FINDING TOPOGRAPHICALLY-SIMILAR REGIONS IN A T R I A N G U L A T E D TERRAIN M O D E L By Gwen Litchfield BMath, University of Waterloo, 1990 A T H E S I S S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M A S T E R O F S C I E N C E in T H E F A C U L T Y O F G R A D U A T E S T U D I E S C O M P U T E R S C I E N C E We accept this thesis as conforming to the required standard T H E U N I V E R S I T Y O F BRITISH C O L U M B I A April 1997 © Gwen Litchfield, 1997 In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for refer-ence and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Computer Science The University of British Columbia 2366 Main Mal l Vancouver, Canada V6T 1Z4 Date: April SS- , 1197-Abstract Two shortcomings of systems developed to automatically extract the topographic features of terrain are that feature definitions are often based on a single attribute, such as Gaussian curvature or slope, and. that interactive queries regarding the similarity of one region to another are rarely supported. We report on a prototype system for triangulated terrain models that attempts to address these two issues. It supports interactive classification of the terrain via a user-controlled classification system. Each classification category is defined in terms of ranges of five key terrain attributes: local relief, slope aspect, slope gradient, plan curvature and profile curvature. The system also supports queries of. the type: "Which regions are topographically-similar to the selected region?" We use optimal-monomorphism in attributed graphs so that "topographically-similar" goes beyond simply matching terrain attributes to include the attributes and spatial relations of adjacent regions. Table of Contents Abstract ii List of Tables vii List of Figures viii 1 Introduction 1 1.1 Thesis Goals 3 1.2 Thesis Organization 4 2 Definitions: Digital Terrain Models 6 2.1 Digital Elevation Model (DEM) 6 2.2 Triangulated Irregular Network (TIN) 7 2.2.1 Delaunay Triangulation 8 2:2:2 Data Structures . . . 8 3 Feature Extraction from Digital Terrain Models (DTMs) 11 3.1 Land Classification Systems 11 3.1.1 Weaknesses 13 iii 3.2 Network Extraction Systems 14 3.2.1 Weaknesses . 15 3.3 Feature-Preserving Generalization Systems 15 3.3.1 Weaknesses ' . 18 3.4 Terrain Matching/Classification Systems : 18 3.4.1 Weaknesses . . 20 3.5 Summary 20 4 D e f i n i n g T o p o g r a p h y 2 2 4.1 Shape . ! 23 4.2 Differential Geometry: 3D Surface Characteristics . 23 4.3 Geomorphometry . . . 24 '4.4 Our Choice of Terrain Parameters 25 5 C o m p u t i n g T e r r a i n P a r a m e t e r s 2 6 5.1 Region of Computation . 26 5.1.1 Region of Computation in Our System . 28 5.2 Computation in a D E M 28 5.3 Computation in a T IN 30 5.3.1 Natural Neighbor Interpolation 31 5.4 Computation of Terrain Parameters in Our System . 32 iv 6 M a t c h i n g 34 6.1 Comparing Parameter Values in Triangles 35 6.2 Selection of Region to be Matched: Triangle Grouping . . 37 6.3 Comparing Parameter Values in Regions • • • 38 6.4 Storing Spatial Relations: The Attributed Graph 39 6.5 Graph Matching: Graph Optimal-Monomorphism . 40 6.6 Graph Optimal-Monomorphism Algorithm of Wong, You and Chan 41 6.6.1 Evaluation Function, f{N) . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.6.2 Heuristic Function, h{N) • . • • • • • • 43 6.6.3 Algorithm in Pseudocode 44 6.6.4 Our Cost Functions for Wong, You and Chan's Algorithm 46 7 O u r Sys t em 48 7.1 Overview 48 7.1.1 Implementation Tools 49 7.2 Steps in Using the System 50 7.2.1 Delaunay Triangulation 50 7.2.2 Computing Terrain Characteristics 52 7.2.3 Region Classification , 53 7.2.4 Building the "Base Graph" . 53 v 7.2.5 Target Region Selection . 55 7.2.6 Building the "Pattern Graph" 55 7.2.7 User-Assigned Weights and Pattern Graph Modifications 58 7.2.8 Graph Optimal-Monomorphism Algorithm 59 7.3 Discussion: Strengths and Weaknesses of the System 61 8 Summary, Conclusions and Future Work 67 8.1 Summary and Conclusions 67 8.2 Suggestions for Future Work 71 Bibliography 72 Appendices 77 A Natural Neighbor Interpolation 77 A . l First Partial Derivatives 78 A.2 Second Partial Derivatives 79 vi List of Tables > 6.1 Sample Classification Category List 38 7.2 Categorization Applied to Sample Terrain . 53 vii List of Figures 7.1 Delaunay Triangulation of Sample Terrain Dataset 51 7.2 Sample Terrain Colored by Aspect • 52 7.3 Sample Terrain Classified into Regions 54 7.4 Sample Terrain Base Graph 56 7.5 Sample Terrain Pattern Graph 57 7.6 Sample Terrain Matching Results 60 viii Chapter 1 Introduction Topography is defined as "the configuration of a surface including its relief and the position of its natural and man-made features." When studying the earth's surface, relief (elevation) is the most commonly measured topographic attribute. Elevation measurements are usually stored in a "digital terrain model" (DTM), which is the formal name given to any discrete representation of the variation of elevation over an area [TG90]. There are three main types of DTMs [Mar78]: the digital line graph (DLG), the raster-based digital elevation model (DEM) and the triangulated irregular network (TIN). The DLG is composed of a set of elevation contours, the D E M is a regular grid with elevation values stored at each (X,Y) location (grid point),.and the TIN is a set of elevations defined at irregularly spaced (X,Y) locations with a triangulation formed of these (X,Y) points. Given a D T M of any of these three types, elevation and many other "topographic attributes," such as slope or curvature, may be extracted. We define a topographic attribute to be a quantity that is derivable from a D T M treated as a mathematical surface. Thus slope and curvature, but not vegetation type, are examples of topographic attributes. . Extraction of topographic attributes from a D T M is useful to the fields of hydrology. [MGL93] and geomorphology [MGL93, Spe68, 01177], and to environmentaland land-use suitability as-sessments [Ham64, GH90]. For example, soil water content, a hydrologically important at-tribute, has been shown to be related to curvature. In geomorphology, land features are studied in hopes of understanding the processes that created them. Profile curvature has been found 1 Chapter 1. Introduction 2 to be a determinant of two geomorphological processes: erosion and deposition of a hillslope. Erosion, or the threat thereof, may also be important to a land-use suitability study. Slope steepness and length are two topographic attributes commonly used to assess erosion hazard. In addition to attributes, "topographic features" may be extracted from a D T M . We define a topographic feature to be a characteristic of a point or area that is determined by examining the point or area's attributes in relation to attributes at other points or areas in the model. For example, a peak, if defined to be the point (or area) of globally maximum elevation in,the model, is a topographic feature. One of the most common topographic features extracted from a D T M is the drainage network (i.e., network of flow of surface water). A drainage network is a complex topographic feature that depends upon other simpler topographic features, such as peaks, pits, ridges, or valleys. Many systems have been developed to automatically extract topographic attributes and features from DTMs. These fall into four basic categories: 1. Land Classification. Systems that help categorize broad areas of terrain (land classifica-tion) [Dik89, GU93]. 2. Network Extraction Systems that extract more complex features, such as drainage or ridge networks [PD75, OM84, Ban86, JD88, JWM90]. 3. Feature-Preserving Generalization Systems that extract features for the purpose of con-structing a (usually, but not always) simpler and feature-preserving representation of the terrain model [Dou86, Pfa76,FS91, KK94]. ; 4. Terrain Matching/Classification Systems that extract features for the purpose of recogni-tion or matching of particular regions of the terrain [GHL88, Fay96]. The focus of this thesis is systems in the fourth category: systems that extract topographic attributes and features for the purpose of matching terrain regions. The two primary weaknesses Chapter 1. Introduction 3 of current systems in this category are: 1. Limited flexibility: the definition of a topographic feature is often based upon a single attribute, such as Gaussian curvature, or slope. The field of general geomorphometry suggests that altitude and its various derivatives are required to characterize landforms [Eva72,. Mar75]. Pike [Pik88] suggests that 7 to 50 attributes are required to characterize a landform surface. 2. Limited interactivity: interactive queries regarding the similarity of topography of one region to another are rarely supported. Similarity of one region to another has either been done by manually compar-ing output of a feature classification system or has been solved as a pattern-recognition problem for DEMs, where a database of terrain images is searched for a specific target pattern. 1.1 Thesis Goals The primary goal of this thesis was to develop a topographic-feature matching system that addressed the weaknesses of existing terrain matching systems. Specifically, we wanted to:. • Base the definition of a topographic feature on more than just a single attribute, such as elevation, or Gaussian curvature. • Allow for matching of regions in terms of both topographic attributes and topographic features. • Support interactive queries regarding the similarity of one region in the D T M to another. • Use a triangulated irregular network (TIN) as the type of D T M , and investigate how this affects computation of topographic attributes and features. Most existing systems use a Chapter 1. Introduction 4 digital elevation model (DEM). In developing such a system, we also investigated three more general issues: What is a good operational definition of "topographically-similar"? What is an appropriate data structure and associated algorithm for performing similarity matching of terrain regions within a TIN terrain model? How can we assist the user in identifying a region of interest (i.e., a region to be matched) in the terrain model? The result of our investigation was the development of a prototype system for triangulated ter-rain models. It supports interactive classification of the terrain model via a user-controlled clas-sification system. Each classification category is defined in terms of five key terrain attributes: local relief, slope aspect, slope gradient, plan curvature and profile curvature. The classification ; segments the terrain into regions. Given these regions, the system also supports queries of the type: "Which regions are topographically-similar to the selected region?" A graph-optimal monomorphism algorithm for attributed graphs is used so that "topographically-similar" goes beyond simply matching terrain attributes to include the attributes and spatial relations of adjacent regions. 1.2 Thesis Organization The remainder of this thesis is organized as follows: • Chapter 2 presents more formal definitions of digital terrain models, focusing on the triangulated irregular network (TIN) and associated data structures. • Chapter 3 provides examples of systems from each of the four types of topographic feature extraction systems. Chapter 1. Introduction 5 • Chapter 4 highlights the difficulties of constructing a quantitative definition of topography, and presents the.beginnings of our prototype system's definition. . • Chapter 5 describes issues related to computing topographic parameters. For example, what size of an area should slope be computed over, and how computing slope in a D E M differs from computing slope in a TIN. • Chapter 6 discusses the concept of matching, and examines various matching algorithms, data structures and similarity, measures. • Chapter 7 outlines our prototype system, which reduces the terrain region matching problem to the matching of attributed graphs. A graph-optimal monomorphism algorithm is used to perform the graph matching. • Chapter 8 contains conclusions and suggestions for future work. Chapter 2 Definitions: Digital Terrain Models In this chapter, we present more formal definitions of various digital terrain models, with the main focus being the triangulated irregular network (TIN) and data structures commonly used to store it. ^ Our prototype system makes use of a TIN, which we store in the quad-edge data structure. The quad-edge structure was chosen as it simplifies the following three tasks: traversing the triangular facets of the TIN, traversing the edges of a triangle in a clockwise (CW) or counter-clockwise (CCW) order, and finding all neighboring points of a given point in a CW or CCW order. 2.1 Digital Elevation Model (DEM) The digital elevation model (DEM) is the most common terrain model. It is a regular, grid, with an elevation stored at each grid point. The grid points are regularly spaced in both the X and Y directions, although the spacing may be different for X and Y. DEMs provide efficient storage, as the (X,Y) locations of the grid points need not be stored explicitly. (The X and Y coordinates for any grid cell can usually be computed from the (X,Y) location of the lower-left corner cell of the grid and the spacing between cells in -X and Y directions. For some DEMs though, the elevation is recorded at the grid cell center, rather than at the lower-left corner of the grid cell. The (X,Y) location of the grid cell center is also easily computed and need not be stored explicitly.) 6 Chapter 2. Definitions: Digital Terrain Models 7 DEMs are available for most of the world at varying resolutions. The United States Geological. Survey (USGS) produces DEM data that corresponds to its published maps, and this data is available free of charge via the Internet. ~ DEMs are convenient for programming due to the simple storage and fixed data resolution. The neighborhood of a point can easily be reconstructed without searching the entire D E M . , One of the problems with DEMs is that the grid must be quite fine to accurately model rough ar-eas of terrain or precise breaklines such as ridges or watercourses, but this fineness is redundant in smooth regions of the terrain. 2 . 2 Triangulated Irregular Network (TIN) The triangulated irregular network (TIN) [PFLM78] is another common terrain model. It addresses the redundancy problem of DEMs, and can sometimes be defined more directly from source data. A TIN is a set of irregularly-distributed vertices linked together by straight lines to form a continuous, non-overlapping set of triangular elements [Mar78]. The choice of vertices to include and how to construct the triangular elements are two active areas of research. The most common method of constructing the triangulation is to create a Delaunay triangulation, which will be discussed in detail in the next section. Irregularly-spaced elevation data is not common in the public domain, but private companies often have survey or stereoplotter data that is sparse (and irregularly distributed) due to the expense of conducting surveys and digitizing. (Nearly all raster data is derived from irregularly-spaced source data.) . TINs require more storage than a DEM for each point - the (X,Y) location and elevation for each point must be stored. However, by having few points in smooth areas of terrain, and many Chapter 2. Definitions: Digital Terrain Models 8 points in rough areas, a TIN could potentially require less storage than a DEM, which would have to be uniformly fine in order to provide the same accuracy. 2.2.1 Delaunay Triangulation . i The Delaunay triangulation is a triangulation in which each triangle's circumcircle (circle through its three vertices) does not contain any other vertex of the triangulation. This tri-angulation maximizes the minimum interior angle of all of the triangles, (i.e., it avoids long "skinny" triangles with small interior angles). The Delaunay triangulation is often used as it has two additional nice properties: it is the dual of the Voronoi diagram, and for a set of vertices, it is unique. The Voronoi diagram (also known as the Thiessen tessellation) for a set of 2D vertices, V, is a set of Voronoi cells (polygons), one about each vertex. Each cell contains all the points closer to its vertex than to any other vertex in V. (These cells are sometimes referred to as "influence" polygons. ) The Delaunay triangulation is constructed from a Voronoi diagram by connecting every two vertices whose Voronoi cells are adjacent. The uniqueness of the Voronoi diagram implies the uniqueness of the Delaunay triangulation. 2.2.2 Data Structures Data structures for TINs must store the (X,Y) location and elevation (Z) for each point, as well as the connectivity between the points (triangle edges), so that the neighbors of a point can be found. Some structures store separate lists of vertices, edges and triangular faces (linked-list), while others store all three types of information together (winged edge, quad, edge). Each of these different data structures will be described below along with their advan-tages/disadvantages. Chapter 2. Definitions: Digital Terrain Models 9 Linked-Lists The linked-list data structure for a TIN usually consists of three separate doubly-linked lists - the vertices, edges, and triangular faces each stored in a list. There is no significance to the ordering of elements in the lists, which represent sets. An element in the vertex list contains the (X,Y) coordinates of the vertex, and the elevation (Z). An element in the edge list contains pointers to the two vertices that are the endpoints of the edge, and pointers to the two adjacent triangular faces. An element in the face list contains six pointers - three to the vertices that are the corners of the triangular face, and three to the edges of the face. The ordering of pointers in edge elements and face elements is arbitrary. The biggest advantage of the linked-list data structure is the low storage cost. The storage requirements are 6 pointers per face, 4 pointers per edge and 3 values per vertex (X, Y, and Z). A disadvantage of using linked-lists is that the data structure does not return ordered informa-tion. For example, finding all points that are neighbors of a given point (i.e., are connected to it) can be done by an exhaustive search of the edge list. Finding the neighboring points in a clockwise ordering is not directly supported by the data structure. Winged-Edge The "winged-edge" data structure was developed by Baumgart in 1975 [Bau75]. It was one of the first representations developed for storing the boundary of a polyhedron. The focus of this data structure is the edge, and it does not actually require that the plane faces be triangles (although for a TIN, all faces will be triangles). There are three separate lists - one for vertices, one for edges and one for faces. Each vertex points to (an arbitrary) one of its incident edges. Each face points to (an arbitrary) one of its bounding edges. Each edge record contains 8 pointers: two to the endpoint vertices, two to the adjacent faces (one left, one right) and four to edges incident to the endpoint vertices in both Chapter,2. Definitions: Digital Terrain Models 10 clockwise (CW) and counter-clockwise (CCW) directions (these 4 edges are called the "wings" of the edge). An advantage of the winged-edge data structure is that it is simple to traverse the edges of a face in CW or GCW order. A disadvantage of the winged-edge structure is its high storage cost, which exceeds that of the linked list and quad-edge structures. The storage requirements are: 1 pointer per face, 8 pointers per edge, and 4 values per vertex (X, Y, Z, and 1 pointer to the edge). Quad-Edge The quad-edge data structure was developed in 1985 by Guibas and Stolfi [GS85]. Like the winged-edge structure, the focus is the edge. Since it is meant to be for the boundary represen-tation of general polyhedra, the faces not need be triangles. Each edge record is part of four circular lists: two for endpoints and two for adjacent faces. An advantage of the quad-edge structure is that it encodes the dual subdivision automatically. The dual of a triangulation assigns a node to each triangular face and an edge connecting two such nodes if the triangular faces they represent are adjacent. The dual is useful for quickly traversing all the faces (triangles) in the TIN. Given a quad-edge structure, it is easy to find all neighboring points of a given point in a clockwise (CW) or counter-clockwise (CCW) ordering. It is also simple to traverse the edges of a given triangle in a CW or CCW order. A disadvantage of the quad edge structure is the storage cost. The storage requirements are: 8 pointers per edge (4 quad-edges per edge and 2 pointers per quad-edge (origin vertex and next quad-edge)), and 3 values per vertex (X, Y, and Z). Chapter 3 Feature Extraction from Digital Terrain Models (DTMs) Many systems and methods have been developed to extract topographic features from digital terrain models. The definition of the term "topographic feature" varies with each system, but it is typically defined to be some characteristic that can only be determined by examining the attribute values for one point (or area) in relation to the values at other points in the model. Examples of features are: peaks, pits, ridges, areas of homogeneous curvature, and drainage networks. In this chapter we review the four most common types of topographic feature extraction sys-tems: land classification systems, network extraction systems, feature-preserving generalization systems, and terrain matching/classification systems. We give examples of each of the four types of systems, and provide a summary of the main weaknesses of systems of each type. 3.1 Land Classification Systems Land classification is important to the field of geomorphology as it allows comparisons to be made between several areas of terrain (if the same classification system is applied to all of them). Geomorphologists have three general approaches to land classification: the genetic approach, the landscape approach and the parametric approach [Mab68j. In the genetic approach, land is subdivided into natural units on the basis of environmental factors such as climate and structure. In the landscape approach, the units are components with 11 Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 12 similar landforms, soils, and vegetation (i.e., groups that could be mapped from an airphoto pattern). In the parametric approach, land is divided and classified on the basis of selected parameter values. Many different sets of key parameters have been defined. For example, Speight [Spe68] suggested a parametric system where land was classified on the basis of four parameters: slope, rate of change of slope, contour curvature, and unit catchment area (measurement of the size of the area upslope from a point). Speight's system then defined seven named landform element categories (e.g., crest or hill-slope) in terms of ranges of values of these four attributes. The parametric approach is the one that has lent itself most readily to automation, as the parameters are usually easily computable from a DEM. Dikau [Dik89] developed a hierarchical automated land classification system for DEMs. The terrain is divided into form elements, which are further subdivided into form facets. A form element is a unit of homogeneous plan (horizontal) curvature and profile (vertical) curvature. A form, facet is a unit with homogeneous gradient, aspect, and plan and profile curvature. One of the strengths of this system is its hierarchical structure, which allows for classification at different levels of detail, and over differing sized areas. Graff [GU93] developed an automated terrain classification system for DEMs that uses two generic terrain feature categories - mount and non-mount. A "mount" is an isolated, elevated mass with a distinguishable boundary and a summit or peak. Everything else is categorized as non-mount. The working definition of mount is based on altitude, slope and critical points (peaks and .ridge points). The weaknesses of this system are: • It classifies a very smallpart of the terrain - mount. • The algorithm produces many small isolated mounts.. Post-processing is'thus used to eliminate all mounts of less than 25 grid cells. The choice of 25 grid cells is arbitrary, and this size is probably dependent on terrain type. Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 13 • The boundary between mount and non-mount is set to be some percent slope value. For example, the mount region must have percent slope greater than 10, percent. The correct choice for this percent slope value seems to be related to local relief (maximum elevation minus minimum elevation over terrain region under consideration), but this relation is not well defined. 3.1.1 Weaknesses From examination of the Dikau [Dik89] and Graff [GU93] systems, weaknesses and questions for automated land classification systems are apparent: • The choice of parameters and the mapping of ranges of parameter values to named topo-graphic feature categories seems arbitrary. • One must be careful to define a mapping so that no point in the terrain model either fails to be assigned to a named feature category or is assigned to more than one. • The systems typically accept only DEMs as the input terrain model. • It is not clear whether critical points (e.g., peaks) or geomorphometric type parameters (e.g., curvature) or both should be used. • Should a land classification system be hierarchical in nature? This might allow for recogni-tion of relations between neighboring landforms, and not just simple landforms computed at a single point. • Should point or area measures be used? For example, should percent slope be computed for a point or averaged over an area? Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 14 3.2 Network Extraction Systems A complex global topographic feature of interest to hydrologists is the drainage network, which is the network of flow of surface water over an area of terrain. Ridge lines and valley or stream lines are often extracted first as they help to determine the drainage network (stream lines are part of the drainage network^ and drainage paths cannot cross ridge lines). . Peucker and Douglas [PD75] extracted ridges and valleys by using a two-by-two moving window over a DEM. The grid cell with the lowest elevation in the window is flagged. Any unflagged cells remaining after the window has passed over the entire D E M are ridges. Similarly, the highest cell in the window is flagged, and any unflagged cells remaining are the valley lines. The weaknesses of this method are: • It assumes that every point on the surface can be classified solely on the basis of an analysis of the point's neighbors. • The method is highly sensitive to noise in the DEM. • The discreteness of the D E M poses problems in recognizing intrinsically continuous fea-tures, such as ridgelines. O'Callaghan and Mark [OM84], Band [Ban86], and Jenson [JD88] also developed algorithms for extracting stream/ridge networks from DEMs. One common concern of these algorithms is how to deal with depressions: areas surrounded by higher elevation values. In determining hydrologic flow directions, the depressions must fill before the flow can continue.- Some of these depressions may be errors in the data, while others may be real topographic features such as potholes or quarries. O'CalJaghan and Mark attempt to remove depressions by smoothing the data, Band ignores them completely, and Jenson uses flow direction and watershed information to determine how depressed cells should be raised. Jones et al. [JWM90] presented a method for computing the path of steepest descent from a Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 15 given starting point in a TIN, and extended it to compute the approximate stream network. Two problems of this method are flat triangles and pits. The path of steepest descent is undefined for points that lie in a flat triangle, as the path is computed locally within that triangle. A pit is a vertex in the triangulation whose surrounding edges all slope upward from the vertex. Although pits can be a natural part of the terrain, false pits can be created by the triangulation process itself. 3.2.1 Weaknesses The algorithms discussed above highlight several problems that arise when extracting any topographic feature (not just the drainage network) from a terrain model: • It can be difficult to distinguish between an error in the.data and a real topographic feature. This was illustrated by the problem of depressions. • The introduction of a data structure for the terrain data may in itself introduce false topographic features. For example, the false pits in the TIN model as mentioned above. A DEM is not necessarily free from this problem either - the original terrain data may have been gathered at irregularly spaced locations and interpolated to a regular grid, .or the original grid may have been resampled to a grid of differing resolution. Both interpolation and resampling can introduce false topographic features. 3.3 Feature-Preserving Generalization Systems The D E M and TIN are the most commonly used terrain models, but attempts have been made to construct simpler models from a D E M or TIN without loss of topographic feature information. There are two categories of simpler models: those that are significantly different from the DEM or TIN, and those that try to build a simpler TIN. We will first discuss models which differ from a DEM or TIN. Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 16 Douglas [Dou86] introduced the "RICHLINE" digital elevation model. The name stands for ridge and channel and other information-rich lines. The model is vector (not raster) based, and includes only the points from the original D E M that lie on one of these "richlines." To illustrate the feature-preserving nature of the richline model, Douglas presents an algorithm for converting the richline model back to a DEM. The richline model is similar to the work of Pfaltz [Pfa76] who defines a graph representation of the terrain surface, where the vertices are critical points (pits, peaks, etc.) and edges are ravine and ridge lines. Falcidieno [FS91] proposed the construction of a "Characteristic Region Configuration Graph" by extracting shape features from a TIN. These shape features are found by testing the ad-jacency type (convex, concave, plane) between surface facets (triangles) in a TIN. First, each edge is assigned an attribute value that indicates whether the angle between the two triangles sharing the edge is concave, convex, or plane. Next, each triangle is assigned to one of six categories depending upon the angle attribute values of its three edges. Characteristic regions are then formed by region growing where triangles in the same category are combined. The result is that each characteristic region is a contiguous block of triangles that have uniform Gaussian curvature sign. Each characteristic region becomes a node in the Characteristic Re-gion Configuration Graph, and attributed arcs are added to the graph to represent adjacency of these characteristic regions to critical lines and points (such as ridgelines or peaks). Some of the problems with this graph representation are: • A tolerance is required in criteria for edge classification. For example, a flat region in the real terrain surface is not represented by a set of perfectly coplanar triangular facets. • Grouping triangles by uniform Gaussian curvature sign causes fragmentation of the terrain into small characteristic regions, whereas the terrain would be better described in terms of its general convex shape. For example, a hill is convex overall, but may have small concave areas inside the area. Use of uniform signed Gaussian curvature means that the Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 17 hill is not stored as a single region in the graph, but as many small regions. • Ridges or valleys of the actual terrain may not be coincidental with edges of the trian-gulation and therefore may not be coincident with characteristic region boundaries. This means that they would not be present at all in the graph. Kweon et al. [KK94] introduced the use of the "topographic change tree" or contour tree in their work on extraction of peaks, pits, ridges and ravines. The contour tree is constructed by building a contour map from a DEM, and then generating the connectivity tree of all regions separated by the contours. Kweon et al. claim that the resulting tree is simpler than the DEM, and that it allows for more accurate definition and extraction of peaks, pits, ravines and ridges. Some of the problems with the contour tree are: • The contour tree requires a contour map from a DEM, so there must be a parameter to determine the number of contour levels to construct from the DEM. It is not clear what the value of this parameter should be for the most accurate contour tree. • The contour tree is not used as a final representation, just as an intermediate step in the search for ridge and ravine lines. Its usefulness as an alternate to the D E M is unknown. So far we have discussed feature-preserving terrain models that are different from a DEM or TIN. There is a second group of models - those that try to construct a simpler TIN from a subset.of the original data points. These models are often referred to as "data-dependent" triangulations because points are chosen for inclusion so that accuracy requirements, defined in terms of local or global error in height, curvature, etc., are met while minimizing the total number of triangles. Garland et al. [GH95] provide a good overview of triangular surface simplification methods. > The primary weakness of data-dependent triangulations is that meeting accuracy requirements Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 18 (such as global error in height) does not necessarily imply that topographic features are well-preserved. 3.3.1 Weaknesses The discussion of alternative terrain models (to the DEM and TIN) has highlighted several common weaknesses: • Each model has its own definition of critical points and lines (peaks, pits, ridge and. ravine lines). Some models don't even define them at all (data-dependent triangulations). • With the exception of the data-dependent triangulation, each model is constructed from a DEM or a TIN, not from the original sample points. Information may already have been lost or false information introduced in the starting DEM or TIN. • The construction of the alternate model seems to involve the selection of some arbitrary parameter value. For example, the number of contour levels from a DEM in the contour tree, the tolerance for edge classification in the characteristic region configuration graph, - or the error threshold in the data-dependent triangulation. ' 3.4 . Terrain Matching/Classification Systems The fourth type of topographic feature extraction system is one which allows for the automatic comparison or matching of features for two or more terrain regions. The land classification systems previously discussed have the goal of allowing for comparison of terrain regions, but this comparison is not automated. Goldgof et al. [GHL88] developed an algorithm which uses Gaussian curvature to extract feature points on the terrain, and then uses these points to locate a small terrain sample within a large reference area. For both the large reference DEM and the smaller sample DEM, the D E M is Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 19 first smoothed, and then Gaussian curvature is computed for each pixel by fitting a quadratic surface over a square window (centered on the pixel) and computing directional derivatives of the surface. Thresholding is used to extract points with large absolute values of Gaussian curvature, and these points are then matched .between the sample D E M and the reference DEM. Goldgof et al. also investigate the case of different sampling for the sample and reference DEMs, and the case of unknown rotation of the sample plane with respect to the plane of the reference DEM. They acknowledge that Gaussian curvature does not completely determine the surface, but for complicated surfaces, they suggest that there will be enough features in the Gaussian curvature image to recognize the surface. In the examples given, their algorithm performs well except for terrains with many man-made objects (such as flat buildings), since the Gaussian curvature is zero for the object except at the corner points. This leads to very few feature points and difficulties in the matching, as decreasing the threshold to acquire more feature points would increase the sensitivity to noise. Fayek [Fay96] presented a method for compression of 3D range data and the recognition of objects within this compressed data. For experimental verification of the method, Fayek at-tempted the recognition of man-made structures in aerial images. This is still terrain matching - between an image and a stored object model, rather than between two images (treated as DEMs). Fayek's method first creates a TIN of all terrain data points in the image (DEM). Next, topographically important vertices and edges (such as peaks, pits, valleys and ridge lines, outside boundary edges) are determined. These edges are simplified using a line simplification method. A constrained triangulation of just the simplified edges and topographically important vertices produces a coarser, but yet feature-preserving triangulation. This compression can be repeated on the new coarser TIN to produce an even coarser representation of the terrain. The coarsened TIN is then compared to a generic model of a 3D man-made structure (such as a house). The model is constructed of nearly-planar patches that represent the surface boundary of the object. The strength of Fayek's algorithm depends upon accurate initial selection ofthe topographically important vertices and edges, and good generic object models. Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 20 3.4.1 Weaknesses The examination of the two terrain matching systems above has highlighted several common problems: • To keep the computational cost of matching low, the number of features/parameters to be matched must be small. There will always be a compromise between computational cost and the accuracy of the match. • Methods that match natural terrain may not do well for man-made terrain and vice-versa. 3.5 Summary In this chapter we reviewed examples of each of the four types of topographic feature extraction systems: land classification systems, network extraction systems, feature-preserving generaliza-tion systems, and terrain matching/classification systems. It is apparent that while the four types of systems have different goals, they do share problems and raise common issues: .-. • • The original data may contain noise. The drainage network extraction method of Peucker and Douglas [PD75] was said to be highly sensitive to noise in the D E M . The terrain matching system of Goldgof et al. [GHL88] had problems with noise when the terrain contained many man-made structures. • • The construction of a D E M or TIN or other terrain model can introduce false topographic features. False pits in a TIN, introduced by the triangulation itself, can alter the result of a drainage network extraction. Chapter 3. Feature Extraction from Digital Terrain Models (DTMs) 21 • There is a reliance on some arbitrary parameter(s). Examples include: the percent slope boundary between mount and non-mount categories in a land classification system, the assigned direction of path of steep-est descent in a flat triangle, or the threshold level of Gaussian curvature used to identify feature points in a terrain matching system. • It is not clear whether local or global methods should be used for extracting topographic features. A ridgeline is an intrinsically continuous feature which is often extracted by local methods (e.g., compare a point to its neighbors within a 3x3 moving window in a DEM). While local methods are fast, the resulting ridgelines are not very accurate. • It is not clear how to define topography,- in terms of critical points (pits, peaks, ridgelines, etc.) or in terms of geomorphometric parameters (relief, slope, curvature, etc.). The DEM (certainly) and the TIN (possibly) may not have even captured the critical points, therefore searching for them may be fruitless. Chapter 4 Denning Topography Our discussion of topography up to this point has been in terms of topographic attributes and features, for which we gave general definitions. An attribute was defined to be something that had a value at each point in the D T M and was derivable from elevation, such as slope. A feature was defined as something that could only be determined by examining the attribute values at a point (or area) in relation to the attribute values at other points in the model, such as a peak. These general definitions do not lend themselves to our goal of matching region topographies. Ideally, we would like to define topography in terms of a set of easily computable parameters. In order to facilitate matching, our definition needs to include the list of parameters, how we will compute them, and how we will quantify the correspondence between sets of parameter values (i.e., what is an appropriate similarity metric for these parameters). We must also be concerned with the spatial relations of these sets of values, as topographic features are determined in this way. In this chapter we focus on choosing an appropriate set of parameter values to form our definition of topography. We review the concept of shape, geometric surface characteristics, and the field of general geomorphometry in order to assist us with our selection. The set of parameters that we choose is local relief, slope aspect, slope gradient, plan curvature, and profile curvature. Issues related to computing these parameters are discussed separately in Chapter 5. Similar-ity measures for the sets of parameter values and methods for storing and comparing spatial relations are discussed in Chapter 6. 22 Chapter 4. Defining Topography 23 4.1 Shape Shape is difficult to define. Webster's dictionary defines shape as "spatial form." An introduc-tory text on computer vision provides the definition: "shape is that quality of an object which depends on the relative position of all points composing its outline or external surface" [BB82]. Shape is a property of both 2D and 3D objects, and does not normally include size or position of the object. Shapiro [Sha80] uses a structural definition of shape to assist in shape recognition. She states that: "a structural definition of shape must include a set of shape primitives, their properties, and their interrelationships." Attempts have been made to define shape more quantitatively, but there is no consensus: • Ireton et al. [10X92] define 2D shape in terms of circularity, transparency, aspect ratio, irregularity, and extreme point ratio. : • The QBIC (Query by Image Content) project [NBE+93] uses another definition of 2D shape: area, circularity, eccentricity, major axis orientation and a set of algebraic moment invariants. 4.2 Differential Geometry: 3D Surface Characteristics The field of differential geometry considers the mathematical treatment of surfaces in n-dimensional space. Differential geometry suggests that a discrete surface element has eight degrees of free-dom: three to describe its (X, Y, Z) position, three for its orientation in space, usually described in terms of a unit surface normal vector, and two for its maximum surface curvature (kl) and minimum curvature values (k2) [Kas89]. 1 Chapter 4. Defining Topography 24 4.3 Geomorphometry The field of general geomorphometry is "the measurement and analysis of those characteristics of landforms which are applicable to any continuous rough surface" [Eva72j. It may be considered a sub-discipline of geomorphology. The goal of general geomorphometry is to determine the minimal set of parameters needed to characterize any land surface. Unlike geomorphology, it does not consider the mapping of values of these parameters to feature type names. Evans [Eva72] studied DEMs, and concluded that the key geomorphometric parameters are: altitude at a point, slope gradient (rate of change of altitude (Z)), slope aspect (rate of change of X wrt Y, for constant Z), downslope convexity (second vertical derivative or rate of change of gradient), and cross-slope convexity (second horizontal derivative or rate of change of aspect). Evans suggested that these five parameters should be computed over point values in order to produce statistical distributions of the parameter values over the entire terrain. The distribu-tions can then be characterized by the mean (measure of central tendency), standard deviation (measure of variability), dimensionless skewness (measure of asymmetry), and kurtosis (mea-sure of peakedness). The distributions of the parameters rather than individual point values then become the defining characteristics of the terrain. Mark [Mar75] reviewed geomorphometric parameters with attention to the measurability of the parameter from a computer-based terrain storage structure, and the geomorphic significance of the measure. He concluded that (almost) all important terrain information is contained within four measures: local relief, mean slope (tangent of the slope angle is actually used), roughness factor, and the hypsometric integral. (The hypsometric integral is a measure of the interrelation of area and altitude. It can be computed as E = ^ for a given area, and is equal to the elevation-relief ratio identified by Wood and Snell in [WS60].) Pike [Pik88] conducted experiments on D E M terrain samples from California to determine which of the 70 parameters computed by an existing computer program (from JPL, Pasadena, Chapter 4. Defining Topography 25 CA) were required for a "geometric signature". This geometric signature would be the minimal subset of the 70 parameters that could distinguish between these terrain samples. He concluded that 17 of the 70 parameters were sufficient for characterization of the given terrain samples, but that a different geometric signature would have to be developed for each land formation process. These 17 parameters thus do not fulfill the goal of general geomorphometry, as they do not apply to any land surface. 4.4 Our Choice of Terrain Parameters We chose the following set of terrain parameters as the basis of our definition of topography: local relief, slope aspect, slope gradient, plan curvature (curvature in a direction perpendicular to the gradient), and profile curvature (curvature in the gradient direction). Our decision was based on our requirement for an easily computable set of parameters, and that these parameters formed a common subset of the various sets discussed in the previous section. Chapter 5 Computing Terrain Parameters In the previous chapter we chose the set of parameters that form the basis of our definition of topography (local relief, slope aspect, slope gradient, and plan and profile curvature), but we did not discuss how we would compute these parameters. In this chapter we discuss the issue of point vs. area computation of parameters, and how terrain parameters are typically computed in a DEM vs. a TIN. Given that we want to use a TIN, we conclude that it is best to compute the parameters over each triangle in our TIN, rather than over points or some arbitrary area. 5.1 Region of Computation For a particular terrain parameter, such as slope, one can debate what the region of computation should be. Should we compute the slope at a point? Should we average the slope over an area? Which is a more meaningful value? Mark [Mar75] suggested that the appropriate region of computation depends on the scale of horizontal variations in topography. He defined "texture" to be the shortest significant wave-length in the topography and "grain" as the longest significant wavelength. Texture is then related to the smallest landform elements one wishes to detect, and obviously dictates the sam-pling density when elevation data is acquired. Mark stated that grain is the size of the area over which one measures other parameters, but did not present a clear method for computing 26 Chapter 5. Computing Terrain Parameters 27 the grain. Wood et al. [WS60] suggested that grain "is dependent on the spacing of major ridges and valleys." They proposed a method for computing grain. Local relief (highest point minus lowest point) is determined within concentric circles centered at a randomly selected data sample point, and then plotted vs. radial distance. The grain is then the diameter at the "knick-point" in this curve, where the slope of curve decreases abruptly and then remains steady. One weakness of this method is that the choice of circle radii seems rather arbitrary. Thompson [Tho64] (in [Eva72]) used a similar method for computing grain, but "representa-tive" high and low points in the neighborhood of the random sample point were defined as the first points touched by spheroids lowered from above and raised from below the sample point. The high and low points and resulting differences were plotted vs. sphere radii. The radii of the spheres then decide which knick-point on the curve is used. Other methods propose that geomorphometric parameters, especially local relief, should be computed within a certain fixed sized radius or grid square. While easier to implement, there is no consensus as to the size of the square or circle that should be used. Evans [Eva72] cited studies that used squares ranging from 1 km2 to 90km2 and referred to a study by Hammond [Ham58], which suggested that it may be necessary to use different sizes in areas of markedly different topography. It is important to note that most of these investigations reflect the computing constraints of the time - prior to easy access to digital D E M data. The use of fixed sized square areas simplified the computations. If this problem were to be studied now, fixed sized areas may not be the best solution. Chapter 5. Computing Terrain Parameters 28 5.1.1 Region of Computation in Our System Given that we wanted to use a TIN as our terrain model, we chose the triangle as our basic region of computation. We triangulate the terrain data points using the Delaunay triangulation, which is known to produce triangles with good aspect ratio. We then compute the values of our chosen terrain parameters over the resulting triangles. Ideally, one would like interactive control within the system over the choice of grain; our current prototype does not provide this directly, but merely accepts an arbitrary TIN and makes no assumptions about the importance of any particular vertex in the TIN. This implies that the grain of triangles in the TIN should reflect the grain of the terrain. 5.2 Computation in a D E M Computing parameters such as slope and slope derivatives in a D E M typically involves the use of a finite difference or surface fitting method, as these are continuous parameters being computed from a discrete model. Skidmore [Ski89] presented a comparison of six techniques for calculating the gradient and aspect from a DEM. The techniques were as follows: 1. For a 3x3 window in a DEM, aspect was defined to be the direction of maximum drop from the center pixel to the eight nearest cells. So aspect was the direction (in 45 degree intervals) of the maximum gradient. 2. For a 3x3 window in a DEM, gradient was defined to be the maximum gradient of the steepest drop or steepest rise. Aspect was then the direction of the maximum gradient (again in 45 degree intervals). 3. A second order finite difference method using a 2x2 moving window to compute gradient and aspect (numerical differentiation). Chapter 5. Computing Terrain Parameters 29 4. A third order finite difference method using a 3x3 moving window. This was like the method of Sharpnack and Akin [SA69], but had a weighting factor for non-diagonally adjacent cells. 5. A multiple linear regression model where the surface was fitted to nine grid cells in a 3x3 moving window using least squares. 6. Another linear regression model. Skidmore concluded that: • Spurious values of aspect and gradient may be generated over some terrain features re-gardless of the method used to calculate the gradient and aspect. • Methods 4-6 have the best potential for estimating gradient and aspect from a gridded DEM. Evans [Eva72] fitted a 5-term quadratic trend surface, z = ax2 + by2 + cxy + dx + ey, to the 3x3 neighborhood of each DEM sample point in order to compute various parameters in his geomorphometric study. Heerdegen and Beran [HB82] used this method to compute aspect, gradient and plan and profile curvature in a DEM according to the formulae: -e -d aspect (degrees) = 180 — arctan(——) — 50(y^ j-) (5-1) gradient (mm-1) = (d2 + e2)^2 (5.2) I I -1 \ (Cd ~ %ae) I T O \ plan curvature [m ) = e——: (5.3) (er + e/i)1-i' „. , , - 2(ad2 + be2 + cde)(1.0 + d2 + e")1 profile curvature (m ) = r-rj (cr + ez) (5.4) This work was then extended by Zevenbergen and Thorne [ZT87] to exactly fit the 9-term quadratic polynomial, z = Ax2y2 + Bx2 y + Cxy2 + Dx2 + Ey2 + Fxy + Gx + Hy + I, to all Chapter 5. Computing Terrain Parameters 30 9 points in the 3x3 neighborhood grid. Moore et al. [MGL93] provide formulae for aspect, gradient and plan and profile curvature (of the midpoint of the 3x3 grid) based on this 9-term quadratic polynomial: H G aspect, i> = 180 - arctan(-) + 90(—^) (5.5) G IG | gradient, f3 = arctan[(G2 + H2)1 /2} (5.6) DH2 + EG2 - FGH plan curvature, u = 2 — 5 5 (5-7) G'* + n DG2 + EH2 + FGH profile curvature, <p = — 2 ^ 2 _|_ j]2 \P-°) 5.3 Computation in a TIN Computing parameters such as slope and the first derivatives of slope in a TIN relies on treating each triangle as a planar surface, and estimating an average parameter value for the triangle. It is important to note that every triangle in the TIN may be of a different size and shape. This differs from computation in a DEM, where every measure is computed over an area of the same size and shape (a grid cell) to produce a single value for each cell. Tajchman [Taj81] represented the triangular facet as the plane equation z = Ax + By + C, where A, B, and Care determined by the simultaneous solution of the equation at the three vertices of the triangle (Pt - (xl ,yl, zl), Pz = [x2,y2,z2),P3 = (x3,y3,z3)). Moore et al. [MGL93] gave formulae for aspect and gradient based on Tajchman's method: B A aspect, i> = 180 - arctan(—) +90(T—T) (5.9) A \A\ gradient, p = arctan[(A2 + B2)1/s] (5.10) Unfortunately, Tajchman's method cannot be used to calculate parameters such as plan and profile curvature as the second derivatives of the planar surface equation do not exist. A local interpolation method which does allow for the computation of second derivatives is Chapter 5. Computing Terrain Parameters 31 natural neighbor interpolation. As this is the method used in our system, natural neighbor interpolation is discussed in detail in the next sub-section. 5.3.1 Natural Neighbor Interpolation Given a TIN, and the need to interpolate values for a point (x,y) not in the TIN, a typical local interpolant is: where f(x, y) is the interpolated function value at the point (a;, y), the /,'s are the data values at n local nodes in the TIN, and the cfo's are the weights associated with each node. The smoothness of the interpolation depends on the choice of n local nodes and associated weights. If linear interpolation is sufficient, then the values for a point (x, y) not in the TIN can be found by using the three vertices of the triangle which contains (x, y) as the nodes, and weights based on the location of (x, y) in the triangle. Natural neighbor interpolation, first introduced by Sibson [Sib80, Sib81], is a local interpolant that has two important properties: • it is an exact interpolant (i.e., the original function values are recovered exactly at the reference (in this case, TIN) nodes), • the derivatives of the interpolation function are continuous everywhere except at the reference points. In natural neighbor interpolation, the n nodes used in the local interpolation function (Eqn. 5.11) are the "natural neighbors" of the point (x,y), and the weights used (<?!>;'s) are the "natural neighbor coordinates" of (x, y) with respect to each of these n nodes. The natural neighbors of (x, y) are just the n nodes to which the point would be connected if it were added to the Delau-nay triangulation. The natural neighbor coordinate of (x, y) with respect to natural neighbor n (5.11) i=l Chapter 5. Computing Terrain Parameters 32 node i is defined to be the area of the second-order Voronoi cell between (x, y) and node i, normalized by the total area of the Voronoi cell around (x, y). (The second-order Voronoi cell between (x, y) and node i is just the region of overlap between the original Voronoi cell around node i and the new Voronoi cell around (x, y) after it was added to the Delaunay triangulation.) Sambridge et al. [SBM95] provide a detailed look at natural neighbor interpolation, including a description of Watson's method [Wat92] for computing the natural neighbor coordinates. They also derive the first partial derivatives of the natural neighbor interpolation function, fx and fy. 5.4 Computation of Terrain Parameters in Our System As our system uses a TIN model, all terrain parameters are computed on a per-triangle basis (rather than at each vertex in the TIN). We compute local relief for each triangle as zmax — zm{n where zmax is the maximum, and zmin the minimum, of the z (elevation) values for each of the three triangle vertices. For aspect and gradient, we first use Tajchman's method (see Section 5.3 for details) to derive the planar equation z = Ax + By + C for each triangle. Given this equation, we then use the formulae for aspect and gradient as outlined by Moore et al. (see Section 5.3) to provide us with average values of aspect and gradient for the triangle. For plan and profile curvature, we use natural neighbor interpolation to determine the values of these parameters at the in-center of each 2D triangle. The in-center of a planar triangle is the origin of the largest circle which lies entirely within the triangle. Given a planar triangle defined by vertices Pj = (xi,yi), P2 = (#2,2/2), and P3 — (x5, y3) (i.e., we ignore the z value), the in-center can be computed as: _ l-Pg ~ Ps\Pi + \Ps - Pi \P& + \Pi - Pz\P3 ,r 1 2 s I n ~ PerimeteriAPtPgPs) ' Chapter 5. Computing Terrain Parameters 33 Given the (x, y) coordinates of the in-center, Cj„, we compute the natural neighbor coordinates of Cin using Watson's method [Wat92] (as outlined in Appendix A of the paper by Sambridge et al. [SBM95]). This gives us the list of the n natural neighbor nodes to C/„, and the weights, 0,'s (just the natural neighbor coordinates of Cjn with respect to each natural neighbor node i). We can then compute elevation, z, at the in-center of the triangle as: i = l where the /,'s are the elevations at the n nodes. We compute the first and second partial derivatives of z (elevation) at the in-center of the triangle: fx, fy, fxx, fyy, and fxy. We use the formulae for the first partial derivatives of the natural neighbor interpolation function as outlined in Sambridge et al. [SBM95]. For the second partial derivatives of the natural neighbor interpolation function, we extended the first partials derivation of Sambridge et al. [SBM95] (see Appendix A for details). Given the first and second partial derivatives at the in-center of the triangle, we can finally compute plan and profile curvature for the in-center point using the following formulae (from Mitasova and Hofierka [MH93]): n (5.13) plan curvature, u — fxxfy <2fxyfxfy ~\~ fyyfx (5.14) profile curvature, <f> = (5.15) where: P = fxS+fy2 q = p+ 1 (5.16) (5.17) Chapter 6 Matching We have now defined the set of parameters that form the basis of our definition of topography: local relief, slope aspect, slope gradient, and plan and profile curvature (see Chapter 4). We have also described how we will compute these terrain parameters in the TIN model - on a per triangle basis, using a combination of planar equations and natural neighbor interpolation (see Chapter 5). In this chapter, we focus on several issues related to matching of terrain regions: • How can we compare the values of these parameters from one triangle to another? If the parameter values of triangle A differ from those of triangle B by one degree in aspect, and differ from those of triangle C by one degree in gradient, is triangle A more similar to triangle B or to triangle C? We need to define a "unit" of change for each of our five terrain parameters. • How can we group the individual triangles into regions that represent a topographic feature or have some meaning to the user? We want the user to be able to locate more than just topographically similar triangles. • How can we compare the set of values of terrain parameters for the triangles in one feature region to the set of values for another region? Region A may have 100 triangles, with values for the five terrain parameters at each triangle, and Region B may have 17 triangles, with values for the five parameters at each triangle. How do we compare a set of 100 X 5 = 500 values to a set of 17 X 5 = 85 values? 34 Chapter 6. Matching 35 • How can we keep track of the adjacency, or spatial relations of the feature regions? We want to match a feature using information about the adjacent features in addition to the values of the five terrain parameters inside the feature region. • Given a method for comparing parameter values, grouping triangles into feature regions, and storing adjacency of these regions, what algorithm can we use to do the actual match-ing? We must match both parameter values and spatial relations. 6.1 Comparing Parameter Values in Triangles A match is a correspondence for which the "goodness" has been quantified. This measure of goodness is the similarity metric. We require a similarity metric for our set of terrain parameter values. In this section, we consider how to define such a metric by first examining metrics used in pattern recognition and object recognition problems. In pattern recognition, a feature vector, v, is found for each image. This vector is a set of measurements vi, ...Vk that condense the description of the relevant properties of an image into a small, Euclidean feature space of k dimensions [BB82]. To compare two images, the feature vectors of the two images, v and w, are compared using a distance metric, which assumes the same units for all dimensions of the vector. Examples of distance metrics are: Euclidean = ^5>;-«>,•)* (6-18) Manhattan(Lt) = ^ — (6.19) i=l Chessboard(L^) = max(\v{ — W{\), for i = l,k (6.20) The smaller the value, the closer the match, with a value of 0 being the perfect match. Color is often used for object recognition in computer vision, as it is a local surface property that is view invariant and largely independent of resolution [SB90]. To compare the color properties Chapter 6. Matching 36 of two images, V and W, a color histogram vector is created for each image. These histogram vectors, v = vj ...Vk and w = wt ...Wk, are created by sorting the pixels of each image into k bins, where each bin represents a range of colors, w,- and equal the number of pixels in images V and W respectively whose color is in the range associated with bin i. v and w can then be compared using a histogram similarity metric, H [SB90]: H(v,w) = E?=' ( 6 . 2 1 ) The numerator is the number of pixels from image W that have corresponding pixels of the same color in image V. H is normalized to produce a fractional match value between 0 and 1 by dividing by the number of pixels in image W. This is similar to pattern recognition metrics if we just consider each histogram bin as a component of the feature vector. We could use the Manhattan metric (Eqn. 6.19) for our set of five terrain parameters as long as the units were the same for all parameters. They are not - local relief is in meters, and aspect and gradient are in degrees. However, we can normalize the parameters by dividing each by their standard deviation, which is a measure of the variability of a distribution. Standard deviation, a, of a parameter x, is defined to be: a = \ 3 — (6.22) V n — 1 where n is the number of triangles in the TIN, and x is the sample mean over the n trian-gles [GH84]. This metric still has the limitation that all terrain parameters are considered equally important. We could modify this to allow for a user-specified importance weight for each parameter. Given our set of five terrain parameters: local relief, aspect, gradient, plan cur-vature, and profile curvature, then the metric (for comparing two sets of these five parameters, v = vi ...vs and w = Wi ...Ws) would become: m e t r i c = y ~ "'1 (6.23) where w,- is the user-assigned weight > 0 for terrain parameter i, and CT; is the standard devi-ation for the ith parameter. A weight of 0 means that the parameter will not be included as Chapter 6. Matching 37 part of the metric. This metric with weights (Eqn. 6.23) would work well if we only compared individual triangles within our system. However, recall that we wished to provide the user with the ability to locate more than just two topographically similar triangles. This implied that we needed some way to group triangles together into regions that either represented a topographic feature or had some other significant meaning to the user. This issue will be discussed in the next section. Similarity metrics more suitable for region comparison will be discussed in Section 6.3. 6.2 Selection of Region to be Matched: Triangle Grouping One of the original goals of our system was to match both terrain attributes and features. Matching terrain features, such as a ridge or a valley, is more difficult than we first realized. Part of the complexity is that the literature provides varying definitions depending on the field of study (hydrology, geomorphology, geography, etc.). The definition given is also often dependent upon the type of D T M used. Ideally, we would like to formulate definitions for individual topographic features that make use of the information that we have: triangles, and the values of our five terrain parameters for each triangle. This nows seems to be too simple of an approach. If we had a working definition of various terrain features, we would classify each triangle in the TIN as being part of some feature, which would segment the TIN into distinct regions. This would be presented as the the initial view of the terrain to the user. The user could then pick a feature region of interest as the region to be matched. Rather than construct oversimplified definitions of terrain features, we allow the user to provide a list of categories to be used to classify the terrain for the initial view. Each category is defined in terms of ranges of our five terrain parameters, and has a name and an associated color. A user may omit a range for a specific parameter, in which case that parameter will not be considered Chapter 6. Matching in the classification. Table 6.1 shows an example of a category list. 38 Category Name Color Local Relief Aspect Gradient Plan Curvature Profile Curvature Flat/low slope magenta n/a n/a [0.0, 25.0] n/a n/a Steep slope grey n/a n/a [43.5, 48.0] n/a n/a • NE facing red n/a [0,90) n/a n/a n/a SE facing green n/a [90,180) n/a n/a n/a SW facing blue n/a [180,270) n/a n/a n/a NW facing yellow n/a [270,360) n/a n/a n/a Table 6.1: Sample Classification Category List A category list is stored in a text file. The user can load the file, apply the classification to the TIN, and view the result - each triangle colored according to category. It is easy for the user to create many classifications and examine the resulting terrain segmentations, all within the same session. This flexibility allows the user to set up a category list specific to terrain type, or to explore the relation of two or more terrain parameters within their terrain model. 6.3 Comparing Parameter Values in Regions Given that the TIN has been grouped into individual regions that each belong to some user-specified category, we need a similarity metric for comparing a set of terrain parameter values within one region to a set of parameter values within another. Specifically, for each of the five parameters, we have a set of values defined on the triangles within the region rather than a single value applicable to the whole region. We could avoid dealing with a set of values for each parameter if we averaged the values of the parameter over all the triangles within a region. However, we would then lose any record of the variability over the region. Instead, we compute the minimum, maximum and mean of each of the five terrain parameters (local relief, slope aspect, slope gradient, plan and profile curvature) over all the triangles in the region. We then have 15 values defined for each region Chapter 6. Matching 39 and we can use the metric as specified in Eqn 6.23. For example, if the TIN has been classified into regions, then our "region metric" is: where i = {local relief, aspect, gradient, plan curvature, profile curvature}, j = {MIN of, MAX of, MEAN of) all triangles in the region, (7,-j is the standard deviation of the region parameter i,j over all regions, and u>ij is the user-assigned weight > 0 for parameter 6 . 4 Storing Spatial Relations: The Attributed Graph We now have a method for grouping the triangles into feature regions and comparing the terrain parameter values of two regions, but we have not yet dealt with the spatial adjacency relations of the feature regions. We need a data structure to store these spatial relations. Ideally, the same data structure would store the values associated with each region, and could be used directly for matching. We can use the "attributed graph," a commonly used data structure for shape comparison matching problems in image processing. Kitamoto et al. [KZT93] use an attributed graph to represent a NOAA satellite image (clouds/tropical storms). The graph is then compared against a database of image graphs in order to retrieve the images that most closely match the original image. Formally, an attributed graph, G = ( V, E, A, B) is: • a finite, possibly empty set of vertices (nodes) V, and • a finite, possibly empty set of edges, E, where EC V X V, and • a finite, possibly empty set of attributes A defined at each vertex (node) v € V, and • a finite, possibly empty set of attributes B defined at each edge e 6 E. 5 3 (6.24) We can create an attributed graph, G, of our classified TIN as follows: Chapter 6. Matching 40 • Create a vertex for every region in the classified TIN. These vertices become the set V. (The vertices in V are not the same as the vertices used to define the TIN - see Figure 7.4.) • Create an edge between every two vertices in V whose associated regions are adjacent in the classified TIN. These edges become the set E. (The edges in E are not the same as the edges in the TIN itself - see Figure 7.4.) • Define a set of attributes, A, to be stored at each vertex in V. Minimally, this will be the 15 values that we computed for each region (minimum, maximum and mean of {local relief, aspect, gradient, plan and profile curvature}). • Define a set of attributes, B, to be stored at each edge in E. For an edge, e = (vi, Vj), we should at least store the direction from the region represented by V{ to the region represented by VJ. This will provide additional spatial information. 6.5 Graph Matching: Graph Optimal-Monomorphism The attributed graph provides us with a data structure that stores both the region-based values and the spatial relations of the regions. As was mentioned in the previous section, the attributed graph can be used directly in matching algorithms, such as those that solve the "graph optimal-monomorphism" problem. The "graph monomorphism" problem seeks a one-to-one, incidence-preserving vertex mapping between a graph ("pattern graph") and a subgraph of another larger graph ("base graph"). This problem is also referred to as the "subgraph isomorphism" problem. If the two graphs are attributed graphs, then the "graph optimal-monomorphism" problem is to find the optimal mapping, given that a perfect match between the attributed values of the two graphs may not exist. If we store the entire classified TIN in an attributed graph (our base graph) and create another Chapter 6. Matching 41 attributed graph (our pattern graph) that represents the region of interest (and its neighboring regions), then we can use an algorithm which solves the graph optimal-monomorphism problem to perform terrain region matching. The graph optimal-monomorphism algorithm will produce an optimal mapping between the pattern graph and some other subgraph of our base graph. This other subgraph actually represents another region (and its neighbors) in the TIN, so we will have found the region that most closely matches our region of interest, and know how the region of interest and its neighbors map to the solution region and neighbors. Many algorithms have been developed to solve the graph optimal-monomorphism problem. Ghahraman et al. [GWA80] describe several algorithms based on the characterization of a graph optimal-monomorphism in terms of subgraphs of the Cartesian graph product. Shapiro and Haralick present a forward-checking algorithm for inexact graph matching [SH81], and Tsai and Fu present an error-correcting subgraph isomorphism algorithm for use in pattern recognition [TF83]. Wong et al. [WYC90] describe a decision-tree based approach to solving the graph optimal-monomorphism problem, which uses the branch-and-bound technique for searching through the decision-tree. We will use the approach of Wong et al., which is described in detail in the next section. 6.6 Graph Optimal-Monomorphism Algorithm of Wong, You and Chan The graph optimal-monomorphism algorithm presented by Wong et al. [WYC90] uses a decision-tree approach. Suppose the pattern graph has m vertices and the base graph has n vertices. Then in the decision-tree, let tree nodes be sequences of < m distinct base graph vertices. The tree node N = (bt, bs,bi) means that pattern vertex 1 is matched with base vertex bj, pattern vertex 2 with base vertex bs, pattern vertex / with base vertex 6/. A tree node JV = (bt, b%,bi-!, bi) is a child of tree node N' = (bt, bs,&/_;), thus, each node at level / has n - / children. The root of the tree is the empty sequence (). Chapter 6. Matching 42 6.6.1 Evaluation Function, f(N) Searching a decision-tree requires an evaluation function that assigns a cost to each tree node. The cost of a tree node, g*(N), is the cost of all vertices and edges that are matched between the pattern and base graphs. If N = (bj,&/), then: / g*(N) = £ > „ ( i , & , - ) + £ ce((i,j),(bi,bj)) (6.25) i = l i>l,j<l,i^j where cv(i, 6,-) is the cost of matching pattern vertex i with base vertex b{ and ce((i,j), (6,-, bj)) is the cost of matching pattern edge (i,j) with base edge (&;, bj). Ideally, the evaluation function f*(N) of a node N = (bt,6/) would tell us the minimum cost of extending the match TV to the rest of the pattern graph, as shown in the following equation: f*(N) = minbl+1_bm{g*{{b1,...,bubl+u...,bm))} (6.26) The extension cost h*(N) = f*{N)-g*(N) (6.27) is the contribution to the cost by extending the match to the pattern graph nodes / + 1 , m , which are yet unmatched. The value of /*(()) is the cost of the best match (between the pattern graph and the base graph), which is what we are trying to compute. Wong et al. define f(N), to be used instead of f*(N), as: f(N) = g*(N) + h(N) (6.28) where h(N) is a consistent lower bounded estimate of h*(N) using any available information. This is the key to their algorithm, as the closer the heuristic h(N) is to h*(N), the more efficient the algorithm will be. If h(N) < h*(N), then using f(N) instead of f*(N) will expand fewer nodes than a search algorithm that uses no heuristic information, and will always guarantee a minimum cost solution path in the decision-tree. Chapter 6. Matching 43 Wong et al. prove that for their heuristic, h(N), and a similar heuristic, h'(N) (of Tsai and Ku [TF79]), h'(N) < h(N) < h*(N) always holds, thereby making their algorithm more effi-cient. 6.6.2 Heuristic Function, h(N) Wong et al. define h(N) as follows. Let N = (bj, bz,&/) be the current node of the decision-tree at level /. Then there are sets of vertices, Np from the pattern graph, and NB from the base graph, which have already been matched: Np = { 1 , 2 , / } and NB = {bj, bz,&/} where iis matched with 6; for i = 1 , 2 , I . There are also sets of vertices Mp from the pattern graph and MB from the base graph which have not yet been matched: Let k(i,b) be the cost of appending vertex b to N, by matching pattern vertex i £ Mp to base vertex b £ MB- It is defined as: A mapping, H : Mp —> MB, is then found such that each unmatched vertex i £ Mp corresponds to the vertex b £ MB for which the cost k(i, b) is minimized. Note that the mapping //may be many-to-one. The function a(N) is the total cost of H: MP = {1+ 1,1+ 2,...,m} and MB = {b\b £ {1, 2,n}\NB}. k(i,b) = FL*((61,..,6,,6))-5*((6LL...,6,)) = c„(i, 6) + ]T ce((i,j), (b, bj)) + ce((j, i), (bj, b)) (6.29) m a(N) = ^2 min{k(i,b)\b £ MB} (6.30) i=l+i Chapter 6. Matching 44 Similarly, b(N) is defined as the total cost of the optimal mapping of edges, H' : Mp X Mp —> MB X MB-H' maps edge (i,j) with endpoints i,j G Mp to the edge (d, e) with endpoints d, e e MB for which the cost function ce is minimized. The mapping H' can also be many-to-one. b(N) is: b{N) = niin{ce((i,j),(d,e)) + ce((j,i),(e,d))\d,ee MB,d^e} (6.31) ',jeMP,i<j Finally, h(N) is defined in terms of a(N) and b(N): h(N) = a(N) + b(N) (6.32) To make use of Wong et al.'s algorithm for our terrain matching problem, we will have to define the functions cv and c e. Our definitions of these functions are presented in Section 6.6.4. 6.6.3 Algorithm in Pseudocode Finally, Wong et al. present a pseudocode version of their algorithm: • Initiation: - upper-bound = oo - tentative solution = {} (empty set) - NB - {} (empty set) - f(N) = 0 - active-node Jist — {(NB,/{N))} (root node only) • Main process: - WHILE (active.node Jist ^ {}) DO (node expansion) * current-node = first element of active-node-list * NB = first element of current-node (i.e., in (NB,f(N))) Chapter 6. Matching 45 number of elements in N'B * + 1 * Mp = {1 + 1,1+2,m) (m = size of pattern graph) M' = {1, 2,n}\N'B (n = size of base graph) * FOR each element s in M'B DO • bi = s • NB = N'B U bi (append 6/ to N'B) • MB = M'B\bi • f(N)=g*(N) + a(N) + b(N) • IF (/ m) THEN Insert (Ns,f(N)) in active-nodeJist in ascending order according to value of f(N). (new node generated) • ELSE IF (/(TV) < upper .bound) THEN upper-bound = f(N), tentative solution = {NB} * ENDFOR * active-node Jist = active-node-list — current-node (current-node is replaced by active sons) * Remove any element from active-nodeJist whose f(N) value is > upper-bound. (prune unpromising nodes) - ENDWHILE - Output optimal solution = {(i, b{)\i = 1,2,m} where 6; is the ith element of the tentative solution. - Output optimal-cost = upper-bound. Chapter 6. Matching 46 6.6.4 Our Cost Functions for Wong, You and Chan's Algorithm To make use of Wong et al.'s algorithm [WYC90] as described in the previous sub-section, we will have to define two functions: cv(p, b) and ce((pa,pb), (bc, bd)). cv(p, b) is the cost of matching vertex p from the pattern graph to vertex 6 of the base graph. ce((pa, pb), (bc, bj)) is the cost of matching the directed edge (pa, pb) from the pattern graph to the directed edge (bc, bd) of the base graph. In the definition of cv(p, b), we finally use the attributes that we have stored at the vertices (nodes) in the pattern and base graphs. Remember that a node in either graph represents a region. In Section 6.4, we suggested that for every node, we should at least store the 15 values that we computed for the associated region (minimum, maximum and mean of {local relief, aspect, gradient, plan and profile curvature}). We do store these 15 values, and one additional value, the total area of the region. We then define cv(p, b) using our region metric from Section 6.3, with the addition of the area of the region. The resulting definition of cv(p, b) is: • i = {local relief aspect, gradient, plan curvature, profile curvature}, • j = {MIN of, MAX of, MEAN of} all triangles in the region, • o~ij is the standard deviation of the region parameter i,j over all regions (i.e., all nodes in the base graph), • U{j is the user-assigned weight > 0 for parameter i,j, • parameter parea is the area of the pattern graph region, (6.33) where: Chapter 6. Matching 47 • parameter barea is the area of the base graph region, • f a r e a is the standard deviation of region area over all regions, and • u?area is the user-assigned weight > 0 for the region area parameter. The definition of ce((pa,pi,), (bc, bd)) involves the attributes that we have stored at the edges in the pattern and base graphs. As we suggested in Section 6.4, for edge (d, e) in either graph, we store the angle from the centroid of the region associated with node d to the centroid of the region associated with node e. This angle is in degrees clockwise from north. We then define ce{{pa, Pb), (bc, bj)) as: Ce((Pa,Pb),(bc,bd)) = U""9le\(Pa,Pb) angle ~ (K, bd) a n g l e | ( g 3 4 ) 0~ angle where oang\e is the standard deviation of the angles between the centroids of adjacent regions in the classified TIN (base graph), and u>angie is the user-assigned weight > 0 for the angle parameter. Chapter 7 Our System In this chapter we describe our prototype topographic feature matching system. In Section 7.1 we provide an overview of the system, including the software tools used to implement it. Sec-tion 7.2 describes in detail each of the steps of using the system, from reading in the set of irregular (x, y, z) terrain data points through to viewing the results of a match. We use a small set of simulated terrain data (it was sampled from DEM data for Butte, Montana, USA) to illustrate this. Finally, in Section 7.3 we discuss the strengths and weaknesses of the system. 7.1 Overview We have implemented a system that does the following: • Takes as input a set of irregularly spaced elevation data and creates a TIN using the Delaunay triangulation of the data points. • Computes local relief, slope aspect, slope gradient, and plan and profile curvature for each triangle. These terrain parameters, in conjunction with spatial relations, form the basis of our definition of topography. • Applies a user-defined classification system in order to group triangles into regions. Each triangle in the region belongs to the same category. A category is defined in terms of ranges of the five terrain parameters from the previous step. The classification is flexible - the user defines a list of category names and associated parameter ranges. 48 Chapter 7. Our System 49 • Computes the minimum, maximum and mean of {local relief, aspect, gradient, and plan and profile curvature} over all the triangles in each region. These 15 attributes, in addition to the region area, are stored for each region. • Allows the user to select one of these regions as the target region for similarity matching. The user can weight the importance of matching the attributes for that region, as well as the importance of matching adjacency relations and attributes of adjacent regions. • Encodes the region of interest, adjacent regions, and user-assigned importance weights as an attributed graph (pattern graph), and the classification of the entire terrain as another attributed graph (base graph). • Finding a match of the region of interest to another region within the terrain model is then an inexact subgraph matching problem (i.e., optimally match pattern graph to subgraph within base graph). We use a decision-tree-based algorithm for graph optimal-monomorphism to do the inexact subgraph matching. 7.1.1 Implementation Tools Our system was implemented in C and C++ on an SGI workstation, with use of the following libraries: • XForms [Z095] for the graphical user interface, • OpenGL [NDW93] for the 3D presentation of the terrain surface, and 2D presentation of the pattern and base graphs, and • LEDA [NU] for the C++ attributed graph class and various list classes. Chapter 7. Our System 50 7.2 Steps in Using the System In this section, we describe each of the steps in using the system, one step per sub-section. We have created a small set of simulated terrain data (sampled from DEM data for Butte, Montana, USA) that we will use to illustrate the steps. This set of data consists of 25 (x, y, z) points where the range in the (x, y) plane is approximately 7 kilometers and the range in elevation (z) is approximately 400 meters. Please note that in the following sub-sections, rather than repeat the details of computational or design decisions that have already been discussed, we refer the reader to the appropriate sections in previous chapters. 7.2.1 Delaunay Triangulation The user first specifies a file of input points, which contains a list of arbitrary (x, y) points with elevations (z). We compute the Delaunay triangulation (see Section 2.2.1) of the input points using an incremental algorithm as outlined in [GS85], with an improved "locate" procedure (to avoid an infinite loop) as found in the C++ source code of Garland's "scape" software [GH95]. We store the triangulation in a quad-edge data structure (see Section 2.2.2). The user can display the points and the TIN in 2D or 3D (i.e., with elevation) and rotate, translate, or scale the view. The user can also pick a triangle or point with the mouse and examine the coordinates of the vertices or the point. Figure 7.1 shows the application window with the TIN of the sample terrain dataset. Chapter 7. Our System Figure 7.1: Delaunay Triangulation of Sample Terrain Dataset Chapter 7. Our System 52 7.2.2 Computing Terrain Characteristics We then compute the following five terrain parameters for every triangle: local relief, slope aspect, slope gradient, plan curvature, and profile curvature (see Section 4.4). We use a com-bination of planar equations and natural neighbor interpolation to obtain these values (see Section 5.4). The user can select one of the five parameters, such as aspect, and a color scheme with associated ranges, in order to display the terrain colored by aspect value. Figure 7.2 shows the TIN colored by aspect value. Terrah fttlUOIfllrilBflBONJ IM*ION(T)| I diss Off I IftUqOtt | I toj* Off I | H i t 3 a » g | Erth^ jfcj \ ?Da, Beginrtig read ot e&ssfficatBrr system... mated read of daasfscattoo system... Data wad time: 0.0T778S *seccnd3. Figure 7.2: Sample Terrain Colored by Aspect Chapter 7. Our System 53 7.2.3 Region Classification The user specifies a file containing a classification system. This is a text file that contains a list of categories, where each category consists of: ranges of each of our five terrain parameters, a name, and an associated color. Table 7.2 shows the category list that we used for the sample terrain data. Category Name Color Local Relief Aspect Gradient Plan Curv. Profile Curv. East facing/flat slope red n/a [0.0, 180.0) [0.0,5.0) n/a n/a East facing/other green n/a [0.0, 180.0) [5.0,100.0) n/a n/a West facing/flat slope blue n/a [180.0,360.0) [0.0,5.0) n/a n/a West facing/other magenta n/a [180.0,360.0) [5.0,100.0) n/a n/a Table 7.2: Categorization Applied to Sample Terrain The classification system from the file is applied to the TIN to group the triangles into more meaningful regions. The triangles in each region belong to one category, and are colored ac-cording to the category color (as found in the file). The user is shown the resulting colored classification of the TIN, with the boundaries of the regions outlined in black. Figure 7.3 shows the TIN classified into regions according to the categories of Table 7.2. 7.2.4 Building the "Base Graph" After applying the user's classification, the terrain has been classified into regions, and an attributed graph (see Section 6.4) of the entire terrain can be built. This graph is referred to as the "base graph", and it must be rebuilt every time the user applies a new classification system. To build the base graph, each region in the classification is added as a node. An edge is constructed between two nodes in the graph if the two regions represented by nodes are adjacent Chapter 7. Our System Figure 7.3: Sample Terrain Classified into Regions Chapter 7. Our System 55 in the terrain model. At each node, the following information is stored: • the total area of the region, • the minimum, maximum and mean, of each of the five terrain parameters (local relief, aspect, gradient, plan curvature, and profile curvature) over all the triangles that comprise that region. At each edge, e = (a, b), the following information is stored: • the angle between the centroid of the region represented by node a and the centroid of the region represented by node b (in degrees CW from north). After construction of the base graph, the user can display the edges of the graph as an overlay on the region classification. Figure 7.4 shows the base graph edges displayed as an overlay on the TIN. 7.2.5 Target Region Selection The user can now select one of the classified regions in the TIN as the region of interest (i.e., region to be matched). At this time, they are limited to selecting a single region. 7.2.6 Building the "Pattern Graph" Once the user has selected a region of interest, an attributed graph that represents the region and its neighborhood is built. This graph is referred to as the "pattern graph," and it must be rebuilt every time the user selects a new region of interest. To build the pattern graph, a node is created that represents the region of interest. This is the main node. Additional nodes are added, one for each region that is adjacent to the region of Chapter 7. Our System m Figure 7.4: Sample Terrain Base Graph Chapter 7. Our System 57 interest in the terrain model (base graph). An edge is then added in between the main node (representing region of interest) and every other node. The type of information stored at the nodes and edges is identical to that in the base graph. After construction of the pattern graph, the user can display the edges of the graph as an overlay on the region classification. They can display the base graph, the pattern graph or both as overlays. As well, they can select a node or edge in the pattern graph and examine the stored attribute values. Figure 7.5 shows the pattern graph edges displayed as an overlay on the TIN. The numbers displayed are the pattern graph node numbers. Figure 7.5: Sample Terrain Pattern Graph Chapter 7. Our System 58 7.2.7 User-Assigned Weights and Pattern Graph Modifications At this point, the initial pattern graph has been constructed, and all node and edge attributes have been assigned a weight of 1. Note that there is a separate weight associated with each attribute at each node in the pattern graph, as opposed to one weight per node attribute for the whole pattern graph. Similarly, there is a separate weight associated with each edge attribute at each edge in the pattern graph. This allows the user more flexibility in the matching process. The user can proceed with the matching using the default pattern graph (the weights for all nodes' and edges' attributes set to 1). Alternatively, the user can modify any or all of the weights associated with each node and edge attribute in the pattern graph. The weights must all be > 0. For example, the user can exclude a specific attribute at a specific node or specific edge from being considered in the matching by setting the corresponding weight to be 0. If the user wants to ensure that a node's or edge's attribute value is matched as closely as possible, then the user should set the weight for that attribute to be some very large number (in comparison to the weights for other node's and edge's attributes). The actual value can be modified as well - by default it is the value computed for that node or edge in the pattern graph. By changing the value, the user can see how sensitive the matching is to this particular attribute. The user can proceed with the matching using the pattern graph with modified weights and attribute values, or they can make further modifications to the pattern graph specification. Suppose that for some node's or edge's attribute, instead of equalling some value, it is critical that the attribute in question should lie within a specific range. The user can then assign a range ([min,max]) to that attribute instead of a single value, and set the attribute's associated weight to a large number. Chapter 7. Our System 59 7.2.8 Graph Optimal-Monomorphism Algorithm Given the pattern graph and the base graph, we can now apply the graph optimal-monomorphism algorithm of Wong et al. [WYC90], as we outlined in Section 6.6. However, we must make a slight modification to the node and edge cost functions from Section 6.6.4 to account for the individual weights for each node's and edge's attributes, and the possible range specification for an attribute. Therefore, the final node cost function, c„, is: C v { p b ) = ^Jjp'area ~ K r e a ) \ - b,)\ ( ? ^ Oarea ! = j °~i where i = {(min of, max of, mean of) X (relief, aspect, gradient, plan curvature, profile curvature)}, and p[ is either the user-specified value or the median of the user-specified range for attribute i of node p of the pattern graph. The final edge cost function, c e, is: w , , u ^(p v) \((pa,Pb)'angle ~ (bc, bd)angle)\ ,„ Ce{{Pa,Pb)(bc,bd)) = 9- (7.36) ®angle where {pa, Pb)'angie l s either the user-specified value or the median of the user-specified range for the angle for edge [pa,Pb) of the pattern graph. In this example, we set the attributes and weights in the pattern graph as follows (refer to Figure 7.5 for the pattern graph node numbers): • Node 1: mean aspect: 9.04 (default computed value), weight 1.0. • Node 1: mean gradient: 3.8 (default computed value), weight 1.0. • Node 2: mean aspect: 38.0 (default computed value), weight 1.0. • Node 2: mean gradient: 6.5 (default computed value), weight 1.0. • Node 4: mean aspect: 9.6 (default computed value), weight 1.0. Chapter 7. Our System 60 • Node 4: mean gradient: 5.7 (default computed value), weight 1.0. • Edge from node 1 to node 4: angle 347.8 (default computed value), weight 1.0. • Edge from node 1 to node 2: angle 253.6 (default computed value), weight 1.0. • All other node and edge weights were set to 0.0 (i.e., not to be considered in the match). After the matching algorithm has completed, the user can display the "results graph" (subgraph of the base graph that matches the pattern graph) with dashed lines connecting the matched nodes between the results graph and the pattern graph. Figure 7.6 shows the matching graphs for our sample terrain data problem. Pattern graph rade Z Sase graph K J * Pattern graph rsee 3, Base graph rode 4 Pattern graph nede 4, Base graph oadF 1 ? S o M o n c o s t . 3.235131 Figure 7.6: Sample Terrain Matching Results The total cost of the match (as computed by the matching algorithm) is printed. The lower the cost, the better the match between the pattern graph (region of interest) and the results Chapter 7. Our System 61 graph (another region in the TIN), with a cost of 0 denoting a perfect match. The user can also re-run the matching algorithm to find the next best match, repeating this until there are no more feasible matches remaining. 7.3 Discussion: Strengths and Weaknesses of the System Based on our experimentation with the simulated terrain data, we have already identified several strengths and weaknesses of the system. The main strength of the system is its flexibility: • It makes no assumption about the significance of one data point over another in the initial set of terrain data points. • The user can examine the values of the various terrain parameters on a per triangle basis, or on a per region basis. • The user can easily create many different classification systems to be used to segment the TIN into regions. This allows the user to create a classification system specific to a terrain type and to explore the interaction of two or more terrain parameters within their terrain model. • In the matching process, the user selects a region of interest and can then modify the pattern graph prior to the match. For example, the user can modify an attribute value stored at an edge or node in the pattern graph or they can replace it with a range specification. It may be important to the user that an attribute at some specific node (region) in the pattern graph lie within a range as opposed to being exactly equal to some value. • Using individual importance weights for each node's and edge's attributes, the user can control the relative importance of the attributes. If the user wants to completely exclude Chapter 7. Our System 62 a certain node's or edge's attribute from being considered in the match, they can set the associated weight to 0. If it is critical that an attribute be equal to a value or a range in the results graph (matched region), then the user can set the associated weight to a very large number (relative to other weights). • The user can re-run the matching algorithm to find the next best match. This process can be repeated until the system reports that there are no more feasible matches remaining. • After performing a match, the user can retain the current pattern graph structure (i.e., region of interest), but change weights and/or attribute values/ranges and re-run the matcher. This will allow them to see how sensitive the matching result is to a particular attribute. Unfortunately, the flexibility of the system is also its main weakness. In creating any system, there is always a balance between what the user controls and what the system controls. The flexibility in our system introduces the following problems: • Making no assumptions about the initial set of terrain data points means that the onus is on the user to supply a reasonable set of data. For example, they may wish to perform a data-dependent triangulation to reduce the size of their input set, and to ensure that the points they keep are topographically significant. If the data set has noise, then the user must correct for this outside our system. As an aside, the user may wish to import the TIN directly, rather than a list of data points. • Allowing the user to create a classification system for region segmentation means that the user must spend time experimenting with ranges of terrain parameters to get an appropriate classification. (To some extent, the use of a classification system presupposes that the user knows what the range of values will be in the piece of terrain that they are studying.) Users may have trouble defining suitable ranges for some of the lesser known terrain parameters, such as local relief and plan and profile curvature. Chapter 7. Our System 63 It may be useful to provide the user with a library of classification systems that are applicable to various types of terrain. They could then select a default classification, another from the library, or one of their own. The classification system still does not completely solve the problem of matching topographic features, such as peaks or pits. It is probably not possible to create a category list based on ranges of our five terrain parameters that will divide the TIN into regions where each topographic feature lies entirely within one region. (It may be possible to segment the TIN in this way with a category list based on distributions (rather than ranges) of the five terrain parameters, but this was not investigated.) • The system allows the user to modify attribute weights, values and ranges. The user can then explore the effects that the weights and values have on the match results. However, it is difficult for the user to quantify the effects without running the matching algorithm repeatedly themselves. It may be useful for the user to be able to ask in a single query: for a specific node's or edge's attribute, what is the range of values that will produce the same match result? One of the reasons that it is difficult for the user to get a sense of the consistency of a set of weights is that the weights are potentially unbounded. Normally, weights are set to some value between 0 and 1 (0 turns associated attribute off, and 1 is the default weight), but recall that the user can set the weight to an arbitrarily large number if it is critical that the associated attribute equal a certain value or range. It would be better if the weights were always bounded. This could be done by only allowing the user to set weights in the range of 0 to 1, or still allowing arbitrarily large weights, but always normalizing the range of weights set by the user to the range of 0 to 1 before running the matching algorithm. In addition to the weaknesses mentioned above, there are some further limitations: • Our set of five terrain parameters includes local relief (change in elevation over some Chapter 7. Our System 64 area). This is a reasonable parameter if the terrain model used is a DEM, as local relief is always computed over the same sized area - a grid cell. In a TIN, this may not be a useful parameter. For example, if a user supplies a "good" TIN (one that has few points where the terrain surface is flat and more points in rough surface areas), then every triangle in the TIN will have approximately the same local relief value. This is because the large elevation change in the rough areas will be spread over many triangles (more TIN points in these areas), so the actual change in elevation within a triangle will still be low. • The user can only select a single region from the classified TIN as the region of interest. They may wish to select two or more adjacent or non-adjacent regions. Allowing for more than one region would help with the problem of trying to match topographic features (such as peaks, pits, ridges), where the classification did not ensure that each such topographic feature was wholly contained within a single region. • The system can only load one terrain model at a time, which means that the region of interest (pattern graph) and the matched region (results graph) are both in the same terrain model (base graph). It is an easy extension to allow more than one terrain model to be loaded. • The attribute stored at an edge in the pattern or base graph is the absolute angle between the centroids of the regions represented by the endpoints of that edge. This causes two problems: the centroid of the region may actually lie outside the region for a non-convex region, and the match cannot be done in a rotation invariant way. Instead of using the centroid, some other point known to always be in the region polygon could be used, such as the center of the largest in-circle. The solution to the second problem is to compute the relative angles between the edges in the pattern graph, and store these angles instead of the absolute angle between the endpoints' regions. In addition to these problems, the angle between the regions may not even be the best attribute to store at an edge. The distance between regions may be more topographically er 7. Our System 65 significant, or it may be that no attribute is required - the edge itself (which captures the fact that the two regions are adjacent) may be sufficient. The classification system may produce many small regions, which means that it is likely that there exists a region, r, which is adjacent to all of these small regions. This causes problems for the graph matching algorithm if region r is chosen as the region of interest, as the pattern graph main node will have a high degree. Since the matching algorithm looks to match every edge in the pattern graph with an edge in the results graph (i.e., the pattern graph edge cannot be matched to a null edge in the results graph), it is likely that no match will be found (no node in the base graph has such a high degree as the main node of the pattern graph). The solution is to use a different classification system. An extension to the matching algorithm could also be made to allow edges in the pattern graph to be mapped to null edges in the results graph. This would require that our cost functions include the cost of a missing edge and a missing node. It is not clear if this cost should be user or system-specified. The category that a region belongs to is not used as part of the matching process. The user may wish to include this. This would require that a metric either be developed or supplied by the user for comparing feature categories. For example, is category 1 the same "distance" from category 2 as it is from category 3? The user can use a very large weight on a node's or edge's attribute to denote that it is critical that the value be matched as closely as possible in the results graph. This may not always work (and would still not always work even if we made the weights be bounded as discussed in our list of weaknesses above). For example, it is still possible that weighting that attribute higher than others may not choose the best match of that attribute, as a worse match on that attribute may allow for a lower cost on all other nodes or edges. The solution would be to provide constrained matching, where a node's or edge's attribute can be considered mandatory (i.e., we want a perfect match). Forcing a perfect match Chapter 7. Our System 66 may mean that there is no feasible solution, but the lack of a match may be exactly the information that the user is after. • The list of attributes that the user works with includes the minimum, maximum and mean of the five terrain parameters over the triangles in a region. Minimum and maximum reflect the individual extremes, and given that the limits of a classification category are set to the minimum and maximum (see Table 7.2), using them to compare regions is less information "rich" than using a measure such as the standard deviation (of the terrain parameter over the triangles in the region). • Our region metric involves dividing by the standard deviation (see Eqn. 6.24) of the attribute. The two attributes aspect (vertex attribute, in degrees, in range [0,360]) and angle (edge attribute, in degrees, in range [0,360]), are not correctly treated as circular attributes. The result is that the mean and standard deviation of aspect and angle are incorrectly computed if the range of values over a region is greater than 180 degrees, or the range crosses the 0/360 degree line. Chapter 8 Summary, Conclusions and Future Work 8.1 Summary and Conclusions The primary goal of this thesis was to develop a topographic feature matching system that: • Based the definition of a topographic feature on more than just a single attribute, such as elevation, or Gaussian curvature. • Allowed for matching of regions in terms of both topographic attributes and topographic features. • Supported interactive queries regarding the similarity of one region in the D T M to another. • Used a triangulated irregular network (TIN) as the type of D T M so that we could inves-tigate how this affected computation of topographic attributes. We reviewed the varying definitions of topography and shape as found in computer vision, dif-ferential geometry and geomorphometry. We found varying notions of the set of key topographic attributes, which seemed to depend upon the researcher's field of study. We also found that the definition of a topographic feature, such as a ridge, was dependent both upon the field of study, and the type of digital terrain model being used (e.g., raster, contour or triangle-based). Based on our literature review, we chose a set of easily computable terrain attributes that formed our working definition of topography: local relief, slope aspect, slope gradient, and plan 67 Chapter 8. Summary, Conclusions and Future Work 68 and profile curvature. We realized that to define a topographic feature, such as a ridge, in terms of ranges of these five attributes may be too naive of an approach (distributions instead of ranges may have worked, but this was not investigated). Rather than create some simplistic definition of each feature, we allowed the user to create classification categories based on ranges of our five terrain parameters, and apply this classification to the terrain to segment it into regions. This provided the user with much greater flexibility. They could experiment with different classification systems to determine which was most applicable to a certain type of terrain. They could also investigate the interactions of two or more terrain attributes in their terrain model. We only allowed for matching based on topographic attributes, not on topographic features. This was due to our difficulty in creating reasonable definitions of features. However, our matching based on attributes included: the topographic attributes defined at a region, the at-tributes defined at the region's neighbors, and the spatial relation of the region to its neighbors. (The spatial relation included both adjacency and the angle between the two regions, although it is questionable as to whether the angle was really useful. Distance may have been more topographically significant.) We supported interactive queries regarding the similarity of one region to another via a graph optimal-monomorphism algorithm, which finds the optimal match of an attributed "pattern graph" to a subgraph of a larger attributed "base graph". The user could select a region of interest from the terrain (with the mouse) and it and its adjacent regions were encoded as the attributed pattern graph. Using the entire terrain encoded as an attributed base graph then allowed the graph matching algorithm to be used. We used the graph optimal-monomorphism algorithm of Wong et al. [WYC90], with some modifications to the cost functions. This algorithm is a decision-tree approach that requires two cost functions: the cost of matching node a in the pattern graph to node b in the base graph, and the cost of matching edge (at, as) in the pattern graph to edge (bt, 6g) in the base Chapter 8. Summary, Conclusions and Future Work 69 graph. We reviewed metrics for comparing attributes between triangles and between regions in the terrain. We defined the two required cost functions based on these metrics. The graph optimal-monomorphism algorithm was tricky to implement correctly - translating (correctly) the pseudocode algorithm of Wong et al. [WYC90] along with our modifications into C and C++ code was time consuming. Our system used a TIN as the data structure for the terrain model. The effect of using a TIN instead of a DEM was that it was more difficult to compute two of our terrain attributes: plan curvature and profile curvature. These are second derivatives of elevation, and the second derivatives are undefined for a triangular facet. Our solution was to use a natural neighbor interpolant to derive the plan and profile curvature at the in-center of the triangle. Even in 2D, natural neighbor interpolation is a complicated algorithm. If we had used a DEM, we would have had to use some surface fitting method to obtain the curvatures. Another point of concern regarding the use of a TIN was the area over which one should compute the terrain attributes. For example, should aspect or gradient be computed over an area larger than a triangle? We reviewed the literature and found that there was no consensus, and that computing the attributes on a per-triangle basis was no more arbitrary than other methods (although contrasted with computation in a DEM it is quite different since an attribute is computed over areas of equal size and spacing in a D E M (i.e., the grid cells)). Given that we computed the attribute values over triangles, local relief (change in elevation for a given area) may not have been a useful terrain attribute. If a "good" TIN is used (few points in flat areas, more points in rough areas), then the value of local relief for every triangle will be similar. In Chapter 1 we stated that we also wanted to investigate three more general issues during the development of this system: • What is a good operational definition of "topographically similar"? Chapter 8. Summary, Conclusions and Future Work 70 • What is an appropriate data structure and associated algorithm for performing similarity matching of terrain regions within a TIN terrain model? • How can we assist the user in identifying a region of interest (i.e., a region to be matched) in the terrain model? Our choice of the five terrain attributes combined with spatial relations of regions was not a particularly good definition of topographically similar. Several reasons for this are: • It was difficult for us to compute two of the attributes in a TIN (plan and profile curva-ture). (Computing attributes on a per-triangle basis in a TIN also meant that the terrain attribute local relief was not as meaningful as it is for a DEM.) • We were: unable to provide matching based on topographic features as we could not formulate reasonable definitions of features using just ranges of these five attributes (dis-tributions instead of ranges may have been sufficient, but this was not investigated). The attributed graph seems like a natural data structure for storing regions and their spatial relations in the terrain. Unfortunately, it was relatively costly space-wise in our system as we created the TIN using a quad-edge structure, and then created the base graph on top of this. There may be a more compact means of storing the triangles, regions, and attributes. The efficiency of the graph optimal-monomorphism algorithm relies on two good cost functions, based upon attributes that are good discriminators (i.e., two regions that are not alike should have differing values of these attributes). It is not clear if our five terrain attributes, or the cost functions that we derived from them, were good discriminators. Using the graph matching algorithm also meant that the user was very aware of the data structure being used to store regions and spatial relations. The user may not want to have to understand the workings of an attributed graph. Chapter 8. Summary, Conclusions and Future Work 71 The user-controlled classification system provided the user with a flexible means for segmenting the terrain into regions. This is still not the best way to help the user choose a region of interest for matching. The user would most likely want the terrain segmented into regions representing topographic features, such as peaks or pits. 8.2 Suggestions for Future Work There are several issues outstanding from the development of our system. In addition to address-ing the weaknesses and limitations outlined in Section 7.3, work could be done to investigate the following: • The issue of the scale over which to compute terrain attributes. • What is the most meaningful way to report the closeness of a match to the user? At the moment, the optimal cost as found by the matching algorithm is reported, but how does the user compare two overall cost numbers? • Query by example should be supported. This would allow the user to refine a query by adding in more information from a matched result. • How does simplification of the TIN affect both the classification and the matching process? • How does the graph matching method compare to other matching methods? It would be interesting to compare manual methods vs. graph searching without attributes (spatial relations only) vs. graph searching with attributes (our system). Bibliography [Ban86] Lawrence E. Band. Topographic partition of watersheds with digital elevation mod-els. Water Resources Research, 22(l):15-24, 1986. [Bau75] B. G. Baumgart. A polyhedron representation for computer vision. In Proc. AFIPS Natl. Comput. Conf., volume 44, pages 589-596, 1975. [BB82] Dana H. Ballard and Christopher M . Brown. Computer Vision. Prentice-Hall Inc., 1982. [Dik89] Richard Dikau. The application of a digital relief model to landform analysis in geomorphology. In Jonathan Raper, editor, Three Dimensional Applications in Ge-ographic Information Systems, pages 51-77. Taylor & Francis, Ltd., 1989. [Dou86] David H. Douglas. Experiments to locate ridges and channels to create a new type of digital elevation model. Cartographica, 23(4):29-61, 1986. [Eva72] Ian S. Evans. General geomorphometry, derivatives of altitude, and descriptive statistics. In R. J. Chorley, editor, Spatial Analysis in Geomorphology, pages 17-90. Methuen & Co Ltd, 1972. [Fay96] Reda Ezzat Fayek. Preserving topography in 3D data compression for shape recog-nition. In International Archives of Photogrammetry and Remote Sensing, Volume XXXI, Part B3, XVIIIISPRS Congress, pages 186-191, Vienna, Austria, 1996. [FS91] Bianca Falcidieno and Michela Spagnuolo. A new method for the characterization of topographic surfaces. International Journal of Geographical Information Systems, 5(4):397-412, 1991. [GH84] Gene V. Glass and Kenneth D. Hopkins. Statistical Methods in Education and Psychology. Prentice-Hall, New Jersey, 1984. [GH90] P. K. Garg and A. R. Harrison. Quantitative representation of land-surface morphol-ogy from digital elevation models. In Proceedings of the 4th International Symposium on Spatial Data Handling, pages 273-282, Zurich, July 1990. [GH95] Michael Garland and Paul S. Heckbert. Fast polygonal approximation of terrains and height fields. CMU-CS-95-181, School of Computer Science, Carnegie Mellon University, September 1995. 72 Bibliography 73 [GHL88] Dmitry B. Goldgof, Thomas S. Huang, and Hua Lee. Feature extraction and terrain matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 899-904, 1988. [GS85] L. J. Guibas and J. Stolfi. Primitives for the manipulation of general subdivisions and the computation of Voronoi diagrams. ACM Transactions on Graphics, 4:74-123, 1985. [GU93] Linda H. Graff and E. Lynn Usery. Automated classification of generic terrain fea-tures in digital elevation models. Photogrammetric Engineering & Remote Sensing, 59(9):1409-1417,1993. [GWA80] David E. Ghahraman, Andrew K. C. Wong, and Tung Au. Graph optimal monomor-phism algorithms. IEEE Transactions on Systems, Man, and Cybernetics, SMC-10(4):181-188, April 1980. [Ham58] Edwin H. Hammond. Procedures in the descriptive analysis of terrain. U.S. Office Naval Research, Geography Branch, Contract Nonr 1202(01), Project NR 387-015, Final Report, pages 85 + appendices (72 p.), 1958. [Ham64] Edwin H. Hammond. Analysis of properties in land form geography: An appli-cation to broad-scale land form mapping. Annals of the Association of American Geographers, 54(1):11-19, 1964. [HB82] Richard G. Heerdegen and Max A. Beran. Quantifying source areas through land surface curvature and shape. Journal of Hydrology, 57:359-373, 1982. [10X92] M. A. Ireton, J. P. Oakley, and C. S. Xydeas. An hierarchical classification method and its application in shape representation. In Albert A. Jamberdino and Wayne Niblack, editors, Proceedings, SPIE, Image Storage and Retrieval Systems, volume 1662, pages 154-165, February 1992. [JD88] S. K. Jenson and J. O. Domingue. Extracting topographic structure from digital elevation data for geographic information system analysis. Photogrammetric Engi-neering & Remote Sensing, 54(11):1593-1600, 1988. [JWM90] Norman L. Jones, Stephen G. Wright, and David R. Maidment. Watershed de-lineation with triangle-based terrain models. Journal of Hydraulic Engineering, 116(10):1232-1251, 1990. [Kas89] T. Kasvand. Analytic primitives in digital elevation models. In Digest of the 1989 In-ternational Geoscience and Remote Sensing Symposium (IGARSS), volume 3, pages 1214-1219, July 1989. [KK94] In So Kweon and Takeo Kanade. Extracting topographic terrain features from elevation maps. Computer Vision, Graphics and Image Processing (CVGIP), Image Understanding, 59(2):171-182, 1994. Bibliography 74 [KZT93] Asanobu Kitamoto, Changming Zhou, and Mikio Takagi. Similarity retrieval of NOAA satellite imagery by graph matching. In Wayne Niblack, editor, Proceedings, SPIE, Storage and Retrieval for Image and Video Databases, volume 1908, pages 60-73, February 1993. [Mab68] J. A. Mabbutt. Review of concepts of land classification. In G. A. Stewart, editor, Land Evaluation, pages 239-250. The Macmillan Company of Australia Ltd., 1968. [Mar75] David M. Mark. Geomorphometric parameters: A review and evaluation. Ge-ografiska Annaler, 57 A(3-4):165-177, 1975. [Mar78] David M. Mark. Concepts of "data structure" for digital terrain models. In Amer-ican Society of Photogrammetry, Proceedings of the Digital Terrain Models (DTM) Symposium, pages 24-31, 1978. [MGL93] I. D. Moore, R. B. Grayson, and A. R. Ladson. Digital terrain modelling: a review of hydrological, geomorphological and biological applications. In K. J. Beven and I. D. Moore, editors, Terrain Analysis and Distributed Modelling in Hydrology, pages 7-34. John Wiley & Sons, 1993. [MH93] H. Mitasova and J. Hofierka. Interpolation by regularized spline with tension: II. Application to terrain modeling and surface geometry analysis. Mathematical Geol-ogy, 25:657-669,1993. [NBE+93] W. Niblack, R. Barber, W. Equitz, M. Flickner, E . Glasman, D. Petkovic, P. Yanker, C. Faloutsos, and G. Taubin. The QBIC project: Querying images by content using color, texture and shape. In Wayne Niblack, editor, Proceedings, SPIE, Storage and Retrieval for Image and Video Databases, volume 1908, pages 173-187, February 1993. [NDW93] Jackie Neider, Tom Davis, and Mason Woo. OpenGL Programming Guide. The Official Guide to Learning OpenGL, Release 1. Addison-Wesley, 1993. [NU] Stefan Naher and Christian Uhrig. The LED A User Manual, Version R3.3. Fach-bereich Mathematik und Informatik, Martin-Luther Universitat, Halle-Wittenberg, Halle, Germany and Max-Planck-Institut fur Informatik, Saarbriicken, Germany (URL ftp://ftp.mpi-sb.mpg.de/pub/LEDA/MANUAL-341.ps). [01177] C. D. Oilier. Terrain classification: Methods, applications and principles. In John R. Hails, editor, Applied Geomorphology, pages 277-316. Elsevier, 1977. [OM84] J. F. O'Callaghan and D. M. Mark. The extraction of drainage networks from digital elevation data. Computer Vision, Graphics and Image Processing, 28:323-344,1984. [PD75] Thomas K. Peucker and David H. Douglas. Detection of surface-specific points by local parallel processing of discrete terrain elevation data. Computer Graphics and Image Processing, 4:375-387, 1975. Bibliography 75 [Pfa76] J. L. Pfaltz. Surface networks. Geographical Analysis, 8:77-93, 1976. [PFLM78] T. K. Peucker, R. J. Fowler, J. J. Little, and D. M. Mark. The triangulated irregular network. In American Society of Photogrammetry, Proceedings of the Digital Terrain Models (DTM) Symposium, pages 516-532, 1978. [Pik88] Richard J. Pike. The geometric signature: Quantifying landslide-terrain types from digital elevation models. Mathematical Geology, 20(5):491-511, 1988. [SA69] D. A. Sharpnack and G. Akin. An algorithm for computing slope and aspect from elevations. Photogrammetric Engineering and Remote Sensing, 35:247-248, 1969. [SB90] Michael J. Swain and Dana H. Ballard. Indexing via color histograms. In IEEE Pro-ceedings, 3rd International Conference on Computer Vision, pages 390-393, 1990. [SBM95] Malcolm Sambridge, Jean Braun, and Herbert McQueen. Geophysical parameter-ization and interpolation of irregular data using natural neighbours. Geophysical Journal International, 122:837-857, 1995. [SH81] L. G. Shapiro and R. M. Haralick. Structural descriptions and inexact matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-3(5):504-519,1981. [Sha80] Linda G. Shapiro. A structural model of shape. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2(2):111-126, September 1980. [Sib80] R. Sibson. A vector identity for the Dirichlet tessellation. Mathematical Proceedings, Cambridge Philosophical Society, 87:151-155, 1980. [Sib81] R. Sibson. A brief description of natural neighbor interpolation. In V. Barnet, editor, Interpreting Multivariate Data, pages 21-36. Wiley, Chichester, 1981. [Ski89] Andrew K. Skidmore. A comparison of techniques for calculating gradient and aspect from a gridded elevation model. International Journal of Geographical Information Systems, 3(4).-323-334, 1989. [Spe68] J. G. Speight. Parametric description of land form. In G. A. Stewart, editor, Land Evaluation, pages 239-250. The Macmillan Company of Australia Ltd., 1968. [Taj81] S. J. Tajchman. On computing topographic characteristics of a mountain catchment. Canadian Journal of Forest Research, 11:768-774, 1981. [TF79] W. H. Tsai and K. S. Fu. Error-correcting isomorphism of attributed relational graphs for pattern analysis. IEEE Transactions on Systems, Man and Cybernetics, SMC-9(12):757-768, December 1979. [TF83] W. H. Tsai and K. S. Fu. Subgraph error-correcting isomorphism for syntactic pattern recognition. IEEE Transactions on Systems, Man and Cybernetics, S M C -13(l):48-62, Jan./Feb. 1983. Bibliography 76 [TG90] David M. Theobald and Michael F. Goodchild. Artifacts of TIN-based surface flow modeling. In ASPRS/ACSM Proceedings, GIS/LIS 90, volume 2, pages 955-967, Bethesda, MD, 1990. [Tho64] W. Thompson. Determination of the spatial relationships of locally dominant topo-graphic features. U.S. Department of Army, Natick Massachusetts, pages 14 p. + 10 p. of figs, 1964. [Wat92] D. F. Watson. Contouring: A Guide to the Analysis and Display of Spatial Data. Pergamon, Oxford, 1992. [WS60] Walter F. Wood and Joan B. Snell. A quantitative system for classifying landforms. Quartermaster Research and Engineering Command, U.S. Army, Tech. Rept. EP-124, I960. [WYC90] A. K. C. Wong, M. You, and S. C. Chan. An algorithm for graph optimal monomor-phism. IEEE Transactions on Systems, Man, and Cybernetics, 20(3):628-636,1990. [Z095] T. C. Zhao and Mark Overmars. Forms Library, A Graphical User Interface Toolkit for X. Dept. of Physics, University of Wisconsin-Milwaukee, Milwaukee, WI and Dept. of Computer Science, Utrecht University, Utrecht, the Netherlands (URL ftp://einstein.phys.uwm.edu/pub/xforms/DOC/forms.ps.gz), 1995. [ZT87] L. W. Zevenbergen and C. R. Thorne. Quantitative analysis of land surface topog-raphy. Earth Surface Processes and Landforms, 12:47-56, 1987. Appendix A Natural Neighbor Interpolation Natural neighbor interpolation, first introduced by Sibson [Sib80, Sib81] is a local interpolant defined as: n i=l where f(x, y) is the interpolated function value at the point (x, y), the /,'s are the data values at the n "natural neighbors" of the point, and the 0,'s are the "natural neighbor coordinates" of the point with respect to each of its n natural neighbors. The natural neighbors of a point (x, y) are just the n nodes to which the point would be connected if it were added to the Delaunay triangulation. The natural neighbor coordinate of (x,y) with respect to natural neighbor node i is defined to be the area of the second-order Voronoi cell between (x, y) and node i, normalized by the total area of the Voronoi cell around (x, y). (The second-order Voronoi cell between (x, y) and node i is just the region of overlap between the original Voronoi cell around node i and the new Voronoi cell around (x, y) after it was added to the Delaunay triangulation.) Sambridge et al. [SBM95] describe Watson's method [Wat92] for computing the natural neighbor coordinates, which calculates the areas of the second-order Voronoi cells by breaking them down into a sum of signed areas of sub-triangles. Watson's method first finds all circum-triangles of the point (a:, y) (i.e., every Delaunay triangle whose circumcircle contains the point). For each circum-triangle (with vertices pi, ps, and J 0 5 ) , four vectors are computed: v, Cj, eg, and C3. v is the circumcenter of the circum-triangle. ct (for i = 1, 2, 3) is the center of the circle passing 77 Appendix A. Natural Neighbor Interpolation 78 through the three points pj, pk, and (x, y), where j and k are fixed by i in cyclic rotation (i.e., 123, 231, and 312). Ci (for i = 1,2,3) can be found by solving the 2x2 linear system: { P j x 2 + P j y 2 ) - (x2 + y2) (pkx2 + pky2) - (x2 + y2) For each circum-triangle, three sub-triangles are formed with the vertices: (cz, c3, v), (c3, Ci, v), and (cj, c2, v). The area of each sub-triangle, at,i(x, y), then becomes (using cyclic rotation of i, j, k): (*t,i{x,y) - j \\(CJ - v) A (ck - v)\\, for i - 1,2,3 where A denotes a vector product, || || is the vector magnitude, and t refers to the particular circum-triangle. Sambridge et al. then rewrite the natural neighbor interpolation function as: ^ N 3 f{x,y) = T ^ E ^ ' t 1 ' ^ A t=i i=i where A is the total area of the Voronoi cell about (x, y), given by the sum of the areas of the second-order Voronoi cells, N 3 A = £ £ < * « , ; ( * > 2/) t = l i=l and N is the number of circum-triangles of the point (x, y). PjX — X - y pkx - X Pky - y ciy A . l First Partial Derivatives Sambridge et al. [SBM95] provide the following definitions for the first derivatives of their reformulated natural neighbor interpolation function, f(x,y): df{x, y) 8i 1 A datti(x,y) JL\* datii(x,y) E E —fir-ft" ~/(*'y) £ ^ ~^x— t—i J—i t=i i=i Appendix A. Natural Neighbor Interpolation 79 df(x,y) = 1_ dy A d<*t,i(x, V) t tt \ dott,i{x, y) Lt = l i = l t=l i=l dy where the derivatives of the signed sub-triangle areas, and ^f^1, are: dat,j{x, y) _ £ dx 2 , for i = 1, 2, dat,i{x, y) _ 1_ dy 2 — A{ck-v) + (cJ-v)A — , for i = 1,2,3 and the derivatives of the centers of the circumcircles with respect to the components of the dCj dy position vector, ^ and (for i = 1,2, 3), are found by solving the 2x2 linear systems: PjX — X Pjy- y dcjx dx 1 X PkX — X PkV~ y _ dcjy dx ~2 c^x X Plx — X PjV- y dcix dy 1 Ciy - y PkX — X PkV -y dciy . dy . ~2 Ciy - y _ A.2 Second Partial Derivatives We extend the first partial derivative formulas of Sambridge et al. [SBM95] to obtain the fol-lowing definitions of the second derivatives of f(x, y): N 3 df(x,y) dx2 1 A 1 A N 3 dat,i{x, y) df(x,y) ^ ^ dat,i{x,y) 1 - 1 3 x 2 d x f f i --1 1 = 1 t=l 1=1 N 3 da n dat,i{x, y) t=l 1=1 df{x,y) dy2 1 A 1 A • N 3 E E t=i i=i dott,i{x, y) ndf{x, y) A A datti(x, y) dy1' -ft,i-2-dy S§ dy t=l i=l Appendix A. Natural Neighbor Interpolation 80 df{x,y) dxdy 1 A 1 A Ss A dat,i(x,y) 1^ Z_> a„a„. h'1 lt=i i=i dxdy df(x,y) jL^ ^ dat,i(x,y) dy t=i i=i N 3 t = l i=l t = l i-l dx dat,i(x, y) dxdy where the second derivatives of the signed sub-triangle areas, ^^f, a n d dxdy' a r e : dat,j(x, y) dx2 dat,i{x, y) dy2 dat,i{x, y) dxdy dcj . n (dcj dck \ , . dck de3 dy2 de3 A (cjt - v) + 2 dcj dck\ \ . dek ~dy ~dy) — v) A , for i = 1,2,3 , for i = 1,2,3 de dxdy . . dck dcj dx dy dy dx dxdy , for i = 1,2, and the second derivatives of the centers of the circumcircles with respect to the components of the position vector, J^-, J^-, and ^ x c £ , are found by solving the following 2x2 linear systems: Pjx — x PjV - y dcjx dx* 2 ^ - 1 VkX — x VkV - y _ dciy dx2 2d^x _ x ox PjX — X VjV - y dcjx dy2 2%^ - 1 dy VkX — X VkV - y _ dc,y . dy2 . 2dc^y _ 1 Pjx - X VjV- - y dcix dxdy dcjx , dcjy dy ' dx VkX - X VkV - y dciy dxdy dcix i dciy dy dx
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Finding topographically-similar regions in a triangulated...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Finding topographically-similar regions in a triangulated terrain model Litchfield, Gwen 1997
pdf
Page Metadata
Item Metadata
Title | Finding topographically-similar regions in a triangulated terrain model |
Creator |
Litchfield, Gwen |
Date Issued | 1997 |
Description | Two shortcomings of systems developed to automatically extract the topographic features of terrain are that feature definitions are often based on a single attribute, such as Gaussian curvature or slope, and that interactive queries regarding the similarity of one region to another are rarely supported. We report on a prototype system for triangulated terrain models that attempts to address these two issues. It supports interactive classification of the terrain via a user-controlled classification system. Each classification category is defined in terms of ranges of five key terrain attributes: local relief, slope aspect, slope gradient, plan curvature and profile curvature. The system also supports queries of the type: "Which regions are topographically-similar to the selected region?" We use optimal-monomorphism in attributed graphs so that "topographically-similar" goes beyond simply matching terrain attributes to include the attributes and spatial relations of adjacent regions. |
Extent | 5870528 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-03-12 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051522 |
URI | http://hdl.handle.net/2429/5945 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1997-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1997-0252.pdf [ 5.6MB ]
- Metadata
- JSON: 831-1.0051522.json
- JSON-LD: 831-1.0051522-ld.json
- RDF/XML (Pretty): 831-1.0051522-rdf.xml
- RDF/JSON: 831-1.0051522-rdf.json
- Turtle: 831-1.0051522-turtle.txt
- N-Triples: 831-1.0051522-rdf-ntriples.txt
- Original Record: 831-1.0051522-source.json
- Full Text
- 831-1.0051522-fulltext.txt
- Citation
- 831-1.0051522.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0051522/manifest