Topology Building and Random Polygon Generation by CHONGJIAN ZHU B.Sc., Peking University, 1982 M.Eng., Tsinghua University, 1986 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES DEPARTMENT OF COMPUTER SCIENCE We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA APRIL 1994 © CHONGJIAN ZHU, 1994 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. (Signature) Department of The University of British Columbia Vancouver, Canada Date DE.6 (2/88) 4?, 2,9, ‘114 Abstract In the island adoption problem from geographical information system we are asked to identify which islands are located in which lakes. This problem translates directly to polygon nesting in computational geometry: given a set of polygons, find their nesting structure. We present our research into a broader nesting problem, namely connected component nesting, beginning with the underlying concept of topology-building and a related issue of random polygon generation. Topology building is a process of structuring data. We develop a plane sweep al gorithm for building a quad-edge data structure that captures the topological structure of connected components of a set of line segments. The algorithm starts with a data structure representing a single edge then adds edges into the data structure at each step while sweeping across the connected components The algorithm’s time complexity is determined by the time to sort the vertices of the line segments. We develop two approaches for obtaining the nesting structure of polygons. The first adopts a basic idea of Bajaj and Dey [1], but introduces a new notch definition to simplify their algorithm. The second generalizes the nesting problem to a broader class including the nesting of connected components. We present a sweep algorithm, based on a union-find data structure, that computes the nesting of the connected components. In order to test and verify the time complexity of our polygon nesting algorithm, we present an algorithm that generates x-monotone polygons uniformly at random over a vertex set of n points. This algorithm scans the point set to calculate the total number of monotone polygons that can be created, then reverses the scan to generate a random monotone polygon. This process generates a random polygon over the n vertices in 0(K) time, where n K 2 is the number edges of the visibility graph of the x-monotone n chain whose vertices are the given n points. The space complexity of our algorithm is 0(n). 11 Table of Contents Abstract ii List of Tables vii List of Figures viii Acknowledgment x 1 Introduction 1 1.1 Topology Building 1 1.2 Nesting Structures 3 1.3 Random Polygon Generation 4 1.4 Organization 5 2 Polygon Nesting 7 2.1 The Background of the Polygon Nesting 7 2.2 Extracting Polygon Nesting Structure. 2.2.1 Definitions 2.2.2 The Plane Sweep Algorithm . 9 9 111 11 3 Topology Building 16 3.1 Introduction 16 3.2 Edge Functions 19 3.3 3.4 3.5 4 3.2.1 Basic Edge Functions 20 3.2.2 Duality 21 3.2.3 Properties of Edge Functions 23 Quad Edge Data Structure and Topological Operators 25 3.3.1 Quad Edge Data Structure 25 3.3.2 Basic Topological Operators 28 Topology Building by a One Pass Sweep 31 3.4.1 Constructing the Org Ring of an Edge 32 3.4.2 Algorithm 34 Conclusion 37 Connected Component Nesting 39 4.1 Introduction 39 4.2 Disjoint Sets and Notches 42 4.2.1 Disjoint Sets 42 4.2.2 Notches 44 4.3 Plane sweep 47 iv 5 4.3.1 Update at the Begin Point v of Edge e 47 4.3.2 Update at the End Point v of Edge e 51 4.3.3 Getting Nesting Structure 53 4.4 Algorithm 59 4.5 Conclusion 62 Generating Random Monotone Polygons 64 5.1 Introduction 64 5.1.1 Motivation 64 5.1.2 Problem 65 5.2 Preliminaries 66 5.3 Generating Monotone Polygons at Random 68 5.3.1 Counting Monotone Polygons 69 5.3.2 Generating Monotone Polygons 73 5.4 The Analysis of Polygon_Generator 75 5.5 Computing Visibility 79 5.5.1 Computing Visibility Forward 79 5.5.2 Computing Visibility backward 83 5.6 Time and Space Complexity Analysis 86 5.7 Conclusion 88 V 6 Conclusions 89 Bibliography 92 vi List of Tables 5.1 The Procedure for calculating TN and BN 73 5.2 The Procedure for generating monotone polygons 74 5.3 The Procedure of generating top chains 76 5.4 The Procedure of generating bottom chains 77 5.5 The Procedure of computing top_tree(k) from top_tree(k + 1) 83 5.6 The Procedure of Graham-Scan-Top 84 vii List of Figures 2.1 The input and the answer of the polygon nesting problem 2.2 The comparison of two notch definitions 10 2.3 The above relation of the edges 11 2.4 Update ordering 0 at a left endpoint 13 3.1 A set of nested connected components and faces. 17 3.2 The ring of edges out of a vertex 21 3.3 The edge functions 24 3.4 Edge record showing Next links 26 3.5 The subdivision and its quad edge data structure 28 3.6 The result of MakeEdge 29 3.7 The effect of Splice: Trading a vertex for a face 30 3.8 The Org ring and its quad edge data structure 33 3.9 The result of Splice[e, b] 34 8 3.10 The connection cases 34 3.11 The order of the vertices: (ei, e , €3 e 2 ,b 4 ,b 1 ,b 2 ,b 3 ) 4 36 vii’ 4.1 The connected components and the nesting structure 40 4.2 2 The left-edge e 1 and right-edge e 41 4.3 The idea of using disjoint sets to extract the nesting structure 43 4.4 Vertex v is a notch 45 4.5 v, 4.6 Vertex v is a begin point of edge e 48 4.7 The nesting cases 49 4.8 The cases of a notch v as the end point of edge e 4.9 The parent of C, can not be detected directly ..., , 1 9 are notches and s v ..., are simple lines 12 46 . . . . 4.10 The direct edge chain in the tree of nesting sets 4.11 Path compression during the operation find parent 51 57 58 . 61 5.1 A monotone polygon generated from S 12 66 5.2 The top and bottom monotone chains 68 5.3 The above-visible and below-visible points from point {12} 69 5.4 The original set and its equivalent set 72 5.5 The generating process 75 5.6 A point set S 5 and the data of top_tree(5). 81 5.7 A point set S and the data of bot_tree(5). 82 5.8 top_tree(k) is generated from top_tree(k + 1) 85 ix Acknowledgment I sincerely thank my supervisor Dr. Jack Snoeyink. I am very grateful for the super vision Dr. Snoeyink gave me over the whole period of my Msc. program including taking courses, solving the proposed problem theoretically, doing experiments and proofreading this thesis. I thank him for being a exemplary researcher and a considerate mentor. I also thank him for the constant encouragement I received from him. I thank Mr. Dan Lemkow of Essential Planning Systems for introducing the polygon nesting topic and supplying the real data to me. I thank my wife, Dr. Ying Li, for her love and support. x Chapter 1 Introduction In this thesis, we address three parts of our research work. In the first part, we survey the quad-edge data structure and propose an algorithm to build this data structure for capturing the topological structure of the connected components. In the second part, we study the nesting structure of both simple polygons and connected components, and developed algorithms to solve the nesting problems of both types of objects. In the final part, we give an algorithm to generate x-monotone polygon uniformly at random. In the rest of this chapter the motivation of the research work and the background of these problems are detailed. 1.1 Topology Building Geometric algorithms involve the manipulation of objects that are not handled at the machine language level. We must therefore organize these complex objects by means of simpler data types directly representable by the computer. These organizations are universally referred to as data structures. The most important property of data structures is that they capture structural relationships between the data elements. When the data elements are organized in a data structure, we then have structured data. To retrieve structural information from structured data becomes much easier than from unstructured data. In the simplest form, a set of one-dimensional points, unstructured data is simply 1 Chapter 1. Introduction 2 an amorphous mass of numerical values. By assigning an ordering structure on the point set, we are able to answer such questions as: which element is the smallest one? which is the largest one? which elements precede and follow a given one? etc. in an efficient way. In more complicated situations, the data might be two-dimensional or higher. In order to capture the structural information of the data, the data elements should be organized appropriately, such as nesting structure among the simple polygons. Many applications require new algorithms to efficiently represent and manipulate the structural information presented within data. One such example, drawn from the geographical information system (GIS) domain, is to retrieve a collection of areas with a common property from a larger and more complex set with many properties. Two-dimension structuring of data is called topology building in GIS. After struc turing the input data, the traditional topological structures such as connected compo nents, polygons, rings, and chains, can be retrieved through pointer lists. Topology build ing is an important technique in GIS, computational geometry, and computer graphics. By knowing the topological structure of the data one can perform various operations ef ficiently on the data, such as area intersection, union, difference, clipping, and parentage tracking in overlay process in GIS [16]. To capture the topological structure of connected components, we develop a sweepbased algorithm that builds a quad edge data structure of the components. Because the quad edge data structure [7] captures all the topological properties of vertices, edges, and faces of two-dimension graphs, this data structure is the ideal candidate for an intermediate data structure in topology building. The quad edge data structure of a set of line segments can be built by two simple operations from the local relations of the line segments: using only two primitives avoids subtle bugs due to pointer errors. A single topological operator, called Splice, and one primitive for the creation of isolated edges, called Makeedge, is sufficient to construct and modify this data structure. Chapter 1. Introduction 3 The plane sweep method [14] is a widely used computational geometry paradigm that has been applied to solve many different problems involving spatial data [10, 16]. We have adopted this method in our algorithms for building quad edge data structure, extracting the nesting structure among simple polygons and connected components. 1.2 Nesting Structures Typical GIS problems include: How can one retrieve a single area within a certain con nected component? Here a connected component is a undirected planar graph, embedded in the Euclidean plane, in which there is a path that joins any two vertices. What is the parentage relation among these connected components? The parentage relation problem is to find the nesting structure among the connected components. The nesting structure problem is called the topological relation problem. Work by Egenhofer and Sharma [3] on topological relation problems is based upon knowing the topology structure of the data. Before we can extract the topological relation it is important to know the topo logical structure of the objects. But in many GIS applications the topological structure of the given data may not be known. Without knowing the topological structure of con nected components, to extract the nesting structure of them will be a more practical and challenging problem. A region of a connected component is the area enclosed by a set of edges of the component. For two nonintersecting connected components Ml and M2, we say that Ml is nested in M2 if there exists a region R of M2 where every region of Mi is contained in R. If the connected components are simple nonintersecting polygons, the problem of finding the nesting structure of a set of connected components becomes the lake and islands problem in GIS. In this simplified case, M2 has only one region, that is, R and M1CR. = M2 Chapter 1. Introduction 4 Bajaj and Dey [1] extracted the nesting structure from a set of simple nonintersecting polygons. They did not discuss how to find the nesting structures from a set of connected components. Applications in GIS, graphics, and medical applications use the nesting structure from more complicated objects. Thus, an algorithm that extracts the nesting structure from a set of connected components becomes more practical than extracting nesting structure from simple polygons. We develop an algorithm to extract the nesting structure from a set of connected com ponents by a one pass sweep. We sweep a line L in the plane through all the connected components, while maintaining the ordering of edges that induced by L. Meanwhile dis joint region and subregion sets and nesting sets are created or united together according to the sets connection information discovered during the sweep. The final nesting sets capture the nesting structure of the connected components. 1.3 Random Polygon Generation Interest in generating random geometric objects has increased recently, both in theory [8] and in applications [4, 13]. The generation of random geometric objects has applications which include testing and verifying the time complexity of computational geometry al gorithms. There are two ways to test computational geometry algorithms. The first involves the construction of geometric objects that the implementer considers difficult cases for the algorithm. For example, our polygon-nesting algorithm, based on a plane sweep, may require special case code for some polygons. It is important to make those polygons candidates for exposing errors of the algorithm. The second approach to testing involves executing the algorithm on a large set of geometric objects generated at random. We expect errors to be exposed if enough different valid inputs are applied to the algorithm. Chapter 1. Introduction 5 To verify that an implementation of an algorithm achieves the stated algorithm time complexity is the problem we often have in implementation-oriented computational ge ometry research. This is done by timing the execution of the algorithm for various inputs of different sizes. There are many possible inputs of any given size, and the choice is im portant, since an algorithm may take more time on some inputs than others of the same size. If an average execution time is computed over a set of randomly generated objects of a given size, the relationship between time and problem size will typically follow a curve corresponding to its complexity. We can then check this complexity against the stated algorithm’s complexity. Some research has been done on generating geometric objects at random, such as Epstein [4]. In order to test and verify the time complexity of our polygon nesting algorithm we give an algorithm to generate monotone polygons uniformly at random. 1.4 Organization This thesis proceeds as follows: Chapter 2 describes the polygon nesting problem and the basic idea of the plane sweep process. An algorithm is proposed for extracting the nesting structure from a set of simple nonintersecting polygons. Chapter 3 describes the basic concepts, the edge functions, and the topological oper ators of the quad edge data structure. Then an algorithm to build the quad edge data structure of connected components by a one pass sweep is proposed. Chapter 4 adapts the plane sweep process and also gives an algorithm to extract the nesting structure from a set of connected components. Chapter 5 describes how to generate random x-monotone polygons. Both a theoretical Chapter 1. Introduction analysis and an algorithm are given in detail. The conclusions and directions of future work follow in Chapter 6. 6 Chapter 2 Polygon Nesting In this chapter we describe the polygon nesting problem, also known as the island adop tion problem, and a plane sweep algorithm to extract the nesting structure of simple nonintersecting polygons. 2.1 The Background of the Polygon Nesting To determine the area of a body of water, one must subtract the area of its islands from the body of water. For this, one must know which island belong to which body of water. Moreover, islands may contain lakes with islands; we would like to know the entire nesting structure of the shoreline polygons that separate water and land. The polygon nesting problem is to determine, for each polygon P in a set of disjoint polygons, which polygon directly contains F. We have adapted a plane sweep algorithm (see Preparata and Shamos [14]) to compute polygon nesting. A sweep algorithm turns a static two dimension problem into a dynamic one dimension problem, in this case a dynamic interval containment problem. When the sweep encounters a polygon vertex, the intervals in which the sweep line intersects the polygon can be created or destroyed and can be split or merged. Keeping track of which polygons own which intervals allows us to compute nesting information with one pass through the data. In fact, if the data is clean we can compute topology within the same 7 Chapter 2. Polygon Nesting 8 pass at the cost of additional bookkeeping. The Problem. Let P be a set of m non-intersecting simple polygons P, i = 1, 2, ..., m. For each polygon P, we define ancestor(Pj as the set of polygons containing P. The polygon Fk in ancestor(P) is called the parent of P if ancestor(Pk) = arzcestor(P) — Pk• If ancestor(P) is empty we say that the parent of P is null. Any polygon whose parent is Pk is called a child of Fk; see Figure 2.1. The nesting structure G of P is an acyclic directed graph (a forest) in which there is a node n, corresponding to each polygon P in 1, and there is a directed edge from a node n to n 3 if and only if F 3 is the parent of F. The polygon nesting problem is to compute the nesting structure of a set of simple non-intersecting polygons. Ti: 1 n T3: ‘ 7 n 1 P 1 nil T4: T2 nb (a) A set of simple polygons ‘12 ::: (b) The answer trees of the polygons Figure 2.1: The input and the answer of the polygon nesting problem. In [1] Bajaj and Dey give an algorithm that computes the polygon nesting structure. We adapted their algorithm and introduced our notch definition (see definition 4.1) so that the number of notches is the same as the number of subchains. It reduces the number of subchains induced by Bajaj and Dey’s notch definition. Chapter 2. Polygon Nesting 2.2 9 Extracting Polygon Nesting Structure Definitions 2.2.1 Let P be a simple polygon with vertices z’, i” , 2 ..., v, in clockwise order. A vertex v is defined as a notch if the two edges that connect with vertex z are either both at the left side of or both at the right side of the vertex v. These left side and right side ordering relations can be defined with respect to the projection on any line, L, but the x coordinate is selected as the axis of projection and the precise definition of notch is given as follows. Definition 2.1 Let v—*x be the x-coordinate of vertex v. A vertex v is a notch of P if —x 1 z’_ < v—x and v —x 1 v—+x or v_—x v—*x and v+i—x v—x. Between any two consecutive notches v and v 3 in the clockwise order, the sequence of vertices (,ij, v , 1 ..., ) 3 i” is called a x-monotone polygonal line, or subchain, of P. Figure 2.2 (a) shows the notches and the monotone polygonal lines in a simple polygon. Our notch definition is different from that of Bajaj and Dey’s. They defined a vertex v to be a notch of P if the inner angle between the edge (vj , v) and (vi, + 1 ) is larger 1 than 1800. From this definition the polygonal line between any two consecutive notches is a convex polygonal line. Then they partitioned the convex polygonal line into convex chains and further partitioned the convex chains into the x-monotone polygonal lines. But from our definition we can directly partition a polygon into x-monotone polygonal lines that make the definition clearer and reduce the total number of subchains. Figure 2.2 (a) shows that there are only 1 u and 1 2 two notches, and Ci and C u 2 two subchains induced by our notch definition. But for the same polygon there are (n/2 — 1) notches induced by Bajaj and Dey’s notch definition, and all the edges become subchains, which is shown in Figure 2.2 (b). Chapter 2. Polygon Nesting 10 V2 notches (a) Two notches by our notch definition (n/2 - 1) notches (b) Bajaj and Dey’s notch definition Figure 2.2: The comparison of two notch definitions. From our notch definition, we get the following lemma. Lemma 2.1 Let P be a simple polygon with N notches. The number of subchains S, in P is N. Proof. From Definition 2.1 we know that between any two consecutive notches v and 3 in the clockwise order, there is a subchain. So the number of .subchains of P is N. D v A vertex or an edge is said to lie inside a polygon if it lies completely inside the polygonal region enclosed by the boundary of the polygon. A vertex or an edge is said to be contained in a polygon if it lies on the boundary of the polygon. Let L be any line drawn through a set of polygons. Let E be the set of edges that intersect L. The direction of an edge is always from left to right. If an edge ek is parallel to L, the direction of ek points up or down, whichever is consistent with edges connecting to ek. A vertex v is said to be above v if v lies geometrically above 3 v An edge e . 1 is said to be above the edge e 2 in E if the point of intersection of L and e 1 lies above the point of intersection of L and e . If e 2 1 and e 2 have a common vertex through which L passes, e 1 Chapter 2. Polygon Nesting 11 Figure 2.3: The above relation of the edges. is “above” e 2 if e 1 is at the left of e 2 (see edges e 5 with e 6 and e 7 with e 8 in Figure 2.3). L induces a partial order R on the edges in E with respect to the “above” relation. If L passes through a vertex v, we define above(v) as the set of edges whose point of intersection with L is above i’. V:. The lowest edge in above(vj is called the ueighbor of Order R extends naturally to another order 0 of subchains associated with the edges in R. If edges e 1 and e 2 of subchains C 1 and C 2 intersect L, then C 1 is above C 2 if e 1 is above 2.2.2 62. The Plane Sweep Algorithm A polygon P consists of subchains C ,C 1 , 2 .., Ck. We sweep a vertical line L in the plane through all the polygons, while maintaining the ordering 0 of the subchains C induced by L. To maintain this order we stop only at the endpoints of the subchains, while sweeping from left to right. First, we break the boundaries of all the polygons into subchains in 0(n) time where n is the total number of vertices of all polygons. Let N be the number of subchains. Clearly we have 2m N n where m is the number of polygons. We sort only the Chapter 2. Polygon Nesting 12 endpoints of all the N subchains according to their x coordinates. At each end point we update the ordering 0 and detect the parent of the polygon if required. The algorithm is as follows. Algorithm Input: A set of rn simple, non-intersecting polygons. Output: A directed acyclic graph G, called the nesting structure. Step 1: Detect the endpoints of subchains in all polygons. Step 2: Sort the x-coordinates of these endpoints in increasing order. If two points have the same x-coordinates, the one with lower y-coordinate is sorted before the other. Let this sorted sequence W be v, v , 2 ..., va,. Step 3: Create a node in G for each polygon. Insert the two subchains connected with the leftmost vertex v 1 edges connected to vertex each vertex Va e W by inserting to the ordering 0 the two polygon “i Then we sweep from left to right, taking steps at of W as follows. Step : If vertex ‘a associated with polygon P is a left endpoint of a subchain C. We can find another subchain C 2 connected to ”a t (a) Insert the subchains C 2 into ordering 0 on L by inserting the two 1 aild C polygon edges connected to Va in C 1 and C 2 by a simple binary search. The search is based on a procedure for determining the position of Va with respect to the edge inserted by L on a subchain C, already present in ordering 0. Figure 2.4 shows the changes of the ordering 0. For the later purpose, we keep the last visited edge associated with each sub chain C, in 0. This can be accomplished by the following simple bookkeeping. Chapter 2. Polygon Nesting 13 Cl Cl Cl C2 C2 ,e 3 c 1 C4 , 3 C 4 e , eal 6 c 3 c 5 C C 7 e a2 C4 C5 5 C L Figure 2.4: Update orderillg 0 at a left endpoint: Let the edge associated with C initially be e , the first edge of the subchain 1 ,e 1 , 2 C. We visit the sequence of edges e ek ..., el of C,, stopping at the first edge that intersects L. We determine the position of associate edge ek Va with respect to ek and with C. Later, when we need to classify any other vertex with respect to C we start from edge ek; Figure 2.4 shows the edge changes of 03. Obviously the edges of C are visited only once without updating the ordering 0 throughout a sweep. (b) Detect the parent of the polygon associated with Let e be the neighbor edge of Va ‘a found while inserting C 3 be 1 and 02. Let F the polygon containing e on the boundary. We determine k, the number of edges (or equivalently the number of subchains) of the polygon F 3 that are in above(va). Maintaining the ordering of subchains of each polygon separately, this number can be obtained in O(log S ) time where S is the number of 1 subchains in the polygon F. If k is odd and F 3 3 to be the parent P, we set P of P,. Otherwise, we set the parent of F 3 to be the parent of P. Certainly, parent determination at each update adds up to at most O(log S), where S is the total number of subchains. From our notch and subchain definitions we Chapter 2. Polygon Nesting know that S = 14 N, where N is the total number of notches. Thus, the total time for parent determination is O(log N). Step 5: If vertex associated with polygon F is a right endpoint of subchains C and 02. We delete the subchains Cl and 02 from ordering 0 on L by a simple binary search. Now we prove the correctness of detecting the parent of a polygon. Lemma 2.2 Let L be any line passing through vertex e be the neighbor of Va. Va of a polygon F. Let the edge The parent of F, is either the polygon F containing e or Fj’s parent (possible null). Proof. If the neighbor edge e of Va is an edge of F , which is a parent of F, the lemma 3 holds trivially. Suppose F 3 is not the parent of F. We claim that 1 if and only if e lies inside it. Suppose e lies inside F F 1 and segment between Va Va a lies inside polygon does not. Then the vertical and e on L contains a part that is outside of F . This is impossible 1 since e is the neighbor edge of Va. Similarly, we can argue that if Va lies inside polygon , so does e. Hence e lies inside the same set of polygons, within which 1 F Va lies. Hence, if Fk is the parent of F it is a parent of F 3 and vice versa. D Lemma 2.3 Let L be any line passing through vertex contained in the polygon Fk, k Va of a polygon F . Vertex v, is 2 i if and only if the number of edges of Fk that are in above(va) is odd. Proof. Since any edge demarks the region that is “inside polygon F” and “outside polygon F” on L the lemma is obvious. D 15 Chapter 2. Polygon Nesting Theorem 2.4 Let L be any line passing through the neighbor of Va Ofl Va of P. Let edge e of polygon Fk be L. If the number of edges of Fk in above(va) is odd and k i, then Fk is the parent of P. Otherwise, Ph’S parent is the parent of P. Proof. Combine lemmas 2.2 and 2.3. D Theorem 2.5 The problem of polygon nesting for m polygons can be solved in O(n + N log N) time where n is the total number of vertices in the polygons and N is the total number of notches of all polygons. Proof. Detecting the endpoints of the subchains takes 0(n) time. Sorting these end points requires 0(8 log S) time. Updating and determining parent takes O(n + Slog 5) time. By Lemma 2.1, 3, the total number of subchains, is N where N is the total number of notches. Hence, total time spent is O(n + NlogN). D Chapter 3 Topology Building 3.1 Introduction The topology building technique is especially useful in GIS, graphics, and computational geometry. GIS, CAD/CAM, and medical applications contain large amounts of spatial data which users want to analyze and from which they want to extract spatial information. An important criterion, fundamental to most spatial analysis, is the information about the relative spatial location among the objects. For instance, earth scientists monitoring global change want to retrieve all water area in a particular region; airplane designers are interested in all devices within reaching distance from the cockpit; or doctors, analyzing X-rays or tomographies, want to determine which organs are affected by a tumor. We will focus topology building on vector-based geographical information systems. In GIS, maps are generally represented by separating areas having different properties by boundary arcs, which are made of line segments, e.g. forest, lakes, roads, agricultural areas, and some boundaries are joined together geographically. Joined boundaries form connected components. Retrieving the areas enclosed by the line segments is the basic step for retrieving the geographical information of the areas with the same property. One may ask the questions as follows: how can one retrieve a single area within a certain connected component? What is the relative spatial location of the given line segments? We solve this problem by constructing the topological structure and building 16 Chapter 3. Topology Building 17 a pointer list for retrieving individual areas. We assume that no topological information is given in the input data, because most data are supplied as sets of line segments, called edges. The only way for us to know the topological structure of the edges is to display them on a computer screen or draw them on a piece of paper, such as maps. For example, Figure 3.1 shows us the topological structure clearly. But without the help of this visual drawing we could not know anything ,e 1 ,. 2 about the topological structure except a set of edges {e . . , e}. Figure 3.1: A set of nested connected components and faces. Before we describe the problem we give the precise definitions of a connected compo nent and faces of the connected component. Definitions. Let 0 denote an arbitrary planar graph with vertex set V = ,v 1 {v , 2 ..., and edge set E. A (planar) embedding of G is a drawing of G in the plane which each vertex is represented as a point and each edge is represented as a simple curve joining its endpoints in such a way that no pair of edges intersect except (possibly) at their end points. An embedding G of G induces a partition of the plane into regions or faces bounded by the edges of G. We call this the planar subdivision associated with G. If Chapter 3. Topology Building 18 each of the edges of G is straight line segment (respectively, a sequence of straight line segments) then G is said to be a straight-edge embedding and the associated subdivision is a straight-edge subdivision. It is well known that all planar graphs have straight-edge embeddings; in fact every planar embedding has a topologically equivalent straight-edge embedding [9]. Hereafter we will restrict our attention to linear embedding. A path of length k from a vertex (VO,v1,...,Vk) of vertices such that The path contains the vertices , v 0 v , 1 to a vertex = ..., vo, it’ it’ = vk, in a graph 0 = (YE) is a sequence and e E for i ,v) 1 (v_ z’, and the edges (v , z’), (v 0 ,v 1 ), 2 there is a path p from it to v, we say that i’ ..., = 1,2,...,k. (ak_i, Uk). If is reachable from it via p. The connected components of a graph are the equivalence classes of the vertices under the “is reachable from” relation. The graph in Figure 3.1 has five connected components: 01,02, C3 04, C 5 and the connected component C has six faces: , 11 , f 12 , f 13 f The Problem. Let {ei, e ,. 2 . . , 15 . f , 16 f e} be a set of n edges. Our topology building problem is to compute a quad-edge data structure [7] for the input edges as an intermediate data structure that contains the topological structure of the set of edges, meanwhile setting up pointer lists for retrieving the connected components and faces of the connected components. We build the quad-edge data structure of the input data with a plane sweep method. The quad-edge data structure may be seen as a variant of the “winged edge” represen tation for polyhedral surfaces [2]. In the next section we will give the details of the quad edge data structure. A good property of the quad edge data structure is that the global topological structure can be built by simple operations from the local relations of the edges. This property ensures us that we can construct the data structure correctly by using the plane sweep method. To construct and modify this data structure a single topological operator, called Splice, together with a single primitive for the creation of Chapter 3. Topology Building 19 isolated edges, called MakeEdge, is sufficient. This data structure is general enough to allow the representation of undirected graphs embedded in arbitrary two-dimensional manifolds. The definition of the data structure is supplied by Guibas and Stolfi [7]. In Section 3.2 and Section 3.3 we give the brief review of the quad edge data structure. If you are familiar with the data structure you may skip these two sections and continue reading from Section 3.4 because we use the same notations that are used by Guibas and Stolfi [7]. 3.2 Edge Functions In this section we give a precise definition for the informal concept of an embedding of an undirected graph on a surface. Special instances of this concept are sometimes referred to as a subdivision of the plane, a generalized polyhedron, a two-dimensional diagram, or other similar names. They have been extensively discussed in the solid modeling literature of computer graphics [2, 11]. We want a definition that accurately reflects the topological properties one would intuitively expect. (For instance, that every edge is on the boundary of two faces, every face is bounded by a closed chain of edges and vertices, every vertex is surrounded by a cyclic sequence of faces and edges, etc.). Our intuition leads the following definition. Definition 3.1 A subdivision of a manifold Mis a subset S of Mpartitioned into three finite collections of disjoint parts: the vertices, the edges, and faces (denoted, respectively, by V, E, and F), with the following properties: 51. Every vertex is a point of M. S2. Every edge is a line segment of M. S3. Every face is homotopic to (deformable to) an open disk of M. S4. The boundary of every face is a closed path of edges and vertices. 20 Chapter 3. Topology Building The vertices, edges, and faces of a subdivision are called its elements. In order to explain the quad edge data structure clearly, we first define the edge functions and describe the dual of a planar graph. Then we discuss the properties of edge functions. 3.2.1 Basic Edge Functions The edge functions, their dual, and the properties of the functions are given by Guibas and Stolfi [7]. Here we only give a brief description in order to understand the quad edge data structure. In Figure 3.2, on any disk D of a manifold there are exactly two ways of defining a local “clockwise” sense of rotation; these are called the two possible orientations on D. There are also exactly two consistent ways of defining a linear order among points of a line 1; each of these orderings is called a direction along 1. A directed edge of a subdivision P is an edge of P together with a direction along it. Since directions and orientations can be chosen independently, for every edge of a subdivision there are four directed and oriented edges. Observe that this is true even if the edge is a loop or is incident twice to the same face of P. Because in many applications, that we are going to discuss including ours, all manifolds to be handled are orientable. This means we can assign a specific orientation to each edge, vertex, and face of the subdivision so that any two incident elements have compatible orientation. Because we can pre-orient the edges we give a simplified version of the edge functions. For the details for non-orientable manifolds, please see Guibas and Stolfi’s article [7]. For any oriented and directed edge e we can define its vertex of origin, eOrg, its destination, eDest, its left face, eLeft, and its right face eRight. We define also the symmetric of e, called eSym, as being the same undirected edge with the opposite direction Chapter 3. Topology Building 21 but the same orientation as e. Traversing the boundary of a disk D around a vertex v in the counterclockwise di rection establishes a cyclical ordering of the edges. We obtain what is called the ring of edges out of v, shown in Figure 3.2. Note that if e is a ioop, it will occur twice in the ring of edges out of e. To be precise, both e and eSym will occur once each: since the manifold around v is like a disk, e will appear only once in each circuit. Figure 3.2: The ring of edges out of a vertex. We can define the next edge with same origin, eOnext, as the edge immediately fol lowing e (counterclockwise) in this ring, such as d in Figure 3.2 is the eOnext. Similarly, given an edge e we define the next counterclockwise edge with same left face, denoted by eLnext, as being the first edge we encounter after e when moving along the boundary of the face F = eLeft in the counterclockwise sense as determined by the orientation of F. The edge eLeft is oriented and directed so that eLnextLeft 3.2.2 = F (including orientation). Duality The dual of a planar graph G can be informally defined as a graph G* obtained from G by interchanging vertices and faces while preserving the incidence relationships. The Chapter 3. Topology Building 22 definition below extends the intuitive concept to arbitrary subdivision. Definition 3.2 Two subdivisions S and S* are said to be dual to each other if for every directed and oriented edge e of either subdivision there is another edge eDual of the other such that Dl. (eDual)Dual D2. (eSym)Dual D3. (eLNext)Dual = = e. (eDual)Sym. . 1 (eDual)Onezt = Equation D3 states that moving counterclockwise around the left face of e in one subdivision is the same as moving clockwise around the origin of eDual in other sub division. To see why, note that the edges on the boundary of the face F = eLeft, in counterclockwise are { e, eLnext for some m , eLnext , 2 . . . , m eLriext = e} (3.1) 1. This path maps through Dual to the sequence { eDual , (eDual)Oriext 1 , (eDual)Onext , 2 . all edges coming out of the vertex v . . , m (eDual)Onext = eDual}: (eDual)Org of S*, in clockwise order around v. We can therefore extend Dual to vertices and faces of the two subdivisions by defining (eLeft)Dual = (eDual)Org and (eOrg)Dual = (eDual)Left. Equations D2 and D3 imply that any two edges that differ only in direction will be mapped to two versions of the same undirected edge. The Dual establishes a correspondence between and E*, between V and V*, and between F and F”, such that incident elements of S correspond to incident elements of S* and vice versa. The edge eRot is called the rotated version of e; it is the dual of e, directed from eRight to eLeft and oriented so that moving counterclockwise around the right face of Chapter 3. Topology Building 23 e corresponds to moving counterclockwise around the origin of eRot. Then we have (eRot)Rot 3.2.3 = eSym instead of e. Properties of Edge Functions The functions Rot, and On ext satisfy the following properties: El. 4 eRot E2. eRotOnextRotOnext E3. 2 eRot E4. e E ES if eRot E ES*. E5. e E ES if eOnext = = e. eSym = e. e. ES. A number of useful properties can be deduced from these, as for example 1 eRot = 3 eRot 1 eOnext = eRotOnextRot and so forth. For added convenience some derived functions are introduced. By analogy with eLnext and eOnext, for a given e we define the next edge with same right face, eRight, and with same destination, eDnext, as the first edges that we encounter when moving counterclockwise from e around eRight and eDest, respectively. These functions satisfy also the following equations: eLnext = eRot’OnextRot eRnext = eRotOnextRot’ eDnext = eSymOnextSym. The direction of these edges is defined so that eLnextLeft eRight, and eDnextDest = eDest. Note that eRnextDest = = eLeft, eRnextRight = eOrg, rather than vice versa. 24 Chapter 3. Topology Building / eDnext I eSym eRot e eRnext eOnext / Figure 3.3: The edge functions. By moving clockwise around a fixed endpoint or face, we get the inverse functions, defined by eOprev = eOnext’ = eRotOnextRot eLprev = 1 eLnext = eOriextSym eRprev = eRnext’ = eSymOnext eDprev = eDnexr’ = 1 eRot’OnextRot It is important to notice that every function defined so far can be expressed as the composition of a constant number of Rot and O.next operations, independently of the size or complexity of the subdivision. Figure 3.3 illustrates these various functions. Chapter 3. Topology Building 3.3 25 Quad Edge Data Structure and Topological Operators Now we may have this question: do these edge functions accurately capture all the topological properties of a subdivision? The authors of the article [7] gave us the answer. In order to describe the result we first give a definition. Definition 3.3 An edge algebra is an abstract algebra (E, E*, Onext , Rot) where E and are arbitrary finite sets and Onext and Rot are functions on E and E* satisfying properties El — E5. The result is that the topology of a subdivision is completely determined by its edge algebra, and vice versa. 3.3.1 Quad Edge Data Structure We represent a subdivision S (and simultaneously a dual subdivision S*) by means of the quad edge data structure, which is a natural computer implementation of the corre sponding edge algebra. The edges of the algebra can be partitioned in groups of four: each group consists of the two directed versions of of an undirected edge of S plus the two versions of its dual edge. The group containing a particular edge e is therefore the orbit of e under the subalgebra generated by Rot. To build the data structure we select arbitrary canonical representative, from each edge group. Then any edge e can be writ ten as eoRotr, where r E {O, 1, 2, 3} and e 0 is the canonical representative of the group containing e. The group of edges containing e is represented in the data structure by one edge record e, divided into four parts e[O] through e[3]. Part e[r] corresponds to the edge eoRotr, see Figure 3.4. Chapter 3. Topology Building 26 Figure 3.4: Edge record showing Next links. An edge e = eoRotr is represented by the tuple (eo, r), called the edge reference. We may think of this tuple as a pointer to the “quarter-record” e[rj. Each part e[r} of an edge record contains two fields, Data and Next. The Data field is used to hold geometric and other nontopological information about the edge eoRotT. This field neither affects nor is affected by the topological operation that we will describe, so its contents and format are entirely dependent on the application. The Next field of e[r] contains a reference to the edge eoRotrOnext. Given an arbitrary edge reference (e, r), the two basic edge functions Rot and Onext are given by the formulas (e,r)Rot = (e,r+1), (e,r)Onext = e[r].Next, (3.2) Where the r is computed modulo 4. In the first expression in 3.2, this corresponds to rotate e 990 counterclockwise, we advance to the next part of the Next field of e[r]. The second expression says that applying the Onext function to e[r] we get the Next field of e[r]. From these formulas it follows also that (e, r)Sym = (e, r + 2), (e, r)Ror’ = (e, r + 3), (e, r)Oprev = (e[r + 1].Next)Rot, (3.3) Chapter 3. Topology Building 27 and so forth. The actual quad edge data structure we use for each e[r} is as follows. typedef struct Qedge } { Point *data; /* data field associated with e.Org *1 struct Qedge *next; 1* reference to next edge of ring */ mt mt mark; /* marker for transversal */ tmp; 1* reserved for future use *1 Qedge; Where the field data is a structure Point that simply consists of the x and y coordi nates of the vertex; the fields mark and tmp are used by our display routines to show the quad edge data structure. Figure 3.5 illustrates a portion of a subdivision and its quad edge data structure. We may think of each record as belonging to four circular lists, corresponding to the two vertices and two faces incident to the edge. The quad edge data structure contains no separate records for vertices or faces; a vertex is implicitly defined as ring of edges, and the standard way to refer to it is to specify one of its outgoing edges. This has the added advantage of specifying a reference point on its edge ring, which is frequently necessary when the vertex is used as a param eter to topological operations. Similarly, the standard way of referring to a connected component of the edge structure is by giving one of its directed edges. In this way, we are also specifying one of the two dual subdivisions and a “starting place” and “starting di rection” on it. Therefore a subdivision referred to by the edge e can be “instantaneously” transformed into its dual by referring to eRot instead. Chapter 3. Topology Building 28 1 H f d (a) A subdivision of a plane (b) The data structure for the subdivision (a) Figure 3.5: The subdivision and its quad edge data structure. 3.3.2 Basic Topological Operators Two basic topological operators are sufficient to construct and modify the quad edge data structure. The first operator is denoted by e — MakeEdge[org, dest]. It takes two points org and dest as parameters, and returns an edge e of a newly created data structure representing a subdivision of the plane (see Figure 3.6), where org and dest are the end points of a line segment. By assigning org to the eOrg and dest to eDest, we set up the direction of the new quad edge e. Because e is the only edge of the subdivision its left and right faces are closed to be one face. Then the origin points of left and right faces are set to be null. Apart from the orientation and direction, e is the only edge of the subdivision and will not be a loop; then we only need to set up the Onext rings of e. Those are eOnext = e, eSyrnOnext = eSym, eRotOriext = eRot’, and eRot’Onext = eRot. The second operator is denoted by Splice[a, b], takes two edges a and b as input Chapter 3. Topology Building 29 Figure 3.6: The result of MakeEdge. parameters and returns no value. This operation affects the two edge rings aOrg and bOrg and, simultaneously, the two edge rings aLeft and bLeft. These rings are defined in section 3.4.1. In each case, (a) if the two rings are distinct, Splice will combine them into one; (b) if the two are exactly the same ring, Splice will break it into two separate pieces; The parameters a and b determine the place where the edge rings will be cut and joined. For the rings aOrg and bOrg, the cuts will occur immediately after a and b (at aDest and bDest); for the rings aLeft and bLeft, the cut will occur immediately before aRot and bRot. Figure 3.7 illustrates this process for one of the simplest cases, when a and b have the same origin and distinct left faces. In this case Splice[a, bj splits the common origin of a and b in two separate vertices and joins their left faces. If the origins are distinct and the left faces are the same the effect will be precisely the opposite: the vertices are joined and the left faces are split. Indeed, Splice is its own inverse: if we perform Splice[a, b] twice in a row we will get back the same subdivision. In the edge algebra, the Org and Left rings of an edge e are the orbits under On ext Chapter 3. Topology Building 30 13 (a) aOrg = bOrg, aLe ft (b) aOrg bLeft. bOrg, aLeft = bLeft. Figure 3.7: The effect of Splice: Trading a vertex for a face. of e and eOriextRot, respectively. The effect of Splice can be described as the con struction of a new edge algebra A’ A = = (E, E*, Onext’ , Rot) from an existing algebra (E, E*, Onext , Rot), where Onext’ is obtained from Oriext by redefining some of its values. The modifications needed to obtain the effect described above are quite simple. If we let a = aOriextRot and 3 = bOnextRot, basically all we have to do is to interchange the values of aOnext with bOnext and aOnext with f3Onext. The apparently complex behavior of Splice now can be recognized as the familiar effect of interchanging the next links of two circular list nodes. aOnext’ = bOnext, bOnext’ = aOnext; aOnext’ = ,BOnext, 3 Onext’ = aOnext, Note that these equations reduce to Onext’ = Onext if b (3.4) = a. Since aOnext’ = Chapter 3. Topology Building 31 bOnext, to satisfy axiom E5 we must have a E E iff bOnext E E, which is equivalent to a E E if b E E, where E is defined in Definition 3.3. We will take this as a precondition for validity of Splice[a, bj. The following theorem proves that these manipulations are correct (for the proof please see article [7]). Theorem 3.1 If A is an edge algebra, a and b are both primal or both dual, then the algebra A’ obtained by performing the operation Splice [a,b] on A is also an edge algebra. This theorem guarantees to build a valid quad edge data structure by adding one edge at a time. Based upon this theorem we have adapted a plane sweep algorithm [14] to compute a quad edge data structure from a set of line segments. When the sweep-line encounters a vertex, the edges associated with this vertex can be added or deleted from an array in a constant time and can be spliced according to the connectivity of the edges. Keeping track of this local connection information allows us to build up the quad edge data structure. In the next section we will describe how to construct the quad edge data structure by a one pass sweep. 3.4 Topology Building by a One Pass Sweep The Org and Left rings of an edge e are the orbits of e under Onext and eOnextRot, respectively. The process of topology building is to construct a new edge algebra A’ (E, E*, Onext’ , Rot) from an existing algebra A = = (E, E*, Onext , Rot), where Onext’ is obtained from Oriext by applying the Splice operation on A in certain order. Since Splice does not affect Rot we only need to construct the Org ring of e correctly. The Left ring comes for free by duality. In another words, the Org ring of an edge e is a Chapter 3. Topology Building 32 sequence of edges that satisfies the following equation for some m 2 { e, eOnext, eOriext , ..., m eOnext = 1. e} (3.5) The basic idea to build the quad edge data structure for a connected component by a one pass sweep is to make use of the properties of the edge functions and the operators that can modify the Org ring of an existing quad data to build up the entire structure correctly. Splice gives us the tool to construct the quad edge data structure with only local information. Constructing the Org Ring of an Edge 3.4.1 We start building an Org ring in the quad edge data structure by calling MakeEdge[] for e, the first edge we encounter. Now the Org ring of edge e consists of only itself, that is, eOnext e, see Figure 3.6. From the edge functions (see Figure 3.3) we know that the edges in the Org ring of edge e are the edges that have the same origin vertex as edge e and different left faces, see Figure 3.8 (a). These edges are connected to be a cycle in their quad edge data structure; see Figure 3.8 (b). Let the current Org ring of edge e be {eOnext, eOnext 2 Let a 3 = eOnext for j = 1,2, ..., , ..., eOnext i, then the Org ring of e is {ai, a , 2 the same origin vertex eOrg. For a newly created edge b, if eOrg = = ..., e}, for i a = 1. e} with bOrg, which means edges e and b are connected, we get the new Org ring of edge e by doing the operation Splice[e, b], see Figure 3.9. Since Splice does not affect the Rot function, we get the correct Left ring of edge e after we construct the Org ring of edge e. In order to construct the Org rings of the edges consistently we assign the direction of the edges as follows. Let a and b be the two end points of an edge e. Let (a.x, a.y) and (b.x, b.y) be the x- and y- coordinates of the endpoints a and b. The direction of edge e Chapter 3. Topology Building 33 5 e=b (a) The Org ring of edge e. (b) The quad edge data structure of (a). Figure 3.8: The Org ring and its quad edge data structure. is from a to b if a.x < b.x or a.y < b.y if the a.x = b.x. If the direction of edge e is from a to b, we say that endpoint a of edge e is the origin of edge e and endpoint b of edge e is the destination of edge e. According to this direction assignment there are three cases that need to be handled to build up Org rings; see Figure 3.10. (a) The edge e and b have the same origin vertex, that is, bOrg = eOrg. By doing the operation Splice[e, bJ we get the new Org ring of edge e, see Figure 3.10 (a). (b) The destination vertex eDest of edge e is connected with bOrg of edge b, see Figure 3.10 (b). We know that the edge b is on the Org ring of edge eSym. We simply do the operation Splice[eSym, b] and the Org ring of edge eSym is constructed correctly. By duality, the Org ring of edge e is constructed correctly. (c) The edge b and edge e have the same destination vertex eDest of edge e, see Figure 3.10 (c). In this case we find that both edge e and edge b are on the Org ring of edge eSym. By doing the Splice operation on eSym and bSym, that is, Chapter 3. Topology Building 34 a3 /4 a3 (4 5 a 6 e=a eOrg 2 a I Splice 1 a eOrg 6 e=a cj2 1 a Figure 3.9: The result of Splice[e, bj. b (a) (b) (c) Figure 3.10: The connection cases. Splice[eSym, bSym], we get the Org ring of edge eSym at this vertex. By constructing the Org ring of the edges step by step we get the entire quad edge data structure of the connected components. The following section will give the detailed description of the algorithm. 3.4.2 Algorithm Any face of a connected component is enclosed by a chain of edges. In the quad edge data structure this chain of edges satisfies Equation 3.1. In another words, the edges of Chapter 3. Topology Building 35 the chain are within the same Left ring. If we get an edge e on the chain we can retrieve the face by following the Left ring of edge e in the quad edge data structure. So we define a handle of a face of a connected component to be a pointer pointing to the quad edge data structure of an edge e that is on the face. Note that there may be more than one handle for a face of a connected component because of the way we create the handles in Step 3 (a)-(2). But this will not affect the correctness. We get the same face by retrieving the Left ring of these handles. Input: A set of embedded line segments forms a connected component that has n vertices. Output: A list of handles of faces of the connected component. Stepi: Sort the vertices of all edges and create an event list for sweeping, breaking ties according to the following rules. (a) sort the x-coordinates of the vertices of edges (b) If two vertices have the same x-coordinate, the one with higher y-coordinate is sorted before the other. (c) If two vertices are the same, the one that is the destination vertex of an edge is sorted before the origin of another edge. (d) If the two vertices are the same and have the same type, such as they are the origin points or the destination points of two edges, the vertex of the edge that is on the left side of another edge is sorted before the other. Figure 3.11 shows us the order of the vertices associated with the edges. Copies of v are vertices of edge e ,e 1 ,e 2 ,e 3 4 and b ,b 1 ,b 2 ,4 3 b Their order after sorting is (e . ,e 1 ,e 2 ,e 3 ,b 4 , 1 Chapter 3. Topology Building 36 y x Figure 3.11: The order of the vertices: (ei, e ,e 2 ,e 3 , bj, b 4 ,b 2 ,b 3 ). 4 ,b 2 b ,b 3 ). Let this sorted list W be 4 zi, v , 2 ..., v,, where n is the total number of the vertices. The time of this sorting process is O(n log n). Step2: Create the first quad edge data structure fe by applying MakeEdge to the edge e associated with vertex v . Create the first handle of the face associated with edge e by 1 setting a pointer pointing to fe and mark edge e to be added. Store fe and edge e in the record SAVE. Step3: Sweep a line from left to right, taking steps at each vertex v of W. (a) If v, is the origin of edge b: We create the quad edge data structure bq of edge b by MakeEdge[b]. Mark edge b as added. Check if edge b is connected with edge e that is stored in the record SAVE. (1) If edge b is connected with edge e. If the origin of e is connected to operation Splice[e, b]. i’, the origin of edge b, then we perform the Chapter 3. Topology Building 37 If the destination of e is connected with zi, the origin of edge b, we do operation Splice[eSym, b]. (2) Else edge b is not connected with edge e. We create a handle of the quad edge data structure record of b and insert this handle into the handle list. Substitute e and fe with b and bq. (b) If v is the destination of edge b: We get the quad edge record bq of b that is created when sweeping at the origin of b. Check if edge b is connected with edge e that is stored in record SAVE. If edge b is connected with edge e then by our sorting procedure Step 1, the destination of e is connected with v, the destination of edge b. We do operation Splice[eSym, bSymj to construct the Org ring of the edges. After the Splice operation or if edge b is not connected with edge e that is stored in record SAVE, we substitute e and fe with b and bq in the record SAVE and mark the edge b as removed. The time of this algorithm is mainly spending on sorting the vertices in O(n log n). Because we do not need to maintain the vertical relations of the edges, the whole sweeping process is linear 0(n). So the time complexity of this algorithm is 0(n log n). 3.5 Conclusion In this chapter we gave a brief review on the basic concepts, the edge functions, and the topological operators of the quad edge data structure. Based upon the edge functions and the topological operators we propose a plane sweep algorithm to construct the quad edge data structure for connected components of a set of line segments. The basic concept of the algorithm is to construct the quad edge data structure by adding an edge each step Chapter 3. Topology Building 38 when sweeping across the connected components and applying the appropriate Splice operation on the connected edges. The time of the algorithm is O(n log n). Chapter 4 Connected Component Nesting As we mentioned in section 3.1, many objects in GIS are connected components instead of simple polygons and there are many other applications that need the nesting structure among connected components. In Chapter 2 we have studied the polygon nesting prob lem. In this chapter we describe the connected component nesting problem, and propose an algorithm to extract the nesting structure of connected components. Introduction 4.1 Unlike the polygon nesting problem for which the topological structure of the input data is given, we assume that no topological structure is given in the input data in the connected component nesting problem. Because of our application background we restrict our attention to straight-line em beddings of planar graphs. Each connected component is called a region and each face that is bounded by a closed chain of edges and vertices, is called a subregion of the region. Problem. Let C be a set of m connected components C in a straight-line planar graph, for i = 1,2, ..., m; Let SC be the set of k subregions SC, for j = 1,2, ..., k; We define ancestor(C:) to be the set of connected components that have a subregion SC containing C. The component k 0 is called the parent of G if ancestor(Ck) 39 = ancestor(C) — Ck. If Chapter 4. Connected Component Nesting 40 G: Ti: T2: 3 n Sfl 13 Ci 715 4 n (b) The nesting structure of the components (a) The connected components Figure 4.1: The connected components and the nesting structure. ) 1 ancestor(C = 1 is null. Any connected component whose 0 we say that the parent of C parent is Ck is called the child of Ck; see Figure 4.1. As in the polygon nesting problem, the nesting structure C of C is an acyclic directed graph (a forest of trees) in which there is a node ri, corresponding to each connected component C in C, and there is a node .snjj, corresponding to each subregion SC 1 in the connected component C . There is 1 a directed edge from a node n 1 to a node Sflkj if and only if a subregion SCk of Ck contains the region C 1 and Ck is the parent of C . The undirected edges of Figure 4.1 (b) 1 represent the regions with their subregions. The connected components nesting problem is to compute the nesting structure of a set of connected components. Figure 4.1 (b) shows the nesting structure forest of the connected components. A region is represented by a region number and the leftmost point of the connected component. The subregion is represented by a subregion number and the leftmost point of the subregion. The region data structure is typedef struct region { Chapter 4. Connected Component Nesting (b) (a) 41 (c) (d) Figure 4.2: The left-edge e 1 and right-edge e . 2 mt /* the region number assigned */ *num; /* the leftmost point of this region *1 Point *leftmost; } Region; The subregion data structures is typedef struct subregion mt *num; Point *leftmost; } { 1* the subregion number assigned *1 1* the leftmost point of this subregion *1 SubRegion; Definitions Let v 1 and v 2 be the two vertices of an edge e. We say that u 1 is the begin point of e if the x-coordinate of vj is less than the x-coordinate of v ; or if v 2 2 have 1 and v the same x-coordinate but v 1 has smaller y-coordinate. Then the other vertex v 2 is called the end point of e. We say that an edge e is on the region R of a connected component C if e is an edge of which C consists. Each edge e is on the boundary of two regions or two subregions. We call these two regions or subregions the left-region and right-region of e according to the direction of e that is from begin point to end point. The edge data Chapter 4. Connected Component Nesting 42 structure consists of two fields as follows. typedef struct edge } { Point *begin; 1* the begin point of this edge *1 Point *end; 1* the end point of this edge *1 EDGE; Let v be a begin point of edge e. We say that edge e is a left-edge if no edge that takes i’ as a begin point lies to the left of e. And edge e is a right-edge if no edge that takes ii as a begin point lies to the right of e. The edge e in Figure 4.2 (a) is both a left-edge and right-edge. The edge e 1 is left-edge and e 2 is a right-edge in Figure 4.2 (b). This definition is the same when the vertex ii is an end point of edge e, see Figure 4.2 (c) and (d). 4.2 4.2.1 Disjoint Sets and Notches Disjoint Sets Because the sweep algorithm transforms a static two dimension problem into a dynamic one dimension problem, we create disjoint sets of regions, subregions and, later, nest ing sets before we know their connection information. When connection information is discovered during the sweep, these disjoint sets are united together to represent the con nected components. In Figure 4.3 (a) we do not know any connection information of the regions and subregions, such as regions R2 and R7 and subregions sr2l and sr7l, before the sweep line reaches the vertex r. Before reaching vertex r we create a set to represent each possible region R or subregion sr . Depending on the information that we have 3 at this stage, we also create a list of nesting sets, such as nest 3 1 which indicates nest Chapter 4. Connected Component Nesting that R 5 1 and nest 3 is nested within R c 43 2 which indicates that R nest 2 nests R , etc. 5 After the sweep line reaches vertex r, we know that R 1 is connected with R 3 and R 2 is connected with R . According to the connection information R 7 1 is united with R 3 to be one region and R 2 is united with R 7 to be one region. The associated subregions and the nesting sets of the united regions are ullited accordingly. After the union operation, the regions, subregions, and nesting sets are shown in Figure 4.3 (b). R2 L sweeping direction (a) The region and subregion sets at point v (b) The region and subregion sets at point r Figure 4.3: The idea of using disjoint sets to extract the nesting structure. The nesting sets created with the region and subregion sets are not shown in Figure 4.3 because the nesting sets are described in detail following. The nesting set is represented by a tree with pointers from child to parent, and each nesting set consists of the region and subregion information of the coilnected component and has a pointer parent pointing to its parent nesting set, and a flag indicating the type of this set because the union operation of the nesting set may change the set’s type. If the region or the subregion that the set represents is truly a parent of its children sets, the flag is set to be 0, which Chapter 4. Connected Component Nesting 44 means this set is the region representative set, or to 2, which means this set is a subregion representative set. Otherwise is set to be 1, indicating that this set is united with other set and the parent of this set will be the parent of this set’s children. The nest data structure is typedef struct nest mt } { *f lag; /* the set property flag = 0, 1, or 2 */ struct nest *parent; 1* the parent of this set *1 Region *region; 1* the region on which this set is SubRegion *subregion; 1* the subregion on which this set is */ Nest; Two disjoint-set operations are used. Make_set creates the disjoint region, subregion, and nesting sets. Union unites two region set, or subregion sets, or nesting sets and make them consistent with the connection information among these sets. How to extract the nesting structure from connected components by the two disjointset operations will be discussed in detail in Section 4.3.3. 4.2.2 Notches Because connected components have more complicated structure than simple polygons, a new notch definition is needed. Definition 4.1 A vertex v is a notch of C (a) if vertex v is taken as a begin point by all the edges that intersect at vertex zi; (b) if vertex ii, is taken as an end point by all the edges that intersect at vertex vj; Chapter 4. Connected Component Nesting 45 V V (a) (c) (b) Figure 4.4: Vertex v is a notch. (c) if there are at least three edges are connected through v. In all these parts of Figure 4.4, v is a notch. Notice that in cases (a) and (b) the number of edges may be only one. We can see that between any two neighboring notches (vi, vi), the sequence of vertices (vj, vj , 1 ..., i’) is a x-monotone chain, we call these x-monotone chains simple lines; see Figure 4.5. The simple line is represented by a list of edges. The data structure is typedef struct line } { EDGE *edge; 1* one edge of this simple line *1 struct line *next; 1* the pointer to the next edge *1 LINE; We do not stop only at the notches such as in Section 2.2.2 in our sweep algorithm. Because we have assumed that the input data does not give us any connection information of the edges, we get nesting information simultaneously with the connection information. If by having the edge connection information we can spend less than O(N log N) time to get the notches and simple lines, where N is the number of notches of connected Chapter 4. Connected Component Nesting 46 8 S 2 V y Vi S 10 3 V x Figure 4.5: , 1 i” ..., v are notches and ..., i2 are simple lines. components, it maybe is worthwhile to preprocess the input data and use notches instead of all vertices in the plane sweeping algorithm. Because all the edges of a simple line have the same region, subregion, and nesting set, one can make simple modifications to our algorithm to substitute vertices and edges with notches and simple lines while sweeping to increase the efficiency of the algorithm because the all the edges of a simple line have the same region, subregion, and nesting set. For example, after we build the quad-edge data structure for the connected components, we can separate each face and reduce the connected component nesting problem to the polygon nesting problem. But we have more efficient way to extract the nesting structure from the connected components. We can get the nesting structure of the connected components while we build the quad edge data structure from them. Because no edge connection information is available at this moment, our algorithm can work with only vertices and edges. Chapter 4. Connected Component Nesting 4.3 47 Plane sweep As in Section 2.2.2, we sweep a line L through all the connected components, while maintaining the ordering 0 of the edges induced by L. The difference is that we also create and maintain a list of disjoint dynamic sets of regions, subregions, and nesting sets, for extracting the nesting structure of the connected components. To maintain this ordering and the disjoint sets we stop at endpoints of each edge of the connected components, while sweeping from left to right. When sweeping from left to right, the sweepline L encounters each vertex in the input data once. Depending on whether the vertex is a begin point or an end point of an edge, new region, subregion, and nesting sets may be created or existing region, subregion, and nesting sets may be united. For a begin point of an edge we update the ordering 0, region, subregion, and nesting sets in the following ways. 4.3.1 Update at the Begin Point v of Edge e There are six cases to be handled when the sweep algorithm encounters the begin point v of edge e. Case (a): v is a vertex such that no edge connected to e through v has been encountered. A new region Re is created for e. If more than one edge takes z.’ as the begin point and e is not a right-edge, (which means that there is at least one edge that takes v as a begin point and lies at the right of edge e), a subregion SRe is created for e. We insert e into ordering 0 on L by a simple binary search. Simultaneously, we get the above and below neighbor edges of e, say a and b. (The “above neighbor” of e is the lowest edge that is above e in above(v) and the “below neighbor” of e is the highest edge that is below e in above(vj, see Figure 4.6 (a).) A new nesting set neste is created by assigning Chapter 4. Connected Component Nesting 48 Re to neste. a b L L (a) (d) Figure 4.6: Vertex If a’s region 1’ L (b) (c) L L (e) (0 is a begin point of edge e. b’s region then we know that a’s right-region must equal b’s left-region. We take the nesting set nesta, representing the subregion SRa, to be the parent of Re by creating a pointer from nests to nesta. We also assign SRe or null to the subregion field of nest, and set the flag of nest to be 0. The left-region of edge e is set to be the right-region of edge a and the right-region of edge e is set to be the left-region of edge b. Now we store nests, Re, along with the edge e in the ordering 0. If a’s region way. Let Xa b’s region, we detect the parent nesting set of nestb in the following be the x-coordinate of the leftmost point of Ra and xb be the x-coordinate of the leftmost point of Rb. We assume that the neste Xe xb. We take the parent nesting set of as the parent of nests. Because there are no other edges between a and e and Chapter 4. Connected Component Nesting 49 ...... . Ia .. ....... IeA Ie .... . a.. .. . V’’c •.. V I Lb I L.b e } V ., I .. (c) (b) (a) Figure 4.7: The nesting cases. b and e there are oniy three situations under this condition. The first one is that and nestb is that have the same parent. The second one is that nestb nests nesta. nesta nest a nests nest&. The third one Figure 4.7 shows these three situations. After detecting the parent nesting set we assign the appropriate region and subregion data to the edge and nesting set of Case (b): ii: e. Then we continue sweeping forward. is a vertex such that edge e is connected with its above neighbor edge f that takes v as a begin point, see Figure 4.6 (b). (1) If e from is not a right-edge. Then e inherits the region and the parent information f. Because edge e is on the boundary of a new subregion we create a new subregion SR for e and set SRe to be the right region of e. According to their region, subregion, and nesting set information we create a new nesting set neste for e and the set flag is set to be 2. (2) If e is a right-edge then e inherits the region and the parent information from f. Chapter 4. Connected Component Nesting 50 Because e is on the boundary of left region of left-edge g we set e’s right-region = g’s left-region. With these region, subregion, and nesting set data we create a new nesting set neste for e but no new subregion set is created. Case (c): v is a vertex such that edge e is connected to only one edge f and f takes v, as its end point; see Figure 4.6 (c). This is the simplest case. The edge e simply inherits all the region, subregions, and nesting set from f. There is no new set created. Case (d): v is a vertex such that edge e is connected with an edge its end point; edge e is a left edge, and edge f f that takes v as is the only edge that takes i’ as its end point, see Figure 4.6 (d). Edge e inherits the region and the nesting set information from edge f. Now edge e and edge f have the same left-region, but different right-regions because e is on a new subregion boundary. Then we create a new subregion for e and set the left-region of e to be the new subregion. From the region, subregion, and the nesting set we create a new nesting set neste for e and set the flag of neste to be 2. Case (e): v is a vertex such that edge e is connected with an edge as its end point; edge f f that takes v is a right edge, and edge e is the only edge that takes ii as its begin point, see Figure 4.6 (e). Edge e inherits the region and the nesting set from But e and f have different f. left-regions because e has the same left-region as that g has. Because no new region and subregion are encountered we simply let edge e carry region, left-region, right-region, and nesting set data. Case (f): v is a vertex such that edge e is connected with a edge end point; edge f f that takes z as its is a right-edge, and edge e is a left-edge, see Figure 4.6 (f). In this case, edge e inherits only the region and nesting set from f because the e’s left-region and right-region are different from those of f’s. Edge e’s left-region is the left-region of g, and right-region is a new subregion. So we create a new subregion for e and with these data we make a new nesting set nest 6 for edge e. Chapter 4. Connected Component Nesting 51 f:.: (a) (b) (c) Figure 4.8: The cases of a notch v as the end point of edge e. In order to get the four left-edges and right-edges at vertex v, which is currently encountered while sweeping, we simply keep four extra records for vertex v and update these records while sweeping. 4.3.2 Update at the End Point v,, of Edge e When the sweep encounters the end point v of an edge e, there are two major situations. In the first one, there are edges that take v’ as begin point; see Figure 4.8 (a) and (b). In the second one, no edge takes v as its begin point, see Figure 4.8 (c). In these two situations we have three cases that we handle differently. But in common we delete edge e from the ordering 0 on L by a simple binary search. Case (a): v is a vertex such that no edge connected to e through i’ has been encountered. Because e is a left-edge we get c’s region, subregion, and nesting set data; then delete e from ordering 0 on L by a simple binary search. Case (b): v is a vertex such that edge e is connected with its above neighbor edge that takes v, as a end point, see Figure 4.8 (b). f Chapter 4. Connected Component Nesting Now we know edge e and f 52 are connected and they should have a consistent region, subregion, and nesting set. If the data of edge e is not consistent with the data of edge f, we unite these sets and make them consistent, such that e and data and consistent left-region of f f have the same region and right-region of e. The union policies are as follows. (1) Suppose edge e and edge f are in the same region but the right-region of edge f and the left-region of edge e are different. Let Le be the left-region of e and Rj be the right-region of edge f. Let Xe be the x-coordinate of the leftmost point of Le and vf be the X-coordinate of the leftmost point of Rf. Without losing generality we assume that Xf Xe. Then we substitute Le with Rj. Meanwhile all the nesting sets that have taken the nesting set of Le as their parent are changed to the nesting set of Rf as their parent. We implement this by simply changing the data in the records that the pointer points to. (2) Otherwise edge e and f f have different region data. We unite the regions of e and to be one region. Let Re be the region of e and Rf be the region of and Xf Xf f. Let Xe be the X-coordinate of the leftmost point of Re and Rf respectively, and Xe. Then we substitute Re with Rf. Meanwhile all the nesting sets that take the nesting set of region R as their parent are changed to the nesting set of Rf as their parent. Then we do a union operation on the left-region of e and right-region of f with the same policy as that in (1), if the left-region of e and right-region of f are different. Case (c): v is a vertex such that edge e is connected with its above neighbor edge f that takes v, as a end point, and there is no edge that takes vj as a begin point, see Figure 4.8 (c). In this case, we not only need to make the region, nesting sets, and subregions of Chapter 4. Connected Component Nesting the edges e and f 53 consistent, but also need to ensure that the right-region of e must be consistent with the left-region of the left-edge g, You may notice that g may equal there is no other edge between g and e. After we unite f and f if e as in Case (b), we unite the left-region of g with the right-region of e by the same policy used in Case (b). Then the nesting sets are united according to the region and the subregion data. 4.3.3 Getting Nesting Structure The correct number of regions, subregions, and the nesting structure of the connected components are determined by the topological structure of the input data. For instance, the region number should be equal to the number of connected components. In this section, we prove that our algorithm determines these quantities correctly. Lemma 4.1 The two disjoint-set operations, Make_set and Union preserve the number of connected components of the input data. Proof: The Make_set operation creates at least one region set for each connected component. Let C be a connected component. Let ip be the left-most point of C. Suppose that k (k 1) edges take ip as a begin point. When sweepline L encounters ip, it is the case (a) in Section 4.3.1 to be handled. Then a region set is created. Let R , R 1 , 2 ..., R, (t > 1), be the region sets created by Make_set for connected component C, while sweeping crossing C. For t = 1, no Union operation is applied to the region set R 1 from the analysis in Section 4.3.1 and Section 4.3.2. Then the lemma holds. For t > 1, there is at least one path from R to R, i j. On the path there is one vertex v that the two region sets R and R 3 meet. The vertex v must be in the case (b) and (c) in Section 4.3.2. While sweepline encounters vertex v the region sets, R and. R, Chapter 4. Connected Component Nesting 54 are united to be one region set according the union policies in Section 4.3.2. Finally, all the region sets, R , R 1 , 2 ..., R, are united to be one region set. Because no path connects the region sets of different connected components, no Union operation is applied to them. So the region sets of different connected com ponents will not be changed. Hence, there is only one region set for each connected component to represent the connected component. Then the lemma holds. D Lemma 4.2 The two disjoint-set operations, Make_set and Union preserve the correct number of the faces of connected components of the input data. Proof: Let f, •.., fk, (k > 1), be the faces of connected component C. The same as the proof of lemma 4.1, Make_set creates at least one subregion set for each face in a connected component. Let S , 1 ..., S, (t k), be the subregion sets created for connected component C. These subregions may belong to different region sets created by Make_set for connected component C. Each face is enclosed by a chain of edges and vertices. If more than one subregion sets are created for a face, there is a path that connects them. Suppose that two subregions S, and Sj meet at vertex v. The vertex v must be in the case (b) and (c) in Section 4.3.2. If S and S are in the same region set, then S and S are united to be one subregion set, say S, directly. If S, and S are in different region sets, the region sets are united first, then S and S are united to be one subregion set. Hence the subregion sets on the same face are united to be one subregion set. For the subregion sets of the different faces, no Union operation is applied on them because the Union consistent policies. For the subregion sets of different connected components, no Union operation is applied to them because no path connects these subregion sets. Finally, we have one subregion set for each face of connected components. D Chapter 4. Connected Component Nesting 55 Lemma 4.3 The two disjoint-set operations, Make.set and Union, preserve the correct structure of each connected component and its faces. Proof: From the analysis in Section 4.3.1 and Section 4.3.2, a nesting set with flag 0 is created when a region set is created and a nesting set with flag 2 is created when a subregion set is created. By combining lemma 4.1 and lemma 4.2, the nesting sets with flag 0 are united to be one nesting set when the region sets are united to be one set. Like the subregion sets, the nesting sets with flag 2, which represent the faces of the connected component, are united to be the nesting sets that each set represents a face of the connected component. Finally, we have one nesting set with flag 0 to represent the connected component and exactly the same number of nesting sets with flag 2 to represent the faces of the connected component. D From the analysis in Section 4.3.2, the union of two nesting sets n and n 3 to be one nesting set n is accomplished by marking the flag of nesting set n 3 to be 1 and setting the parent to nesting set n. We call a parent pointer of n 3 that points to n a direct edge from n 3 to n. Then between a parent nesting set and a child nesting set is a sequence of direct edges. We call this sequence of direct edges is an edge chain. The edge chain may have one direct edge. We call the nesting set between two consecutive direct edges intermediate node. The flag of a nesting set between two consecutive direct edges is 1 because intermediate nodes have been united. The flags of the two end nesting sets of the sequence of direct edges are marked either 0 or 2. Lemma 4.4 Let E be an edge chain that connects two nesting sets snt, (t 1) be the intermediate nodes. Let f Sj and .sj. Let sn , 1 be the face associated with s. Let C , 1 C and C 3 be the connected components associated with nesting sets sn , 1 s. Then the connected components C , 1 ..., C and C 3 are nested in the face ..., f. snt, and Chapter 4. Connected Component Nesting 56 Proof: Suppose this lemma is not true. Now let us check the relation between .s and . From the assumption we have that C 1 sn 1 is outside of f but the parent nesting set of 1 is s. According to the region and subregion consistency policy in Section 4.3.1, the sn region or subregion associated with s is outside of f. Then the subregion associated with f is united with its subregion representative and similarly s is united with the nesting set of its subregion representative. Thus, the flag of .s is set to be 2 which contradicts the given condition 1. This contradiction proves that C 1 is nested in Let C , 1 ..., k1 0 be the connected components nested in nected component not nested in , 1 Ck from C vertex , ..., and Ck be the first con in the edge chain. Clearly, the edges of f it is not possible to detect any of the connected components Ci, the parent of °k separate be the left most point of Ck. When sweepline encounters Let Ck_1. f f f. Ck_1 as ..., because the parent detecting is based on the neighbor edges of as in Section 4.3.1. Similarly, it is impossible to detect any faces of the connected components , 1 C ..., Ck_1 as the parent of the subregions in Ck. So it is impossible to set a direct edge from snk to f. k—1• Hence, the connected component Ck must be nested by the face The same kind of analysis can be applied to the connected component j 0 with nesting set .s,. Then all the connected components associated with sn , 1 are nested in the face f. associated ..., sn, and 0 Theorem 4.5 The tree of nesting sets records the nesting structure of the connected componeilts correctly. Proof: Let C , C 1 , 2 ..., Cm, for m 2, be a set of nested connected components. Accord ing to lemma 4.3, there are m nesting sets to represent the m connected components. Let n , 1 712, . .•, m be the nesting sets that represent the connected components in the 71 nesting tree. Let sn , 1 ..., snj, for i = 1,..., m, be the nesting sets that represent the faces of the connected components. Chapter 4. Connected Component Nesting 57 Figure 4.9: The parent of C 3 can not be detected directly. First, we prove that if C is the parent of C, i and an edge chain from n 3 to ik 5 j, then there is a nesting set in the nesting tree, where STik represents the face that subregion set Sjj represents in the connected component C. Because C 3 is nested in C, there is a face set of f be Sik and the nesting set of f be f ik• 8 of C that nests C . Let the subregion 3 Let lp 3 be the left most point of C . 3 When sweepline L encounters lp, the nesting set n 3 is created. According to the parent detecting policies in Section 4.3.1, if we detect the parent of C 3 directly, then the parent pointer is set from n 3 to snk. Otherwise, the parent of C detected at vertex lp 3 is not the subregion of C, but another connected component, say a subregion S of a region R . Let sn and nj 1 be the nesting sets associated with S and R . Because R 1 1 and S are not the parent of C, the subregion set S, and the nesting sets sn will be united with another subregion set and nesting set when sweepline encounters a certain vertex, say r. For example, in Chapter 4. Connected Component Nesting 58 2 s Sfl5 Sfl2 Sfllk Figure 4.10: The direct edge chain in the tree of nesting sets. Figure 4.9, the subregion set S 1 is S , the region set R 5 1 is R . For the region set R 3 1 may be united with another region set or may not, depending on the connecting structure of the connected component. In Figure 4.9, there is no region-set union operation at vertex r but the region set R . Subregion sets that 1 2 will be united to region set R 1 at vertex r are detected nesting C 3 will be united with another subregion set according 3 at vertex lp to the region and subregion consistency policies in Section 4.3.2. Let Sk and Rk be the subregion set and region set that S 1 and R 1 are united with. After the Union operation on Sk and S and Rh and R , the nesting sets are united 1 accordingly. After nesting set sn 1 is united with .snk, and the parent pointer is set pointing to nesting set nesting set k 5 srij has flag set to be 1 associated with subregion Sk. If the region set R 1 is united with region set Rk, the nesting set ni is united with nesting set nk. After the Union operation, nesting set pointer pointing nesting set nk. ni If the parent set of has flag set to be 1 and the parent .snk and nk is the true parent of C, we are done. Otherwise, we can use the same analysis as above to find the direct edge chain that connects C, with C in the tree of nesting sets. The final edge chain found in the tree of nesting sets shows in Figure 4.10. Now we prove that if an edge chain in the tree of the nesting sets that connects nesting sets from n, to .snlk, then the subregion Sik that jk 8 is associated with nests with the region R, that n 3 is associated with. By the lemma 4.4 it is clear. D Chapter 4. Connected Component Nesting 59 The edge chain of the Figure 4.9 is given in Figure 4.10, where the square boxes are intermediate nodes, the circles are end nodes of the edge chain, and the triangles are the subtrees under the nodes. For example, the subtrees under the intermediate nodes take the face f as their parent; the subtree under the end nodes take the end nodes as their parents. Corollary 4.6 The forest of the nesting sets records the nesting structure of the con nected components correctly. 4.4 Algorithm In section 4.2 we need the vertex type to indicate whether a vertex is a begin or an end point of an edge, the edge that associated with the vertex, and the number of edges that take the vertex as begin point and the number of edges that take the vertex as an end point when we sweep crossing the data. In order to have these data, we create an event data structure to store these data and linked through a list. The vertex type is indicated by 0 or 1 if the vertex is a begin point or an end point. The structure is typedef struct element mt vertype; { /* the vertex type */ EDGE *edge; /* the simple line associated with the notch */ mt /* the number of edges that take the notch left; as a begin point */ mt right; /* the number of edges that take the notch as a end point */ } ELEMENT; Chapter 4. Connected Component Nesting 60 These elements are linked by a list. In the following algorithm one can substitute the vertices and edges with notches and simple lines if the time of getting the notches and simple lines is less than O(Nlog N), where N is the total notches in the connected components. Algorithm Input: A set of embedded line segments that form a set of connected components that have total n vertices. Output: A directed acyclic graph G, called the nesting structure. Stepi: Sort all the vertex of the edges and create the event list for sweeping as described in section 3.4.2 Stepi. Edge order in the event list are (ei, e ,e 2 ,e 3 , b, b 4 ,b 2 ,b 3 ), (see Figure 3.11). Let this 4 sorted element list SW be s, s , 2 ..., s, where n is the total number of the vertices. Step2: Create the first region, subregion, and nesting set with parent equal to null for the leftmost vertex s of SW, and insert the edge e associated with s into ordering 0. Notice 0 is initially empty. Step3: Sweep a line from left to right, taking steps at each event element s of SW. (a) If s is a begin point of edge e. We update the region, subregion, and nesting set according to the policies in section 4.3.1. If using notches and simple lines we need to detect the position of s 1 with respect to the simple lines intersected by sweep line. For this, carry out a binary search in the ordering 0 of these simple lines. To detect the position of s with respect to a simple line S during binary search, find the edge e 1 of this simple line kept in 0 and then follow the linked list of edges e ,e 1 , 2 ..., ek Chapter 4. Connected Component Nesting (a) 61 (b) Figure 4.11: Path compression during the operation find parent. until the edge ek is found which intersects L. Insert the simple line associated with s in 0. (b) If .s, is a end point of the last edge e associated with the simple line that s is on. We do the union operation on the region, subregion, and nesting sets according to the policies in section 4.3.2. Then we delete the edge (or simple line associated with s) from ordering 0 by a simple binary search. Step4: Retrieve the nesting structure from the nesting sets. To retrieve the nesting structure is to find the parent of each nesting set. Notice this may repeatedly report one nesting relation among some nesting sets. But this will not affect the correct results. We represent the nesting sets by a linked-list. During find parent operation we use path compression, that is, make each nesting set pointing directly to its parent nesting set on the way of finding a set’s parent. This operation is shown in Figure 4.11. Figure 4.11 (a) shows a tree representing Chapter 4. Connected Component Nesting 62 a set prior to find parent. Triangles represent subtrees whose parents are the node shown. Each node represents a nesting set that has a pointer to its parent. The nodes b, c, d, e are united sets and their set flag is 1. Figure 4.11 (b) shows the same sets after the operation find parent. Each node on the find path now points directly to the root. From the results of Tarjan [15], the total time of n find parent operations on the nesting sets is O(na(n, n)), where c(n, n) is a very slowly growing inverse of Ackermann’s function. Theorem 4.7 The problem of connected components nesting for m connected compo nents can be solved in 0(n log n) time where n is the total number of vertices in m connected components. Proof. We sorting the n vertices is 0(n log n) time. Updating at the n vertices takes maximum O(n log n) time. Retrieving the nesting structure from the nesting sets needs O(na(n,n)) O(nlogn). Hence, total time spent is O(nlogn). D Corollary 4.8 If the notches and simple lines of the connected components can be obtained in 0(n) time, the problem of connected components nesting for m connected components can be solved in 0(N log N + Na(N, N) + n) = 0(N log N + n) time, where N is the total number of notches of the m connected components. 4.5 Conclusion In this chapter we discussed the connected components nesting problem, and proposed one algorithm to solve it. Clearly our algorithm for the connected components nesting problem can solve the simple polygon nesting problem with the time of 0(Nlog N + n). Chapter 4. Connected Component Nesting 63 The simple polygon nesting problem is oniy a simplified case in the problem we solved. We have generalized the nesting problem to a broader class inducing the nesting of connected components. Chapter 5 Generating Random Monotone Polygons 5.1 Introduction This chapter details some results that we have obtained in our study of generating random polygons. In particular, we describe an algorithm for generating x-monotone polygons uniformly at random. The remainder of this section provides motivation for this research and a detailed description of this problem. In Section 5.2, we give the general notation and definitions of our algorithm. In Section 5.3, we present our monotone polygon generating algorithm with the counting procedure and generating procedures. In Section 5.4 we prove that our algorithm can generate monotone polygons uniformly at random. In Section 5.5, we give the visibility computing procedures with the correctness proofs. In Section 5.6 we analyze the time and space complexity of our algorithm. A summary of our results and related open problems are presented in Section 5.7. 5.1.1 Motivation As well as being of theoretical interest, the generation of random geometric objects has applications that include the testing and verifying the time complexity for computational geometry algorithms. Algorithm Testing: The most direct use for a stream of geometric objects generated 64 Chapter 5. Generating Random Monotone Polygons 65 at random is for testing computational geometry algorithms. We can test such algorithms in two ways. The first involves the construction of geometric objects that the implementer considers difficult cases for the algorithm. For example, our polygon-nesting algorithm, based on a plane sweep, may require special case code for some polygons. It is important to make those polygons candidates for exposing errors of the algorithm. The second approach to testing involves executing the algorithm on a large set of geometric objects generated at random. Intuitively, we expect errors to be exposed if enough different valid inputs are applied to the algorithm. Verification of Average Time Complexity: In implementation-oriented compu tational geometry research, we are often given the problem of verifying that an imple mentation of an algorithm achieves the stated algorithm time complexity. This is done by timing the execution of the algorithm for various inputs of different sizes. There are many possible inputs of any given size, and the choice is important, since an algorithm may take more time on some inputs than others of the same size. If an average exe cution time is computed over a set of randomly generated objects of a given size, the relationship between time and problem size will typically follow a curve corresponding to its average time complexity. We can then check this complexity against the stated algorithm’s complexity. Research, such as Epstein [4], has been done on generating geometric objects at random. In this thesis we give an algorithm that generates random monotone polygons uniformly at random. 5.1.2 Let S,, Problem = {Si, 82, ..., Sn} be a set of n arbitrary points. In this chapter, we assume that the x-coordinates of the points in the point set n 5 are distinct. We want to generate a Chapter 5. Generating Random Monotone Polygons 66 simple polygon with vertex set Sj,,. at random. At this beginning stage we only consider generating a monotone polygon from S. Figure 5.1 shows a monotone polygon generated from a set of 12 points. 12 Figure 5.1: A monotone polygon generated from S 12 ) algorithm to generate triangulations of a given simple 4 In [4] Epstein gives an 0(n polygon at random. His algorithm, although it does not generate simple polygons at random, inspires us in constructing our algorithms for generating monotone polygons at random. In Section 5.3, we will give an algorithm that generates a monotone polygon randomly on a set of n points in 0(K) time and in 0(n) space, where n K 2 is the total n number of above-visible and below-visible point-pairs (see Section 5.2 for definitions) in the point set. In related work, Meijer and Rappaport [12] study monotone traveling salesmen tours and show that the number of x-monotone polygons on n vertices is between and (/g)(_2) /g)(3)/2 1 + 2 ( Mitchell and Sundaram [13] have independently developed a routine to generate random monotone polygons in 0(n) space and 0(n ) time. 2 5.2 Preliminaries Notation. We refer to a probability space as (1, E, Pr), where Q is the sample space, E Chapter 5. Generating Random Monotone Polygons 67 is the event space, and Fr is the probability function. The sample space is the set of all elementary events that are the possible outcomes of the experiment being described. The event space E is the set of all subsets of Fr : E —* that are assigned a probability. The function j defines the probability of events. A geometric object generator is an algorithm that produces a stream of geometric objects of a given type. We say that a generator is complete if it can produce every object in a given sample space . The Uniform Probability Distributions. Probability theory defines both discrete and continuous uniform probability distributions. We are interested only in the discrete case: the discrete uniform probability space for a finite sample space Q is defined as u, Eu, Pro), where E is the set of all subsets of u, and Pr(A) = 1/uI for all 1 ( A E flu. In other words, in a finite sample space, a uniform distribution is one in which each elementary event is equally likely. Since the sample space we deal with is finite, we use the discrete uniform probability distribution. A monotone polygon generator is called uniform if each of the monotone polygons has the same probability of being generated. Definitions. Let S.,,. = {Si, 2, ..., Sn} be a set of n arbitrary points and if i < s.x < s .x. Hereafter Sn is referred to as {1, 2, 3 ..., n}. Let S = {1, 2, ..., i} for 1 j then i n. The total number of monotone polygons that can be generated with vertex set S is denoted as N(i). Any monotone polygon constructed from S.j can be divided into two monotone chains of which the leftmost vertex is 1 and rightmost vertex is i. In Figure 5.2 the top monotone chain is {1, 2, 3, 6, 7, 11, 12} and bottom monotone chain is {1,4,5,8,1O,12}. Any point in S is either on the top or bottom chain, except 1 and i are on both chains because Chapter 5. Generating Random Monotone Polygons 68 they are the beginning and ending points of the chains. 11 2 10 Figure 5.2: The top and bottom monotone chains Let T(i) be the set of monotone polygons that are generated from S with the edge (i — 1, i) on their top chains. Let B(i) be the set of monotone polygons that are generated from S with the edge (i — i,i) on their bottom chains. We define TN(i) = the total number of monotone polygons included in T(i) and BN(i) = IT(iH to be IB(i) to be the total number of monotone polygons included in B(i). Let l(j, i) be the line determined by j and i. Now we define above-visible or below- visible for a point. We say that a point k is above-visible from i if k is above all l(j, i), for j =i j= — 1, ..., k — 1. And a point k is below-visible from i if k is below all l(j, i), for i—i,..., k—i. For example, in Figure 5.3, 10 is above-visible from 12, and {9, 7} are below-visible from 12. Let 4(i) be the set of all the points that are above-visible from point i. Let Vb(i) be the set of all the points that are below-visible from point i. For example, in Figure 5.3, the V(i2) = {10} and V (12) = {9, 7}. 6 5.3 Generating Monotone Polygons at Random We have two steps to generate monotone polygons randomly from S,.. The first one is to calculate the number of monotone polygons that can be generated from S,. Then we Chapter 5. Generating Random Monotone Polygons 69 Figure 5.3: The above-visible and below-visible points from point {12} scan S. backward to generate monotone polygons. 5.3.1 Counting Monotone Polygons Before we give the procedure to count monotone polygons we prove several theorems to build up the theoretical background. Lemma 5.1 The set of monotone polygons that are generated from Sk with edge (k—i, k) on their top chains is disjoint from the set of monotone polygons that are generated from Sk with edge (k — 1, k) on their bottom chains. That is T(k)flB(k)=ø. Proof. Clearly, there is no polygon in T(k) that could include edge (k bottom chain. And there is no polygon in B(k) that could include edge (k top chain. SoT(k)flB(k)=Ø. D From this lemma we get the following result. — — 1, k) in its 1, k) in its 70 otone Polygons on M m do an R Generating Chapter 5. ns monotone polygo of r be m nu e th ith k > 2, vertex set Sk, w y an or F 2 5. Theorem is ith vertex set Sk (5.1) generated w ) (k N B N(k) TN(k) + = n we know ted from Sk. The ra ne ge is at th one polygon or on P be any monot et L eans P T(k), m ch hi w , F of Proof. the top chain 1, k) is either on either TN(k) or (k by ge d ed te un e co th is P that (k). Thus, ch means P E B hi w P of D n ai ch N(k) + BN(k). T the bottom ) (k N ve ha a 5.1, we ording to Lemm cc A hai ). (k N B and bottom c n ai ch p to s it rated from Sk, e polygon gene on ot e bottoiz-on m e pl m p chain or on th to e th For any si on er th 1, k) is ei k), ther k. The edge (k i, to — (k 1 ge om ed fr n ai s does not cont are path r the chain that Fo n. go ly po e monoton we have the chain of the For the point j k. to ts ec nn co at for i < k 1, th t exists a poin j, - - — = — — results. from Sk. that is generated n go ly po e on ot mple mon e point Let P be any si r j < k 1, is th fo Lemma 5.3 d j, an fF o the top chain (k 1, k) is on ge ed e th If (1) om k. is below-visible fr j en th 1, is the r— k, to ts d j, for j < k connec an F of n ai ch e bottom (k 1, k) is on th ge ed e th If om k. ) (2 is above-visible fr j en th k, to ts ec ain that conn of the top ch holds.. then the lemma k om fr le ib is is below-v e prove (1). If j W e l(i, k)., f. o Pro ch that j is abov su k) i, l( ne li a e exists P. isible from k, ther the top chain of is not below-v on is i n, go ly a monotone po 1. Because P is ntradiction k < i < j polygon. This co e on ot on m e pl m a si ence P can not be below l(j, k). H . D that (1) is true at for (1). is the same as th ) (2 r fo f oo pr he T — — — — _ — — 71 Chapter 5. Generating Random Monotone Polygons Let P(j, k) be a subset of the polygon set T(k) in which the edge bottom chain, for chain} for j j Vb(k). That is, P(j, k) = T(k) fl{edge (j, k) (j, k) is on the is on the bottom (k). Now we have lemma as follows. 6 V Lemma 5.4 The number of monotone polygons in the set P(j, k) is BN(j + 1). Proof. For the monotone polygons in P(j, k), we know that points the bottom chains, and j+1,. . . is fixed. We can treat the path , j and k are on k are on the top chains. So the path ofj, k, k—1,--* j+1 j, k, k — 1,.. . ,j + 1 as an edge (j,j + 1) that is on the bottom chain. Figure 5.4 shows an example. Now we know that the number of monotone polygons in the set of P(j, k) equals the number of monotone polygons generated from 1 with the edge S (j,j + 1) on the bottom chains. Hence the lemma is true. We call the set of B(j + 1) the equivalent set for P(j, k). D Using a similar proof we have the following result. Lemma 5.5 The number of polygons in B(k)fl{edge (j,k) is on the top chain} for jEV(k) isTN(j+1). Theorem 5.6 For any point set Sk, we have TN(k) = BN(j +1) (5.2) TN(j +1) (5.3) jEV(k) BN(k) = jEVt(k) Proof. We prove formula (5.2). According to lemma 5.3, for any P E T(k), its bottom chain must use one of the points of V (k). Let 6 j be the point. Obviously Chapter 5. Generating Random Monotone Polygons 72 (a) The original monotone polygon in P(k) ,.. ‘I 1+1 j (b) The equivalent set of monotone polygons B(j + 1) Figure 5.4: The original set and its equivalent set. P E P(j, k). From lemma 5.4, we know that the number of monotone polygons in P(j, k) is BN(j+1). Then the total number of different monotone polygons is ZjEVb(k) BN(j+1). So (5.2) holds. D The proof for formula (5.3) is the same as that for formula (5.2). According to this theorem we can calculate TN and BN if we know V(k) and 4(k). Now we assume that we have 4(k) and V(k) then we can have the procedure that calculates TN and BN. The procedure is shown in Table 5.1. Chapter 5. Generating Random Monotone Polygons 73 getTNandBN(n) TN(2) = 1; BN(2) = 1; FOR 3, TO n TN(i) = 0; BN(i) = 0; FOR ALL j E V,(i) TN(i) = TN(i) + BN(j + 1); FOR ALL j E V,(i) BN(i) N(n) = = BN(i) + TN(j + 1); TN(n) + BN(n); Table 5.1: The Procedure for calculating TN and BN. After we get TN(i) and BN(i) for i = 2,. . . , n, we start to generate a monotone polygon on 5(n) at random, under the uniform distribution. The following section gives the details. 5.3.2 Generating Monotone Polygons For the general case, we give an algorithm to generate monotone polygons from S at random. Again we assume that we have Vb(k) and V(k), the below-visible and abovevisible vertices. The algorithm scans the point set S backward from the right to the left to generate monotone polygons. Table 5.2 shows the procedure for generating monotone polygons. Generate_Top, shown in Table 5.3, and Generate_Bottom, shown in Table 5.4, deal with two cases. Generate_Top deals with the case in which k — 1 is on the bottom 74 Chapter 5. Generating Random Monotone Polygons Generate PICK AN x WITHIN [1, N(n)] UNIFORMLY AT RANDOM; ADD n TO topchain; ADD n TO bottomchain; IF x <TN(n) ADD n — 1 TO top.chain; GenerateTop(n, x); ADD 1 TO bottomchain; ELSE x = x — ADD n TN(n); — 1 TO bottomchain; Generate_Bottom(n, x); ADD 1 TO topchain; END IF Table 5.2: The Procedure for generating monotone polygons. chain and k is on the top chain of the monotone polygon. In this case the undetermined points are { 1,. . . , k — 2}. Then the set of all monotone polygons that can be generated from the original set is equivalent to that from the subset S(k) with edge (k — 1, k) on the bottom chains; that is B(k). GenerateJ3ottom deals with the case in which k is on the bottom chain and k — 1 is on the top chain. In this case the set of all monotone polygons that can be generated is equivalent to T(k). These two cases are shown in Figure 5.5. Our generating algorithm combines getTNandBN and Generate together. Polygon..Generator getTNandBN(n) Generate Chapter 5. Generating Random Monotone Polygons (a) k 75 is on the top chain and its equivalent set B(k) k-i 10\ (b) k is on the bottom chain and its equivalent set T(k) Figure 5.5: The generating process. The following section will show us that our Polygon_Generator can generate mono tone polygons uniformly at random. 5.4 The Analysis of PolygonGenerator Let l(n) = ,F 1 {P ,. 2 . . , FN(n)} be the sample space of all monotone polygons with vertex set S. Then f(n) is a sample space. Each event in 2(n) is an unique monotone polygon F that can be generated from Sn. We map fl(n) to an integer set of [1,N(n)]. Each x [1,N(n)] corresponds to an unique monotone polygon F E 11(n). Now we have the following results. Chapter 5. Generating Random Monotone Polygons 76 Generate_Top(k, x) 1. IFk<2RETURN; 2. FIND THE SMALLEST i SUCH THAT i SATISFIES: x jEV(k)Aj<i BN(j + 1); 3. ADD POINT i TO bottomchain; 4. ADD ALL THE POINTS k 5. k=i+1; 6. x 7. = x (k)Aj<i 6 ZjEV — 2, k — 3,. . . , i + 1 TO topchain; BN(j + 1); Generate_Bottom(k, x) Table 5.3: The Procedure of generating top chains. Lemma 5.7 For n 2 and Vx monotone polygon P [1, TN(n)], Generate_Top generates an unique T(n) ç f(n); For n 2 and Vx’ ate_Bottom generates an unique monotone polygon Proof e [1, BN(n)], Gener E B(n) ç f1(n). We use induction on n (the size of the point set). Our base case is n Because of TN(2) = 1 and BN(2) = 1, we know that x is 1. = 2. From the procedure Generate the input of Generate_Top is that 1 and 2 are on the top chain and x = 1, and the input of Generate_Bottom is that 1 and 2 are on the bottom chain and x = 1. For this trivial base case Generate_Top and Generate_Bottom generate the correct trivial monotone polygon by simply returning to Generate. Now for all k < n, we assume that Vx E [1, TN(n)J Generate_Top generates an unique monotone polygon P e T(k) and Vx’ ates an unique monotone polygon Pi E B(k). [1, BN(n)j, Generate_Bottom gener Chapter 5. Generating Random Monotone Polygons 77 Generate_Bottom(k, x) 1. IFk<2RETURN; 2. FIND THE SMALLEST i SUCH THAT i SATISFIES: < X jeVt(k)Aj<i TN(j + 1); 3. ADD POINT i TO topchain; 4. ADD ALL THE POINTS k 5. k=i+1; 6. x = x — ZjEVb(k)Aj<iTN(i — 2, k — 3,. . . , i + 1 TO bottom.chain; + 1); END IF 7. Generate_Top(k, x); Table 5.4: The Procedure of generating bottom chains. For k = n, let x ,x 1 2 € [1, TN(n)] and P, 1,P 2 e T(n) be the monotone polygons that are generated by Generate_Top according to x 1 and x . Now we prove that if x 2 1 then P 1 . 2 P From Generate we know that 1 . Let i 2 P 1 and i 2 ii — 1 and n are on the top chains of both P 1 and 1 be the belowvisible points found in Generate_Top. There are two cases in this situation. Case 1: ii . Without loss of generality, let i 2 i 1 < i . From Generate_Top we 2 know that for F 1 is on the bottom chain and point i 2 is on the top chain. For , point i 1 P, point i 2 is on the bottom chain. This proves P 1 Case 2: i 1 TN(n) and x 2 P. . From Generate_Top we know that k 2 i = TN(n), we have = BN(j + 1) — jEV(k)Aj<il BN(ii + 1) = 1 + 1, Since i 78 Chapter 5. Generating Random Monotone Polygons and 1). BN(j+1)BN(i + 2 2 jEV(k)Aj<i x > Because of 1 1 x’ ZjEv&(k)Aj<i, 1 1. Then we have x’ 2 > BN(j + 1) and x ZjEV&(k)Aj<i, BN(j + 1), we have x, and x E [1,BN(k)] and x E [1,BN(k)]. From our assumption, GenerateJ3ottom generates two different monotone polygons P and P with edge (i 1 + 1) on the bottom chains. From lemma 5.4 and lemma 5.5, we know ,j 1 that P, P E B(k) and that B(k) is the equivalent set of P(k, k). Then we know that the part of polygons of P and polygons of P 1 1 and P . Hence P 2 without edge 1 (i ii + 1) are on the monotone , . D 2 P Using the similar proof, we can prove that for Vx’ E [1, BN(n)], GenerateJ3ottom generates an unique monotone polygon Pr’ E B(k). From this lemma we immediately get the following result. Theorem 5.8 For n 2 Generate generates monotone polygons from n) uniformly at random. Proof Generate picks an x E [1, N(n)] uniformly at random. If x TN(n) Generate calls Generate_top. If x > TN(n) Generate calls Generate_bottom. From lemma 5.7 Generate generates an unique monotone polygon P Generate is an uniform monotone polygon generator. U Corollary 5.9 Generate is complete. e fl(n). Thus, 79 Chapter 5. Generating Random Monotone Polygons Computing Visibility 5.5 The algorithms of the previous section assumed that the above-visible and below-visible sets, V(i) and Vb(i) for i = 1,. . . , n, were available. A closer look, however, shows that these sets are only needed for one index i at a time: algorithm getTNandBN needs the sets in increasing order and algorithms Generate_top and Generate_top need them in decreasing order. In this section, we show how to calculate each of the sets V (i) incrementally as i increases (In Section 5.5.1) and as i decreases (In Section 5.5.2), using time proportional to V(i) and 0(n) space to compute V(i) from V(i — 1) or V(i + 1). The idea is the following. Let Sk denote the monotone chain with vertices k 5 1, s, If we think of Sk as a fence and compute the shortest paths in the plane above Sk from k 8 to each s with i tree rooted at k 5 k, then we obtain a tree that is known as the shortest path [5, 6]. The above-visible set V(i) is exactly the set of children of the shortest path tree rooted at trees rooted at 2, . . ., 3j k. 5 k 5 in Thus, we will incrementally compute shortest path to get the above-visible sets. We represent shortest path trees (in which a node may have many children) by binary trees in which each node has pointers to its uppermost child and next sibling. Section 5.5.1 gives the details for computing these trees in the forward direction: computing V(i) from V(i — 1). Section 5.5.2 gives the details for the reverse direction: computing V(i) from V(i + 1). 5.5.1 Computing Visibility Forward We use a tree data structure to calculate V(k) and Vb(k) recursively. Assuming V(k and V&(k — — 1) 1) have been calculated, we calculate V(k) and Vb(k) according to the results 80 Chapter 5. Generating Random Monotone Polygons of V(k — 1) and Vb(k — 1). The data structure that we use in the calculation is the tree of the shortest paths rooted at vertex k. We store top_tree(i) and bot_tree(i) using child and sibling pointers. For each vertex j E [1,n], we have a record for top_tree j: j ptr ptr stores the coordinates of vertex upc upc is a pointer pointing the upper child of j in top_tree(k) sib sib is a pointer pointing the sibling of j in top_tree(k) and a record for bot_tree j: j ptr ptr stores the coordinates of vertex iwe iwe is a pointer pointing the lower child of sib sib is a pointer pointing the sibling of j j in bot_tree(k) in bot_tree(k) The initial value of top_tree for the recursive calculation is 1.ptr 1.sib = nil. We assume that top_tree(i — = 1, 1.upc = nil and 1) has been completed. Then we call Make_Vt to calculate the above-visible set, V. In order to get V, the procedure Make_Vt calls the procedure Make_top to calculate the top_tree(i). Make_Vt(i) t = tmp; Make_top(i i.upc — 1, i, Var : t); tmp.sib; Chapter 5. Generating Random Monotone Polygons Procedure Make_top(i 81 1, i, lastsib) makes the tree edge from k to — j in top_tree(k), and puts it as the sibling of lastsib and updates lastsib. Then it recursively builds the top_tree(k). One example is shown in Figure 5.6. Make_top(j, k, Var: lastsib) nil and k is above l(j.upc,j) WHILE j.upc Make_top(j.upc, k, Var: lastsib); / make subtree for this child of j, which can be seen by k. j.upc j.upc.sib; / consider next child of / j *1 END WHILE lastsib.sib = lastsib = j; / make the connection to j, one of the children of k / j; 1 4 3 Figure 5.6: A point set S 5 and the data of top_tree(5). To compute the bot_tree is similar to computing the top_tree. We need only change upc and ‘above’ in procedure Make_V(i) and Make_top into lwc and ‘below’ to get the procedures Make_V&(i) and Make_bot. We use Make_Vb(i) and Make_bot to compute bot_tree(i) from bot_tree(i — 1). One example is shown in Figure 5.7. Chapter 5. Generating Random Monotone Polygons 1 82 4 3 Figure 5.7: A point set S 5 and the data of bot_tree(5). Knowing top.iree(k) and bot _tree(k) , we know the above-visible and below-visible point sets, V(k) and ‘(k) of vertex k. Now we give the theorem to show us how to get V(k) and Vb(k) from top_tree(k) and bot_tree(k). Let r be a record in the top_tree or bot_tree. We define r.sibz integer i > 0, and r.sib° r.sib’.sib, for any = r. Now we claim that the upper child of k and its siblings are the vertices visible from k, and any vertex that is visible from k is either the upper child of k or its sibling. This is proved in the next theorem. Theorem 5.10 Let CT(k) V(k) = {k.upc.(.sib)t = CT(k) Proof — {k — 0}. Let CB(k) ‘v/i 1} and Vb(k) First we prove V(k) that is above line l(k — = = CB(k) CT(k) = {k — {k {k — — {k.lwc.(sib) 1} there is no l(i, k exists no point that is above l(k For the general situation, ‘c/j — — = {k — 1}. If V(k) 1}. Hence V(k) 1) that is below k for i 1, k). Hence 4(k) ‘v/i I 0}. We have 1}. = 0 = = 0, there is no point = 1, k). This means that there is no l(i, k From Make_top, we know that CT(k) CT(k) — — = — = CT(k) 1,. CT(k) 1) that is below k. — . . , {k V(k), we havej is above all l(i, k) for i k — — — = {k 1}. If — 2. So there 1}. j+1,. . . ,k—i Chapter 5. Generating Random Monotone Polygons 83 Back_top(k + 1, k) 1. FIND i, SUCH THAT (k + 1).upc.sib 2. FORj=OTOi 3. 3 a_ = = (k + 1).upc.sib2; IFi=ORETURN; 4. ELSEIFi1 ,. 0 Granham-Scan-Top(i, a 5. . ., ag); END IF Table 5.5: The Procedure of computing top_tree(k) from top_tree(k + 1). that implies that k is above all l(j, i) for i 0. If there is no k.upc.(sib), i j = k.upc. Otherwise, j = ’. Then we 1 j’ = k.upc.(sib) Vj E CT(k) i j = + 1,.. . , k — — {k — j’ = V(k) and j + 1,. k 1. j’ <j such that k . . , Now we prove — j = k.upc.(sib)’’. So 14(k) ç CT(k) 1}, we know j k.upc.(sib)t. Then = 1. Otherwise, there exists a point, say j’, j = is above l(j’,j) then j’.sib. Similarly this induction can be applied to have j — {k j’, — that is, 1}. is above all l(i, k), for such that j’ > j and j is below l(j’, k). Then l(j, k) is below l(j’, k) that means k is below l(j,j’). From Make_top we know that that j j can not be the format of k.upc.(sib)t, i is above all l(i, k), for i implies 14(k) CT(k) The proof for 14(k) 5.5.2 — = {k — = j + 1,. . . , k — 1. Then we have that 1}. Now we have 14(k) CB(k) — {k — 0. This contradiction proves = CT(k) — {k — j E 14(k) that 1}. D 1} is similar to the proof above. Computing Visibility backward In procedure Generate_Top and Generate_Bottom, we need to find the smallest i satisfying the expression in line 2. Here we assume that top_tree(k + 1) and bot_tree(k + Chapter 5. Generating Random Monotone Polygons Granham-Scan-Top(i, a ,. 0 . , a) / 1. Push(ao,S); 2. 1 a 3 ; 1 ao.upc=a 4. Push(ai,S); 5. FORj=2TOi = . 84 S is a stack / ao.upc; WHILE the angle formed by points NEXT-TO-Top(S), Top(S), 6. and a 3 makes nonleft turn Pop(S); 7. END WHILE 8. 3 a 9. Push(S,a); = Top(S).upc; Table 5.6: The Procedure of Graham-Scan-Top. 1) have been completed, and define procedures Back_top and Back_bot to generate top_tree(k) and bot_tree(k). Let a_ (k + 1).upc and a 0 Q = V(k + 1) — = k. Let Q = (k + l).upc.sib3, for {aj,j = 0,. . . , j = 0, 1,. . . , i. Then a = i}. From theorem 5.10, we know {k}. If we take a 0 as the origin of of coordinates, according to the above-visible definition, the points in Q are sorted lexicographically by polar angle and distance from a . Then from Graham-Scan we can get the correct top_tree(k). This is 0 similar for calculating bot_tree(k). Table 5.5 and Table 5.6 show the procedures. One example to calculate top_tree(k) from top_tree(k + 1) is shown in Figure 5.8. Similarly we have the procedure to compute bot_tree(k) from bot_tree(k+1). They are called Back_bot and Graham-Scan-Bot(i, Q). We get them simply by changing upc and ‘nonleft turn’ of Back_top and Graham-Scan-Top(i, turn’. Q) into lwc and ‘nonright Chapter 5. Generating Random Monotone Polygons 85 k+ I 1 Figure 5.8: top_tree(k) is generated from top_tree(k + 1). Now we prove that these procedures compute correct results. Back_top and Graham-Scan-Top(i, Q) correctly compute Theorem 5.11 top_tree(k) from top_tree(k + 1). Back_bot and Graham-Scan-Bot(i,Q) correctly compute bot_tree(k) from bot_tree(k + 1). Proof We prove that Back_top and Graham-Scan-Top(i, Q) correctly compute top_tree(k) from top_tree(k + 1). In Back_bot we first find the upper child of k + 1 and its siblings. In order to get top_tree(k) from top_tree(k + 1) we must cut the edges of these vertices with k + 1 and reconnect them with appropriate vertices. These points are the only points that need to be reconnected. In Graham-Scan-Top(i, Q), point k is always kept in the bottom of the stack S. For any vertex visible from k + 1, there two cases. Case 1 is that it is visible from k. Case 2 is that it is not visible. Case 1: if point j is visible from k then all the points in the stack S are popped out but k. Now we output edge (j, k) and point j is pushed into S. Now there are at least two points in the stack S. Case 2: otherwise point j is not visible from k. We know that j must be visible from Chapter 5. Generating Random Monotone Polygons a vertex in S, say edge (j,j’) and j j’. 86 Then all the points on top of j’ are popped out, and we output the is pushed into S. After we checked all the points visible from k + 1, we reconnect the points correctly. D Similarly we can that prove Back_bot and Graham-Scan-Bot(i, Q) correctly com pute bot_tree(k) from bot_tree(k + 1). Now we have all the procedures to build up our algorithm. Next we give its time and space complexity. Time and Space Complexity Analysis 5.6 Lemma 5.12 The runtime ofMake_top(k time ofMake_bot(k Proof — — 1,k,Var: t) is O(V(k)). Arid the run- 1,k,Var: t) is O(I14(k)). Because of the similarity, we only prove the runtime of Make_top(k — 1, k, Var: t) is O(V(k)I). Let us assign the following amortized costs: WHILE checking 1 updating j.upc 1 updating lastsib 1 and each time we encounter the upper child, k.upc or its sibling k.npc.sib, but excluding k — 1, we get 3 credits. Clearly from theorem 5.10, we know that the total number of Chapter 5. Generating Random Monotone Polygons k.upc and k.upc.(.sib), excluding k — 87 1, is V(k). We shall now show that we can pay any operation costs by charging the amortized costs. We start from Make_top(k—1, k, Var: t) and we have 3 credits. Clearly if j is vis ible from k, WHILE checking succeeds. From this we get 3 more credits to pass to the next call to Make_top(j, k, Var: la.stsib). Then this call receives 3 credits to pay for its own checking and updating costs. If j is not visible from k, WHILE checking fails. Then the current call to Make_top saves 1 credit for upper level Make_top to pay another WHILE checking. We know that Make_top(j, k, Var: lastsib) with 3 credits can pay their own costs and the number of total recursive calling for Make_top(j, k, Var: lastsib) is V(k). Then 3 * I4(k) will pay all the costs. So the runtime of Make_top(k 1, k,Var : t) is O([4(k)). — D Theorem 5.13 Polygon_Generator has time complexity of 0(K) and space in 0(n), where K is the total number of above-visible and below-visible points of the points in the point set. Proof From lemma 5.12 the runtime of getTNandBN is, for some constant c, (IVt(k)I + IVb(k)I) <cK = 0(K). Clearly the runtime of Back_top is O(IV(k)) and the runtime of Back_bot is O(J(k)). The time complexity of Generate depends on the time complexity of Generate_Top and Generate_Bottom. Because they have a similar structure the time complexity of Generate_Top and Generate_Bottom is the same. Let tk be the run time of Generate_Top(k, x) From line 2 to 6, the time depends on the number of above-visible and below-visible points of k. Then we have, for some constant c k tk c*(Vt(k)I+IVt(k)I+k—i)+t+i. j=i+1 Chapter 5. Generating Random Monotone Polygons 88 So t <c * ((k)j + Vb(k)I) + n c * (K + n). Hence the run time of Generate is O(n + K). Obviously, n . The time 2 K < n complexity of our Polygon_Generator is 0(n + K) + 0(K) = 0(K). In the process of generating we need only to store the point set S, top_tree(n), bot_tree(n), and TN(i) with BN(i), for i = 2,. . . , n. Since each of the data struc tures use no more than 0(n) memory space, we have that the memory space of Poly gon_Generator is 0(n). LJ 5.7 Conclusion We have presented an algorithm to generate monotone polygons uniformly at random. The time complexity of our algorithm is 0(K), where n K 2 is the number edges n of the visibility graph of the x-monotone chain whose vertices are the given n points. The space complexity of our algorithm is 0(n). We have given the detail analysis of the algorithm and the proof of its correctness. A random monotone polygon generator is useful for testing the many algorithms that accept a simple polygon or a group of simple polygons as input. We are also interested in finding a polynomial algorithm to generate general simple polygons randomly from an arbitrary set of points. There is, to our knowledge, no efficient enumeration of the simple polygons on a given vertex set. Chapter 6 Conclusions Summary In this thesis, we have briefly reviewed the basic concepts, the edge functions, and the topological operators of the quad edge data structure. Based upon the edge functions and the topological operators, we presented a plane sweep algorithm that extracts connected components of a set of line segments and captures the topology in a quad edge data structure. We have described that to construct the quad edge data structure correctly is to construct the Org rings of the edges correctly. The validity and feasibility of our sweep algorithm method are proved mathematically and tested by implementation. The time of the sweep process is linear in the total number of vertices n. The sorting time dominates the algorithm’s complexity, so its time complexity is O(n log n). For the polygon nesting problem, we reported results along two fronts. First, we redefined the idea of a notch in Bajaj and Dey’s polygon nesting algorithm to provide a direct correspondence between notches and subchains. Second, we presented a plane sweep algorithm to solve the nesting problem on a set of connected components. In the connected components nesting problem the objects are more general than Bajaj and Dey’s restriction to simple polygons. The correctness of our sweep algorithm is proved mathematically. The time complexity of our sweep algorithm is O(n log n), where n is the total number of vertices of the connected components. If the notches and simple 89 90 Chapter 6. Conclusions lines of connected components can be obtained in 0(N log N) time, we find the nesting structure of the connected components in 0(N log N+n) time, where N is the number of notches and n is the total number of vertices. The nesting structure for simple polygons is an instance of the connected components problem, so our sweep algorithm can identify the structure in 0(N log N + n) time. The algorithms for collstructing quad edge data structure, polygon nesting, and con nected components nesting are implemented in C on Sun-workstation and Silicon Graph ics machines. We examined methods for generating random polygons and presented an algorithm to generate monotone polygons uniformly at random. We are able to generate a random monotone polygon in 0(K) and 0(n) space, where n is the total number of vertices in the point set and n K 2 is the number edges of the visibility graph of the x-monotone n chain whose vertices are the given n points. A detail analysis of the algorithm and a proof of its correctness are also given. A random monotone polygon generator is useful for testing many of the algorithms that accept a simple polygon or a group of simple polygons as input, such as the polygon nesting algorithm. Future Directions It is interesting to apply the algorithms to large GIS data set. Although we have not done this testing yet, we think that it is an important step in exploring more practical problems in the GIS application area. A second interesting direction is to combine, the quad edge data structure building algorithm, the polygon nesting algorithm, and the connected component nesting algo rithm, with a GIS database system to increase the tool power and the efficiency of the GIS. There is a great potential for many interesting problems to arise in this combinillg Chapter 6. Conclusions 91 process. Finally, because there is no efficient enumeration of the simple polygons on a given vertex set, we are interested in finding a polynomial algorithm to generate general simple polygons randomly from an arbitrary set of points. Bibliography 92 Bibliography [1] C. L. Bajaj and T. Dey. Polygon nesting and robustness. Information Processing Letters, 35(1):23—32, June 1990. [2] B. G. Baumgart. A polyhedron representation for computer vision. In 1975 National Computer Conference, volume 41 of AFIPS Conference Proceedings, pages 589—569. AFIPS Press, Arlington, Va., 1975. [3] M. J. Egenhofer and J. Sharma. Topological consistency. In 5th International Sym posium on Spatial Data Handling, pages 335 343. IGU Commission on GIS, August 1992. — [4] P. Epstein and J. Sack. Generating triangulation at random. In Proceedings of the Fourth Canadian Conference on Computational Geometry, pages 305 310, 1992. — [5] L. Guibas, J. Hershberger, D. Leven, M. Sharir, and R. Tarjan. Linear time algo rithms for visibility and shortest path problems inside triangulated simple polygons. Algorithmica, 2:209—233, 1987. [6] L. J. Guibas and J. Hershberger. Optimal shortest path queries in a simple polygon. Journal of Computer and System Sciences, 39(2):126—152, October 1989. [7] L. J. Guibas and J. Stolfi. Primitives for the manipulation of general subdivisions and the computation of Voronoi diagrams. ACM Transactions on Graphics, 4(2):74— 123, April 1985. [8] M. R. Jerrum, L. G. Valiant, and V. V. Vazirani. Random generation of com binatorial structures from a uniform distribution. Theoretical Computer Science, 43:169—188, 1986. [9] D. G. Kirkpatrick. Establishing order in planar subdivisions. Discrete Comput. Geom., 3:267—280, 1988. [10] 11.-P. Kriegel, T. Brinkhoff, and R. Schneider. The combination of spatial access methods and computational geometry in geographic database systems. In Data structures and efficient algorithms, number 594 in Lecture Notes in Computer Sci ence, pages 70—86. Springer-Verlag, 1992. Bibliography 93 [11] M. J. Mantyla and R. Sulonen. GWB: A solid modeler with Euler operators. IEEE Computer Graphics and Applications, 2(5):17—31, Sept. 1982. [12] H. Meijer and D. Rappaport. Upper and lower bounds for the number of monotone crossing free flamiltonian cycles from a set of points. ARS Combinatoria, 30:203— 208, 1990. [13] J. S. B. Mitchell and G. Sundaram. Generating random geometric objects. Unpub lished manuscript, 1993. [14] F. P. Preparata and M. Ian Shamos. Computational Geometry: An Introduction. Springer-Verlag, 1985. [15] R. E. Tarjan. Efficiency of a good but not linear set union algorithm. Journal of the Association for Computing Machinery, 22(2):215—225, 1975. [16] J. W. van Roessel. A new approach to plane-sweep overlay: Topological structuring and line-segment classification. Cartography and Geographic Information Systems, 18(1):49—67, January 1991.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Topology building and random polygon generation
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Topology building and random polygon generation Zhu, Chongjian 1994
pdf
Page Metadata
Item Metadata
Title | Topology building and random polygon generation |
Creator |
Zhu, Chongjian |
Date Issued | 1994 |
Description | In the island adoption problem from geographical information system we are asked to identify which islands are located in which lakes. This problem translates directly to polygon nesting in computational geometry: given a set of polygons, find their nesting structure. We present our research into a broader nesting problem, namely connected component nesting, beginning with the underlying concept of topology-building and a related issue of random polygon generation. Topology building is a process of structuring data. We develop a plane sweep al gorithm for building a quad-edge data structure that captures the topological structure of connected components of a set of line segments. The algorithm starts with a data structure representing a single edge then adds edges into the data structure at each step while sweeping across the connected components The algorithm’s time complexity is determined by the time to sort the vertices of the line segments. We develop two approaches for obtaining the nesting structure of polygons. The first adopts a basic idea of Bajaj and Dey [1], but introduces a new notch definition to simplify their algorithm. The second generalizes the nesting problem to a broader class including the nesting of connected components. We present a sweep algorithm, based on a union-find data structure, that computes the nesting of the connected components. In order to test and verify the time complexity of our polygon nesting algorithm, we present an algorithm that generates x-monotone polygons uniformly at random over a vertex set of n points. This algorithm scans the point set to calculate the total number of monotone polygons that can be created, then reverses the scan to generate a random monotone polygon. This process generates a random polygon over the n vertices in 0(K) time, where n K n2 is the number edges of the visibility graph of the x-monotone chain whose vertices are the given n points. The space complexity of our algorithm is 0(n). |
Extent | 1870341 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-02-27 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051255 |
URI | http://hdl.handle.net/2429/5246 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1994-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1994-0313.pdf [ 1.78MB ]
- Metadata
- JSON: 831-1.0051255.json
- JSON-LD: 831-1.0051255-ld.json
- RDF/XML (Pretty): 831-1.0051255-rdf.xml
- RDF/JSON: 831-1.0051255-rdf.json
- Turtle: 831-1.0051255-turtle.txt
- N-Triples: 831-1.0051255-rdf-ntriples.txt
- Original Record: 831-1.0051255-source.json
- Full Text
- 831-1.0051255-fulltext.txt
- Citation
- 831-1.0051255.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051255/manifest