Efficient Generation of Contour Trees in Three Dimensions by Hamish Carr .Sc.(Hons.) 1987, L L . B . 1990, B.C.Sc.(Hons.) 1998 University of Manitoba A THESIS S U B M I T T E D IN P A R T I A L F U L F I L L M E N T OF T H E R E Q U I R E M E N T S F O R T H E D E G R E E OF Master of Science in T H E F A C U L T Y OF G R A D U A T E STUDIES (Department of Computer Science) we accept this thesis as conforming to the required standard The University of Bri t ish Columbia April 2000 © Hamish Carr, 2000 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t of the requirements f o r an advanced degree at the U n i v e r s i t y of B r i t i s h Columbia, I agree that the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r reference and study. I f u r t h e r agree that permission f o r extensive copying of t h i s t h e s i s f o r s c h o l a r l y purposes may be granted by the head of my department or by h i s or her r e p r e s e n t a t i v e s . I t i s understood that copying or p u b l i c a t i o n of t h i s t h e s i s f o r f i n a n c i a l gain s h a l l not be allowed without my w r i t t e n permission. The U n i v e r s i t y of B r i t i s h Columbia Vancouver, Canada Date flprj/ //, Z000 Abstract Many scientific fields generate data in three-dimensional space. These fields include fluid dynamics, medical imaging, and X-ray crystallography. In all contexts, a common difficulty exists: how best to represent the data visually and analytically. One approach involves generating level sets: two-dimensional surfaces consisting of all points with a given value in the space. With large datasets common, efficient generation of these level sets is critical. Several methods exist: one such is the contour tree approach used by van Kreveld et al. [26]. This thesis extends the results of van Kreveld et al. [26] and Tarasov & Vyalyi [23]. An efficient algorithm for generating contour trees in any number of dimensions is presented, followed by details of an implementation in three dimensions. n Contents Abst rac t i i Contents i i i L is t of Figures v i i 1 Introduct ion 1 1.1 Overview 1 1.2 X-ray Crystallography 2 1.3 Desiderata 6 1.4 Assumptions about Data 7 2 Definitions 9 2.1 Parameters for analysis 12 3 P r i o r W o r k 14 3.1 Marching Cubes 14 3.1.1 ' Analysis 15 3.1.2 Ambiguity „. '. . . 16, iii 3.1.3 Local Coherence 16 3.1.4 Connectivity 17 3.2 Octrees 18 3.3 Span Space 18 3.4 Interval Trees 19 3.5 Contour Following 20 3.5.1 Seeds and Seed Sets 21 3.5.2 Analysis 22 3.5.3 Extrema Graphs 23 3.5.4 Thinning 24 4 Contour Trees 25 4.1 Morse Theory 26 4.1.1 A Guarantee that Critical Points are Vertices 26 4.2 Description of Contour Tree 26 4.3 Contour Properties 30 4.4 Contour Equivalence 34 4.5 Definition of the Contour Tree 35 4.6 Properties of the Contour Tree 38 4.7 Augmented Contour Tree 41 4.8 Contour Tree Algorithms . . .' 44 5 J o i n Trees and Spl i t Trees 46 5.1 Definitions of Join and Split Trees 47 5.2 Vertex Degrees in Join and, Split Trees 50 iv 5.3 Preliminaries to Reconstruction . ' . . 55 5.4 Basic Reconstruction Algorithm 61 5.5 Improved Reconstruction Algorithm 62 5.6 Join Trees of the Mesh & the Contour Tree 68 6 Join & Split Trees of the Mesh 75 6.1 Construction of the Join Tree 76 7 Contour Tree Construction Algorithm 82 7.1 Constructing the Augmented Contour Tree 82 7.2 Constructing the Contour Tree 83 8 Generating Seeds 87 8.1 Simple Seed Sets : 87 8.2 Heuristic Seed Sets 88 9 Implementation 91 9.1 Simplicial Subdivision 91 9.1.1 Desiderata for Subdivision 92 9.1.2 Some Possible Subdivision Schemes '. 93 9.2 Boundary Effects ' 96 9.3 Symbolic Perturbation of Data . . . 97 • 9.4 Searching for Interpolating Edge 98 9.5 Local Contours 98 9.6 Memory Requirements 99 9.7 Timing 104 v 10 Summary 107 11 Extensions 108 11.1 Higher and Lower Dimensions 108 11.2 Irregular Meshes 109 11.3 Flexible Contours - . . . . 109 11.4 Transparent Shells 110 11.5 Scaling And Parallelism 110 Bib l iography 111 vi List of Figures 1.1 Errors in Hypothesizing Phase Information 4 2.1 Sample Meshes 10 2.2 A Level Set consisting of 2 Contours 11 3.1 Marching Cube cases (after symmetry) 15 3.2' Ambiguous Marching Cube cases 16 3.3 Two Contours in the Same Cell 17 3.4 Contour Following 20 3.5 Possible Intersections of a Simplex and a Level Set 22 4.1 Development of Level Sets in 3-D 27 4.2 Contour Tree corresponding to Fig. 4.1 28 4.3 Development of Level Sets in 2-D 29 4.4 Simplices Intersecting a Contour 31 4.5 Possible Contours in a Simplex (from Fig. 3.5, p.22) 31 4.6 Critical Points and Boundaries Between Regions 37 4.7 The Augmented Contour Tree corresponding to Fig. 4.3 . . . . 42 vii 5.1 A sample C-Tree and subgraph above x$ 48 5.2 Join and Split Trees corresponding to Fig. 5.1 49 5.3 Components and Up-Arcs in the C-Tree 52 5.4 Reduction of a C-tree 57 5.5 Reducing a Join Tree at a Lower Leaf 60 5.6 Basic Reconstruction Algorithm 61 5.7 Improved Reconstruction Algorithm 64 5.8 The Leaf Queue and the Global Minimum 66 5.9 Constructing a graph path from a path in space 70 5.10 Region Representing an Edge in Augmented Contour Tree . . 72 6.1 Algorithm to Construct Join Tree 77 9.1 Minimal subdivision: 5 simplices / voxel 93 9.2 Axis-aligned subdivision: 6 simplices / voxel 94 9.3 Body-centred subdivision: 24 simplices / voxel 95 9.4 Face-centred subdivision: 24 simplices / voxel 95 9.5 Comparison of Local and Fixed Contours 99 9.6 Timing Results for the Atom9 dataset 101 9.7 Timing Results for the Caffei.ne2 dataset 102 9.8 Timing Results for the "29g" dataset 103 vm Chapter 1 Introduction 1.1 Overview Applications in several disciplines require sampling of some physical quantity in three dimensions, followed by visualization of the data thus.acquired. These disciplines include medical imaging [15, 18, 17], fluid dynamics [17], and X-ray crystallography [13, 6]. The sampling process produces a finite number of data points with asso-ciated values. Most, if not all, visualization techniques assume an underlying real-valued function, and approximate it by means of some interpolation func-tion f(x) over the volume of space under consideration. I defer'definition of this interpolation function to Def. 2.4, p. 10. Many techniques for visualizing such data exist, the technique used in this thesis is that of level- sets1: surfaces defined by {#•£ IR3 : f(x) = h} for ^ften called isosurfaces: see discussion under Def. 2.6, p.11 1 some fixed value h. Using X-ray crystallography as a driving problem, I discuss existing techniques for generating level sets, in particular techniques based on a data structure called the contour tree. I then propose an efficient and practical algorithm for constructing contour trees in arbitrary dimensions. After an-alyzing the performance of this algorithm, I consider implementation issues, and present some experimental results. 1.2 X-ray Crystallography One of the major problems in biochemistry is the analysis of protein structure and conformation. It is known that the function of a protein is intimately connected to it's conformation [21], which is determined by the spatial loca-tions of the atoms in the protein, and the interaction of the electron clouds surrounding each atom. Techniques exist to determine the sequence of amino-acid residues in a protein [21]. This sequence defines the number and connectivity of the atoms in the protein, but.not the spatial location of the atoms. X-ray crystallography is concerned with determining the locations of the atoms in the protein. Although minor variations exist, the following is an overview of the experimental procedure [9]. A quantity of the protein in question is crystallised: it is assumed that this lines up many copies of the protein in similar orientations. The resultant crystal is then bombarded with X-rays. The X-rays interact with the electrons 2 associated with each atom, and are sometimes refracted. These refractions are measured by a planar array of detectors. The process is repeated with the detectors at varying distances from the crystal: this produces a three-dimensional dataset, consisting' of scalar values (the measured intensities at the array). These measurements, however, do not correspond directly to the elec-tron distribution in the crystal, but are related to the Fourier transform of the electron distribution. If the electron distribution in a typical molecule is known, the Fourier transform of the distribution can be computed: this is defined over complex numbers. Since the experimental data is scalar-valued rather than complex-valued, the Fourier transform is underdetermined. It is, however, known that the data corresponds to the amplitude of the Fourier transform: only the phase information is lost [21]. The locations of the atoms are then found by an iterative procedure. The researcher hypothesizes the missing phase information, and performs the inverse Fourier transform to recover spatial data, in the form of an electron density map. The researcher then visualizes the electron density map, and forms further hypotheses about the locations of the atoms, and about the missing phase information. This process is repeated, refining the locations of the atoms at each iteration, until the researcher is reasonably certain that the locations of the atoms have been determined. It is possible, however, to make mistakes in this procedure, and recover incorrect data. In particular, the hypothesized phase information can lead to an entirely incorrect map. 3 (a) A duck (b) A cat (c) Cat Phase, Duck Amplitude in Reciprocal Space (d) Cat Phase, Duck Amplitude in Real Space Figure 1.1: Errors in Hypothesizing Phase Information 1 As an example, consider Fig. 1.1(c), from [7], which combines the magnitude information from the duck, and the phase information from the cat. The result of this is the image (Fig. 1.1(d)), in which the the cat (phase) information dominates visually. It is a common and useful assumption that the nuclei of atoms are local maxima (at the centre of an electron cloud), and that molecular bonds are saddle points: bridges between two electron clouds [9]. For this reason, visualization techniques which aid in the location of local maxima and saddle points are preferable. In particular, note that local maxima can be located in principle by choosing a high isovalue (Def. 2.5, p.11) to generate contours (Def. 2.6, p. 11). If the isovalue is chosen appropriately, this creates a small contour around each local maximum. As the isovalue decreases, these level sets will gradually merge at saddle points, allowing us to detect the saddle points. However, a major problem exists: not all local maxima are equally high-valued, and some are lower-valued than the saddle points between the higher maxima. When this happens, it is impossible to select a suitable isovalue so that one can correctly identify all maxima. Either the lower-valued maxima will fail to appear, or the isovalue chosen to show the lower-valued maxima will be lower than the high-valued' saddle points, causing the higher-valued maxima to blur into each other. Two possible causes of this problem exist: thermal noise, and type of atom. Each protein isxomposed of a backbone or main chain of carbon atoms: 5 the locations of these atoms are.generally highly constrained, producing high-valued maxima. Atoms, off the main chain are somewhat more flexible, and have freedom to rotate about one or more of the bonds in the side chain. As •a result of thermal noise, the maxima for these atoms will be diffused, and will be lower and flatter than for atoms on the backbone. Similarly, there is some variation between the heights of the maxima corresponding to atoms of different elements. -One solution to this problem is local contouring: generating contours for different isovalues in different regions of the data. See Sec. 9.5, p.98 for further details. 1.3 Desiderata From the description above, we extract a number of desiderata for visualizing crystallographic data: 1. Local maxima and saddle points should be easily located. As noted above, level set techniques can be used to do this. 2. Level sets should be able to be generated efficiently, even for large datasets, to facilitate changing the isovalue at interactive rates. 3.. It should be possible to compute and display contours at different iso-values in different portions of the image. 6 1.4 Assumptions about Data I make certain assumptions about the data: most of these are necessary for the contour tree approach (Assns. 1.4, 1.3, 1.5). The remaining assumption simplifies the implementation (Assn. 1.2). Defini t ion 1.1 The input data is defined to be a set of real-valued measure-ments in a bounded volume V in JRd: V = {(xl, hi)} : xt £ V C WLd, hi G 1R A s s u m p t i o n 1.2 Regular Rectilinear Data I assume that electron density maps are datasets sampled at regular intervals on a three-dimensional rectilinear grid. A s s u m p t i o n 1.3 Simplicial Mesh I assume that the mesh on which the interpolation function is based is a simplicial mesh (Def. 2.3, p.10) . This assumption is necessary for the contour tree approach, but not necessary for other level set generation techniques. Methods for ensuring that the mesh is simplicial are discussed in Sec. 9.1, p.91. A s s u m p t i o n 1.4 . Piecewise-Linear Interpolation Function I assume that the interpolation function (Def. 2.4, p.10) is piecewise-. linear. Although this assumption is not strictly necessary, it greatly simplifies some preliminaries (for example, Lemma 5.21, p.69). It is also is the most 7 common interpolation function (see Def. 2.4, p.10). When combined with Assn. 1.3, this assumption means that the interpolation function inside a given simplex will be the linear interpolation between the vertices of the simplex. Assumption 1.5 Uniqueness of Data Values I assume that no two values in the data are identical. Without this assumption, it is not guaranteed that there will be a unique contour tree for a given set. The solution adopted, "simulation of simplicity" [11], is discussed in Sec. 9.3, p.97. This assumption is also required to guarantee that the critical points occur only at vertices of the mesh (Sec. 4.1.1, p.26), and applies to all contour tree-based methods. 8 Chapter 2 Definitions In order to simplify discussion, some relevant terms are defined. Since this thesis is printed on two-dimensional paper, illustrations of two-dimensional datasets will sometimes be used for clarity. Defini t ion 2.1 A mesh in IR3 is a collection of polyhedral volumes or cells that completely fill the volume V. The vertices, edges, and faces are shared be-tween adjacent polyhedra, and the vertex set is the set of data points {x;}. Note that definitions of mesh used elsewhere may differ slightly from this definition. , In practice, meshes are frequently rectilinear meshes (Def. 2.2), hexa-hedral meshes (which are similar to rectilinear meshes) or simplicial meshes (Def. 2.3). Defini t ion 2.2 A rectilinear mesh is a mesh (Def. 2.1) in which each cell is a rectangular prism, or voxel, as in Fig. 2.1(a). 9 (a) Rectilinear Mesh (b) Simplicial Mesh Figure 2.1: Sample Meshes The rectilinear mesh is generally the most natural mesh when data is sampled on a regular rectilinear grid (Assn. 1.2, p.7). However, there can be ambiguity in interpolation, notably when the Marching Cubes algorithm is used (Def. 3.1.2). Defini t ion 2.3 A simplicial mesh is a mesh (Def. 2.1) in which each cell is a simplex, as in Fig. 2.1(b). In 1R3, a simplex is a tetrahedron (not necessarily regular). As noted in Sec. 1.1, visualization techniques usually assume that the data is sampled from a continuous function defined over V. An interpolation function is then chosen to approximate the continuous function in V. This interpolation function is normally based on a mesh M that fills the volume V: Defini t ion 2.4 An interpolation function is a function f(x) defined over the volume V, with the following properties: 10 Figure 2.2: A Level Set consisting of 3 Contours 1. Mesh-based: a mesh M (Def. 2.1, p.9) is chosen to fill the spatial volume V. 2. Local Interpolation: the function in any cell of the mesh is based solely on the values at the vertices of the cell. The most commonly used interpolation function for simplicial meshes is the piecewise-linear function over the cells of the mesh: see Assn. 1.4, p.7. For rectilinear meshes, the most commonly used interpolation function is a tri-linear interpolation over the cells of the mesh. Defini t ion 2.5 A level set of a function f at an isovalue h is the set L(h) = {x G Dom(f) : f{x) = h}. Note that a level set may be empty, or may consist of one or more connected components, or contours (Def. 2.6), as shown in Fig. 2.2. Defini t ion 2.6 A contour is a connected component of a level set (Def. 2.5). 11 In the literature, isosurface is sometimes used to refer to a level set, and sometimes to a contour. In order to avoid confusion, I avoid the use of "isosurface", and use only "level set" and "contour". 2.1 Parameters for analysis In analyzing algorithms for generating level sets, I use a number of parameters: Defini t ion 2.7 n is the number of vertices in the mesh(Def. 2.1) being used. Defini t ion 2.8 N is the number of cells in the mesh. A l l algorithms considered use either simplices or voxels as cells in IR3. Both of these cell shapes have a fixed number of faces. Thus, the number of faces is 0(N). In fact, since each face can belong to only 2 cells, the number of faces is B(N). Since an edge can belong to an arbitrary number of simplices, but each simplex has a fixed number of edges, it follows that the number of edges is O(N). In a rectilinear mesh, which must have a fixed maximum number of cells containing an edge, the number of edges is Q(N). Similarly, cells have a fixed number of vertices, but vertices may belong to an arbitrary number of cells in a simplicial mesh, so n = 0(N). For a rectilinear mesh, or for a regular simplicial mesh, such as those discussed in Sec. 9.1, p.91, n = 0(7V). Defini t ion 2.9 k is the size of the output. • 12 For maximum generality, k refers to the number of cells intersected by a given level set. In some situations, it is more convenient to think of k as the number of triangles generated for rendering. Since all algorithms considered in IR3 generate at most 4 triangles per cell (see Fig. 3.1, p.15 and Fig. 3.5, p.22), these two measures are within a constant factor of each other. Although k = fi(AQ in extreme cases, it is claimed that, for typical data in IR3, k = 0(N2^) [16, 15]. Defini t ion 2.10 t is the number of local extrema in the mesh M. A local extremum in the mesh M is a vertex all of whose neighbours either have smaller values than hi (a local minimum - Def. 4.9, p.33), or larger values (a local maximum - Def. 4.9, p.33). I will show in Sec. 4.5, p.35 that the size of the contour tree is 0(t): thus, t can be used as a rough measure of the complexity of the dataset. In extreme cases, t = fi(n): a turbulent fluid dataset from fluid dynamics might be one such case [17]. For X-ray crystallographic data, however, I expect that t < n. 13 Chapter 3 Prior Work 3.1 Marching Cubes The simplest algorithm for generating level sets is the "Marching Cubes" al-gorithm of Lorenson and Cline [18]. This algorithm "marches" through all the cubes (i.e. voxels) in a rectilinear mesh, computing.the intersection with the level set for each such voxel. For any voxel, each vertex is classified as "above" or "below" the level set, producing an 8-bit index into a lookup table. This lookup table, loosely based on a tri-linear interpolation function, defines how to compute the intersection (see Fig. 3.1). Marching Cubes is simple and relatively easy to code. However, it has several disadvantages: running time, ambiguity, local coherence, and connec-tivity. 14 A J r A A 1 C u 2 Figure 3.1: Marching Cube cases (after symmetry) 3.1.1 A n a l y s i s Since Marching Cubes steps through all cells, it is trivial to analyze: the running time is Q(N), and no preprocessing or additional space is required. Subsequent work has focussed on output-sensitive algorithms, which take ad-vantage of the fact that k = 0{N2l3) (Def. 2.9) in typical cases. These al-gorithms normally require 0(N log N) preprocessing time, although no lower bound has been proven. Thus, Marching Cubes is preferred for generating single static level sets: alternate algorithms are preferred only when the preprocessing step can be amortized over multiple level sets, or when they provide additional information about the dataset. L5 O 10a # L : : : :-0 10b Figure 3.2: Ambiguous Marching Cube cases 3.1.2 Ambiguity For trilinear interpolation, it is possible to construct voxels with identical Marching Cube cases, but topologically different surfaces (Fig. 3.2). Marching Cubes resolves this by assuming all such cases to be of one type, leading to topologically incorrect level sets and visual artefacts, although solutions for these problems were later identified by Nielson & Hamann [19]. This disadvantage does not exist with linear interpolation in simplices, for which no ambiguous cases exist (see Sec. 3.5.1 and Fig. 3.5). 3.1.3 Local Coherence As Howie & Blake [14] point out, most graphics hardware can process strips of triangles more efficiently than individual triangles, since 2/3 of the vertices in a triangle are implicitly carried forward from the previous triangle. This can reduce the geometric calculations in the graphics hardware by a factor 16 Figure 3.3: Two Contours in the Same Cell of 3. Since there is frequently a bottleneck in transferring triangles to the graphics hardware, this consideration is of major importance, and algorithms that permit the generation of triangle strips are preferred. To achieve this, as many adjacent cells as possible should be processed'sequentially. - Marching Cubes follows the grid axes: two voxels which are adjacent in the direction of travel will be processed sequentially. Thus, if the level set intersects the face between the two voxels, some vertices may be reused. However, for surfaces that do not follow the grid, this is'not possible. Runs of cells from which triangle strips can be generated will tend to be short. 3.1.4 Connectivity Marching Cubes cannot distinguish between separate contours in the level set, as no topological information is generated. In some cases, Marching Cubes 17 will actually render facets of two or more contours in the same voxel at the same time (as in Fig. 3.3). Where it may be desirable to distinguish between contours, as in X-ray crystallography, Marching Cubes is therefore at a disad-vantage. 3.2 Octrees Wilhelms & van Gelder [27] proposed using octrees to accelerate Marching Cubes. Under this scheme, each node in an octree is labelled with the minimum and maximum values of all voxels contained in the node's subtree. Level sets are then constructed by commencing at the root of the octree, and propagating downwards through all children spanning the desired isovalue. Construction of the octree takes 0(N log JV)time, and worst-case time to construct a level set is 0(k + logn/fe) [17]. Most disadvantages of Marching Cubes are retained (Sec. 3.1.2, 3.1.3, 3.1.4). In addition, Shen & Johnson [22] note that octree-based methods are vulnerable to noise in the data. A variation on this has also been used by Bajaj and Pascucci [1] for progressive rendering of isosurfaces. 3.3 Span Space Livnat, Shen & Johnson [17] consolidated much of the work on level sets, and proposed a.new method based on Bentley's 'fed-trees [4]. They represent each 18 cell as the interval defined by the minimum and maximum values of the cell. The cell then intersects any level set whose isovalue falls into the interval. This reduces the problem to that of finding all intervals containing a desired isovalue. The intervals are recorded as (min,max) pairs, and the resulting two dimensional space (the span space) is searched using a fcd-tree [17]. Construction of the fcd-tree takes 0(N log N) time and 0(N) space. Construction of a level set based on the &d-tree takes 0(\/n + k) time, by retrieving all cells intersecting the desired interval. This method has advantages other than just speed: it works for recti-linear and simplicial meshes, and on both regular and irregular grids. Local coherence (Sec. 3.1.3) is not achieved, however, since the Jcd-tree generates a list in which the cells are not necessarily related to each other. In this respect, span space may in fact be worse than Marching Cubes, since there is no guar-antee that adjacent cells are ever processed sequentially. Similarly, the ability to identify different contours in a level set (Sec. 1.3, 3.1.4) is also lost. 3.4 Interval Trees Cignoni et al. [5] follow Livnat, Shen h Johnson (Sec. 3.3) in treating each cell as an interval. These intervals are then stored in Edelsbrunner's interval tree [10], instead of the fcd-tree. The interval tree is then used to search for cells intersecting the level set: otherwise the technique is identical to the Span Space approach. Construction of the interval tree takes 0(N log N) time and 0(N log(N)) 19 Figure 3.4: Contour Following space: construction of a level set takes 0(k + logn) time. A l l other advantages and disadvantages of the span space approach are preserved. 3.5 Contour Following A l l of the algorithms referred to so far lack local coherence (Sec. 3.1.3) and connectivity (Sec. 3.1.4). As a result, these methods generate a list of unrelated cells intersecting the level set. A class of algorithms exists which preserve both coherence and connectivity: I refer to these algorithms as contour-following1, since they follow a contour from cell to cell. In simple terms, the contour-following algorithm generates contours as follows: 1 Other names include continuation [20], and mesh propagation [14] 20 1. Choose a cell that intersects with the level set. 2. Construct the surface of intersection of the cell and the level set. 3. Note that for each face of the cell that intersects with this surface, the adjacent cell must also intersect with the surface (see Fig. 3.4). 4. "Follow" the surface into each adjacent cell, by repeating steps 2 - 4 for that cell. • The surface constructed remains topologically connected at all times, since a cell is only visited if the surface under construction reaches into it from an adjacent cell. Thus, the algorithm constructs, not a level set, but-a single component of the level set: a contour (Sec. 2.6). Hence the term "contour-following." This was first implemented by Wyvil l , McPheeters & Wyvil l [28], using an explicit queue to visit the cells in a regular mesh, and was extended from regular to irregular meshes by Howie & Blake [14]. Both papers omit details on how to select an initial cell: in both cases, it appears that the initial cell was selected manually, or on the basis of domain-specific knowledge of the data set. 3.5.1 Seeds and Seed Sets The contour-following algorithm stated above traces a single contour (i.e. con-nected component) of a level set. If the level set has more than one contour, 21 Figure 3.5: Possible Intersections of a Simplex and a Level Set multiple starting points are needed, to trace the entire level set. Such start-ing points are called seeds, and can be edges, faces, or cells. A set of seeds sufficient to generate an entire level set is called a seed set. In a simplicial mesh, there are only 4 possible intersections of the cell and a level set not passing through a vertex: see Fig. 3.5. In each case, no more than one contour can intersect the simplex. Thus, each cell can be a seed for at most one contour at a given isovalue, and a seed set for a given level set must contain at least as many seeds as the level set contains contours. If, the mesh is rectilinear, more than one contour can intersect a given voxel (e.g. Fig. 3.3). In this case, a cell may be a seed for more than one contour, and the seed set need not have as many seeds as there are contours in the level set. This complicates following the contours, but does not preclude it. 3.5.2 Analysis Using contour-following to generate level sets requires Q(k) time to generate all contours from seeds, plus the time to find the seeds. Since contour-following 22 itself requires no preprocessing, the preprocessing cost is simply the cost of generating any structures necessary to obtain seed sets for any desired isovalue. Since contour-following expressly moves from one cell to the next across a shared boundary, it will usually produce connected triangles, thus preserving local coherence (see Sec. 3.1.3). In addition, since contour-following generates each contour separately, it can be used to distinguish contours in the level set, unlike Marching Cubes (see Sec. 3.1.4). Ambiguity (Sec. 3.1.2) is still a problem in non-simplicial meshes. 3.5.3 Extrema Graphs Itoh &; Koyamada [16, 15] use extrema graphs to generate seeds for contour-following. They preprocess the data to extract all local extrema, then connect pairs of extrema with arcs to form the extrema graph., Each edge of the extrema graph has an associated list of cells, in sorted order from a maximum to a minimum: one of these is chosen as a seed for any given isovalue. No exact analysis is provided, but the authors admit that it is not always efficient, and is not guaranteed to succeed. Livnat, Sheri & Johnson [17] observe that the worst case behaviour for generating a level set is Jl(n), as Itoh & Koyamada test all edges in the extrema graph for intersection with the isovalue. 23 3.5.4 Thinning Bajaj et al. [2] have described a technique in which a set of cells is chosen as a seed set, then stored in a segment tree: generation of contours is then done by contour-following. To choose the seed set, the entire set of cells is initially chosen, then redundant cells are removed by a heuristic, thus reducing, or thinning the seed set, until an almost-optimal set is achieved. In the same paper, Bajaj et al. also describe a greedy "climbing" algo-rithm which approximates the contour tree, and a sweep algorithm for con-structing seed sets offline. 24 Chapter 4 Contour Trees The algorithms discussed in the previous chapter were designed for 3-Ddatasets. Similar algorithms exist for 2-D datasets. For example, van Kreveld used in-terval trees (Sec. 3.4, p. 19) to generate contour lines in 2-D [25]. In 1993, de Berg and van Kreveld [8] applied a structure called the contour tree in 2-D, and demonstrated that it could be constructed in 0(N log N) time. In 1997, van Kreveld et al. [26] improved the algorithm for constructing the contour tree, and observed that it could be applied to. higher dimensions, albeit with a construction time of 0(N2). Their algorithm is discussed in detail in Sec. 4.8, as is a later result by Tarasov & Vyalyi , which extends the 0(N log N) time bound to three dimensions [23]. 25 4.1 Morse Theory Contour trees are based, in part, on work in the field of Morse theory [3, 12]. Morse theory studies the changes in topology of level sets of / as the parameter x changes. Points at which the topology of the'level sets change are called critical points. Morse theory requires that the critical points be isolated - i.e. that they occur at distinct points and values. A function that satisfies this condition is called a Morse function. A l l points other than critical points are called regular points and do not affect the number or genus of the components of the level sets. 4.1.1 A Guarantee that Critical Points are Vertices Morse theory provides some useful theoretical results. In the case of simplicial meshes, the definition of / - as a linear interpolant over a simplicial mesh with unique data values at vertices (Sec. 1.4) - ensures that / is a Morse function, and that the critical points occur at vertices of the mesh [3]. A direct proof of this result is given in the following sections. This proof does not rely on Morse theory, but is based solely on properties of the interpolation function. 4.2 Description of Contour Tree If we think of the parameter x as time and watch the evolution of the level sets of / over time, then we see contours appear, split, change genus, join, 26 (d) (e) (f) Figure 4 .1 : Development of Level Sets in 3-D 27 Figure 4.2: Contour Tree corresponding to Fig. 4.1 and disappear. In the example shown in Fig. 4.1, the level set evolves from four sticks, to two rings, to two cushions, to a hollow ball, to a solid, as the parameter decreases. The contour tree is a graph that tracks contours of the level set as they split, join, appear, and disappear. Fig. 4.2 shows the contour tree for the dataset illustrated in Fig. 4.1. Starting at the global maximum, four small contours appear in sequence (10, 9, 8, 7): these correspond to the four leaves at the top of the contour tree. The surfaces join (6, 5) in pairs, forming larger contours, which quickly become rings. These rings then flatten out into cushions, .which join (4) to form a single contour. This contour gradually wraps around a hollow core, and pinches off at (3), splitting into two contours: one faces inwards, the other outwards. The inward contour contracts until it disappears at (2): the outward contour expands until it reaches the global minimum (1). 28 7 3 6 a 8 Figure 4.3: Development of Level'Sets in 2-D Note that changes in genus from disk to torus to disk are not reflected in the contour tree — even though the contours change topology, each can still' be traced from a single seed point. . A l l changes to the contours occur at critical points, but not all critical points cause changes to the contour tree. Formal definitions to support this intuitive description are found in the following sections. Each "contour" referred to as appearing, joining, splitting and disap-pearing is actually a class of contours for different isovalues, and I define contour trees in terms of equivalence classes of contours. The contour tree is independent of the dimension of the space - for 29 example, Fig. 4.3 shows a dataset in 2-D which has Fig. 4.2 as its contour tree. In some of the examples that follow, I shall use this contour tree instead of Fig. 4.1, where the desired property is clearer in 2-D than in 3-D. 4.3 Contour Properties In order to define contour trees, I start by showing that contours are essentially unchanged between isovalues of vertices of the mesh. From Assn. 1.5, the vertices X i , x2,. . ., xn must have unique values: it is convenient for contour tree computation if they are in sorted order. To simplify some definitions, add h0 = — oo and hn+i = oo: Assumpt ion 4.1 h0 < hi < h2 < ... < hn < hn+i Defini t ion 4.2 The contour containing a point p, denoted -yP! is the contour 7 to which p belongs. Defini t ion 4.3 For a contour K at isovalue h, define the set of simplices intersected \>y K, to be Simp(ri) = {a is a simplex in the mesh : a fl K ^ 0} (see Fig. 4.4). The goal is to construct a contour tree over the entire spatial volume V, based on the values hi at the vertices of the mesh: the first step towards this goal is to show that changes in the connectivity of the level set can only happen at vertices of the mesh. I start by showing that changes in the connectivity cannot happen between successive values hi, hi+i': 30 31 L e m m a 4.1 Let K,X be contours at distinct isovalues h,h' £ (/7;_i,/Y;), and Simp(K) fl Simp(X) 7^ 0. Then Simp^n) = Simp(X). Proof: Since K is connected, all the simplices in Simp(K) must be connected. Assume that Simp^K.) ^ Simp(X). Then there must be at least one simplex o"i in Simp^K.) fl Simp(X) from which K continues through a face of <7i into some other simplex o"2 which is in Simp(K) but not Simp(X) (or vice versa). Without loss of generality, assume the former. There must then be a edge r, common to <7i and cr2, that intersects K. Let Xj and Xk be the two vertices which T connects, with j < k. Then hj < h < hk, by the assumption that the interpolation function ispiecewise . linear (Assn. 1.4, p.7). Also, hi-i < h < h{, and the vertices are in sorted order, so hj < < h < hi < hk- Then hj < h' < hk, since hi-\ < h! < hi. By the Mean Value Theorem, since / is continuous over each simplex, there exists some point re on the edge r such that f(x) = h'. Since only one contour can intersect a given, simplex (Sec. 3.5.1, p.22), this means that A fl r / 0. But r is contained in cr2, so A fl (T 2 ^ 0, i.e. cr2 £ Simp(X), which contradicts our assumption that cr2 was in Simpfa) but not Simp(X).0 Note that this result will not hold in a rectilinear mesh, where two topologically separate contours can intersect the same cell (Sec. 3.1.2, p. 16). A consequence of this lemma is that we can continuously transform a contour at height h to a new contour at some nearby height h': Coro l l a ry 4.2 Let K be a contour at isovalue h £ (/i;_i,/i;), and let h' £ (hi-i,hi). Then there is exactly one contour X at isovalue hi such that Simp^n) = Simp(X).D 32 P r o o f : Let a be any simplex in Simp(K). Then there is some x in a such that f(x) = h. By the same argument as used in Theorem 4 .1 , there must also be some x' in a such that f{x) = hi. Call the contour to which x belongs A. Then, by Theorem 4.1, Simp(r%) — Simp(X). Since only one contour for a given isovalue can intersect cr, A must be unique.• Clearly, little of interest occurs between isovalues of the vertices: see Fig. 4.5. We classify each vertex x; by the behaviour of the contours that are just above and just below x;, for some small e > 0 such that hi-\ < hi — e < ht < hi + e < hi+i. Defini t ion 4 .4 The star of a vertex Xi, denoted Star[xi), is the set of sim-plices that have Xi as .one of their vertices. Defini t ion 4 .5 The set of contours just above x8-; Ct = {7 € L(hi + e) : 7 Fl Star(Xi) / 0}. . Defini t ion 4 .6 The set of contours just below X J : C~ = {7 €" L{hi - e) : 7 fl Star(xi) ^ 0}. Now classify the vertices of the mesh as follows: Defini t ion 4 .7 An ordinary point is a vertex x,- for which \\Ci~\\ = 1 and licril = i- • ' , • Defini t ion 4 .8 A local maximum is a vertex xt- for which HC*|| = 0. Defini t ion 4 .9 A local minimum is a vertex x,- for which \\C~\\ = 0. 33 Note that this definition of local extrema is different from the usual definition of local extrema in a graph: vertices larger [smaller] than all their neighbours. However, the two definitions are equivalent: this definition was chosen to provide a consistent test for extrema and other critical points. Defini t ion 4.10 A join is a vertex X{ for which | |C/"| | > 1. Defini t ion 4.11 A split is a vertex X{ for which \\C~\\ > 1. Defini t ion 4.12 A critical point is a vertex Xi which is a local maximum, local minimum, join, or split. Note that a vertex may be both a join and a split, a join and a local minimum, or a split and a local maximum. These cases normally occur only on the boundary of the volume., and not in the interior. A l l other possibilities are mutually exclusive. Note that this definition of critical point is not quite the same as the definition of critical point in Morse theory: I treat a vertex where only the topological genus changes as an ordinary point. 4.4 Contour Equivalence In this section, I define an equivalence relation that captures the intuitive description of the contour tree. The contour tree will then be defined in terms of the equivalence classes of this relation. 34 Definition 4.13 Let 7 , 7 ' be contours at h,h', respectively, with h < hi. Then 7 , 7 ' are contour equivalent (j = j') if all of the following are true: 1. neither 7 nor 7 ' passes through a critical point, 2. 7 , 7 ' are in the same connected component T of {x : f(x) > h}, and there is no join X{ £ V such that h < hl < h!, and 3. 7 , 7 ' are in the same connected component A of {x : f(x) < h'}, and there is no split X{ £ A such that h < hi < h'. Definition 4.14 The equivalence classes of this relation will be called contour classes. Definition 4.15 The contour class containing a point p, denoted [yp], is the contour class to which the contour 7 P belongs (recall that 7 P is the contour to •which p belongs. 4.5 Definition of the Contour Tree The definition of contour classes encapsulates the intuitive "sweep" described in Sec. 4.2. Contours not passing through critical points will belong to contour classes that map 1-1 with open intervals (/i;, hj), where ,T,-, X3 are critical points, and i < j. I will sometimes refer to a contour class as being created at j, hj, or Xj, and being destroyed at i, hx, or x,-, thus preserving the intuitive description of a sweep from high to low values. 35 Contours that pass through critical points will be the sole members of the contour classes to which they belong (i.e. finite contour classes). This correspondence between critical points and finite contour classes, and between open intervals and infinite contour classes, leads to the definition of the contour tree for a simplicial mesh: Def ini t ion 4.16 The contour tree is a graph composed of: 1. vertices, or supernodes, which represent finite contour classes (i.e. crit-icalpoints): (a) Leaf vertices, corresponding to: (i) . local maxima, at which an infinite contour class is created (ii) . local minima, at which an infinite contour class is destroyed (b) Interior vertices, at which (i) . at least one infinite contour class is created, and (ii) . at least one infinite contour class is destroyed 2. edges, or superarcs, tohich represent infinite contour classes, and con-nect: ' (a) the supernode corresponding to the critical point at which, the con-tour class is created, and (b) the supernode corresponding to the critical point at which the con-tour class is destroyed. 36 Figure 4.6: Critical Points and Boundaries Between Regions 37 Label each superarc Cf, where Xj is the vertex at which the superarc is created, and X{ is the vertex at which it is destroyed. Fig. 4.6 shows the correspondence between the contour tree in Fig. 4.2 and regions of Fig. 4.3: the correspondence to Fig. 4.1 is similar, but harder to show on paper. 4.6 Properties of the Contour Tree The first useful property of the contour tree is that it is in fact a tree. The proof of this is taken from de Berg and van Kreveld [8], with minor modifications, and is included for completeness: L e m m a 4 .3 The contour tree C for a connected spatial volume is a tree. Proof: Proof is by showing that the contour tree is acyclic, and is connected if the volume V is connected. Connectedness: Let x and y be points in V that are connected by some path P. Each point p on P belongs to some contour jp, which in turn belongs to some contour class [fp], which may be either finite or infinite. Let P ' = { 7 P : p £ P}, i.e. P' is the set of contours containing points on P. Then, since each 7 P contains the corresponding p, P' contains P. Therefore P' connects x and y. Note that if p and q are points on P with the same isovalue f(p) = /(•?)) then 7 P and jq may in fact be the same contour. If this happens, I reduce P' by omitting all contours for points between p and q on P. Thus, I assume that P', viewed as a set of contours, contains no repeats. 38 Let P" = {[7P] : p £- P} , i.e. P" is the set of contour classes contain-ing points on P. Since each [jp] contains the corresponding jp, and each jp contains the corresponding p, the set P" must also contain P. Therefore P" connects x and y. As with P', I assume that there are no duplicates: once P exits a contour class, it never re-enters that contour class. Consider the set Q consisting of all supernodes and superarcs that cor-respond to contour classes in P". Clearly, Q contains both [7^] and [jy]. I claim that Q connects [7^] and [7^] in the contour tree. Let Cj be a superarc in the contour tree, corresponding to some infinite contour class in P", and let x; and Xj be the vertices at which C\ is created and destroyed, respectively. Note that the connectivity of the level set can only change at critical points (Sec. 4.1.1), and an infinite contour class contains no critical points. This implies that the infinite contour class C\ consists of exactly one contour for each isovalue in the open interval (h{,hj). Also note that each of these contours is a connected component of the level set for that isovalue. Therefore, any path from a point in Cj to a point not in Cj must include at least one point x on either jXt or /yXj. If we consider what this means in the contour tree, we see that if the path P leaves Cj in P", then either xt- or Xj belongs to Q. Assume that it is Xi, and note that Cj is connected to X{ in the contour tree. Similarly, when the path P leaves j x n it must enter some new infinite contour class, C\ or C%k, to which X{ is connected in the contour tree. Continuing in this way, we see that the sequence of contour classes along the path P corresponds 39 exactly to a set of connected supernodes and superarcs in the contour tree: i.e. Q is connected.• It follows from this that, if the volume is connected, so is the contour tree. A c y c l i c i t y : Suppose-a cycle K exists in the contour tree. Without loss of generality, assume that K has no repeated supernodes, and pick the smallest-valued supernode rc; on the cycle: two distinct superarcs Cj and are incident to X{. Since Cj represents contours for all isovalues in the open interval (h{,hj), I choose a point x on contour 7 at isovalue h such that [7] = Cj, and hi < h < hi+i. Similarly, I choose a point y on contour 7 ' at isovalue h' such that [7'] = C), and h < h' < hi+1. Note that neither 7 nor 7 ' passes through a critical point. Thus, 7 and 7 ' satisfy the first condition for contour-equivalence (Def. 4.13). Let Q be the open-ended path obtained by removing re; from K without deleting Cj andCf. By an argument similar to that used for connectivity above, this path Q must correspond to a connected set in V consisting solely of points with isovalues > ht, and including x and y. I truncate this connected set by removing all points with isovalues < h. Since hi < h < /ij+i, and changes to connectivity can only occur at vertices (Sec. 4.1.1), this does not disconnect the set. Since 'x and y sit on contours 7 and 7 ' respectively, it then follows that 7 and 7 ' belong to the same connected component Y of {x : f(x) > h}. And since hi < h < h! < there is no join re j G T such that h < hj < h', 40 satisfying the second condition of Def. 4.13. Finally, let R be the open-ended path in K obtained by taking x,-, Cj and C\. A similar argument to that just used then shows that 7 and 7 ' are in the same connected component A of {x : f(x) < h', and there is no split Xj € A such that h < h3 < h', satisfying the third condition of Def. 4.13. -Thus, 7 = 7 ' , and it follows that Cj — C\, so by contradiction, no cycle K exists in the contour tree.D Coro l l a ry 4.4 The contour tree C for the volume V is of size Q(t). Proof: From Def. 4.8 and Def. 4.9, it is clear that the leaves of the contour tree must be local extrema. By Def. 2.10, p. 13, there are t local extrema. Since at least half of the nodes of a tree must be leaves, the size of the contour tree is at least t + 1 and at most 2t — l .D 4.7 Augmented Contour Tree For some purposes, we may wish to know more information than that provided by the contour tree. In particular, we may wish to know which non-critical points in the mesh belong to which superarc. One method of doing this is to augment the superarcs with all of the corresponding vertices, as follows: Defini t ion 4.17 The augmented contour tree, ,ACM of the mesh M is the tree on the vertices of M such that X{ and Xj are adjacent in ACM if 1. Xi and Xj are both supernodes of the contour tree, Cf is a superarc of the contour tree, and there is no vertex Xk such that [fXk] = Cj, 41 Figure 4.7: The Augmented Contour Tree corresponding to Fig. 4.3 2. Xi is a supernode of the contour tree, Xj is not a supernode of the contour tree,' [yXj] = Q and there is no vertex xm such that [jXm] = C\, and hj *\ hfji <^ hi, 3. xi is a supernode of-the contour tree, Xj,is not a supernode of the contour tree, [7 ]^ = C\ and there is no vertex xm such that [~fXm] .= C\, and ht < hm < hj, or 4- Xi and Xj are not supernodes of the contour tree, [7^] — [7^] = Cm> and there is no vertex xp such that [jx ] = and hm is strictly between x,-and Xj 42 This definition strings the vertices belonging to a superarc out along that superarc in sorted order (see Fig. 4.7). Each superarc has become a path in the augmented contour tree. This path starts at the vertex where the superarc is created, passes through all of the vertices belonging to the superarc, and ends at the vertex where the superarc is deleted. L e m m a 4 .5 Ifxi is an ordinary point in the mesh M, then x2- has exactly two neighbours Xj and xk in AC'M, and hj < hi < hk. Proof: Proof is by exhaustion of the cases in Def. 4.17. By Def. 4.7, \\Ci~\\ = l a n d | |C~ | | = 1. Thus Xi cannot be an endpoint of the superarc Cvq of the contour tree to which xt- belongs. Consider xp, the supernode at the top of C%, and suppose that there is no vertex xk belonging to CVQ such that hk > hi. Then by Def. 4.17(2) edge of ACM- Since xt- belongs to exactly one superarc, Q , there can be no other supernode that satisfies Def. 4.17(2). And since there is no xk belonging to Cvq that is higher than Xi, Def. 4.17(4) cannot connect x,- to any higher-valued vertex. This leaves two cases, both of which require a supernode for the lower-valued vertex. Thus the only higher-valued vertex connected to x,- is xp. If there is at least one vertex on Cvq that is higher than X{, let xk be the smallest-valued such vertex: a similar argument to the foregoing shows that xk is the only higher-valued vertex connected to x,-. The same arguments then apply symmetrically to show that x,- has exactly one smaller-valued neighbour in ACM-^ 43 4.8 Contour Tree Algorithms van Kreveld et al. [26] construct a contour tree on a simplicial mesh. Their construction assumes that saddle points are simple (i.e. involve no more than two contours splitting or joining). Multiple saddle points are split into multiple simple saddle points. The algorithm proceeds by sweeping through the values of the param-eter, maintaining the level set at all times as a set of contours, each contour being represented as a polygon. As the sweep passes through each vertex in the mesh, the level set is updated locally. If the vertex is a (simple) saddle point, either two contours join or two contours separate. In either case, a new supernode is generated, and the superarcs corresponding to each contour are set to terminate at the supernode. If two contours join at the saddle point, the level set is updated in time proportional to the smaller of the two contours. Similarly, a split is achieved in time proportional to the smaller of the two resulting contours. This allowed van Kreveld et al.to prove an upper bound of 0(N log N) time in two dimensions. The same algorithm in higher dimensions resulted in a lower bound of 0(N2) time, primarily due to the increased cost of maintaining the level set in dimensions above two. Tarasov & Vyalyi [23] extend the result of van Kreveld et al. [26] to achieve a time of 0{N log N) in three dimensions. Again, the mesh is assumed to be simplicial, and saddle points are assumed to be simple. The single sweep in [26] is replaced by three sweeps. Two sweeps are performed tb identify local minima and maxima respectively: the third sweep then identifies and resolves 44 saddle points in a way similar to [26]. Boundary cases are handled by special cases within the algorithm. Tarasov & Vyalyi's assumption that saddle points are simple is justified by a preprocessing step. This step involves a barycentric subdivision of each simplex into 24 smaller simplices, and may also involve further subdivisions. As a result, the size of the input data is magnified by a factor of 24 in all cases, and 360 in the worst case. When combined with the cost of simplicial subdivision (Sec. 9.1, p.91), this leads to a total magnification between 120 and 8640 for rectilinear data: this is prohibitive in practice. I propose a new algorithm (Algorithm 7.2, p.84) for computing contour trees over simplicial meshes with: 1. Time requirements of 0(n log n + N + ta(t)) for constructing contour trees, in any number of dimensions (Corollary 7.5, p.85), 2. Space requirements of 0(n) working storage (Theorem 7.4, p.84), 3. Simple treatment of boundary effects, and 4. Simple treatment of multi-saddle points. Before giving details of the algorithm, some preliminary results are nec-essary. Ch. 5 deals with reconstructing trees from partial information. Ch. 6 then describes how to extract sufficient information from the data set to per-form the reconstruction. Finally, Ch. 7 puts all the pieces together to produce an algorithm. 45 Chapter 5 Join Trees and Split Trees This chapter presents some graph-theoretic results which form the basis of the contour-tree construction algorithm in Ch. 7. Recall that Def. 4.13, p.35, defined contour equivalence in terms of components of the sets {x : f(x) >Ji} and {x : f(x) < h}. Def. 4.16,- p.36 then defined the contour tree in terms of classes of equivalent contours over the volume V. These contour classes reflect the1 underlying continuity of the volume V. In a graph, such as the mesh M underlying the interpolation function, equivalence classes are rather less meaningful. However, connectivity still ex-ists, and it is possible to define structures similar to the contour tree using only the sets {.T : x is a vertex and f(x) > h} and {x : x is a vertex and f(x) < h). The information represented by these sets is collected separately in two struc-tures called the join tree and the split tree. This chapter contains two results: the reconstruction of an unknown 46 tree from its known join and split trees, and a demonstration that the join tree and split tree for the mesh M are identical to the join and split trees for the augmented contour tree ACM-The following chapter (Ch. 6) gives an efficient algorithm to compute the join and split trees of the mesh. Ch. 7 then unites the results in this chapter and Ch. 6 to obtain the algorithm to construct the contour tree. 5.1 Definitions of Join and Split Trees The join and split trees ignore the physical origin of the data: they are defined over an abstract class of graphs, called height graphs. For any given height graph, the join tree and split tree are then defined. Def in i t ion 5.1 A height graph is a graph G, each of whose n vertices Xi has a unique height hi. As in Assn. Jhl, p.30, it is assumed that the vertices are in sorted order, i.e. h\ < hi < ... < hn. Def in i t ion 5.2 A C-tree is a height graph C that is also a tree. Having defined height graphs, it is useful to consider what a height graph looks like "above" a given vertex (see Fig. 5.1), i.e. to consider the connectivity of the subgraph of all higher-valued vertices. This concept recurs so frequently that a formal definition is in order: Defini t ion 5.3 The subgraph of a height graph G above hi, denoted T^G), is the subgraph of G induced by the vertices with height > hi. 47 (a) The C-Tree (b) The subgraph above xg Figure 5.1: A sample C-Tree and subgraph above x$ Defini t ion 5.4 The subgraph of a height graph G below hi, denoted T~(G), is the subgraph of G induced by the vertices with height < hi. I now define the join and split trees corresponding to a given height graph. Although the join and split trees are notionally on the same set of vertices as the height graph, the proofs become more intelligible if I adopt the convention that Xi is a vertex in a height graph (or C-tree), yi is the corresponding vertex in the join tree, and zt- the vertex in the split tree (see Fig. 5.2 for the join and split tree corresponding to Fig. 5.1). Defini t ion 5.5 The join tree JQ of a height graph G is a graph on the vertices yi,... ,y\\G\\ in which two vertices yi and yj, ivith hi < hj, are connected when: 1. Xj is the smallest-valued vertex of some connected component 7 ofTf(G), 48 (a) The Join Tree (b) The Split Tree Figure 5.2: Join and Split Trees corresponding to Fig. 5.1 2. Xi is adjacent in G to a vertex ofj, and D e f i n i t i o n - 5.6 The split tree SG of a height graph G is a graph on the vertices Z\,. . . , Z||G'|| in which two vertices Zi and Zj, with hi > hj, are connected when: 1. Xj is the largest-valued vertex of some connected component 7 ofTT(G), 2. Xi is adjacent in G to a vertex of'7, and I now abandon height graphs temporarily, and work only with C-trees. In order to simplify proofs, I rely on the observation that the join and split trees are essentially the same, but for the direction of comparisons, so that: D u a l 5.1 Any property that holds for join trees holds for split trees, with suitable modifications of up and down, higher and lower, < and >, &c. 49 A l l such properties will be noted as "Duals" to the relevant theorems and lemmas, without proof. 5.2 Vertex Degrees in Join and Split Trees An important property of the join and split trees of a C-tree C is that the degree of a vertex in C can be deduced from its degree in Jc and Sc- This allows me to locate leaves of C without knowing C explicitly. Because the vertex heights are of critical importance, I differentiate between arcs leading "up" from a vertex, and those leading "clown". Similarly, I distinguish between the "up-degree" and "down-degree." Defini t ion 5.7 An up [down] arc is an arc from a vertex Xi to a higher-[lower-] valued vertex xv Defini t ion 5.8 The up degree of a vertex Xi, S+(x{) is the number of up arcs at xr. Defini t ion 5.9 The down degree of a vertex a,\-. 5~{xi) is the number of doivn arcs at X{. Once we have defined up and clown arcs, it becomes meaningful to talk of local and global extrema: Defini t ion 5.10 A local maximum [minimum] Xi in a height graph is a vertex with up [down] degree of'O. 50 Defini t ion 5.11 The global maximum [minimum] of a height graph is the unique highest- [lowest-] valued vertex. Since leaves have only one arc, it is easy to see that a leaf must be an extremum. Either the arc is up, in which case the leaf is a local minimum, or the arc is down, in which case the leaf is a local maximum. But not all local extrema are leaves (for example, vertex 2 in Fig. 5.1): a vertex with up-degree 2 and down-degree 0 is obviously not a leaf. It turns put to be critical to differentiate between a leaf at the "top" of the C-tree, and a leaf at the "bottom", so I define: Def ini t ion 5.12 An upper [lower] leaf of a C-tree is a vertex with up- [down-] degree of 1, and down- [up-] degree of 0. Having defined my terms, I now prove some results about the degrees of vertices in the join and split trees. Recall that each vertex Xi in C has corresponding vertices yi in Jc and Z{ in Sc-Theorem 5.2 Given a C-tree C and its corresponding join tree Jc, c>+(xt) = (5+(y.,) for every vertex Xi in C and corresponding yi in Jc-Proof: To prove this, I show that each up-arc X{Xj in C corresponds 1-1 with an up-arc m Jc-(=^): Let x{Xj be an up-arc at xs- in C (see Fig. 5.3). Then Xj must belong to some connected component 7 in Tf(C). I claim that no other vertex Xi adjacent to x; belongs to 7. Suppose there were another vertex x\ in 7 51 Figure 5.3: Components and Up-Arcs in the C-Tree adjacent to x,-. Since 7 is a connected component, a path P from Xj to xi would have to exist in 7. Then P + x/x,Xj would be a cycle in C. But C is a tree, and has no cycles, which contradicts the assumption that x; is adjacent to more than one vertex in 7. Therefore, each up-arc at x; corresponds to a component 7 in Ff(C). Let Xfc be the unique smallest-valued vertex in 7. By Def. 5.5, y,- is connected to i/k- Note that neither Xj nor Xk can belong to more than one connected component 7 in Yf(C). It follows that, for each up-arc at x;, there exists at least one up-arc at yl. • (<=)'• Let i/iyk be an up-arc.at y,-. By Def. 5.5, Xk belongs to some component 7 in Tf(C), and X ; is connected to some vertex Xj in 7. As with (=>•), this X j must be unique, because (7 is a tree. Since h3 > / i ; , we have shown that there is an up-arc at x; in C corresponding to yiyk in Jc-D D u a l 5.3 Given a C-tree C and its corresponding split tree S'c, — 8~(zi) for every vertex x,- in C and corresponding Zi in 5c 52 From this result, it is trivial to use properties of Jc and Sc to identify leaves of C: Coro l l a ry 5.4 If8+(yi) = 0 and 8~(zi) = 1, then x,- is an upper'leaf in C. From Theorem 5.2 and Corollary 5.3, S+(x{) = 0, and 8~(xi) = 1. From Def. 5.12, it then follows that xt- is an upper leaf in C.D D u a l 5.5 IfS~(zi) = 1 and S+(yi) = 0, then x2- is a lower leaf in C. • If the up-degrees in the join tree and C-tree are identical, what can be said about the down-degree? Conveniently, the down-degree of a vertex in the join tree will always be 1, except at the global minimum, where it will be 0. L e m m a 5.6 For each vertex yj in the join tree Jc, 8~{yf) = 1 unless j = 1, in which case 8~{y\) = 0. Proof: Suppose that <^_(y7) > 1. Then there are two down arcs at y'j, to yi and yk, with hi < hk < hj. From Def. 5.5, Xj must be the smallest-valued vertex of some component 7 of F^(C'), andrXfc is adjacent to some x\ in 7. Again applying Def. 5.5, Xj belongs to some component p of Tf(C). If xm is a vertex of 7, it must be'connected to Xj by a path whose vertices all have height > hj (else x3 would not be the smallest-valued vertex of 7). Since hi < hj, it follows that xm belongs to p, and that 7 C p. Because x\ G 7, it is also true that x\ £ p, and since xkxi is an arc between two vertices that are higher-valued than x,-, it follows that xk £ p.. 53 From Def. 5.5, Xj must be the smallest-valued vertex of p (otherwise DjUi would not be an arc of Jc)- But hk < hj, and Xk also belongs to p. By contradiction, it follows that Xj has down-degree < 1. . To show that S~(XJ) =. 1, count the arcs in Jc and C. Since C is a tree on n vertices, it must have n — 1 arcs. From Theorem 5.2, there are as many arcs in Jc as in C. Since no vertex may have down-degree > 1 in Jc , there must be n — 1 vertices with down-degree 1, and one vertex with down-degree 0. Trivially, the global minimum, y 1 ? cannot have any down-arcs (since there are no vertices with smaller value), and o~~(yi) = 0. Thus, if j' / 1, 5~(yj) ^ 0. • D u a l 5.7 For each vertex Zj in the split tree Sc, $+{ZJ) — 1 unless j = n, in which case 5+(zn) = ().• Coro l la ry 5.8 The join tree Jc of a C-tree C is a tree, and also a C-tree. It is not difficult to see that Jc is connected: each vertex except y\ must be connected to a lower-valued vertex. This vertex in turn connects to another vertex with smaller value yet: this can only stop when y\ is reached: thus, "each vertex is connected by a path leading downwards to y\. Suppose that there is a cycle K in Jc with unique highest-valued vertex yl. Because K is a cycle, yi must be adjacent to two other vertices of K, say ijj and yk- Since yi is the highest-valued vertex in K, the arcs yi\jj and must be down-arcs. But this contradicts Lemma 5.6, which allows y2- to have at most 1 down arc. Since Jc is both connected and acyclic, it must be a tree. And since each vertex has a unique height associated, Jc is also a C Jtree.D 54 Dual 5.9 The split tree Sc of a C-tree C is a tree, and also a C-tree.O 5.3 Preliminaries to Reconstruction The previous section showed that leaves of a contour tree C can be located using the up- ,and down- degrees in Jc and Sc- This can be extended to locate the incident arc to the leaf, then reduce Jc and Sc to smaller join and split trees. This permits reconstruction of C from Jc and Sc with a recursive algorithm. The first step is to show that the incident arc to an upper [lower] leaf is present in the join [split] tree: Lemma 5.10 If x; is an upper leaf, and t/ii/j is the incident arc to y t in Jc, then XiXj is the incident arc to x4- in C. Proof: Let x8- belong to some component 7 in T~j(C). I claim that x,-is the only vertex in 7 . Suppose not. Then, since 7 is a connected component, there is some other vertex xk in 7 to which x; is connected. By Def. 5.5, x,- is the smallest-valued vertex in 7 , so X j X f c must be an up-arc at xt-. But, since x,- is an upper leaf, it has no up-arcs. It follows that X,- is the only vertex in 7 . Applying Def. 5.5, we see that, if ytj/j is an arc of Jc, then x7- must be connected to some vertex in 7 . But, since x; is the only vertex in 7 , it follows that Xj is connected to x,-.Q 55 D u a l 5.11 If Xi is a lower leaf, and ZiZj is the incident arc to Z{, then XiXj is the incident arc to • For the recursive construction of C, we remove a leaf x; from C, and calculate the new join and split trees directly from the old ones. Although we aim to remove leaves, an upper leaf in C may not be a leaf in Sc- In this case, removing Zi from Sc may disconnect the split tree. Similarly, removing a lower leaf may disconnect Jc-To avoid this, recall that the up-degree in Sc is always 1 (Dual 5.7). If the down-degree of xt- in Sc is more than 1, then the down-degree in C is also more than 1 (Dual 5.3), so x; cannot be a leaf. Thus, x; has exactly one up-arc in Sc, and either 0 or 1 down-arcs. Similarly, if x,- is a lower leaf, it has exactly one down-arc in Jc, and either 0 or 1 down-arcs. In removing x%, we maintain connectivity by contracting the incident arcs into a single arc, preserving connectedness. This operation will be called reduction to distinguish it from the simple removal of a yertex from a graph (Def. 5.14). Theorem 5.15 will then show that applying the reduction operation gives the join and split trees of the new, smaller graph. Defini t ion 5.13 Define T Q Xi, the reduction of a C-tree T by a vertex xt-whose up-degree and down-degree are both < 1, to be (see Fig. 5.4): 1. If Xi has arcs x 8 X j up and XiXk down, in T, then: T Q xt- = T \ x ; U XjXk 2. Otherwise, T Q x 2 = T\x,-56 Figure 5 . 4 : Reduction of a C-tree L e m m a 5.12 Reduction does not increase the up- or down- degree of any vertex. Proof : This is easily observed, since the reduction deletes vertices and arcs, and only adds arc XjXk when deleting x\x3 and .TJ.TJ..D Def in i t ion 5.14 The subgraph of a graph G induced by removing a vertex Xi, denoted G\xi, is defined to be the subgraph of G which includes all vertices of G except Xi, and all arcs of G except those incident to X{. L e m m a 5 .13 If Xi is a leaf of a C-tree C, and y3yk is an arc of the corre-sponding join tree Jc such that hj < hk, and i ^ j,k, then yjyk is also an arc of Jc\xi-Proof : For convenience, let C' refer to C\x t-, and assume that paths have no repeated vertices. Let P be any path in C. Since x,- is a leaf, this means that Xi cannot be on P (any path through x; would have to duplicate an edge, and therefore 5 7 a vertex). Since x% is not on P , the path P also exists in C. By Def. 5.5, x3 is adjacent to some vertex xi in the component 7 of r+(C) to which Xk belongs: i.e. there exists some path P from xi to Xk in 7 with no repeated vertices. But then P also exists in C", and therefore also exists in Tj(C'). Since XjXi is also in C, Xj is adjacent to the component 7' of T~j(C) to which Xk belongs. Note that each path P connecting two vertices of 7 will also be in 7', except for paths starting or ending at x t : thus the vertices of 7 are the same as those of 7', with the possible exception of x'{. It then follows that Xk is the smallest-valued vertex of 7', so by Def. 5.5, y3 is adjacent to yk in J c - D Corol la ry 5.14 Jc Qyi is contained in Jc\x,-Proof: This follows from the observation that every arc of Jc that is not incident to yi is also in JC\Xt.O Theorem 5.15 If Xi is a leaf of a C-tree C, then Jc\Xi = Jc Q Vi-Proof: In Corollary 5.14, I showed that Jc\x, contains all of the arcs of Jc that are not incident to yi. I now show that the two are equivalent. For convenience, I will use C to refer to C\x{. I separate the proof into three cases: is an upper leaf, x,: is the global minimum, or re,- is a lower leaf (other than the global minimum): Case I: Xi is an upper leaf: If Xi is an upper leaf, it has a down-arc, and cannot be the global minimum xi. By Lemma 5.6, yi has one down-arc, say Since Jc is a 58 tree with n — 1 arcs (by Corollary 5.8), this leaves n — 2 arcs of Jc that are not incident to yt-. But, by Lemma 5.13, each of these arcs must be in Jc-Since Jc is a tree on n — 1 vertices, it has n — 2 arcs in total, and it follows that Jc\yi = Jc- Since Xi is an upper leaf, y,- has no incident up-arc, and by Def. 5.13, part 2, Jc> = Jc\yi = Jc Q Vi-Case II : x t is the global m i n i m u m : By hypothesis, X{ is a leaf, so 8 + ( x i ) = 1 and 8~(xi) = 0. By Theo-rem 5.2, S+(yi) = 1,'arid by Lemma 5.6, 8~{yi) — 0.' Again, there are n — 2 arcs of Jc that are not incident to y,-, each of which is also in Jc, by Lemma 5.13, and Jc\yi = Jc- Since X{ is the global minimum, y,- has no down-arc, and by Def. 5.13, part 2, Jc = Jc\yi = Jc 6 Hz-Case I I I : X{ is a lower leaf other than the global m i n i m u m : By hypothesis, X{ is a lower leaf, so S+(x{) — 1 and 8~(x{) = 0. By Theorem 5.2, 5 +(y,) = 1, and by Lemma 5.6, 8~(yi) = 1. This leaves n — 3 arcs of Jc which are not incident to y,-, and by Lemma 5.13, each of them is also an arc of Jc- Since Jc is a tree on n — 1 vertices, it has n — 2 arcs, so there is only one arc left unaccounted for. Let the down-arc at y, be y tyj, and the up-arc be y^y^ (see Fig. 5.5). I claim that y3ijk is an arc of Jc- From Def. 5.5, a;,- belongs to some component 7 of Tf(C), and Xj is adjacent to some vertex xi in 7. Note that x\Xj must be a down-arc, and since X{ has no down-arcs, xi cannot be Xi. Also, since X{ is the smallest-valued vertex'in 7, hi <h\. Consider the component p of Tf(C) to which xk belongs. Since each 59 Figure 5.5: Reducing a Join Tree at a Lower Leaf vertex of p has value > hk, and hk > hj, p C 7, and Xk must be in 7. Since xt is the smallest-valued vertex of 7, the only arc of 7 that is not in p must be the arc incident to re,-, so p = 7\rc,-. But this must be a component of Ff(C'). Since xi ^ X{, it follows that X[ must have been connected to X{ in 7, as was Xk- Then x\ must be connected to Xk by a path whose vertices all have values > hi. Therefore, x\ and Xk belong to the same component of p , and since Xk is the smallest-valued vertex of S\xi, y3 must be connected to yk in J c -Note that y2yk cannot be an arc in Jc , because j/ij/jj/fc would then be a cycle in Jc , and Jc is known to be a tree by Corollary 5.8. Thus, I have added an arc to the n — 3 arcs that we had already shown to be in J c , for a total of n — 2. Since J c is a tree on n — 1 vertices, there are no more arcs to be found in J c -From Corollary 5.14, Jc\ j / i C J c , I have just shown that y^yk is the one remaining arc of J c - , so J c = Jc\yi U y i ' y r Since iji had an up-arc yiyk and a down-arc j/ij/j, it follows by Def. 5.13, part 1, that J c = Jc 0 yi-Thus, in all three cases, J c = Jc 0 60 ReconstructCTree(join tree Jc , split tree Sc) returns C-tree C: 1. Base Case: If \\JC\\ < 2, let C = JC-2. Recursive Case: (a) Choose an index i such that c>+(y;) = 1 and 8~(zi) — 0 (a lower leaf), or S+(iji) = 0 and 8~(zi) = 1 (an upper leaf). (b) If Xi is an upper leaf, find the incident arc j / j j / j in Jc'- if X{ is a lower leaf, find the incident arc Z{Zj in 5c-(c) Let J' = Jc 0 yit and S" = Sc Q Zi) (d) Let C = ReconstructCTree(J', 5') (e) Let C = C U X j S j Figure 5.6: Basic Reconstruction Algorithm D u a l 5.16 If Xi.is a leaf of a C-tree C, then Sc\x, = Sc 0 -z,-.D 5.4 Basic Reconstruction Algori thm In the previous sections, I have shown a number of properties of the join and split trees. I now put these together to obtain a recursive algorithm that reconstructs C from Jc and Sc'-A l g o r i t h m 5.1 Algorithm, to Reconstruct a C-Tree: In this algorithm (Fig. 5.6), recall the convention that the vertex xt in the C-tree C corresponds to the vertex y4- in the join tree Jc and to the vertex Zi in the split tree Sc-Theorem 5.17 Given a join tree Jc and the corresponding split tree Sc for a C-tree C, Algorithm 5.1 correctly reconstructs the C-tree C. 61 Proof: Proof is by induction on the size of Jc- Note that Jc-, Sc, and C are all the same size. Base case: \\Jc\\ < 2. Since C and Jc are C-trees, and there is only one possible C-tree on each of 0, 1, and 2 vertices, C is correctly computed by Step 1. Inductive case: Assume that the algorithm works correctly for \\Jc\\ < k, and show that it works correctly for \\Jc\\ = k: By Corollary 5.4, Step 2a correctly identifies a leaf of C from properties of Jc and Sc only. By Lemma 5.10, Step 2b correctly identifies the arc of C that is incident to X{. By Theorem 5.15, Step 2c correctly constructs Jc\x, and Sc\Xl-By the inductive hypothesis, since || J c v , | | = k — l, ReconstructCTreeQ returns C\XJ. We have already observed that X{Xj is the arc of C that is incident to Xi- Thus, since C is a tree on k vertices, and C = C\xi is a tree on k — 1 vertices, Step 2e correctly computes C. • 5.5 Improved Reconstruction Algori thm A straightforward implementation of Algorithm 5.1 is quadratic in the size of the C-tree, but improvements are possible, based on the following observations: 1. Leaves in C other than the £; chosen will remain leaves in C\x{ (from Lemma 5.12, and the observation that C\x{ is a tree). 2. After constructing J' and S', Jc and Sc are no longer needed 3. Before adding X{X3 to C, C is not needed. 4. The arcs XiXj identified on different recursive levels are disjoint. These observations allow elimination of tail-recursion, and result in a much more efficient algorithm, by making the following changes: 1. a queue containing the leaves of C is maintained at all times. 2. J' and S' are constructed by reducing Jc and Sc in place, rather than making copies. 3. C starts off empty, and is constructed in place, adding each X{X3 as it is identified A l g o r i t h m 5.2 Improved Algorithm to Reconstruct a C-Tree: In this algorithm (Fig. 5.7), I assume that the join tree Jc and split tree Sc are constructed using adjacency lists using half-arcs: that is, each arc y tyj in Jc is stored as a directed arc a in y;'s adjacency list, linked to a directed arc a' in yfs adjacency list. Theorem 5.18 Given a valid join tree Jc and the corresponding valid split tree Sc, Algorithm 5.2 correctly reconstructs the C-tree C. Proof: Since Jc and Sc are not used after constructing Jc\x, and Sc\Xl, reducing Jc and Sc in Step 3d, does not affect the correctness of the algorithm. 63 ImprovedReconstructCTree(join tree Jc, split tree Sc) returns C-tree C: 1. For each vertex X{, if up-degree in Jc + clown-degree in Sc is 1, queue x2-2. Initialize C to an empty graph on \\Jc\\ vertices 3. While queue size > 1 (a) Dequeue the first vertex on the leaf queue x,-(b) If Xi is an upper leaf, find'incident arc y,-yj in Jc- Otherwise, Xi is a lower leaf, so find incident arc ZiZj in Sc-(c) Add XiXj to C. (d) Jc <- JcQ yu Sc <- s*c e (e) Test whether X j is a leaf, and if so, transfer to leaf queue if it if not already on leaf queue. Figure 5.7: Improved Reconstruction Algorithm Similarly, since the arcs chosen at each iteration are disjoint, building C in place in Step 3c does not affect the correctness, either. Correctness then depends upon the invariant that the leaf queue con-tains all vertices of C at all times, where C is the reduced C-tree that corre-sponds to the reduced join and split trees. Initially, all vertices are scanned, and the leaves placed on the queue. From Lemma 5.12, the degree of vertices in Jc and Sc never increase as reductions take place. Since the up-degree of a-vertex in C" matches the up-degree in Jc, and the down-degree matches the down-degree in Sc, it follows that the degrees in C never increase. Also, since C must remain a tree at all times, the degree of a leaf on the queue cannot decrease (this would result in a vertex with degree 0, disconnecting the tree). 64 From the mechanics of reduction in the join tree, either an upper leaf, the global minimum, or a lower leaf is reduced. Case I: X l is an upper leaf: Theorem 5.15 showed that y,- is adjacent to precisely one vertex yj-. It then follows that yj has its degree reduced by 1, and that all other vertices retain their degree after the reduction. In this case, Step 3e maintains the property as required. Case II: x, is a lower leaf other than the global m i n i m u m : If Xi was a lower leaf other than the global minimum, Theorem 5.15 shows that y,- is between y 7 and yk: we replace y,yj and yiyk with yjyk- in this case, neither ijj nor yk has its degree reduced, so yj does not become a leaf, unless it already was, and neither does yk. Again, Step 3e maintains the property as required. Case III: x t is the global m i n i m u m : If Xi is the global minimum, 2,- is adjacent to some Zj, and by Corol-lary 5.11, Xi is adjacent to Xj. Since x\- is the global minimum, yt- has no down arcs, and must have at least one up arc. But since x\- is a leaf, 5+(x{) — °~+(yi) = 1- the up arc from y,- be yiyk- If ijk is yj, then deleting yi from Jc makes yj the new global minimum, and does not change the degree of any other vertex of Jc- It then follows that Step 3e maintains the property as required. Otherwise, yj is not the same as yk. By Def. 5.5, xk is the smallest-valued vertex of some component 7 in Tf(C). Also, because xi is a leaf, 65 Figure 5.8: The Leaf Queue and the Global Minimum and adjacent to Xj, X{ cannot be adjacent to Xk- Thus, since x t is the global minimum, Xk is not adjacent to any xm in C such that hm < hk, which is to say that 8~{xk) = 0 and S~(zk) = 0 before the reduction takes place. Since hi < hk, deleting the arc yiyk cannot affect 8+(yk), and by The-orem 5.2, 8+(xk) is also unaffected. I now break into subcases depending on the up-degree of Xk (which is the same as the up-degree of yk): Subcase I l i a : 8+{yk) = 1: Since 5~(zk) = 0, Xk was a lower leaf before the reduction took place, by Corollary '5.5. Thus, by the invariant, Xk was already on the leaf queue, and does not need to be tested at this iteration. Subcase I l l b : 8+(ijk) > 1: If 5+(j/fc) > 1 before reduction, S+(yk) > 1 after reduction, and Xk does not need to be tested at this iteration. Step 3e maintains the property as required. Subcase IIIc: 8+(yk) = 0: This case can never happen. To see this, observe that Jc is a tree (Corollary 5.8), and 8~{'yk) = 1 before reduction (Lemma 5.6). After deleting 66 Hil/k from Jc, 8+(yk) = 8~(yk) = 0, which is only possible if Jc has size 1. But in this case, there can be no other vertex y3 in Jc, which contradicts the assumption that Xi was connected to x-j in C. Thus it follows that S+(yk) ^ 0 before reduction. I have now shown that, in all cases that can occur, Step 3e maintains the invariant.• Theorem 5.19 Algorithm 5.2 takes 0{i) time and space to reconstruct a C-tree C with t vertices. Proof: Again, we consider the time and space requirements for each step: 1. Step 1 takes 0(1) time to test each vertex to see if it is a leaf. Since a leaf has maximum degree 2 in either the join or split tree, it takes 0(1) time to test each vertex, for a total of 0(t) time to check all vertices. 2. Step 2 takes at most O(t) time to initialize C. 3. Step 3 iterates t — 1 times, since one arc is added to C on each iteration. (a) Step 3a dequeues the first vertex on the leaf queue xl in 0(1) time (b) Step 3b takes 0(1) time to locate the arc in Jc or Sc-(c) Step 3c takes 0(1) time to add the arc to C. (d) Step 3d takes 0(1) time to reduce Jc and Sc- Since X{ is a leaf, yi has maximum degree of 2, and the arcs and yiyk can be located in 0(1) time each for deletion. Since the arcs are stored 67 using half-arcs in adjacency lists have half-arcs at each vertex, y^yu can be added as follows: Locate the half-arcs a and 8 from yi to y3 and yk- For each of these arcs, find the corresponding half-arc a' and /3'. Delete a and f3, and connect a' to B'. This takes at most 3 operations to reduce each tree, each of which is performed in 0(1) time. (e) Step 3e tests whether Xj is a leaf in 0(1) time, and if so, transfers it to the leaf queue in 0(1) time. Summing up the time required for the algorithm gives a total of 0(t) time. 0(t) space is required to store each of Jc, Sc, and C, since they are trees on t vertices. The leaf queue will also require 0(t) space, for a total of 0(t) space required.• 5.6 Join Trees of the Mesh 8z the Contour Tree The algorithm described in the previous section is useful only if we have a convenient way of obtaining the join and split trees for the contour tree. For-tunately, the join and split trees for the augmented contour tree are identical to those for the mesh. .To show this, I start with the trivial observation that the contour tree is a C-tree: this implies that the join tree is well-defined, and that the algorithm in the previous section can reconstruct the contour tree correctly. L e m m a 5.20 The contour tree C of a mesh M is a C-tree. 68 Proof: Def. 5.1 defines a height graph to be a graph whose vertices have heights associated, in sorted order. We know from Def. 4.16, p.36 that the vertices of the contour tree are finite contour classes, and are thus associ-ated with the vertices X i , . . . , xn, which are assumed to be in sorted order by Assn. 4.1, p.30. Thus a contour tree is a height graph. Def. 5.2 defines a C-tree to be a height graph that is also a tree. By Lemma 4.3, p.38, we know that the contour tree is a tree when the mesh is connected. Since this is the case that interests us, it then follows that the contour tree is a C-tree. • In Def. 4.16, p.36, I defined the contour tree to be a tree whose vertices were the critical points of the mesh. In Def. 5.5, I defined the join tree so that it included all the vertices of the mesh. Clearly, these two vertex sets need not be the same. However, I also defined the augmented contour tree in Sec. 4.7, p.41 in such a way that it included all the vertices of the mesh. Thus, if I can relate the join tree of the mesh to that of the augmented contour tree, it only remains to reduce the augmented contour tree to the contour tree. I will show that the components of the subgraphs above [below] any ver-tex in the augmented contour tree are essentially identical to the corresponding components of the mesh, but first, I must prove a preliminary result: L e m m a 5.21 and x3 belong to the same component of r£(M) precisely when xt and Xj belong to the same component of {x : f(x) > hk}. Proof: 69 10 5 9 Figure 5.9: Constructing a graph path from a path in space (=>): It suffices to show that this is true for any xl and Xj connected in the mesh M by the edge XiXj. From Assn. 1.4, p.7, the interpolation is piecewise-linear over the cells of the mesh. Any point x on the edge XiXj must have a value between those of X, and xy. thus x belongs to {x : f(x) > hk}. Since this is true for all points along the edge, x,- and x3 must belong to the same connected component of {x : f(x) > hk}-{<=): Let xt and Xj be connected in {x : f(x) > hk}. Then there exists some path P from xt to x7 such that f(p) > hk for all points p in P (see Fig. 5.9). Since the mesh is assumed to be simplicial (Assn. 1.3, p.7), the path P is completely contained by the sequence of simplices through which it passes. If a vertex of one of these simplices has value > hk, then the vertex belongs to the same component of {x : f(x) > hk} as P (by Assn. 1.4, p.7). 70 Similarly, any edge between two vertices of a simplex on the path is in the same component if both ends of the edge have value > hk. P may only enter and exit a simplex through faces which have at least one vertex higher than Xk- Pick the highest vertices xp and a;,-on the entry and exit faces. Since these vertices are in the same simplex, they are adjacent, and the edge xpxq is in {x : f(x) > hk}. Thus, I can replace the path P with a path Q which travels only along edges of the mesh: this path Q must also exist in Tk(M), since it is composed of edges between vertices with values > hk- It then follows that X{ and Xj belong to the same connected component of r£(M).n D u a l 5.22 Xi and Xj belong to the same component ofYk(M) precisely when Xi and Xj belong to the same component of {x : f(x) < /ifc}.D L e m m a 5.23 For each component in Tf(ACM) (where ACM is the augmented contour tree for the mesh M-), there exists a component in Tf(M) containing exactly the same vertices. Proof: Proof is by finite induction, starting with the highest vertex xn, for which the property is trivially true. Assuming that the hypothesis is true for k < i < n, I now consider the vertex Xk-i- the only difference between the components of Fj(M) and T^_1(M) is that the up-arcs from Xk have been added to the latter. Thus, I break the proof into three cases: local maxima, joins, and other points: Case I: Xk is a local m a x i m u m : Since Xk has no up-arcs, there are no edges added to Tj(M) to obtain r^ _,(M). What is the corresponding change in the augmented contour tree? 71 ' 10 5 9 c c 3 7 6 8 Figure 5.10: Region Representing an Edge in Augmented Contour Tree From Def. 4.16, p.36, a local maximum only has one edge: this descends from it. Thus, since is not adjacent to any components of T%{M), no edges are added to T^(ACM) to obtain T^^ACM)- By the inductive hypothesis, the components of (M) and V£(ACM) contain the same vertices. It then follows that the components of r^_L(M) and V^_X{ACM) contain the same vertices. Case II: z% is a jo in : Suppose that xk is adjacent to Xj in ACM, with hk < hj. From Def. 4.17, p.41, xi and Xj both either belong to some superarc, or are endpoints of it. Since the superarcs and supernodes correspond to connected contour classes, I take the union of these contour classes, and obtain a connected set in the original space of points with values between hk and hj. Therefore, there is a path P from x% to x} in this set (see Fig. 5.10). 72 But this set is contained in some component 7 of {x : f(x) > hk-i}- So, by Lemma 5.21, xk and Xj must also be connected in T ^ ^ M ) . This is true for each edge xkXj in ACM (with hk < hj). Also, the components of Tk(M) and {x : f(x) > hk} have the same vertex sets by the induction hypothesis. Thus, it follows that xk is connected to the same components of Tk(M) in M as in ACM-As a result, the component of T^_ 1 (M) to which xk belongs will corre-spond directly to the component of {x : f(x) > hk^x} to which xk belongs. Components to which xk does not connect will be unaffected, so we conclude that the components of TK_1(M) and Y^^ACM) contain the same vertices, as required. Case III : xk is neither a jo in nor a local max imum: In this case, xk is adjacent to only one component of Tk(M), and an argument similar to that of Case II applies to show that the components of TjJ'_1(M) and F^_j (AC'M) contain the same vertices.• D u a l 5.24 For each component in T7(ACM)I there exists a component in T~(M) containing exactly the same vertices.• A corollary of this result is that the augmented contour tree has the same join tree as the mesh does: Coro l la ry 5.25 The augmented contour tree ACM and the mesh M have the same join tree (i.e. JACM — JM)-Proof: In Def. 5.5, I defined the join tree of a height graph G in terms of the components of Tf(G). By Lemma 5.23, these components are identical in 73 ACM and M, and we saw in the proof of the theorem that Xi will be connected to the same components of Tf(ACM) and Tf(M). It follows immediately from Def. 5.5 that JACM = JM-C D u a l 5.26 The augmented contour tree ACM o,nd the mesh M have the same split tree (i.e. SAcM = SM)-U 74 Chapter 6 Join & Split Trees of the Mesh The contour tree is useful because it represents the topology of the level sets in the data. Taking a cross-section of the contour tree at a given height h gives us the connectivity of the level set {x : f(x) — h}. Corresponding to this property, it can be shown that the join tree and split tree of the mesh give us the connectivity of the sets {x : f(x) > h} and {x : 'f(x) < h} respectively: the interior and exterior of the level set. Describing these sets as "interior" and "exterior" is consistent with sweeping through the isovalues from high to low: the interior is the set of points enclosed by the level set as it expands "outwards." This property, although illustrative, is not necessary for what follows: an algorithm for the construction of the join and split trees of a sorted sim-plicial mesh in 0(Nct(N)) time and 0(n) space. The following chapter uses this algorithm, and the algorithms demonstrated in the previous chapter, to construct the contour tree for the mesh. 75 Throughout this chapter, I shall use M to refer to the simplicial mesh that fills the spatial volume. Since each vertex Xi of M has an associated height hi, in sorted order, M satisfies the definition of a height graph in Def. 5.1, p.47. 6.1 Construction of the Join Tree From Def. 5.5, p.48, each vertex yi of JM must be connected to the smallest-valued vertex in each component in Tf(M). At y8, these components will merge: at lower values of i, there will be one component corresponding to all of them. This property makes it possible to use Tarjan's discrete set union algorithm [24] to compute JM-Tarjan's algorithm progressively constructs the connectivity of sub-graphs of M by adding edges to the union-find structure, in any arbitrary order. At each step, the union-find structure represents the connectivity of the subgraph of M induced by the edges already added. Each edge XjX{ in M runs from a "higher" end x} to a "lower" end Xi (i.e. hj > h{, or j > i). Since Tarjan's algorithm does not require any particular order, I add the edges of M to the union-find structure in order of the height of the-lower end. This leads to the following algorithm: A l g o r i t h m 6.1 Algorithm To Construct JM-' In the algorithm (Fig. 6.1), the only item added to the usual union-find representation is LowestVertex, which stores the lowest vertex for each component: this allows me to add edges to the join tree at each step. 76 1. for i := n clownto 1 do: (a) Component[i] := N O N E (b) for each vertex Xj adjacent to Xi (i) . if (j < i) or (Componentfi] = Component[j]) skip Xj (ii) . if Component^] = N O N E (A) . Component[i] := Component[j] else (B) . UFMerge( Component [i], Component [j]) (iii) . AddEdgeToJoinTree(yj-, LowestVertex[Component[j]]) (iv) . LowestVertex[Component[j]] := y8-(c) if Component[i] = N O N E (i) . Componentfi] := i (ii) . LowestVertex[i] := yt-Figure 6.1: Algorithm to Construct Join Tree To show that this algorithm correctly computes JM, I shall prove by induction that the components in the union-find structure correspond directly with the components of Tf(M) immediately before iteration i. Next I show that LowestVertex holds the lowest vertex in each component. Finally, I shall show that Step l(b)(iii) generates all the edges of JM-L e m m a 6.1 Immediately before step i, the components of the union-find struc-ture used by Algorithm 6.1 correspond to the components ofTf(M). Proof: Proof is by finite induction, with base case of i = n. In this base case, the union-find structure is empty, and T+(M) contains no vertices. Since neither has any components, the correspondence is obvious. I assume, therefore, that the result is true for k < i < n, and show that • 77. the result holds for k — 1. The proof now breaks into three cases, depending on whether Xk is adjacent to 0, 1, or > 2 components of (M) . C a s e I: Xk has no c o m p o n e n t s adjacent ye t : Since vertices xn to Xk+i have already been processed, this means that Xk is not adjacent to any higher vertices. Thus Xk is a local maximum, and j < k for every vertex Xj adjacent to Xk- Accordingly, Step l(b)(i) causes the algorithm to skip each of these vertices, and when Step lb completes, Step lc executes, setting Xk to represent a new component in the union-find structure. This corresponds to the component {xk} in T^_JM). Any other component in r J _ 1 ( M ) must consist solely of vertices higher than Xk- By the inductive hypothesis, these were correctly represented prior to step k. Since step k does not merge any of these components, it follows that these components are still correctly represented in the union-find structure. C a s e I I : A l l adjacent v e r t i c e s h i g h e r t h a n Xk b e l o n g to one c o m p o n e n t : In this case, Xk is adjacent to vertices of exactly one component in T~l(M). When the first of these vertices is encountered, Step 1 (b)(ii)(A) adds Xk to the component. Subsequent neighbours higher than Xk all belong to this same component, and are skipped due to Step l(b)(i). Thus, Xk is added to the component, and all other components are left untouched. C a s e I I I : A d j a c e n t ve r t i ce s h i g h e r t h a n Xk b e l o n g to m o r e t h a n one c o m p o n e n t : As in Case II, Step l(b)(ii)(A) adds Xk to the first component encoun-78 tered adjacent to xk. When the first vertex from a different component is encountered, Step 1(b)(ii)(B) merges the two components together: subse-quent adjacent vertices belonging to either of these components will then be skipped at Stepl(b)(i). Thus, the first vertex from each adjacent component causes that component to merge onto x^s component. As in the previous cases, components not adjacent to xk are unaffected. The result then follows by induction. • L e m m a 6.2 For each component 7 in Tf(M), LowestVertexfj] holds the small-est vertex in 7 before step i of Algorithm 6.1. Proof: Again, proof is by finite induction on i. The base case is trivial: no components exist before step n. In Case I of Lemma 6.1, I showed that Step 1(b)(ii) is never reached: only Step 1c is executed, creating a component containing only xk. Step l(c)(ii) then sets it's lowest vertex correctly to xk. In Case II of Lemma 6.1, I showed that Step l(b)(ii)(A) adds xk to an existing component. Step 1(b)(iv) then sets the lowest vertex of this compo-nent to xk. Since the only vertices in this component prior to step k were higher than xk, xk is in fact the lowest vertex: other components are unaffected. In Case III of Lemma 6.1, I showed that all components adjacent to Xk were merged by Step l(b)(ii)(B): Step l(b)(iv) then sets the lowest vertex correctly to xk. The result again follows by induction. • Theorem 6.3 Algorithm 6.1 correctly computes JM-79 Proof: The two preceding lemmas (Lemma 6.1 and Lemma 6.2) have established that the union-find represents the components of Ff(M) at step i, and that Lowest Vertex holds the smallest-valued vertex of each component at the same time. From Def. 5.5, p.48, we see that y,- is connected to y,- in JM precisely when Xj is the smallest-valued vertex of a component 7 of Ff(M), and x,- is adjacent to some vertex in 7. But this is exactly the edge we add to the join tree in Step 1(b)(iii). It follows that the result is JM- D Theorem 6 .4 Algorithm 6.1 computes JM in 0(N + ta(t)) time and 0(n) space. Proof: The steps of the algorithm each execute the following number of times: 1. Step 1 executes n times 2. Step l a executes once per loop, for a total of n times 3. Step lb executes once for each incident edge at each vertex, or twice for each edge in the mesh in total. As noted in Def. 2.8, p. 12, the total number of edges in a simplicial mesh is O(N). 4. Step 1(b)(1) executes once per edge (at the lower end): a total of O(N) times. 5. Step 1 (b)(ii) executes at most once per vertex: a total of 0(n) times, with 0(1) work on each execution 80 6. Step l(b)(ii)(B) performs at most t merges, for a total time of 0(ta(t)) 7. 'Step lc executes once for each local maximum, for a total of 0(t) times. Summing these times up, we get a total of 0(N + n + ta(t)), which simplifies to 0(N + ta(t)). Space requirements are Q(n) for the union-find structure, 0(n) for the lowest vertex, and 0(n) for the join tree that is created, for a total of 0(n) space. • D u a l 6 .5 By the same duality observed in Dual 5.1, p.49, the split tree SM may be computed in 0(N + ta(t)) time and 0{n) spaced 81 Chapter 7 Contour Tree Construction Algorithm In the previous chapters, I have defined the contour tree (Ch. 4), described how to reconstruct a C-tree from join and split trees (Ch. 5), and shown how to compute the join and split trees of the mesh efficiently (Ch. 6). In this chapter, I put these pieces together to get an efficient algorithm for computing the contour tree of the entire mesh. 7.1 Constructing the Augmented Contour Tree In Sec. 5.6, p.68, I showed that the mesh and the augmented contour tree have identical join and split trees. From this, Algorithm 6.1, p.76 and Algorithm 5.2, p.63, I can now assemble an algorithm to compute the augmented contour tree: A l g o r i t h m 7.1 Algorithm to construct the Augmented Contour Tree 82 Given a mesh M , compute the augmented contour tree ACM as follows: 1. Use Algorithm 6.1, p.76 to compute the join and split trees JM and SM'-2. Use Algorithm 5.2, p.63 to compute ACM from JM and SM-Theorem 7.1 Algorithm 7.1 correctly constructs the augmented contour tree for the mesh M. Proof: By Theorem 6.3, p.79, Step 1 correctly computes JM and SM-• By Corollary 5:25 and Dual 5.26, these are identical to JACM a n c ^ SACM- And by Theorem 5.18, p.63, Step 2 correctly computes ACM-^ Theorem 7.2, Algorithm 7.1 computes the augmented contour tree for the mesh M in 0(N + ta(t)) time and 0{n) working space. Proof: By Theorem 6.4, p.80, Step 1 takes 0(N + ta(t)) time and O(n) space. By Theorem 5.19, p.67, Step 2 takes 0(n) time and space. Since n = O(N), the result follows.• 7.2 Constructing the Contour Tree I have now given an algorithm for computing the augmented contour tree. But what of the contour tree itself? Fortunately, it is easy to compute the contour tree from the augmented contour tree by using the reduction operation defined in Def. 5.13, p.56. This leads to the following algorithm for computing the contour tree: 83 A l g o r i t h m 7.2 Algorithm to construct the Contour Tree Given a mesh M, we compute the contour tree C as follows: 1. Use Algorithm 7.1 to compute the augmented contour tree AC 2. For each vertex X{ in AC (a) If S+(tj = s-(i) = 1 (i). Reduce vertex x 8 Theorem 7.3 Algorithm 7.2 correctly constructs the contour tree for the mesh M. Proof : By Theorem 7.1, p.83, Step 1 correctly computes the augmented contour tree AC'. In Sec. 4.7, p.41, I observed that the ordinary points of the mesh were inserted into the superarcs to which they belonged, and that each ordinary point has one up-arc, and one down-arc. This is not true for critical points, since these will always have either the up-degree or down-degree ^ 1 Thus, Step 2a correctly identifies the ordinary points in the contour tree, and Step 2(a)(i) correctly removes them from the contour tree.D Theorem 7.4 .Algorithm 7.2 computes the contour tree for the mesh M in 0(N + ta(t)) time and O(n) working space, and requires O(t) output size. Proof : By Theorem 7.2, Step 1 requires 0(N + ta(tj) time and 0(n) working space. Step 2 executes 0(n) times, but, as in Theorem 5.19, p.67, Step 2a and Step 2(a)(i) take 0(1) time for each vertex. This gives a total of 84-0{N + ta(t)) time and 0(n) working space. The output is the contour tree, which requires 0(t) space. • Although Algorithm 7.2 correctly computes the contour tree, note that it assumes that the vertices x\,...,xn are in sorted order. If this is not the case, then the cost of sorting must be included: C o r o l l a r y 7.5 Algorithm 7.2 computes the contour tree for an unsorted mesh in 0(n\ogn -f- N + ta(t)) time and 0(n) working space.O Finally, some small optimizations of this algorithm are possible. If the augmented contour tree is not required, it is possible to use the reduction operation to suppress ordinary points during the computation of the join and split trees (see Algorithm 6.1, p.76). In this case, we store the HighestVertex for each component as well as the LowestVertex, and only generate a join tree arc at a join or the global minimum. Note that splits and.local minima will not be represented in this join tree: they must be inserted. Fortunately, all splits and local minima are identified in the split tree. If sufficient information is retained from the join tree computation, it is easy to determine which arc of the join tree the splits and local minima belong to. Inserting these vertices on this arc in sorted order then gives the correct join tree for the contour tree. Since the vertices are already sorted, this can be done in 0(1) time per vertex, for a maximum of 0(t) time. However, doing this adds a significant amount of complexity for relatively little gain, since the augmented contour tree is used for the purpose of generating seed sets (see next chapter). 85 It appears to be possible to avoid the 0(n log n) cost for sorting the vertices. To do this, note that, for each vertex X{, we need to know the components in the union-find of it's neighbours only: this does not depend on a global ordering of the vertices. If components are "grown" downwards from each local maximum, it should be possible to avoid sorting entirely. However, this would require separate search queues for each local maximum: this makes the algorithm much more complex, and is likely to be prohibitive in practice. Note that the 0(n log n) in this algorithm is entirely due to the sort, and not due to the cost of building a data structure. Since sorting can usually be clone efficiently in practice, the cost of sorting may not be a major concern. 86 Chapter 8 Generating Seeds The contour tree represents the connectivity of all level sets: to generate the isosurfaces, we need seeds for contour-following (Sec. 3.5.1). I start by com-puting both the contour tree and the augmented contour tree. The contour tree is used to identify which superarcs intersect the desired level set: this can be done by searching all superarcs, or by storing the superarcs in an interval tree. Once the relevant superarcs are identified, a seed edge is selected for each superarc. This can be done in one of several ways: 8.1 Simple Seed Sets The simplest method is to store the entire augmented contour tree. To identify a seed cell for an isovalue h, the corresponding superarcs are searched in the augmented contour tree to find the for which hi < h < hj. However, Xi and Xj need not be adjacent to each other in the mesh. If a;,- is not a split, 87 all its down arcs intersect the desired contour, so we simply pick a down-arc. Similarly, if X j is not a join, any of its up arcs can be used as a seed. If is a split and Xj is a join, a problem arises: it is difficult to pick an edge for the interpolation. This can be dealt with by picking an up arc for each up superarc when we construct the join tree, and a down arc for each down superarc, although this adds complexity. This simple search takes 0(n) time, if clone with a linear search, or O(logn) for a binary search. In extreme cases, this could lead to Q(nlogn) cost to find the seed set. 8.2 Heuristic Seed Sets Instead of precompiling the seed set, I borrow from Itoh & Koyamada [16, 15] and Bajaj et al. [2] the trick of associating a path with each superarc. This path can be computed in the following way at runtime: 1. At a local maximum, start the path with the edge to the lowest-valued adjacent vertex. 2. At a local minimum, start the path with the edge to the highest-valued adjacent vertex. 3. At a join, each ascending superarc corresponds to a join component that terminates at the join. On each such superarc, store the edge between the current vertex, and the highest adjacent vertex belonging to that join component. Start from this edge when finding a seed. 88 4. At a split, each descending superarc corresponds to a split component: again, store a starting edge on the superarc. 5. If none of the above apply, the superarc must descend from a join to a split: in this case, there is only one descending superarc from the join (and only one ascending superarc from the split). Therefore, any edge to a lower-valued vertex adjacent to the join vertex can be used to start the path, as can any edge to a higher-valued vertex adjacent to the split vertex. In all these cases, a supernode has been identified, as has an adjacent starting node on either an ascending or descending path. Assuming.an as-cending path, if the supernodes do not already straddle the desired isovalue, advance to the edge from the adjacent node to it's highest, adjacent neighbour. Repeat until the supernodes straddle the desired isovalue. Note that this heuristic will never reach a dead end at a local maximum: if there is more than one maximum to be found in this way, there must be a join vertex at a lower value than all such maxima. The desired superarc must then terminate at or before the join vertex. If the paths are to be preprocessed and stored, perform a breadth-first search in the same fashion for the isovalue at the far end of the superarc, thus finding the shortest path to a vertex above that isovalue. This preprocessing can be performed in O(N) time and O(n) space, since the same edge is never considered more than once. 89 As noted in [17], a dataset can consist entirely of local extrema: in this case, t = £l(n). In this case, it will take 0(n) time to generate a level set that intersects all of the superarcs. But no algorithm can improve on this, since the output size k will be 0(n). To determine which superarcs span a.level set, the superarcs can be stored in an interval tree, as Cignoni et al. [5] do with the edges of the original dataset. This is used by Bajaj etal in [2]. Using this approach reduces the time to generate a level set to O(logt) + k, at the cost of O(tlogi) preprocessing. Since we expect that t -C N (Def. 2.10, p. 13), storing superarcs in this way should not be necessary for crystallographic data. 90 Chapter 9 Implementation This chapter discusses the major issues that arose during the implementation of the algorithm, and the solutions adopted in each case. 9.1 Simplicial Subdivision The first difficulty to be resolved is the assumption that the data is acquired upon a rectilinear grid(Assn. 1.2). Ideally, we would like to use the rectilinear grid directly, with the natural tri-linear interpolation function over each voxel. Unfortunately, all existing contour tree algorithms assume that the data is in the form of a simplicial mesh: the simplices prevent ambiguities of the interpolating function inside the mesh(Assn. 1.3). The immediate consequence of this disparity is that either the algorithm must be modified so that it works with voxels, or the grid must be converted into a simplicial mesh. I chose to convert the grid to a simplicial mesh, by 91 subdividing each voxel into simplices. 9.1.1 Desiderata for Subdivision In order for the algorithm to work, our definition of a mesh (Def. 2.1, p.9) requires that faces be shared between adjacent cells. Thus, where the face of a voxel is subdivided, the subfaces must be the same in both of the voxels that intersect at that face. Note that, in choosing a subdivision, we implicitly choose an interpo-lation function over the original voxel, composed of the interpolation function for each simplex in the voxel. In choosing a subdivision, the ideal is to approximate the tri-linear interpolation function over the voxels. This gives several criteria for quality, to which we acid some criteria related to efficiency of processing. Not surprisingly, it is not possible to satisfy all of the following goals: ii) the interpolation function for a given point should depend solely on the values at the vertices of the voxel containing the point. ii) the subdivision should be symmetrical: all vertices should be treated equally in a given cell. iii) the subdivision should not magnify the dataset - i.e. it should not require the addition of data points. iv) the subdivision should generate as few simplices as possible. v) if possible, the subdivision should be implicit, for processing efficiency. 92 (a) The Subdivision (b) Even Parity (c) Odd Parity Figure 9.1: Minimal subdivision: 5 simplices / voxel 9.1.2 Some Possible Subdivision Schemes A number of schemes for subdividing a voxel have been used, or could poten-tially be used. One of the sample datasets, "atom9" is used to demonstrate deviations from the ideal tri-linear interpolation. This dataset was constructed artificially, by placing local "atoms" in IR 3. Each "atom" was assumed to be the centre of a Gaussian distribution of electron density: the volume was sam-pled at regular intervals to produce the data. One area of this dataset proved especially good at revealing artefacts due to the subdivision: a zig-zag of atoms placed on vertices of the sampling grid. a) Minimal subdivision, using 5 simplices (Fig. 9.1(a)). This fails criterion ii): not all vertices are treated equally. Fig. 9.1(b) and Fig. 9.1(c) shows the result of applying this subdivision to a sample dataset. This asymmetry could be mitigated by randomizing the orien-tation of the subdivision in each voxel, which would violate our condition 93 (a) The Subdivision (b) Artefacts Figure 9.2: Axis-aligned subdivision: 6 simplices / voxel that subfaces must match between voxels. b) Axis-aligned subdivision, with 6 simplices, arranged around a major di-agonal of the voxel (Fig. 9.2(a)). Again, this fails condition ii). Fig. 9.2(b) shows the result of applying this subdivision to a sample dataset. As with a), this asymmetry could be reduced by randomized orientations, but the same difficulty would be encountered; it would violate our subface-matching condition. c) Body-centred subdivision, with 12 simplices in a B C C (body-centred cubic) lattice (Fig. 9.3(a)). This subdivision shares simplices between two adjacent voxels, reducing the average number of simplices per cell to 12. However, it fails criterion i): that interpolation should be restricted to the voxel itself. d) Face-centred subdivision, with 24 simplices, constructed by subdividing the voxel into face-centred square pyramids (Fig. 9.4(a)) 94 (a) The Subdivision (b) Artefacts Figure 9.3: Body-centred subdivision: 24 simplices / voxel 95 This subdivision fails criterion iii) by requiring an average of 4 inter-polated data points per voxel, and condition iv) (it has the largest magnification factor of all subdivisions considered. However,, as seen in Fig. 9.1(b), Fig. 9.2(b), & Fig. 9.4(b), it fulfills condition iv), which neither a) nor b) does. I implemented each of these subdivisions for comparison purposes: the choice of which one to use depends on the problem from which the data derives. If the features of the data are large relative to the spacing of the samples, then artefacts such as those shown in Figs. 9.1(b), 9.1(c), and 9.2(b) are much less prominent. Either the minimal subdivision or the axis-aligned subdivision can be used. If, on the other hand, features are closely spaced (as in Fig. 9.1(b) and Fig. 9.1(c)), then the face-centred subdivision (24 simplices / voxel) has fewer unpleasant artefacts than any of the other subdivisions, but requires significantly more memory and processing time to generate isosurfaces. In Sec. 9.7, I compare the minimal subdivision scheme with Marching Cubes (Sec. 3.1). 9.2 Boundary Effects Both van Kreveld et al. [26] and Tarasov h Vyalyi [23] assume that the con-tours may extend to the boundaries of the dataset: this complicates their algorithms for processing the contour tree, results in open surfaces, and adds additional splits and joins in the contour tree. 96 In the early stages of developing the algorithm given above, I was con-cerned about boundary effects. As a result, I chose to avoid them by em-bedding the entire dataset in a layer of zeroes (or some other value smaller than all values in the dataset). This not only avoided boundary cases, but also reduced the number of splits and joins, and guaranteed that all surfaces will be closed topologically. The extra layer of cells was rendered as well as the original dataset. However, since the interpolation inside the original data set is unaffected by the embedding process, the outer layer could have been suppressed for rendering. Instead of automatically adding the zero layer internally, I added the zeroes to the file in which the data was stored: the zeroes could instead be added when reading in the data in 0(n) time. This zero-embedding step turned out to be unnecessary: Algorithm 7.2 operates correctly at the boundary. In an implicitly-represented mesh such as I used, some special handling would still be required, since boundary vertices do not have the same connectivity as interior vertices. 9.3 Symbolic Perturbation of Data As notecl above, I assume that the data values are unique (Assn. 1.5): that no two vertices have the same value associated with them. This assumption is necessary for the guarantee that the critical points occur at the vertices (Sec. 4.1.1). For the sake of simplicity, I chose to perturb the data symbolically, by 97 adding an e to each value proportional to its location in R A M [11]: when comparing values, ties are broken by comparing memory addresses. This re-sults in a stable sort order, provided we do not move data around- in memory. Otherwise, this form of symbolic perturbation may give inconsistent results. Since I assume an array of data, which is initialized at the beginning of the program, and never moved around, this does not pose a problem. A minor complication is added in the zero-embedding stage: If the global minimum is adjacent to the zero-embedding layer, but not to the ele-ment of that layer which is adjacent to the global minimum in the sorted list, one or more spurious joins will be added in the zero-embedding layer. This was resolved by. special case treatment of the zero-embedding layer, which can be assumed to belong to one component. As noted in Sec. 9.2, the zero-embedding is not required, so this complication can be avoided. 9.4 Searching for Interpolating Edge To generate an actual level set, it is necessary to use the contour tree to find seeds to start the contour-following algorithm. This is discussed in more detail in Sec. 8. 9.5 Local Contours Since contour trees preserve topological information about the entire dataset, it proved possible to generate contours locally around individual local maxima 98 (a) Local Contours (b) Fixed Contours Figure 9.5: Comparison of Local and Fixed Contours (see Fig. 9.5(a)): this provides better resolution of individual peaks than a single level set can do (Fig. 9.5(b)). Further exploration of this technique is desirable. 9.6 Memory Requirements Due to some early design choices, memory overhead of the current implemen-tation is significant. For each vertex, I store the following information: 1. type: whether the vertex is one of the original data points, an interpo-lated vertex on a face of a voxel, or an interpolated vertex in the centre of a voxel 2. value: the value at the vertex 99 3. 3 tree nodes: pointers to the vertex in the contour tree, join tree and split tree: the data types defined for edges and vertices are themselves inefficient 4. ID: an ID number for the node 5. queue flag: flag marking whether the vertex is on the leaf queue 6. follow flags: flags for which simplices have been visited in the voxel based at this vertex. In the case of the body-centred and face-centred subdivisions (Sec. 9.1.2), this is increased by the need to store the interpolated vertices, if only for sorting. This was achieved in practice by doubling each dimension of the grid, increasing storage requirements by a factor of 8. Thus, the face-centred subdivision uses approximately 800 bytes per vertex at present on a 32-byte machine. Clearly, this is excessive: I estimate that the amount of memory required is about 10-20 bytes per vertex for the minimal subdivision, and perhaps as much as 100 bytes for the face-centred subdivision. This will be affected, however, by the complexity of the contour tree, which we saw to be 0(t) (in the worst case, O(n)) 1. 100 File: atom9.txt Grid size: 13x13x13 1728 voxels; 8640 simplices. Memory required: 219,700 bytes. 9.33 milliseconds to sort data. 12.07 milliseconds to build join and split trees. 0.23 milliseconds to merge join and split trees. 19 nodes in join tree; 4 nodes in split tree 21 nodes in contour tree. Triangles Generated vs. Time (Atom9) | I Marching Cubes I Contour Tree 200 400 600 800 1000 1200 Triangles Generated Figure 9.6: Timing Results for the Atom9 dataset 101 File: caffeine2.txt Grid size: 19x19x11 3240 voxels; 16200 simplices. Memory required: 397,100 bytes. 18.24 milliseconds to sort data. 24.42 milliseconds to build join and split trees 0.54 milliseconds to merge join and split trees 37 nodes in join tree; 4 nodes in split tree. 39 nodes in contour tree. Triangles Generated vs. Time (Caffeine2) C3 Marching Cubes I Contour Tree Triangles Generated Figure 9.7: Timing Results for the Caffeine2 dataset 102 File: 29g.txt Grid size: 53x53x53 140,608 voxels; 703,040 simplices. Memory required: 14,887,700 bytes. 9.33 milliseconds to sort data. 12.07 milliseconds to build join and split trees. 0.23 milliseconds to merge join and split trees. 19 nodes in join tree; 4 nodes in split tree 21 nodes in contour tree. 150-i 1 4 0 » -130-120-110-T 100-m 90-6 80-j in 70 s e 60 c ) 50 40-| 30 20 10 0 Triangles Generated vs. Time (29g) j I Marching Cubes I Contour Tree 200 400 600 800 1000 Triangles Generated Figure 9.8: Timing Results for the "29g" dataset 103 9.7 Timing In Figs. 9.6, 9.7, and 9.8,1 give some sample times for the implementation. The timing was run on a 300 MHz Power Macintosh for two reasons. The current operating system (MacOS 8.1) is non-pre-emptive, so it can be assumed that no other processes interrupted the test runs: thus, timing should be quite accurate (I rarely observed a deviation of more than about 5%). Secondly, a convenient function exists to give the current clock value in microseconds since start-up, providing fine granularity of measurements. Since the interface for the implementation used OpenGL, I conducted timing runs by measuring how long it took to construct a display list to render the isosurface, with normals. This decoupled actual rendering from the cost of moving through the mesh to locate and generate triangles. A simple version of the marching cubes algorithm was implemented for comparison. This consisted of nested loops to march through all the cells in the mesh, generating triangles in any that intersected the isosurface. In contrast, the contour-following algorithm takes two passes through the data: the first pass generates the triangles, marking the cells that have been visited, while the second pass unmarks the cells. Due to the memory issues referred to above (Sec. 9.6), timing was performed using the minimal subdivision. A l l available optimizations were enabled, including inlining. However, the contour-following routine can be assumed not to be optimized, as it is im-^ee Def. 2.10, p.13 and Def. 2.7, p.12 for definition of the parameters. 104 plemented recursively: thus, the times shown include overhead for the function calls. Each display list was created 100 times, then immediately destroyed: the average of the results is shown: deviation from the average was small. Also shown is the triangle count for both algorithms, the number of cells visited by the contour-following algorithm, and the maximum depth of recursive call. Isosurfaces were generated at 5% intervals, based on the maximum value in the dataset: thus, 19 results are shown for each dataset. Note that Marching Cubes tends to be faster than Contour Following for large isosurfaces, but slower for small ones. Also note that the two algorithms do not produce the same isosur-faces, since the interpolation inside each voxel differs. In particular, Marching Cubes generates fewer triangles in most cases. This is not necessarily an ad-vantage, since contour-following can potentially generate long triangle strips (see Sec. 3.1.3, p.16 and Sec. 3.5.2, p.22), reducing the cost of sending the tri-angles to hardware. Although triangle strips were not implemented, the depth of the recursive call tended to be approximately 2/3 of the total number of cells visited (see Figs. 9.6, 9.7, and 9.8). This indicates that very long triangle strips are possible in practice. Finally, a word about the data sets: since the implementation was on a small scale, and real crystallographic datasets are generally much larger, a simple simulation was used: a helper program created data sets from a list of atomic nuclei and their positions. The caffeine dataset (Fig. 9.7) was 105 created by taking a standard crystallographic P D B datafile, and can be seen in Fig. 9.5. The 9 atom dataset (Fig. 9.6) was created by placing 9 nuclei of different radii in entirely arbitrary positions. It was particularly useful for showing artefacts in the data, and, can be seen in Fig. 9.4(b). The last dataset (Fig. 9.8), the largest, consisted of three isolated nuclei. Thus, the isosurfaces are smaller than would normally be the case: this dataset was included to give some idea of the potential speed advantage over the marching cubes algorithm for datasets where N >> k. Although these results are far from rigorous, it is clear that the contour-following algorithm is worth implementing for large datasets. 106 Chapter 10 Summary 3-D datasets naturally arise in various application fields. Some fields, such as X-ray crystallography, require interactive visualization of large datasets. One technique to visualize such datasets is the use of contour trees to construct isosurfaces. This thesis describes a practical and efficient algorithm for constructing contour trees for 3-Ddatasets. Details of an implementation were sketched, and the technique compared with other techniques. In particular, if we review the properties listed in Sec. 1.3, p.6, we see that contour tree techniques satisfy all of the properties desired for dealing with X-ray crystallographic data (see Sec. 9.5, p.98 for details on the last item). 107 Chapter 11 Extensions A number of extensions are possible in practice. These include: dimensions other than 3-D, application to irregular meshes, flexible contours, the use of transparent shells for visualization, and scaling issues. 11.1 Higher and Lower Dimensions Although the discussion throughout has focussed on 3-D datasets such as those acquired in X-ray crystallography, the algorithm depends in no way on this assumption. Implementation details will vary slightly in higher or lower di-mensions, but the construction outlined above remains valid: the algorithm is equally useful in 2, 3, 4, or more dimensions, although clue to the size of datasets, 4-Dmay be the practical limit. Most of the other algorithms described in Ch. 3 can be modified to work in higher dimensions. Marching Cubes (Sec. 3.1, p. 14) would become 108 Marching Hypercubes, and would have 65536 cases before symmetry. Most of the other algorithms would scale rather more easily. 11.2 Irregular Meshes This thesis assumed that the data was presented on a regular mesh (Assn. 1.2). As with dimensionality (Sec. 11.1), this assumption is unnecessary: the algo-rithm as outlined works equally well for irregular meshes. I expect to imple-ment a version for irregular meshes at some future date. Again, most of the algorithms described in Ch. 3 can be modified to work on irregular meshes, with more or less success. 11.3 Flexible Contours Local contours (Sec. 9.5) have been implemented to show the immediate neigh-bourhood at local maxima. So far, this has only been implemented for the superarc incident to each local maximum. To do this, an isovalue is inter-polated along the superarc incident to the local maximum, and the corre-sponding contour generated: this fulfills the goal expressed in Sec. 1.3, p.6 for crystallography. This needs further development, particularly with respect to user-interface issues, such as when and how to merge contours as the isovalue passes a supernode. Note that Marching Cubes, and all of the other algorithms in Ch. 3 except for the contour-following algorithms (Sec. 3.5, p.20) are unable to gen-109 erate flexible or local contours, without significant additional processing time. 11.4 Transparent Shells A simple extension to the use of isosurfaces is the use of multiple transparent isosurfaces or shells. As with contour lines on a map, this permits the entire dataset to viewed simultaneously, rather than a single slice. This idea has surfaced in the literature (e.g. in Guo [13]), but is not well-developed. 11.5 Scaling A n d Parallelism Datasets in 3-D can contain as many as 10003 data points, and may not fit into available memory. In the future, I will be researching ways to scale the algorithm to perform efficiently on such large datasets. One way that seems promising is to use parallel processors to compute partial join trees or contour trees, then merge the results. 110 Bibliography [1] Chandrajit L. Bajaj and Valerio Pascucci. Progressive IsoContouring. Technical Report 99-36, Texas Institute of Computational and Applied Mathematics, Austin, Texas, 1999. [2] Chandrajit L. Bajaj, Valerio Pascucci, and Daniel R. Schikore. Seed Sets and Search Structures for Optimal Isocontour Extraction. Technical Re-port 99-35, Texas Institute of Computational and Applied Mathematics, Austin, Texas, 1999. [3] Thomas F. Banchoff. 'Critical Points and Curvature for Embedded Poly-hedra. Journal of Differential Geometry, 1:245-256, 1967. [4] Jon Louis Bentley. Multidimensional Binary Search Trees Used for Asso-ciative Searching. Communications of the ACM, 18(9):509—517, 1975. [5] Paolo Cignoni, Paola Marino, Claudio Montani, Enrico PuppOj and Roberto Scopigno. Speeding Up Isosurface Extraction Using Interval Trees. IEEE Transactions on Visualization and Computer Graphics, 3(2):158-169, 1997., [6] Paolo Cignoni, Claudio Montani, Enrico Puppo, and Roberto Scopigno. Multiresolution Representation and Visualization of Volume Data. IEEE Transactions on Visualization and Computer Graphics, 3(4):352-369, 1997. [7] Kevin Cowtan. Book of Fourier, http://www.yorvic.york.ac.uk/~cowtan/ fourier/fourier.html, 1994. [8] Mark de Berg and Marc van Kreveld. Trekking in the Alps Without Freez-• ing or Getting Tired. In First Annual European Symposium on Algorithms (ESA [93), pages 121-132, 1993. I l l [9] Jan Drenth. Principles of Protein X-ray Crystallography. Springer- Verlag, Berlin; Heidelberg; New York, 1994. [10] Herbert Edelsbrunner. Dynamic Data Structures for Orthogonal Inter-section Queries. Technical report, Inst. Informationsverarb, Tech. Uniz. Graz, Graz, Austria, 1980. [11] Herbert Edelsbrunner and E.P. Mucke. Simulation of Simplicity: A tech-nique to cope with degenerate cases in geometric algorithms. ACM Trans-actions on Graphics, 9(1):66-104, 1990. [12] H . Freeman and S.P. Morse. On Searching A Contour Map for a Given Terrain Elevation Profile. Journal of the Franklin Institute, 284(l):l-25, 1967. [13] Baining Guo. Interval Set: A Volume Rendering Technique Generalizing Isosurface Extraction. In IEEE Proceedings on Visualization 95, pages 3-10. IEEE, 1995. [14] C T . Howie and Edwin H . Blake. The Mesh Propagation Algorithm for Isosurface Construction. Computer Graphics Forum, 13:65-74, 1994. [15] Takayuki Itoh and Koji Koyamada. Automatic Isosurface Propagation Using an Extrema Graph and Sorted Boundary Cell Lists. IEEE Trans-actions on Visualization and Computer Graphics, l(4):319-327, 1995. [16] Takayuki Itoh and Koji Koyamada. Isosurface Extraction By Using Ex-trema Graphs. IEEE Transactions on Visualization and Computer Graph-ics, 1:77-83, 1995. [17] Yarclen Livnat, Han-Wei Shen, and Christopher R. Johnson. A Near Optimal Isosurface Extraction Algorithm Using the Span Space. IEEE Transactions on Visualization and Computer Graphics, 2(l):73-84, 1996. [18] William E. Lorenson and Harvey E. Cline. Marching Cubes: A High Resolution 3D Surface Construction Algorithm. Computer Graphics, 21(4):163-169, 1987. [19] Gregory M . Nielson and Bernd Hamann. The Asymptotic Decider: Re-solving the Ambiguity in Marching Cubes. In IEEE Proceedings on Vi-sualization 91, pages 83-91. IEEE, 1991. 112 [20] Paul Ning and Jules Bloomenthal. An Evaluation of Implicit Surface Tilers. IEEE Computer Graphics and Applications, 13:33-41, 1993. [21] Max Perutz. Protein Structure. W . H . Freeman & Company, New York, 1992. [22] Han-Wei Shen and Christopher R. Johnson. Sweeping Simplices: A fast iso-surface extraction algorithm for unstructured grids. In IEEE Proceed-ings on Visualization 95, pages 143-150. IEEE, 1995. [23] Sergey P. Tarasov and Michael N . Vyalyi . Construction of Contour Trees in 3D in O(n\og n) steps. In Proceedings of the 14th ACM Symposium on Computational Geometry, pages 68-75. A C M , 1998. [24] R .E . Tarjan. Efficiency of a good but not linear set union algorithm. Journal of the ACM, 22:215-225, 1975. [25] Marc van Kreveld. Digital Elevation Models and TIN Algorithms. In Algorithmic Foundations of Geographic Information Systems, pages 37-78 Springer-Verlag, Berlin; Heidelberg; New York, 1997. [26] Marc van Kreveld, Rene van Oostrum, Chandrajit L. Bajaj, Valerio Pas-cucci, and Daniel R. Schikore. Contour Trees and Small Seed Sets for Isosurface Traversal. In Proceedings of the 13th ACM Symposium on Computational Geometry, pages 212-220. A C M , 1997. [27] Jane Wilhelms and Allen van Gelder. Octrees for Faster Isosurface Gen-eration. ACM Transactions on Graphics, 11(3):201—227, 1992. [28] Geoff Wyvil l , Craig McPheeters, and Brian Wyvi l l . Data Structure for Soft Objects. Visual Computer, 2:227-234, 1986. 113
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Efficient generation of contour trees in three dimensions
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Efficient generation of contour trees in three dimensions Carr, Hamish 2000
pdf
Page Metadata
Item Metadata
Title | Efficient generation of contour trees in three dimensions |
Creator |
Carr, Hamish |
Date Issued | 2000 |
Description | Many scientific fields generate data in three-dimensional space. These fields include fluid dynamics, medical imaging, and X-ray crystallography. In all contexts, a common difficulty exists: how best to represent the data visually and analytically. One approach involves generating level sets: two-dimensional surfaces consisting of all points with a given value in the space. With large datasets common, efficient generation of these level sets is critical. Several methods exist: one such is the contour tree approach used by van Kreveld et al. [26]. This thesis extends the results of van Kreveld et al. [26] and Tarasov & Vyalyi [23]. An efficient algorithm for generating contour trees in any number of dimensions is presented, followed by details of an implementation in three dimensions. |
Extent | 7820508 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-07-07 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051283 |
URI | http://hdl.handle.net/2429/10372 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2000-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_2000-0175.pdf [ 7.46MB ]
- Metadata
- JSON: 831-1.0051283.json
- JSON-LD: 831-1.0051283-ld.json
- RDF/XML (Pretty): 831-1.0051283-rdf.xml
- RDF/JSON: 831-1.0051283-rdf.json
- Turtle: 831-1.0051283-turtle.txt
- N-Triples: 831-1.0051283-rdf-ntriples.txt
- Original Record: 831-1.0051283-source.json
- Full Text
- 831-1.0051283-fulltext.txt
- Citation
- 831-1.0051283.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051283/manifest