Low-Stretch Trees for Network VisualizationbyRebecca Linda McKnightB.Sc., The University of Victoria, 2012A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Computer Science)The University of British Columbia(Vancouver)August 2015c© Rebecca Linda McKnight, 2015AbstractLow-stretch trees are spanning trees which provide approximate distance preser-vation for edges in the original graph by minimizing stretch. We explore the ap-plication of these trees to network visualization. In particular, we present a noveledge bundling technique, LSTB, that computes edge bundles explicitly and effi-ciently and does not rely on fixed vertex positions. This approach is in contrast toprevious methods, which require the user to provide a layout of the input graph.We introduce an abstract framework for edge bundling methods, which provides aunifying formalization of bundling terminology and techniques, as well as a clas-sification of such methods. Based on this framework, LSTB provides algorithmicsupport for sophisticated visual encodings, including dynamic layout adjustmentand interactive bundle querying.In addition, we explore the use of the multiplicative weights update methodto compute a distribution over low-stretch trees in order to achieve low stretch forall edges in expectation, rather than on average. We present the results of usingthis distribution in place of a single low-stretch tree as a routing graph for LSTB.While the distribution provides better stretch guarantees, we find that from a visualperspective a single low-stretch tree provides a better routing graph for the LSTBedge bundling application.iiPrefaceThis thesis is the expanded version of currently unpublished work done with NicholasJ. A. Harvey and Tamara Munzner. The ideas and algorithms presented here weredeveloped jointly by us. I performed all implementation and results gathering pre-sented in Chapter 6. Chapter 5 and Chapter 7 were written solely by me.Figures 1.2, 6.1a, 6.1c, 6.1e, and 6.1g used with permission.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.1 Low-Stretch Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Edge Bundling . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Network Visualization . . . . . . . . . . . . . . . . . . . . . . . 93 Abstract Framework for Edge Bundling . . . . . . . . . . . . . . . . 113.1 Framework Details . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Application to Existing Methods . . . . . . . . . . . . . . . . . . 133.2.1 Force-Directed Edge Bundling . . . . . . . . . . . . . . . 133.2.2 Multilevel Agglomerative Edge Bundling . . . . . . . . . 133.2.3 Clustered Edge Routing . . . . . . . . . . . . . . . . . . 14iv4 LSTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.1 Bundling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 154.1.1 Routing Graph Generation . . . . . . . . . . . . . . . . . 154.1.2 Edge Routing and Bundling . . . . . . . . . . . . . . . . 164.2 Visual Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2.1 Vertex Layout . . . . . . . . . . . . . . . . . . . . . . . . 194.2.2 Edge Drawing . . . . . . . . . . . . . . . . . . . . . . . 194.2.3 Interactivity . . . . . . . . . . . . . . . . . . . . . . . . . 215 Low-Stretch Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225.1 Low-Stretch Tree Construction . . . . . . . . . . . . . . . . . . . 225.1.1 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 235.2 Distribution over Low-Stretch Trees . . . . . . . . . . . . . . . . 255.2.1 Multiplicative Weights Update Method . . . . . . . . . . 265.2.2 Tree Oracle . . . . . . . . . . . . . . . . . . . . . . . . . 275.2.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 295.2.4 Application to LSTB . . . . . . . . . . . . . . . . . . . . 316 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326.1 Data and Implementation . . . . . . . . . . . . . . . . . . . . . . 326.2 LSTB Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 336.3 Distribution over Low-Stretch Trees . . . . . . . . . . . . . . . . 357 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387.1 LSTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387.2 Stretch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397.3 Distribution over Low-Stretch Trees . . . . . . . . . . . . . . . . 407.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43vA Supporting Materials . . . . . . . . . . . . . . . . . . . . . . . . . . 46A.1 Zero-Sum Games and the Minimax Theorem . . . . . . . . . . . 46A.2 Proof of Lemma 5.2.3.2 . . . . . . . . . . . . . . . . . . . . . . . 47viList of TablesTable 1.1 A comparison of edge bundling method usage, broken downbased on whether the graph has a layout or hierarchy. Ourmethod, LSTB, is applicable in all four cases. . . . . . . . . . 6Table 6.1 Graph statistics for data sets used in this paper. . . . . . . . . . 32Table 6.2 Performance statistics of LSTB. Running time is in seconds,and is averaged over 100 runs. . . . . . . . . . . . . . . . . . . 35Table 6.3 Maximum edge stretch comparison between a single low-stretchtree T and a distribution D over low-stretch trees, computed asin Section 5.2 with varying sizes |D | ∈ {2,5,10}. . . . . . . . . 35viiList of FiguresFigure 1.1 Comparison between the input graph and the output of LSTBon the Poker data set. . . . . . . . . . . . . . . . . . . . . . . 2Figure 1.2 c© 2006 IEEE Comparison between two bundled layouts ofsoftware package interaction data from Hierarchical Edge Bun-dles [17]. In each case, although the vertex positions differ, thebundles are the same. . . . . . . . . . . . . . . . . . . . . . . 3Figure 1.3 LSTB system workflow. . . . . . . . . . . . . . . . . . . . . 3Figure 1.4 Comparison between trees with poor (high) and good (low)stretch, respectively. . . . . . . . . . . . . . . . . . . . . . . 4Figure 1.5 Comparison between a minimum spanning tree and a low-stretch spanning tree for an 8-by-8 grid graph. . . . . . . . . . 5Figure 3.1 Illustration of the input and output of edge segmentation. . . . 12Figure 4.1 Illustration of the bundling algorithm of LSTB. A low-stretchtree T = (V,ET ) is computed for use as a routing graph. Seg-mentation is then performed by routing edges through this tree,using the routing algorithm illustrated in Figure 4.2. Bundlesare then formed from groups of segments sharing the sameendpoints as edges in ET . . . . . . . . . . . . . . . . . . . . . 17Figure 4.2 Illustration of the routing algorithm of LSTB from Section4.1.2. The algorithm roots tree T = (V,ET ) at an arbitraryvertex in order to speed up queries, which then only need tofind the lowest common ancestor between two vertices. . . . . 18viiiFigure 4.3 Comparison between vertex layout methods for the routinggraph T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Figure 4.4 Examples of visual encoding options for LSTB. . . . . . . . . 20Figure 4.5 Comparison between bundle querying and edge querying onhover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Figure 6.1 Comparison between previous methods and LSTB. . . . . . . 34Figure 6.2 Comparison between visual results obtained on the Flare dataset using a distribution over two low-stretch trees (b) and asingle low-stretch tree (c) for routing in LSTB. . . . . . . . . 37Figure 7.1 Comparison between using an arbitrary spanning tree and alow-stretch tree as the routing graph in LSTB on the Emaildata set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39ixAcknowledgmentsI would like to thank my research supervisor, Nick Harvey, for his help over thepast two years. His teaching, both in and out of the classroom, has been invaluableto me. I am so grateful for the time and effort he put into guiding me through thisprocess.I would also like to thank Tamara Munzner for her input and insight into thevisual aspects of this work. Her keen eye for beauty and design was an invaluableresource and essential to the success of this thesis.I thank Alex Telea for providing data for comparison and Quirijn Bouts forproviding Figure 6.1e.I am also deeply grateful to my husband, Christopher, and to my family fortheir love and support throughout my degree. Their faith and encouragement giveme motivation to succeed.xChapter 1IntroductionGraphs are common in many applications, such as: computational biology, com-puter networking, and natural language processing. Two key areas that studygraphs and their representations are graph theory and information visualization.Graph theory studies properties and applications of graphs. Common problemsin graph theory include constructing graphs with particular properties, computingsubgraphs with a particular structure, and finding routes through a graph. Infor-mation visualization (“vis”) creates “visual representations of datasets designedto help people carry out tasks more effectively” [23]. In particular, network visu-alization focuses on visualizing graph data.Finding a visually effective embedding of a given graph is often difficult, how-ever, unless the data set is very small or the graph admits a nice property, suchas planarity. In order to improve the visual representation of a graph by reducingvisual clutter, two approaches might be considered: adjusting vertex positions andadjusting edge positions. The latter is known as edge bundling, and has been stud-ied extensively in the information visualization literature. Edge bundling methodsvisually group edges into bundles in order to minimize edge clutter. We intro-duce a new classification of such methods as either layout-based or layout-free.Layout-based methods require the input graph to have a pre-computed layout: thatis, geometric positions for all vertices. These methods use information from thelayout, such as edge proximity, to perform bundling. For example, geo-spatial dataincludes a specific embedding which can be used to inform bundling. However,1a particular layout may not match a user’s intent or needs, or they may not knowhow to choose an appropriate layout. In addition, pre-computing a layout may beexpensive. In such situations, layout-based edge bundling will likely not be useful.In contrast, layout-free edge bundling methods do not use any information re-garding vertex positions in order to perform bundling. That is, the bundling com-puted by such methods is independent of layout. Therefore, any graph layout canbe used. An example of this is shown in Figure 1.2, where different tree layouts areused to draw the same software package interaction data. However, the bundlingin both cases is the same: the same edges are bundled together regardless of ad-justed vertex positions. This example is taken from Hierarchical Edge Bundles(HEB) [17], which is the only pre-existing layout-free edge bundling method weare aware of. HEB takes a compound graph as input: that is, a graph together with aspecific hierarchical relationship defined by a rooted spanning tree on the vertices.The hierarchy is used as a routing graph, where graph edges are routed through thehierarchy in order to form bundles.This method of using a routing graph to perform bundling is also commonamong layout-based methods [10, 11, 21, 26]. However, some of the routing graphsused are dense, which leads to inefficiencies when finding routes. From this per-(a) Input graph. (b) LSTB bundled graph.Figure 1.1: Comparison between the input graph and the output of LSTB on thePoker data set.2IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 12, NO. 5, SEPTEMBER/OCTOBER 2006Fig. 13. A software system and its associated call graph (caller = green, callee = red). (a) and (b) show the system with bundling strength β = 0.85using a balloon layout (node labels disabled) and a radial layout, respectively. Bundling reduces visual clutter, making it easier to perceive theactual connections than when compared to the non-bundled versions (figures 2a and 11a). Bundled visualizations also show relations betweensparsely connected systems more clearly (encircled regions); these are almost completely obscured in the non-bundled versions. The encircledregions highlight identical parts of the system for (a), (b), and figure 15.Fig. 14. Using the bundling strength β to provide a trade-off between low-level and high-level views of the adjacency relations. The value of βincreases from left-to-right; low values mainly provide low-level, node-to-node connectivity information, whereas high values provide high-levelinformation as well by implicit visualization of adjacency edges between parent nodes that are the result of explicit adjacency edges between theirrespective child nodes.regarded as being aesthetically pleasing. SIG and FEI Company Eind-hoven are currently supporting further development by providing uswith additional data sets and feedback regarding the resulting visual-izations.More specifically, most of the participants particularly valued thefact that relations between items at low levels of the hierarchy wereautomatically lifted to implicit relations between items at higher lev-els by means of bundles. This quickly gave them an impression of thehigh-level connectivity information while still being able to inspectthe low-level relations that were responsible for the bundles by inter-actively manipulating the bundling strength.This is illustrated in figure 14, which shows visualizations usingdifferent values for the bundling strength β . Low values result in vi-sualizations that mainly provide low-level, node-to-node connectivityinformation. High values result in visualizations that provide high-level information as well by implicit visualization of adjacency edgesbetween parent nodes that are the result of explicit adjacency edgesbetween their respective child nodes.Another aspect that was commented on was how the bundles gavean impression of the hierarchical organization of the data as well,thereby strengthening the visualization of the hierarchy. More specif-ically, a thick bundle shows the presence of two elements at a fairlyhigh level of the hierarchy, whereas the fanning out of a bundle showsthe subdivision of an element into subelements.Most participants preferred the radial layout over the balloon layoutand the squarified treemap layout. Another finding was the fact that therooted layout and the slice-and-dice treemap layout were consideredless pleasing according to several participants. This is probably due tothe large number of collinear nodes within these layouts, which causesbundles to overlap along the collinearity axes. This is illustrated infigure 17.Although our main focus while developing hierarchical edge bun-dles was on the visualization itself, interaction is an important aspectin determining the usability of our technique. Based on our own in-sight and feedback gathered from participants, we contend that bundle-based interaction as described below could provide a convenient wayof interacting with the visualizations.Figure 16 shows how the bundling strength β could be used in con-Figure 1.2: c© 2006 IEEE Comparison between two bundled layouts of softwarepackage interaction data from Hierarchical Edge Bundles [17]. In each case, al-though the vertex positions differ, the bundles are the same.spective, HEB is ideal, sinc it us s a spa ni g tree to route edges. HEB requiresthis hierarchy as input, though, which renders the method unusable in cases whereno hierarchy is kn wn. For example, a protein interaction graph has no inherentlayout and no nderlying tree struc ure; therefo , neith r layout-based methodsnor HEB are sufficient.To address the e problems, we introduce a new edge bundling system: LSTB.Figure 1.1 shows an example of the input and output of our method. Our system iscomprised of two steps: the edge bundling algorithm and the visual encoding step.Figure 1.3 shows the system workflow.Bundling AlgorithmVisual EncodingInput graphG=(V,E)Low-stretch tree T,bundles ℬ,edge map εFigure 1.3: LSTB system workflow.In the first step, our layout-free bundling algorithm computes a particular typeof spanning tree, a low-stretch tree [3], to use for edge routing. Given a graphG = (V,E) and a spanning tree T = (V,ET ), the stretch [3] is the ratio betweenpath length in the tree and path length in the original graph. A low-stretch tree is aspanning tree that approximately minimizes the stretch of edges on average. Such3a tree provides the distance preservation that we desire in a good routing graph.For graph G, the stretch of an edge e = (u,v) ∈ E in tree T is defined as:sT (e) = dT (u,v)where dT (u,v) is the path length from u to v in T . The overall stretch of G isdefined as:sT (G) =1|E |∑e∈EsT (e)Figure 1.4 gives an example of two spanning trees: one with poor (large)stretch, and one with good (small) stretch.One of our main contributions is to introduce the use of low-stretch trees tothe problem of edge bundling. Low-stretch trees became objects of interest in thegraph embeddings literature several decades ago, but they are also the key objectsunderlying several recent breakthroughs in spectral graph algorithms, such as max-imum flow in near-linear time [28].The power of low-stretch trees can be seen through a simple example. Consideran n-by-n grid graph, or mesh. Arbitrary spanning trees generally do a poor job ofencapsulating the structure of this graph, as shown in Figure 1.5a. However, Alonet al. [3] show that low-stretch trees succeed at capturing this structure. This isillustrated in Figure 1.5b.Most low-stretch tree construction algorithms, however, only guarantee thepreservation of distances on average. In contrast, we explore the use of a distri-bution over low-stretch trees to achieve low stretch for edges in expectation. Suchuv(a) Original graph G, withcomparison edgee = (u,v) highlighteduv(b) Tree T with sT (e) = 6,sT (G) = 28/11uv(c) Tree T with sT (e) = 2,sT (G) = 17/11Figure 1.4: Comparison between trees with poor (high) and good (low) stretch,respectively.4(a) Minimum spanning tree. (b) Low-stretch spanning tree, as shownin Alon et al. [3].Figure 1.5: Comparison between a minimum spanning tree and a low-stretch span-ning tree for an 8-by-8 grid graph.a distribution can be obtained via a reduction to a zero-sum game, where the payoffvalues are stretches of edges from the input graph in spanning trees. One playerwishes to find an edge with maximum stretch, whereas the other wishes to find atree which minimizes the stretch of that edge. By Von Neumann’s Minimax The-orem, there exists a distribution which gives each player the best possible payoffgiven their opponent’s strategy. We use the multiplicative weights update method[6] to approximate such a distribution. In contrast to using a single tree, we alsoexplore using this distribution over low-stretch trees for our edge bundling algo-rithm.Many existing edge bundling methods perform bundling visually without realconsideration of the bundles themselves: the bundles are either not explicitly com-puted, or not explicitly shown to the user. In contrast, our bundling algorithmoutputs the bundles directly. In order to explain this concept clearly, we introducean abstract framework for edge bundling which includes a formal definition of abundle. This definition provides the data abstraction for our visualization design,according to the nested model [23].The visual encoding step of our system is responsible for drawing the bundledgraph. This step corresponds to the visual encoding and algorithm levels of the5nested model for visualization design [23]. Since our bundling algorithm is layout-free, we can use any existing vertex layout method. We also explore several waysof visually encoding bundles; the techniques we use are pre-existing, but their ap-plication to edge bundles is new. In addition, we incorporate interactive featuressuch as dynamic layout adjustment and bundle query on hover.Depending on the task, one bundling algorithm may be more suitable thananother. However, the data in question poses restrictions on which methods areeven feasible. Table 1.1 explains these restrictions in conjunction with our layout-based and layout-free classifications by comparing method usage depending onwhether the user has a hierarchy or layout for the input graph. As seen in thistable, LSTB can be used in each scenario, and clearly fills a gap in the current edgebundling literature.In summary, we report on the following contributions: LSTB, a new layout-freeedge bundling technique that computes bundles explicitly; an abstract frameworkfor edge bundling that clearly defines terminology and techniques used in suchmethods; the utilization of the multiplicative weights update method to compute adistribution over low-stretch trees; and the introduction of low-stretch trees to theedge bundling literature.Has Hierarchy No HierarchyHas Layout H3L3: Any methodH7L3: Any layout-basedmethod, or LSTBNo Layout H3L7: HEB [17], or LSTB H7L7: LSTBTable 1.1: A comparison of edge bundling method usage, broken down based onwhether the graph has a layout or hierarchy. Our method, LSTB, is applicable inall four cases.6Chapter 2Literature ReviewWe discuss prior work related to low-stretch trees, edge bundling, and other visual-ization techniques. Low-stretch tree computation methods will be reviewed as wellas motivations and popular applications of these objects. We also give an overviewof important work on edge bundling, particularly the methods that inspired LSTB.In addition, we discuss previous work on several visualization techniques that weuse to display the results of our bundling algorithm.2.1 Low-Stretch TreesLow-stretch trees were first introduced by Alon, Karp, Peleg and West in 1995 [3]for use in the k-server problem, a key problem in online algorithms. They formu-late a zero-sum game for the problem, where the tree player picks a spanning treeof the graph and the edge player picks an edge in the graph; the payoff is then thestretch of the chosen edge in the chosen tree. This framework is described in detailin Section A.1. By von Neumann’s Minimax Theorem [29], for any fixed strategyx by the edge player there exists a strategy y for the tree player consisting of asingle tree which achieves miny xᵀMy, where M is the payoff matrix of the game.Alon et al. assume a uniform strategy for the edge player, and provide an algorithmthat constructs a single tree as a fixed strategy for the tree player which providesstretch exp(O(√log n log log n)) = O(n0.01) for edges on average.In 2001, Boman and Hendrickson [8] discovered low-stretch trees could be7used as preconditioners for solving linear systems where the matrix is the Lapla-cian of a graph. The work of Spielman and Teng, and others [28] dramaticallyextended the work of Boman and Hendrickson by giving provably near-linear timealgorithms for solving Laplacian linear systems. These fast Laplacian solvers areused in problems such as max flow and bipartite matching [28]. This breakthroughprompted a renewed interest in developing algorithms for computing low-stretchtrees.Elkin, Emek, Spielman and Teng [12] introduced a star-decomposition tech-nique which their algorithm uses to compute a spanning tree with average stretchO(log2 n log log n). Abraham, Bartal and Neiman [2] extend this technique to com-pute a distribution T over spanning trees in order to obtain good expected stretch:ET ∈T [sT (e)] = O˜(log n). This distribution consists of all spanning trees inducedby their hierarchical star partition algorithm with non-zero probability. Based onthis result, they obtain an average stretch guarantee of O(log n log log n(log log log n)3).A more recent algorithm for computing low-stretch trees from Abraham and Neimanuses similar ideas to develop a petal-decomposition technique which improves av-erage stretch to O(log n log log n) [1]. The goals of these algorithms, however,are to improve stretch guarantees. Therefore, most of these algorithms are verytheoretical and are not practical to implement.2.2 Edge BundlingOne of the first papers on edge bundling was Hierarchical Edge Bundling (HEB) [17].As mentioned previously, this algorithm takes as input a compound graph and usesthe hierarchy to route the additional edges. Existing tree drawing methods are usedto lay out the hierarchy, and splines are used to draw the routed edges through thetree. While this method does not explicitly compute bundles before laying out thetree, the bundles are layout-independent and therefore this method is layout-free.Interestingly, all subsequent work has been exclusively layout-based, to the best ofour knowledge.Some early work on edge bundling focused on routing-based techniques, wherebundles are formed by edges being routed along a mesh or grid. Geometry-BasedEdge Clustering [11] generates a control mesh and computes control points on the8mesh which edges are routed through. Winding Roads [21] computes a hybridquad-tree/Voronoi diagram according to vertex positions of the input graph, andthen routes the edges along this grid. Force-Directed Edge Bundling (FDEB) [18],another early method, takes a different approach. This algorithm subdivides edgesand exerts spring forces on the subdivision points in order to attract edges towardsone another. Unfortunately, this iterative procedure tends to be quite slow, andinvolves many parameters.More recent methods employ unique and varied techniques. Skeleton-BasedEdge Bundling (SBEB) [13] clusters similar edges together and generates a skele-ton from their medial axes, then attracts edges to the skeleton to produce bundles.Kernel Density Estimation Edge Bundling (KDEEB) [19] computes a density mapof the input graph’s layout using kernel density estimation, and then applies im-age sharpening to merge local high points in order to form bundles. MultilevelAgglomerative Edge Bundling (MINGLE) [14] constructs a proximity graph foredges, then bundles edges according to this graph with the objective of minimizingink used. This method does identify bundles explicitly, but does not exploit themwhen drawing the resulting bundled graph. However, this method is the fastestknown at this time.Several recent methods have employed spanners as routing graphs [10, 25, 26].A spanner is a subgraph that approximately preserves edge distances. These meth-ods compute a spanner of the visibility graph and use it to route edges. Unfortu-nately, spanners have known poor guarantees in terms of sparsity: to preserve dis-tance with factor O(t), Ω(n1+1/t ) edges are needed [4]. Furthermore, these meth-ods are computationally expensive: the most recent method requires O(n2 log2 n)time to compute a spanner [10]. We will use a routing graph which can be quicklycomputed, and provides both sparsity and approximate distance preservation onaverage.2.3 Network VisualizationThe computation of the routing graph in our bundling algorithm relies on an itera-tive coarsening process that is similar in spirit to the coarsening steps used in manymulti-level graph layout systems such as FM3 [15]. However, low-stretch trees9come with provable bounds on quality, which make them ideal routing graphs (seeSection 4.1.1 for more details). In our visual encoding step, we employ existinggraph visualization techniques. We use a tree as the backbone for graph layout; thisapproach is used by many previous systems, including SPF [5]. We create percep-tual layers using colour and opacity both statically and dynamically, with supportfor interactive highlighting of bundles or edges on hover, as in previous systemssuch as Constellation [24].10Chapter 3Abstract Framework for EdgeBundlingOne contribution of this paper is to formalize an abstract framework for edgebundling that answers the question: what is a bundle? We wish to solidify theintuitive notion of a grouping of edges in order to explicitly compute and visu-ally represent edge bundles. Our framework will use this definition to describe therequisites of a bundling algorithm.3.1 Framework DetailsWe define the process of edge segmentation as a key component of all bundlingmethods. Let G = (V,E) be a graph with n = |V | vertices and m = |E | edges.An example graph is shown in Figure 3.1a. We wish to subdivide each edge e =(u,v) ∈ E into segments, such that each edge can be represented by an orderedlist of segments. A segment s = (x, y) is an edge which cannot be subdividedfurther. A segment’s endpoints may be: an endpoint of the original edge e, anothervertex from V , or a dummy vertex. Dummy vertices may be introduced duringsegmentation to subdivide an edge. This case is shown for edge (c,d) in Figure 3.1,where x is a dummy vertex. Alternatively, edges may be routed through existingvertices in the graph. This case is shown for edge (a,b) in Figure 3.1, where thesegmentation routes through c. Also, an edge may consist only of a single segment.11These cases are shown for edges (a,c) and (b,c) in Figure 3.1. The decision ofhow to segment edges depends on the bundling algorithm. Once all edges havebeen segmented, we have a mapping E from each edge e ∈ E to its correspondingordered list of segments.dcab(a) Input: original graph G = (V,E),where edges are differentiated by colourand dash style.cdxab(b) Output: segmented graph M = (V ∪{x},S), with corresponding colour anddash styles.Figure 3.1: Illustration of the input and output of edge segmentation.Let S be the multiset of all segments, and U be the set of all dummy vertices.The result of the segmentation process is a multigraph M = (V ∪U,S), as shownin Figure 3.1b. A bundle is then a set of two or more segments, B ⊂ S, |B | ≥ 2.Let B denote the set of all bundles. Segments in the same bundle are referred to asbundle neighbours, although they need not share any endpoints from V in orderto be bundled together. An edge e ∈ E belongs to any bundle that contains oneof its segments. A segment may be contained in at most one bundle. A particularbundling algorithm must decide how to assign segments to bundles.We define an edge bundling algorithm as an algorithm which takes as input agraph G = (V,E) and outputs a set of bundles B, as well as a mapping E betweenedges e ∈ E and ordered lists of segments. Based on this definition, an algorithmmust determine how to perform edge segmentation, as well as how to assign seg-ments to bundles. While most existing edge bundling methods implement a systemwhere the output is a graph drawing, we believe the concepts from this frameworkapply nonetheless. This framework provides a context from which all bundlingmethods can be discussed.123.2 Application to Existing MethodsOne of our goals for this framework is to provide a formalization of bundling tech-niques such that these methods can be compared in a uniform way. We will exem-plify this by applying the framework to several of the existing methods introducedin Chapter 2. For each method, we will discuss the edge segmentation process andhow bundles are formed.3.2.1 Force-Directed Edge BundlingFDEB [18] models edges as springs which attract one another in order to formbundles. The algorithm proceeds in cycles, where each cycle consists of a numberof iterations. Edge segmentation is performed using dummy vertices. Edges aresubdivided uniformly by dummy vertices, where the number of such vertices in-troduced starts at one and doubles each cycle. Consecutive vertices along the sameedge are attracted to one another via a spring force. Edge compatibility measuresare used to determine whether two edges should interact, where pairs of edges withcompatibility above a certain threshold are considered to be interacting edges. Forsuch pairs of edges, an electrostatic force is used to attract corresponding pairs ofdummy vertices, which is how edges are bundled together.This method implements a system which outputs the graph drawing directly.Although edges are drawn together in bundles, the bundles are not explicitly com-puted, so it is not known whether two edges are bundled together or not. While theedge compatibility measure indicates whether two edges exert force on each other,they may not appear visually in the same bundle.3.2.2 Multilevel Agglomerative Edge BundlingMINGLE [14] takes as input a set of edges with fixed endpoint positions. Eachedge is considered to be a four-dimensional vector based on the location of its end-points. An edge proximity graph is constructed by finding the k closest neighboursof an edge according to Euclidean distance in the 4D space; the vertices of thisgraph are edges of the original graph. An iterative coarsening process is then usedto form bundles, where vertices in the proximity graph are bundled together andcoalesced if their bundling minimizes ink used to draw the edges.13In this case, edges are bundled without segmentation. Bundles are computedexplicitly, since edges are directly assigned to bundles based on their ink-savingcompatibility. However, these bundles are only used to guide edge drawing, ratherthan for additional visual feedback or interactivity. When drawing bundled edges,edges are segmented using two dummy vertices as meeting points. For a set ofbundled edges, the corresponding dummy vertices of each edge will be drawn inthe same location, so the edges share a common segment. The edges then fan outfrom these meeting points to their respective endpoints.3.2.3 Clustered Edge RoutingClustered Edge Routing [10] performs edge bundling using spanner-based routing.First, links are clustered using a well-separated pair decomposition which is con-sistent with the compatibility measures of Holten and van Wijk [18]. This givesan explicit bundling, since each well-separated pair of point sets (A,B) induces acluster containing edges which have one endpoint in A and the other in B. Next, avisibility graph is computed such that vertices are connected by an edge if they arevisible to each other. Dummy vertices are also added to route around any obstacles,which include vertices of the original graph. A spanner of this graph is computedin order to improve sparsity; this technique is discussed further in Section 2.2. Thisspanner is then used to route edges, where edges which are bundle neighbours mustshare a common sub-path in the graph. This bundles the edges visually.As in the previous case, edges are bundled without segmentation and the bun-dles are only used for drawing. When drawing, each edge is segmented by thedummy vertices on its route through the routing graph. Several additional tech-niques are used to reduce visual clutter further, including merging obstacles, edgeordering and crossing minimization.14Chapter 4LSTBOur layout-free bundling method takes as input a connected graph G = (V,E). Thefirst step of the system is the bundling algorithm, which computes the routing treeT = (V,ET ) and the set of bundles B. The visual encoding step then draws thebundled graph, which includes interactive elements for data exploration.4.1 Bundling AlgorithmOur novel edge bundling algorithm computes bundles directly without relying onthe geometry of vertices or edges. This layout-free approach routes edges througha tree to determine both edge segmentation and bundle membership. The algo-rithm consists of two steps: routing graph generation, where the spanning tree iscomputed; and edge routing, where edges are segmented and bundles are formed.4.1.1 Routing Graph GenerationBouts and Speckmann [10] quantify several desirable properties of a routing graph,including two that are layout-independent: sparsity, and that the shortest path be-tween two vertices in the routing graph should not be much longer than their di-rect connection in the graph. Since we are computing a spanning tree, the firstproperty is achieved by definition. Therefore, we focus on the second property,which amounts to a guarantee that the spanning tree preserves edge distances fromthe original graph. This is equivalent to a low-stretch guarantee, as introduced in15Chapter 1.We will use a low-stretch tree as our routing graph for bundling. In order tocompute such a tree, we use the algorithm from Alon et al. [3]. This algorithmperforms an iterative coarsening process in order to compute the low-stretch tree(LST). This process is explained in detail in Section 5.1.This algorithm runs in time O(m log n), and the tree is guaranteed to have lowaverage stretch: exp(O(√log n log log n)) = O(n0.01). As discussed in Section 2.1,other methods with better theoretical guarantees are known [2, 12], but their algo-rithms are not practical to implement. Regardless, any method can be substitutedto compute the LST in this step, provided there are guarantees on the stretch of theresulting tree.4.1.2 Edge Routing and BundlingOur routing graph is a low-stretch tree T = (V,ET ). Let the remainder edges bethose not in the spanning tree: ER = E \ ET . We will segment the remainder edgesby routing them through T . Tree edges ET will be represented by single segments,as (a,c) and (b,c) are in Figure 3.1b.For any two vertices in a connected graph, there is a unique path between themin any spanning tree of the graph. For any edge e = (u,v) ∈ ER , the unique u-vpath through T gives its segmentation. That is, for every edge (x, y) on the u-vpath in T , a new segment will be added and the ordered list of these segmentswill represent e. In terms of our framework, we are segmenting through existingvertices in V , rather than adding dummy vertices. This case is shown for the (a,b)edge in Figure 3.1.As discussed in Chapter 3, the output of this segmentation process is a multi-graph that is supported on the same set of edges as T . We define our bundles as setsof segments that have the same endpoints. Therefore we will have at most n − 1bundles, since |ET | = n − 1, and for any segment (x, y) it must also be true that(x, y) ∈ ET . Figure 4.1 illustrates this process.To perform segmentation and bundling efficiently, we need a fast subroutineto handle queries for u-v paths in our spanning tree T . A naı¨ve approach is tosimply perform breadth-first search, but this requires Θ(|V |) time per query. Since16dcab(a) Original graph G, where colour anddash style are used to differentiate edges.dcab(b) Routing tree T .cdabB1B2B3(c) Result of the bundling algorithm of LSTB. Seg-ments have the same colour and dash style as theircorresponding edge in G, and bundles B1,B2,B3 areindicated.Figure 4.1: Illustration of the bundling algorithm of LSTB. A low-stretch tree T =(V,ET ) is computed for use as a routing graph. Segmentation is then performedby routing edges through this tree, using the routing algorithm illustrated in Figure4.2. Bundles are then formed from groups of segments sharing the same endpointsas edges in ET .Ω(|ER |) queries are needed, this approach requires Ω( |V | · |ER |) time. Instead, wewill use a two-phase approach that pre-processes the graph in linear time, but canthen perform queries in time proportional to the length of the returned path (whichis optimal, as the path itself is returned). Performing such a query for every edgein ER requires time O(sT (G) · |ER |), which is small since it is a low-stretch tree.This subroutine can be easily implemented as follows. The pre-processing stepsimply roots the tree at an arbitrary vertex and directs all edges towards the root.Then, to find a u-v path in T , step in parallel from each of u and v towards the root,marking the nodes along the way. The first marked vertex encountered on either17path is the lowest common ancestor `, and the time to identify it is proportionalto the length of the u-v path. The path through the tree is then the concatenationof paths u-` and `-v, and the segmentation of (u,v) is this path. This process isillustrated in Figure 4.2.x(a) Original (unrooted) tree T .x(b) T rooted at arbitrary vertex x.uvxl(c) To find the path from u to v in T , we find theirlowest common ancestor, `, by climbing up the tree.The path from u to v is then the concatenation of pathsu-` and `-v.Figure 4.2: Illustration of the routing algorithm of LSTB from Section 4.1.2. Thealgorithm roots tree T = (V,ET ) at an arbitrary vertex in order to speed up queries,which then only need to find the lowest common ancestor between two vertices.Once the u-v paths have been computed for each (u,v) ∈ ER , bundles areformed such that all segments that have the same endpoints are bundle neighbours.The output of our bundling algorithm is then: T , our low-stretch tree; B, our set ofbundles; and E, our map from edges in E to their segmentation.4.2 Visual EncodingWe wish to take advantage of the key assets of our bundling algorithm when de-signing the visual encodings for LSTB: namely, that it is layout-free and that it18computes bundles directly. We will do so by allowing arbitrary layout adjustment,as well as adding interactive bundle queries. The figures in this section show thePoker graph (see Table 6.1 for more details).4.2.1 Vertex LayoutSince the bundles we compute are independent of layout, we can position the ver-tices in any manner we choose. We lay out the routing graph T = (V,ET ) using thestandard force-directed graph layout built into D3 [9]. Since our routing graph isa tree, we could also take advantage of tree layouts, such as the Reingold-Tilfordalgorithm [27]. Figure 4.3 shows a comparison of these layouts.(a) Force-directed layout (b) Radial Reingold-Tilford layoutFigure 4.3: Comparison between vertex layout methods for the routing graph T .Additionally, vertices may be arranged according to a user-defined layout, butimposing this layout on our tree backbone may not produce an effective visual-ization. Figure 6.1h shows such an example for comparison purposes. We alsoallow dynamic layout adjustment: vertices can be dragged in order to adjust theirposition, as discussed further in Section 4.2.3.4.2.2 Edge DrawingPrevious edge bundling methods draw edges individually, so bundles are shown asclosely located or overlapping edges. However, we wish to also take advantage of19(a) Bundles only (b) Remainder edges only(c) Bundles foreground (d) Remainder edges foregroundFigure 4.4: Examples of visual encoding options for LSTB.the fact that our algorithm outputs bundles, and that all segments in a bundle sharethe same endpoints. We will do so by drawing bundles explicitly, both indepen-dently from and in combination with individual drawing of remainder edges.Figure 4.4 shows the different visual encoding options for bundles and edges.Bundles are depicted by straight lines with tapered endpoints and varying thick-ness, as shown in Figure 4.4a. The thickness of bundles varies in proportion withbundle size: that is, the number of segments in a bundle. Individual edges aredrawn as splines, as shown in Figure 4.4b. The control points for the spline of edge(u,v) are the endpoints of its segments, which correspond to the vertices in the u-vpath in T . This approach is similar to the method of HEB [17].Bundles and edges can also be drawn simultaneously, where distinct perceptuallayers are created by adjusting opacity. Figures 4.4c and 4.4d show the differencebetween having bundles and remainder edges as the foreground layer.204.2.3 InteractivityAnother contribution of our bundling method is increased support for user inter-action. Vertex re-positioning by dragging is possible without re-running the edgebundling algorithm because the algorithm is layout-free and bundles are computedindependently of layout, in contrast to previous methods. Different hovering tech-niques are also possible, since bundles are computed explicitly and edge member-ship in bundles is known. Figure 4.5a shows how when a bundle is hovered over,all edges belonging to that bundle are highlighted. Figure 4.5b shows how singleedges are individually highlighted on hover.(a) Hovering over a bundle high-lights edges in that bundle(b) Hovering over an individualedge highlights that edgeFigure 4.5: Comparison between bundle querying and edge querying on hover.21Chapter 5Low-Stretch TreesLSTB uses low-stretch trees as routing graphs for edge bundling. The history be-hind these trees is discussed in Section 2.1. This chapter will go into more detail onthe construction algorithm of Alon et al. in addition to exploring the computationof distributions over low-stretch trees in order to achieve low stretch for edges inexpectation.In this chapter, we will consider weighted graphs G = (V,E,w) where w :E → R≥0. We will update our previous definition of stretch as follows, whereT = (V,ET , z) is a spanning tree of G with weights z : ET → R≥0:sT (e) =dT (u,v)w(e)for e = (u,v)Here, dT (u,v) is the z-weighted distance from u to v along the unique path betweenthem in T .5.1 Low-Stretch Tree ConstructionLSTB computes low-stretch trees using the method of Alon, Karp, Peleg andWest [3]. Their algorithm, given in Algorithm 1, takes as input a weighted n-vertexmultigraph G. The algorithm starts by breaking the edges into weight classes, andthen proceeds in rounds. At each step of the algorithm, the vertices of the graphare partitioned into clusters such that each cluster has low topological diameter.A shortest-paths spanning tree is then computed for each cluster, and the edges22from these trees are added to the (initially empty) low-stretch tree. Each clusteris then contracted into a meta-vertex, and edges are added to represent connec-tions between vertices in different clusters. The algorithm then iterates on this newmultigraph.The key properties of the clusters are that each cluster has a spanning tree ofradius ≤ y j+1 at round j, and in every non-empty edge class Ei for 1 ≤ i ≤ j, thefraction of inter-cluster edges is at most 1/x, where the optimal value for x is shownto be exp(√log n log log n). This algorithm is guaranteed to find a low-stretch treewith stretch exp(O(√log n log log n)) = O(n0.01) [3].In their paper, Alon et al. do not give runtime guarantees for their algorithm.However, an informal analysis shows that the runtime is O(m log n) by looking atthe number of iterations performed. At each step, the number of edges shrinks bya factor of x, where x ≈ log n. Therefore the number of iterations is logx m ≤loglog n n2 = O(log n). At each round we do O(m) work, so the total runtime isindeed O(m log n).5.1.1 ParametersThe low-stretch tree algorithm of Alon et al. depends on two key parameters, xand y. These parameters are set at the beginning of the algorithm, where x =exp(√log n log log n) and y = 27x ln n ·⌈ln nln x⌉. Alon et al. set the values for theseparameters according to their theoretical guarantees for the stretch of the resultingtree. The x parameter is responsible for controlling the expansion of each cluster.The cluster starts at a root node, then expands outward to its neighbours, workinglevel-by-level until the next level contains fewer edges than an x-fraction of theexisting edges in the cluster. The y parameter is used for weighted graphs to segre-gate edges into weight classes Ei . This way, the algorithm considers lighter edgesfirst in order to attain the radius bound given earlier.23Algorithm 1 AKPW algorithm to compute low-stretch tree T in multigraph G [3]function LOWSTRETCHTREE(G)input: Graph G = (V,E,w)output: Low-stretch tree T = (V,ET ,w)// Set parametersx = exp(√log n log log n), ρ =⌈3 ln nln x⌉, µ = 9ρ ln n, y = xµimax = blogy max(u,v)∈E w(u,v)c + 1Ei = {(u,v) ∈ E : w(u,v) ∈ [yi−1, yi ) for i = 1, . . . , imax// Perform iterative roundsj = 1T = (V,ET = ∅)while⋃i Ei , ∅ doC = CLUSTER(G j , j,Ei’s, x, y)for all c ∈ C doGc = INDUCEDSUBGRAPH(G j ,c)Tc = SHORTESTPATHSPANNINGTREE(Gc ) // Uses Dijkstra’s algfor all (u,v) ∈ ETc doET = ET ∪ {(u,v)}VG j+1 = {vc : c ∈ C} // Contract each cluster into a single vertexEG j+1 = ∅for all (u,v) ∈ EG j doif u ∈ ci and v ∈ cj and i , j thenEG j+1 = EG j+1 ∪ {(vci ,vc j )} // Add inter-cluster edgeselsei = blogy w(u,v)c + 1Ei = Ei \ (u,v) // Remove intra-cluster edgesG j+1 = (VG j+1 ,EG j+1 )j = j + 1return T = (V,ET ,w)24Algorithm 1 AKPW algorithm to compute low-stretch tree T in multigraph G [3]function CLUSTER(G, Ei’s, j, x, y)input: Graph G = (V,E,w), edge buckets Ei , and parameters j, x, youtput: Clustering C = {c1, . . . ,cK } where⋃k∈{1, ...,K } ck = VC = ∅Vrem = Vwhile Vrem , ∅ doChoose u ∈ Vrem arbitrarilyV (0) = {u}Ei (0) = {}` = 0repeat` = ` + 1V (`) = {v ∈ Vrem : |SHORTESTUNWEIGHTEDPATH(G,u,v) | = `}Ei (`) = {(v1,v2) ∈ Ei : v1 ∈ V (`),v2 ∈ V (`) ∪ V (` − 1)}until ∀1 ≤ i ≤ j, |Ei (`) | ≤ 1x |Ei (1) ∪ Ei (2) ∪ . . . ∪ Ei (` − 1) |c = V (1) ∪ V (2) ∪ . . . ∪ V (` − 1)C = C ∪ {c}Vrem = Vrem \ creturn C5.2 Distribution over Low-Stretch TreesSection 2.1 discusses several methods for computing low-stretch trees. In general,these methods compute a single tree and provide low stretch on average over alledges in the original graph. The method of Abraham et al., however, gives theo-retical results based on finding a distribution over low-stretch trees [2]. In doingso, they provide low stretch for each edge in expectation. Unfortunately, their al-gorithm is theoretical in nature and not practical to implement. We will present apractical solution for finding a distribution over low-stretch trees.We will use a reduction to a zero-sum game, given in Section A.1. We wish tofind a distribution D over trees T such that ET∼D[sT (e)] is small for all e ∈ E. In25the zero-sum game framework, for some strategy (i.e. probability distribution) overtrees y, we have ET∼y [sT (ei )] =∑T ∈y Pr[T] · sT (ei ) = (My)i for any ei ∈ E.Therefore, our goal is to find miny maxx xᵀMy, which is exactly the goal of thetree player in the game. Our distribution over trees is thus given by y∗, the opti-mal distribution from von Neumann’s Minimax Theorem [29]. The multiplicativeweights update method [6] is an algorithm for approximating such distributions.The Multiplicative Weights Update Method (MWUM) is an algorithmic frame-work which optimizes decision-making by adjusting weights on decisions. Eachdecision has some associated cost, which is revealed only after the decision hasbeen made. The algorithm proceeds in rounds, updating the weights depending onthe consequences of the decision made in that round. As the algorithm progresses,the weights are distributed to reflect the knowledge of which decisions are better orworse in terms of cost. The expected cost of the algorithm can be shown to be notmuch worse than making a fixed decision throughout; this holds true for the bestoverall decision as well [6].Arora et al. [6] provide a tailored version of MWUM for solving zero-sumgames approximately. In this case, the algorithm’s rounds correspond to simulatedrounds of the game and weight updates correspond to changes in a player’s strategy.This method can be used to find strategies xˆ and yˆ such that, for error δ,minyxˆᵀMy ≥ λ∗ − δ and maxxxᵀM yˆ ≤ λ∗ + δ (5.1)Therefore, regardless of their opponent’s strategy, a player following such a strat-egy is guaranteed payoff at most δ from the optimal. In this case, we wish to findyˆ, since this will be our distribution over trees. In order to compute the trees in thisdistribution, we will use the method of Alon et al. [3].5.2.1 Multiplicative Weights Update MethodAlgorithm 2 adapts the method from Arora et al. to solve our zero-sum game reduc-tion, with the goal of finding a distribution over trees. If Alice is our edge playerand Bob is our tree player, we want Bob’s distribution over trees to achieve lowstretch for any distribution over edges that Alice may choose. At each round, thealgorithm adjusts Alice’s weights to focus on edges which had high stretch in the26previous round, so that the resulting distribution over trees will be robust.The algorithm performs ρ weight update rounds, where the weights γ are overthe edges (e1, . . . ,em ). Note that the graph may have edge weights w as well,but these should not be confused with the multiplicative weights γ of the algo-rithm. At the start of round t, the weights γ are normalized to form the distributionx (t ), which is Alice’s current strategy. Then, the TREEORACLE method returns alow-stretch tree with respect to the distribution x (t ) (this will be discussed furtherin Section 5.2.2). In doing so, Bob is choosing the best pure strategy given Al-ice’s distribution. The edge weights are then updated, based on whether an edgehad high or low stretch in the tree, resulting in better or worse payoffs for Alice,respectively. The process is then repeated in the next round. Once all rounds arecomplete, Alice’s final strategy is computed as the average of each round’s distribu-tion x (t ). Bob’s final strategy is computed as the average over the fixed distributionsfrom each round, giving us a distribution over low-stretch trees.Some distinctions should be noted. Firstly, the trees are computed on demand,since we don’t know which trees will be in the distribution before executing themethod. Therefore, the matrix M is not explicitly computed; rather, we keep trackof the trees as they are computed, and calculate the stretch as needed. Also, sincethe same tree may be generated in multiple rounds, we only have an upper bound,ρ, on the number of trees in the distribution. In the case of tree repetition, however,we will simply have some probabilities equal to zero. In addition, Arora et al.restrict payoff values to the range [0,1]. Therefore, at each round, stretch valuesare normalized according to the maximum stretch of that round.5.2.2 Tree OracleIn order for this method to succeed, we need a way to compute a tree which haslow stretch with respect to a distribution over edges. If we think in regards to thepayoff matrix scenario, the TREEORACLE method must find some tree Tj such that(xᵀM) j is minimized, where x is Alice’s distribution over edges:arg minj(xᵀM) j = arg minj∑ixiMi, j = arg minj∑ixi sT j (ei )27Algorithm 2 Multiplicative Weights Update Method for Low-Stretch Treesfunction MWUM-LST(δ,G)input: Desired error δ, graph G = (V,E = (e1, . . . ,em ),w)output: Distributions xˆ ∈ Rm and yˆ ∈ Rρ , and list of trees T// Set parameters = δ/2, ρ = dln(m)/2eT = [ ]γ(1)i = 1 for i = 1, . . . ,m// Perform iterative roundsfor t = 1, . . . , ρ dox (t ) = γ (t )/∑mi=1 γ(t )iT (t ) = TREEORACLE(x (t ),G)if T (t ) < T thenk = |T| + 1T[k] = T (t )y(t )i =0 T[i] , T (t )1 T[i] = T (t )for i = 1, . . . , ρsmax = maxi sT (t ) (ei )si = sT (t ) (ei )/smax for i = 1, . . . ,m // Normalize to fit range [0,1]γ(t+1)i = γ(t )i · (1 + · si ) for i = 1, . . . ,mxˆ =∑ρt=1 x(t )/ρ, yˆ =∑ρt=1 y(t )/ρreturn xˆ, yˆ,TThat is, if we consider Alice’s distribution to be an importance weighting of theedges in the graph G, then Bob will want to choose a tree that minimizes thestretch of edges according to their importance. Therefore, we define the follow-ing weighted stretch value for graph G = (V,E,w) and tree T = (V,ET , z):sT (G, x) =∑ei∈Exi · sT (ei ) =∑ei=(u,v)∈Exi ·dT (u,v)w(ei )where x is a distribution over edges in G, so∑i xi = 1, xi ≥ 0 ∀i. In the uniformcase, we have xi = 1|E | ∀ei ∈ E, which gives sT (G), as expected.Therefore, we need an algorithm which computes a low-weighted-stretch treeaccording to the value sT (G, x) for graph G = (V,E,w). We know the algorithm28of Alon et al. finds a low-stretch tree T = (V,ET ,w) according to the valuesT (G) =∑ei∈EsT (ei ) ·1|E |=∑ei=(u,v)∈EdT (u,v)w(ei )·1|E |Therefore, let us re-weight the graph G with weights z : E → R≥0 wherez(ei ) =w(ei )xi · |E |Now, the algorithm will compute a tree T = (V,ET , z) according to the valuesT (G = (V,E, z)) =∑ei∈EsT (ei )·1|E |=∑ei=(u,v)∈EdT (u,v)z(ei )·1|E |=∑ei=(u,v)∈Exi ·dT (u,v)w(ei )which is equivalent to our weighted stretch value, under T weighted with z. Notealso that in the uniform case, this is equivalent to sT (G = (V,E,w)), since z(ei ) =w(ei ) when xi = 1|E | for all i.Our oracle, shown in Algorithm 3, will therefore re-weight the graph accordingto weights z, so that the low-stretch tree T = (V,ET , z) returned from Algorithm 1will have low weighted stretch.Algorithm 3 Compute low-stretch tree WRT distribution over edgesfunction TREEORACLE(x,G)input: Distribution x over edges, graph G = (V,E = (e1, . . . ,em ),w)output: Tree T = (V,ET , z) in G with low stretch WRT distribution xz(ei ) = w(ei )/xi · m for i = 1, . . . ,mT = LOWSTRETCHTREE(G = (V,E, z))return T = (V,ET , z)5.2.3 AnalysisWe wish to show that the distribution computed by Algorithm 2 provides lowstretch in expectation for all edges in E. That is, for all edges ei ∈ E, ET∼D[sT (ei )]29is small. As discussed earlier, for some ei , we have:ET∼D[sT (ei )] =∑T ∈DPr[T] · sT (ei ) = (My)iIn order to show that this is small, we will use the theorem from Arora et al. [6]:Theorem 5.2.3.1 (adapted from Arora et al. [6]). Given an error parameter δ ∈(0,1) , MWUM-LST finds xˆ and yˆ such that:minyxˆᵀMy ≥ λ∗ − δ and maxxxᵀM yˆ ≤ λ∗ + δusing O(ln(m)/δ2) calls to TREEORACLE , with an additional processing time ofO(m) per call.This theorem follows nicely from the next two results.Lemma 5.2.3.2 (Harvey [16]). For all i,ρ∑t=1x (t )ᵀMy(t )ρ≥ρ∑t=1Mi, j (t )ρ− δwhere j (t ) is the index in T of the tree T (t ) chosen at round t , such that y(t )j (t )= 1 .This lemma shows that the algorithm’s performance will not be much worsethan choosing a fixed action throughout. The key thing to note is that this holds forall possible fixed actions, including the optimal. The proof of this lemma is givenin Section A.2.Corollary 5.2.3.3 (Harvey [16]). For any distribution q ∈ Rm , we haveρ∑t=1qᵀMy(t )ρ− δ ≤ρ∑t=1x (t )ᵀMy(t )ρ≤ λ∗The lower bound follows from Lemma 5.2.3.2. Since q is a distribution, weknow∑mi=1 qi = 1. Therefore, we can multiply both sides by the sum, and simplifythe RHS to obtain our result. The upper bound comes from the observation thatx (t )My(t ) = miny x (t )My ≤ λ∗. More details can be found in the full proof [16].We are now ready to prove Theorem 5.2.3.1.30Proof (Theorem 5.2.3.1). In particular, let’s focus on the right-hand guarantee.The left-hand case is similar [16]. Let yˆ =∑ρt=1 y(t )/ρ be the result returned fromthe algorithm. Apply Corollary 5.2.3.3 to the distribution x = arg maxx xᵀM yˆ.Then we have:xᵀM yˆ =ρ∑t=1xᵀMy(t )ρ≤ λ∗ + δIn terms of running time, we know that ρ = ln(m)/2 = 4 ln(m)/δ2 rounds areperformed, and TREEORACLE is called once per round. Within each round, wealso update all weights γi for i = 1, . . . ,m. Therefore, we have O(ln(m)/δ2) callsto TREEORACLE, and perform an additional O(m) work each round.Theorem 5.2.3.1 shows that the MWUM-LST algorithm returns a distributionyˆ such thatmaxxxᵀM yˆ ≤ λ∗ + δwhere Alon et al. show that λ∗ ≤ exp(O(√log n log log n)), since their tree stretchis upper bounded by that value. This implies that (M yˆ)i is small for all i. There-fore, the distribution provides low expected stretch for all edges.5.2.4 Application to LSTBIn order to utilize this distribution as a routing graph for LSTB, we must determinehow to perform routing and how to draw the resulting bundled graph. For routing,we use the simple approach of choosing the shortest path among the trees in thedistribution. That is, for some edge e = (u,v) ∈ E, its route through the distribu-tion is the unique path between u and v in T , where T = arg minT ∈D dT (u,v). Todraw the distribution, we combine the trees into a single routing graph GR with ver-tices V and edges⋃T=(V ,ET )∈D ET . We lay out GR using a force-directed layoutalgorithm. Bundled edges are then drawn as described in Section 4.2.2.Section 6.3 shows the results of using a distribution over trees for routing inLSTB. For these results, the size of the distribution was fixed by setting the ρparameter explicitly.31Chapter 6ResultsThis chapter presents the results of this work, including visual output and perfor-mance of LSTB and experiments with distributions over trees.6.1 Data and ImplementationThe data sets listed in Table 6.1 were chosen based on their use in previous work toenable comparisons to existing methods. The “Case” column indicates which casefrom Table 1.1 the data falls under.Data set |V | |E | Description CaseFlare 220 708 Software class hierarchy1 H3L7US Airlines 235 1297 US airline network2 H7L3Poker 859 2127 Poker game network3 H7L7Email 1133 5451 Email interchange network4 H7L7Yeast 2224 6609 Protein interaction network4 H7L7wiki-Vote 7066 100736 Wikipedia admin elections4 H7L7Table 6.1: Graph statistics for data sets used in this paper.1https://gist.github.com/mbostock/10442422https://github.com/upphiminn/d3.ForceBundle/tree/master/example/bundling data3Courtesy of A. Telea [19]4http://yifanhu.net/GALLERY/GRAPHS/32LSTB was implemented in Python and JavaScript, using D3 [9] for drawing.Our experiments were run on a MacBook Pro with a 2.6 GHz Intel Core i7 and 16GB of 1600 MHz DDR3 RAM.6.2 LSTB EvaluationThe visual results of LSTB are layouts that closely follow the spanning tree back-bones. Figure 6.1 compares LSTB to previous edge bundling systems: KDEEB [19],Clustered Edge Routing [10], MINGLE [14], SBEB [13], and HEB [17]. We usethe same layouts as in previous work for the layout-based methods, and when nonewas available we provide an arbitrary layout. For both the wiki-Vote (Figures 6.1aand 6.1b) and Yeast (Figures 6.1c and 6.1d) data sets, the bundled graphs have atree-like structure, but LSTB makes it easier to identify clusters of vertices anddisparities in their sizes. For the Email data, shown in Figures 6.1e and 6.1f, Clus-tered Edge Routing [10] shows a mesh-like structure in contrast to our tree layout,so choosing the more appropriate method would depend on the task at hand. TheUS Airlines data has a geographical layout, so the layout-based bundling shown inFigure 6.1g would be suitable for tasks that pertain to spatial position. We showour method of drawing bundles with that geographic layout for comparison pur-poses in Figure 6.1h, in contrast to the layout computed by LSTB in Figure 6.1ithat emphasizes different patterns in the data. Figures 6.1j-l show the Flare dataset, which is a software package hierarchy where edges represent imports of onepackage from another. While the LST used as a routing graph is not the originalpackage hierarchy, we can see from the package names that similar packages arestill clustered together, and hovering over edge bundles makes it easy to see whichpackages call each other.Our approach achieves competitive performance with the fastest previous method,MINGLE [14]. Table 6.2 gives running times for our method, which range from0.06 seconds for a 708-edge graph to 8.067 seconds for a 100736-edge graph. Onthis large graph, MINGLE reports a running time of 18.4 seconds, albeit on an olderarchitecture. We note that while MINGLE is implemented in C using OpenGL on aGPU, our proof-of-concept implementation uses unoptimized scripting languages;further speed improvements would result from a GPU port.33(a)wiki-Vote|V | = 7066|E | = 100736H7L7KDEEB [19](b)wiki-VoteLSTB(c)Yeast|V | = 2224|E | = 6609H7L7MINGLE [14] (colours inverted)(d)YeastLSTB(e)Email|V | = 1133|E | = 5451H7L7Clustered Edge Routing [10](f)EmailLSTB(g)US Airlines|V | = 235|E | = 1297H7L3SBEB [13] c© 2011 IEEE(h)US AirlinesLSTB(i)US AirlinesLSTB(j)Flare|V | = 220|E | = 708H3L7HEB [17] (D3 implementation [9])(k)FlareLSTB(l)FlareLSTBFigure 6.1: Comparison between previous methods and LSTB.34Data set |V | |E | LST Bundling Drawing TotalFlare 220 708 0.022 0.005 0.032 0.060US Airlines 235 1297 0.035 0.009 0.042 0.086Poker 859 2127 0.082 0.026 0.108 0.216Email 1133 5451 0.180 0.053 0.283 0.516Yeast 2224 6609 0.247 0.070 0.342 0.659wiki-Vote 7066 100736 4.275 1.010 2.782 8.067Table 6.2: Performance statistics of LSTB. Running time is in seconds, and isaveraged over 100 runs.6.3 Distribution over Low-Stretch TreesThe aim of computing a distribution D over low-stretch trees versus a single treeis to obtain low stretch in expectation for each edge in the graph. We can validatethe results of our experiments by comparing the maximum expected stretch of anyedge over the distribution, maxe∈E ET∼D[sT (e)] to the maximum stretch of anyedge in a single tree T , maxe∈E sT (e). These results are given in Table 6.3. In alldistribution computations we use δ = 0.5.maxe∈E ET∼D[sT (e)]Data set maxe∈E sT (e) |D | = 2 |D | = 5 |D | = 10Flare 9.00 7.10 8.36 8.13US Airlines5 34.08 34.30 32.24 25.90Poker 12.00 12.15 10.82 12.07Email 8.00 7.97 8.34 9.02Yeast 10.00 10.08 10.19 11.74wiki-Vote 6.00 6.03 6.29 6.97Table 6.3: Maximum edge stretch comparison between a single low-stretch treeT and a distribution D over low-stretch trees, computed as in Section 5.2 withvarying sizes |D | ∈ {2,5,10}.From these results we can see that the distribution generally fulfills its purposeof ensuring the expected stretch is low for each edge, versus the single tree. In5This graph has a non-uniform input weighting.35a few cases, however, the maximum expected stretch goes up compared to themaximum stretch in the single low-stretch tree. Since there is no monotonicityguarantee, the stretch may not go down at each round. Also, since the guaranteesgiven in Section 5.2 are for a distribution of size ρ = dln m/2e and we computedistributions of constant size, we may not obtain the optimal result.Next we will evaluate the results of applying the distribution to LSTB, as dis-cussed in Section 5.2.4. Figure 6.2 shows the output of LSTB when using a dis-tribution over two low-stretch trees as the routing graph. When remainder edgesand bundles are shown, there is a large amount of visual clutter and it is hard todistinguish patterns among edges and bundles. This is particularly evident whencomparing this result, shown in Figure 6.2b, to the original output of LSTB witha single tree, shown in Figure 6.2c. These issues are only worsened when largerdata sets are used, and for larger distributions. This is discussed in more detail inSection 7.3.36(a) Force layout of unbundled Flare data.(b) Bundled result using two trees as therouting graph (with 317 edges). Whilethis visualization is still somewhat clut-tered, the maximum expected stretch forany edge is 7.1.(c) Bundled result using one tree (orig-inal LSTB output) as the routing graph(with 219 edges). This visualization isless cluttered than (b), but has worsemaximum stretch of 9.0.Figure 6.2: Comparison between visual results obtained on the Flare data set usinga distribution over two low-stretch trees (b) and a single low-stretch tree (c) forrouting in LSTB.37Chapter 7DiscussionSeveral interesting issues arise when using low-stretch trees for a network visual-ization application. This chapter will discuss some of them, including the use oftrees for routing in edge bundling, the deeper meaning behind “low” stretch, anddifficulties in drawing a distribution over trees. We will also evaluate our techniqueof using low-stretch trees and a distribution over those trees for edge bundling. Thischapter concludes with ideas for future work.7.1 LSTBThe crux of this thesis is the use of low-stretch trees for edge bundling. Whilerouting through a tree results in similar-looking bundled graphs for most data sets,using a tree provides the sparsity needed for an uncluttered bundling. That said,LSTB specifically uses a low-stretch tree to route edges. Section 4.1.1 explainsthe reasoning behind this decision, but one might also wonder why a different typeof spanning tree is not used. For example, a minimum spanning tree might seemappropriate, since it minimizes its total weight. In the unweighted case, any span-ning tree is a minimum spanning tree. In either case, however, this tree structuredepends very little on the edges that are not included in the tree, despite the factthat these edges are crucial for edge bundling. Edges may therefore be distorted byhaving endpoints far apart in the tree. To illustrate this, Figure 7.1 shows a compar-ison between an arbitrary spanning tree and a low-stretch tree for the unweighted38Email data set. In Figure 7.1a, the large bundles along the interior edges implymany edges are being routed from exterior branches through the middle, whichmeans endpoints from edges in the graph are far apart in the tree. This congestiondistorts the edges of the original graph.(a) Arbitrary spanning tree routinggraph.(b) Low-stretch tree routing graph.Figure 7.1: Comparison between using an arbitrary spanning tree and a low-stretchtree as the routing graph in LSTB on the Email data set.7.2 StretchFrom the name, we have an intuitive sense of what stretch is. From the definition,we know it is the ratio between the distance between edge endpoints in a tree andthe weight of that edge in the original graph. However, we are also interested in themeaning of stretch. In particular, what does it mean to have low stretch? We haveseen two quantities: low stretch for edges on average, and low stretch for everyedge. Most low-stretch tree constructions focus on the former [1, 3, 12], whilecomputing a distribution over trees focuses on the latter [2]. As in many situations,there is a trade-off: we can achieve low stretch in expectation for every edge, butwe must accept a distribution; on the other hand, we can compute a single tree, butmust be satisfied with low average stretch.In this work, we faced another trade-off in choosing which low-stretch tree39construction to use. We chose the simpler method of Alon et al., but must use themultiplicative weights update method [6] to obtain a distribution with low expectedstretch for all edges. Had we chosen the method of Abraham et al., we would beable to sample directly from the distribution, but the implementation would be farmore complex (and likely impractical).In the unweighted (or uniformly weighted) case, stretch is far easier to under-stand. The “units” are edges: if two vertices were connected in the original graph,then their stretch corresponds to the number of edges separating them in a giventree. In the weighted case, however, things get more complex, particularly whencomparing stretches. In such situations, the meaning of stretch depends on theweighting. For geographical data, such as the US Airlines data set (see Table 6.1),edge weights measure physical distance. In other cases, though, the weighting maybe less clear. In these situations, stretch results should be thought about from thecontext of the weighting, rather than edge quantity.7.3 Distribution over Low-Stretch TreesBased on the results shown in Section 6.3, our distribution over (few) low-stretchtrees performs generally as expected: in most cases, the distribution reduces themaximum stretch of individual edges. However, with a constant number of treesin the distribution and no monotonicity guarantee, this may not always be the case.This caveat should be taken into account when evaluating the use of a distributionof low-stretch trees for routing in LSTB.Another factor for evaluation is the method by which edges are routed throughthe distribution. Choosing to route an edge through the tree with the shortest pathbetween its endpoints is simple, but may be computationally expensive. Randomlysampling a tree from the distribution for each edge is fast, but may not provide thebest stretch. The most important factor to take into consideration, however, is thevisual aspect. Drawing a routing graph which consists of multiple trees presentsseveral difficulties. One such issue is the representation of weights. The trees havea uniform probability of being sampled from the distribution, but the tree edges areweighted. It is unclear what the best way to represent these weights is.The main problem, though, is visual clutter. In graph theory, any graph with40O(n) edges is considered extremely sparse. In network visualization, however,there is a considerable difference between n edges and 2n edges. While increasingthe size of the distribution leads to better theoretical results, more edges in therouting graph results in more visual clutter. The graph is more difficult to lay out,and the number of bundles is increased, making them harder to distinguish. Theseeffects are highlighted in Figure 6.2. After taking these issues into consideration,it seems that a single low-stretch tree provides a better routing graph for the LSTBedge bundling application.7.4 Future WorkWe present several ideas for further research in this area. In terms of improvingLSTB, different graph layouts and visual encodings could be explored. It wouldalso be interesting to investigate the adaptation of LSTB for layout-based bundling.Ultimately, a united system of both layout-based and layout-free edge bundlingshould be developed.Rather than using the method of Alon et al. [3] for computation of low-stretchtrees, other methods with better theoretical guarantees could be tried. Along thisvein, using the method of Abraham et al. [2] would remove the dependence on themultiplicative weights update method for a distribution over low-stretch trees. Ineither case, however, the implementation difficulties may outweigh the benefits ofthese other methods.Other sparsifiers may also be tried as routing graphs for edge bundling. Asmentioned in Section 7.3, there is a notable difference between n and 2n edgesfor visualization. However, new approaches such as Kolla et al. [20] computeultrasparsifiers , which have n + o(n) edges. A famous sparsification result byBatson et al. [7] proved too slow for practical purposes in early experiments, butnew work by Lee and Sun [22] improves this with a near-linear time algorithm.In addition, low-stretch trees could be applied to other areas of network vi-sualization. They are particularly useful for applications which require both thepreservation of edge distances and a sparse representation of the original graph.41Chapter 8ConclusionWe present LSTB, a novel edge bundling technique which is, by our classifica-tion, layout-free. While previous bundling methods rely on an input graph lay-out or explicit hierarchical structure, we use topological features of the graph inorder to compute a low-stretch tree which we use to route edges. Our bundlingmethod is fast and simple, and provides algorithmic support for sophisticated vi-sual encodings and interactivity. In addition, our abstract framework for edgebundling presents a formalization of bundling terminology and techniques that al-lows bundling methods to be compared in a uniform way.Our application of the multiplicative weights update method to a zero-sumgame over edges and trees enables the computation of a distribution over low-stretch trees. This distribution ensures all edges in the original graph have lowstretch in expectation. We apply this distribution as a routing graph for LSTB, butour analysis shows that a single low-stretch tree obtains better visual results.42Bibliography[1] I. Abraham and O. Neiman. Using petal-decompositions to build a lowstretch spanning tree. In Proc. ACM Symp. Theory of Computing (STOC) ,pages 395–406, 2012. ISBN 978-1-4503-1245-5. → pages 8, 39[2] I. Abraham, Y. Bartal, and O. Neiman. Nearly tight low stretch spanningtrees. In Proc. IEEE Symp. Foundations of Computer Science (FOCS) , pages781–790, 2008. → pages 8, 16, 25, 39, 40, 41[3] N. Alon, R. M. Karp, D. Peleg, and D. West. A graph-theoretic game and itsapplication to the k-server problem. SIAM Journal on Computing, 24(1):78–100, 1995. ISSN 0097-5397. → pages 3, 4, 5, 7, 16, 22, 23, 24, 25, 26,29, 31, 39, 40, 41[4] I. Altho¨fer, G. Das, D. Dobkin, and D. Joseph. Generating sparse spannersfor weighted graphs. In Scandinavian Workshop on Algorithm Theory,volume 447 of Lecture Notes in Computer Science, pages 26–37. SpringerBerlin Heidelberg, 1990. ISBN 978-3-540-52846-3. → pages 9[5] D. Archambault, T. Munzner, and D. Auber. Smashing peacocks further:Drawing quasi-trees from biconnected components. Visualization andComputer Graphics, IEEE Transactions on , 12(5):813–820, 2006. → pages10[6] S. Arora, E. Hazan, and S. Kale. The multiplicative weights update method:a meta-algorithm and applications. Theory of Computing, 8(1):121–164,2012. → pages 5, 26, 27, 30, 40[7] J. Batson, D. A. Spielman, and N. Srivastava. Twice-ramanujan sparsifiers.SIAM Journal on Computing, 41(6):1704–1721, 2012. → pages 41[8] E. Boman and B. Hendrickson. On spanning tree preconditioners.Manuscript, Sandia National Lab , 2001. → pages 743[9] M. Bostock, V. Ogievetsky, and J. Heer. D3: Data-driven documents. IEEETrans. Visualization & Comp. Graphics (Proc. InfoVis) , 17(12):2301–2309,2011. → pages 19, 33, 34[10] Q. W. Bouts and B. Speckmann. Clustered edge routing. In Proc. IEEEPacific Visualization Symp. (PacificVis) , 2015. To appear. → pages 2, 9, 14,15, 33, 34[11] W. Cui, H. Zhou, H. Qu, P. C. Wong, and X. Li. Geometry-based edgeclustering for graph visualization. IEEE Trans. Visualization & Comp.Graphics, 14(6):1277–1284, Nov 2008. ISSN 1077-2626.doi:10.1109/TVCG.2008.135. → pages 2, 8[12] M. Elkin, Y. Emek, D. A. Spielman, and S.-H. Teng. Lower-stretch spanningtrees. SIAM Journal on Computing, 38(2):608–628, 2008. → pages 8, 16, 39[13] O. Ersoy, C. Hurter, F. V. Paulovich, G. Cantareiro, and A. Telea.Skeleton-based edge bundling for graph visualization. IEEE Trans.Visualization & Comp. Graphics , 17(12):2364–2373, Dec 2011. ISSN1077-2626. doi:10.1109/TVCG.2011.233. → pages 9, 33, 34[14] E. R. Gansner, Y. Hu, S. North, and C. Scheidegger. Multilevelagglomerative edge bundling for visualizing large graphs. In Proc. IEEEPacific Visualization Symp. (PacificVis) , pages 187–194, 2011. → pages 9,13, 33, 34[15] S. Hachul and M. Ju¨nger. Drawing large graphs with a potential-field-basedmultilevel algorithm. In Graph Drawing, volume 3383 of Lecture Notes inComputer Science, pages 285–295. Springer Berlin Heidelberg, 2005. →pages 9[16] N. Harvey. Lecture notes in mathematical programming.http://www.math.uwaterloo.ca/∼harvey/F10/Lecture12Notes.pdf, 2010. →pages 30, 31, 47[17] D. Holten. Hierarchical edge bundles: Visualization of adjacency relationsin hierarchical data. IEEE Trans. Visualization & Comp. Graphics , 12(5):741–748, Sept 2006. ISSN 1077-2626. doi:10.1109/TVCG.2006.147. →pages viii, 2, 3, 6, 8, 20, 33, 34[18] D. Holten and J. J. van Wijk. Force-directed edge bundling for graphvisualization. Computer Graphics Forum, 28(3):983–990, 2009. ISSN1467-8659. → pages 9, 13, 1444[19] C. Hurter, O. Ersoy, and A. Telea. Graph bundling by kernel densityestimation. Computer Graphics Forum, 31(3pt1):865–874, 2012. → pages9, 32, 33, 34[20] A. Kolla, Y. Makarychev, A. Saberi, and S.-H. Teng. Subgraph sparsificationand nearly optimal ultrasparsifiers. In Proc. ACM Symp. Theory ofComputing (STOC) , pages 57–66, New York, NY, USA, 2010. ACM. →pages 41[21] A. Lambert, R. Bourqui, and D. Auber. Winding roads: Routing edges intobundles. Computer Graphics Forum, 29(3):853–862, 2010. → pages 2, 9[22] Y. T. Lee and H. Sun. Constructing linear-sized spectral sparsification inalmost-linear time. In IEEE Symp. Foundations of Computer Science(FOCS) , 2015. To appear. → pages 41[23] T. Munzner. Visualization Analysis and Design, chapter 4, pages 67–93. AK Peters Visualization Series. CRC Press, 2014. → pages 1, 5, 6[24] T. Munzner, F. Guimbretiere, and G. Robertson. Constellation: Avisualization tool for linguistic queries from MindNet. In Proc. IEEE Symp.Information Visualization (InfoVis) , pages 132–135, 154, 1999. → pages 10[25] S. Pupyrev, L. Nachmanson, and M. Kaufmann. Improving layered graphlayouts with edgebundling. In Graph Drawing, volume 6502 of LectureNotes in Computer Science, pages 329–340. Springer Berlin Heidelberg,2011. ISBN 978-3-642-18468-0. → pages 9[26] S. Pupyrev, L. Nachmanson, S. Bereg, and A. E. Holroyd. Edge routing withordered bundles. In Graph Drawing, volume 7034 of Lecture Notes inComputer Science, pages 136–147. Springer Berlin Heidelberg, 2012. ISBN978-3-642-25877-0. → pages 2, 9[27] E. M. Reingold and J. S. Tilford. Tidier drawings of trees. IEEE Trans.Software Engineering, SE-7(2):223–228, March 1981. → pages 19[28] N. K. Vishnoi. Lx = b . Foundations and Trends in Theoretical ComputerScience. NOW, 2013. → pages 4, 8[29] J. von Neumann. Zur theorie der gesellschaftsspiele. MathematischeAnnalen, 100(1):295–320, 1928. → pages 7, 26, 4745Appendix ASupporting MaterialsA.1 Zero-Sum Games and the Minimax TheoremConsider a game between two players, Alice and Bob. Let M be the payoff matrixsuch that, on a given turn, if Alice chooses row i and Bob chooses row j, Bob mustpay Alice the value Mi, j . Therefore, Alice wishes to maximize her payoff fromM and Bob wishes to minimize it. We will formulate such a zero-sum game forlow-stretch trees, where the payoff values will be the stretch of a given edge in agiven tree.For graph G = (V,E), choose an arbitrary ordering of edges e1, . . . ,em whereE = {ei : 1 ≤ i ≤ m}. Let the edges ei = (u,v) of G correspond to the rows of M ,and let there be ρ columns, each one corresponding to some (currently undefined)spanning tree Tj . Then the payoff values will be the stretch of a given edge in agiven tree, so Mi, j = sT j (ei ). Alice wishes to choose an edge ei that will have highstretch in whichever tree Bob chooses, and Bob wishes to choose a tree Tj that willhave low stretch for whichever edge Alice chooses.Let x ∈ Rm be a distribution over edges, and y ∈ Rρ be a distribution overtrees, such that∑mi=1 xi = 1 and∑kj=1 y j = 1, and all xi and y j are non-negative.These distributions represent player strategy, where xi denotes the probability thatAlice chooses edge ei , and y j denotes the probability that Bob chooses tree Tj .The expected value of Alice’s payoff for choosing edge ei , given Bob is playing bystrategy y, is (My)i . Likewise, the expected value of Bob’s payment for choosing46treeTj , given Alice is playing by strategy x, is (xM) j . By von Neumann’s MinimaxTheorem [29], there exists a distribution x∗ over edges and a distribution y∗ overtrees such that:maxxminyxᵀMy = x∗ᵀMy∗ = minymaxxxᵀMywhere both players achieve the best possible payoff given the other’s strategy. Wedenote the value of the game as λ∗ = x∗ᵀMy∗.A.2 Proof of Lemma 5.2.3.2Adapted from Harvey [16]. Let Γ(t ) =∑mi=1 γ(t )i . Consider the algorithm’s stateat the start of round t + 1.Γ(t+1) =m∑i=1γ(t+1)i =m∑i=1γ(t )i · (1 + · si )Recall that si is the normalized stretch of edge ei in tree T (t ). That is, si =sT (t ) (ei )/smax. Since we assume M is normalized, this is equivalent to Mi, j (t ) ,where j (t ) is the index in T of T (t ) such that y(t )j (t )= 1. Therefore, we also haveMi, j (t ) = (My(t ))i . Therefore, we obtain:Γ(t+1) =m∑i=1γ(t )i · (1 + · (My(t ))i )=m∑i=1γ(t )i + γ(t )i (My(t ))i=m∑i=1γ(t )i + m∑i=1γ(t )i (My(t ))i= Γ(t ) + Γ(t )m∑i=1x (t )i (My(t ))iwhere the last step uses that x (t ) = γ (t )/Γ(t ). Simplifying this expression gives us:Γ(t+1) = Γ(t ) (1 + · x (t )My(t ))47Using that (1 + y) ≤ ey ∀y, this becomes:Γ(t+1) ≤ Γ(t ) exp( · x (t )My(t ))Now, consider the end of the algorithm, after ρ rounds have been performed.We know that the base case of the recurrence is Γ(1) = m. Therefore, we obtain:Γ(ρ+1) ≤ Γ(ρ) exp( · x (ρ)My(ρ))≤ Γ(1)ρ∏t=1exp( · x (t )My(t ))= m · exp ·ρ∑t=1x (t )My(t )where the product has been pulled inside the exponential. This gives us an upperbound on Γ(ρ+1); next, we will find a lower bound.Note that any γ (t+1)i is a lower bound for Γ(t+1) for all i = 1, . . . ,m and t =1, . . . , ρ since the weights are non-negative. Therefore,Γ(t+1) ≥ γ (t+1)i≥ γ (t )i · (1 + · Mi, j (t ) )Consider the result after ρ iterations. Here, our base case is γ (1)i = 1. Therefore,Γ(ρ+1) ≥ γ (ρ+1)i≥ρ∏t=1(1 + · Mi, j (t ) )≥ρ∏t=1(1 + )Mi, j (t )since (1 + x) ≥ (1 + )x for x ∈ [0,1] and ≥ 0. Putting the upper and lower48bounds together gives:m · exp ·ρ∑t=1x (t )My(t ) ≥ Γ(ρ+1) ≥ρ∏t=1(1 + )Mi, j (t )Taking the natural logarithm of both sides gives:ln m + ·ρ∑t=1x (t )My(t ) ≥ρ∑t=1Mi, j (t ) · ln(1 + )ρ∑t=1x (t )My(t ) ≥1ln(1 + )ρ∑t=1Mi, j (t ) − ln m≥ (1 − )ρ∑t=1Mi, j (t ) −ln m≥ρ∑t=1Mi, j (t ) − ρ −ln m≥ρ∑t=1Mi, j (t ) − 2ρusing the fact that ln(1 + ) ≥ − 2, and∑ρt=1 Mi, j (t ) ≤ ρ. Dividing by ρ, weobtain our result:ρ∑t=1x (t )My(t )ρ≥ρ∑t=1Mi, j (t )ρ− 2≥ρ∑t=1Mi, j (t )ρ− δ49
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Low-stretch trees for network visualization
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Low-stretch trees for network visualization McKnight, Rebecca Linda 2015
pdf
Page Metadata
Item Metadata
Title | Low-stretch trees for network visualization |
Creator |
McKnight, Rebecca Linda |
Publisher | University of British Columbia |
Date Issued | 2015 |
Description | Low-stretch trees are spanning trees which provide approximate distance preservation for edges in the original graph by minimizing stretch. We explore the application of these trees to network visualization. In particular, we present a novel edge bundling technique, LSTB, that computes edge bundles explicitly and efficiently and does not rely on fixed vertex positions. This approach is in contrast to previous methods, which require the user to provide a layout of the input graph. We introduce an abstract framework for edge bundling methods, which provides a unifying formalization of bundling terminology and techniques, as well as a classification of such methods. Based on this framework, LSTB provides algorithmic support for sophisticated visual encodings, including dynamic layout adjustment and interactive bundle querying. In addition, we explore the use of the multiplicative weights update method to compute a distribution over low-stretch trees in order to achieve low stretch for all edges in expectation, rather than on average. We present the results of using this distribution in place of a single low-stretch tree as a routing graph for LSTB. While the distribution provides better stretch guarantees, we find that from a visual perspective a single low-stretch tree provides a better routing graph for the LSTB edge bundling application. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2015-08-17 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivs 2.5 Canada |
DOI | 10.14288/1.0166557 |
URI | http://hdl.handle.net/2429/54490 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2015-09 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/2.5/ca/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2015_september_mcknight_rebecca.pdf [ 12.88MB ]
- Metadata
- JSON: 24-1.0166557.json
- JSON-LD: 24-1.0166557-ld.json
- RDF/XML (Pretty): 24-1.0166557-rdf.xml
- RDF/JSON: 24-1.0166557-rdf.json
- Turtle: 24-1.0166557-turtle.txt
- N-Triples: 24-1.0166557-rdf-ntriples.txt
- Original Record: 24-1.0166557-source.json
- Full Text
- 24-1.0166557-fulltext.txt
- Citation
- 24-1.0166557.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0166557/manifest