GOING CRITICAL* AN INVESTIGATION OF DIAMETER-CRITICAL GRAPHS by JOSHUA MADDEN B.A., University of Oregon, 1994 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES Department of Mathematics 1 /! Institute of Applied Mathematics We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA April 1999 © Joshua Madden, 1999 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of M ^ J W v W a c f f ^ S The University of British Columbia Vancouver, Canada Date 3 0 Af-l\ 1 W DE-6 (2/88) Abstract We define a graph G = (V, E) with m = \E\ edges, n = \V\ vertices, maximum degree D, and diameter d, to be d-critical if it has the property that for any edge e G E, the graph G — e has diameter > d. We explore some results relating to a conjecture on the maximum possible number of edges in a 2-critical graph. We also present efficient algorithms for identifying spanning d-critical subgraphs of graphs having diameter d where d = 2 or d = 3. The algorithm for d = 2 runs in 0(mn), and the algorithm for d = 3 runs in 0(mnD). n Table of Contents Abstract ii Table of Contents iii List of Figures iv Acknowledgements v Chapter 1. Introduct ion 1 1.1 Background and Motivation 1 1.2 Problem Domain 5 1.2.1 Weighted vs. Unweighted Graphs 5 1.2.2 Directed vs. Undirected Graphs 6 1.3 Terminology and Notation J 7 Chapter 2. 2-Critical Graphs wi th 0 ( n 3 ) Triangles 9 2.1 2-Critical Graphs and Triangles 9 2.1.1 2-Critical Graphs with All Edges in Triangles 14 2.2 Examples of 2-Critical Graphs with 0 (n 3 ) Triangles 15 2.2.1 The Graph Family Qp 19 2.2.2 The Graph Families Uv and Wp 23 2.3 Analysis of our Examples 25 2.3.1 Associated Triangles 26 2.3.2 Cliques, Triangles, and Edges 27 Chapter 3. d-Critical Spanning Subgraphs 34 3.1 Calculating the Diameter of a Graph .- 34 3.1.1 Breadth-First Search Trees 34 3.1.2 Matrix Multiplication 35 3.2 Recognizing <i-Critical Graphs 40 3.3 Identifying d-Critical Spanning Subgraphs 41 3.3.1 Properties of Breadth-First Search Trees 45 3.3.2 Algorithm for Diameter 2 Graphs 47 3.3.3 Algorithm for Diameter 3 Graphs 55 3.3.4 Algorithms for General Graphs 60 3.4 Concluding Remarks 63 3.4.1 Further Uses of Matrix Multiplication 63 3.4.2 Parallelization 64 Bibl iography 65 iii List of Figures 1.1 Transformation from a weighted edge to a series of unweighted edges 6 2.1 An example of an extremal triple 12 2.2 Members and non-members of Si 12 2.3 The family of graphs A(r) 15 2.4 The graph 5(2, 3) 16 2.5 Original concept for 2-critical graph with critical triangle (x,y,z) 18 2.6 Gi,g2 30 2.7 g3,g4 31 2.8 rtii 1L2, 1L3 32 2.9 Wi,-W2,W3 33 3.1 An example of a BFS tree and its virtual edges 45 3.2 An example of case 2 for diameter 2: T{ 49 3.3 An example of case 3 for diameter 2: Tk 51 3.4 An example of case 2 for diameter 3: Ti 56 3.5 An example of case 3 for diameter 3: Tk, l(Tk, i) = 1, l(Tk,j) = 2 .• 57 3.6 An example of case 4 for diameter 3: Tk, l(Tk, i) = 2, l(Tk,j) = 3 58 IV Acknowledgements I would like to thank a number of people for their encouragement and assistance in the writing of this work: my supervisor, Richard Anstee; my parents, Bill and Beth Madden; my leman, Megan Foster; my academic mentors, including (but not limited to) Frances Cogan, Bill Kantor, Tom McCormick, Andrzej Proskurowski, Hollis Williams, and Chris Wilson; Jack Snoeyink and Joel Friedman, for providing much of my financial support; Alan Mackworth, David Poole, and Craig Boutilier, for pointing the way to the next stage of my education; Alex West and Jeff Lassahn, for helping to keep my mathematical talents alive in the years between finishing my B.A. and starting my M.Sc; the folks in the UBC Computer Science Theory Seminar (David Kirkpatrick, Hamish Carr and Andrea Mantler in particular), who suggested a number of improvements and extensions; and a number of innocent bystanders who held still long enough for me to explain what my thesis was about, despite their lack of background (or, occasionally, interest!) in mathematics. Their contributions have been invaluable. v Chapter 1 Introduction 1.1 Background and Motivation A graph consists of a collection of objects (vertices) and the connections between them (edges). Graphs, which are more commonly known as networks in nonmathematical contexts, are ubiquitous in today's society. Some of their manifestations are explicit: for example, a modern national transportation system consists of cities and the available paths between them, e.g., highways, railroads, airplane routes, and so on. However, this does not exhaust the representational power of graphs by any means. Many problems in communication, resource management, scheduling, planning, and countless other subjects are very naturally represented by graphs, and the best solutions known to these problems have generally been arrived at by the study of graphs and their properties. Indeed, one could make the argument that the modern world is characterized by graphs, as the complex interconnections between people and corporations which graphs so naturally represent are an obvious concomitant of the emerging global society. When one constructs a graph for a specific purpose, especially in the real world, one is often concerned with the expenditure necessary to build the associated structure which the graph represents. To return to our example of transportation, you probably would not want to build a separate road connecting each pair of cities in Canada; many of these roads would hardly be used. Nor is it likely that you would want to construct six separate 1 Chapter 1. Introduction roads just to connect two cities. On the other hand, if you built the smallest possible number of roads sufficient to connect all the cities, many routes would be inconveniently long. Furthermore, the more roads that you have to take, the more likely you are to get lost or take a wrong turn. Thus, we formulate the following problem: "How many roads do we have to build so that you can get from any city in Canada to any other using only a few roads?" Using ideas from graph theory, this problem can be approached in two different ways: • Find a graph that represents the system, satisfies the requirements of the system, and has the property that no subgraph of this graph satisfies the system's require-ments. Such a graph is called a minimal graph with respect to the requirements. • Find a graph that represents the system, satisfies the requirements of the system, and has as few edges as possible. This graph is called a minimum cost graph with respect to the requirements (assuming that the cost of a graph is defined as the number of edges which comprise it). Generally speaking, finding a minimal solution to a problem in graph theory is not difficult, but finding a solution of minimum cost to the same problem is prohibitively time-consuming. (Note that a minimum cost solution is usually also a minimal solution by default.) There are exceptions to this generalization, of course; unfortunately, as far as we are aware, this particular problem is not one of them. (See Chapter 3 for more details on this issue.) However, in many cases a minimal solution is what we are interested in, or is at least good enough for our purposes. One of the subjects of study in the field of extremal graph theory is those graphs which are minimal with respect to some set of requirements. Formally, we say that a graph G having property P is critical with respect to P , or P-critical, if no subgraph of G has property P. 2 Chapter 1. Introduction A fundamental property of graphs whose behavior is not always well understood in general is diameter. One can readily calculate the distance between all pairs of vertices in a graph (where the distance between vertices i and j in a graph is defined as the length of the shortest path in the graph which connects i to j). The diameter of a graph is defined as being the longest distance between any two vertices in the graph. It is straightforward to calculate the diameter of a given graph, but it is sometimes difficult to make generalizations about the diameter of an arbitrary member of a family of graphs. One reason for this is that local changes in a graph (e.g., removing an edge) can have global effects on the diameter. For an example of the practical relevance of the diameter of a graph, we consider those graphs which represent communications networks. The diameter of a network (graph) represents the length of the longest path that a signal must travel between two nodes in the network (vertices in the graph), assuming that a signal always uses the shortest available path. A signal propagates across a network in time proportional to the length of the path taken, and the quality of the signal degrades in proportion to the number of links in the network (edges in the graph) that the signal must traverse in order to reach its destination. Thus, the diameter of a network is a crucial measure of its performance. Suppose, then, that we wish to design a communications network, subject to the require-ment that there should be a path of no more than d links between any two nodes. A network that satisfies this requirement, and has the property that no subnetwork satisfies this requirement, is called diameter-critical in general, and d-critical if d is specified. The purpose of this thesis is to explore some results relating to two issues in the theory of diameter-critical graphs. One of these issues is straightforward: we would like to be able to efficiently find a diameter-critical subgraph of a given graph. This is useful for improving the efficiency (in terms of number of edges needed) of a graph which represents 3 Chapter 1. Introduction a network. If diameter is our chosen measure of performance, we would like to find a minimal graph of the same diameter as the graph that we were given. The algorithms that we have developed for this task are presented in Chapter 3. The other issue is more subtle, and relates to a conjecture in this field regarding the maximum number of edges that a critical graph of diameter 2 (2-critical graph) can have. Briefly, the conjecture states that a 2-critical graph of n vertices can have no more 2 than ^j- edges. In Chapter 2, we explore this conjecture by examining the properties of 2-critical graphs which have a large number of triangles. It turns out that the truth of the conjecture turns on whether it is possible to construct a 2-critical graph with 6(n3) triangles that have certain specific characteristics. As a practical example of the importance of the number of edges in a given (2-critical) graph, we observe that the design of a communications network involves balancing, among other things, the following three factors: • Cost and complexity: every link added to a network increases its cost and its complexity. Complex networks are more prone to failure of individual components, and their errors are more difficult to diagnose and repair. • The desire to have available alternate efficient paths between any two nodes in the network (in case a link fails or is busy). • The intended topology of the network. The topology is the general scheme for how the nodes are connected to one another, which may be dictated to a degree by hardware restrictions. For example, there is usually a limit to the number of links that can be connected to a given network node. We present a number of families of 2-critical graphs with many triangles, which therefore also have a large number of edges. The fact that there exist many such graphs with 4 Chapter 1. Introduction distinct topologies gives a network designer a number of options if they wish to use a 2-critical graph with a large number of edges as the network template. 1.2 Problem Domain 1.2.1 Weighted vs. Unweighted Graphs A weighted graph is a graph for which each edge has an associated (positive) weight. This weight can represent the cost of traversing that edge, or the time necessary to traverse it, or any similar scalar quantity. The diameter of a weighted graph is defined in similar fashion to that of an unweighted graph. The diameter is still considered to be the maximum of the distances between any pair of nodes in the graph, the distance between a pair of nodes (i,j) is defined to be the length of the shortest r/-path, and the length of an ij-path is defined to be the sum of the weights of the edges which comprise the path (rather than the number of edges which comprise the path, as in the unweighted case). This thesis considers only unweighted graphs, and we offer the following justifications: • When we consider communications networks, it is often the case that the crucial factor in determining the transmission time from one node to another has little to do with the lengths of the links between nodes in the network, because the signal travels at the speed of light along the link. Instead, the transmission time is largely determined by the time it takes for the router at a network node to decode the address of the intended recipient and send the signal along the correct link. This time may be quite large in comparison to the link transmission time, especially if there are several signals which.are attempting to pass through the node simultaneously. Thus, the total transmission time is often proportional to the number of nodes on the path from the source node to the destination node, or, in other words, the number of links traversed. In this case, the graph representing the network is effectively unweighted. 5 Chapter 1. Introduction • Suppose that we have a weighted graph W with a weight function w : E —» N {i.e., each edge has positive integer weight). We define a mapping from W = (V, E) to an unweighted graph U — (V, E') as follows: replace each weighted edge e = (i,j) G E with w(e) consecutive unweighted edges in E' which join w(e) — 1 new vertices in V placed between i and j . (See Figure 1.1, below.) Figure 1.1: Transformation from a weighted edge to a series of unweighted edges This mapping preserves the structure of the graph: for any vertex v £ V(W), there exists a corresponding vertex v' € V such that deg{v) = deg(v'), where deg(v) is the number of edges in E incident to v. If one considers the weight of an unweighted edge to be 1, the sum of the weights of the edges on the new r/'-path is exactly w(e). So long as the weights in W are not too large (i.e., the size of the maximum weight is polynomial in m and n) the new unweighted graph provides an efficient representation of the old one. 1.2.2 Directed vs. Undirected Graphs An undirected graph is one in which an edge e = (i,j) is an unordered pair of vertices, and thus can be traversed in either direction (either from i to j , or from j to i). By contrast, a directed graph is a graph in which an edge e = (i,j) is an ordered pair of 6 Chapter 1. Introduction vertices, i.e., e has an associated orientation and can only be traversed in the direction indicated by the ordering: i to j . We consider only undirected graphs for several reasons: • The conjecture which we address in Chapter 2 relates to undirected, unweighted graphs. • For at least two of the major classes of applications for diameter-critical networks (transportation and communication) it is reasonable to suppose in many cases that network links (graph edges) may be traversed in either direction. • In order to apply the results of Chapter 3 to directed graphs, we would need to make only a few minor modifications to our definitions and algorithms. For instance, the definition of diameter would remain the same for a directed graph D: "the longest distance between any two vertices i and j in D". The definition of distance, in turn, would be modified only slightly to read "the length of the shortest directed path between i and j " . The necessary changes to the algorithms are of a similar character and magnitude, and do not affect their efficiency or complexity measures at all. 1.3 Terminology and Notation • A graph G has edge set E or E(G), vertex set V or V(G), m = \E\ edges, n = \V\ vertices, and diameter diam(G) = d. • A graph which has diameter d and is critical with respect to the operation of edge removal is called d-critical. Formally, a graph G with diameter d is d-critical if and only if diam(G — e) > d. (Plesnik, in [12], refers to such a graph as an edge minimal graph of diameter d.) 7 Chapter 1. Introduction • An edge e is called d-critical if and only if diam(G) = d and diam(G — e) > d. • The degree of a vertex v, denoted deg(v), is the number of edges that are incident to v (or, equivalently, the number of vertices to which v is connected). We denote the maximum degree of any v G V{G) as D or D(G). Note that D is in 0{n). • For x,y G V(G), we use x ~ y to indicate that there exists an edge connecting x to y; the edge is denoted xy.-We use x ^ y to indicate that no such edge exists. • triangle: a cycle in a graph of length 3. • The set of triangles in a graph G is denoted A or A(G). • simple graph: a graph having no parallel edges or loops. All graphs that we consider are simple. • ij-path: a path, having no repeated vertices, from vertex i to vertex j . • Given S C V, (S) denotes the subgraph of G induced by S. • The neighbor set N(v) of a vertex v G V is defined to be the set of vertices adjacent to v, i.e., those vertices u G V such that v ~ u. • Given a connected graph G = (V,E), a spanning subgraph of G is a connected subgraph S of G with vertex set V and edge set E' C E. 8 Chapter 2 2-Critical Graphs with 0 ( n 3 ) Triangles The purpose of this chapter is to discuss some approaches to proving a conjecture regard-ing the maximum possible number of edges in a 2-critical graph. Pursuant to this, we will relate this conjecture to a result that frames the conjecture in terms of 2-critical graphs with 0(n3) triangles, present some examples of such graphs, and discuss the reasons why the examples are not counterexamples to the conjecture. 2.1 2-Critical Graphs and Triangles Graphs that have a large number of triangles are of particular interest in the context of constructing 2-critical graphs. A triangle in isolation is not, itself, 2-critical. In fact, it can readily be shown that all diameter 2 graphs with no triangles are 2-critical. Proposition 2.1 A simple graph G of diameter 2 which has no triangles is 2-critical. Proof: Consider two vertices i,j G V connected by an edge ij. Suppose that we remove ij from G. Any ij-path of length 1 would have to be an edge parallel to ij. Suppose, then, that there exists a zj-path of length 2 which passes through some vertex k; this would imply the existence of edges ik and jk. However, if such edges existed, they would 9 Chapter 2. 2-Critical Graphs with 6(n3) Triangles form the triangle ikj with the edge ij. Thus, ij must be 2-critical, and therefore G is 2-critical. • We can see from the proof that in order for a triangle to exist in a 2-critical graph, there must be some associated structure in the graph which requires the presence of each edge in the triangle in order to maintain diameter 2. Given these facts, one might suppose that 2-critical graphs have few triangles, if they have any at all. We are interested in 2-critical graphs with many triangles not only because they are perhaps counterintuitive or unusual, but as a byproduct of our interest in dense 2-critical graphs as mentioned in Chapter 1. A theorem due to Mantel [8] places a bound on the number of edges that a general graph can have if it has no triangles. (This theorem was proved later in a more general form by Turan [13], and is generally known as Turan's Theorem.) Theorem 2.2 (Mantel [8], Turan [13]) If a graph G has no triangles, then m < [n2/4], with equality holding if and only if There is a conjecture with very similar form, proposed independently by Simon and Murty (referenced in [3]) and Plesnik [10], that relates the number of edges to that of vertices in a 2-critical graph. Conjecture 2.3 (Simon and Murty [3], Plesnik [10]) IfG is a 2-critical graph, then m < [n2/4], 10 Chapter 2. 2-Critical Graphs with @(n3) Triangles with equality holding if and only if G = tf|-n/2Un/2j • / A result of Fiiredi [7] has proven the conjecture for n > n0. However, n0 is an inconceiv-ably (and inconveniently) huge number: it is a tower of 2s of height approximately 1014. Since, for practical purposes, we are usually interested in graphs which are smaller than this, further investigation is warranted. The conjecture has been proven by Fan [6] for n < 24 and n = 26. Fan also proved that the upper bound on m (for n > 25) is not greater than n2 n2 - 16.2n + 56 320 0.2532n2 Xu [14] presented a general proof of this conjecture; however, he retracted it later after finding a fault in his proof. If we consider only graphs with no triangles, the proof of the conjecture follows trivially from Turan's Theorem. The similarity between the conjecture and Turan's Theorem suggests that if one could bound the number of triangles in a 2-critical graph, this might contribute to a general proof of the conjecture. Pursuant to this goal, Caccetta and Haggkvist [3] demonstrated a result of particular interest: they showed that the conjecture is true for those graphs which have 0(n3~£) triangles with a certain property. In order to present this result, we must first establish some additional terminology. • Let (x,y,z) be an unordered triple of distinct vertices such that \({x, y, z})\ — 2. (Recall that (S) denotes the subgraph of G induced by S C V.) Without loss of generality, let x </- y. Then if N(x) D N(y) = {z}, then (x,y,z) is an extremal triple. Figure 2.1 shows an example of an extremal triple. 11 Chapter 2. 2-Critical Graphs with 6(n3) Triangles Figure 2.1: An example of an extremal triple S\ = {(x, y, z) : E{{x, y, z}) = {yz}, N(x) D N(y) n N(z) = {w}, and one or more of {(x,y,w), (x,z,w)} is an extremal triple}. (a) (b) (c) Figure 2.2: Members and non-members of Si In Figure 2.2, (a) and (b) are examples of members of Si; (c) is not, because neither (x,y,w) nor (x,z,w) is an extremal triple. Let r e A with vertices (x,y,z). (Recall that A denotes the set of triangles in G.) Without loss of generality, we select the edge xy from r. If G is 2-critical, xy must lie on the unique shortest path from some vertex a ^ r to either x or y. Thus, there is a 12 Chapter 2. 2-Critical Graphs with G(n3) Triangles vertex a G G — r such that a ~ s, or a ~ y, but not both, and a ^ z. Again, without loss of generality, let a ~ y. As a result, N(x) n A/'(a) = {y}. Therefore, (a, x, y) is an extremal triple, and (a,x,z) € Si. We say that r is associated with (a, x, 2), and observe that no two triangles in G can be associated with the same element of S\. Also, because each element of r must be critical, each edge of r belongs to an extremal triple. Thus, r must be associated with at least two elements of Si. In Figure 2.2(a), the triangle (w, y, z) is associated with one element of Si, to wit, (x,y, z). Finally, we define 5*2 = {(x,y,z) : (x,y,z) e A and is associated with exactly two elements of Si}. Proposition 2.4 (Caccetta and Haggkvist [3]) If G is a 2-critical graph, n2 j J 1 6 ( S 2 | ) \ This implies that the conjecture is true for the class of 2-critical graphs having the prop-erty that IS2I is in 0(n3~e), since in that case the value of the expression asymptotically approaches n2/4 as n increases without bound. Thus, we may restrict our search for graphs which might constitute counterexamples to the conjecture to those graphs with 0(n3) triangles. We note that while this is a necessary property of a counterexample, it is not a sufficient property; the proposition provides only an upper bound on the number of edges of such a graph, and not a lower bound. While we were initially unaware of any examples in the literature of families of 2-critical graphs with 0(n3) triangles, we found one example of such a family in a paper by 13 Chapter 2. 2-Critical Graphs with 6(n3) Triangles Plesnik; see [12]. However, this family appears to have this characteristic as a coincidental emergent property, since the family was constructed for a different (although related) purpose, which we address briefly in Section 2.1.1. Furthermore, Plesnik made no mention of this additional property in his paper. Therefore, we hope that the examples of 2-critical graphs with G(n3) triangles that we present, and our notes and observations on their construction and properties, may provide some useful insights into possible methods for proving (or disproving) the conjecture. Of course, Fiiredi's result guarantees that there are no infinite families of graphs which violate the conjecture. Thus, the only possible counterexample to the conjecture would consist of a finite set of 2-critical graphs with > ^- edges, rather than a family of such graphs. Nevertheless, we develop methods to construct families of 2-critical graphs with 0(n3) triangles because we hope that examining such graphs, and determining why they do not have enough edges to violate the conjecture, will shed some light on the question of whether it is possible to construct any graph which violates the conjecture. 2.1.1 2-Critical Graphs with All Edges in Triangles It is possible to construct families of 2-critical graphs which have the property that each edge is in at least one triangle in the graph. Plesnik [12] presented several such families, and designated them 2-MT graphs. We will show by means of examples that although 2-MT graphs might appear to be related to graphs with 0(n3) triangles, neither property is either a sufficient or a necessary condition for the other. The family of graphs A(r) depicted in Figure 2.3 on page 15 clearly has 0(n) triangles; by inspection, |A(A(r))| = r + 6. Figure 2.4 on page 16 shows a member of the graph family B(s, t), where s — deg(u) — 1 and t is the number of children of Vi (i > 0; v0 has t + 1 children). Using a proof similar 14 Chapter 2. 2-Critical Graphs with 6(n3) Triangles Figure 2.3: The family of graphs A(r) to the one that we will develop in the next section, we can show that B(s, t) indeed has 0(n3) triangles (where n, the number of vertices in B(s,t), is (s + l)(i) + s + 2). However, [12] did not contain a proof that \A(B(s,t))\ is in 0(n3), nor did it discuss the number of triangles in any family of graphs. At any rate, it is clear that while the classes of 2-MT graphs and 2-critical graphs with 0(n3) triangles are not disjoint, neither are they strongly related. The 2-MT property is neither necessary nor sufficient to yield a graph with 0(n3) triangles. 2.2 Examples of 2-Critical Graphs with 0 ( n 3 ) Trian-gles Proposition 2.4 showed that Conjecture 2.3 would be proven to be true if there did not exist any graphs such that l^l was in 0(n3). Each element of S? is a triangle with a specific property, as we discussed in Section 2.1. Thus, an obvious step towards proving (or disproving) the conjecture is to construct families of 2-critical graphs where |A| is in 0(n3) , and examine their properties. We have constructed and will present three 15 Chapter 2. 2-Critical Graphs with 6(n3) Triangles Figure 2.4: The graph 5 (2 ,3 ) different such families. We explored two approaches to constructing graphs with large numbers of triangles. The first approach involved using greedy heuristics in an attempt to find 2-critical subgraphs of Kn with many triangles; the second approach was based on the same greedy heuristics, starting with graphs that had triangles which were guaranteed to exist in any 2-critical spanning subgraph. The first approach is described by the algorithm below. Input: The complete graph Kn. Output: A 2-critical graph Gc C Kn. graph 2_cri t ical_tr iangular(graph G) { graph Gc; edge e,eT; vertex_set V; edge_set E,EC; 16 Chapter 2. 2-Critical Graphs with 0(n3) Triangles i n t e g e r TG,Te; boolean critical; V <— v e r t i c e s ( G ) ; E <r- edges (G) ; EC^E; TG^0; critical 4— FALSE; whi le ( c r i t i c a l = FALSE) do { Gc -f- def ine_graph(V, Ec) ; eT ^ 0 ; fo r each e G E do i f ( d i a m « 3 c - e ) < 2) { Te 4— num_tr iangles (G c — e) ; i f (Te>TG) { eT <-e; TG^Te; } } i f ( e r ^ 0 ) -E'c «- Ec — er; e l s e critical - TRUE; } r e t u r n Gc; } The heuristic can be changed from "maximize triangles" ("at each stage, find the non-critical edge whose removal induces a subgraph with the maximum number of triangles") to "minimize triangles" by replacing the (Te > To) comparison with (Te < TG). It turns out that both the maximizing and minimizing heuristics yield (apparently due to some cosmic graph-theoretic joke, as far as I can tell) graphs with no triangles. The first heuristic yields KiiTl-i; the second yields Km<n^m where m = 2k,k = maxfc{/c : | < 17 Chapter 2. 2-Critical Graphs with 0(n3) Triangles The second approach is this: build a graph around a triangle r in such a fashion that each edge of r is critical. Add as many other edges to the graph as possible, subject to the restriction that the edges of r remain critical. Using the resultant graph as a starting point, apply one of the greedy heuristics described in Approach 1. Figure 2.5 shows the basic idea behind such a graph. If we define the vertex sets A = {a,j} and B = {bi}, then in order for this graph to be 2-critical, we must have A n B = 0, but that some of the vertices of A may coincide. (Note that this graph has diameter > 2; this is a conceptual framework on which a 2-critical graph is to be constructed, rather than the graph itself.) Figure 2.5: Original concept for 2-critical graph with critical triangle (x,y,z) 18 Chapter 2. 2-Critical Graphs with 0(n3) Triangles 2.2.1 T h e G r a p h Family Qp After some experimentation with this basic structure, we were able to construct a family Qp of 2-critical graphs. The first four members of it are shown in Figure 2.6 on page 30 and Figure 2.7 on page 31. The properties of this family are as follows: • The vertices divide naturally into three disjoint subsets A, B, C. The three out-ermost vertices comprise subset A = {a i ,a2 ,a 3 }; the three vertices of the middle "layer" comprise subset B = {6i, &2, ^3}, and the remaining vertices comprise sub-set C. (Note that subset C includes the original triangle ({x,y,z}) of our tem-plate). Subset C is further divided into three subsets C\ = {cu(= x), C i 2 , . . . , Cip}, C2 = {c2 i(= y), c22, • • •, c2p}, and C3 = {c31(= z),c32, •••, c3 p}. • The edges also divide naturally into four disjoint subsets Ei, E2, E3, E±. - E1 = {(ai,bj) : at e A,bj e B} - E2 = {{bi, dj) :biEB, ci:j e Cu d C C} - E3 = {(cik,cjk) : cik e Chcjk G Cj,j ^ 1} - Ei = {(ctj, clk) : i 7^ 1, j ' ^ k} Given Qp, the inductive step that creates Qp+i consists of adding one vertex to each of the vertex subsets Cj. The new vertices are connected to the vertices of Qv via new edges in E2,E3, and EA. Thus, a member Qp of this family of graphs can be constructed as follows: we define Q0 to be the graph in this family that has no vertices in C and consequently no edges in E2, E3, or E^. Starting with Go, attach p vertices to each vertex of B. Connect c^ - to ckj V/c ^ i. Finally, connect the vertices of C2 and of C3 so as to form two copies of Kp. 19 Chapter 2. 2-Critical Graphs with 9(n3) Triangles To summarize, a member Qp of this family of graphs has the following properties: 1. Va; G A : N(oi) = B via Ex 2. VbiEB : JV(6i) = A U Q via £ 2 3. V? such that c^ - G Q : Cij ~ C2j, Cy ~ C3J, C2j ~ 03^ via S3 4. Vj, A; such that c^-, Cj^ ^ Ci,j ^ k,i ^ 1 : Cij ~ c^ via £4 5. |A| = |B| = 3, | C i | = p , |V(Sp)| = 3p + 6 Propos i t ion 2.5 diameter(Qp) = 2 , V p G N . Proof: We prove this proposition by examining the shortest path lengths dg (u,v) be-tween vertices u and v in all pairs of vertex subsets. We will make use of the properties of Gp listed above. • Case 1: u G A,v G A. By property 1, 35; G B such that u ~ 6; and -y ~ ^ ; thus there exists a path u, 6j, v of length 2. • Case 2: u G A,w G B. By property 1, u ~ u. • Case 3: u 6 A,« 6 Cj. By property 1, u ~ &,; similarly, by property 2, w ~ 6j. Thus, there exists a path u,bi,v of length 2. • Case 4: u = bi,v G Q . By property 2, u ~ v. 20 Chapter 2. 2-Critical Graphs with Q(n3) Triangles • Case 5: u = bi,v G Cj,j ^ i. Let v = Cjk- By property 2, it ~ c^; by property 3, v ~ c^. Thus, there exists a path it, Cjfe, i> of length 2. • Case 6: u £ Ci,v £ Ci. By property 2, it ~ 6j and i> ~ 6j. Thus, there exists a path it, bi, v of length 2. • Case 7: u = cik, v = cjk, i ^ j . By property 3, it ~ v. • Case 8: u = Cik, v = Cji, i ^ j , k ^ I. By property 3, u ~ Cjk] similarly, by property 4, v ~ c^ . Thus, there exists a path it, Cjfc, i> of length 2. • Propos i t ion 2.6 Qv is 2-critical V p G N . Proof: Since we have already demonstrated that £/p has diameter 2, we need only demon-strate that the edges in each of Ei, E2, E3, E4 are 2-critical. We will again make use of the properties of Qp listed above. • Case 1: (u,v) G E\. Without loss of generality, let u = aj and v = bj. By property 1, u is adjacent only to vertices in B. By properties 2 and 3, v is not adjacent to any vertex in B. Thus, (u, v) lies on the unique path of length < 2 connecting u to v and is therefore 2-critical. • Case 2: (u,v) G £ 2 -Without loss of generality, let u — bi and v — Cij, and select an arbitrary vertex ak £ A. By property 1, ak is adjacent only to vertices in B, so any path of length 21 Chapter 2. 2-Critical Graphs with 0(n3) Triangles < 2 from ak to v must pass through some vertex bi 6 B. In particular, any such path must include a path of length 1 from bt to v. By properties 2 and 3, the only vertex in B that is adjacent to v is u. Thus, (u, v) lies on the unique path of length < 2 connecting ak to -u, and is therefore 2-critical. • C a s e 3: (u, v) G E3. Without loss of generality, let u = cik and v — Cjk. By properties 2 and 3, the vertex bi is adjacent only to vertices in A and d. By property 1, vertices in A are adjacent only to vertices in B. Thus, any path of length < 2 from bi to any vertex in C must pass through a vertex en G C,. By property 4, the only vertex in d that is adjacent to v is it. Thus, (u,w) lies on the unique path of length < 2 connecting bi to v, and is therefore 2-critical. • C a s e 4: (u, v) G £4. Without loss of generality, let u = Cij and v = Cik, i 7^ 1- By properties 3 and 4, the vertex cij is adjacent to {bi,u, Qj(l / i)}. By properties 2 and 3, b\ is not adjacent to any vertices in d(i ^ 1), so there is no path of length < 2 from c\j to v passing through b\. Similarly, by properties 3 and 4, cij is adjacent to only one vertex in d iu) s o there is no path of length < 2 from C\j to v passing through cy. Thus, (u, u) lies on the unique path of length < 2 connecting cy to v, and is therefore 2-critical. • Now that we have established that Qp is 2-critical, we will calculate |A(£P) | and m = 1^ (^ )1-By property 3 of Qp, the edges in E3 form p triangles. By property 4 of Qp, the edges in E4, in conjunction with the edges in E2 that connect b2 to the vertices of C2 and 63 to the vertices of C3, form 2 cliques of order p+1. There are no other triangles in Qv. Thus we have 22 Chapter 2. 2-Critical Graphs with 0(n3) Triangles \A(gp)\=P + 2(p+3V By property 5 of Qp, \V{gp)\=n = 3P + 6=>P=—^-. This gives us a formula for |A(C/P)| in terms of n, which we can simplify to: . A ,„ x. n3 - 18n2 + 126n - 324 |A(<5P)| = Note that we have established that there exists a family of 2-critical graphs Qv with 0(n3) total triangles, not that \S2(QP)\ G G(n3), so this result is not sufficient to disprove the conjecture. In fact, m < n2/4 for Qp, as we will now demonstrate. Since E = {£], £2 , £3, £4}, we can calculate the size of each Ei individually and then calculate £) |£i| = m. By properties 1 and 5 of Qp, \EX\ = 9. By properties 2 and 3 of Qp, \E2\ = |£31 = 3p. Finally, by property 4 of Qp, E4 is comprised of 2 cliques of order p, so I £41 = 2 m = p2 — p. Thus we have m = 9 + 3p + 3p + p2 — p = p2 — 5p + 9. As above, by property 5 of Qp, p — '°^. This gives us the formula for m in terms of n: n2 + 3n + 27 m = 2.2.2 The Graph Families Up and Wj v The central idea behind constructing the family Qp turned out to be the idea of replicating triangles in a graph in such a fashion that for each replicated triangle, two of the vertices • 23 Chapter 2. 2-Critical Graphs with Q(n3) Triangles must form distinct cliques with all other similar vertices. Since cliques of order 0 ( n ) have 0 (n 3 ) triangles, and we added a constant number of vertices to cliques with each increase in the order of Qp, the number of triangles in Qp must be 0 ( n 3 ) . The recognition of this pattern led to the speculation that we could use this heuristic to construct additional families of 2-critical graphs with 0 (n 3 ) triangles. We were successful in this, and as a result we were able to construct the families 7ip and Wp. We will not present the properties of these families in detail as we did for Qp, but we do make note of |A| and \E\ for each family. The graphs in Figure 2.8 on page 32 are the first three elements of the family H,p of 2-critical graphs. • Tip has three disjoint sets of triangles: the replicated triangles, a clique of order p and another clique of order p + 2. | A ( « P ) | = 2+72n-162 81 . | £ ( f t p ) | = s!±2pi8 The graphs in Figure 2.9 on page 33 are the first three elements of the family Wp of 2-critical graphs. • Wp has three disjoint sets of triangles: the replicated triangles and two cliques of order p + 1. . |A(W„)| = "3-1 2 n 2 8 + i6 6 n~1 3 6 . \E{Wp)\ = ^ ^ We define a triangle r G G to be vertex-isolated if it has no vertices in common with any other triangle in G, and edge-isolated if it has no edges in common with any other triangle in G. Note that vertex-isolation implies edge-isolation. 24 Chapter 2. 2-Critical Graphs with 0(n3) Triangles We observe a few facts about the families of graphs that we have generated: • In each case, |A| = fj + o(n3) and m = \E\ — ^ + o(n2). • The replicated triangle in Q\ and W\ is vertex-isolated; the replicated triangle in Tip is only edge-isolated. Our experiments in generating these families suggest that in order for a triangle in a 2-critical graph to act as a "seed"-a triangle whose replication generates a family of 2-critical graphs with 0 (n 3 ) triangles-it must be edge-isolated. Attempts to use a non-edge-isolated triangle as a seed resulted in graph families that did not exhibit the clique growth that we see in our examples, and that did not have a large number of triangles or edges. "Seed" triangles which are edge-isolated but not vertex-isolated (such as rlp) seem to result in an asymmetry in the sizes of the generated cliques. We denote a vertex which is shared by several triangles in the original graph (including the "seed") as vs, and observe that the asymmetry is apparently due to the necessity of connecting the replicated copies of vs to each vertex of the triangles of which vs was a part in the original graph. These connections cause the clique including vs to be larger (by a constant number of vertices) than the clique whose members are the copies of a vertex which is part of only one triangle in the original graph. However, since the difference in the clique sizes is only a constant, the presence or absence of the property of vertex-isolation in the "seed" does not appear to affect the asymptotic number of triangles or edges of a graph family. 2.3 Analysis of our Examples We have presented several families of 2-critical graphs with 0 (n 3 ) triangles; nevertheless, none of them have had enough edges to provide a counterexample to the conjecture. In 25 Chapter 2. 2-Critical Graphs with 6(n3) Triangles the following two sections, we will explore in detail the reasons why this is the case here, and why it may be not be possible to construct such a counterexample. 2.3.1 Associated Triangles Recall that Proposition 2.4 states that a counterexample to the conjecture would not only have 0 (n 3 ) triangles, but would have G(n3) members of 52 (see page 11 for the definition of S2 and related terms). The families of graphs that we have constructed, as well as Plesnik's B(s,t) in [12], have all originated in the replication of some appropriate triangle in the original 2-critical graph, and at the concomitant growth of cliques whose vertices are comprised, in part, of the similar copies of the vertices of the replicated triangles. Thus, each vertex of one of these cliques is incident to at least two other vertices not in the clique: the vertices of the replicated triangle. As an example, consider, without loss of generality, the triangle (C21, C22, C23) in Q4 (whose picture is in Figure 2.7 on page 31). As we demonstrated in the proof that Qp is 2-critical, the edge C22C21 lies on the unique path of length < 2 from c i 2 to c2i. Since ci2 7^ c2i and N(ci2) nJV(c2i) = {C22}, we know that (ci2, C21, C22) is an extremal triple. Furthermore, since iV(ci2)niV(c2i)n./V(c23) = {C22}, we know that (ci2, c22, C23) € S\. Thus, the triangle (C21, c22, C23) is associated with (ci2, C22, C23). Using similar arguments, we can show that (c2i,C22,c23) is also associated with (c32, c23 ,c2i), (c33,c2i, c22), and (c13, c2i, c22), which are also members of S\. Thus, (c2i,c22,c23) is associated with at least four members of Si, and is therefore clearly not a member of S2-This argument applies to any triangle in any of the cliques in the examples of 2-critical graphs with Q(n3) triangles that we have seen. Since the number of triangles in these families of graphs that are not in the cliques is clearly in 0 (n 3 ~ £ ) , none of these families 26 Chapter 2. 2-Critical Graphs with Q(n3) Triangles are candidates for providing a counterexample to the conjecture. Furthermore, since the existence of the cliques was made possible by the replication of the attached triangles, it seems unlikely that one can construct a graph which has enough triangles to violate the conjecture without "disqualifying" most of the triangles by forcing them to be associated with too many members of S\. 2.3.2 Cliques, Triangles, and Edges 2 Recall that Conjecture 2.3 states that the number of edges in a 2-critical graph is < ^-, 2 and that the only 2-critical graph with ^ edges is if[„/2"|,|n/2j-We denote an arbitrary family of 2-critical graphs among those that we have discussed as Xp. For any Xt, then, we construct Xi+1 by adding a triangle r to Xi. Of course, some additional edges beyond those of r are also used to connect r to Xt and to insure that Xi+i has diameter 2, but we wish to emphasize that three vertices are added for every increase in p of 1. By contrast, if we set p = n/2, then for •K'[p"|,Lpj, we know that p increases by 1 for every two vertices that we add. Furthermore, for every such r added to Xp, two vertices of r contribute to disjoint cliques of order 0(p) in Xp, and thus these vertices have degree Q(p). We define the set EK to be the set of edges in Xp which are part of one of these cliques, and note that the remaining vertex in r invariably has degree 3. Thus, asymptotically, we can characterize |£,(A'p)| by considering only \EK(XP)\; it seems clear that |£'^(A'?,)| G Q(\E(XP)\). Therefore, we have the following: •j where b is the number of vertices in the graph which we add to Xi in order to form Xi+i (r is an example of such a graph), q^ is the number of vertices which contribute to the 27 Chapter 2. 2-Critical Graphs with 9(n3) Triangles growth of clique j , and n is, as usual, |y(A^)|. (Note that in general, J2j Qj ¥" b.) As we explore the use of clique-growing in the construction of families of 2-critical graphs which might provide counterexamples to the conjecture, we must consider several possible choices of qj and b: • qx = l,q2 = 1,6 = 3. These are the values of qj and b for all the examples that we have presented. In this case, | £7(^)1 = \ + o(n2). • qx = 1, q2 = 1,<?3 = 1,6 = 3. In this case, |J5(^P)| = ^ + o(n2). • qi — 1, q2 — 1, b — 2. In this case, 1^(^)1 = \ + o(n2). Of course, in this case we are no longer adding triangles but 2-cliques (two vertices connected by an edge). • qi = 2, b = 3. In this case, two vertices of three contribute to a single clique, and \E(Xp)\ = 2-£ + o(n2). None of these cases have enough edges to violate the conjecture, with the possible exception of the third (since we might asymptotically approach ^ from above). However, since \E(K^n/2],[n/2\)\ = \ , we suspect that any 2-critical graph whose number of edges is — + o(n2) may be isomorphic to i^[n/2],[n/2j • • qi = 2,q2 = l,b = S. In this case, 1^(^)1 = ^- + o(n2). • qx = b. In this case, ^(A^,)! = %- + o(n2) for any value of b. As a result of these observations, we make the following conjecture: C o n j e c t u r e 2.7 Suppose that Xv is a family of 2-critical graphs such that given Xi, we generate Xi+i by adding a graph B = (VB,EB) to Xi in such a way that some S C B combines with existing cliques in Xi to form cliques in Xi+X. Then \E(XP)\ > ^ only if Ly* + e vertices of B become part of a single clique in Xi+i for any £ > 0. 28 Chapter 2. 2-Critical Graphs with 0(n3) Triangles This conjecture, if true, might seem to suggest that we should use techniques other than 2 clique-growing in our attempts to construct a 2-critical graph with > ^ edges. However, recall that Proposition 2.4 requires that any such graph must have 0(ra3) triangles, and thus must have structures in it that are not overly different from cliques. In summary, we hope that our focus of attention on certain well-defined issues arising from Conjecture 2.3 (and its relation to Proposition 2.4) will make it easier for researchers to prove or disprove it. 29 Chapter 2. 2-Critical Graphs with 0(n3) Triangles Figure 2.6: & , & 30 Chapter 2. 2-Critical Graphs with 0(n3) Triangles Figure 2.7: & , & 31 Chapter 2. 2-Critical Graphs with 0(n3) Triangles Figure 2.8: ri1,n2,riz 32 Chapter 2. 2-Critical Graphs with 0(n3) Triangles Figure 2.9: Wi,W2j>V3 33 Chapter 3 d-Critical Spanning Subgraphs This chapter is devoted to the discussion of methods for identifying d-critical spanning subgraphs. We will review some methods of calculating the diameter of a graph and of recognizing a d-critical graph. We will then present efficient algorithms for identifying 2-critical and 3-critical spanning subgraphs of graphs of diameters 2 and 3, respectively. Finally, we will discuss the problem of efficiently identifying a d-critical spanning sub-graph of a graph of diameter d where d > 4. 3.1 Calculating the Diameter of a Graph Recall that we defined the diameter of a graph G to be the longest distance between any two vertices i and j in G, where the distance do(i,j) is itself defined as the length of the shortest zj-path in G. From this definition, we derive two methods of calculating the diameter of a graph. 3.1.1 Breadth-First Search Trees The first method is quite straightforward: we make use of the fact that, for a breadth-first search (BFS) tree T generated from a graph G having diameter d, and rooted at v £ V(G), the shortest paths from v in T are exactly the shortest paths from v in G. Thus, a BFS tree rooted at any vertex in a graph G must have height < d, and the tallest 34 Chapter 3. d-Critical Spanning Subgraphs of these BFS trees will have height = d. Generating a BFS tree requires 0(m) time, since each of the m edges in G is traversed once in a breadth-first search, and each traversal requires 0(1) time. Since there are n vertices, the process of generating the BFS trees requires 0(mn) time. Furthermore, we can keep track of the height of a BFS tree as we build it, and calculate the maximum height among all BFS trees, using no more than 0(mn) additional time. Therefore, the total time required to calculate the diameter using this method is 0(mn). 3.1.2 Matrix Multiplication The ideas behind calculating the diameter of a graph using matrix multiplication are somewhat more involved. We define the adjacency matrix A of a graph G as follows: A(i,j) = < 1 ife=(i,j)eE 0 otherwise We further define Ak inductively as follows: A1 = A, and Ak+1 = AAk. We denote the adjacency matrix (A + I)k as Ak, and the (i,j)th entry of Ak is denoted Ak(i,j). Finally, we say that Ak > 0 if every entry in Ak is > 0. We denote the complexity of multiplying two n x n matrices as M(n), since there are several algorithms for matrix multiplication, each with its own advantages. Coppersmith and Winograd [4] have established that M(n) E 0(n2,377) for general matrices. However, we observe that an adjacency matrix is a Boolean matrix: all elements are in {0,1} and the sum and product operations on rows and columns are equivalent to the (bitwise) log-ical OR and AND operators, respectively. There is a practical algorithm for multiplying Boolean matrices, known as the "four Russians" algorithm (due to Arlazarov, et al. [2]), whose refinement, described in [9], has a time complexity of 0 (n 3 / lg2 n). In practice, the 35 Chapter 3. d-Critical Spanning Subgraphs refined "four Russians" algorithm tends to be more efficient on Boolean matrices than Coppersmith and Winograd's algorithm does. P r o p o s i t i o n 3.1 Ak(i,j) > 0 only if there is a path from i to j of k or fewer edges. Proof: We will prove this result by induction on k. Base case: A\ > 0 if and only if i and j are connected by an edge, by the definition of an adjacency matrix. Inductive step: We must show that if Ak(i,j) > 0 exactly when there exists an i j-path of length < k, then Ak+\{i,j) > 0 exactly when there exists an ij-path of length < k + 1. Ak+i(i,j) > 0 exactly when there is at least one nonzero element Ak(i,l) and a corre-sponding nonzero element Ax(l,j), by the definition of matrix multiplication on nonneg-ative matrices. Furthermore, by definition, if A\(l,j) > 0, then I is connected to j , and by inductive assumption, if Ak(i, I) > 0, then there exists an il-p&th of length < k. Thus, if Ai(l,j) > 0 and Ak(i, I) > 0, then there exists a path (i,..., l,j) of length < k + 1. • Propos i t ion 3.2 Ak+i > Ak. Proof: This follows directly from Proposition 3.1. If there exists an r?-path of < k steps, there is certainly an r/'-path of < k + 1 steps, so Ak+i is bounded from below by Ak. Furthermore, there may exist an ij-paih of k + 1 steps where there exists no zj-path of k steps, so Ak+i may be greater than Ak. • L e m m a 3.3 diam(G) is the smallest k for which Ak > 0. Proof: Ak > 0 means that there is a path of length < k between each pair of vertices 36 Chapter 3. d-Critical Spanning Subgraphs i, j E V; or, in other words, that the distance between any pair of vertices in V is bounded above by k. Suppose that k was the smallest power of (A+I) such that Ak > 0, but diam(G) = d < k. This would imply that the distance between any pair of vertices was < d, but by this assumption, Ad has some 0 entries, so there is some pair of vertices for which the shortest path is longer than d ==> contradiction. If Ak > 0, we know that the distance between any pair of vertices in V is bounded above by k, so the diameter, being the maximum of all distances in G, can be no more than k. * Theorem 3.4 We can calculate the diameter of a graph G in 0(M(n) Igd) time, where d = diam(G). Proof: Recall that d = min^jAfc > 0}. We generate powers oi A\ by repeated squaring: (Ai)2 —> A2, (A2)2 —> A i , . . . , (A2i)2 —» A2i+i, stopping at iteration j — 1, when A2i > 0. We store the matrices that we have generated for use later in the algorithm. The A2i > 0 test at each iteration requires 0(n2) time, and there are j — 1 iterations. By the definition of d given above, we know that j = [lgcf|. Therefore, thus far we have executed Igd iterations, each of which required 0(M(n) + n2) = 0(M(n)) time, for a total of 0(M(n) Igd) time. Again, by the definition of d, we know that d lies in the interval [2J_1, 2J], so we can use binary search to find d. We make use of the following facts: • The size of the interval that we are searching is always a power of 2: initially the interval size is 2-7-1, and thereafter, each iteration of the binary search cuts the interval exactly in half. 37 Chapter 3. d-Critical Spanning Subgraphs • The matrix Ad can be decomposed into a product of matrices of the form (Ai ) 2 \ In particular, matrix (Ai)2* contributes to this product if and only if the ith bit of d is 1. One can interpret the stages of the binary search as determining, starting from the most significant bit of d, whether the successive bits (in decreasing order of significance) are 0 or 1. We already know that the most significant bit of d is 1, by definition. These facts, in conjunction with the generation of the matrices of the form {Ai)T in the first part of the algorithm, allow us to achieve our desired bound of O(lgd) matrix multiplications for the algorithm. A more naive approach would involve doing binary search on the interval [1, n] (note that d < n — 1). However, if n is not a power of 2, the midpoint of the current interval (represented by the matrix Ak) would be most efficiently generated by a process of repeated squaring followed by binary search. This is exactly what our algorithm does to determine Ad, but the naive algorithm would require such a calculation at each stage of the binary search for Ad- This would require O(lgn) matrix multiplications at each stage of the binary search, and would yield an overall bound of 0 ( l g 2 n ) , rather than O(lgd), matrix multiplications for the algorithm. To make it easier to deal with the boundary cases, we search for the largest i such that A{ ^ 0, rather than the smallest k such that Ak > 0. This i is of course d — 1, so we merely add 1 to the result of the binary search to yield the diameter. The search then proceeds as follows: we initialize Ak to A2j-i, initialize i to j — 2, and continue as long as i > 0. This algorithm is formalized below in the function d iamete r_bsea rch( ) . It assumes that the matrix set A has already been generated by the repeated squaring process described above. The function s e t_b i t (d, j , n) sets the ith bit of the integer d to n, where n e {0,1}. 38 Chapter 3. d-Critical Spanning Subgraphs Input: A set of matrices A = {A2k : k = 0 , 1 , . . . , |~lg d]}. Output: The diameter d of the graph from which A was generated. graph d iameter_bsearch(mat r ix_se t A, i n t e g e r j) { i n t e g e r i,d; mat r ix B; B <- A2i-i; d <- 2*-1; fo r i from j — 2 downto 0 do { i f {BA2i > 0) { B <r- BA2,; s e t _ b i t ( d , i, 1) ; } } r e t u r n d + 1; } This is not exactly the usual form of a binary search routine, so perhaps some explanation is in order. A binary search routine proceeds by determining, at each stage, whether the search object is above or below the midpoint of the current interval. If above, the new interval is bounded by the midpoint and the upper bound of the current interval; if below, the new interval is bounded by the lower bound and midpoint of the current interval. Recall that we are looking for the largest i such that Aj ^ 0. The matrix B is simply the current lower bound for this A{ in the search. The index i specifies the midpoint of the search interval by indicating the matrix A2« which we will multiply by B to see whether the matrix product is > 0. We do not need to represent the upper bound of the search interval explicitly. There are j — 1 iterations of this algorithm; we have already established that j = [lg d]. 39 Chapter 3. d-Critical Spanning Subgraphs Each iteration requires 0(1) matrix multiplications, 0(n2) work to determine whether BAi > 0, and one call to s e t J b i t Q , which we will assume requires 0(1) time. Thus, the binary search requires a total of lg d(M(n) + 0{n2)) = 0(M(n) lgd) time. Therefore, as the repeated squaring was also determined to require 0(M(n)\gd) time, the algorithm as a whole requires 0(M(n) lg d) time. • We observe that the BFS tree algorithm is asymptotically faster than the matrix multi-plication algorithm for sparse graphs, and vice versa for dense graphs. As we mentioned earlier, the "four Russians" Boolean matrix multiplication algorithm described in [2] runs in 0 (n 3 / lg2 n) time. Thus, if m — 0(n2/ lg2 n) then we would use BFS trees to calculate the diameter, and if m 6 f2((n2/lg2 n)l+£) then we would use matrix multiplications. Thus, we say that we can calculate the diameter of a graph in 0(min{mn, M(n) Igd}) by determining the density of the graph (which requires 0(1) time) and choosing the appropriate algorithm. We note that this is an improvement over the bound given by Anstee and Caccetta in [1], who cited an 0(min{mn, M(n) lgn})-time bound for diam-eter calculation. 3.2 Recognizing d-Critical Graphs A naive approach to recognizing diameter-critical graphs consists of m diameter compu-tations: given a graph G such that diam(G) = d, if 3e 6 E such that diam(G — e) > d, then G is diameter-critical. This algorithm runs in 0(min{m2n, m • M{n) Igd}) time. In [1], Anstee and Caccetta developed several more efficient algorithms for recognizing diameter-critical graphs: • An 0(M(n))-time algorithm for recognizing 2-critical graphs. • An 0(nm + M(n))-time algorithm for recognizing 3-critical graphs. 40 Chapter 3. d-Critical Spanning Subgraphs • An 0(n2m)-time algorithm for recognizing cZ-critical graphs of any diameter d. We are not aware of any algorithm which is more efficient than the general one given in [1] for graphs of diameter > 4. These results are relevant in that they contribute to an understanding of the issues involved in the development of efficient algorithms for identifying d-critical spanning subgraphs. We will discuss this more fully in the next section. 3.3 Identifying d-Critical Spanning Subgraphs We are interested in identifying, for a given graph G with diameter d, spanning subgraphs of G which are d-critical. We conjecture that the corresponding optimization problem, i.e., finding the d-critical spanning subgraph of G which has minimum weight, is NP-hard; it is known that the following closely related problem is NP-hard. Theorem 3.5 (Plesnik [11]) The following problem is NP-hard: given an edge-weighted graph G with cost function c : E —> R and an integer B, find a spanning subgraph F of G with cost Y^eeE(F) c(e) < B and w^h diam(F) minimized. We consider instead finding efficient solutions to this problem: ci-Critical Spanning Subgraph Problem: Given a graph G such that diam(G) — d, find a spanning subgraph Gc of G such that Gc is minimal with respect to the operation of edge removal and has the property diam(Gc) = d. The following algorithm provides a straightforward (if somewhat naive) general solution to this problem for any value of d. Input: A graph G = (V, E) having diameter d. Output: A spanning d-critical graph Gc C G. 41 Chapter 3. d-Critical Spanning Subgraphs graph na ive_c r i t i c a l_ subg raph (g raph G, i n t e g e r d) { graph GC,G'; edge e; ve r t ex_se t V; edge_set E, Ec; V <r- ve r t i c e s (GO ; E «- edges (GO; EC^E; f o r each e 6 E do { G" <S- def ine_graph(l/ ; Ec - e) ; i f (diam(G') < d) £ c ^ £ c - e ; } Gc 4— def ine_graph(V^, i?c) ; r e t u r n Gc; } Since each edge is considered exactly once, and the graph considered in each call to the diameter computation routine is a subgraph of G, the complexity of this algorithm is that of m diameter computations. Thus, we have an algorithm which runs in time 0(min{m 2 n, ralgd • M(n)}). (Note that this is exactly the complexity of the naive approach to diameter-critical graph recognition.) It is interesting to observe that, for graphs of diameter 2 or 3, we can achieve es-sentially the same time bounds for the d-critical spanning subgraph problem by us-ing the d-critical graph recognition algorithms described in [1]. (Note that , for fixed d, n a i v e _ c r i t i c a l _ s u b g r a p h ( ) , as implemented above, runs in time 0(min{m2ro,m • M(n)}).) Specifically, we replace, in n a i v e _ c r i t i c a l _ s u b g r a p h ( ) , the calculation and test of the diameter (the call to diam(G') and comparison to d) with a call to the ap-propriate recognition algorithm: i s _ 2 _ c r i t i c a l ( G ' ) or i s _ 3 _ c r i t i c a l ( G " ) . For graphs of diameter 2, this yields an algorithm which runs in 0(m • M(n)) time; for graphs of 42 Chapter 3. d-Critical Spanning Subgraphs diameter 3, the resultant algorithm runs in 0(m2n + m • M(n)) time. As with the recognition problem, we suppose that it is possible to construct an algorithm that is more efficient than these naive approaches. Neither calculating the diameter, nor testing the graph for diameter-criticality, after the removal of each edge seems terribly efficient. While we recognize that diameter is a global property of a graph, we would like to be able to eliminate some of the redundancy in the m diameter computations. It turns out that we can indeed improve the efficiency of the naive algorithm, at least for graphs of diameter 2 or 3. In order to make improvements on the bounds thus far achieved, we will retain the original call to diam(G'), and alter the function called so that it computes the diameter using BFS trees, and updates the BFS trees efficiently, as necessary, when an edge is deleted from the graph. We make use of the fact that we established earlier: that the diameter of G is exactly equal to the height of the tallest BFS tree generated from the vertices of G. Formally, then, for a given graph G, we construct a set T of the n BFS trees rooted at the vertices of G, and designate Tv to be the BFS tree rooted at vertex v. Then we have diam(G) = msix{height(Tv) : Tv e T} This fact implies that, if we can efficiently update the BFS trees as edges of G are deleted, we can speed up the naive algorithm for identifying a diameter critical spanning subgraph of G. We attempt to remove each edge e from G and obtain the BFS trees T for the graph G — e. We then check to see whether any Tv 6 T has height greater than d. If there is any such Tv, then e is critical and is not removed. Otherwise, we remove e from G and update the BFS trees. Input: A graph G = (V, E) having diameter d. Output: A spanning d-critical graph Gc C G. 43 Chapter 3. d-Critical Spanning Subgraphs graph crit ical_subgraph(graph G, integer d) { graph Gc; vertex_set V; edge_set E, Ec; vertex v; edge e; b f s _ t r e e _ l i s t T,T'; boolean critical; V <— ver t ices (GO; E 4- edges (G) ; EC^E; for each v G V do Tv 4- create_BFS_tree (V, E, v) ; for each e G E do { critical «— FALSE; for each v G V do { while (critical = FALSE) do { Z^ •f- remove_edge(TlJ, e) ; i f (height (T^) >d) critical <— TRUE; } } i f (critical = FALSE) { Ec^- Ec-e; for each v G V do } } Gc <— def ine_graph(l/, i^c) ; return Gc; 44 Chapter 3. d-Critical Spanning Subgraphs Figure 3.1: An example of a BFS tree and its virtual edges 3.3.1 Properties of Breadth-First Search Trees There are a few properties of BFS trees (beyond those which we have mentioned previ-ously) which are worthy of mention here, as they have a direct bearing on the implemen-tation of the restructuring routine, called here remove_edge(). • A BFS tree of a graph G has only n — 1 edges, while the number of edges in G may be 0(n2) . Therefore, there may be as many as 0(n2) edges in G that do not appear in a particular Tv. We refer to such edges as virtual edges for the tree Tv. Figure 3.1 on page 45 shows a BFS tree with its virtual edges; the BFS tree edges are solid, and the virtual edges are dotted. • We denote the level of a vertex k as l(k), or, if it is not clear which BFS tree we are considering, l(Ti,k), where the ordered pair (Ti,k) designates vertex k in BFS tree Tj. The level of a node is defined recursively as follows: - l{Ti,i) = 0. - Let j be the parent of k in T-. Then l(T{, k) = l(Tuj) + 1. Thus £(T*, k) is the number of edges on the path in T, from the root (vertex i) to k; or, in other 45 Chapter 3. d-Critical Spanning Subgraphs words, the length of the shortest ifc-path. ' • Since the BFS tree Tk consists of the shortest path tree from k to all other vertices in G, a vertex will never be repositioned nearer the root of its BFS tree as a result of the deletion of the edge connecting it to its parent. In fact, as we are considering only undirected graphs (so (u,v) e E 4=> {v,u) G E), there are only three types of virtual edges in these BFS trees which are incident to a vertex v: 1. Edges which connect v to a vertex of level l(v) — 1. There are no such edges if v is a child of k in T^, i.e., v is a child of the root. 2. Edges which connect v to a vertex of level l(v). 3. Edges which connect v to a vertex of level l(v) + 1. Thus, for a given vertex v in a graph of diameter d, l(v) will monotonically increase up to 0(d) times over the course of the identification algorithm. This becomes important in our discussion of the algorithm for general d. Note that although virtual edges only connect v to vertices one level different from its own, it is possible for the deletion of an edge to result in an increase in l(v) of 0(n). As an example, consider the graph consisting of a simple cycle. If we remove an edge e = (i,j) from this graph, the shortest distance between i and j increases from 1 to n — 1. We will show that the efficiency of this identification algorithm depends on the effi-ciency of the implementation of the remove_edge() function. In particular, we will show how to improve the performance of our algorithm by an efficient implementation of the remove_edge() function. If the BFS trees were simply rebuilt from scratch each time an edge was removed, the complexity of the algorithm would be in 0(m2n): 0(m) time is necessary to generate a BFS tree, and there are n BFS trees to be (re)generated as each 46 Chapter 3. d-Critical Spanning Subgraphs of m edges is tested for criticality. Note that this is, unsurprisingly, just the complexity we noted earlier for the naive approach. To do better, we must do as little restructuring of the BFS trees as possible. The following theorems will specify how this restructuring should proceed for graphs of diameters 2 or 3. 3.3.2 Algorithm for Diameter 2 Graphs Theorem 3.6 Let G = (V, E) be a diameter 2 graph having m edges and n vertices. We can identify a diameter 2 critical spanning subgraph in 0(mn) time. Proof: As above, we generate a set T of the n BFS trees rooted at the vertices of G, and designate T{ to be the BFS tree rooted at vertex i. As the BFS trees are generated, we record the level of each vertex in each BFS tree. As we established earlier, the process of generating the BFS trees requires 0(mn) time. For each Tj, we generate an adjacency list representation of the entire graph G: each vertex (T{, k) is assigned a list adj(Ti, k) of the vertices adjacent to k. The crucial idea is that each adjacency list is divided into disjoint sublists by level; since the level of a vertex is stored at the vertex, and the ordering of vertices within a sublist is unimportant, only 0(1) time is needed to direct each vertex to its proper sublist. The combined size of all adjacency lists of Tj is in 0(m), since vertex adjacency lists are simply an implicit listing of the edges of a graph, and each edge e = (u, v) is represented twice: v is a member of adj(Ti, u), and u is a member of adj(Tt, v). Generating the adjacency lists for all the BFS trees requires 0{mn) time: there are n BFS trees, and 0(m) work is required to build the adjacency list for each BFS tree. Once the BFS trees and adjacency lists have been constructed, we examine each edge e = (hj) 6 G in conjunction with each Tj. G T to see whether its removal will increase the height of any T^. We create a temporary copy of each T^, denoted T'k, and, assuming 47 Chapter 3. d-Critical Spanning Subgraphs that e is not a virtual edge for Tj., remove e from T'k. T'k must be restructured so as to restore the BFS tree property, i.e., the vertices disconnected from T'k by the removal of e must be reconnected to T'k. If all of the restructured BFS trees have height < d, then e is not critical, each T„ becomes the new Tv, and the algorithm continues on the graph G — e. Otherwise, the algorithm retains the current copies of Tv and continues with the next edge in G, until all edges have been tested for criticality. Thus far, we have simply described in detail the algorithm specified above, and demon-strated that the setup (generating the BFS trees and adjacency lists) requires 0(mn) time. In order to make the BFS tree restructuring which occurs in remove_edge() effi-cient, we will use a technique known as lazy deletion, in which the deletion of elements of a data structure is delayed to improve the efficiency of the deletion routine. If the presence or absence of the element during later stages of the algorithm's execution is irrelevant, the deletion may be delayed indefinitely, or never occur. We use lazy deletion so that we may avoid doing unnecessary updates to the BFS trees and their adjacency lists. We observe that, in the end, the only information about the BFS trees that we are interested in is their height, and as long as we know that the height of a restructured BFS tree is < d, that is sufficient. Thus we update the BFS trees and associated adjacency lists only when necessary; the details are given below. We also note that if we attempted to update all the adjacency lists of a BFS tree immediately in response to an edge deletion, this would require 0(m) time per BFS tree, leading to the 0(m2n) time bound of the naive algorithm. For each BFS tree, there are three cases: • C a s e 1: e ^ T^. By definition, e is a virtual edge for T^. Therefore, T^ is the same for the graphs G = (V, E) and G' = (V,E — e). Thus, removing e from G has no immediate effect 48 Chapter 3. d-Critical Spanning Subgraphs Figure 3.2: An example of case 2 for diameter 2: Tj on Tfc. In keeping with the principle of lazy deletion, Tk is not updated at this time to reflect the deletion of e; update may take place at a later stage if necessary. (For details on when and how updates occur, see cases 2 and 3.) • Case 2: e = (i,j) G ThTj. Without loss of generality, consider the BFS tree Tj. Since e = (i,j), e is adjacent to the root in Tj. Thus, removing e from the tree disconnects 0(D) vertices in the subtree r,- of Tj of which j is the root. (Recall that D is defined as being the maximum degree of any v G V, and is therefore an upper bound on the size of Tj.) For each v G r,-, search adj(Ti,v) for an adjacent vertex a in the level 1 adjacency sublist. Also, check the current level of a, since a may have moved due to a previous restructuring of Tj. — If a — j , ignore it (since j will no longer be at level 1 once the restructuring of Tj is done), and continue searching. — If a is still at level 1, attach w to a in Tj. — If a is no longer at level 1, remove it from the level 1 adjacency sublist, put it in the sublist appropriate to its new level, and continue searching. — If the edge (v, a) is no longer in G (having been determined to be noncritical at an earlier stage, and then deleted), then remove a from adj(Ti,v), and 49 Chapter 3. d-Critical Spanning Subgraphs continue searching. - If there is no such a, e is 2-critical. Abort the restructuring of the BFS trees, and restore all T, G T to their prior states using backup copies. We must also undo the changes made to the adjacency lists as a result of the assumed deletion of e. The mechanism for this, and the reason for not simply reverting to the backup copies of the adjacency lists, are described in the proof of Claim 3.7. The deletion of e requires each v G Tj to be repositioned in Tj. We define I(Ti, v, e) to be the number of invalid elements of adj(Ti,v) which are encountered in the search for an element of adj(Ti, v) to which v may be reattached. (Invalid elements are those which are skipped, moved to a different level sublist, or deleted according to the rules given above.) The time required to reattach each v G Tj, then, is 0(1) for the reattachment + 0(1) for each invalid element, or 0(1 + I(Ti,v,e)). Thus, since there are 0(D) vertices in Tj, the total time required to reattach all vertices in Tj is OiD+^KT^e)) VETj Note that we are not searching all elements of T; to see whether j is now incorrectly listed as being on level 1 in their adjacency lists. It is a characteristic of the lazy deletion to which we referred earlier that we update the adjacency lists only when we encounter members which have been invalidated in the course of a previous iteration of the algorithm. • Case 3: e = (z, j) eTk,k^ i, l{i) = 1, l(j) = 2. Since j is at level 2 of Tk, the subtree T/ which is disconnected from Tk by the removal of e must consist of exactly one vertex, j . We use the procedure described in case 2 to reattach j . Thus, the time required in this case is 0(I(Tk,j,e)). 50 Chapter 3. d-Critical Spanning Subgraphs k j j Figure 3.3: An example of case 3 for diameter 2: Tk We observe that the work done in this algorithm can be separated into three parts for purposes of calculating the time complexity: • creating the BFS trees and adjacency lists • repositioning of vertices in BFS trees due to edge deletion • searching for valid entries and disposing of invalid entries in adjacency lists, pur-suant to vertex repositioning We have already established that the setup for this algorithm requires 0(mn) time. We can easily demonstrate that the vertex repositioning requires a total of 0(mn) time (recall that repositioning a vertex requires 0(1) time, not including the time required to search for a valid adjacency list entry). Case 1 requires no vertex repositioning. Case 2, which can occur in two BFS trees for a given edge, requires the repositioning of 0(D) vertices. Case 3 can occur in 0(n) BFS trees for a given edge, and requires the repositioning of a single vertex. Thus we have a total of 0(mD + mn) = 0(mn) vertices that are repositioned over the course of the algorithm, and therefore the vertex repositioning requires 0(mn) time. In order to establish that the traversal of the adjacency lists also requires 0(mn) time, we use a technique known as amortized analysis. Amortized analysis is an analysis of 51 Chapter 3. d-Critical Spanning Subgraphs algorithm complexity that bounds the total cost of a series of operations, rather than attempting to bound the cost of a single operation. This is a useful technique when • there is a wide variation in the costs of individual operations, or • broadly speaking, if a particular operation is expensive, subsequent operations will be relatively cheap (and vice versa). The time complexity of cases 2 and 3 depends on I(Tk,v,e), the number of members of adj(Ti,v) which are invalidated in the process of restructuring Tk in response to the removal of edge e. In the worst case, 0(D) edges incident to v have been removed from G and v's adjacency list has never yet been updated, so I(Tk,v,e) is in 0(D). This is where the amortized analysis comes in: instead of looking at the individual I(Ti,v,e), we consider the total number of obsolete adjacency list entries that we may encounter, measured over all Tk e T and all edges e G E. Claim 3.7 £ £ J2l(Tk,v,e) = 0(mn). eeETkeTverj Proof: This sum represents the number of invalid adjacency list entries that are encoun-tered over the course of the execution of the algorithm. We will prove that the number of adjacency list entries that are rendered invalid by the algorithm is bounded by 0(mn). Thus, we must first justify the use of this value to bound the number of adjacency list entries that we encounter during the course of the algorithm. We must demonstrate that an invalid adjacency list entry is encountered (at most) once. At first glance, this seems trivial; when we encounter an invalid adjacency list entry, we fix it in 0(1) time as described in case 2. However, if the candidate edge e is found to be critical, we must restore the BFS trees to the state that they were in before we 52 Chapter 3. d-Critical Spanning Subgraphs started testing e for criticality. We must do some restoration of the adjacency lists, as well, or there would be adjacency list entries that were improperly marked as invalid. However, if we undo all the changes made to the adjacency lists, then we may encounter a given invalid adjacency list entry 0(m) times. For an example of this worst-case behavior, consider a graph with exactly one noncritical edge, which is deleted in the first iteration. The algorithm will then attempt to restructure the BFS trees in response to the deletion of each edge, and since all remaining edges are critical, it will revert to the previous copies of the adjacency lists after each iteration. Thus, each adjacency list entry which was invalidated in the first iteration may be encountered once for each succeeding (noncritical) edge that the algorithm examines. Therefore, for each candidate edge e, we keep a record of the changes that are made to the adjacency lists, and separate them into two disjoint categories: changes that are made in response to the tentative deletion of e, and changes that were made in previous iterations (in response to the deletion of edges that have been proven to be noncritical). Using this information, we can undo the changes of the first category while preserving those of the second category. All changes made to the BFS trees fall into the first category, so we can simply revert the BFS trees to their previous state if e is found to be critical. Recall that the changes to the adjacency lists are of two types: deleting entries that refer to edges that are no longer in the graph, and shifting entries to the adjacency sublist of the proper level. If we have a record of these modifications, we can readily reinsert adjacency list entries, or return them to their original sublists, in 0 (1) time. We have demonstrated that each invalid adjacency list entry is only encountered once (and then permanently fixed). We have also shown that we can undo the modifications of the adjacency lists that are made in response to the tentative deletion of a critical edge in no more time than was required to make the modifications in the first place. Thus, we 53 Chapter 3. d-Critical Spanning Subgraphs may bound the number of invalid adjacency list entries that we encounter by the number of invalid adjacency list entries that we generate, and use the latter value to prove the desired result. An entry a in an adjacency list is invalid when one of two things has occurred: • The edge (v, a) is no longer in G. • The vertex at that entry is no longer at the level in this BFS tree which is indicated by the sublist of the adjacency list which it occupies. Both of these are side effects of an edge being deleted. In the first case, this is obvious; in the second case, we simply observe that we only move a vertex within a BFS tree (and therefore possibly change its level) if it is in a subtree which has been disconnected by the removal of an edge. We further observe that if v is already at level 2, it must be repositioned so that it is still at level 2. If it cannot be so repositioned, then the edge e whose removal occasioned the repositioning of v must be critical. Therefore, a successful repositioning of such a vertex v cannot invalidate any adjacency list entries. With this in mind, we consider (a) the number of vertices in each case whose repositioning can invalidate an adjacency list entry and (b) the number of adjacency list entries that can be invalidated by such the repositioning of such a vertex. In Case 1, no vertices are repositioned, and hence no adjacency list entries are invalidated. In Case 2, the vertex j , originally at level 1, may be repositioned at level 2, thus invalidating some adjacency list entries. All the other elements of Tj (the subtree disconnected from its BFS tree by the removal of an edge e = (i,j)) are already at level 2. Finally, in Case 3, the single vertex to be repositioned (j) is already at level 2, and thus its repositioning cannot invalidate any adjacency list entries. We observe that a vertex may appear in D adjacency lists (recall that D is the maximum number of neighbors of any vertex in the graph), so the repositioning of a single vertex to a different level causes 0(D) adjacency 54 Chapter 3. d-Critical Spanning Subgraphs list invalidations. The deletion of an edge e = (i,j) invalidates the two entries in each BFS tree's set of adjacency lists which are associated with e: the entry j in adj(Tk,i) and the entry i in adj(Tk,j), for a total of 0(n) entries. We now consider the total number of additional invalidations over all BFS trees caused by the repositioning of the vertices v e Tj, in the light of the arguments above regarding the number of possible adjacency list invalidations that can take place. Since in Case 1 and Case 3, no adjacency list entries may be invalidated, we need not consider them here. Case 2, however, occurs in exactly two BFS trees, and in each instance 0(D) adjacency list entries are invalidated. Thus, the total number of invalidations incurred over all Tk £ T is 0(D) for a given edge e. Since there are m edges, the sum over all edges e £ E of the number of invalidated adjacency list entries is 0(mn + mD) = 0(mn). • We have already established that the setup and repositioning costs are 0(mn) as well. Therefore, the complexity for the algorithm as a whole is 0(mn). • 3.3.3 Algorithm for Diameter 3 Graphs Theorem 3.8 Let G = (V, E) be a diameter 3 graph having m edges and n vertices. We can identify a diameter 3 critical spanning subgraph in 0(mnD) time. Proof: We construct the set of BFS trees T and adjacency lists adj(Ti,k) for each Tj G T as in the diameter 2 algorithm described above; these processes require 0(mn) time. The algorithm for diameter 3 graphs, in fact, is identical to the algorithm for diameter 2 graphs, with the exception of the restructuring step (remove_edge()), which is divided into more cases which are handled differently. • Case 1: e ^ T&. j Chapter 3. d-Critical Spanning Subgraphs Figure 3.4: An example of case 2 for diameter 3: Tj By definition, e is a virtual edge for Tk. Therefore, Tk is the same for the graphs G = (V, E) and G' = (V, E — e). Thus, removing e from G has no immediate effect on Tk. In keeping with the principle of lazy deletion, Tk is not updated at this time to reflect the deletion of e; update may take place at a later stage if necessary. (For details on when and how updates occur, see cases 2, 3, and 4.) Case 2: e = (i,j) e ThTj. Rebuild the BFS tree on the graph G' = (V,E — e), and regenerate the adjacency lists for each vertex in the BFS tree. This requires 0(m) time. Case 3: e = (i, j) G Tk, l(Tk, i) = 1, l(Tk,j) = 2. In the discussion of BFS tree properties in section 3.3.1, we observed that a vertex could never be repositioned closer to the root as a result of an edge deletion. The children of j are already at level 3, and so they must be repositioned at level 3 (or below, but if they were repositioned below level 3, that would indicate that the diameter of G — e was > 3). Since j is at level 2, it may be placed at level 2 or 3. If we can find a vertex p adjacent to j which is at level 1, then we can simply reattach the subtree rooted at j to p. The search for such a p occurs before we attempt 56 Chapter 3. d-Critical Spanning Subgraphs A • I Figure 3.5: An example of case 3 for diameter 3: Tk,l(Tk,i) = l,l(Tk,j) = 2 to reposition any other vertex in Tj, because if it succeeds, no other repositioning is necessary. We recall that the definition of I(Tk,v,e) does not depend on the number of sublists of adj(Tk, v) which are searched, so the cost of this initial search of j ' s level 1 sublist is included in I(Tk,j,e). If this search fails, this case is then dealt with in a fashion identical to that of Case 2 of the diameter 2 algorithm, except that the level 2 sublist is searched rather than the level 1 sublist. Since the operations performed are identical (with the addition of the extra test noted above, whose cost has been accounted for) the complexity of this case is also that of Case 2 of the diameter 2 algorithm: 0(D-\-YJV^T I(Tk, v, e)). (Recall that D is an upper bound on the number of vertices in Tj.) • C a s e 4: e = {i,j) € Tk,l(Tk,i) = 2,l(Tk,j) = 3. Because j is on level 3 of Tk, the subtree Tj which is disconnected from Tk by the removal of e must consist of exactly one vertex, j . Thus this case is identical to that of Case 3 of the diameter 2 algorithm, and the complexity is therefore 0(I(Tk,j, e)). The analysis of this algorithm proceeds in similar fashion to that of the diameter 2 algorithm. We use an amortized analysis to bound the number of invalid adjacency list 57 J Chapter 3. d-Critical Spanning Subgraphs Figure 3.6: An example of case 4 for diameter 3: Tk, l(Tk, i) = 2, l(Tk,j) — 3 entries that we may encounter over the course of the execution of the algorithm. Claim 3.9 E E Y,nTk,v,e) = 0(mnD). Proof: As before, we will prove the claim for the number of adjacency list entries that are rendered invalid over the course of the algorithm. Recall that this is an upper bound for the number of invalid adjacency list entries that we encounter. (See the proof of Claim 3.7 for a justification of this statement.) Using the same analysis as for the diameter 2 case, we observe that the number of vertices repositioned which may result in the invalidation of adjacency list entries are 0 (Case 1), 1 (vertex j in Case 3), and 0 (Case 4). Case 2 is considered separately: although 0(n) vertices are repositioned by the rebuilding of the BFS tree, all adjacency lists for this tree are also updated. (One might hope that this "free" cleanup of the adjacency lists would help us at later stages of the algorithm. We will discuss this notion in the concluding section of this chapter.) We also observe that the number of adjacency list entries that may be invalidated by the repositioning of j in Case 3 is 0(D). 58 Chapter 3. d-Critical Spanning Subgraphs The deletion of an edge e = (i,j) invalidates the two entries in each BFS tree's set of adjacency lists which are associated with e: the entry j in adj(Tk,i) and the entry i in adj(Tk,j), for a total of 0(n) entries. We now consider the invalidations caused by the repositioning of vertices v G Tj (the vertices disconnected from their BFS trees by the deletion of e). We need only consider Case 3, since by the argument above, this is the only case in which any vertex repositioning can result in the invalidation of adjacency list entries. Case 3 may be encountered in any number of BFS trees from zero to n — 2; in each case, 0(D) entries are invalidated, which yields a total of 0(nD) adjacency list entries which are invalidated. Thus, the total number of adjacency list entry invalidations (measured over all Tk G T) that take place in response to a given edge deletion is 0(n + nD) = 0(nD). Since there are m edges, the sum over all edges e G E of the number of invalidated adjacency list entries is 0(mnD). • We have already established that the initial construction of the BFS trees and generation of the adjacency lists take 0(mn) time. The remaining costs to be calculated, then, are the time required for repositioning the vertices in the BFS trees in response to the deletion of an edge, and the time required to restructure the BFS trees from scratch in instances of Case 2. As in the algorithm for diameter 2, we observe that the time required for repositioning a vertex is 0 (1) if we do not include the time required to search through the adjacency lists for a valid entry. Thus, we need only consider the number of vertices that must be repositioned over the course of the algorithm. Case 1 requires no vertex repositioning. The costs for Case 2 are considered separately. Each instance of Case 3, which may occur in 0(n) BFS trees for a given edge, requires 0(D) vertices to be repositioned. Each instance of Case 4, which may occur in 0(n) BFS trees for a given edge, requires 0 (1 ) vertices to be repositioned. Thus, since there are m edges to be considered, the total number of vertices to be repositioned over the entire algorithm is 0(mnD). 59 Chapter 3. d-Critical Spanning Subgraphs Case 2 occurs twice per edge deleted, and requires 0(m) time in each instance, for a total of 0{m2) time for all edges. Thus we have a complexity, for all operations in the algorithm, of 0{mn + mnD + m2) = 0(mnD) (since m = 0{nD)). • 3.3.4 Algorithms for General Graphs Recall that the complexity of the naive approach to identifying d-critical spanning sub-graphs is 0(m2n). We have not yet succeeded in developing an algorithm which is more efficient than this for graphs of diameter d > 4. The purpose of this section is to explore the reasons why our current algorithms do not appear to have natural extensions to the graphs of general d, and discuss alternate methods that might be used to improve this bound. A General izat ion That Doesn' t Work In order to discuss the generalization of the BFS-tree-based algorithms to graphs of general diameter, we must first establish some terminology. As above, we define TJ to be the subtree that is disconnected from a given BFS tree Tfc by the removal of e. We also define U(Tk, A) to be the subset of Tj whose vertices are at level p < A and which have not yet been reconnected to Tk- Initially, A is set to l(j) — 1. We then use a slight modification of the procedure described in the algorithms for d = 2 and d = 3 to reattach each u £ U(Tk, A). We search the level A sublist of the adjacency lists for each such u. If we encounter an adjacency list element which is an element of [/(Tfc, A) (and hence of currently undefined level, since we have not yet reconnected it to Tfc) we move it to the sublist for vertices of level A + 1. If we manage to reconnect all u G C/(Tfc, A), then we are done; all subtrees rooted at vertices in [/(Tfc, A) will also be at the closest possible distance to the root of Tfc. Otherwise, we increment A and continue until either all vertices are reattached to Tfc, or we find that there exists a 60 Chapter 3. d-Critical Spanning Subgraphs vertex u e U(Tk, A) such that l(Tk, u) > d (thus proving that e is critical). We observe that a vertex in U(Tk,X) can only change its level 0(d) times in response to the deletion of a given edge e. (One might think that a vertex in any Tk could move down at most d times over the course of the entire algorithm. However, if e is critical, all Tk G T are returned to the state that they were in before we started restructuring them in response to the anticipated deletion of e.) Since \TJ\ = 0(D), and \T\ = 0(n), the total amount of work done over the course of the algorithm in repositioning vertices is 0(dmnD). Note that, for fixed d, this portion of the algorithm requires asymptotically as much time as the algorithm for d = 3. Any adjacency list entry can be invalidated 0(d) times in response to the removal of an edge e, since each time an invalid entry is encountered in the level A sublist, it is repositioned in the level A + 1 sublist. Since there are a total of mn adjacency list entries associated with the BFS trees, the number of invalid adjacency list entries that we may encounter is 0(dm2n). As before, the extra factor of 0(m) arises, ironically, from the fact that if e is discovered to be critical, the BFS trees and adjacency lists are returned to their previous states. Thus, this strategy does not seem to generalize gracefully; the total complexity of this approach is 0(dm2n), which is certainly no better than the complexity of the original naive approach: 0(m2n). Given the fact that this generalized strategy closely resembles the tree-growing that results from a naive restructuring of the BFS trees, this result is perhaps not terribly surprising. Poss ible Improvements When we attempt to improve on the efficiency of a given algorithm, we should remember that there are two ways of doing so. The first, and most straightforward, is to design an 61 Chapter 3. d-Critical Spanning Subgraphs algorithm that uses a different approach to solving the problem. The second makes use of the fact that, in many cases, there is no proof that the anticipated worst-case behavior of the algorithm actually ever arises in practice (and sometimes not even in theory, once the theory is sufficiently closely examined). Thus, it is sometimes possible to "improve" the running time of an algorithm by demonstrating that tighter bounds apply than those that have already been proven. In the case of these algorithms, we have not yet succeeded in generating a graph that demonstrates the worst-case behavior that we predict. As a result, we would like to investigate the following possibilities: • In the d = 3 case, the BFS trees are restructured whenever the edge e = (i,j) that we are deleting is found in the BFS trees Tj and Tj. Since the adjacency lists for these trees are brought up to date at this point, it is possible that a more careful analysis of the number of invalid adjacency list entries would reveal that there cannot be as many as we have predicted. If so, perhaps we might improve on the complexity of the algorithm for general d described above. Recall that the process that determines the complexity of the algorithm is that of searching through the adjacency lists for a valid entry. By judicial rebuilding of the BFS trees (either when we have reason to believe that there are sufficient numbers of invalid adjacency list entries to make it worthwhile, or at regular intervals), we might improve on the worst-case predictions of the number of invalid adjacency list entries that we could encounter. • We calculate the number of adjacency list entries that are invalidated, and use this to bound the number of invalid adjacency list entries that are actually encoun-tered. Perhaps the number of invalid entries that the algorithm must deal with is asymptotically fewer than the number of invalid entries that are generated. 62 Chapter 3. d-Critical Spanning Subgraphs 3.4 Concluding Remarks We have seen that there are efficient methods of identifying d-critical spanning subgraphs of graphs of diameter d = 2 or d = 3. Unfortunately, these methods do not appear to readily generalize to general d. (This has prompted us to speculate that we have perhaps unfairly maligned the method we first discussed-tentatively removing each edge in turn and calculating the diameter of the resultant graph-by referring to it repeatedly as "naive".) 3.4.1 Further Uses of Matrix Multiplication The success of Anstee and Caccetta [1] in adapting matrix multiplication to the problem of recognizing diameter-critical graphs leads us to believe that there may be benefits in the further exploration of its use in our closely related problem. We already have an algorithm for identifying d-critical spanning subgraphs that uses matrix multiplication to calculate the diameter of the graph as each edge is removed, and has a complexity of 0(m\gd • M{nj) (or, for a given value of d, 0(m • M(n))). It seems likely that this is as far as this particular use of matrix multiplication can be taken. However, suppose that we have a graph G with diameter d represented by an adjacency matrix A. If we calculate d by the method that we described in section 3.1.2, we generate a series of matrices Ai,A2,... ,A2i,... ,A(i (where A^ = (A + I)k). If we delete an edge e, recalculate Ad, and Ad ^ 0, then we know that e is critical. We would like to have some way of updating Ad in response to the deletion of an edge e that does not require as much work as matrix multiplication. (Of course, we expect to have to update at least some of the other matrices as well in order to be able to update A^.) We observe that deleting an edge changes only 2 entries in A, and hypothesize that there may be some way of quickly updating the matrices that takes advantage of this fact and runs in, perhaps, 0(n2) or 63 • Chapter 3. d-Critical Spanning Subgraphs 0(n2lgd) time. This would give us an overall time of 0(mn2\gd). 3.4.2 Parallelization The two different methods that we have discussed of calculating the diameter of a graph-BFS trees and matrix multiplication-yield efficient parallel algorithms for find-ing (i-critical spanning subgraphs. The n BFS trees can be grown and manipulated by n different processors in parallel, or, if only k < n processors are available, the work can be distributed evenly among the k processors (so that each handles n/k BFS trees) with no difficulty. The advantages of this method are that it is very easy to implement as a parallel algorithm. Its implementation does not depend on the topology of the parallel machine, it requires no knowledge of special parallel programming tricks, and the algorithm can efficiently use any number of processors from 1 to n. The parallel implementation of the naive algorithm using BFS trees would have a time complexity of 0(m2) (with 0[n) processors). (We can, of course, achieve the same Q(n) speedup with the algorithms that we have developed for graphs of diameter d = 2 and d = 3.) On the other hand, we can multiply matrices using n2 processors in 0(n) time. The parallel implementation of the naive algorithm using matrix multiplication would then have a time complexity of 0(mn) (with n2 processors), which is clearly faster than the parallel implementation using BFS trees. This method does have some drawbacks, however. It is a more specialized use of parallel computation, since the implementation of the algorithm requires a very specific communication protocol among the processors. As a result, the efficiency of the algorithm is also dependent to a certain extent on the topology of the machine on which it is to be executed. It does not have any application to the faster algorithms for d = 2 and d = 3. Finally, it is not as readily scalable as the parallel implementation of the BFS tree algorithm; in other words, it is not immediately clear how to implement this algorithm on a system with < n2 processors. 64 Bibliography Anstee, R. and L. Caccetta. Recognizing diameter critical graphs. Combinatorics, Complexity, and Logic: Proceedings of Discrete Mathematics and Theoretical Com-puter Science '96, Springer-Verlag, Singapore (1996) pp. 105-112. Arlazarov, V.L., E.A. Dinic, M.A. Kronrod, and LA. Faradzev. On economical con-struction of the transitive closure of a directed graph. Soviet Math Dokl, 11 (1970) pp. 1209-1210. Caccetta, L. and R. Haggkvist. On diameter critical graphs. Discrete Mathematics 28 (1979) pp. 223-229. Coppersmith, D. and S. Winograd. Matrix multiplication via arithmetic progres-sions. Proceedings of the Nineteenth Annual ACM Symposium on Theory of Com-puting (1987) pp. 1-6. Dekel, E., D. Nassimi, and S. Sahni. Parallel matrix and graph algorithms. SIAM Journal on Computing 10 (1981), pp. 657-675. Fan, G. On diameter 2-critical graphs. Discrete Mathematics 67 (1987) pp. 235-240. Furedi, Z. The maximum number of edges in a minimal graph of diameter 2. Journal of Graph Theory, vol. 16, no. 1 (1992) pp. 81-98. Mantel, W. Problem 28, soln. by H. Gouwentak, W. Mantel, J. Teixeira de Mattes, F. Schuh and W. A. Wythoff. Wiskundige Opgaven 10 (1906) pp. 60-61. Masek, W.J. and M.S. Paterson. How to compute string-edit distances quickly. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, D. Sanhoff and J.B. Kruskal, eds. Addison-Wesley, Reading, MA (1983) pp. 337-349. Plesnik; J. Critical graphs of given diameter. Acta Facultatis Rerum Naturalium Universitatis Comenianae Mathematica 30 (1975) pp. 71-93. Plesnik, J. The complexity of designing a network with minimum diameter. Networks 11 (1981) pp. 77-85. Plesnik, J. On minimal graphs of diameter 2 with every edge in a 3-cycle. Mathe-matica Slovaca 36 (1986) pp. 145-149. Turan, P. Eine Extremalaufgabe aus der Graphentheorie. Matematikai Fizika Lapok 48 (1941) pp. 436-452. (in Hungarian) Xu, J.M. Proof of a conjecture of Simon and Murty. Journal of Mathematical Re-search and Exposition 4 (1984) pp. 85-86. (in Chinese) [Corrigendum. Ibid. 5 (1985) p. 38] 65
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Going critical : an investigation of diameter -critical...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Going critical : an investigation of diameter -critical graphs Madden, Joshua 1999
pdf
Page Metadata
Item Metadata
Title | Going critical : an investigation of diameter -critical graphs |
Creator |
Madden, Joshua |
Date Issued | 1999 |
Description | We define a graph G=(V, E) with m=\E\ edges, n=\V\ vertices, maximum degree D, and diameter d, to be d-critical if it has the property that for any edge e G E, the graph G — e has diameter > d. We explore some results relating to a conjecture on the maximum possible number of edges in a 2-critical graph. We also present efficient algorithms for identifying spanning d-critical subgraphs of graphs having diameter d where d=2 or d=3. The algorithm for d=2 runs in 0(mn), and the algorithm for d=3 runs in 0(mnD). |
Extent | 2673799 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-07-03 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080060 |
URI | http://hdl.handle.net/2429/10030 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1999-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1999-0278.pdf [ 2.55MB ]
- Metadata
- JSON: 831-1.0080060.json
- JSON-LD: 831-1.0080060-ld.json
- RDF/XML (Pretty): 831-1.0080060-rdf.xml
- RDF/JSON: 831-1.0080060-rdf.json
- Turtle: 831-1.0080060-turtle.txt
- N-Triples: 831-1.0080060-rdf-ntriples.txt
- Original Record: 831-1.0080060-source.json
- Full Text
- 831-1.0080060-fulltext.txt
- Citation
- 831-1.0080060.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0080060/manifest