ON THE EFFICIENCY OF CLIQUE DETECTION IN GRAPHS by ANTHONY HUNTER DIXON B.Sc. , University of British Columbia, 1968 M.Sc, University of British Columbia, 1970 fl THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in the Department ' ofCOMPUTER SCIENCE We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA May,1973 In presenting this thesis an advanced degree at the L i b r a r y I further in p a r t i a l fulfilment of the requirements the U n i v e r s i t y of B r i t i s h Columbia, I agree s h a l l make i t freely available for agree t h a t p e r m i s s i o n f o r e x t e n s i v e copying o f of this representatives. thesis for It financial this thesis of gain s h a l l not Computer Science The U n i v e r s i t y o f B r i t i s h Columbia Vancouver 8, Canada Date June 18, 1973 or i s understood that copying or p u b l i c a t i o n written permission. Department that r e f e r e n c e and s t u d y . f o r s c h o l a r l y purposes may be g r a n t e d by the Head o f my Department by h i s for be allowed without my II ABSTRACT This thesis examines the devices employed by various algorithms to search for maximal complete subgraphs in graphs. Also known as cliques,in Chapter 1 these subgraphs are seen to play an important role in graph theory, information retrieval, Sociometry, logic design, and computational complexity. The enumeration of cliques using the Harary-Rcss, Bonner, Peay, and Bron-Kerbosch algorithms i s discussed in Chapter 2. The Reduced Redundancy performance of alternative approach algorithms i s shown algorithm the five given graph whose size i s introduced, procedures to empirical to generate can be and the i s assessed testing. using an Each of the a "derivation tree" for a used as a measure of i t s efficiency. In Chapter 3, the possibility of exploiting vertex similarity i s examined with a view to reducing the size of the derivation tree. As a consequence, algorithms are proposed for finding non-similar cliques. The concept of "complete subgraph equivalence" of vertices i s introduced to develop a means for expressing the cliques of a graph as the Cartesian product cf vertex subsets. An algorithm for detecting the existence cf a order k in a certain class of k-partite graphs in polynomial time i s proposed in Chapter 4. This graphs clique cf reducible class consists of a l l by the algorithm to k-partite graphs having at most two vertices per block of degree greater than 0. This ii) algorithm i s shown to provide an efficient heuristic which can be used in a procedure for determining whether a well-formed formula i s a tautology. The thesis i s concluded with an empirical analysis of the techniques employed by the enumeration algorithms 2. In addition, the efficiency algorithm i s compared with algorithm. that of of the the Clique Reduced of Chapter Detection Redundancy IV TABLE OF CONTENTS CHAPTER 1 : INTRODUCTION 1.1 DEFINITIONS AND NOTATION 1 1.2 INTRODUCTORY REMARKS 5 1.3 HISTORICAL SURVEY 6 CHAPTER 2 : ENUMERATION OF CLIQUES 2.1 INTRODUCTION 16 2.2 MATHEMATICAL ANALYSIS OF ENUMERATION ALGORITHMS 18 2.3 ANALYSIS OF THE HARARY-ROSS ALGORITHM 22 2.3.1 NOTATION 23 2.3.2 THE ALGORITHM 24 2.3.3 NUMBER OF VERTEX SETS EXAMINED 35 2.3.4 STORAGE REQUIREMENTS 38 2.4 BONNER * S ALGORITHM 41 2.4.1 NOTATION 42 2.4.2 THE ALGORITHM 42 2.4.3 NUMBER OF VERTEX SETS GENERATED 46 2.4.4 STORAGE REQUIREMENTS 50 2.5 ANALYSIS OF PEAY'S ALGORITHM 51 2.5.1 NOTATION 52 2.5.2 THE ALGORITHM 53 2.5.3 NUMBER OF VERTEX SETS GENERATED 57 2.5.4 STORAGE REQUIREMENTS 64 2.6 A NEW ENUMERATION ALGORITHM 66 2.6.1 DESCRIPTION OF THE ALGORITHM 70 2.6.2 NOTATION 70 2.6.3 THE REDUCED REDUNDANCY ALGORITHM 71 V 2.6.4 NUMBER OF VERTEX SETS GENERATED 72 2.6.5 ANALYSIS OF AM ITERATION 75 2.6.6 STORAGE REQUIREMENTS 76 2.7 THE BRON-KERBOSCH ALGORITHM 78 2.7.1 NOTATION 79 2.?.2 THE ALGORITHM 79 2.7.3 NUMBER OF VERTEX SETS GENERATED 84 2.7.4 ANALYSIS OF ONE ITERATION 85 2.7.5 STORAGE REQUIREMENTS 88 CHAPTER 3 : CLIQUE DETECTION USING VERTEX SIMILARITY 3.1 INTRODUCTION 89 3.2 POINT AND LINE SYMMETRIC GRAPHS 91 3.3 DETERMINATION OF NON-SIMILAR CLIQUES 98 3.4 ALGORITHM FOR FINDING NON-SIMILAR CLIQUES 102 3.5 DISCUSSION OF THE ALGORITHM 105 3.6 ANOTHER APPLICATION OF ORBI/TAL PARTITIONING 108 3.7 ANOTHER APPROACH TO DESCRIBING THE CLIQUES 111 CHAPTER 4 : EFFICIENT DETECTION OF CLIQUES 4.1 INTRODUCTION 126 4.2 CLIQUE DETECTION AND SATISFIABILITY 127 4.3 CLIQUE DETECTION ALGORITHM 130 4.4 TIMING CONSIDERATIONS 138 4.5 STORAGE CONSIDERATIONS 139 4.6 INPLICATIONS OF THE PROCEDURE 140 CHAPTER 5 : EMPIRICAL OBSERVATIONS AND SUMMARY 5.1 INTRODUCTION 141 5.2 THE TEST DATA 142 5.3 DISCUSSION OF THE RESULTS 145 vi 5.1 SUMMARY 149 BIBLIOGRAPHY 151 APPENDIX A 156 APPENDIX B 157 APPENDIX C 169 LIST OF TABLES TABLE 4.1 BESULTS OF THE ALGORITHM FOR FIG. 4.1 TABLE 5.1 CHARACTERISTICS OF THE TEST GRAPHS TABLE 5.2 COMPARISON OF ALGORITHMS TABLE 5.3 COMPARISON OF DETECTION AND ENUMERATION V'IU LIST OF FIGURES FIG. 2.1 K (3,3,3) 33 FIG. 2.2 HARAR Y-ROSS ALGORITHM: DERIVATION TREE 34 FIG. 2.3 SUBTREE OF DERIVATION TRIE 37 FIG. 2.4 A PATH OF THE DERIVATION TREE 40 FIG. 2.5 BONNER'S ALGORITHM: DERIVATION TREE 49 FIG. 2.6 SUBTREE OF THE DERIVATION TREE 59 FIG. 2.7 PEAY'S ALGORITHM: DERIVATION TREE 62 FIG. 2.8 63 DERIVATION TREE FOR K(3,3) FIG. 2.9 MINIMAL COVER COUNTEREXAMPLE 69 FIG. 2.10 MAXIMAL COVER COUNTEREXAMPLE 69 FIG. 2.11 REDUCED REDUNDANCY ALGORITHM: DERIVATION TRIE 74 FIG. 3.1 POINT SYMMETRIC GRAPH 93 FIG. 3.2 93 INDUCED SUBGRAPH OF FIG. 3.1 FIG. 3.3 POINT-SYMMETRIC GRAPH NOT LINE-SYMMETRIC 94 FIG. 3.4 GRAPH WITH ALL CLIQUES NON-SIMILAR 107 FIG. 3.5 A PATH OF DERIVATION OF NON-SIMILAR CLIQUES 120 FIG. 3.6 124 DERIVATION TREE FOR K (3,3,3,3) FIG. 3.7 DERIVATION TREE FOR FIG. 3.6 125 5 FIG. 4.1 GRAPH CONTAINING SUBGRAPH K(1 ) • 136 ACKNOWLEDGEMENT I wish to Mowshowitz, for thank my supervisor, h i s assistance both Professor A. academically and 0 financially thesis, and attention to i n the development also Professor and E. presentation preparation my wife, of this Patty, f o r her manuscript. this L a w l e r f o r b r i n g i n g t o my the importance o f t h e problem. thank of Finally , assistance I wish i n the 1 CHAPTER U INTRODUCTION 1. 1 DEFINITIONS AND NOTATION The purpose of this thesis will be to examine which search for maximal particular the manner subgraphs will be in procedures complete subgraphs cf a graph. In which explored algorithms in order enumerate these to discover why such algorithms have an exponential computation time. The of detecting the existence cf a complete particular order will also be explored and an be proposed which uses a different efficiency of the procedure subgraph cf a algorithm will method enumeration. This technigue i s instrumental in problem from that of improving the of clique detection, and fcr a particular class cf graphs the algorithm can be shown to have a polynomial computation time. First we shall provide a fairly definitions pertinent to the problem at there complete list hand. Unfortunately, i s no universally accepted terminology in graph theory and for this reason the author has chosen his definitions be of compatible where possible with the widely known text of Harary [43]. The definitions will also serve to introduce notation to the to be used in the body cf this text for the concepts defined. DEFINITION 1.1: A jjrach G consists of a vertices together finite set V (G) of with a prescribed subset E(G) of unordered pairs of elements from V (G) called the edges of G. If <u,v) is 2 an edge of G then the vertices u and v are said to be adjacent. DEFINITION JL2J. The degree , djyj. , of a vertex v in a graph G is the number of vertices in G adjacent to v. DEFINITION 1.3^ A labelling, of a graph G with | V (G) | = n vertices i s an assignment of n distinguishing labels, one to each of the vertices in V(G). DEFINITION lz.Hl A k-partite graph GJm £n>2l... tE^l i sa graph whose vertices can be partitioned into k blocks V , V ,...,V such that for any two vertices u and v in the same block (u,v) is not an edge of G (m ,m ,...,m ) . Given such a partition m. 1 2 k i denotes the number of vertices i n block 7 . i DEFINITION 1.5: A complete k-partite g.rap_h K (m , m , . . ., m ) ~" x <. is k a k-partite graph such that for any two distinct blocks V^, and any vertex u in V , and any vertex v in V , (u,v) i s an 1 *~ edge of K(m ,m ,...,m ) . DEFINITION V.62 The chromatic number^ %1GJ_X of a graph , G the minimum number of blocks V , 7 ,...,V 1 2 is . possible in any A(.GJ partition of V (G) such that for any two vertices u,v in the same block (u,v) i s not an edge of G. DEFINITION l-.ll The p_oint independence number ,/3^(G), i s the largest number of mutually non-adjacent vertices in a graph. DEFINITION 1.8; The complement , G, of a graph G i s a graph such that V ( G) = V(G) and (u,v) i s an edge of G i f and only i f (u,v) i s not an edge of G and u*v. DEFINITION 1.9: Let the vertices of G be labelled 1 through n where |V(G)| = n. The adjacency matrix , A (G) , of the graph G i s a (0,1) matrix such that a =1 i f and only i f (i,j) is an edge ij 3 of the labelled graph. The adjacency set ,A(v), cf a vertex v is the set of vertices adjacent to v. J^EFINITIQN _1«IQl A unicligual vertex v of a which belongs to exactly one cligue graph of G is one the graph (see is a graph with definition 1.14). DEFINITION -1^11: A subgraph H of a graph G V(B) V(G) and E(H) E (G) such that i f (u,v) i s an edge of H, then (u,v) i s an edge of G. DEFINITION 1.12 For any set of vertices W £ V ( G ) , Sil^2I§£li G ^ is t n e maximal subgraph the induced of G with vertex set V(G^)=W. That i s , for any u,v in V (Gw) (u,v) i s an edge of G^ i f and only i f (u,v) i s an edge of G. DEFINITION 1^.12}. A complete subgraph of order k of a graph G i s a subgraph defined on k vertices of V(G) for which any vertices in the two subgraph are adjacent. A trianale cf G i s a complete subgraph of order 3. DEFINITION J i l i i A cligue C of order k complete subgraph of G V(G)-V{C) adjacent to a l l vertices of a graph G is a for which there exists no vertex in in V(C). Cliques are therefore maximal complete subgraphs. DEFINITION 1.15: An automorphism of a graph G i s a permutation of the vertex set V (G) which preserves adjacencies. DEFINITION 1.16: The collection of a l l automorphisms cf a graph G forms a group called the automorphism grogjE ,P(G), of the graph G. DEFINITION 1.17: Two vertices u,v of a graph G are similar i f there exists an cc in T(G) such otu=v. DEFINITION 1.18: A graph G i s points-symmetric i f for any two 4 vertices u,v in V(G) there exists an «c in T(G) such that<*u=v. DEFINITION 1.19: A graph G i s line-symmetric i f for any two edges (u ,v ) , (u ,v ) in E(G) there exists an oc in r (G) that either m ul=u2 and ^ 1 =v 2 DEFINITION ls.2.0.1 fin considered efficient requirements such or ocu^v^ andocv1=u2. algorithm which operates on a graph i s i f the computation time and storage can be expressed as functions of n bounded above by a polynomial in n, where n i s the number of vertices in the graph. DEFINITION 1. 2,1: The Hadamard product , AXB, of two matrices A and B i s the matrix C where c = a. . • b . . . 5 1.2 INTRODUCTORY REMARKS The focal point cliques in graphs. of this thesis is the detection of The importance of this topic from both a graph theoretic and an applications standpoint will manifest i t s e l f in the historical survey of the next section. There are several variants or special cases of this subject which can be considered. We will confine ourselves to a study of the following four problems: 1. ) The enumeration of cliques of a graph; 2. ) Detection of the largest clique of a graph; 3. ) Determination of the non-similar cliques cf a graph; 4. ) Determination of cliques of a specific order. The object in each case will characteristics of the problem, be to study the special and then to exploit these features to develop useful procedures for achieving the gcal. In addition, an effort will be made to determine bounds on the efficiency of several such procedures and to seek insight into intrinsic properties estimating the best of the methods used which might aide in that one can expect designed to solve these problems. from procedures 6 1.3 HISTORICAL SURVEY The study of methods subgraphs originated efficient and for detecting principally objective with maximal complete the representation search for an of the structure of social groups. Such groups can be modeled with a sociogram, a graph which characterizes the responses of individuals in a sociometric test which requires that each participant some subset of the group to which he wishes to belong (Moreno [62]). Pioneers in a mathematical treatment were Forsyth and Katz [33] of this and operations of column permutations could be applied to achieve some more desirable form in could problem who proposed representation of sociograms as matrices to which the elementary row specify be more easily which the groupings observed., Such of individuals a representation was subseguently refined by Luce and Perry [54], Festinger and Luce [55]. A matrix of the properties of (0,1) matrix, effectively the adjacency sociogram, a graph was utilized, derived from and the characterizing the structure of distance the square, cube, and higher powers of the adjacency matrix were used as in [31], the group. indicators Festinger observed that a unicliqual member i belonged to a clique of k persons i f and only i f the ith diagonal element in the cube of the adjacency matrix was equal to (k-1) (k-2) . It was therefore a simple matter to find the order of the clique to which an individual belonged, provided he was clique. Weiss and Jacobsen Luce, Perry, and Festinger to [77] a member cf only one applied the techniques of analyze the relationships individuals and their tasks in business organizations. of 7 As a result "cligue" cf its sociological beginnings the became synonymous with the graph theoretic notion cf "maximal complete subgraph". The apparent graphical method of representation usefulness of of the general need manipulationg the sccicgram particular, the cliques techniques for and the of socicmetric data was suggested by the early results. However, investigators aware word more powerful became methods of i t s associated matrix; in of determining in an efficient but general manner suggested for by the studying model. The social importance groups thus of motivated exploitation of a graph theoretic approach to the detection of cliques. Because of the finiteness of the problem, there exists a naive, "brute force" method for finding the cliques of a graph with n vertices—merely examine vertices for k=1,2,...,n. This sets, an each clearly obviously involves looking at intractable number for even moderately large graphs. Because sociograms could be very large, analysis by cligue membership would be practicable cnly with the advent cf much better detection procedures. Early techniques depended primarily on a "bag of tricks" which incorporated knowledge of the induced by the particular exhaustive search or estimation longer possible. In 1956 structure application, when Harary cf and the resorted simplification and graph was to no Ross [41] proposed a method wh.ere.by a graph was systemically reduced to components each of which contained at least one unicligual vertex and for 8 which the observation of Festinger, previously mentioned, could be used to find i t . The reduction procedure employed the following algorithm: 1. ) I n i t i a l l y , let the graph G i t s e l f be the component under consideraticn. 2. ) If v is a consideration, then unicligual vertex in a component under the subgraph induced on v and those vertices adjacent to v i s a clique. Define a new component for consideration by deleting from the current component v and a l l unicligual vertices adjacent to v. 3. ) If the component under consideraticn contains no unicligual vertices, l e t v be a vertex belonging to a number of trianqles of that component. Define minimal two new components for consideration: a. ) the subgraph induced on v together with those vertices adjacent to v, and b. ) the union of subgraphs induced on the set of vertices not adjacent to v together with their respective sets of adjacent vertices. Since the union of these two components can equal be shown to be to the current component under consideration, delete i t from the l i s t of components to be considered. 4. ) Choose a component from the l i s t and go to 2 until the l i s t i s exhausted. A procedure for improving the efficiency of the algorithm was also suggested by Harary and Ross by which a component 9 containing not more than three processed at step 2. The cligues Harary could and be ccmpletely Eoss algorithm thus provided a more practical method than the naive algorithm identifying cligues in sociometric fcr data representable as a (0,1) matrix. Subsequent researchers have pursued techniques of the information in an about group. vertices development cf for -the analysis of variations and generalizations model social the in the By a effort to include structure associating sociogram, in a obtain more representation cf a weights the and with degree or relationship could be characterized (see for each pair of strength of the example Hubell [45]). By utilizing an adjustable threshold value, a hierarchy of graphs could be established, each graph being defined cn the same set of vertices but consisting of whose weights Johnson exceeded [47], hierarchichal Boyle structure particular Peay determining and the the [6B] threshold [7] of only value. have groups cligue by Dorien [20], this an the method. In algorithm structure representation. A family of sets of vertices call edges investigated recently developed hierarchichal those cf which for such a we shall "potential cliques" is generated from a graph G of order n and associated with a threshold value t. 1. ) I n i t i a l l y V(G) is the only potential clique in the family. Denote by w (v ,v ) the weight of edge (v ,v ) in E(G). i 3 1 j 2. ) If there exists an edge (v ,v ) in E(G) such that i j w (v ,v ) < t, and i f there exists a potential cligue C induced i j 10 on a subset of k vertices containing v potential cliques C , C 1 2 and v then two new are induced on C-fv } and C-{v }. i j 3. ) C i s deleted from the family and C 1 1 and C are added 2 to i t provided neither i s contained in some current member of the family. 4. ) When it is the case that w(v , v )>t for any pair i J (v^v.) in any potential clique, then the family constitutes the set of cliques associated with the threshold value t . It is clear successively adjacent refine pairs hence such sets "potential that the sets until of effect of of vertices the algorithm i s to which contain non- such refinement i s no longer possible; vertices cliques". To can. be use the considered tc induce algorithm to find maximal complete subgraphs, a threshold value of 1 i s assumed. Yet another important that stimulated the sociological study of property cliques in graphs tendency for individuals to "cluster" into groups way that members of a cluster retained in outside of techniques groups was the such a a high degree of similarity, while different clusters characterized properties of dissimilar (Davis [15]). Cluster theory also had applications Sociometry. Abraham to solve interconnections of [1] has used clustering the problem of minimizing the number of electrical assemblies, a problem also explored by Lawler [51], while Bonner [6] applied such methods to medical taxonomy problems. Bonner's efforts resulted in the design of an apparently efficient algorithm, so considered 11 because his method eliminated the generated vertex subsets need for comparing newly with previously generated sets for containment, a necessary part of the procedures employed by Harary and Ross or Peay. As a conseguence, Bonner's popularity and was used recently algorithms. proposed in algorithm comparative studies of extraneous vertex some with more However, Augustson and Minker [5] showed this efficiency was often illusory number enjoyed subsets since a large could be generated in certain cases. Besides the interpretation explored application of sociometric application retrieval. Early retrieval were applied made such of is to data, the developments by cligue technigues the cf the information area [57] while the problem to tc the most recent widely problem in Meetham detection of document Abraham of [2,3] thesaurus construction. In addition, the previously cited work of Bonner was also an application of cligue detection methods tc information retrieval problems. More recently, Gotlieb technigues to represent the between index and Kumar [36] used clustering degree of semantic terms used in the classification of documents. The thesaurus problem was further explored Minker who association employed a by Augustson and new algorithm developed by Bierstone which was shown by empirical methods to be better than that cf Bonner in empirically many by cases. Bierstone's Mulligan [63] with algorithm two was compared recently developed 12 algorithms for clique enumeration, one by [8] and Bron the second by Corneil (see Mulligan [63]). From this study Mulligan concluded the Bron-Kerbosch superior. these and Kerbosch The problems algorithms will and be algorithm to be techniques of comparing some of discussed further in the next chapter. Several important graph theoretical problems are related to the detection of maximal complete known (see for example subgraphs. Nordhaus It i s well [66]) that the pcint independence number of a graph G i s egual to the order of the maximal clique in G, chromatic number of G independent efficient cliques means invariants. A i t s complement. In addition, the i s equal in G. At of computing procedure to the minimum present either for example theory, and [14,19,26]) hence serves there i s no known of these graphical for determining either number would provide an important tool in pursuing (see number of the host cf problems that exist in chromatic graph to emphasize the importance of studying complete subgraphs. As a result the literature results relating tc the existence abounds with a variety of of complete subgraphs in graphs. As an example, one may consider the celebrated problem of Ramsey [69] concerning the smallest number of vertices that a graph order m, may or have an and contain either a complete subgraph of independent set of k vertices. The determination of such a number, r(m,k), i s an unsolved problem for general m and k, although the published results include 13 the calculation cf specific values, bounds [21,29,35,37,38,48]. It theorems, and i s evident that an efficient cligue detection procedure would providing existence provide a useful tool by a faster way of examining graphs for their complete subgraphs. Perhaps the results of extremal graph by Turan[74,75], contributed most theory, pioneered directly to the clique detection problem. In 1965, Moon and Moser [60] verified by the maximum number earlier established direct methods of cliques possible in a graph, a result by Erdos [ 27] through an inductive argument: The maximum number of cligues in a graph with n vertices i s : a. ) 3 n/3 i f n = 0 mod 3 n b. ) 4-3^ "^ n /3 i f n = 1 mod 3 2 c. ) 2-3 ( " )/3 i f n = 2 mod It was also shown that 3. the graphs which contain the maximum number of cliques were: a. ) the complete nr-partite graph K(3,3,...,3) if n = 0 3 mod 3 b. ) the complete (n-1)-partite graph K(3,3,...,4) i f 3 n = 1 mod 3 c. ) the complete (n*1)-partite graph K(3,3,...,2) i f n = 2 mod 3. These graphs Moser graphs. shall henceforth be referred to as Moon- 14 It follows from this result that, as a number of i t s vertices, at least once of the a graph may contain an exponential number of cliques. Hence any clique function algorithm which (ie. Sequential) examines each may be expected to require an exponential amount of time to enumerate the cliques of a graph. Although from a graph theoretic point of view this disheartening result, or nearly graphs appear the to a an examination of graphs which contain such numbers of cliques reveals that a l l cliques same is are of the same order. Further the cliques in such be homogeneously distributed over the vertices, each vertex belonging to the same or nearly the same (again exponential) number of cliques. From a practical pcint of view, the number of edges in the graphical models generated by the empirical data cf the applications suggests conditions are unlikely to occur. The that sparseness adjacency matrix of such graphs could therefore be used rough a priori test of the such of the as a number of cliques a detection algorithm might be expected to find. Cliques in graphs having a maximal number of cligues all of the same size. number of different vertices is bounded their lower bound on bounded below sizes by Moon and Moser also showed that the of above this are cligues in a graph with n by n-log (n). Erdos [30] improved number by showing that it was n-log (n)-H (n) -0 (1) , where H(n) denotes for some k the k-fold iterated logarithm, 0(1) i s an unspecified constant. log log.. .log (n), and 15 The r e s u l t s of Erdos, algorithm for Mccn and Rcser suggest that an the enumeration of c l i g u e s i s an example of an exponential combinatorial effort the theory of computation tc e s t a b l i s h a h i e r a r c h y in of complexity by the c l a s s e s of c o m b i n a t o r i a l lack of algorithms f o r processes. success a large This work that c o m b i n a t o r i a l recognition in was problems. no of Using such efficient expressed a efficient combinatorial by Ccck [11] be motivated proving important pioneered can and an who as showed language representation certain algorithms have jet been were shown to be e g u i v a l e n t i n the sense that each reducible to the s a t i s f i a b l e . The expanded by class Lawler exists of problems [52,53] a polynomial so reducible Karp [49] and and has p r i n c i p a l r e s u l t i s that bounded algorithm strongly suggests that the was been includes the either f o r each problem i n the c l a s s , or f o r none cf them. However, the nature problems was problem cf whether a well formed formula c l i q u e d e t e c t i o n problem. The there algorithms, finding number problems problems f o r which devised process. There has r e c e n t l y been of the l a t t e r case i s i n f a c t true. The study of the d e t e c t i o n cf c l i g u e s may t h i s question s i n c e Howshowitz [ 6 3 ] formed formula with k c l a u s e s can such a way has t h a t the w e l l formed formula i s s t a t e d by Karp [49 1, and shewn that be represented enly i f there e x i s t s no complete subgraph result help t c r e s c l v e a well by a graph i n i s a tautology i f and of i s implicit order k. in Ccck. This [11]. 16 CHAPTER 2 i ENUMERATION OF CJ.IO.UES 2.1 INTRODUCTION In order tc gain insight into the complexity of the problem of cligue enumeration, we shall examine the algorithms described in Chapter 1 in some detail. The technigue evolved exploits the way in which the vertex subsets are determined during an iteration of each algorithm. A cligue enumeration algorithm will then be proposed and shown to be more efficient than those previously examined. The because algorithms of Harary and Ross and Bonner were chosen of their apparent availability differences in the literature, of approach, and their their freguency of citation as references in subsequent literature on the subject of cliques in graphs. In addition, algorithm i s considered precedent for algorithms. the Peay»s the Harary and Ross by this author to be the historical development algorithm of was clique chosen enumeration because i t is comparatively recent, offers a conceptually simple approach to the problem and there efficiency. this to be no analysis of i t s Although not yet readily available at the time of documentation, included appears the Eron-Kerbosch algorithm has been because of i t s superiority over some of the previous methods as determined by Mulligan [64]. Although the algorithms cited employ apparently different techniques primarily to achieve their one of detail, goals, this difference for an examination is reveals the 17 following common features: 1. ) Each algorithm refines determined vertex subset or decomposes a previously to obtain new vertex subsets each containing at least one cligue of the original graph,ftchoice i s made of a vertex from the i n i t i a l vertex adjacency old subset, and its properties are used to define the new subsets. The vertex subset i s subsequently removed from further consideration. 2. ) Each alqorithm has the property that every clique of the original graph i s contained i n exactly one of the possibly several vertex sets available for consideration at any stage of the algorithm. 3. ) Each algorithm employs p i t f a l l s of multiply defining clique some complete situation can occur a cligue subgraph whenever some device to avoid the or including as a which is not maximal. Such a vertex subsets are generated which are properly contained in other vertex subsets, or which contain complete subgraphs maximal on that vertex subset but not on the original vertex set. In the remainder of this chapter i t will be seen that i t is precisely these properties of the clique algorithms which profoundly procedures. affect enumeration the efficiency of such 18 2.2 MATHEMATICAL ANALYSIS OF ENUMERATION ALGORITHMS To obtain some means of estimating analytically the computational time required by each algorithm, the computation may be divided determining into two parts. The f i r s t the effort required consists of for one iteration of the algorithm; that i s , the time required to determine new vertex subsets from an old one. The second involves the determination of the number of iterations required to find a l l the cliques of the graph. We shall required see that the number of iterations i s related to the number of vertex subsets generated during the execution of an enumeration algorithm A. ^ DEFINITION 2^1^ A vertex subset W i s directly derivable from a vertex set V using algorithm A i f during some iteration in the execution of A, W i s determined from V. DEFINITION 2.2: A vertex subset » i s derivable from a vertex subset V using algorithm A i f there exist subsets IL ,0 ,...,U. such that U ^ i s directly derivable from V, U i s directly from D\ for i < j , and W i s directly derivable from U\ . Since the set W i s contained in the set V from was derived, using these definitions i t i s easy to see that derivability induces a partial ordering vertex subsets which i t on the set of all generated by algorithm A during the course of enumerating the cligues, provided we add the stipulation that every subset i s directly derivable from i t s e l f . This partial ordering may be represented by a directed tree whose root represents the vertex set of the graph, and whose vertices represent the vertex subsets generated by 19 algorithm A. Vertex u, representing subset U, i s connected a directed edge to vertex wf representing subset W, i f H i s directly derivable from a. It i s clear dependent upon the enumeration that this tree i s algorithm used and the labelling of the graph, hence we shall always any derivation by tree a labelled associate with graph G and an enumeration algorithm A. DEFINITION 2.3: A vertex subset W will be said to be redundant i f i t i s properly contained in some vertex subset V from which i t was not derived. It i s interesting to observe that the behavior of clique enumeration algorithms to be described subseguently can be compared to a tree searching procedure, the tree in this being the derivation tree case of a graph G as determined by an algorithm A. The determination of methods for minimizing the development of redundant nodes in the determination of the derivation tree may then be likened to the problem of finding suitable tree pruning technigues. been noted by Mulligan [ 6 4 3 , exploited i t in their while algorithm This similarity has also Bron which and Kerbosch have used a "branch and bound" technique on the derivation tree. To determine the computational effort required single iteration during a of each enumeration algorithm, a technique developed by Corneil [12, APPENDIX A ] will be used. The tasks performed within each algorithm will be grouped into blocks. Each block will be defined from a set of basic operation types determined by implementation considerations. These operation 20 types and their associated time constants are given in TABLE 1 of APPENDIX A. Host of the instruction types shall elaborate briefly, operations. The need indeterminate are self-explanatory. We however, on for such the "push" and "pop" operations arises save implemented as a data useful obtain storage structure is a linked l i s t with additional storage added on as required. The essence of the "push" to most such subsets, i f they are needed for some subsequent processing, on a push-down store. Such easily the amount of storage required for saving partially determined vertex subsets. It will be seen to be to from operation i s for the current vertex subset tc be saved together with a link address pointing to the next data item in the store. The top item i s always directly accessible a pointer to the top of the through store . The "pop" operation deletes the top data item of the store and resets the pointer to the store top so that i t points to the next data item which thus becomes the new top of the store. In most cases i t is difficult expression for the computation time consideration. This is due to of obtain the primarily to an exact algorithm under difficulties determining the number of vertex subsets having number of vertices. determination of such determining (such as the particular the A an extent "first iteration. second factor expression is of edge" For a particular complicating the in difficulty the of search to satisfy some condition in Peay's these algorithm) during a reasons our principal goal 21 will be to determine the order (as a function cf the number of vertices in the graph) of an iteration, and the number of vertices in the derivation tree. In determining estimates of the reguired computation time for the enumeration of cligues by various methods, the primary graph to be considered will be the complete k-partite graph with m vertices per block, denoted K(m,m,...,m) or K (m ) . The reason for such a choice i s that every k-partite graph i s a subgraph of K (nr*) and the derivation tree of any other k- partite graph i s smaller than the derivation tree of K(m ) . As a consequence of this choice of graph for consideration i t is possible to determine the number derivation tree using each algorithm. of vertices in its 22 2.3 ANALYSIS OF THE HARARY-ROSS ALGORITHM As mentioned in the introduction of this chapter, the Harary-Ross algorithm was selected comparative for consideration study of some cligue detection algorithms because of i t s historical precedent. It i s interesting despite in a to note that i t s frequent reference i n subseguent papers on clique detection examined by this author, no mention had been made of an error in the alqorithm until Harary himself referred to i t s existence i n a recent paper on the application cf graph theory to the Social Sciences [UU], Harary observed that although the method found a l l the cliques, i t also subgraphs included in the set of maximal complete subgraphs as well. These "other" subgraphs are in fact complete are not maximal the problem, possibly attempt made to of the discovery of the a modification to correct the defect i s fairly simple. When a complete subgraph has been there which due to the existence of more efficient algorithms by the time error, subgraphs and hence each i s contained in some clique. Although there appears to be no subseguent correct some "other" exists a vertex discovered, determine i f adjacent to a l l vertices subgraph. If not, a cligue has been in that found. Otherwise it is obvious that such a subgraph i s not a cligue and i s therefore deleted from further consideration. This modification i s included in the subseguent analysis. In the interests of improving the efficiency of their algorithm, Harary and Ross cited in the historical modified the general procedure survey by defining a procedure for 23 determining whether or not the subgraph induced on the vertex set under consideration had at most three cligues. If so, then such a subgraph method (rather could than be completely processed by a direct the recursive method of the general procedure) and the cligues of the subgraph determined. However in most cases only procedure i s required subgraph one additional iteration of the general to determine a l l the cligues of a containing at most three cliques. In addition such a scheme requires whether a additional subgraph computation, has at namely determining most three cligues, during each iteration of the general procedure thus increasing the overall computation time. * For overall efficiency these reasons, the contribution to i s small and for simplicity has not been included in the analysis or implementation of the Harary-Boss algorithm. 2.3.1 NOTATION FOB THE ALGORITHM G^ The subgraph currently under consideration. VJGJ. The vertex subset of the subgraph G. The adjacency matrix of the subgraph G. rj. An array such that r ( i ) contains 2 the sum of a l l elements in row i of B= A XA , the Hadamard product cf the adjacency matrix and i t s sguare. d^ An array containing the degrees of the vertices in the subgraph G. Bi a r(i). pointer to the vertex i in G with minimum row sum 24 I K The number of vertices in the vertex subset V(G). 2.3.2 THE ALGORITHM STEPO: I n i t i a l l y place V(G) on the stack. STEPli Choose a vertex set V(G) from the stack of vertex subsets to be considered. If stack empty then go to step13. STEP2: Compute the matrix product 2 A XA where "X" i s the Hadamard product. 2 STEP3 : Let B= A XA . Compute the row sums of B as well as the degrees r (1) ,r (2) ,... , r (n ) d (1), d (2) ,...,d (n ) of the vertices in the subgraph G. STBP4: Set i to 1 and m to 1. STJP5i I f r (i) =d (i) (d(i)-1) then go to STEP10. STEP6: i f r{i) < r(m) then set m to i . STEPJi set i to i+1. I f i<n then go to STEP5. STEP8^ No unicliqual vertices exist. Therefore define two new vertex subsets V (G ) and V{G ) as follows. V(G ) i s the set of 1 all vertices adjacent 2 to 1 m, the vertex of minimum row sum r{m). V(G2) i s the set of a l l vertices together with not adjacent to m, a l l vertices adjacent to at least one of these vertices not adjacent to m. STEP9: Store V (G^) and V(G2) on the stack of vertex sets tc be considered and go to STEP1. STEP 10: A unicliqual vertex i intersection of a l l rows has been of found. the adjacency Compute the matrix of the original graph corresponding to vertices adjacent tc i . 25 STEP j 1_: If the result cf STEP10 yields an empty set complete subgraph determined then the by i is maximal. Hence print i and the set of vertices adjacent to i . STEP12:. Delete from V (G) vertex i and a l l vertices adjacent to i which are also unicligual. Place V (G) on the stack of vertex sets to be considered and go to STEP1. STEP13: a l l vertex sets have Been processed. Therefore stop. The tasks to be performed by the fiarary-Boss algorithm may be logically divided into four blocks. The function of block 1 i s to compute the degrees of the vertices of the graph and to 2 compute [A(G)] X A (G) and determine the row sums of this matrix. Block 2 uses this information unicligual vertex relationship r sum, and d found, two by finding v a row the degree of the vertex. I f such a vertex i s not new subgraphs are determined in block 3, one of Otherwise we the other saved for further proceed to block U to search for a induced on the unicligual and those vertices to which i t i s adjacent. If the complete subgraph is a cligue i t i s vertices for is the corresponding complete subgraph of the graph G vertex search vertex which satisfies the = d • <d -1) where r which i s returned to block 1, processing. a to printed, a l l unicligual in the discovered complete subgraph are deleted from V(G ) and the subgraph G i s returned to block 1 i processing. for further 26 BLOCKJ 1 pop V(G) 2 i <-- 1 3 r. <— 0 4 j <-- i+1 5 -> r . <— 0 3 6 substr (a^ , j,0) : 0 7 a. <— a. +1 8 a 9 k <-- 1 10 s <— 0 X 1 <— a.+i 4 3 11 3 'S <— substr (a . a.,k,1) • s 12 k <— k+1 13 15 r . <— r. *s x x r . <— r. +s 16 j <-- j*1 17 j : n_ 18 i <— i+1 19 l 14 3 3 : n, The parameter n i s equal to the number G the vertex* set V(G) , of vertices of and a (i) denotes the adjacency set cf vertex i in G. BLOCK 1 dominates algorithm; depend the computation in the Harary-Boss the other blocks of the procedure will be seen to linearly on n_. For this reason we defer description until we have further examined BLOCK 1. their 27 In general expression for Harary-Ross the is difficult computational algorithm. determining satisfied it the This number . I t is of also to obtain effort is times to branch subsets that by the difficulty in conditions even complete k-partite graph, to determine the explicit reguired due difficult, an are for an arbitrary number of vertex are generated with a particular distribution of vertices over the blocks. Instead, we shall consider the behavior of the Harary-Ross algorithm when finding the cligues k of K(3 ) . A similar procedure could also be adopted for each of the other two types of Moon-Moser graphs. An example of the derivation tree for K(3 ) is given in Fig. 2.2. Each the tree has node of been labelled according to the distribution of vertices among the blocks of the induced complete 3-partite subgraph which i t represents. Let G be an induced subgraph defined on a vertex subset obtained during the execution of and suppose blocks G with 2 containing 3 i,+i +i =k and 1 2 has the i ^ blocks vertices vertices. per 1 2 Necessarily 3 algorithm, with 1 vertex per block, ±2 block, i.,+2i +3i = n „ . 3 Harary-Ross The and i ^ blocks i t is number the of each case times that that Lr execution passes from line 6 to line 7 during the execution of block 1 is matrix of G complete egual to the number of one's in the adjacency above the diagonal. Since the original f one's in G is therefore (i , i , i )= [ (i +2i +3i ) G 1 is k-partite every induced subgraph containing at least one vertex per block i s also complete k- partite. of graph 2 3 1 .2 3 2- number given by f Q ( i ^ i ^ i ^ ) where (i +ui +9i ) ]. 2 1 The y 28 Hence steps 7 through 15 of block f^ ( i ^ i ^ i 1 will be performed ) times and the computation time of block 1 during one iteration i s given by T (n^ ) where T (n ) = C»+ n C»+n (n -1)C»+f (i , i , i )(C»+ n C* ) 1 G 1 G 2 G G 3. 6 1 2 3 4 G 5 with C* = t +t 1 1 2 C 2 = 3 t i + 2 t 3 C* = 2t 1+ t 3+ 2 t 4 + t 8 c* = 6 ^ + a t . 1 J 4 C 5 = 2 t l + 2 t 3 + t 4 + t 6 + t 8 The constants t ^ f i=1,2,...,10, represent the cycle times for the various operations as specified in APPENDIX A 29 BL0CK2 20 i <-- 1 21 m <— 1 22 \ <V >- -*r. 23 r : r— i 24 -*-Block4 1 m 25 m <— i i <— i+ 1 26 n <r- I Block3 The computational time for BLOCK 2 i s given by: 2 2 T (n) = C +g C 2 1 l 2 where 2t, C| = 3t1+2t3+3t^+tg It was assumed that line 24 which points to the minimal r(i) was executed one-half the time. The term g^ i s egual to one less than encountered. the label of the f i r s t unicligual vertex i 30 BLOCK3 27 VtG^ <— a m V(G) 28 substr (V(G^),m,1) <— 1 29 i <— 1 30 V(G2) <-- -7(6^ 31 i <— i + 1 31 substr (V (G1) , i , 1) : 0 32 V(G2) 33 i <-- i+1 < 34 i : n. 35 push VCG^VfG,,) 7(G2) a± The computation for one iteration of block 3 i s T 3V ( = c i + ( n C3 a G~ s = 4t +2t 1 C 3 2 ( i ) , C 1 n C |' W h e r e *t. *t +t 2 6 7 8 rt = t +t , l = t 1 6 ' *2t, +t c 3 2* 1 3 4 and d (i) i s the degree of vertex G on V (G). 8 i in the subgraph induced BL0CK4 36 37 C <— a . x substr (C,i,1) <— 1 38 T <-- T -T 39 k <-- 1 40 substr {C,k, 1) 41 T <— T a,. 42 k <— k+1 <- 43 A : 0 — k : n 44 T : 0 -i 45 print C 46 substr (V (G) ,i,1) <— 0 47 j <-- j*1 48 49 ••-substr (a i # j , 1) : 0 50 i 3 substr (VG) , j , 1) <— 0 51 j <-- j+1 52 3 : n, 53 push V (G) The computation time for BLOCK4 i s T 3 (n G ) = C*+dG(i)C*2+nGC3*bC*-iC*f the constants being given by Ct 2 - l**3-V*6 2t C C 5 " V*8 3 i m . t l , t 3 t 2 \ * t » and h i s the number of unicligual vertices in the cligue. 32 For K{3 ) the next section will show that there are J[_(3 2 3) components which do not have unicligual vertices; hence the value of g. in block 2 for each of these vertex sets i s n „ , 1 u +C and therefore the maximum value for n (nQ) i s cf 2 Q* For blocks 3 and 4 d^(i) and h are also bounded by n^ and hence T„ (n0) and T . (n_) are also linear polynomials with respect to n^. , T (n^, ) i s thus clearly the dominant term in the computation time for one iteration of the Harary-Ross algorithm and i s a polynomial of order n*. 33 Fig. 2.1: U 3, 3, 3 ) Ill 35 2.3.3 NUMBER OF VERTEX SUBSETS EXAMINED BY THE ALGORITHM From arbitrary an examination graph using example Fig. 2.2) i t components or of the the Harary-Ross is easy vertex sets derivation to that tree of algorithm determine the (see for number the of will be generated. This i s because, given that the number of cligues in the graph since an is N, derivation tree from this algorithm i s binary, the number of nodes i s 2N-1. The nodes of the derivation tree of K (3 ) can be separated into two parts according to the type of processing carried out on the component represented by that node of the tree. In particular, whenever a unicligual is found via block 2, the cligue vertex to which i t belongs i s determined in block 4 and the component corresponding to this cligue is output rather than returned to the processing stack, which consists of vertex subsets yet to be examined further for cligues. For example, i f we examine F i g . 2.2, subgraph induced on vertex block but one and consider the a vertex set having one vertex in every ( eg. (1,1,3), (1,1,2), (1,2,1), (2,1,1)) we see that a l l vertices in the block containing more than one vertex are unicligual. A vertex vertex block, and algorithm a clique returned for would be chosen from this through the computation in block 4 of the would further be determined and a vertex set processing. In the case of the example such a vertex set would have the form (1,1,2) or (1,1,1). For the general complete k-partite graph K(3 ) r among a l l vertex sets generated having unicligual vertices, there is 36 only one vertex set of the form (1,1,1,...,1,3). This i s because such a set i s derivable only from previous sets having either one or three vertices per vertex block. Such set yields three cligues according a vertex to the seguence of derivations given in F i g . 2 . 3 . Of these three cligues one i s reprocessed in block 1 as a conseguence of the previous discussion. Of the remaining cliques, all are derived from vertex sets having unicliqual vertices in some block, one vertex in each remaining blocks, and k 3 -3 two of the hence again according to the previous argument one-half of these w i l l be reprocessed ene further time. The purpose of this discussion has been tc ascertain how many vertex sets processing occurs. k of the 2 « 3 - 1 generated are subject to in block 1 where the major portion of computation This number i s thus k k k 3 -1+1+JL(3 -3) = 2 ( 3 - 1 ) . 2 addition, the number of vertex In 2 sets containing unicliqual vertices, and therefore examined in block 4, i s 2 + 1.(3 - 3 ) . 2 Finally the number of vertex sets not containing a unicliqual vertex and therefore processed i n block 3 i s given by k k k k (2-3 - 1) - 3 - (2 + l ( 3 - 3)) = l ( 3 - 3) 2 2 If we denote by T^T^T^ '^4* e s t ; L i n a t e s o f t n e for blocks 1,2,3,and 4 respectively then an computation time estimate computation time i s given by: k T appx = (f +T_) (l(3 -1)) •3 T, (3 ( 3 1 2 ^ 2 k_1 k -1)) + f.(1(3 *1)) * of the Fig. 2.3 SUBTREE OF DERIVATION TREE 38 2.3.4 STORAGE REQUIREMENTS The Harary-Ross algorithm as defined in blocks 1 through 4 reguires only one adjacency matrix be stored, that original of the graph. The appropriate sub-matrix i s then determined during each iteration by keeping track of the appropriate vertices defining each induced subgraph. Also, since the r and d arrays are pertinent only to the induced subgraph currently under consideration, only one array of size n cf each i s reguired. As a l l other terms are counters or pointers the only other storage ' reguirement i s made by maintenance of a push- down store for keeping track of the vertex sets remaining to be processed. At any iteration we usually define two vertex sets named VfG^) and V (G2) in the description of the algorithm. order their If we position on the push-down store so that VfG.^) i s always chosen f i r s t from the push-down store, then since no path in the derivation tree is of length greater than k, there cannot be more than k vertex sets waiting in the store. Each corresponds to the "other" vertex set paired with that vertex subset in the represented by a node derivation tree. As an example lying consider on the path the derivation of clique (258) in K (3 ) as labelled in Eig. 2.1. The seguence of events is illustrated in Fig. 2.4. For each pair of direct derivations, V (G^ ) corresponds to the " l e f t " derivation, V (G )•to the "right" derivation. Before we can reach the vertex i n the derivation tree labelled (23,456,789) we must have processed (1,456,789) since we have 39 arranged to do this f i r s t . Conseguently a l l cligues containing vertex 1 have been determined and vertex set (1,456, 789) or i t s derivatives no longer appear on the push-down similar argument applies store. A to (2,4,789) and (2,5,7). The cnly nodes remaining tc be processed are (3,456,789), (2,6,789) and (2,5,9) . From a programming point of view i t is most convenient as well as efficient strings. Since to maintain lists of vertices as bit both the vertex sets of the algorithm as well as the rows of the adjacency matrix are vertex lists i tis clear that the storage requirements are given by 2n+C "integer units" of memory plus n(n+k) b i t s . An "integer unit" will depend on the storage conventions for integer representation on a particular system. For our purposes during implementation this is egual to a "half-word" or 16 bits. The storage requirements for the Harary and Ross algorithm i s thus n (n + k)+16 (2n+C) bits where C i s the nuirber of pointers and counters reguired. Fig. 2.U A PATH OF THE DERIVATION TREE 2.a ANALYSIS OF BONNER^S ALGORITHM Bonner's algorithm has generated researchers wishing to employ some interest among the analysis cf cligues in graphs to their particular application because of i t s apparent efficiency since no clique or vertex subset generated need be examined for containment difficulty arose in some previous in the modified component. This Harary-Ross algorithm previously described and i s also inherent in Peay's algorithm, to be discussed next. In addition, Bonner's interesting to examine in a comparative enumeration algorithms because it algorithm i s study cf clique been compared has empirically with the efficiency of more recent algorithms. The approach taken by Bonner is rather different from that of Harary and Ross or Peay i n that i t i s a constructive procedure rather than one of reduction. The method employed i s to build up the vertex sets of the cliques from a set of potential candidates, membership being determined by the adjacency properties associated with each vertex. We describe the steps of the algorithm as given by Bonner [ 6 ] , including a minor correction noted by Augustson and Minker [ 5 ] in their discussion of the efficiency of the procedure. The paper of Augustson and Minker showed that the efficiency of Bonner's algorithm may often be illusory because many complete subgraphs or components may be generated the course of execution only to be discarded at seme later stage. It was discovered that graphs containing large cliques and a few during very small ones several very resulted in an 42 excessive amount cf computation being performed on extraneous components which the algorithm would eventually delete. This generation of extra vertex subsets using Bonner's algorithm i s also present in the enumeration partite of cligues of complete k- graphs, upon which we are focusing our discussion. We shall establish Augustson a generalization of observations and Minker which will then be used tc determine the number of vertices i n the derivation Bonner's made by tree of K (m ) using algorithm. We f i r s t , however, describe the procedure itself. 2.4. 1 NOTATION ' A z_ an array representing the set of objects in the complete subgraph to the present stage of calculation. C 2 a « array of potential candidates for increasing the size of the complete subgraph induced on vertices in A . L : the last vertex of C to be considered ~i~ to the complete subgraph induced on A . for addition S _: the adjacency matrix of the original graph. 2.4.2 BONNER'S ALGORITHM (Augustson, Minker[5]) STEPJ: set i to 1, C to V (G) , A^^ to gS , / L 1 to 1. STEP2: If L i i s not in set C I then set L. to L.+ 1 and go to X I STEP5. STEP3: Set C to { C O S (L ) }-{L. } and A . , to A. l_J {L. }. il x x x xl x x 43 STEP4i set L ^ t o 1^ + 1, i to i+1. STEP5: If there i s an element in larger than L.^ then go to STEP 2. STEP6: Set T to k^. If C^=$ subgraph. Else either thenflj_i s a maximal complete has been found before cr i t i s not maximal. STEP7^ Set i to i-1. If i=0 then stop. STEP8: Set U to be the set of a l l objects in greater than L. . If U £ T then go to STEP7. x STEP9i Set tc L± + 1 and go to STEP2. The tasks divided into performed Bonner's algorithm have been two blocks. The function of the steps performed in block 1 i s tc subgraph by determine i s maximal and whether to find a discovered complete the next component tc be processed. If one i s found, control i s transferred to block which constructs another complete subgraph returning discovered complete subgraph to block 1 for testing. 2 the 4<4 BL0CK1 1 W 2 C. : 0 <— fl i A. I 3 print W 4 i <— i-1 5 stop 6 U <— C 7 substr (U,i,1) <— 0 8 • (0 u W) : w 9 L. <— x L. + 1 > 10 1 The computation time, T^ for block 1 i s given by V i * 'io*" !* c 0 with constants being given by C and value l* = "1**3**4 = 1 i f W i s a cligue, 0 otherwise, and h<i i s the f i r s t of i for which k^,!^, are the vertices of a complete subgraph contained in a cligue not yet found, 45 BL0CK2 10 ->substr 11 C <- 12 substr (Ci lt L 13 A 14 substr(fl^ ^* ±*^) 15 L. i 1 C s ~ i° (Li) & i 1 <- ir 1) <— < 0 i L T L.+1 , 1) : 0 — < — 0 L. +1 — X 1 X 16 i <-- 17 substr (C.. ,1^ + 1^-1. ±) : 0 i+1 Every vertex subseguent to the original exactly once until there are no further included. Therefore loop:17 to 10 i s executed 1 times for an arbitrary graph. For Kfm ^ j) vertices is examined vertices at most there are at most m (k- i there are m-1 vertices not in J C. , namely the other vertices in the block. Since x on such i the in graph i s labelled such that vertices in block labels (i-1)m + 1, (i-1)m*2f... ,(i-1)m*m, then An is occasions loop:17 to 10 i s executed times for each of the next k- j -1 blocks of vertices If upper n-L^ in C\, where j denotes the block to which vertex lu belongs. For each value of incremented to be bound cn j = ^1^ x T2(L.,n),= (m-1)C|+mC2 ) where l = C 2 C m-1 . i have 1j . the time consumed in block 2 i s given by T>> defined as a function of L. and n: C not 2 = 2 3 = 6 t l + 7 W V*3 V 2 V 2 t 8 2 t 8 46 Since the value of h in block 1 i s less than or equal tc n while k-| L. -11 i s maximized when L. i s in the f i r s t block, i t i s clear that the computation for one iteration of Bonner's algorithm i s bounded by n = mk, the number of vertices in the k graph K(m ). 2.4.3 NUMBER OF VERTEX SETS GENERATED To determine the number of nodes in the derivation tree of Bonner's algorithm i t i s necessary f i r s t to establish the following result, a generalization of observations made by Augustson and Minker [ 5 ] . THEOREM 2.4: For m>2 every complete subgraph of K(m ) i s generated during the execution of Bonner's algorithm. Proof: Using Bonner's notation the set Ai complete subgraph of order i defined consists of a on vertices labelled L-.I^ ; the set C. consists of a l l vertices adjacent 1 2 l l to every vertex in A^. if Suppose the vertices of block V. in V (K (m )) are labelled (j-1)m + 1, (j-1)m+2,..., (j-1)m+m, for m>2. If we algorithm to obtain the " f i r s t " clique of K (m ) we obtain the following assignment to A^ and perform the for i=1,2,...,k: Lx= 1 Ax = {1} L = m+1 A2 = { 1,m+1 } L = 2m+1 A = {1,m+1,2o+1} L = (k-1)m + 1 A = { 1, m+1 ,2m+1,.. . , (k-1) m+1 } 47 Let U be a vertex subset of consisting cf a l l vertices with labels greater than L^. Such a subset i s not the vertex set of a complete subgraph unless there i s at most one vertex in 0 because of the labelling of vertices in K(m ) . Therefore, by execution of the algorithm, L^ i s set to L^ + 1 in step 8 and we return to step 2. Since when we i i s determined by the number of vertices in A^ entered step 6, and since every possible value of L^ from i t s i n i t i a l one of (i-1)m*1 up to mk i s adjacent tc L ,L ,...,L. , and also contained in C., i t i s the case that a 1 2 i-l i complete subgraph with vertex set given by A^, 1<i<k*1, i s generated where: 1. ) A^ = { L^,1 2 ,.,L ^ }, 2. ) (j-1)m+1<L <mk, 1<j<i -'• 3. ) L1<L2<...<L . QED We have thus established subgraph that of K(m ) i s generated Bonner's algorithm. The special applying the procedure for m>2 during case m=1 to a complete every complete the execution of corresponding tc graph generates k complete subgraphs as described above in determining the f i r s t cligue of the graph. Since every possible subset U of C contained so Afc in A that the is , clearly no return i s ever made to step 2, algorithm terminates after printing ={ 1,2,...,k}. The derivation illustration algorithm. tree of the vertex From this k for K(3 ) i s given F i g . 2.5 as an sets generated one can clearly by Bonner's see the property cf 48 Bonner's algorithm defined in theorem 2.4. As number a result the , k of components generated by Bonner's algorithm on K (m ) is given by the following: ISJOSJfi 2-.5.1 The number of nodes in the derivation k tree of k K(m ) using Bonner's algorithm i s (1 + m) . Proof: From Theorem 2.4 i t i s clear that every complete subgraph occurs as a node during some stage of execution. The k number of complete subgraphs of order i<k in K (m ) i s egual to the number of ways of choosing i from k blocks V ,V,...,V , and then choosing 1 vertex from each of the i chosen x blocks. k This i s s i / k1 \ . Hence the total number of complete subgraphs i s = ' (1+m) -1. Since the root node of the derivation k • I $ m^kN tree i s not yet included this results in a total of (1+m) i=1 li) nodes. QED. 50 2.4-4 STORAGE REQUIREMENTS The storage reguirements for Bonner's algorithm are similar to those for the Harary and Ross algorithm. Two arrays A and C of length n are required, each element corresponding to a vertex subset which can be represented as a bit string as can the rows of the adjacency matrix S. In addition an integer array of pointers L i s reguired. Two temporary bit strings T and U are also needed in addition to a counter half-word of 16 bits z as the integer unit, 2 i . Using the the storage reguirements are: 3n • 16 (n+1) • 2n = 3n + 18n • 16 bits. 51 2.5 ANALX.SIS O F PEAY^S ALGORITHM The computation involved in the Harary-Ross algorithm was dominated by the computation of a matrix product generation of associated Bonner's a large complete algorithm subgraphs, only of components which were by and later the their deleted. was also dominated by the generation of a number of superfluous generates number and components. Because components which Peay's algorithm are essential to the final determination of a l l cligues, i t i s of interest as i t may have a reasonably small derivation tree. However, the means by which Peay deletes non-essential components results in a large number of additional operations. Specifically, Peay compares each of two newly generated components to an ever growing l i s t of vertex sets of cliques and subgraphs cliques. Thus, for cligues, as function exponential a number a graph of with the which are potential an exponential number of number of vertices, an of comparisons i s required in addition to the time for generation. As will be seen, this off-sets tc a considerable extent the time saved by avoiding the analysis of redundant vertex sets. For modification to the algorithm storage this reason which reduces the amount of required. The extent of this reduction is determined in our storage analysis. The procedure to be obtaining this breadth" implemented for improvement depends on ordering the selection of vertex subsets so as to before we discuss here a obtain a development of "depth of the derivation tree. The stack containing these vertex subsets i s then altered to vertex complete subgraphs. This sets which do not induce contain only those 52 drastically reduces the size of eliminating the growing l i s t of cligues previously being kept there. Instead, a test for cligue that employed in our the push-down membership modification of store by is made, like the Harary-Ross algorithm. These modifications are included in cur subseguent analysis. Before proceeding, we should note however that the inefficiencies inherent in the algorithm as cited by Peay are a consequence of the application to which such a procedure was being put, namely the determination of a hierarchy cf cliques in sociograros. As a rule the goal of a graphical treatment such data i s to assign the "individuals" to one or more of a few sets which i t i s hoped characterize the structure group. of Hence the number of determined by such an analysis difficulties of a cligues is small in of the a social group as and therefore the possibly exponential number of cligues is not pertinent. As cur treatment of cligue detection algorithms is graph information theoretic, we about the have not assumed any a , priori structure of the graph induced by i t s physical interpretation and must therefore be concerned such problems. 2.5.1 NOTATION VJGJL_: the vertex set currently under consideration. gx the subgraph induced on V (G) I K the number of vertices in G. M: the number of vertex sets in the stack. with 53 ajij_2 adjacency set of vertex i . 1J9.-, LL 119.oil G : 2.5.2 newly generated vertex sets. the subgraphs induced on the new vertex sets. THE ALGORITHM STEPOi I n i t i a l l y place V(G) on the stack. gTEPJ: Choose a vertex subset V (G) from the stack of vertex sets to be considered. If the stack i s empty then stop. v v STEP2: Find a pair of vertices ^ » ^ both in (v. ,v .) V(G) such that is not an edge of the original graph. If no such pair exists then go to STEP5. STEP3: define new vertex sets V (G )=v (G)-{v. ], i ' V(G2) V l )"{V..}. = V(G STEP4: l For k=1,2, i f v ( G ) k i sn o t currently on the stack then put V (G^) contained in vertex set on the stack. Go to STEP1. STEP5: V (G) induces a complete subgraph. If there exists no vertex in the original graph adjacent to a l l vertices in V (G) then print V(G) as a clique. In either case go tc STEP 1. The tasks of Peay's algorithm can be logically grouped into two blocks. The function of block 1 subgraph is to examine the induced on a subset of the vertices of a graph G, in oreder to find a pair of non-adjacent vertices. If a pair is not found then a cligue has been discovered and i t is printed. If two vertices, say v and w, are not adjacent then control is passed to block 2 which defines two new vertex sets. Each i s 5H saved for further processing provided i t is not contained in some previously generated vertex set. Control returns to block 1 which chooses another vertex set for examination. r 55 BLOCKJ. 1 ->pop V(G) 2 M <— 3 i <— M-1 1 c <— c - c 5 - * j <— 6 i +1 C <— 7 C a. ->> Block2 •substr ( a ^ j , 1) : 0- 8 j <-- 9 •j : n 10 i <— 11 •i : n 12 •C : 0 j+1 i+1 print V(G) 13 If G i s in fact a clique, then a l l n (n-1) ones in i t s 2 adjacency matrix above the computation time bounded above,by of diagonal block will be with constants i= 3 W W C. = 3 V The 1 for one iteration is therefore T (n ) = C» + n c*+n (n -1)C» c examined. 2 V V 4 t °3 " ' I ' V ^ ' * 1 6 io 56 BLOCK2 14 V(G1) <-- V(G) 15 substr (V 16 V(G2) <-- V(G ) 17 substr (V(G2) f j,1) <~ 0 18 k <-- 1 19 ) , j , 1) <— 0 0 •*V(G1) k 20 V (G1) : V(G^ ^)-i- 21 V(G1) <-- 0 22 v(G2) 23 0 <+• +£- V (G) : V (G ^ 24 V(G2) <-- 0 25 V(G1) 26 i <— i + 1 27 4 i :M 28 V(Gi) : 0 - 29 push V(G1) 30 M <— M + 1 31 V 1(G ) : 0 -«2 32 33 -> exit V(G2) push V(G2) M <— M+1 exit The computation time i s maximized when neither new vertex set i s contained in some previous vertex set. When this occurs 2 the time for BLOCK2 i s T = C2+MC where C 2 1 C2 2 2 1 2 = 7t +2t +2t +2t +2t 1 2 3 = t +t +6t +t t 1 3 4 6 and M i s defined in section 2.5.1. 4 8 57 2.5.3 NUMBER OF VERTEX SETS GENERATED The complete derivation tree for algorithm the is nature K (3,3,3) using Peay's guite extensive. As will be seen this i s due to of the development of new vertex sets for consideration. In order to obtain an expression for the number of nodes in the derivation tree i t will be convenient to consider the development which occurs during the processing of one block of determines vertices. two new discovered which i s consider That i s , since components not an Peay's whenever edge of a the algorithm vertex graph, pair is we shall a l l such pairs defined upon a single vertex block of G. Fig. 2.6 gives such a development for block v = {1,2,3] of K (3,3,3) as labelled in Fig. 2.1. Examination of reveals that three this sub-tree of the derivation tree components are eventually generated with the property that each has exactly one vertex in block 1 and 3 vertices in each of the remaining two blocks. Since, for of these sets, the one remaining vertex is adjacent to a l l other vertices in the vertex subset of that evident component i t is that no further computation will involve that vertex. Thus the induced subgraph of each vertex subset i s to each K(3,3) with eguivalent vertices labelled 4,5,6,7,8,9. The number of nodes generated i s given by the sum of those determined during the generation cf three components, K (1,3,3), from one component, K(3,3,3), and the number of nodes generated in the reduction of each K (1,3,3) (which is eguivalent tc the reduction of K(3,3)). Therefore i f for K (m ) we can determine 58 the number K(1,m of k-1 ) nodes algorithm. of nodes we created in generating m components can obtain a recurrence relation for the number rn the derxvation tree of K (m ) using Peay*s 59 Fig. 2.6 SUBTREE; OF THE DERIVATION TREE 60 THEOREM 2.6: The of vertex sets generated by Peay's k k algorithm to find the cligues of K(3 ) i s 3(3 )-2. v number u V Proof z Let i # 2 f . . . # k b e t h e b l° c J l s o f K m ( ) a n a label the vertices in block V. :(j-1)m+1, (j-1)m+2,... (j-1)m+m. Consider J the sequence of derivations defined by processing the vertex pairs (non-edges) (1,2) ,(1,3), derived t starting from ...,(1,m). The vertex sets ({ 1 ,2 ,3 ,. . . , m }, V , . ..,V ) are respectively ({ 1 ,3,4,. .., m ], V2 ,.. . , Vfe) , ({ 1, U,5,. . . , m }, V2,..., ),...,{{ 1,m},V2,...,Vk) , and finally ({ 1 }, V£ , . . . , Vfc) . If ({1,i,i +1,...,m },V ,...,V ) i s a typical vertex set from this seguence of derivations, ({ 1,i+1,,..,m},v2,...,Vk) and two new sets, ({i , i * 1,..., m}, V2 ,..., Vfc ), are derived by separating vertices 1 and i . The latter vertex set i s deleted since i t i s contained in the previously determined set ({ 2,3,. .. , m },V ,. .., V ). We therefore have a total of 2(m-1) vertex sets determined during this seguence remaining ones of derivations, half of which are deleted, the being those given above. This process i s illustrated in Pig. 2.7. The number of vertex sets considered in reducing one m-1 vertex block of K(m ) i s given by 1+E 2i = 1 + m(m-1) . i=l k m Let a be the number of vertices in the derivation k of K(n>k) block of eguivalent using V(G) Peay's algorithm. Since the reduction of one yields to that m vertex for K{m^~^-) given by the recurrence solution tree relation sets , whose processing i s the number cf vertices i s m m a = 1 + m (m-2)+ma whose m m k k k-1 i s : a = C m + 1+m (m-2). The complete derivation tree k 1 1-m 61 for K(3,3) i s given in Fig. 2.8 from which we obtain 3 m=3. Solving for C 1 3 3 k a ™ with K we get C = 3 and a = 3{3 )-2. QED. - 1 k 64 2.5.4 STORAGE REQUIREMENTS As has been defined observed, required generated storage during exponential its number practicable. Peay's space algorithm as originally for a l l new execution. of cliques For a such a vertex graph sets with demand is an not In our discussion we have described and analyzed a modification to the procedure which eliminates the need for maintaining in the stack the cligues as they are discovered. By developing each path in the derivation possible, the number as far as of nodes placed in the stack i s never greater than the length of the path corresponding tree to the "other" generated, vertex each entry subset of the pair of vertex subsets developed at that stage. Let V(G) be a vertex ,subset complete k-partite graph vertices in (k-i) blocks. derivations by fixinq such that V (G) induces a having 1 vertex in i blocks and m If we one carry vertex out a sequence of in block i+1, say v and sequentially derive new vertex subsets from the set of nonedges (v,w ) , (v,w ) , . . . , (v,w X ) then according to the rC~ -L argument presented in a discussion of the derivation tree for k K(m ) developed by Peay's algorithm, two vertex sets will be added to the stack after such a seguence of derivations. The first consists cf a l l vertices of the original vertex subset other than v. The length of the path in the derivation corresponding tree to this seguence i s m-1, the number of vertices not adjacent ot v. Since there are k blocks k in K(m ), the maximum length of any path i s (m-1)k. However by choosing non- 65 edges as described, after the (m-1)i th node only one of the i+1 vertex subsets have been saved for further processing, hence the number cf vertex sets on the stack i s at most k+1. The only other storage required i s that for the adjacency matrix and a number of integer counters. Therefore, since each entry in the stack can be represented by a bit-string of length n, the total storage reguirements are n(k+1) + n bits. 2 + 16C 66 2.6 A NEW ENUMERATION ALGORITHM The analysis of previous algorithms, while providing expressions for a comparative analysis of sequential clique enumeration algorithms properties desired by a "good" hazards one must properties has also attempt revealed some procedure and some of the to avoid. are characteristic of the In addition, certain of algorithms for explicitly enumerating the cligues of a graph. These algorithms appear to reguire a means of determining the sets of vertices adjacent procedures use this discussed to a given vertex information since all to generate the components to be used in further analysis. This surprising as is net tec any graph i s characterized by this sort of information. However, the adjacency matrix representation cf a graph provides this most simply and directly. The representation of the rows of the adjacency matrix as a string of bits greatly simplifies the computation determining new components. The importance matrix representation reguired in of an adjacency over some other representation i s thus emphasized by these observations. The desirable enumeration algorithm components which do complete properties subgraphs are two-fold. not destroy with possible a seguential First, the existence as l i t t l e Secondly, generate as few such best of effort components cligue generate new of maximal as possible. as possible. The situation i s to avoid the need for determining whether a complete subgraph or component just generated i s 67 properly contained in some other cligue of the graph and reguires some means of choosing just the right set of vertices so that no redundant component i s ever generated. Because the complexity of the possible intersections of the vertex sets of cliques in a graph, i t i s d i f f i c u l t to determine how such covers the minimal vertex independent set of example i l l u s t r a t e s . Consider the vertices labelled a set cf vertices graph as the following graph of Fig. 2.9. The 1 and 4 in the graph constitute a minimal independent set of vertices covering the graph. just a set could be chosen. It i s not sufficient to find either a maximal or a which of vertex set of the It i s clear however that the cligue K (1,1,1,1) induced on vertices 2,3,5 and 6 contains no vertex in this particular minimal covering. Similarly consider the graph in Fig. 2. 10. It has a 5, maximal independent set of vertices labelled 1,3 and of which none i s a member of the cligue induced on vertices 2,4, and 6. Clearly one reguires the prescient ability to appropriate among an an independent covering set of independent vertices exponential number of possible choices. There i s presently no known way for accomplishing such efficient choose manner. The a task in an algorithm to be described generates a i reduced number of redundant vertex sets, and uses an efficient procedure for detecting such redundancy. It will be seen that with some modifications the procedure to be described combines some of the better features 68 of the previous algorithms. As a consequence we shall show that the performance of this algorithm is comparable to and in many cases (eg. Graphs with many cligues) better than that to be expected from the others. 6 5 Fig. 2.9 MINIMAL COVER COUNTEREXAMPLE 1 3 2 5 Fig. 2.10 MAXIMAL COVER COUNTEREXAMPLE 70 2.6.1 DESCRIPTION OF THE ALGORITHM Each vertex subset to be processed i s divided into two parts; V(G1), the set of a l l vertices in that subset yet to be examined and V(G2), the set of a l l vertices previously examined and which induce a complete subgraph in the original graph. The vertices in V(G1) have the additional property that they represent subgraph a l l possible induced on V (G2) extensions which yield of a the complete larger complete subgraph. If V(G1) i s empty, then provided there does net vertex adjacent clique. Such further a tc a l l vertices condition is exist a in V(G2), we have found a maintained by deleting from consideration any vertex subset a l l of whose members are adjacent to some vertex outside the subset. If V(G1) i s not empty then we generate n - d(v) new sets by f i r s t choosing a vertex v, and then considering i t together with the n - d(v) - 1 vertices not adjacent to v. Each vertex from this set i s used to define a new vertex subset by it to V(G2) and thus extending the set of vertices already considered, and then defining a new considered from adding V (G1) by set including of only vertices those to be vertices adjacent to that vertex just added to V(G2). 2.6.2 NOTATION VJGV)_i set of vertices in the current vertex set be considered. yet to 71 V(G2)set of vertices in the current vertex set which- induces a complete subgraph. VJH1); new set of vertices to be considered. VJH2),: new expanded set of vertices inducing a complete subgraph. F:. set of vertices not adjacent to a chosen vertex from V{G1) . Az the adjacency matrix of the subgraph induced on V(G1)U V(G2). 2.6.3 REDUCED RDUNCANCY ALGORITHM STEPO: i n i t i a l l y place V(G)i_j$ on the stack. STEP1: Choose a vertex subset V(G1)vjV(G2) from the stack of subsets to be considered. If stack empty, stop. STEP2: If there exists a vertex adjacent to a l l vertices in 7(G1)u V(G2) then go to STEP1 STEP32 If V(G1) i s empty then print V (G2) as a cligue and go to STEP1. STEPjfi Choose a vertex v in V(G1) and define F to be a set consisting of v together with a l l vertices not adjacent to v. STEP52 Choose a vertex w in F and define a new V(H1)OV(H2 ) where V (H1) i s the set of a l l vertices in subset V(G1) adjacent to w, and V (H2) = V(G2)o{w}. STEP6: Delete vertex w from sets V(G1) and F. STEPJl If I" empty then go to ST EP 1; else go to STEP5 In order to compute the time for one iteration of the 72 algorithm the instructions performed are as follows: 1 -•pop V (G1) ,V (G2)-J 2 C <— C 3 i <— 1 4 -C > substr (V <G1) 5 C <— C 6 i <— i + 1-« 7 V(G2),i,1) : 0 A (i) x : n 8 C :0 9 0 V (G1) print V (G2) 10 v <— index (V (G1) , 1) 11 F <— 12 w <— index (F,1)< 13 V(H1) 14 V (H2) <— V (G2) 15 substr (V (H2) , w,1) 16 push V (HT) , V (H2) 17 substr (V (G1) , w, 1) <— 0 18 substr (F,w,1) <— 0 19 F : 0± -A (v) <— V(G1) A (w) <— 1 2.6.4 JOBBER OF VERTEX SETS GENERATED As an example we again consider the derivation K (3,3,3) labelled as previously tree of in Fig. 2.1, the tree this time being determined by our algorithm. It i s given in F i g . 2.11. Each vertex not representing a cligue is labelled by the 73 pair (V (G1) , V (G2)) representing the vertex subset generated by the algorithm. The number of nodes in the derivation tree for K(m ) i s given in the following theorem: THEOREM 2.7^ The number of nodes in k K(m ) k 1 is m k of derivation tree of - 1, Proof: Let the nodes of K(m ) vertex the block be labelled such that i f v^ i s a V. , v. a vertex of block V. and i < j , then i J J the label of v. is smaller than the label of v. . The algorithm processes the vertices a labelling. algorithm has of typical a graph component i vertices in in ascending during V (G2), one execution from each order of of the of the blocks V ,V ,...,V. . V(G1) induces a complete (k-i) partite 1 2 i graph with m vertices per block. This component therefore determines m new components, one for a selected vertex in block V. , and m-1 for the m-1 other vertices in V. _ the only x 1 x 1 i vertices not adjacent to the selected one. Each component therefore determines m new ones, until there are k vertices in V(G2) in which case there are none in V(G1). The number of k ! vertices in the derivatxon tree i s therefore IZ m i=0 i ki = m -1. ^E?L 7k 75 2.6.5 ANALYSIS OF AN ITERATION The algorithm computational effort to for one iteration of the the cliques of K(m ) i s easy tc determine find from the Iverson description. The loop:7 to 4 i s executed n-1 times where n i s the number of vertices in the original graph, the loop: x 19 while to 12 is determined as fellows. Let the vertex set currently under consideration have i vertices in V(G2), and (k-i) m vertices in V(G1). Then F defined in line 11 consists of a l l vertices in one block and conseguently loop: 19 to 12 is executed m-1 times. The expression for the computation time during one iteration i s therefore given by T Q (n) = CJ+nC9,4mC^ where co = u t ^ t ^ t ^ + t Q CO = 2t 1+ t 3 + 2 t 6 * t 8 co = provided the vertex 6t i +t + +t 2 t4 6 set under +3t + e t9 consideraticn does not have V(G1) empty. If V(G1) i s in fact empty, as i t will be for a l l nodes of the derivation tree representing cligues, then only lines 1 through 9 are performed and the this instance computation time in i s T.^ (n) = C* + nC°, where C9, i s given above and C» = 2t, +t +2t,. 1 1 2 4 We can now combine the results of for one the computation time iteration with the number of nodes in the derivation tree to obtain an expression for the total computation time k reguired to find the cligues of K (m ) . There are m nodes for 76 which the computation time for one iteration i s T^(mk) k mH 1 - m k * m- 1 nodes , and where T n (mk) i s the computation time. Hence T(mk) = T,(mk)m +T nu(mkl m -1. m>1. 1 The case fcr m = 1 m-l clearly defines a derivation tree consisting cf a single path of length k. Hence the computation time to determine that k K{1 ) i s a clique i s T (k) = T (k) • (k-1) T (k) . 2.6.6 STORAGE REfiUIBEMEBTS Like the previous algorithms of Harary-Ross, Bonner, and Peay, the Reduced Redundancy algorithm maintains cnly one adjacency matrix, that of the original graph G. Vertex subsets are maintained rows of the on a stack and used to select the appropriate adjacency matrix of G, to obtain adjacency properties of the subgraph of G induced on the vertices in the vertex subset. Our algorithm, however, generates n -d(v) new vertex sets during an iteration where n is the number of vertices in the set and d (v) is the number of vertices in that set adjacent tc v. For this reason i t i s more d i f f i c u l t to determine the storage reguirements of the push-down store for an the arbitrary graph. Instead we shall again examine k complete k-partite graph K(m ). If we derivation again tree adopt in a the "depth strategy of developing the before breadth" manner, i t i s clear that no path i s of length greater than k. Further, from the previous discussion we know that every vertex set V(G1)uiV(G ) i s complete k-partite with 1 vertex in each of i 77 blocks. V(G1) consists of the m(k-i) vertices in the remaining k - i blocks. If we select a vertex v from V(G1) tc determine new components, the number of vertices not adjacent to v i s m1. Therefore, to each node in a derivation tree there are m-1 vertex sets other path of length k in the nodes corresponding to yet to be processed. Hence the push-down store must be capable of handling k(m-1) vertex sets. The storage reguirements for the new algorithm applied to 2 K(m ) are k (m-1) + (mk) +2mk+16C b i t s , C counters and pointers being the number of used. Since we can partition a graph into k blocks no block of which has more than m vertices for k = X(G) * the chromatic number of the graph, this expression also serves as an upper bound on the storage requirements an arbitrary graph. for 78 2.7 THE BRON^KERBOSCH ALGORITHM The Bron-Kerbosch algorithm i s the most recent cligue enumeration algorithm known to this author. Mulligan [ 6 4 ] described the procedure and found i t to Bierstone's algorithm, a method also discussed be has superior tc by Augustson and Minker [5]. The algorithm employs a recursive procedure which i s used to modify a global vertex set consisting of a l l vertices which form a complete subgraph of the original graph. The function of the recursive procedure i s to number of vertices in the extend, i f possible, complete subgraph. accomplished by maintaining several l i s t s and stack generated through recursive These include two vertex subsets, which can be used calls one a the This i s pointers in a to the procedure. set of candidates to. extend the complete subgraph, and the second a set of vertices which have already been used to carry out such an extension. Since the contents of the vertex subset is under continual maintain a pointer modification indicating Some other counters are stacking of definitions also which i t i s also necessary to the last entry into the set. maintained will be through recursive apparent from the description of the algorithm. In what follows we shall use the notation developed by Mulligan algorithm. and his formulation of the 79 2.7.1 NOTATION ru number of vertices in the original graph. Com£sub^ complete subgraph currently being extended. order of compsub. Vz vertex set currently under consideration. Ne^ number, of vertices already examined i n V. Cez, total number of vertices in V. S_2 pointer to selected vertex from V used to extend compsub. R2§1 position of a potential candidate. - Nodj. number cf vertices not adjacent to a fixed vertex among the set of vertices in V already examined. Minnoch minimum number of vertices not adjacent to a fixed vertex. Pixp^: vertex with maximum degree in the subgraph induced on V. NEW: new vertex set. Newne: number of vertices in NEW that have been examined before. Newce_: total number of vertices in NEW. 2.7.2 THE BRON-KERBOSCH ALGORITHM (Mulligan [64]) A. I n i t i a l c a l l to recursive procedure EXTEND STEP 1: Set V to V (G) , c to 0 ne to 0, ce to n. STEP2: Call recursive procedure EXTEND stop. (V,ne,ce). On return 80 B. EXTEND lVjjae x cel recursive procedure. STEP 1: set minnod to ce, nod to 0 , i to 0. STEP2: set i to i+1. If i>ce or minnod = 0 then go to STEP6. STEP3: set count to 0. For each V ( j ) , j=ne+1 to ce not adjacent to V(i) set count to count+1 and pos to j . STEP4: i f count < minnod then set f ixp to V ( i ) , minnod to count and go to STEP5; else go to STEP2. STEP5:_ i f i < ne then set s to pos; else set s to 2 and nod to 1. In either case go to STEP2. STEP6 Set nod to minnod+nod. STEP?; If nod < 0 then return. STEP8: Interchange V (s) with V(ne + 1). STEP9: Set newne { V (1) ,V (2) egual ,V (ne) ) to the number adjacent of to vertices V(ne + 1). in Set NEW (1),NEW (2) ,... ,NEW (newne) egual to those vertices. STEP10: Set NEW (newne+1),...,NEW (newce) egual tc those i vertices in { V (ne+2) ,. .. ,V (ce) } adjacent to V(ne + 1). Newce eguals the total number of vertices in NEW. STEP 11: set c to c + 1, compsub (c) to V(ne + 1). STEP12 If newce eguals 0 then print compsub(i) , as a cligue; else i f newne less than i = 1,2,...,c newce then call EXTEND(NEW,newne,newce) STEP 13^ Set c to c-1, ne to ne + 1. STEP 14: set nod to nod-1. vertex from If nod > 0 then choose another {V (ne+1),...,V (ce) } not adjacent to fixp and not yet chosen. Go tc STEP7. The Bron-Kerbosch algorithm has been divided into three blocks. The task performed by block 1 is to extend i f possible 81 a complete subgraph contained in G by examining vertices not previously encountered to see whether they are adjacent to a l l of the vertices of the complete subgraph under Control i s passed to consideration. block 3 where i f such an extension i s possible i t i s made, a record being kept of those vertices previously encountered and yet to be examined. If the complete subgraph cannot be extended i t i s printed recursively c a l l s block 1 returning only after out. Block 3 a l l possible extensions have been examined. Control i s then passed to block 2 which makes the next possible extension tc the vertex set under consideration at the present level of recursion. BL0CK1 1 minnod <— ce 2 nod <— 0 3 i <— 0 4 ->i <— i+1 > l : ce 5 6 minnod : 0 7 count <— 0 8 j <— 9 ne+1 I—substr(MV <i)) ,V (j) ,1) 10 count <— 11 pes <— j 12 j <— 13 j 14 count : minnod 15 f ixp <— 16 minnod <— count 17 I count+1 j+1 < : ce : ne 18 s <— pos 19 S 20 nod <— 1 < i-*r V (i) BL0CK2 21 c <— c-1 22 ne <— ne+1 23 nod <— nod-1 24 s <— ne+1 25 26 27 *• substr (V (s), fixp, 1) : 0- -29 s <— s+1 - s : ce return BL0CK3 28 nod <— minnod+nod 29 nod : 0 30 sel <— V (s) 31 V (s) <— V (ne+1) 32 V(ne+1) <— s e l 33 newne <— 0 34 i <— 1 35 »return -=>substr (fl (V (i)) ,V (ne + 1) , 1) : 0 36 newne <— newne+1 37 NEW (newne) <— V (i) 38 i <— i + 1 •* 39 i 40 : ne •i-newce <— newne 41 i <— ne+2 42 substr (A (V (i)) ,V (ne+1) , 1): 0 43 newce <— newce+1 •45 84 44 NEW (newce) <— 45 i <-- i+1 42 <—— 46 V (i) l : ce 47 c <— 48 compsub (c) <— 49 newce : 0 — 50 i <— 1 51 print compsub (i) 52 i <-- i+1 53 i :c 54 21 < 55 21 •* c+1 V(ne+1) — newne : newce < c a l l EXTEND (NEW,newne,newce) 2.7.3 THE NUMBER OF VERTEX SETS GENERATED v V Let ^ » 2 ' * * " ' ' k b e t h e D ^ o c k s . ° ^ vertices of K (m ) , each containing m mutually non-adjacent vertices. The Bron-Kerbcsch algorithm proceeds by fixing vertex subset a vertex and defining a new to be the set of a l l vertices adjacent to the fixed vertex. This vertex subset i s partitioned into two parts to provide imformation subgraph level for determining whether a complete i s maximal or has been considered before. At a given i of the recursion other vertex sets are generated whenever control again returns to that level by choosing anong the set of vertices not adjacent to the original fixed vertex. This selection procedure is analogous employed by the Reduced Redundancy to algorithm the mechanism fcr generating new vertex subsets and as a conseguence the number of vertices 85 in the derivation tree i s the same, namely m this is the case i t is only necessary -1. Tc see that m-1 to establish an equivalence between the nodes of the derivation tree generated by our algorithm and the nodes in the derivation tree generated by the Bron-Kerbosch algorithm. If we assign level 0 to the node of the derivation tree corresponding to the original vertex set of the graph, then a node at level the algorithm i in corresponds to the vertex subset of a complete (k-i) partite graph while the selection of fixed vertices made in the generation of a path from contained the root tc level i is in the array compsub . From previous discussion our algorithm has a node a distance i from the root with two vertex sets V(G1),V(G2). G2 corresponds to a complete subgraph of order i , while V(G1) induces a complete (k-i) partite graph. By choosing that path of length V(G2) containing i which results the same vertices as compsub, V(G1) the same set as the vertex subset generated by in is then the Bron- Kerbosch algorithm. 2.7.4 ANALYSIS OF ONE ITERATION Since the algorithm employs the same technique for vertex set generation as our algorithm, the relative efficiencies of the two procedures are dependent upon how the properties of the vertex set so generated are exploited during an iteration. This depends cn three factors; the represented, the order of development of the way the data derivation and the means by which redundant components are avoided. is tree 86 To determine the computation time required by the Bron- Kerbosch algorithm as implemented vertex by Mulligan set under consideration at level outstanding of which i have procedure EXTEND i t s e l f . A l l parameters V be a i . The maximum depth of recursion by the algorithm is k. At level calls let i been we have i+1 by the within the called defined procedure are saved, a feature important in the determination of storage reguirements. A vertex set generated by the algorithm cliques of Mra^) has the property that in finding at level vertices that have been considered l i e in blocks those yet to be considered l i e in V iall V ,V ,...,V i ,...,V . i+2 k 1 while the 2 ,V i+1 There are m (k-i) vertices in the latter collection since every vertex is adjacent to any vertex in V . from which the selected candidate was chosen. Hence every vertex k blocks of K(m ) the vertex remaining under any vertex consideration previous currently block can already last (k-i) have at most m-1 considered, 4 is repeated times plus once more when choosing a vertex for fixp, while loop:18-to 4 i s repeated at most ne times. Finally, inner the in compsub. Since minnod > 0 for a l l vertices in the vertex subset, loop:14 to ce-ne-1 the has the minimum number of disconnections since set vertices from in loop: 13 to the 9 is repeated m (k-i) times for each vertex already considered and m(k-i-1) times for each vertex yet to be considered. The time for one iteration of block 1 i s thus bounded by T = C»+ (ce-1)Ci + ne (C* + m (k-i)C») + (ce-ne)m (k-i-1)C* 1 1 2 3 A- 4 87 where = 4 7 t t l 4 + C» = 3t,+2t~+4t, 2 c C Loop:54 3 4 to 21 1 = 3 = t l 4 j) + t 4 3 t 1 + 2 t 3+ t 4 + t 8 i s repeated after every return from the recursive c a l l in line 55. This equals n-d(fixp), of the number vertices not adjacent to fixp. Loops 27 to 25 and 46 to 42 are repeated ce-ne times, while loop:39 to 35 i s repeated times. ne If i<k-1 a clique has not yet been found so statements 50 through 53 are skipped. If we define one iteration as being the total computation performed until a return i s made at line 29, then the computation time for blocks 2 and 3 together i s bounded by T2 = (n-d (fixp)) (C|+(ce-ne) C|+neC2) +t^ with J C?1 =' ,13^+61^ "1 C| = (2t 1 + t 3 + t 4 + t 8 ) + (t 1 +t 3 + 2t 4 +t g ) C The order 3 = 3 V 2 V 2 t 4 + t 8 of the computation time for one iteration i s 2 therefore between n and n . For vertex subsets such that ne=0, the computation for one iteration i s of order square of the number consideration. of vertices 2 2 m (k-i) , the in the subset under 88 2.7.5 STORAGE REQUIREMENTS As previously observed, the maximum depth of recursion i s k. Hence a l l variables defined within the recursive function must be stored k times. This consists of pointers and counters and the array NEW an integer array of size n. The adjacency matrix of the original graph and the array compsub of order are maintained outside the recursive adjacency matrix i s stored as an storage reguirements array n procedure. Since the of b i t strings the for the Bron-Kerbosch algorithm as 2 implemented by Mulligan are n + n (k + 1)+16Ck + 16 bits where C i s the number of integer scalars EXTEND, and the additional 16 bits globally defined variable. defined are an within the routine allowance for a 89 CHAPTER 3i CLIQUE DETECTION USING VERTEX SIMILARITY 3.1 INTRODUCTION It i s evident from the observations and results of Chapter 2 that the efficient detection of cligues is severely hampered by the possibly exponential number of such subgraphs. Even the act of printing them out can occupy an inordinate amount of time simultaneously unless there exists seme of each the vertex cligue. Two approaches to this problem will be examined separately in this chapter, each of them properties cf identifying several cligues and also some more compact form of notation than explicitly defining sets means exploiting which measure the degree to which any two vertices are different. One such automorphism device group is similarity of vertices. The of a graph partitions the set cf vertices V (G) into equivalence classes called the orbits of V (G). Two vertices are similar i f and only i f they are members of the same orbit. Hence there exists a permutation in T(G) which maps u onto w where u and w are vertices in the same orbit. An examination vertices in block can be of complete k-partite graphs , reveals that every vertex in any block interchanged with any other, ie vertices in any block are similar. Let u, w be two such vertices. Then the cliques to which u belongs and a permutation u onto w, we also have a l l the cligues Such a with representation to which i s more compact i f we know which maps w belongs. as i t requires 90 explicitly defining only the cligues associated with u. In the development which follows we shall tacitly that assume a procedure for determining the orbits i s available. For implementation we shall employ Cornell's algorithm[12 ]. It i s important to note here that while Cornell's procedures have not failed on any graphs correctness depends encountered on an to date, unproved that conjecture. their Corneil therefore describes his algorithm as a heuristic one, a policy which we formally must also follow when using his routines. ft difficulty with such an here overall approach as offered for improving the efficiency of clique detection occurs when u and w, vertices in the same orbit, are both members of some common clique, in other words u i s adjacent tc w. This i s often the case as i s illustrated by the existence of connected point symmetric graphs, which by definition have a l l vertices belonging to the same orbit. Since vertex used at several levels of cligue similarity detection other enumeration we shall defer further discussion cn this until later. can be than problem 91 3.2 POINT AND LINE SYMMETRIC GRAPHS The that remarks , of the introduction to this chapter suggest similarity of characterization of vertices the may cligues contribute to a in a graph. To what extent x this i s true will be examined in this and the next section. Of particular interest will be the behavior of subgraphs by induced a single vertex since this i s the major mechanism by which a graph can be decomposed into smaller subgraphs processing. for further This feature has already been observed previously in the seguential algorithms of chapter 2. The major portion examination of of this section is devoted to an cligues in point-symmetric and line-symmetric graphs. Of particular interest i s the degree of symmetry of the induced subgraphs. THEOREM 2-.ll T1 »e subgraph induced on the adjacency set of a fixed vertex in a line-symmetric graph is point-symmetric. Proof^ Let G be a line-symmetric graph, v a and v. denote Since by { w^,w G 0 is ct2f<X.3 »• • • r ^ i 2 in V(G), ,. . . ,w^ } the set of vertices adjacent to line-symmetric n vertex there exist permutations T(G) such that: o(-2 ( v ^ ) = (v,w2) oC_(v,w ) = (v,w,) These permutations, together with their inverses, hold v fixed and hence belong to P , the stabilizer of v. Since every ot in 92 p(G) w\ where preserves adjacencies, every «_ maps w^ onto some , vertices w are in {w^,...,w J. in { w ! w If i ' j K a r ea n ? t w 0 v, } then at .ocf (v, w .) = (v,w.) and hence K j X X J the set of vertices {w ,,..,w } i s similar. Further, since permutation oLeT^ maps w^ onto some w^ the integrity of every the subgraph induced on {w ,...,w } i s preserved. The subgraph induced on the adjacency vertex set of a fixed in a point-symmetric graph i s not necessarily point- symmetric. This i s illustrated in the counter-example given in Fig. 3.1. The subgraph induced on vertices adjacent tc vertex 1 i s given in Fig. 3.2 and i s clearly not point-symmetric. Dauber and Harary [43] and Folkman [32] have investigated the extent to which line-symmetric graphs are point-symmetric. The principal result of Dauber and Harary i s the establishment of conditions which characterize such graphs, namely that every line-symmetric graph with uo isolated points i s point- symmetric or bipartite. This result together with the previous theorem establishes a sufficient condition for point-symmetric graphs to have point-symmetric subgraphs induced on the adjacency set of a fixed vertex, that condition being that the graph be line-symmetric or regular condition bipartite. point-symmetric and every this (1,6). graph the edge bipartite subgraph induced on a set of vertices adjacent to a single vertex i s also For this i s not necessary i s illustrated by the graph given in F i g . 3.3, a graph not line-symmetric or regular but That point-symmetric. (1,2) i s not similar to the edge 6 5 3.1 POINT SYMMETRIC GRAPH Fig. 7 3.2 INDUCED SUBGRAPH OF FIG. 3.1 Fig. 3 Fig. 3.3 POINT-SYMMETRIC GRAPH NOT LINE-SYMMETRIC 95 We shall denote by A(v) the set of vertices in a graph G adjacent to the vertex v. LEMMA 3^2^: let v-^*^ and denote by G1 b e a n t w s n, * ° ^ ^^ ar and G^ the subgraphs induced on A (v^ ) a nd A(v2) respectively. Then there exists = oc G 1 vertices of a graph G, oc in V (G) such that G . 2 £l2°Jl Since , v2 are similar the number of vertices in G-^ i s egual to the number in G 2 » Let oc be an automorphism of G mapping onto v 2 » Then for each u in V(G^) there exists a unigue image ocu. Further every such vertex ocu in otV(G^) i s adjacent to v2 since every vertex u in V (G^) is adjacent to v . Now by definition V<G ) i s the set of vertices adjacent to v2 , and hence V(G2) = <*.v(G1). Since oc i s an automorphism, (u,w) i s in E (G) only i f (<xu,<*w) and i s in E(G). Hence for any u,w in V (G^) , (u,w) is in E(G1) i f and only i f (<*u, ocw) i s in «tu, ocw if are vertices from V(G2 ) . Therefore E (G2) E (G2) since = oc E (G^) = G^. and hence oc G ^ As a conseguence of this Lemma we have the following: LEMMA 3_.2x Let G be a point symmetric cL , oc ,.. . , ot graph, automorphisms of G such that V *3 1 v *k l = = V V and denote by 96 Then i t i s the case that: G = G = *3 1 *k l where G V G ,G ,.,.,G, A ( v ^ ,A (v2 ) Let 3' denote the subgraphs induced cn ,A (vk) respectively. {'V^,vL..,vh i = 1,2,...,k be and suppose the vertex without set of G. loss of generality, that ot v"*" = v*. Then clearly i f we know the cligues of G n , x 3 3 find we can 1 a l l the remaining cligues of G knowing the permutations u ,<*.,.... ,cc . To avoid duplication of cligues 2 for j the following k test can be employed. If we component are examining G^, then delete the cligues associated with a l l cligues containing vertex v. for j = 1,2,...,i-1 as such cligues have 3 already been found during the examination of component G... i Such a strategy encounters d i f f i c u l t i e s First, as we have seen previously, the point-symmetry of a graph i s no guarantee for the pointinduced cn two fronts. symmetry cf subgraphs on vertices adjacent to a point, hence the problem of determining the cliques of G i s as yet unresolved. Secondly, the determination of automorphisms oL oL ,... ,<* t i s in general a d i f f i c u l t problem. We can overcome the f i r s t difficulty by generalizing procedure the to include graphs which are not necessarily point- symmetric. Then, the existence of an algorithm fcr determining 97 the orbits can be exploited i n the following way. Every member of each orbit can be represented by a single vertex which determines the induced subgraph for further processing. If Q = {v^ ,v^ , ...,v^ } i s an orbit of P(G) f c r some 1 2 k graph G and i f oi • ,<*., ,...,<x2 3 x such that: 2 X u X ± ± \ oe, v. are permutations in P(G) Tc x 2 X = v i = v± then by an argument simliar to that cligues with ,...,V; associated v. previously given the can be determined i f we know the cligues associated with v. . The development x algorithm utilizing of an l such technigues will be the focus of the next section. The object will not be to find a l l the cliques because of the d i f f i c u l t i e s associated with determining the permutations which Rather, map the algorithm similar shall vertices attempt onto the automorphism group other. to find a set of ncn- similar cligues cf a graph G which together with of each P(G) will a knowledge be sufficient to determine a l l the cligues of the graph. The algorithm can thus be considered as a sub-program which when incorporated with sub-program for determining the automorphism provide a mechanism for finding the cligues. group a will 98 3.3 DETERMINATION OF NON-SIMILAR CLIQUES The purpose of the procedure to be described here somehow characterize is to the non-similar cligues of a graph. Two cligues C^ and C^ of a graph G are said to be similar i f there exists an automorphism oc of G such that «. c^ = C ^ . A naive approach to the problem of determining similar generating classes" induced of vertices of G on by k=1,2,....n-1. For representative the the set the group for equivalence class, choosing one as a member. Since the automorphism group preserves class only subgraphs cf G defined on the vertices represented by the k-tuples are intersted T(G), given k, we examine the cligues that adjacencies, two k-tuples are members of the same clearly "equivalence a l l unordered k-tuples of automorphism any are members of each the non- cligues of a graph which serves to illustrate what we are attempting to find involves if the provides only isomorphic. more in Such information those classes a procedure, therefore, than we desire as we are whose k-tuples are the vertices of maximal complete subgraphs. The mechanism to be employed will depend primarily on the observations of the previous section; namely, that i t is possible to generate a l l the non-similar cligues of a graph by reducing a graph to components egual in number to of the the orbits automorphism group of the graph, each component being the subgraph induced by a vertex from an orbit. Each component i will then serve as input and subsets of vertices will thus generated in a be recursive manner analogous tc the seguential 99 procedure described in chapter 2. There i s not, unfortunately, sufficient determine a l l the non-similar cliques information to of a graph from the orbital partition cf V (G) alone. This i s illustrated following by the example. Suppose we apply the procedure cf choosing vertices as just described to the graph of F i g . 3*3. Since this graph i s point-symmetric, one vertex should be sufficient to characterize the " f i r s t " vertex of a l l the cliques. Let that be vertex 1 . The subgraph induced on vertices adjacent to 1 consists of three independent vertices 2,5 which belongs and 6 each of to the same block of the orbital partition of V (G). The "second" vertex of a l l cliques should therefore be by one vertex, say 2. As (1,2) i s a clique of characterized the graph and since we have argued that a single vertex and the subgraph induced on adjacent vertices should be sufficient to characterize a l l cligues of the graph, we would have to claim that a l l cligues were similar to (1,2) which we know to be false since the graph being examined is net line-symmetric. In fact we have previously observed that edge (1,6) was net similar to edge (1,2) and generated therefore should also have been in the determination of the non-similar cligues of the graph. We can resolve the two non-similar cligues of the graph of Fig. 3.3 by using not only the orbital partition of V (G) but also by determining the orbits induced stabilizer of 1. This on 2,5,6 by the results in a partitioning of {2,5,6] into two sets {2,5) and {6}. By selecting a representative 100 vertex from each cf these sets we can obtain two non-similar cligues say (12) and (16). This example illustrates that although the fact the group T(G) of a graph may be transitive on the vertex set V (G) i t is not necessarily transitive cn A(v). We have previously shown that i f G is line-symmetric then this will be the case. Before we present a description of an algorithm for finding the non-similar cliques cf a graph its orbital structure we stabilizer of a particular from examine further the orbits of the vertex v in the following two theorems. THEOREM 3.2: Every automorphism ot in the stabilizer, P ^ of v i s an automorphism of G , the subgraph of G induced on the set of vertices adjacent to v. Proof: Let P be a automorphism oc vertices of of G vertices in G permutation matrix corresponding the in p . Without loss of generality assume the v are labelled 1,2,...,m, and are labelled v A 2 the remaining m+1 ,m+2,...,n. Let A (G) be the adjacency matrix of G, A (G ) that of G . Clearly v v the form: 'A (G ) A Since to A(G) i s of 2 A J 3 P corresponds to an automorphism in the stabilizer of v, by Lemma 3.1 P maps V (G ) onto V (G ) and i s therefore of v v the form: \ 0 ° P where P acts labelled 1,2,...,m, the vertices in V(G^) on vertices 101 is Since an automorphism of G, i t i s the case that P A (G) P = A (G) . Hence K A(Gv) 0 T P A2 T J P l ,o I A2 A(Gy) 0 P . T L 2 A A2' A A3J 3^ or P A (Gv) P = A (Gv) . Consequently ot maps G^ onto i t s e l f . QED. P 2 2 As a corollary to this theorem we have following result: COROLLARY: Every orbit of P^. i s contained in seme orbit cf Proofz By the previous theorem every automorphism of G i s one of G . Let u,w be vertices in V (G ) and suppose there exists ot in r and such that otu = w. Hence ot i s an automorphism of G^ and u w must be members of the same orbit induced on V(G ) by P(G ) . QED. THEOREM 3z_3± Let G be a connected point-symmetric graph. Then G i s line-symmetric i f and only i f for an arbitrary vertex v, the stabilizer P S T(G) of v i s transitive on A(v) the set of v vertices adjacent to v. Proof: Let A (v) = {w ,w ,...,w }. If G i s line-symmetric for then any w.,w.,i # j , (v,w. ) i s similar to (v,w.). Hence at i s transitive on A (v). Conversely suppose P i s transitive on A(v). Then for any v w ,w there exists OL i n P such that ot(v,w.) = (v,w.). i j V 1 J Further since G i s point-symmetric, for any other vertex u ^ v in V (G), A(u) = [ x there exists / 3 in P (G) ,... ,x ^} then such that / 3 v = u. If since /3 preserves adjacencies 102 a (u) =fyaw2_'f 2' 3v 3w * * * '/ k ^ a n ^ hence/3(v, w.^) = (u,x..) for some in A(u). Finally every edge (u^ovu) i s mapped onto by the automorphism /3ct/3,~^~, hence (u,^sw%) every edge incident tc a vertex u or v i s similar to any other edge also incident to vertex a u or v. Since u was chosen arbitrarily we can conclude that G i s line-symmetric. The observations made in the the machinery by which we previous can define theorems an provide algorithm for determining the non-similar cligues of a graph provided we are equipped with a procedure for determining orbital partitions. 3 . 4 DESCRIPTION OF THE ALGORITHM Since two vertices which are members of will be the orbit members of similar cliques, we i n i t i a l l y determine k representative vertices one for each of the k graph. same orbits of the Further, we shall reguire a knowledge of the orbits of the respective stabilizers of the representative vertices. The algorithm recursively decomposes subgraphs defined on the set of vertices adjacent to a representative as many into new subgraphs as the number of blocks of the orbital partition of the stability representative subgroup fixing to that particular vertex. Each new subgraph i s determined as the subgraph induced on the set of vertices of adjacent vertex the old subgraph a single vertex chosen from one of the blocks of the orbital partition and i s subseguently reduced in a similar manner. 103 A record i s maintained of the representative vertices as they are chosen, and when there exists an isolated vertex in a block of the partition, a complete subgraph has been found. This subgraph i s then examined to see i f i t i s procedure maximal by a similar to that of the Reduced Redundancy algorithm in the previous chapter. As stated previously, to determine the orbits employ Cornell's Quotient algorithm Graph, a conjectured, graph corresponds shall for constructing the Terminal each of whose vertices, i t is to a block of the orbital partition of V (G) [ 1 3 ] . Cornell's algorithm i s ideally purposes we suited to our since in determining the Terminal Quotient Graph, he determines belonging not only the vertices of the original graph to each block of the partition but also the orbits r for a vertex v from each orbit of P(G). Since the v adjacency set of v i s obviously a subset of V (G) i t i s easy to of determine the orbits of V to which they belong. Cornell's v algorithm provides this information in the determination of the vertex guotient graphs of G which are constructed by fixing a vertex and then determining the partition induced the remaining vertices of V (G). Corneil uses the vertex quotient graphs to determine the orbits of P(G) by two vertices i n the same class identical vertex quotient graphs. on grouping i f and only i f they have 1 J 104 NOTATION Q± : the ith orbit of P (G ) . A J v)_: row v of the adjacency matrix of G. G l l vertices yet tc be considered for a particular subgraph. G2_2 vertices which induce a complete subgraph. Hi: new vertex set to be examined, derived from G1. S2l expanded vertex set inducing a complete subgraph. NON-SIMILAR CLIQUES ALGORITHM STEPJk Use Cornell's algorithm to find the orbits Q°,0°,••-,8? <c X of P (G) . In addition, let # 7 , 0 L . . . be the £ 1 orbits rC of Kv V induced on A (v). STEP2: Choose a vertex set (G1,G2,W) from the stack of candidates. If stack empty, then stop. STEP3: Compute T & = V / Q ^ G 2 * I f T n 0 t eni Pty then go to STEP2. STEP4i If G1 empty then print G2 as a cligue and go to STEP2. STEP5: Set i to 1 and F to 61. STEP6: If ^HG1 STEP7: empty then go to STEP10. Choose v in 0 ™ O G 1 not previously chosen. If none left to examine, l e t v be any vertex in 0 ™ n G 1 and go to STEP9. STEP8.: If G2r»A(v) not empty then go to STEP9; else go to STEP7. STEP9:. Define a new vertex set (H1,H2,V) with H1 = FOA(v) and H2 = G2u{v}. Put (H1,H2,V) on the stack. STEPIO^ set i to i + 1 and F to in (~). If i<k then gc to 105 STEP6; else go tc STEP2. 3.5 DISCOSSION OF THE ALGORITHM The algorithm for enumerating non-similar graph is similar to the seguential enumeration of a l l cligues proposed whereas the seguential algorithm Chapter graph similarity and 2. the of vertices in is the However, number dependent of of upon the the graph. It i s clear that this determines the number of orbits of the group as well as the non-similar cligues. Since i t is only necessary to consider one vertex from each of the blocks of of for a in a vertex subset, the determination of non-similar cliques by the method just described number of algorithm's efficiency was dependent upon the number of cligues in the elements in cligues the partition A(v) induced by the stabilizer of an appropriate vertex v, the number of vertices that need be number of new vertex subsets examined generated and hence the i s reduced i f the number of blocks in the orbital partitions determined in stepl is small. It i s possible for a graph to have an exponential of cligues, none of number which i s similar to any other. This i s illustrated by the graph of Fig. 3.4. The subgraph induced vertices on {1,2,...,8} i s K (3,3,2) the graph on eight vertices with maximum number of cligues. Additional vertices are then added to insure that the graph has identity group. Hence every cligue of G i s non-similar to every other. In general i t is 106 possible to construct a graph on 6k vertices having order K 3 cligues in a similar manner. The purpose of this demonstration is to emphasize the fact that the detection of non-similar cligues may i t s e l f be an exponential process. Fig. 3 - U GRAPH WITH ALL CLIQUES NON-SIMILAR 108 3.6 ANOTHER APPLICATION OF ORBITAL PARTITIONING ID the algorithm for determining the non-similar of a graph, i t was not possible cliques for us tc apply orbital partitioning to the vertex sets of each component obtained the in reduction of the graphs. This was because the non-similar p cligues were determined by the orbits of of v, and not v P(G ) . He have previously shown that for every automorphism v of P , there i s an automorphism of P (G ). However the converse v v is not necessarily true since two similar vertices in G be non-similar in G . might If grouped in the same class, a non- similar clique would be lost. We can determine however only employ this strategy i f we wish the existence of cliques of different orders. Such a technique i s seen to examine fewer vertex subsets a procedure to for than finding the non-similar cligues of a graph, since we can take advantage of any symmetry that exists in the subgraph induced on a particular sets which are nearly vertex subset. The vertex resolved into cligues exhibit a high degree of vertex similarity and by only distinguishing between vertices in different orbits, the number of vertices is greatly originally reduced. present The in the fact that graph cliques will be examined of a l l orders obtained is established by the following argument. If we determine the orbits of u,v in P(G) on V (G), two vertices V (G) are members of the same orbit i f and only i f the subgraphs induced on those vertices of V(G) adjacent to u those adjacent to v are isomorphic. Hence each and induced 109 subgraph has the same number of cliques of each order and conseguently either induced subgraph may be chosen fcr further processing and the other ignored without fear of losing a l l cligues of a particular order. THE ALGORITHM STEPO: I n i t i a l l y place (G1,$) on the stack. STEPJ.2 Choose a vertex set(G1,G2) from the stack. If the stack is empty ,then stop. If T i s not empty STEP2: Compute then go to STEP 1. STEP3: If G1 i s empty then print G2 and go to STEP 1. STEP4: Determine the orbits ^ / ^ " " "' o f t h e aDtcn,or P Dlsnl group of the subgraph induced on vertex set G1. i to 1 and F to G l . STEP5i Set STEP6:. Choose v in 0. r\ G 1 ^ (~A (v)) and define a new vertex set x (H1,H2) where H1 = FOA(v) and H2 = G2 U {v}. Place (H1,H2) on the stack. STEP7i Set i tc i+1 and F to F n ( - f t ) . If i<k then go to STEP6; else go to STEP1. This algorithm is very similar to the Reduced algorithm for Redundancy the enumeration of cligues. It is obvious that the latter algorithm could be employed to determine the orders of the hon-isomorphic cligues of a graph. However in their possibly exponential reducing of number, i t is desirable to find some means of reducing the number of vertex subsets by view generated the number of vertices that need tc be examined. 110 This is achieved in our algorithm by again exploiting the fact that two similar vertices belong to the same number of cligues of different orders and therefore in the situation wish where we to find the orders of the different sized cligues of the graph, i t i s only necessary to treat one of the two similar vertices. It should be noted that in step 6 of the algorithm i t i s not sufficient tc choose one vertex from each of the k of the automorphism group of the subgraph induced on G1. This is because i t may turn out that the number of crbits the orbits number of exceeds vertices not adjacent to v, in which case our algorithm would perform more poorly during that iteration than the Reduced Redundancy algorithm of the previous chapter since i t would generate more new vertex sets than the seguential procedure. For this reason v i s chosen from 5,nGl n ( ~ i (v)). 111 3,7 ANOTHER APPROACH TO DESCRIBING THE CLIQUES In the previous sections we have attempted to exploit the similarity of vertices in a graph as an aid to the detection of i t s cliques. This was only partially successful, ene cf the major d i f f i c u l t i e s being the group of the graph, difficulty which was of necessary enumeration of the cligues. Even the task of non-similar determining for the a complete determining the cligues has proved to be limited by the existence of few good procedures for finding the orbital partition of the vertex set. Finally, we saw where i t was even possible fcr a graph with identity group to have an exponential number of non-similar cligues. In this section we shall explore an alternative in approach which two vertices will be related by a condition stronger than that of similarity. DEFINITION 3^X1 be complete T w c vertices u and w of a graph G are said subgraph eguivalent J CS eguivalent) i f for any subgraph of G defined on u,v^,v^,...,v mutually are to vertices u,v ,v ,...,v. adjacent i f and cf V(G), only i f w,v^ ,v^,.,.,v_ are mutually adjacent. Two vertices of degree 0 are CS equivalent. It is clear from the definition that i f two vertices CS equivalent then they are similar. This fellows from the fact that two CS equivalent vertices are adjacent to the set of vertices are same and can be interchanged. By finding a l l the cligues to which vertex u belongs, we have also found a l l the 112 cliques to which vertex w belongs and i t is a simple matter to determine the latter explicitly: for each occurrence of u in a cligue, replace i t by w. A supplementary method which reduces but equally the number important of vertex advantage of a subsets to be considered by finding complete subgraph eguivalent vertices is that i t provides a means of representing a l l the cligues of the graph in a more concise manner than explicit Given enumeration. the vertex set V(G) of a graph G, let v ,v ,...,v set of CS eguivalent vertices, a l l of which are by adjacent only to vertices in A (v^), the be a definition set of vertices adjacent to v^. If we denote by C^ the set of maximal complete subgraphs induced product on the subsets {v ,v ,...,v } X C X k 2 is of A (v^) precisely the the Cartesian set cligues of G containing one of the vertices v ,v,...,v . Ic 1 2 is evident each also expressed as a set of Cartesian one being determined by a set cf CS equivalent vertices and their common set of adjacent vertices defined the subgraph 3 are CS eguivalent the vertices adjacent to vertices isolates the 1,2, and are each adjacent to vertices 4,5,6,7,8,9. In the subgraph induced on vertices, on induced on A(v ) . As an example we may consider again the graph K (3,3,3) given in Fig. 2.1. The vertices and It that this procedure could be extended so that the cligues of C^ were products of a l l 1 4,5,6 this latter set of are CS equivalent and each i s 7,8,9. Since the vertices 7,8,9 are they are also CS equivalent. Thus a l l the cliques of graph are given { 1,2,3}X{{4,5,6}X{7,8,9}}. by the expression: 113 This i s obviously a much more compact way cf defining the twenty-seven cligues of K(3,3f3). The primary drawback of implementing the technigue described just i s the paucity of vertices which are CS equivalent in an arbitrary graph. Instead, we shall implement a procedure which uses a weakened form subgraph to group equivalence of the definition the vertices cf complete i n a similar manner. DEFINITION 3^2^ Two vertices u and w are weakly CS if there exists a complete subgraph defined on some subset of vertices of G say u,v ,v ,...,v, such that 1 2 j induced on w,v^,v^»-••#v. is also complete. It equivalent i s clear vertices from the subgraph the definition that weakly CS eguivalent are not necessarily similar, and that complete subgraph equivalent vertices are weakly CS equivalent. We now consider the properties of a set cf weakly CS equivalent vertices defined in the following way. Let v_^ be an arbitrary vertex from V (G) and l e t {v ,...,v } be a set cf 2 j vertices also from 7(G) such that each vertex v is adjacent i to a l l vertices in A(v^). A l l the complete subgraphs containing v^ including the cligues are induced cn {VjluA(v^). If {w ,w ,...,w ] £ A ( v ) induces a complete subgraph then 12 k 1 v ,w ,w ,...,w also induces a complete subgraph. Further, X 2 X since k v i s adjacent to every vertex in A(v ) , i t i s adjacent i 1 n e n c e V H to ^ » ' * *"' k ^ j _ ' "]_» 2 '" * *' k* induces a complete subgraph. Therefore v^ ,v. are weakly CS eguivalent. Most w W 2 a D d W 1 14 importantly i t i s the vertices from case the that set every cligue {v^JuA(v^) is {v ,v ,...,v ,}UA(v ) . In other words, given {v.,w X ,...,w } inducing X a complete of on vertex set a l l possible K vertices which could be used to extend that vertex set to with induced any subgraph, G induce a larger complete so as subgraph must be contained in (v 1 ,v 2 ,...,v.]uMv 1 ). If we denote by subgraphs which the are maximal {v ,v ,...,v. }, and denote by complete subgraphs X This sets of a l l complete on the subgraph V the vertex induced sets determined by the result has an alternative graphs G^ and G 2 » denoted G^ + G2# on V (G^) U of G and V ( G 2 Cartesian interpretation as a [ 4 3 ] ) of i s the graph G defined ) such that every edge of G^ or G^ i s an edge for every vertex v in V (G^) and vertex w in V (G^ ) , (v,w) is also an edge of G. Let C^ which of a l l induce complete subgraphs which are maximal on product of graphs. The join (see for example Harary two on which are maximal on the subgraph induced on A (v ) , then the vertex sets product vertex i s maximal be a complete subgraph on the subgraph induced on [v ,v ,...,v. }, and l e t C^ be a subgraph which i s maximal on the subgraph induced on A(v^). Then C^+ C2 i s a cligue of G. Re graph by following illustrate the finding weakly example. determination CS Consider of the cligues of the eguivalent the graph vertices in the of Fig. 2. 9,. It i s evident by inspection that no pair of vertices exists which i s 115 complete subgraph eguivalent. Instead we define vertices by sets cf f i r s t choosing arbitrarily vertex 1 and defining one set to be A(1) = {2,6}. Now vertices adjacent two 3 and 5 are also to {2,6} so the second set i s {1,3,5}. By repeating this procedure on the subgraphs induced on the f i r s t of these vertex sets we discover that {2,6} can be separated into two sets expressible as a Cartesian product {2}X{6} corresponding to a complete subgraph however consists of two complete of order 2. The vertex set {1,3,5} components, an isolate 1, and a subgraph cf order 2 defined on {3,5} and expressible as {3}. The complete subgraphs of the subgraph induced on {1,3,5} are induced on vertex sets {1} and {3}X{5} and hence some of the cligues of G are given by {{ 2 }X{ 6 } }X{ 1, { 3 }X{5}} which corresponds to the cliques (261) and (2635). It i s clear that not a l l the cliques have been found for we have not yet examined vertex 4. We therefore determine two new and {2,4,6} i n the same sets {3,5} way, and the processing of their induced subgraphs yields {{ 3 }X{ 5 } }X { 4,{ 2 }X{ 6 }} giving us the third subgraph (354). This example illustrates the principal drawback of this procedure for enumeration, namely the generation of redundant cligues. We have encountered this type of problem in nearly a l l of the algorithms previously means of overcoming discussed. The most this problem has teen to simply exauine each vertex of the induced subgraph to see i f there some vertex exists not in the set, yet adjacent to a l l vertices in the set. Such a mechanism i s clearly not applicable case. usual Alternatively, Peay in his algorithm in this as criginally 116 described, maintained a l l cligues in a stack compare with newly which determined maximal complete subgraphs to see whether i t had been found before. Although applicable he could more directly to our situation, this method is net useful since we have no explicit representation for each cligue with to compare. In addition exponential amount overcome in we of storage. the following must make This use of a possibly difficulty algorithm which i s partly by making use cf information pertaining to the current derivation path in the tree of derivations in a manner described in the next section. SUBGRAPH EQUIVALENCE ALGORITHM Two stacks are used in the algorithm. Stack 1 consists of all vertex sets derived in the development of the current derivation path except those from which derivable. Stack 2 consists the current of a l l vertex set is sets directly derivable from the last vertex set in the derivation path. BLOCK A: Initialization procedure. g-TEP.ll v (G) be the set of a l l vertices in the graph and i n i t i a l l y set both stacks to be empty. STEP2: Call recursive procedure ENUM(V(G)) defined in BLOCK B. The order and adjacency matrix of G are defined globally tc ENUM On return, stop. BLOCK B: Recursive procedure ENUM (V (G)). STEPI^ Choose a vertex v in V (G)and define a vertex set F 117 equal to the union of v together with a l l vertices not adjacent to v. STEP2:. Choose a vertex H1 = V(G) o A (w) w from F and define vertex sets and H2 egual to the set of a l l vertices in F adjacent to every vertex in HI. STEP.3: If the vertex set H1 «-> H2 i s contained in a vertex set previously defined during this iteration then go to STEP7. STEP4: If the vertex set H1 u H2 contains a previously defined vertex set during this iteration then replace that vertex set by H1 u H2 on stack 2. STEP5: Compare H1 U H2 with a l l vertex sets previous iterations derivation path H1 U H2 generated during in the development cf the current other than those vertex sets from which was derived. If H1 U H2 i s contained in some previous such set then delete i t from stack 2. Otherwise place the new set on stack 1. STEP6:, A new pair of vertex sets has been found. Stack 2 contains their union as well as the vertex w used to define them. STEP7:. Delete w from F. If F i s not empty then go to STEP2. STEP8: If no new vertex sets have been added to stack 2 this iteration then return. STEP9 Choose a pair of sets H1 U H2 from stack 2 together with their defining vertex w. Remove this set from stack 1. STEP IO.: If H1 i s empty then print vertex w and go to STEP 14. STEP VI:. Call recursive procedure ENUH (H2). 11 STEPJ22 Print X" . (The maximal complete subgraphs of H1 U H2 will be given by the Cartesian Product of the results 118 upon return from calls in STEP 11 and STEP 13. STEP13^ Call recursive procedure ENDM ( 1 ) . STEPJHjj. Return H1UH2 and w to stack 1 but delete i t from stack 2. If stack 2 not empty then go to STEP9. STEg15i Return. As previously described, the algorithm finds the cligues of the graph by determining vertices sets of weakly CS equivalent in a particular way. A new vertex set whose vertices have been partitioned into two sets of weakly CS equivalent vertices is determined by choosing a vertex v from a set F and defining the two blocks H1 and H2 of the partition according to STEP2. The set F consists of a vertex v and a l l vertices of the induced subgraph on the current under consideration set of vertices, V(G), not adjacent to v. This set insures that a l l cligues will be found and was employed in the Harary-Ross algorithm and the Bron-Kerbosch algorithm as well as our own seguential algorithm previously discussed in Chapter 2. We are thus guaranteed of finding a l l the cliques and i t is therefore only necessary to minimize the possibility of finding redundant cliques. As we have mentioned, this is net a simple problem because of the nature of the representation being exploited in our algorithm. The technique employed i s tc keep track of a l l vertex subsets from which a newly determined vertex subset could possibly sufficient to keep 'track of only possible branches be derived. Tc dc this i t i s the i n i t i a l nodes of all in the derivation tree which deviate from the path of derivations we have taken to reach the current 1 19 vertex subset under consideration. By definition a l l other vertex subsets derived during the execution will be in contained one of the vertex subsets represented by these ncdes. In a derivation path cf length kf let v be the defining vertex of the vertex set H1 U H2 represented by a node on the derivation path at distance maximum number i from the root. of vertex It i s evident subsets generated that the during the generation of set H1 U H2 i s n , -d(v , ) where n . , i s the i-i i - l i - i number of vertices i n the i-1st vertex subset in the path of derivations and v i s i t s defining vertex. The maximum i-1 number of vertex subsets placed on stack 1 is thus } (1+(n. n -d(v. ) ) ) . i=l In i - l i - l Fig. 3.5 we illustrate the vertex subsets involved in such a seguence of derivations. Since our algorithm employs a depth before breadth technigue of development, i t i s clear stack 1 i s not exponentially growing. Hence any gains in efficiency from such a representation will not be offset inordinate storage reguirements. that by V(G) Fig. 3.5 A PATH IN THE DERIVATION OF NON-SIMILAR CLIQUES 121 It i s d i f f i c u l t to assess the overall efficiency of this algorithm because the choice of a "best" each defining vertex at iteration i s a non-deterministic procedure. Clearly, the worst case occurs when the choice that i s made results explicit enumeration of every in an cligue in the graph. In this situation, since the mechanism by which new vertex subsets are generated is similar tc that of the Bron-Kerbosch or our sequential algorithm for the enumeration of cliques, a one-one correspondence can be made between the nodes of the derivation tree of this new algorithm with either of the previously discussed seguential algorithms. k Because a search of as many as £.:(n- n -d(v. , )) elements ^ in a stack must X"*X — X X be made (see Fig. 3.5) for each new vertex subset in addition to i t s generation, i t i s evident time reguired for one iteration will that the be longer than that reguired by the seguential method. Since we have employed the same techniques of our previous algorithm to the generation of new vertex subsets, the time reguired for one itera-tion i s k proportional to reguired 51 (n-d (v. )) • T (n) where T (n) is the time 1 1=1 for one iteration of the seguential algorithm of Chapter 2. When, however, we can take eguivalence of vertices advantage to minimize of the weak CS- the number cf vertex subsets generated, the maximum efficiency is realized greatly reduced by the derivation tree. This is clearly evident for any complete k-partite graph K(m,m,...,m) . Here, each block X of m. vertices corresponds to a ^ x\ set of complete subgraph 122 eguivalent vertices, and hence also a eguivalent vertices. The derivation tree set for of weakly CS K (m , m , . . . ,ra) 1 2 using our new iteration V algorithm i s linear, as only one vertex at each defines derivation Ic a new for K(3^ subset. We illustrate such a in Fig. 3.6, where the vertices of block are labelled (i-1)m+1, (i-1) m+2,..., (i-1) m + m, i=1,2,3,U, with m = 3. For the complete k k-partite graph K(m ), the number cf nodes in the derivation tree determined by our algorithm is mk+2(k-1)+1 for k > 1 and 2(k-1) + 1 for k = 1. Obviously then the derivation trees are smallest for complete k-partite graphs. As mentioned previously case to be encountered occurs the worst when there are no weakly CS eguivalent vertices in the graph and conseguently cligues enumerated explicitly. The derivation determine the number of cligues in examine Fig. vertex have according 3.6, been to the edges of the related by by whether tree graph. are may be used to If we again tree incident to a common the symbols "X" or "," the sets determined in that derivation can be combined in a Cartesian Product to obtain a subgraph of the original graph G. If not then their union (denoted by i s a subgraph of G. We illustrate this notation with ",") the example of Fig. 3.7, a derivation of the cligues of Fig. 2 . 9 : . Using such a notation, the cligues are given by expression: {1 X [2 X 6}},{3 X {{2 X {5 X 6)},{4 X 5}}}. From this example one can see that the cligue (126) been explicitly defined the has while those containing the vertex 3 123 (ie. (2356) and (345)) are grouped from together. It i s evident F i g . 3.7 that there cannot be any edges not incident tc the root of the derivation tree which are related by a •*," for a graph whose cligues are a l l determined explicitly. Since the number of vertex subsets generated from the root i s at most nd(v) where v is a vertex of minimum degree, such a fewer than n cligues. Hence a l l graphs having more than n cligues have some vertices which are weakly CS the graph has eguivalent in induced subgraph defined on some vertex subset; therefore some improvement over a seguential algorithm can often be obtained by reducing the number of vertex subsets that must be considered. Fig. 3-6 DERIVATION TREE FOR K(3 V(G) / \ / \ 5* 6 Fig. 3.7 DERIVATION TREE FOR FIG. 3.1 I CHAPTER 126 EFFICIENT DETECTION CF CLIQUES 4.1 INTRODUCTION In the previous two chapters we have explored some of the ways in which cliques can be detected in graphs. We have examined how also various properties associated with graphs might be used to improve the efficiency major be made i s that i t i s not at a l l clear observation to of such how one might devise an efficient cligue algorithms. detection The algorithm even to detect cliques of a particular order. In this chapter, however, a procedure for the detection of such cliques i s proposed which can be proved to be an efficient algorithm a particular for class of graphs and for which no counterexample has yet been found for general k-partite graphs. An important application of cligue detection in graphs i s motivated by the fact that i t i s possible to represent a formed formula of well the propositional calculus in disjunctive normal form as a k-partite graph where k conjuncts survey of Chapter 1 we in the sentence. In the i s the number of mentioned briefly the efforts of Cock, Karp and Lawler, among others in developing a taxonomy of combinatorial problems. In particular we relates the noted an important result been Cook's which tautology problem to a number of other important combinatorial problems. An extensive l i s t has of of these problems prepared by Karp £ 4 9 ] . We shall use his notation to define the concepts reguired in describing the eguivalence a k-partite graph to a of well-formed formula in disjunctive 127 normal form. 4.2 CLIQUE DETECTION AND He turn now to a Problem" SATISFIABILITY consideration of the "Satisfiability as defined by Karp [49] and i t s solution through the detection of cligues in a graph as suggested by Mowshowitz [63]. DEFINITION JKARP]_J. The s a t i s f i a b i l i t y problem i s defined as follows: Given as input the clauses C^,C 2 , ...,Cp, cf formed a well- formula in conjunctive normal form, does there exist a set S c { x^ ,x2,. . . ,xn<; x.^ ,x2 . i . ,xn } such that # a. ) S does not contain a complementary pair of literals and b. ) SOC He * $ are A = C r\ c 1 thus <^C p 2 for k=1,2,...,p. given the well - formed and asked to decide whether or formula not i t is satisfiable. To do this we convert i t s negation to disjunctive normal form. Suppose ~A i s a tautology. Then for a l l possible assignments of truth values to the variables cf ~A, ~A i s true and conseguently A i s false for a l l possible assignments. Therefore A i s satisfiable tautology. It i f and only if ~~k is not a i s the disjunctive normal form cf ~A that we shall represent by a graph. Let 1 propositional conjunct o... S = D UD 2 calculus o D, k in ± 1 i 2 a sentence of the disjunctive normal form with each D = a n a r» ...r» a. 1 1 be m where a. x 3 is a literal. 128 Define a k-partite graph G as follows. Each vertex in V (G) corresponds to a l i t e r a l of S, there being as many vertices as there are l i t e r a l s of S. The unordered corresponding vertex pair (v ,v ) a b tc l i t e r a l s a and b i s an edge of G i f and only i f a i s not the complement of b, and a and b are net both members of the same conjunct. Thus to each conjunct of S there corresponds a vertex block of G consisting cf mutually non- adjacent vertices. This representation can be used to determine whether or not S i s a tautology. The decision rule i s : THEOREM iiilj. S i s a tautology i f and only i f there does not exist a cligue of crder k in the corresponding graph G , k being the number of conjuncts i n the disjunctive normal form. Proof: A cligue of order k in G exists i f and only i f there exists a selection of l i t e r a l s , one from each such conjunct that no l i t e r a l and i t s complement are both contained in the selection. If such a selection exists then we the of S value can assign 0 (false) to each of the l i t e r a l s in the selection and hence negate the well-formed formula. On the other hand i f such a selection i s not possible this corresponds to the fact that no such 1 assignment to the l i t e r a l s of the well-formed formula can be made and hence i t must be a tautology. QED. The object cf this chapter i s to describe which provides an efficient heuristic for determining whether a cligue of order k exists in an arbitrary Such an algorithm k-partite graph. an algorithm can then be employed through theorem 4.1 as 129 an efficient solution to the tautology problem. Before presenting such an algorithm i t i s necessary to define and discuss a collection of vertex sets determined by the algorithm for a k-partite graph mechanism G which provides the for detecting the existence of cligues cf order k. The importance of these vertex sets will be established subsequent theorem. F i r s t , however, we shall assume that we are given a k-partite graph partitioned into in a G with k blocks i t s vertex V ,7 ,. •.,V X set V(G) of mutually nonK *L adjacent vertices. We define wf(u,w), i$s$k-2 to be be a subset of block associated with an edge <u,w) such t that W (u,w) f 4 , m ai=1,2,...,t; t=1,2,...,s-1. s If i=s then W (u,w) = V n A(u)nA(w) s where A(u),A(w) denote s the adjacency sets of u and w respectively. Else for i < s W (u,w) i s the set of a l l vertices v in 7. i i i such that S _1 (a) v.e W? (u,w)0 l l 1 1 f fT^f S^sT* ^ L r=i+lLv *W°(uw) r (b) W" (u,v )0 w ^ ( w ^ ) *§ 1 _1 (u#v ) O w T ( w , v W^ I r' I r *J for j=1,2,...,i-1. It is evident from the definition that for any particular value of s a family of sets associated with an edge determined in the order (u,w) is s W (u, w) ,M (u, w) , . . . ,(u,w) . This s S*~X S X order i s a conseguence of the fact that W?(u,w) i s dependent upon s s W.,(u,w) ,W.Ju,w) ,... ,W (u,w). A number S XX X£ cf properties s associated with this family of sets may be readily from the definition: S 1 PROPERTY Jz W^(u,w) <= W ^ (U, w) . determined 130 PROPERTY 2: W?(u,w) not empty implies s that W.. X X (u,w), J j=1,i,...,s-i, not empty. PROPERTY 3j_ Every vertex in w? (u,w) i s adjacent to u and w. 4.4 CLIQUE DETECTION ALGORITHM The algorithm proceeds by constructing the vertex sets (u,w), s=1,2,...,k-2 for each edge in the graph G. An edge (u,w) i s deleted from empty w k ~ 2 for the some graph whenever i=1,2,...,s. (u,w),i=1,2,...,k-2, have been (u,w) After becomes the constructed sets for remaining in G connecting vertices in blocks V edges and Vk , and at least one such edge remains, then the sets H f (u,w) if redefined by iterating the above procedure. These are iterations continue until one of two conditions occurs: (a) A l l edges have been eliminated from G. (b) The latest iteration s resulted i n no further deletions. The following theorem establishes that condition (a) implies G contains no complete subgraph of order k. 4.2z THEOREM k W~ 2 (u,w) If G has a complete subgraph of order k then i s not empty for some edge (u,w) of G. Proof: From the definition, the theorem is true Assume s that for k = s i t is the case that K(1 ) for k = 3,4. is a subgraph 131 s of G implies W~ E(G), (u,w) i s not empty for some edge (u,w) S where u,w are in V(K(1 )). Suppose further that K(1 s is a subgraph of G where V(K(1 x)) = {v. ,v ....,v 1 V U V (Mf )) = { 1 ' 2 " * * ' s - 2 ' ' " s V (K(1 )) = {v ,v 1 s V (K(1 )) = { Therefore S in each of s 2 WT 1 v, ,v ....,v 1 from 2 W ~ (u,w) V V of order ,v , ,u} s-l V W } ' s-1 ' * the induction hypothesis v i s contained s 2 s 2 s-l are mutually adjacent, 1 implies 1 v^ 1 1r| [ W ^ ( u , vr) r\ H"" (w, is (w,v , ) . . Since s-l v contained in 1 1 contained ) ] for r = 2,3,... ,s-1. definition v i s contained in W ° (u,w). most two vertices per vertex block greater than 0 after a l l possible edge in Hence by QED. The following lemmas and theorem show that i f at s S-<£ " - - ' V 2 2 1 3 (u,w),W r (u,v , ) , and W~ ,u,w 0 s-2 2 v 2 ) v. s-l 2 being a vertex from block V . Complete subgraphs are defined on 3vertex sets: v V .,u,w}, s 1 in there are of G with degrees deletions have been made then G contains a complete subgraph of order k. LEMMA iJiJi Let V ,V ,...,V partite graph G and suppose w k-2 ( U # H j ' jj = 1 f o r i=1,2,...,k. Then k n o t empty implies K(1 ) is contained in G. k Proof: denote the vertex blocks of a kv By Property 2 W " k 2 (u,w) i s not empty for i=1,2,...,k- 2 2. Since |V | = 1 W ~ (u,w) = V . Let v be the vertex in i i i i block V . From the definition: k-2 wk-2 ( U , w) = {v }n [ [W r-l { u # v ) Wr-1{ W r V ) 3 n J 1 i i r=i+-l 1 r 1 r 132 implies vi is adjacent i=1, i , . . . ,k-2. Hence to v { vr v for r=i+1,i+2,...,k-2, and # v U W i ' 2 * * * * k-2 * ' J * s a s e t ° ^ mutually adjacent vertices. LEMMA 4.2^ If G i s a k-partite graph such that each vertex block has exactly Wk-2 two vertices, then for any edge (u,w) (u,w) not empty implies K(1 k ) i s contained in G. 3 Proof: Let k=5 and suppose v^ i s a vertex in there exists v^ in 2 (u,w) 2 such that v^ (u,w). Then i s contained in 2 (u,w)rkH^ (u,v.j)r\W (w,v.j). It also follows from the 2 2 definitions of W.^ (u,w), 9^ (u,v,j), and 2 (w,v^) that one can choose vertices x,y,z in induced on vertex such that complete sets {v^,x,u,w}, subgraphs are {^,y,v,j,u}, and { v 1 # z,v 3 ,w }. Now since |V\| = 2, either x = y, x = y or x = z then x i s adjacent to v subgraph i s induced on {v^,x, v^,n, w}. adjacent a * l ' v Y ' V 3 ' x = z, to ' U W w and complete or y = z. If and hence a complete If y = z subgraph then i s induced s W in i (u,w) w f (u,w). Then by 3 1 v is a definition there exists v in s 1 such that v.. s W" on J" Assume the lemma is true for k=s-1 and suppose vertex y is is contained in 1 s (ur w) r^- fi ~ 1 s (u,vg)o» ^ s 1 1 (w,vg). s each of the sets W " (u,w) ,W ~ JL 1 Since i s a member of s (u,v ),and W ~ -L 1 -L S (w,v ) , by the S induction hypothesis one can find vertex sets X, Y, and Z such that complete subgraphs of order s+1 are defined on 133 {v ,u,w}ux, {v ,v ,u}UY, and (v ,v ,w}uz. Each of the sets X X contains one X S vertex from each s of the vertex blocks V . V „ , . . . ,V \ . 2 s-1 3 Let s = xny, S X = X H Z , and S ^ = I n Z . Then S US contains a vertex from each of blocks *c V_,V_,...,V , 2 |V | = 2. us X J j since s-JL 3 Since each vertex in Sn i s adjacent to every vertex 1 i in X and Y, i t i s adjacent to every vertex in S2 hence S US and 1 2 SOS„ 1 ' are 3 a set of and i s adjacent vertex also adjacent Z and hence vertices. S^S^ Hence is S^us^s^ and mutually adjacent vertices. Similarly each vertex in S2 in S^ to every a set of mutually induces a complete subgraph of order s-2. Further, by construction v ,v ,u, and w X are adjacent tc every vertex in S^u s^U S^ and therefore G contains a complete subgraph of order s+2. THEOREM 4..3:. If G is s a k-partite k graph QED. with at most two 2 vertices in each block, and i f W ~ (u,w) i s not empty for some edge (u,w), then G contains a complete subgraph cf order k. Proof: The result follows almost immediately from lemmas 4.1 k 2 and 4.2. If for some i W~ (u,w) contains only one vertex i then as a consequence l of every vertex in every other there are r < k-2 sets v the definition v^ is adjacent to set k 2 W~ 3 (u,w), i + j . Suppose having 2 vertices. By the method in lemma 4.2 we can find a complete subgraph of order r+2 defined on vertices chosen from these sets. By the remark above, each vertex in a set which i t is the sole member is adjacent to a l l other vertices and therefore G contains a complete subgraph of 134 order k. QED. Finally, we note that the following cligue detection algorithm could be modified to work for a l l k-partite graphs. If condition (b) occurs and i f the condition of theorem 4.3 i s not satisfied, then a cligue enumeration procedure can be applied to the subgraph of G which remains edge deletions verification occur. This after nc further method has been implemented for purposes and the results are summarized in Chapter 5. CLIQUE DETECTION ALGORITHM Let G be a k-partite graph with blocks , V"2,. .. ,Vfc . Denote by A (u) the adjacency set of vertex u. STEP 1: Set s to 1. Define graph HQ egual to G. k STEP2: If E (H Jempty then stop—K{1 ) i s not in G. ~ s-l STEP3: Choose an edge (u,w) in E(H . ) where u and w are s-l vertices in V(H , ) - V . s s-l s STEP4: Set w (u,w) to v n A (u)rv A (w). s s STEP52 If s = 1 then go to STEP16. STEP6:. set i to s-1. STEP?: Set r to i+1. Set P to be empty. STEP8: Choose a vertex v in W^(u,w). STEP9: r If (u,v) and (w,v) are edges of H , and r-1 W ~l(u,v)n W (w,v) not empty then go to STEP12. i is s STEP10: Set W (u,w) to W (u,w) - fv}. s r s r STEP11: If W (u,w) empty ,then set W (u,w) to be empty r 1 and gc 135 to STEP 16. S e tp t o p STEP 122 U[ W 1 ^ ( U f V j n 1 W \ (w,v) ] . s STEP 13: If some v in W (u,w) has not been examined then go to r STEP8. STEP 14: If P not empty then 3 3 1 set W (u,w) to W" (u,w)OP. 3 Otherwise set W (u,w) to be empty and go to STEP16. Set i to i-1. If i>1 then go to STEP7. STEP152 3 STEPI62 in E (H s-1 If W (u,w) has not been computed for a l l edges (u,w) ) , where u,w are in V(B ,)-V , then gc to STEP3. s-1 s STEP 17: Define graph H £ H ., as follows: s-1 s a. ) V(H ) i s set to V (H n )-V, s-1 s s b. ) (u,w) i s an edge of E(H ) i f and only i f (u,w) i s an edge of E (H and s-1 1 Set s to s*1. If s<k-2 then go to STEP2. STEPJ82 STEP 19^ s s ) and W (u,w) i s not empty. L e tG only ' b et b e subgraph of G such that v i s in V (G*) i f i f there exists (u,w) in E (H k 2 W~ (u,w) for some i = 1,2, i G = G ' and go to STEP1. STEP20: If G* has at most ) such that v i s in ,k-2. If G ' 4 G 2 vertices in each then let block of k degree > 0 then K(1 ) i s contained in G ; else a cligue must be verified by enumeration. As summarize an example, consider the results the graph of the algorithm of Fig 4. 1. We in TABLE 4.1. The subgraph defined by STEP 19 after 1 iteration i s the complete subgraph of order 5. The second iteration defines sets for this graph as indicated deleted in the table. Vertices which are according to STEP 10 are indicated in parentheses. By STEP 19, the algorithm terminates after the second iteration. Fig. U.l GRAPH CONTAINING SUBGRAPH K(l^) ITERATION EDGE W 2 (36) § 1 2 (37) 2 (38) 2 (46) § (47) 2 (49) 2 3 2 3 w W w l W W 2 3 W (56) .1 (58) 1,2 (59) 1,2 (68) 1 (3) ,5 1 (69) 1 (U) ,5 1 (78) 1,2 3 2 (79) 1.2 4 2 (89) 1,2 5 1,2 (56) 1 (58) 1 (59) 1 (68) 1 5 1 (69) 1 5 1 5 1 (89) 1 6, (7) 5 6 5 1 1 TABLE 4.1 RESULTS OF THE ALGORITHM FOR FIG.4.1 138 4.4 TIMING CONSIDERATIONS Let G be a complete k-partite graph with each block m vertices in i = 1,2,...,k. For such a graph any choice of vertices v .,,v„ ,... ,v. with v. a member of V. i s the vertex set 1 2 k l x of some K (1 ) contained in G. Further every edge member of in G is a some E(K{1 ) ) . G therefore constitutes the "worst" k-partite graph the algorithm can encounter. During V u . . . VJ V u V S+l iteration s+2 such all edges defined over must be examined. There areraj(k-s) (k-s-1) K edges. s s Fcr each edge s (u»«) we 2 must compute s W (u, w) , W ; (u, w) ,..., wlf (u,w). H (u,w) can be determined in s s-l 1 s one intersection (that of rows u and w of the adjacency matrix treated as n digit binary numbers), while H?(u,w) i s computed from i STS/ the definition r-l W . (w,v ) r»-l i \ W . (u,v ) vrcW|(u,w) x r i as is follows, computed in m r intersections, m-1 unions and m tests for emptyness for each r = i+1,i+2,...,s. The total computation time fcr wf(u,«) i s therefore 1+(s-i) (3m-1), sc to compute a l l s sets reguires s-l 1 + Zl ((s-i) (3m-1) + 1) steps. The total number cf steps at i=l iteration s for a l l edges i s thus given by: s-l (mf (k-s) (k-s-1)) (1 + n ((s-i)(3m-1) + 1) 2 Since i=l there are k-3 iterations reguiring these 2 computations (iteration 1 only computes W^(u,w) fcr ^. (k-1)(k2) edges), the total number of steps in the computation i s : k-2 m* (k-1) (k-2) + r T - 2 2 2 m (k-s) (k-s-1) s=2 T s(s-1)(3m-1) + s J 2 fm (k -3k + 2)l r3m-1k3 + m + 1k - 3m-38k + 3 (3m-in h 30 2 5 5 1 I 1 J 139 Hence ? an iteration of the algorithm i s 0(k ) with leading 2 coefficient m (3m-1). 120 Since at least one edge iteration must be deleted during each of the procedure, a crude upper bound on the number of iterations i s given by the number of edges of the graph. In general i t i s d i f f i c u l t to obtain empirical tests cn a large a sharper sample bound although of graphs yielded none reguiring more than two iterations to reach a decision. In any case, the time reguired to execute the procedure remains bounded by a polynomial in k. 4.5 STORAGE CODS IDEBAT IONS Let G be the graph described in our discussion cf timing considerations. Both the adjacency matrix of G and the storage reguired for saving the W-sets place the space. Each greatest demand cn s W (u,w) and row of the adjacency matrix can be i represented by bit strings of length m and mk respectively. Since edges with incident vertices in V will not be r considered after iteration r-1, such edges reguire storage for r-1 terms R~:J",W ,..., W F"""''. An examination of the complete r - l r-2 1 2 k-partite graph G with block size m reveals that m edges will r r have in V k-2 such terms computed for them (those with ene vertex k-1 2 and one i n V ) , 2m k terms,..., 2 (k-2)m edges will have k-3 such edges will have only one such term. Since only the most recent values of any such term are ever reguired the maximum number cf terms 2 k-2 i s m i £: «l 2 i(k-i-1) = m ./k\. The 2 b/ 140 total storage reguirement in bits for these items i s thus (mk)2 • £ /*M. 3 4.6 IMPLICITIONS OF THE PROCEDUBE We have established that the procedure i s an efficient algorithm for detecting k-cligues in graphs having at most two vertices per block. When used as a tautology testing procedure, this agrees with Cook's result[11] that well-formed formulae having determined at most two literals per clause can be to be tautologies in polynomial time. Moreover, i f graphs having more degree > 0, can than be two vertices per block, each of reduced to graphs having at most two per block, then the procedure i s s t i l l guaranteed to detect the existence of k-cligues in polynomial time. Thus, the algorithm accepts a larger class of well-formed formulae (represented as graphs) than the Davis-Putnam procedure [66]. In light of the work of Cook and Karp, the procedure provides a mechanism for greatly reducing the number of formed the formulae that might require an exponential solution to tautology counterexample Cook well- and tautology Karp problem. might The of a specific help to resolve the guestions raised by concerning problem. construction the exponential nature However, finding remains an open problem. such of the a counterexample 141 CHAPTER 5i EMPIRICAL OBSERVATIONS AND SUMMARY 5. 1 INTRODUCTION As has been observed in Chapter 2, the clique enumeration algorithms discussed there a l l operate in essentially the same way, namely the development of a derivation tree from representing the i n i t i a l vertex a node set of the graph to ncdes representing the cligues of the graph. The thesis of Chapter 2 is that the size of the derivation algorithm applied to a tree on can be used derivation to develop new nodes different of the tree in as much as this procedure determines their total number. The algorithms examined in Chapter methods to generate new the same derivation which use was seen to tree as the Reduced Redundancy algorithm. The purpose of comparison in this the actual 2 all vertex subsets with the exception of the Bron-Kerbosch algorithm compare as a of i t s efficiency. This efficiency therefore depends the technique employed develop by each Moon-Moser graph together with the order of computation for one iteration measure developed performance Chapter of different i s to methods for developing the derivation tree with the results of Chapter 2. The same set of data used to test the cligue algorithms was also used to test the efficiency of the clique detection procedure of Chapter 4 in order much better enumeration i t s performance enumeration algorithm. to determine hew was than employing an ordinary 142 5.2 THE TEST DATA In order to determine the performance of the random graphs of various edge densities were generated for graphs having 9, 12, 18, and 21 vertices. Each then run algorithms, algorithm was on the test data. For each graph and each algorithm the following statistics were recorded: a. ) The time required to find a l l the cligues, b. ) The number of vertex sets examined, c. ) The number of cligues found. Sixteen test graphs were generated and their orders, edge densities, and number of cliques are given in TABLE results given of applying in TABLE implementations 5.2 used 5.1. each algorithm to this set of graphs i s (time may in seconds) be found while the in APPENDIX actual B. Due to excessive computation time, the algorithms of Harary-Boss Peay were features not of The and applied to some of the graphs. Because of the dynamic manipulation, PL/1 storage was allocation and bit string used as the programming language for the implementations of these algorithms. The programs were run on an IBM 360/67 Duplex system. The actual graphs may be found in APPENDIX C. edge graph vertices density cligues 1 9 0.4 8 2 9 0.6 10 3 9 0.8 17 4 9 1.0 27 5 12 0.4 12 6 12 0.6 19 7 12 0.8 31 8 12 1.0 81 9 18 0.4 34 10 18 0.6 39 11 18 0.8 69 12 18 1.0 729 13 21 0.4 43 14 21 0.6 58 15 21 0.8 144 16 21 0.9 392 TABLE 5.1 CHARACTERISTICS OF THE TEST GRAPHS HARARY-ROSS ALGORITHM GRAPH PE AY «S BONNER'S ALGORITHM ALGORITHM R. R. ALGORITHM TIME NODES TIME NODES TIME NODES TIME NODES 1 0.51 15 0.20 36 0. 15 27 0. 20 20 2 0.71 19 0. 18 46 0. 17 34 0.23 24 3 1.28 33 0.47 49 0.22 48 0. 28 31 4 1.97 53 0. 71 79 0.27 64 0.42 40 5 1.85 23 0.64 62 0.25 55 0.35 37 6 2.62 37 1.11 103 0.36 77 0.50 51 7 a.75 61 0.97 99 0.70 142 0.68 65 8 11.80 161 2. US 241 0.97 256 1. 18 121 9 8.06 67 2.48 182 0.55 117 0.85 78 10 9.48 79 2.91 221 0.71 143 1.36 91 11 18.67 137 11. 13 655 2.77 465 2.01 150 15.31 4096 12 13 13.91 85 14 16.32 115 15 16 3.32 14.03 1093 231 0.73 149 1. 54 108 6. 12 444 1.62 354 2.35 21.75 1490 4.80 1177 15.03 4023 TAELE 5.2 COMPARISON OF ALGORITHMS 167 5. 36 374 11.01 688 145 5.3 DISCUSSION OF THE RESULTS The results cf running each algorithm on the test data of TABLE 5.1 were used to verify the predictions size of the derivation tree. graphs (see page respectively. Because algorithm as they algorithm attempt was made tc obtain such correspond to 13) on 9, 12, and 18 vertices of the slowness and Peay's for the For this purpose, the graphs numbered 4, 8, and 12 were , examined Moon-Moser made of the Harary-Ross for the smaller graphs no results for the Moon-Moser graph on 18 vertices. For the test cases, the Harary-Ross algorithm generated the fewest nodes of the derivation tree in finding the cligues of a graph yet performed more poorly than which generated the largest Bonner's derivation conseguence of the fact that the method trees. used algorithm This is a by Harary and Ross to generate new vertex subsets while being very selective is also very time this algorithm sacrifices consuming as was seen in our analysis of in Chapter efficient node 2. On the other hand, Bonner generation in the derivation tree for a simple means cf defining new vertex subsets. In spite of i t s defects as observed by Augustson method appears and Minker £ 5 ] , this to be very successful particularly with small graphs as one would expect, since for such a graph the size of the derivation tree does not yet dominate This the the computation. hypothesis is further supported by the observation that Reduced competitive Redundancy with algorithm Bonner's appears to be aost algorithm for graphs of high edge 146 density. Such graphs have large numbers of cligues and hence their derivation trees will be large. Peay's Harary-Ross algorithm performed significantly better than the algorithm competitive with and the would other probably algorithms have i f the derivation tree generated were reduced. Such seems been wore a size of modification feasible i f one were to employ a technigue of examining a l l the non-adjacent vertices associated vertex than at one time approach similar to rather that taken with a particular step-by-step. This i s an in the Reduced Redundancy algorithm. If the order cf computation for one iteration multiplied by the number of nodes in the derivation tree cf K(3 } derived in Chapter 2 i s used as a rough measure of relative efficiency, then good agreement i s obtained with the empirical results. although such an estimate does net indicate accurately how much better one algorithm i s than another, the difference large in magnitudes graphs, provides of these values, particularly with some guide in choosing the most efficient algorithm. The inefficiency of cligue enumeration algorithms for finding the existence of a maximal complete subgraph cf order k in a TABLE 5.3 k-partite where we graph i s revealed by the results given in compare the computation time of the decision procedure of Chapter 4 applied to the graphs of TABLE 5.1 with the best time available for the Reduced Redundancy algorithm. While i t is true that such enumeration procedures 147 could be modified tc terminate when discovered, this would not preclude the a k-cligue had been possibility of cliques being discovered rather late in the enumeration. such ENUMERATION DECISION ALGORITHM ALGORITHM GRAPH TIME TIME 1 0. 15 0.06 2 0. 17 0.07 3 0.22 0.09 4 0.27 0. 11 5 0.25 0. 15 6 0.36 0. 16 7 0.68 0. 20 8 0.97 0.28 9 0.55 0.64 10 0.71 0.77 11 2.01 1. 05 12 14.03 4.28 13 0.73 0.82 14 1.62 0.89 15 4.80 1. 54 16 11.01 3.98 TABLE 5.3 COMPARISON OF DETECTION AND ENUMERATION 149 5.4 SUMMARY The principal goal of this thesis has been to examine ways of detecting cligues in graphs. In particular, sought to derive was have an efficient algorithm fcr determining the existence of a clique cf order k in a goal we achieved k-partite graph. This in Chaper 4 for a subset of a l l graphs by adopting a different approach to the problem than that offered in Chapters 2 and 3. In these chapters unlikely that we could solve we saw that i t was our problem by employing any cligue enumeration algorithm. This was because such algorithms could be compared to tree searching processes which are known to be inefficient procedures. In Chapter graphs which together 3 we attempted to exploit some properties of would allow us to group discussed: graph equivalence. improvement The Two kinds of "similarity" latter offered some possibility cf because of the concise notational representation. defining for avoiding cligues although type of notation while the problem of an attempt was made to minimize such behavior. Further, i t should new were theoretic similarity, and complete subgraph However no solution was found the vertices so that we would not have to examine each vertex in such a group individually. multiply "similar" be observed that concise, does not readily display either the cligues or their orders. As the number of clique enumeration algorithms in the literature increases, i t i s useful to carry out some empirical comparison of the performance of these procedures. It has been 150 our observation that the implementations of several of these algorithms have which been improve have been extremely their tc modifications efficiency.'Some of these modifications discussed implementations. sensitive previously The and were included improvements thesis, such approach and to changes the have algorithms not really able discussed changed to in this the basic as a conseguence their computation times remain exponential. Therefore empirical tests of such procedures possible the efficiency of these implementations was examined in the previous section. While we have been suggest in only for moderately graphs, determination of the large size are graphs. For very large of the derivation tree provides a more useful and less expensive method for assessing the performance of enumeration algorithms. In detects Chapter the 4, an efficient algorithm was defined which existence of k-cligues in certain k-partite graphs. Such graphs have the property that they can be reduced by the algorithm to graphs having at most two vertices of degree > 0 in each vertex block. The work of Cook and Karp suggests (but does not imply) that the algorithm will not work for all k-partite graphs. proving this may help in characterizing why the tautology problem does not appear to be solvable in polynomial time. 151 BIBLIOGRAPHY ____________ ( 1 Abraham, C T . "An application of clustering techniques to minimizing the number of interconnections in electrical assemblies " Some Problems In Information Science (M. Kochen ed.), 1965 pp. 252-265 2 Abraham, C T . "Techniques for thesaurus organization and evaluation " Some Problems In Information Science (M. Kochen ed.), 19657 pp.~137-150 3 Abraham, C T . "Graph theoretic techniques for the organization of linked data " Some Problems In Information Science (M. Kochen ed.), 19657 PF~ 229-251 4 Andrasfai, B. "New proof of a graph theoretic theorem of Turan " Magyar Tud. Akad. Hat. Kutato. Int. Kczl 7 (1962) pp. 193-196" 5 Augustson, J . Gary, Minker, Jack "An analysis of some graph theoretical cluster techniques " J . Assoc Commuting Machinery. 17 (1970) pp. 571-588 6 Bonner, R.E. "On some clustering technigues " IBM J . Research And Development 8 (1964) pp. 22-32. 7 Boyle, R.P. "Algebraic systems for normal and hierarchichal sociograms " Spcipmetry 32(1969) pp. 99-119 8 v Bron, Coen, Kerbosch, J.A.G-.M. "Finding a l l cliques of an undirected graph " Communications Assoc. Computing Machinery tc appear ~ ~ 9 Cartwright, D., Harary, F. "Structural balance: a generalization of Heider's theory " Psych. Rev. 63(1956) pp. 146-153 10 Coleman, J.S., MacRae, D. (jr.) "Electrcnic processing of sociometric data for groups up tp 1000 in size " Amer. Sociological Rev.. 25 (1960) pp. 722-727 11 Cook, S.A. "The complexity procedures " Third ACM Symposium Computing 1970, pp.~15T-T58 ~ 12 Corneil, D.G. "Graph isomorphism " Department Of Computer Science^ University Of Toronto Ph.D thesis, 1968 13 for theorem-proving The Theory Of Corneil, D.G., Gotlieb, C C "An efficient algorithm graph isomorphism " J . Assoc_. Computing Machinery 17 (1970) pp. 14 of On 51-64 Culik, K. "On the chromatic decompositions and chromatic numbers of graphs " SpJLsjy Pri red. Fak. Univ. Brno. 1959 pp. 177-185 152 15 Davis, J.A. "Clustering and structural balance in graphs " Hunan Relations 20(1967) pp. 181-187. 16 Dirac, G. "Extensions of Turan's theorem on graphs " Acta^ Math. Acad. Sci. Hungar. 14(1963) pp. 417-422 17 Dirac, G. "On complete subgraphs and complete stars contained as subgraphs in graphs " Math. Scand. 12(1963) pp. 39-46 18 Dirac, G. "Extensions of the theorems of Turan and Zarankiewicz " Proc. Symp. Smolenice 1963 pp.127-132 19 Dirac, G. "Chromatic number and topological complete subgraphs " Canad. Math. B u l l . 8(1965) pp.711-715 20 Doreian, P. "A note on the detection of cligues in valued graphs " Sociomgtrv 32(1969) pp. 237-242. 21 Erdos, P. "Remarks on a theorem of Ramsey " Bull.. Res. Council Israel Sect. F 7 (1957-1958) pp. 21-24 22 Erdos, P. "Graph theory and probability I " Canad. J . Math^ 11(1959) pp.34-38 23 Erdos, P. "Graph theory and probability II " Canad. MatJls. 13(1961) pp. 346-352 24 Erdos, P. "On a theorem of Rademacher-Turan " 111. J.. Math.. 6(1962) pp 122-127 25 Erdos, P. "On the number of complete subgraphs contained in certain graphs " Magjfar Tud. Akad.. Mat^ Kafat2i I S i i Kozl* 7(1962) pp. 459-464 26 Erdos, P. "On circuits and subgraphs cf chromatic graphs " Mathematika 9(1962) pp. 170-175 27 Erdos, P. yon complete topological subgraphs of certain graphs " Ann. Univ.. Sci. Budapest Eotvos Sect, "ath. 7 (1964) pp. 143-149 28 Erdos, P. "On the number of triangles contained in certain graphs " Canad.. Math. Bull. 7 (1964) pp. 53-56 29 Erdos, P. "Some remarks on Ramsey»s theorem " Canad. l a t h . Bull.. 7 (1964) pp. 619-622 30 Erdos, P. "On cligues in graphs 4 (1966) pp. 233-234. 31 Festinger, L. "Time analysis of sociograras using matrix algebra " Human Relations 2(1949) pp. 153-158. " Israel J± Math. 153 Folkman, J . "Regular line symmetric Combinatorial Theory 3(1967) pp.215-232 Forsyth, E., Katz, L. "A analysis of socioraetric data: Sociometry 9(1946) pp. 340-347. graphs " J,. matrix approach to the preliminary report " G i l l , A. "Analysis of nets by numerical methods " AssoS' Computing Machinery 7 (1960) pp. 251-254 Giraud, G. "An upper bound for Ramsey number (5,5) " C.R. Acad.. Sci. Paris 265 (1968) pp. 809-81 1 Gotlieb, C.C., Kumar, S. "Semantic clustering of index terms " Assoc. Computing Machinery 15(1968) pp. 493-513 Graver, J . , Yackel, J . "An upper bound for Ramsey numbers " g u l l . Amer. Math. Soc. 72(1966) pp. 1076-1079 Graver, J . , Yackel, J . "Some graph theoretic results associated with Ramsey's theorem " J.. Comb. Theory 4 (1968) pp. 125-175 Hajnal, A., Suranyi, J . "On the decomposition of graphs into complete subgraphs " Ann.. Dniv. Sci.. Budapest, fotyos Sect.. Math^ 1 (1958) pp.""T 13-12~ Harary, F., Norman, R.Z. Graph Thecry As A Mathematical Model In Social Science Ann-Arbor: Institute for Social Research, 1953 Harary, F., Ross, I . "A procedure for clique detection using the group matrix " Sociometry.. 20 (1957) pp. 205-215. Harary, F. "Graph theoretic methods in the management sciences " Management Science 5(1959) pp. 387-403 Harary, F. Graph Mass., 1969 Theory Addison-Wesley, Reading, Harary, F. "Graph theory as a structural model in the social sciences " Graph Theory And Its Applications Bernard Harris (ed.) ,"970, ppT 1-16 Hubbell, C.H. "An input-output approach to cligue identification " Sociometry.. 28 (1965) pp. 377-399. I l z i n i a , I. "Finding the cliques of a graph Aytomat. I. Vychisl.. Tekn^ #2 (1967) pp. 7-11. Johnson, S.C. "Hierarchichal Psychometrika 32(1967) pp. 241-254 clustering " schemes " 154 Kalbfleisch, J . "On an unknown Ramsey number " Math. 13 (1966) pp385-392 Mich. Karp, R.M. "Reducibility among combinatorial problems " Proceeding Of The IBM Conference On The Complexity Of Computations Plenum Press, New~York, l972~ ~ ~ Kochen m. Some Problems In Information Scarecrow Press, new york, 1965. Science. Lawler, E.L. "Electrical assemblies with minimum number of intersections " IRE Trans. EC-11(1962) pp. 8688 Lawler, E.L. "Polynomial bounded and (apparently) non polynomial bounded matroid computations " Proceedings Of The NYO-ONR Symposium On Combinatorial Algorithms ~ (tc appear7~Prentice-Hall, New York,~1972 Lawler, E.L. "Matroids with parity conditions: a new class of combinatorial optimization problems " Mathematical Programming submitted 1972 Luce, R.D., Perry, A.D. "A method of matrix analysis of group structure " Psychometrika. 14(1949) pp. 95-116. Luce, R.D. "Connectivity and generalized cligues in sociometric group structure " Psychometrika. 15(1950) pp. 169-190. Meetham, A.R. "Algorithm to assist in finding the complete subgraphs' of a given graph " Nature 21 1 (1966) p. 105 Meetham, A.R. "Graph separability and word grouping " Proc. Assoc. Comp. Mach. 2,1st National Conference 1966 pp. 513-574 Mendelson, E. Introduction To Mathematical Logic Van Nostrand, Princeton, N.J.,"l964. ~ ~~ Moon, J.W. "On the number of complete subgraphs of a graph " Canad. Math. Bull. 8(1965) pp. 831-834 Moon, J.W., Moser L. "On cligues in graphs " Israel J_. Math.. 3 (1965) pp. 23-28. Moon, J . "On independent complete subgraphs graph " Canad. J . Math. 20 (1968) pp. 95-102 in a Moreno, J.L. Who Shall Survive? A New Approach To The Problem of Human Inter-Relations Beacon House,~New~York, 1934 Mowshowitz, A., ( private communication ) 155 64 Mulligan, G.D. "Algorithms for finding cliques of a graph " Department Of Computer Science^ University Of Toronto Technical report Mo. 4^, May 1972 65 Mulligan, G.D., Corneil, D.G. "Corrections to Bierstone's algorithm for generating cligues " Assoc. Computing Machinery 16(1972) pp. 244-247 66 Putnam, H., Davis, M. "A computing procedure for guantification theory" J . Assoc. Computing Machinery 7 (1960) pp. 201-215 67 Nordhaus, E.A. "On the density and chromatic numbers of graphs " Lecture Motes In Mathematics. 110(1969) pp. 245-249. 68 Peay, E.R. (jr.) "An iterative cligue procedure " Mich. Math. Psych. Prog^ 4 (1970). 69 Peay, E.R. (jr.) "Monmetric grouping: clusters and cligues " Mich. Math. Psych., Procj., 5(1970) 70 Ramsey, F. 0n a problem of formal Lond. Math. Soc. 30 (1930) pp. 264-286 71 Rose, M.J. "Classification of Computer J. 7 (196 4) pp. 208-211. 72 Sauer M. "A generalization of a theorem of Turan " J . Comb. Theory 10(1971) pp.109-112 73 Sims, Charles C. "Graphs and f i n i t e groups " Mat!u Zeitschr^ 95 (1967) pp 76-86 74 Spencer, J.H. "On cligues in graphs " Israel J ^ Math. 9 (1971) pp. 419-421. 75 Turan,: P. "Eine , extremalaufgabe aus der graphentheorie " Mat. Fiz. Lapo_k 48(1941) pp. 436-452 76 Turan, P. "On the theory cf graphs 3<1954) pp. 19-30 77 Turner, James "Point-symmetric graphs with a prime number of points " JL. Combinatorial Theory 3(1967) pp. 136-145 78 Weiss, R.S., Jacobsen E. "A method for the analysis of the structure of complex organizations " Ajer.. Sociological Review. 20(1955) pp. 661-668. 79 Zelinka, B. "On the number of independent complete subgraphs " Publ. Math. Debrecen. 13(1966) pp. 9 5-97 M detection logic " Proc., a set of elements " " permutation Cc1log. Math. APPENDIX A operation symbol STORE < PUSH,POP — time constant \ push,pop ADD,SUBTRACT • COMPARE M0LTIP1Y UNION,INTERSECTION u, rv COMPLEMENT - SUBSTRING substr INDEX index PRINT print \ s H s t 10 LIST OF OPERATIONS 157 APPENDIX B 158 ************************************************************ * * * THE MODIFIED HARARY-ROSS ALGORITHM * * * * * ************************************************************ HAROSS:PROC(A, N) ; DCL (A(N), /* ADJACENCY MATRIX */ G, /*CURRENT VERTEX SET*/ CLIQ, /*COHPLETE SUBGRAPH VERTICES */ GTEMP) BIT (N) , (N, /* NUMBER OF VERTICES IN GRAPH*/ WT, /*NUMBEB OF VERTICES IN G */ VTX(N), /* LABELS OF VERTICES OF G */ R(N), /*ROW SUMS OF (A**2XA) FOR G*/ D(N)) /* DEGRES OF VERTICES OF G */ FIXED BIN, VSET BIT(*) CTL; /* STACK OF SETS */ DCL CTR FIXED BIN; CTR=0; G=G| ( G) ; PUT SKIP; PUT SKIP LIST ('THE CLIQUES ARE:*); START: /* DETERMINE THE VERTICES IN SUBGRAPH G */ IF G-'O'B THEN GO TO NEXT; GTEMP=GTEMP| ( GTEMP) ; WT=0; DO 1=1 TO N; IF SUBSTR (G,I,1) = '0»B THEN GO TO LP 1 ; GTEMP=GTEMPSA (I) ; WT=WT+1 ; VTX(WT)=I; LP 1: END; IF GTEMP = 'O'B THEN GO TO NEXT; /* V CALCULATE ROW SUMS OF (A**2XA) AND DEGREES OF VERTICES OF G R,D =0; DO 1=1 TO WT-1; DO J=I+1 TO WT; SUM=0; IF SUBSTR (A (VTX (I)) ,VTX(J) , 1) = • 1 • B THEN BEGIN; ^ D(I)=D(I) + 1; D(J)=D(J) + 1; DO K=1 TO WT; IF SUBSTR (A (VTX (I) ) 8A (VTX (J) ) ,VTX (K) , 1) = »1»B THEN SUM=SUM + 1 ; END; END; 159 R (I)=R (I) *SUM; R (J)=R (J) +SUM; END; END; /* SEARCH FOR UNICLIQUAL VERTICES */ MIN=1; DO 1=1 TO WT; IF R (I) = D (I) * (B (I)-1) THEN DO; /* UNICLIQUAL VERTEX DISCOVERED */ CLIQ=A (VTX (I) ) SG; SUBSTR (CLIQ, VTX.(I) , 1) = «1*B; /* V /* IS THIS A MAXIMAL COMPLETE SUBGRAPH? GTEMP=GTEMP| ( GTE MP) ; DO 12=1 TO WT; IF SUESTR (CLIQ,VTX (12) , 1) = »1«B THEN GTEMP=GTEMP6A (VTX (12) ) ; END; IF GTEMP=»0'B THEN PUT LIST(CLIQ); DELETE ALL UNICLIQUAL VERTICES IN TBIS COMPLETE SUBGRAPH FROM G. */ SUBSTR (G,VTX (I) , 1) = • 0 • B ; DO J=I+1 TO WT; IF SUBSTR (A ( VTX (I) ) , VTX (J),1)=*0'B THEN GO TO LP2; , IF R(J)=R(I) THEN SUBSTR (G, VTX (J) , 1) = '0 B; LP2: END; GO TO START; END; ELSE IF R(MIN)>R(I) THEN MIN=I; ILP: END; /* END I LOOP */ /* V NO UNICLIQUAL VERTEX IN G DEFINE TWO VERTEX SETS. SAVE ONE AND PROCESS THE OTHER. ALLOCATE VSET BIT (N) ; GTEMP=G&A (VTX (MIN) ) ; SUBSTR (GTEMP,VTX(MIN) ,1) = M'B; VSET= GTEMP; DO J=1 TO IT; IF SUBSTR (GTEMP,VTX (J) , 1) = »0»B THEN VSET=VSET|A (VTX (J)); END; VSET=VSETSG; CTR=CTR+1; G=GTEMP; 160 GO TO START; /* GET A NEW SET FROM THE STACK. */ NEXT: IF CTR > 0 THEN DO; G=VSET; CTR=CTR-1; FREE VSET; GO TO START; END; FREE VSET; RETURN; END; /* END HAROSS PROCEDURE */ 161 ***************************** ********** * * * BONNER'S ALGORITHM * * * ************************************************************ BON: PROC (G,N) ; DCL (G (N) ,A (N) ,C (N) ,U,T) BIT (N)r L (N) ; STEP 1: 1=1; C(1) = '0«B; C(1)= C(1); A(1) = »0'B; L(1) = 1; STEP2: IF SUBSTR (C (I) ,L (I) ,1) = '1'B THEN BEGIN; STEP3: C (I + 1)=C (I) 8G (L (I) ) ; SUBSTR (C (1+1) , L (I) , 1) = ' 0 * B; A (I+1)=A (I) ; SUBSTR (A (1+1) ,L (I) , 1) = » 1 «B; STEPU: L (I+1)=L (I)+1 ; 1=1+1; END; ELSE I (I)=L (I) +1; STEP5: IF SUBSTR (C (I) , L (I) ,N+1-L (I) ) = '0«B THEN GO TO STEP2; T=A(I) ; STEP6: IF C(I)=»0«B THEN STEP7: 1=1-1; IF 1=0 THEN RETURN; STEP8: U=»0«B; SUBSTR (U,L (I)+1 ,N-L (I) ) = SUBSTR (C (I) ,L (I) +1,N-L (I) ) ; IF (T | U) = T THEN GO TO STEP7; L(I) = L(I)+1; GO TO STEP2; END; /*END BON PROC */ 162 ********************************** * * ************************** THE MODIFIED PEAY ALGORITHM * * * * * * ************************************************************ PEAY: PROC (A,N) ; DCL (A(N), /* ADJACENCY MATRIX */ G, /* CURRENT VERTEX SET */ GTEMP) BIT (N) , (N , /*NUMBER OF VERTICES IN GRAPH*/ WT, /* NUMBER OF VERTICES IN G */ VTX(N)) /*LABELS OF G'S VERTICES*/ FIXED BIN, 1 VSTORE BASED (VPTR), /* STACK OF SETS*/ 2 VNXT PTR, 2 VCTR FIXED BIN, 2 VN FIXED BIN, 2 VSET BIT (N REFER (VN)) , VHD PTR; /*POINTS TO STACK TOP*/ DCL P PTR; DCL CTR FIXED BIN; G=G| ( G) ; VHD=NULL; CTR=0; PUT SKIP; PUT SKIP LIST ('THE CLIQUES ARE: *) ; START:WT=0; /* DETERMINE THE LABELS OF THE VERTICES IN G V DO 1=1 TO N; IF SUBSTR (G,I,1) = '0'B WT=WT+1; VTX (WT) =1; LP1: END; THEN GO TO LP 1; /* FIND TWO NON-ADJACENT VERTICES IN G V /* */ DO 1=1 TO WT; GTEMP=A(VTX (I))6G; SUBSTR (GTEMP, VTX (I) , 1) = M « B ; IF GTEMP = G THEN GO TO FOUND; END; G IS A COMPLETE SUBGRAPH. DETERMINE WHETHER IT IS MAXIMAL. GTEMP=GTEMP| ( GTEMP) ; DO 1=1 TO WT; GTEMP=GTEMPSA (VTX (I)) ; END; , IF GTEMP=«0 B THEN IF STACK NOT EMPTY THEN CHOOSE ANOTHER VERTEX SET FOR FDRTHER PROCESSING */ IF VHD = NOLL THEN DO; G=VHD-> VSET; P=VHD; VHD=VHD->VNXT; CTR=CTR-1; FREE P->VSTORE; GO TO START; END; RETURN; FOUND: /* DETERMINE TWO NEW VERTEX SOBSETS,SAVE ONE AND PROCESS THE OTHER. */ /* V GTEMP= A (VTX (I) ) ; DO K=1 TO N; IF K = VTX (I) THEN IF SOBSTR (A (VTX (I)) ,K, 1) = '0 'B THEN GTEMP=GTEMP| A (VTX (I) ) ; END; SOBSTR (GTEMP,VTX (I) ,1)=«0'B; CHECK THE STACK TO SEE IF NEW SET "GTEMP" IS CONTAINED IN A PREVIOUS ONE AWAITING PROCESSING. P=VHD; DO WHILE (P =NULL) ; IF (GTEMP|P->VSET) = P->VSET THEN GO TO START; P=P->VNXT; END ; CTR=CTR+1; ALLOCATE VSTORE; VSET=GTEMPSG; VNXT=VHD; VCTR=CTR; VHD^VPTR; G=A (VTX (I) ) SG; SUBSTR (G,VTX(I) ,1) = '1»B; GO TO START; END; /* END PEAY PROCEDURE */ 164 * REDUCED REDUNDANCY ALGORITHM * * * * * * * * * * 4 1 4 c * * * * * * * * * * * * * * * * * * ENUM:PROC (A,N); DCL (A(N), Gf H, CSUB, * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /* ADJACENCY MATRIX */ /* CURRENT VERTEX SET */ /* NEWLY DEFINED SET */ /*COMPLETE SUBGRAPH TO BE EXTENDED BY VERTICES IN G */ /* NEWLY EXTENDED CSUB */ /*SET OF DEFINING VERTICES */ CLQ, F, GMX, CMX) BIT (N) , (V, /* VERTEX CHOSEN FROM F */ N) /* NUMBER OF VERTICES OF GRAPH*/ FIXED BIN; DCL CTR FIXED BIN; DCL VTEMP PTR; DCL VSET BIT (*) CTL; /* STACK OF SETS */ CTR=0; NN=N*2; G=G| ( G) ; CSUB= * 0•B; /* DETERMINE WHETHER THERE IS A VERTEX ADJACENT TO ALL VERTICES IN G|CSUB. IF YES THEN NC CLIQUE IS DEFINED ON G|CSUB SO CHOOSE A NEW VERTEX SET. V START: GMX=G|CSUB; NV=0; DO 1=1 TO N; IF (A(I) |GMX) = A(I) THEN GO TO NEWSET; END; V=INDEX (G,•1'B) ; /* F IS THE SET OF VERTICES IN G NOT ADJACENT TO V. */ F=GS ( A (V) ) ; LOOP: H=GSA (V) ; CLQ=CSUB; SUBSTR (CLQ,V,1)='1'B; IF H=«0'B THEN /* * NO FURTHER VERTICES CAN BE ADDED TO CLQ, HENCE CLQ IS A COMPLETE SUBGRAPH. DETERMINE IF IT IS MAXIMAL. OUT: /* DO; NV=0; DO 1=1 TO N; IF (A(I)|CLQ) = A (I) THEN GO TO OUT END; GO TO NXTV; END; PLACE H AND CLQ ON THE STACK FOR FURTHER PROCESSING */ CTR=CTR+1; ALLOCATE VSET BIT (N N) ;. VSET=H||CLQ; NXTV: SUBSTR(F,V,1) = «0«B; SUBSTR (G, V, 1) = «0»B; V=INDEX (Ff * 1 ' B) ; IF V =0 THEN GO TO LOOP; /* V ALL VERTEX SETS HAVE NOW BEEN DETERMINED FOR THIS ITERATION. CHOOSE A NEW SET FROM THE STACK FOR FURTHER PROCESSING NEWSET: IF CTR <= 0 THEN RETURN; G=SUBSTR(VSET,1,N); CSUB=SUBSTR (VSET,N + 1, N) ; FREE VSET; CTR=CTR-1; GO TO START; END; /* END ENUM PROC */ 166 I * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 44**4*44*******4***4****** * * * * ALGORITHM TO DETECT THE EXISTENCE OF A CLIQUE OF ORDER K IN A K-PARTITE GRAPH * * * * ****************************4****444*4********************** KGRPH:PROC(A,N,M,K,MX); DCL A (N) /* ADJACENCY MATRIX */ BIT (N) , (M (K) , /* NO. VERTICES PER BLOCK */ MX, /* MAX. NO. VERTICES/BLOCK */ N, /* NO. VERTICES IN GRAPH */ K, /* NO. OF BLOCKS */ R,S,T,VX,VTX,P0S,P0S2,M1M2) FIXED BIN; DCL VB (K) FIXED BIN ; DCL AO (N) BIT (N) ; DCL U FIXED BIN; STEP0:A0=A; ITER ITER+1; N1=N-M (1)-M (K) ; POS=M(1)+1 ; POS2=POS+M(2); M1M2=POS2; NSP= (N-M1M2) *N1+N-M (K) ; BEGIN; DCL MSUM (K) ; DCL (P,Q,TEMP,W (NSP,K-2) ) BIT (MX) ; W=«0'B; MSUM (1) = M (1) ; DO 1=2 TO K; MSUM (I) = MSUM (1-1) +M (I) ; END; DO S=1 TO K-2; DO I=POS TO N-M (K) ; /* */ CHOOSE AN EDGE (I,J) NOT YET DELETED FROM FURTHER CONSIDERATION. DO J=I+1 TO N; 1 IF SUBSTR (A(I),J,1)= 1'B THEN BEGIN; P='0»B; /* MAP THE VERTEX PAIR (I.J) ONTO AN INTEGER I J . INITIALLY DEFINE THE SET W(IJ,S). */ /* V IJ= (J-M1M2)*N1 + I; W (IJ,S)=SUBSTR (A (I) 6A (J) ,POS-M (S) ,M (S) ) ; DEFINE THE SET W(IJ,T) FOR T=S-1,S-2,...,1 AS A FUNCTION OF THE PREVIOUSLY DETERMINED SETS W (IJ,T+1) ,W(IJ,T+2) , . . . , W (I J , S) . DO NT=1 TO S; 167 T=S+1-NT; IF T=S THEN GO TO SKP; DO R=T*1 TO S; VTX=INEEX (W(IJ,R),'1*B) ; DO WHILE (VTX = 0); VX=VTX+MSUM (R-1) ; IF SDBSTR (A (J) ,VX, 1) SSUBSTR {A (I) ,VX, 1) = *0 «B THEN BEGIN; V SUBSTR (W (I J , R) ,VTX,1)='0*B; GO TO SKP3; END; Q=W ( (J-M1M2) *N1+VX ,T) &W { (I-M1M2) +N1+VX ,T) ; /* IF THERE EXISTS A VERTEX VTX NOT ADJACENT TO ANY VERTEX IN W(IJ,T) THEN DELETE IT FROM FURTHER CONSIDERATION */ IF Q = 'O'B THEN P=P|Q; ELSE SUBSTR (W (IJ,R),VTX,1)='0»B; IF VTX >= MX THEN GO TO CHK4; TEMP=»0'B; SUBSTR (TEMP,VTX + 1,MX*1-VTX) = SUBSTR(W(IJ,R),VTX+1,MX+1-VTX); VTX=INDEX (TEMP, • 1 'B) ; SKP 3: END; CHK4; /* */ SKP: W (IJ,T)=W(IJ,T)&P; END; /* END R LOOP */ IF W(IJ,T) IS EMPTY FOR ANY T=1,2,...,S THEN DELETE EDGE (I,J) FROM FURTHER CONSIDERATION , , IF W(IJ,T)= 0 B THEN BEGIN; SUBSTR (A (I) ,J,1) = '0'B; SUBSTR (A (J) ,1, 1)= «0 «B; GO TO NEXT; END; END; /* END T LOOP */ END; /* END BEGIN BLOCK */ NEXT:END; /* END J LOOP */ END; /* END I LOOP */ POS=POS2; POS2=M (S + 2) +POS2; END; /* END S LOOP */ /* V TEST WHETHER ALL EDGES HAVE BEEN DELETED FROM THE SET OF CANDIDATES. IF S=K-2 AND NOT ALL EDGES HAVE BEEN ELIMINATED, THEN TEST FOR FURTHER ITERATION. DO I=N-M(K)-M(K-1) + 1 TO N-M (K) ; DO J=N-M(K)+1 TO N; IF W ( (J-M1M2) *N1*I, 1) = «0»B THEN 168 GO TO STEP19; END; END; CONDA:PUT SKIP LIST('NO K-CLIQUE'); RETURN; STEP19:DO 1=1 TO K; VB (I)=0; DO J=1 TO M (I) ; U=I*MX+J-MX; IF A0(U) = A (U) THEN GO TO STEPO ; IF A(U) = »0'B THEN VE (I) =VB (I) *1 ; END; END; CONDB:PUT SKIP LIST(»NO CHANGE'); DO 1=1 TO K; IF VB(I) > 2 THEN GO TO STEP20 ; END; PUT SKIP LIST ('K-CLIQOE EXISTS BY THEOREM U.3'); RETURN; STEP20:CALL ENUM (Af N) ; RETURN; END; /* END KGRPH */ 169 APPENDIX C graph 1 graph 2 000101001 000001100 000100101 101000101 000000110 110000110 011111000 101100000 000101001 000111100 000100111 111000111 010000110 110000110 011111000 101100000 graph 3 graph 4 000111001 000111110 000 1 11111 111000111 11 1000110 111000111 011111000 011111000 101101000 000111111 000111111 000111111 111000111 111000111 111000111 111111000 111111000 111111000 graph 5 graph 6 000101001000 000001 100000 000100101110 101000101101 000000110110 110000110011 0111 11000100 000011000101 101100000111 0011 10111000 001011001000 000101011000 000101001100 000111100111 000100111011 111000111000 010000110110 110000110101 011111000110 001111000111 101100000011 110011110000 011010111000 011001011000 graph 7 graph 8 000111001110 0001 11110101 000111111111 111000111110 111000110111 111000111111 011111000011 011111000011 101101000111 111111001000 101111111000 011011111000 000111111111 000111111111 000111 111111 111000111111 111000111111 111000111111 111111000111 111111000111 111111000111 111111111000 111111111000 111111111000 graph 9 graph 10 0000000001 11100111 000001011010111101 000000001111000001 000000010000010101 000000001110111001 010000000110100011 000000000101000001 010100000011100111 011010000000001C00 101011100000111111 11 1011010000010100 101000110000001000 110011010100000010 010110000110000010 010010001101000001 110100010110000000 100001010100110000 111111110100001000 000001111100110110 000001000111001101 000001001101100100 00000011100010011 1 000000010010101110 111000000011100011 100100000001111011 100110000011011110 101100000000110000 111000000000111011 010011010000010110 011001110000000001 101111101100000110 100000111110000001 010010110100000000 111110010010100000 100111110110100000 010101100101010000 graph 11 graph 12 000101101111001110 000011111111111111 000001110111111111 100000111110100111 010000011111101111 111000001111111011 111100000111011100 011110000001101110 110111000001001011 111111100000111111 111111100000000110 111011111000001111 011111010100000111 011001100100000011 111011111101000000 111110110111100000 111111011111110000 011111001101110000 000111111111111111 000111111111111111 000111111111111111 111000111111111111 111000111111111111 111000111111111111 111111000111111111 111111000111111111 111111000111111111 111111111000111111 111111111000111111 111111111000111111 111111111111000111 111111111111000111 111111111111000111 111111111111111000 111111111111111000 111111111111111000 graph 13 graph 14 000000000111100111010 000011010111101110001 000001000001010100000 000000101100111011100 010000000110100011001 011000000000010011100 000100000100000100000 010000000011110010100 000100000001000000101 1101 10100000001 101010 110010010000001110000 11 100001 1000001 101001 110110010000000011100 001101010000000C00011 010100000111000001001 11 1000100111000000000 110111010010100000001 1001 11000101101000000 000101011000100000000 100000000100010000000 010010001001011010000 0001111110010 10 111000 000011100011110011110 000001011011110110110 100000011111011011011 110000010001001101001 111000000011001011001 11000000000011001 1110 101110000011110111000 101100000000111111110 000100000000101100100 011101010000011001011 111111010000001010100 011000111100000111101 111100111010000011110 00011 1001 11 1000000001 101010011100100000110 11 1 101 11 1001 1 10000010 110111111010110000001 01100010110111010CCOO 011100101010010110000 000111000010101001000 graph 15 graph 16 00011111111111001101 00001111111101101010 00000111111111011110 10000011110100111101 11000001011110111011 111000001111111111110 111 10000001011 11101 1 11111000001 1111111110 11110100000011011111 11111100000011111011 11 101111000001111011 11111101000000010111 10101111110000011111 11100111111000001111 0101 1111011000000111 00111111111110000011 11 1 11111111011000001 10110101100111100000 011011111111111100000 100111111111111110000 111110101111111111000 000111111 000011111 000001111 00000 111 0000011 000001 ooooo ooooo ooooo ooooo ooooo ooooo ooooo ooooo 100000 1100000 11100000 111100000 1111100000 1111110000 1111111000
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- On the efficiency of clique detection in graphs
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
On the efficiency of clique detection in graphs Dixon, Anthony Hunter 1973
pdf
Page Metadata
Item Metadata
Title | On the efficiency of clique detection in graphs |
Creator |
Dixon, Anthony Hunter |
Publisher | University of British Columbia |
Date Issued | 1973 |
Description | This thesis examines the devices employed by various algorithms to search for maximal complete subgraphs in graphs. Also known as cliques, in Chapter 1 these subgraphs are seen to play an important role in graph theory, information retrieval, Sociometry, logic design, and computational complexity. The enumeration of cliques using the Harary-Rcss, Bonner, Peay, and Bron-Kerbosch algorithms is discussed in Chapter 2. The Reduced Redundancy algorithm is introduced, and the performance of the five procedures is assessed using an alternative approach to empirical testing. Each of the algorithms is shown to generate a "derivation tree" for a given graph whose size can be used as a measure of its efficiency. In Chapter 3, the possibility of exploiting vertex similarity is examined with a view to reducing the size of the derivation tree. As a consequence, algorithms are proposed for finding non-similar cliques. The concept of "complete subgraph equivalence" of vertices is introduced to develop a means for expressing the cliques of a graph as the Cartesian product of vertex subsets. An algorithm for detecting the existence of a clique of order k in a certain class of k-partite graphs in polynomial time is proposed in Chapter 4. This class consists of all graphs reducible by the algorithm to k-partite graphs having at most two vertices per block of degree greater than 0. This algorithm is shown to provide an efficient heuristic which can be used in a procedure for determining whether a well-formed formula is a tautology. The thesis is concluded with an empirical analysis of the techniques employed by the enumeration algorithms of Chapter 2. In addition, the efficiency of the Clique Detection algorithm is compared with that of the Reduced Redundancy algorithm. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-03-03 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0052075 |
URI | http://hdl.handle.net/2429/31952 |
Degree |
Doctor of Philosophy - PhD |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-UBC_1973_A1 D59_6.pdf [ 7.25MB ]
- Metadata
- JSON: 831-1.0052075.json
- JSON-LD: 831-1.0052075-ld.json
- RDF/XML (Pretty): 831-1.0052075-rdf.xml
- RDF/JSON: 831-1.0052075-rdf.json
- Turtle: 831-1.0052075-turtle.txt
- N-Triples: 831-1.0052075-rdf-ntriples.txt
- Original Record: 831-1.0052075-source.json
- Full Text
- 831-1.0052075-fulltext.txt
- Citation
- 831-1.0052075.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0052075/manifest