Some Characteristics of the Second Betti Number of Random Two Dimensional Simplicial Complexes by Kang Tan B.Sc, Fudan University, 1989 A THESIS SUBMITTED TN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (Department of Mathematics) We accept this thesis as coiiforrriing to the required standard The University of British Columbia April 1996 © Kang Tan, 1996 In presenting this degree at the thesis in University of partial fulfilment of of department granted or understood his or her representatives. permission. Department of The University of British Columbia Vancouver, Canada DE-6 (2/88) M IS,!** an advanced Library shall make it this thesis for scholarly purposes may be by for agree that permission for extensive It publication of this thesis for financial gain shall not Da,e requirements British Columbia, I agree that the freely available for reference and study. I further copying the is by the that head of copying my or be allowed without my written ABSTRACT In this thesis, through generating random two-dimensional simplicial complexes, we studied the event probability of event (b2=0) (b2=0) for some specific probabilities. We found when the takes on certain specific values, the pair (no,n) lies on 2 certain lines. However, this research is limited by our sample space ( i.e. for P(b=0) « 10%, 10 < no < 80; for P(b=0) « 50%, 12 < no < 100; for P(b=0) « 90%, 2 2 2 12 < no < 145). The " linear behavior" may not hold asymptotically. In the same time, we endeavor to find the number of tetrahedra and 6-triangles in the simplicial complexes. When the event (b=0) occurs in our specific probabilities, it seems the 2 second Betti number should comefromtetrahedra and 6-triangles with high probability. However, the expectation of the number of tetrahedra and 6-triangles goes to zero, when no goes to infinity and there exists linear relationships between the pair (no,n). 2 This evidence may also support that the " linear behavior" may not hold asymptotically. If n and n vary linearly with no going to infinity, then the probability that no(K) — no is 2 0 extremely small in model MB for reasons are similar to the Coupon Collector's problem. Hence, the probability that we cannotfindelement in S(no,n) is large, which 2 indicates that the model may have problems. TABLE OF CONTENTS Abstract ii Table of Contents iii List of Tables iv List of Figures v Acknowledgment vi INTRODUCTION 1 Section One Algebraic Topology Concepts 2 Simplices and Simplicial Complexes 2 Boundary and Homology 3 Section Two The Betti Number and the Euler Number 4 The Probabilistic Model 5 Section Three The Random Simplicial Complex Generator Section Four and Experimental Outcomes 10 The Random Simplicial Complex Generator 10 The Data and Data Analysis 14 Tetrahedra and Six-triangles in a Simplicial Complex 18 The Expected Number of Tetrahedra and Six-triangles in S(no, n2) 19 The Process to Detect Tetrahedra and Six-triangles in a Simplicial Complex Bibliography 21 36 iii LIST OF TABLES Table 1 The Upper Bounds of P M B O ^ I ) for Some Specific Values of no and n Table 2 Computation Results of Floating Point and Exact Precision for Some Data 14 Table 3 Data Distribution for P(b=0) « 10 14 Table 4 Data Distribution for P(b=0) » 50% 15 Table 5 Data Distribution for P(b=0) « 90% 15 Table 6 Regression Functions for P(b=0) * 10% 16 Table 7 Regression Functions for P(b=0) * 50% 17 Table 8 Regression Functions for P(b=0) « 90% 17 Table 9 Portion of the Second Betti Number ComingfromTetrahedra 2 2 2 2 2 2 and Six-triangles 8 2 23 iv LIST OF FIGURES Figure 1 Scatter Plot of n against no when P(b=0) near 10% Figure 2 Scatter Plot of n against no when P(b=0) near 50% 25 Figure 3 Scatter Plot of n against no when P(b=0) near 90% 26 Figure 4 Regression Line of n against no when P(b=0) = 10% 27 Figure 5 Regression Line of n against no when P(b=0) near 10% 28 Figure 6 Regression Line of n against no when P(b=0) = 50% 29 Figure 7 Regression Line of n against no when P(b=0) near 50% 30 Figure 8 Regression Line of n against no when P(b=0) = 90% 31 Figure 9 Regression Line of n against no when P(b=0) near 90% 32 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 24 Figure 10 Distribution of b when no=20 n = 143 33 Figure 11 Distribution of b when no=30 n=200 33 Figure 12 Distribution of b when no=50 n=400 33 Figure 13 Distribution of b when no=45 n=256 34 Figure 14 Distribution of b when no=50 n=260 34 Figure 15 Distribution of b when no=80 n=500 34 Figure 16 Distribution of b when no=30 n=80 35 Figure 17 Distribution of b when no=75 n=250 35 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Figure 18 Distribution of D2 when no=l 00 n2=350 v 35 ACKNOWLEDGMENT I wish to express gratefulness to my supervisor, Prof. Joel Friedman. During the 1 year study and research, he always gave me valuable suggestions, patiently explained anything and kindly guided me tofinishthis thesis. I am also thankful to Prof. Jack Snoeyink for his helpful comments on revising the thesis which led to a considerable improvement of my presentation. vi Introduction Distinguishing the topological isomorphism classes of topological spaces is one of the main interests in thefieldof algebraic topology. To distinguish these isomorphism classes, one of the methods used is to compute the homology groups of topological spaces. The homology groups of a topological space can be computedfromthose of an "approximating" geometric simplicial complex; the homology groups of a geometric simplicial complex are equal to those of its abstraction, an abstract simplicial complex, which is a finite combinatorial object (generalizing graphs). Throughout this thesis, when we refer to simplicial complexes we mean abstract simplicial complexes. A part of a homology group is its rank. The i-th Betti number, bi, is the rank of the i-th homology group. The integers, bo, bi, b„, are called Betti numbers, and are fundamental invariants of the n-dimensional topological spaces. The Betti numbers of a simplicial complex have intuitive meanings. For example, the 0-th Betti number, bo, is the number of connected components in the simplicial complex. The general goal of this line of research is to understand the Betti numbers of the simplicial complexes in any dimension. In a one-dimensional simplicial complex (i.e. graph), one can easilyfindbi for any graph, G, if bo is known, by the Euler characteristic formula bi - b = x(G) = E(G) - V(G), 0 where %(G), E(G), and V(G), are the Euler characteristic, number of edges, and number of vertices of graph G, respectively. A two-dimensional simplicial complex is a collection of subsets of afixedset, V, which is closed under taking subsets, and where the largest faces are triangles. In a two-dimensional simplicial complex, K, even if bo is given , one cannot 1 2 'A triangle is a set of size three, namely {vi, v , v }. If a simplicial complex has more than one components, we can study each component separately. 1 2 2 3 / 1 find b or bi from the Euler characteristic formula, b -bi +bo= %(K)=n (K) - n\(K) + n (K), 2 2 2 0 where %(K), n (K), ni(K), and no(K), are the Euler characteristic, number of triangles, 2 number of edges, and number of vertices of K, respectively. For this reason, simplicial complexes in two dimensions are the lowest dimensional simplicial complexes of interest. When studying these two dimensional simplicial complexes, we will use a probabilistic model to study typical simplicial complexes. The remaining sections of this thesis are arranged as follows. In section 1, we review the necessary concepts of algebraic topology. We describe our probabilistic model in section 2. We explicate the process of generating a random simplicial complex, display computational data, and interpret the results in section 3. In section 4, we study some combinatorial quantities which seem to be related to the second Betti number, when certain conditions hold in our model. 1. Algebraic Topology Concepts Let us briefly review some necessary conceptsfromalgebraic topology. All of our terminology follows that in [2]. 1.1 Simplices and Simplicial Complexes For 0 < k < n, a geometric k-simplex c in R" is the convex hull of a set T of k+1 geometrically independent points. The dimension of C is dim(c) = |T| - 1 = k. For every U c T , the simplex a defined by U is a face of C, and if TJ ^ T then a is called a proper face of c. In R , we use the terms vertex for 0-simplex, edge for 1-simplex, and triangle n for 2-simplex. A collection of simplices, H, is a geometric simplicial complex if it satisfies 2 following two properties: (i) if a is a face of c and c sH, then asH. (ii) if a e if, o e if, then a n o either a face of both or an empty set. The largest dimension of any simplex in H is the dimension ofH. An abstract simplicial complex is a collection of subsets of a fixed set, K, which is closed under taking subsets. A simplex a ofK is an element of K. The dimension of a is one less than the number of vertices in it. A k-simplex of K is a simplex with dimension k. Each nonempty subset of a is called a face of a. The dimension of K is the maximum dimension of its simplices. The vertex set of K is the union of the O-simplices of K. Clearly, an abstract simplicial complex is completely determined by its maximal faces. Two abstract simplicial complexes K and Tare said to be isomorphic if there is a bijective correspondence / { v, 0 mapping the vertex set of K to the vertex set of T such that v } e K if and only if {f[yo),/v„) } e T. n Let H be a geometric simplicial complex, and let K be an abstract simplicial complex whose vertices are in one to one correspondence with the vertices of H, a subset of vertices being a simplex of K if and only if they correspond to the vertices of some simplex of H. K is called an abstraction of H, and any geometric simplicial complex having K as an abstraction is called a realization of AT. The homology groups of a geometric simplicial complex, H, are equal to those of its abstraction, K. Hence, the Betti numbers of H are the same as those of K. 1.2. B o u n d a r y and Homology Each k-simplex of a complex K can be oriented by assigning a linear ordering to its vertices, denoted by c = [v<>, vi, vj. Two orientations of K are the same if one sequence differs from the other by an even number of transpositions. The boundary of o is defined to be d c = Xi=o (-1)' [v , Vi, k k 0 v , .., v ], where the caret means that v is A ; k 3 ; omitted. If o is a vertex, then 9 c is defined to be zero. So 8k maps each oriented 0 k-simplex to a formal sum of oriented (k-l)-simplices. A formal sum of integer multiples of oriented k-simplices is called a k-chain. If the coefficient of a k-simplex in some k-chains is non-zero, we say the simplex belongs to the chain. Thefreeabelian group of k-chains of K is denoted by Ck = Ck(X). The map 9k naturally extends to the boundary homomorphism 3k: Ck —> Ck-i defined by 3k(Zj=iajOj) = Lj=i aj9k(Cj), where the aj are d d integers, the Oj are k-simplices of K, and d is the dimension of Ck. Note that a (k-1)-simplex can occur in more than one term of this sum. The coefficients of these terms are added in the usual way. A number of interesting groups can be defined using the boundary homomorphism. The group of .ST-cycles, Zk = Zk(K), is the kernel of 9k, which is, the subgroup of k-chains z e Ck with 3kZ = 0. The group of k-boundaries, Bk = Bk(X), is the image of 9k+i, which is the subgroup of k-chains z e Ck for which there exists a (k+l)-chain z' e Ck+i with z = 3k+i z*. It can be easily shown that 9k°9k+i z = 0 for every (k+l)-chain z'; therefore, Bk 1 is subgroup of Z . Finally the quotient group, H = H (fy) = Z /B = ker(9)/im(9k+i), is k k k k k k defined to be the k-th homology group of K. 1.3. The Betti number and the Euler number By the fundamental theorem forfinitelygenerated abelian groups, such a group, G, is isomorphic to a direct sum Z © Z/ti © Z/t ©...© Z/t , where b is non-negative integer, b 2 m and the torsion coefficients, ti, are integers greater than or equal to 2 such that t; divide t;+i for 1 < i < m - 1. If G isfree,G is isomorphic to Z , and is denoted as G = Z . Let b b G = H , then b = b (£) = b(Hk) (i.e. the dimension of homology group ker(9k)/im(3+i) ), k k k k 4 and where bk is called the k-th Beth' number of K. Hence, bk= dim(ker(3 )) - dim(im(3 )). k k+1 The Euler number of a simplicial complex K is defined classically by the equation, X(K) = I , (-l) ni, where n; is the number of simplicies of K in dimension i and d is the d i w dimension of K; by Hopf trace theorem, we have, x(£) Xi=o (-l)'bi. = d 2. The Probabilistic Model In previous research on the evolution of random graphs, one adds random edges to a fixed vertex set and studies how connected components evolve. A variation of this technique allows us to add random triangles and study the evolution of the Betti numbers of the simplicial complexes. Let V(no) be the vertex set {1, 2, no} and S(no,n) be the 2 set { K | no(£) = no, n (K) = n , where K is a two-dimensional simplicial complex }. With 2 2 T initially empty, we will generate a random element inS(no,n ) by repeatedly adding to T 2 a random triangle not in T until the number of triangles in T reaches n . Then we take 2 closure of triangles in T to get a simplicial complex, K. We only keep a simplicial complex, K, if no(K) = no. The above process, defined to be model MA, is used to generate a random simplicial complex; it only dependents on parameters no and n . 2 In order to simplify the calculation of the probability in our model, we introduce another model MB which is like MA except that we do not discard simplicial complexes with no(K) < no. For any event, A(X), in MA, we have: PMA(A(/£) occurring ) = P M B ( A(K) occurring given n<)(K) = no ) 5 (1) Let us consider event ( no(K) < no ) (i.e. 3 v e V(no) and v g V(K) ) in MB. Fix a Vk in V(rio) and consider the event that Vk£V(AT). We pick triangles one by one without repetition. The first triangle in V(K) does not contain Vk with probability Cs^'VCs", where 0 C3 is the binomial coefficient of no choose 3. Then the second triangle in V(K) does not 110 contain v with probability (C^-iyiCs^-l); similarly, the i-th triangle in V(K) does not k contain v with probability (C " -i+l)/(C3 -i+l). n<, k 1 ,v> 3 Therefore, PMB( v gV(K) ) = n^KCs^-i+iyCCs^-i+l)]. We have: k PMB( no(K) < no) = PMB( 3 some v in V(no) such that v e V(K) ) k (2) k = PMB(^k=i °{vkgV(Z)}) (3) ^Zk^P^v^V^) (4) < noTL-^Cs^-i+iyCsM+l) (5) <tio(C < " /C3T (6) = no(l-3/nor (7) n n > 1 , 3 < noe (8) 3nj/n<> We get formula (6) by the fact: x/y > (x-i)/(y-i), where 0 < x < y and 0 < i < y. From formula (7) to (8), we use the inequity: log(l+x) < x, where x > (-1). Notice that when n > (nologno)/3, formula (8) is bounded by 1. We calculate formula (8) in the values of 2 pairs (no,n) considered and obtain (8) is bounded by 0.03. Hence, PMB(IIO(£) = no) > 0.97. 2 PMA(A(Z) occurring) < P M B ( A(K) occurring) / P M B ( no(K) = no) < 1.03 1PMB( A(K) occurring) (9) (10) We will use formula (10) to get the upper bound of the probability for any event occurring in our model. 6 . The Betti numbers of random simplicial complexes are random variables depending on the pair (no, n ). If we know any two of them, we canfindanother one by 2 the Euler characteristic formula. Suppose that K is a simplicial complex generated by our probabilistic model. The 0-th Betti number, bo(£), is the number of connected components in K. Intuitively speaking, after adding enough random triangles to K, K contains only one component with large probability. Let us consider the event (bo(X) > 1) (i.e. K has at least two components ) in model MB; suppose Ci(K) and C (K) to be any two components of 2 K. We have 6 < V ( d ( £ ) ) + V(C (K)) < no (11) 2 where is the number of vertices in component Ci(K) because each component V(CI(JK)) has at least three vertices. Without loss of generality, suppose 3 < V(Ci(£)) < [no/2] (12) where [no/2] is the maximal integer less than or equal to no/2. Let A be subset of V(no) such that A - V(Ci(X)) and 3 < |A| =j < [no/2]; let CA be complement of A in V(no). We pick triangles one by one without repetition. After picking the (i-l)-th triangle, the number of triangles whose vertices lie in either A or CA is (Dj-i+1), where Dj = C °" + C3 n J 3 is the number of triangles whose vertices lie in either A or CA; the total number of triangles is (Cs^-i+l) after the (i-l)-th triangle is picked. Hence, given that all the previous triangles lie in either A or CA, the probability that all of the i-th triangle's vertices also lie in either A or CA is (Dj-iH4)/(C °-i+l). We have: n 3 PMB( b (£)>l ) = P M B ( 3 subset of V(no), A, with size j such that some triangles' vertices are in A and the other triangles' vertices are in CA) (13) 0 = PMB( u [n(/2] j=3 ( |A|=j and 3 d(K) c K s.t. V(d(£)) = A) 7 (14) J < ZJPMB( IA|=j and 3 c K s . t V(Ci(£)) = A ) d(K) (15) < ZjCj^niCDj -i +l)/(C -i+l) no (16) 3 < ZjCj (D/C ) no (17) no n2 3 where 3 < j < [no/2] and 1 < i < n . We get (16) because there are Q" subsets of V(n ) with size j;fromformula (16) to (17), we use the same fact asfromformula (5) to (6) for Dj < Cs*. 0 2 0 0 We calculate formula (17) for some specific values of no and n and list the results 2 in Table 1. (no, n ) (10, 55) (12, 25) (145, 600) PMB(D >1) < 1.254e-478 1.526e-152 2.41e-564 2 0 Table 1. The upper bounds of P M B O ^ I ) for some specific values of n and n . 0 2 In Table 1, we only checked three pairs of (no,n). Of the data we have, 10 is the 2 smallest no, 145 is the largest no, and 25/12 is the smallest ratio of n/no. Because 2 PMA(DO=1) = 1- PMA(DO>1) > 1- PMB(bo>l), b is almost certainly equal to one if n is large 0 2 with respect to no. Therefore, if we know either one of bi or b , we canfindthe other one 2 by the Euler characteristic formula. In this thesis, we focus on properties of the second Betti number. Some reasons for us to study b rather than bi are: (i)findingb requires less computation thanfindingbi 2 2 in that computing b requires the computation of only one matrix's rank, but computing bi 2 requires the computation of ranks of two matrices, (ii) when b does not equal zero, it is 2 important to know how many tetrahedra and 6-triangles the simplicial complex contains. 3 4 In a two-dimensional simplicial complex, a tetrahedron is a set of the form { {vi,V2,v }, {vi, v , v }, {vi, v , v }, {y , v , v } } for any four vertices set {vi, v , v v }. A 6-triangle is a set of six triangles such that there is a triangle, A, not in the simplicial complex,and 3 3 3 4 2 3 4 2 4 8 3> 4 2 4 If, for some values of (no, n2), the second Betti number, b , comes from tetrahedra and 2 6-triangles in that they form a basis for rational H , we can use a more efficient way to 2 study the properties of the Betti numbers of the class S(no,n). 2 Note that for fixed no, as triangles are added, the second Betti number, b , does not 2 decrease. To describe b , there are many interesting events such as (b =0), (b <10), 2 2 2 and (b< VnT). However, we choose to focus on the event (b=0) because it is the simplest and 2 2 requires the least computational resources. To study the event (b =0), we should 2 investigate how the event P(b=0) depends on the pair (no, n ) or investigate which n 2 2 2 values give us P(b=0) near various values for given no. In this thesis we focus specifically 2 on the three probabilities; P(b=0) * 10% ( which we take to be 8-12% ), P(b =0) * 50% 2 2 ( i.e 48-52%), and P(b=0) « 90% (i.e. 88-92%). It should be noted that choosing these 2 three numbers (10%, 50%, 90%) is somewhat arbitrary. One can choose any other three percentages or more to study the event (b=0). In order to get a broader understanding of 2 how n depends on no when P(b=0) jumps from one level to another, and to avoid 2 2 difficulty in analyzing the data, it is better not to choose these percentages too close to one another. There are many other probabilistic models which add a random triangle. For example, model(i): add to K a random triangle (i.e. repeats of a triangle are allowed); model(ii): add to K a random triangle having no common edges with the triangles already in K. To study the event (b=0), we compare the above different random simplicial 2 complex generators with our model. Once there exists two of the same triangles in the simplicial complex generated by model(i), the second Betti number b will be greater than 2 or equal to one. Hence, P(b=0) = 0. It is easy to reduce the computation of the second 2 Betti number, b , of a simplicial complex generated by model(i) by computing the b of 2 2 there are exactly two tetrahedra with a common triangle A if A is added into the set. 9 the simplicial complex generated using different triangles. Let Ki be the simplicial complex generated by model(i), let K be the simplicial complex reducedfromKi by keeping only 2 one copy of each triangle in K\. It is easy to see b (Ki) = b (K ) + 2i( multiplicity of 2 2 2 triangle i in K\ - 1), where 0 < i < no(K{). We do not use the model (ii), because it is also easy to see that the second Betti number, b , of a simplicial complex generated by this 2 model is always zero. 3. The Random Simplicial Complex Generator and Experimental Outcomes In this section we completely describe the process used to generate a random simplicial complex and our experiments. First, let us review notation used throughout the rest of this thesis: b denotes the second Betti number of a simplicial complexes; d 2 2 denotes the matrix induced by the boundary map, C —>d; T(K) denotes the set of 2 triangles of simplicial complex K; B(K) denotes the set of edges of simplicial complex K; V(K) denotes the vertex set of simplicial complex K; n (K) denotes the number of triangles 2 of K such that |T(K)| = n (K); m(K) denotes the number of edges K such that |E(K)| = 2 n\(K); no(AT) denotes the number of vertices of K such that |V(£)| = n (K); 0 V(n ) = {1, 2, 0 no} and S(no,n) = { K \ ru>(K) = no, n (K) = n , where K is a two2 2 2 dimensional simplicial complex}. 3.1 The Random Simplicial Complex Generator Given a pair (no,n), we want to generate a random simplicial complex, K, 2 according to the model MA. We use three different integers to denote the three vertices of 10 a triangle. For each fixed no, we randomly generate three different integers in V(n ) by 0 C++ library function randO- In other words, we randomly create a triangle with its vertices in V(no). In order to guarantee that a random triangle is generated, we first create an array, N, with size no+1 and N[i] = i, where 1 < i < no. Secondly, we generate the first 5 random integer, ii, in V(no). If no ^ ii, we swap N[no] and N[ii]. Thirdly, we generate the second random integer, i , in V(no-l). If no-1 * ii, we swap N[no-l] and N[i ]. Swapping 2 2 the integers in this way ensures that no duplicate integers are generated. Lastly, we generate the third random integer, i , in V(no-2). Now we have a triangle { N[n ], 3 0 N[no-l], N[i ] }. Because any two triangles are the same if their vertices are different only 3 in order, we sort these three integers in increasing order. With initial F(£) = 0, we generate the maximal faces set of a random simplicial complex K by adding different random triangles to F(K) until the number of triangles in F(K) reaches n (i.e. we discard 2 the repeated triangle if we find it is already in F(K)). Furthermore, we keep all of the triangles generated in lexicographical order. Keeping the triangles in this order allows easier detection of duplicate triangles and easier detection of tetrahedra which will be discussed in section 5. We should also notice that the number of vertices of that simplicial complex, K, generated by F(£) may be smaller than no. To make sure that no is the value attained, we use an array VT with size no+1 to record thefrequencyof each vertex when we add the triangle {vi, v , v }. If VT[i] * 0, V 1 < i < no, we list all edges of F(£), arrange them in 2 3 lexicographical order and put them in B(K). Now a random simplicial complex K such that K e S(no,n) has been successfully generated. 2 5 In C++, an array index starts from zero. 11 In order tofindthe chain matrix d , we create an n by ni matrix 3 , for the i-th 2 2 2 triangle (vi, v , v } and the j-th edge (v , v }, let 9 (i, j) = 0, if the edge m n r s 2 {v„ v} £ {vi, v , v„}. Let 3 (i, j) = 1 if the j-th edge (v , v„} is either {vi, v } or {v , v„}, s m 2 r m m otherwise 3 (i, j) = -1, where 1, m, n, r, and s belong to V(K) such that 1< m < n and r < s. 2 We use double precision to store the matrix entries. Then we perform Gaussian elimination to the chain matrix, 3 , in which we repeatedly choose thefirstelement, c, 2 such that |c| > 10" in each column as the pivot element, and treat c as zero if |c| < 10". 8 6 8 By the definition of b , we have b = dim(ker(3 )) = n - dim(im(3 )) = n - rank(3 ). 2 2 2 2 2 2 2 We give our algorithm tofindthe distribution of b of the class S(no,n) 2 2 Algorithm: input no, n . 2 • generate a simplicial complex K e S(no,n). 2 • find the chain matrix 3 . 2 • compute the rank of d via Gaussian elimination. • b = n - rank(3 ). 2 2 2 2 repeat above 100 times using different random seed to estimate P(b=0) for 7 2 fixed pair (no, n ) 2 Forfixedno, our goal is tofindthe corresponding n such that the pair (no,n) gives 2 2 P(b=0) « 10%, P(b=0) « 50%, and P(b=0) = 90%, respectively. Forfixedn < 50, we 2 2 2 0 choose n = 4no. If the value n results in the P(b=0) being greater than 12%, then we 2 2 2 increase thefirstvalue of n = 4no by 20 and repeat the same process. We stop increasing 2 %e guess it is small enough for our matrix. However, when some nonzero entries below the pivot element in a column of a matrix are smaller than 10" in exact arithmetric, the rank calculated in this way may be less than the true rank of the matrix. To estimate P(b=0) for fixed pair (no,n ), the larger sample, the better. We choose 100 as sample size due to computer time constraint. The standard deviation of P(b=0) * 10% (or 90%) is 3 for 100 experiments. The standard deviation of P(b=0) » 50% is 5 for 100 experiments. 8 7 2 2 2 2 12 n once P(b=0) * 10% or P(b=0) < 7%. If we cannot get n such that P(b=0) » 10%, we 2 2 2 2 2 reduce 20 to 10, use the smallest value of n which satisfies P(b=0) > 13%, and repeat. 2 2 For no > 50, we use the same process except that we begin with n = 7no and increase the 2 n value using intervals of 40. For P(b=0) » 50%, we begin with some n which is the 2 2 2 nearest to P(b=0) = 50% in the results of finding n such that P(b=0) « 10%. Then we 2 2 2 decrease n until P(b=0) > 52% or P(b=0) « 50%. If we cannot find the appropriate n , 2 2 2 2 then we increase n and repeat the same process. To find n such that P(b=0) « 90%, we 2 2 2 start with n=2no and increase this n value using intervals of 10, until that we find some n 2 2 2 that yields P(b=0) < 88%. Then we adjust n by increasing the value and repeat the same 2 2 process. After we find the first n , we can adjust this n 2 2 value slightly and input new random seed to find another n . We should point out that choosing n=2no, n=4no, 2 2 2 n=7no, and increasing values is somewhat arbitrary. One can choose other small values of 2 n and increase these values to achieve the desired values of P(b =0). 2 2 Our code is written in C++. We use double precision to store the matrix entries which introduces two problems. One problem is that we cannot compute b for large n 2 2 and ni due to the lack of computer storage space. In fact, this is a general problem for any large computation. The other problem is that using float precision to compute the rank of the matrix may be unreliable. We used exact multiple precision calculations to check some floating point calculations. It is clearly shown in Table 2 that we get the same results for the pairs (12, 100) and (20, 200). However, multiple precision calculations did not work in our computer when testing the large pair (25, 300), because the matrix involved could be as large as 300 by 300. For such a pair the computer gave the error message "Virtual memory exceeded in 'new' ". Hence, the multiple precision calculations need much more storage space than double precision calculations, and cannot compute the second Betti number of the simplicial complexes such as those in S(25,300). To overcome the above 13 problems, a new technique for computing the Betti numbers was developed, which is faster and requires less storage [3]. (no,n) (12, 100) (20, 200) (25, 300) floating point precision b=10 b=18 b=8 multiple precision b=10 b=18 out of memory 2 2 2 2 2 2 Table 2. Computation results of floating point and exact precision for some data 3.2. The Data and Data Analysis To obtain a comprehensive notion of behaviour of event (b=0), one must get large 2 number of (no,n) sample pairs. This, however, is not practical for us because of limited 2 resources and time. For this reason, we had to confine our study to a limited number of test result and in our case we choose 100 (no,n) pairs. 2 For P(b=0) « 10%, P(b=0) » 50%, and P(b=0) « 90%, the scatterplots are 2 2 2 shown in Figure 1-3, respectively. For example, when P(b=0) « 10%, each point (no, n ) in 2 2 the plot denotes that P(b=0) of the class S(no, n ) is near to 10%. We use different symbols 2 2 to denote the different values of P(b=0) (see Figure 1). As well, Table 3-5 list the sample 2 forms for each case. P(b=0) 8% 9% 10% 11% 12% distribution of data 17 20 18 20 25 2 Table 3. Data distribution for P(b=0) * 10% 2 14 P(b =0) 48% 49% 50% 51% 52% distribution of data 24 15 19 15 27 2 Table 4. Data distribution for P(b =0) « 5 0 % 2 P(b =0) 88% 89% 90% 91% 92% distribution of data 22 18 18 16 26 2 Table 5. Data distribution for P(b =0) » 90% 2 In Figure 1-3, there seems to be a strong linear relationship between no and n . 2 Therefore, to analyze the data, we consider model (A), n = Co + aono for the data such 2 that P(b =0) 2 = 10%, P(b =0) = 50% 2 and P(b =0) 2 = 90%, where ao and c are unknown 0 regression coefficients. For each case, we have about 20 sample points (see Table 3-5). We also consider model (B), n = c + aono, for the data such that P(b =0) 2 « 50% and P(b =0) 2 0 » 90%, where ao 2 « 10%, P(b =0) 2 and c are unknown regression coefficients. For 0 each case, we have 100 sample points which are all assumed to be of the same probability (see Table 3-5). In order to make the best use of the data, it is reasonable to consider the coefficients, ao and Co in model (B), as the function of p = P(b =0). This will hopefully give 2 us a less biased estimate of ao and Co because much more information is provided. The simplest case is the linear part of the Taylor series at a specific value of P(b =0). We 2 consider the model (C) to be: n = Co + ci( p - 10%) + ( ao + ai( p - 10%))no for P(b =0) « 10%, 2 2 n = Co + ci( p - 50%) + ( ao + a ( p - 50%))no for P(b =0) « 50%, and 2 2 x n = Co + ci( p - 90%) + ( ao + ai( p - 90%))no for P(b =0) « 90%, 2 2 15 where ao, ai, c , and Ci are unknown variables, and p is the observed probability for the data. 0 A useful measure of how well an estimated regression fits the observed (no, n ) is 2 the R -value, the squared correlation between the observed n and f(no), where fQ is the 2 2 fitted function. The formula for the Revalue is R = 1- JXpj 2 - fW)) /!^ 2 1 - E(n )) , 2 2 where E(n ) is the mean of sample n ,fi^no )is the fitted value at point no' and 1 < i < 100. 1 2 2 If R -value is near to zero, then f() is a poor fit; if R -value is near to one, f() is a good fit. 2 2 However, it is not sufficient merely to study the R -value and then to evaluate the model 2 on this basis. There are many examples with the same R -value, but the nature of the 2 relationship between the two variables on the plots varies enormously as is shown in [4]. Clearly, the scatterplots are vital for helping us to gain real insight into the nature of the data. All of our three models can be fit using the various statistical functions, (i.e. multiregress and R-squared ) in the Maple package developed by the University of Waterloo. The results are shown in Table 6-8. parameters R-squared value model (A) ao=9.6927, Co =-67.907 0.9908 model (B) ao=9.51, 0.9926 model (C) ao=9.50567, ai=-4.02949, Co=-61.05784, Ci=105.16423 10% Co = -61.384 0.9963 Table 6. Regression functions of model (A), model (B) and model (C) 16 50% parameters R-squared value model (A) ao=6.6807, Co=-50.928 0.9926 model (B) ao=6.7691, c = -57.775 0.9913 model (C) ao=6.6744, a^-1.50752, CQ= -50.6871, Ci=36.8622 0.9972 0 Table 7. Regression functions o f model (A), model (B) and model (C) 90% R-squared value parameters model (A) ao=4.0021, c =-36.445 0.9869 model (B) ao=4.1481,Co=-47.141 0.9839 model (C) ao=4.0577, ^=-0.11962, Co=-42.111, Ci=-4.6374 0.9956 0 Table 8. Regression functions o f model (A), model (B) and model (C) Each model gives a good fit for each of P(b=0) « 10%, P(b=0) « 50% and 2 2 P(b=0) = 90%. The difference between the coefficients, ao and c in model (A) and (C), 2 0 shows that model (A) and model (C) are consistent. Although the difference between the R-squared values is very small, model (C) improves the fit for each case because it contains more parameters. In our data , if the pair (no,n) satisfies a linear relationship 2 (see Tables 6-8 ), P(b=0) achieves some specific values (i.e. 10%, 50%, 90%). This linear 2 17 relationship is limited by the data scope for P(b=0) = 10%, where 10 < no < 80; for 2 P(b=0) « 50%, where 12 < no < 100; and for P(b=0) « 90%, where 12 < no < 145. We 2 2 display the regression line for P(b=0) = 10%, P(b=0) = 50%, and P(b=0) = 90% in 2 2 2 Figures 4, 6, and 8, respectively. In Figures 5, 7, and 9, we give the regression lines generated by model (B). In Figures 5, 7, and 9, we have indicated all of the different probabilities with the same symbol due to computer software contraint. When P(b=0) « 10%, 50% and 90%, the distributions of b are shown in 2 2 Figures 10-12, 13-15, and 16-18, respectively, where E(b ) means the expectation of the 2 second Betti number, b , of the sample simplicial complexes and S(b ) means the unbiased 2 2 variance of b of the sample simplicial complexes. For each case, we also investigated 2 many other b 's distributions for different pairs (no, n ), but their shapes are all similar to 2 2 those already shown in Figures 10-18. 4. Tetrahedra and Six-triangles in a Simplicial Complex By a " hole" in a simplicial complex, K, we mean that a subset of K, denoted by H, such that for each 1-simplex c e H, there are exactly two 2-simplices in H, namely a and P, 3 o = anp\ Its size is the number of 2-simplices in it. Intuitively, a hole is soccerball whose patches are triangles rather than octagons. By a collection of independent holes in a simplicial complex, K, we mean that these holes are independent as 2-chains. In some specific situations, the number of independent holes in K are equal to the second Betti number of K. Therefore, one way to approximate the second Betti 18 number of a simplicial complex is to detect all of the independent holes in it. Suppose there are k triangles making up a hole, then the number of edges of this collection triangles is 3k/2 which must be an integer. Therefore, k is even. In the simplicial complex generated by our model, there are no duplicate triangles. We have k = 2m, where m > 2. Hence, we approximate the second Betti number byfindingall independent tetrahedra ( i.e. holes of size four), 6-triangles ( holes of size six), and so on in the simplicial complex. For our specific values of P(b2=0), we guess that the second Betti number mainly comes from tetrahedra and 6-triangles. In this section, t(K) denotes number of tetrahedra in the simplicial complex K; s(K) denotes number of 6-triangles in the simplicial complex K. We calculate the probability of the event occurring in model MB. 4.1. The Expected Number of Tetrahedra and Six-triangles in S(n ,n ) 0 2 Given no and n , in order to calculate the expected number of tetrahedra with 2 respect to model MB, we pick four triangles in T(K) to see if they form a tetrahedron, where T(K) consists of n random triangles. We order the four triangles arbitrarily such as 2 {triangle 1, triangle 2, triangle 3, triangle 4 }. Notice that if the four triangles are to form a tetrahedron, thefirsttriangle can have its three vertices chosen arbitrarily. To form a tetrahedron with thefirsttriangle, the second triangle must have two common vertices with the first triangle. The number of ways of choosing these two vertices is 3, the number of ways of choosing one other vertex is no-3, and the number of triangles is (Cs^-l) because no repeated triangles are allowed, so the probability that the second triangle forms the start of a tetrahedron with thefirsttriangle is 3(no-3)/(C -l). For the third triangle to no 3 form the start of a tetrahedron with the previous triangles, two vertices arefixed,so we can only choose one of two vertices which are common vertices of thefirsttwo triangles, therefore, the probability that the third triangle forms the start of a tetrahedron with the 1? previous triangles is 2/(C3° -2). The fourth triangle's vertices are fixed, which determine 0 only one triangle, then the probability that the fourth triangle forms the start of a tetrahedron with the previous triangles is l/(C3"°-3). Thus, the probability ofK containing a tetrahedron is [3(n -3)/(C -l)][2/(C3 -2)][l/(C3 -3)]. The number of the collection of n<, 0 n<, no 3 four triangles in n triangles is As a result, the expected number of tetrahedra in K, 2 E(T), is E(T) = C [3(n -3)/(C3 -l)][2/(C3 -2)][l/(C3 -3)] nj n,, 4 n,, (18) n<, 0 Formula (18) is approximately 54n /no . When n « 0.38no, we have E(T) « 1; when 4 8 2 2 2 n=cno, we have E(T) » 54c /no , where c is a constant. Notice that E(T) goes to zero as 4 4 2 n goes to infinity. 0 Let C(no) be the set of triangles whose vertices are in set V(no) and CF(K) be the complement of F(X) in C(no). The size of CF(K) is n« = | CF(K) |= C - n . By the n<> 3 2 definition of 6-triangle, we know that a 6-triangle in the simplicial complex K is determined by one triangle in the set CF(K) and the two other vertices in V(no). Fix a triangle, a, in CF(K). The number of ways of choosing the two other vertices in V(no) to form a 6-triangle with a is C^' . Fix these two vertices. The probability that any six 3 triangles with fixed vertices in F(£) form a 6-triangle is is the number of triangles excluding a, and C6 ° N 1/C6 °, N where No = C3"°-l, since N 0 is the total number of the collection of six triangles in N triangles. The number of the collection of six triangles in K is C^ . The 1 0 expected number of 6-triangles mK, E(S), is therefore: E(S) = C (C - n ) nj 6 n<> 3 2 (19) CF*IC?* 20 When n « C "°, E(S) is .approximately 3888n2 /no ; when n=cno, we have 6 2 13 3 2 E(S) « 3888c /no , where c is a constant. Notice that E(S) goes to zero as no goes to 6 7 infinity. 4.2. T h e Processes to Detect Tetrahedra a n d Six-triangles A tetrahedron in the simplicial complex is made up of four triangles of the form {vi, v , v }, {vi, v , v }, {vi, v , v } and {v , v , v }. For each triangle {vi, v , v } in the 2 3 2 4 3 4 2 3 4 2 3 simplicial complex K, we look for the second triangle of the form (vi, v , v }, where v is 2 4 4 any vertex in K such that v is greater than v , if the triangle {vi, v , v } is not found, then 4 3 2 4 we choose the next triangle after {vi, v , v } as the new {vi, v , v } and start again. If the 2 3 2 3 second triangle {vi, v , v } is found, then we look for the third triangle {vi, v , v } mK. If 2 4 3 4 the third triangle (vi, v , v } is found, then we look for the fourth triangle {v , v , v }. 3 4 2 3 4 Otherwise we choose the next triangle below (vi, v , v } as {vi, v , v }. If we cannot find 2 3 2 3 the fourth triangle (v , v , v }, then we choose the next triangle below (vi, v , v }. As 2 3 4 2 3 well, we record the fourth triangle in set M for later use. If the fourth triangle (v , v , v } 2 3 4 is found, we have then found one tetrahedron in K. Each tetrahedron independent of the previous ones in the simplicial complex increases the second Betti number, b , by one. 2 To find the number of 6-triangles, we let K = KKJG for each a in M and find the a number of tetrahedra in K . Because of the way we generate the set M, we know that a there is at least one tetrahedron in K<j. Let d = t(K )- t(K), if d > 1, then K contains a a a (C °) 6-triangles which increases the second Betti number, b , by d - 1. For each element d 2 2 a c in M, we have (C °) 6-triangles. The total number of 6-triangles in the simplicial d 2 complex K is s(K) = £a(C °). We sum all of the do-1 over each element in M to get d 2 contribution to the second Betti number by 6-triangles. 21 We should point out that any one 6-triangle and any two tetrahedra found in this way are independent. Because we focus on the specific probabilities of event (b=0) 2 (i.e n «C3 °), n 2 we conjecture that at these probabilities the second Betti number of the 8 simplicial complex K is greater than or equal to t(K) +£o(d -l) with large probability The 0 main advantage offindingtetrahedra and 6-triangles is that we only need to store the triangle set, F(X), which is far smaller than the corresponding chain matrix, especially when no is large. The algorithm is simple and the only operation is to compare the two triangles to see if they are same. Because it requires less storage space, computing the number of tetrahedra and 6-triangles is quick. Although it is not clear how to find the number of higher-triangles efficiently, we might give a lower bound of the second Betti number, b , for a simplicial complex through detecting the tetrahedra and 6-triangles in it 2 under certain situation. Table 9 gives the experimental data resultingfromcalculations performed on some pairs (no,n) with no small. P(# of tetrahedra > 0) = 5% means, among 100 trials, there are 2 5 sample simplicial complexes which contain tetrahedra. The average number of tetrahedra by our experiment is given as E( # of tetrahedra ). E(T) is the expected number of tetrahedra calculated by formula (18). E(S) is the expected number of 6-triangles calculated by formula (19). When n is small, the tetrahedra and 6-triangles seem to be independent 2 22 (no, n ) (12,25) (15,35) (20,143) (30,80) P(b *0) 10% 10% 89% 11% E(b ) 0.12 0.1 2.65 0.12 P(# of tetrahedra >0) 10% 5% 52% 8% E(# of tetrahedra) 0.11 0.05 1.27 0.08 P(#of6-triangles>0) 0 0 18% 0 E(# of 6-triangles) 0 0 0.22 0 E(T) 0.05 0.04 1.16 0.004 E(S) 0.009 0.004 0.41 0.00006 2 2 2 Table 9. Portion of b coming from tetrahedra and 6-triangles 2 23 112 750 T O 700 + 0 o ft 650 600 O o 550 500 + 450 I • 8% 8 0 9% 6 400 A10% 350 f ° | O ^ 250 | <p 300 012% 4- .& O 200 + 150 + 100 50 H 1 1 1 1 5 10 15 20 25 1——H 30 35 1 1 1 1 1 1 1 1 1 40 45 50 55 60 65 70 75 80 Figure 1. Scatter plot of m against no when P(b2=0) near 10% 24 no m 650 " • 600 " 9 o * $ 550 " 3 500 " $ ? 450 " a 048% ft 400 * S 049% 350 • A 50% o -51% t 300 • • 52% 250 - 200 - * * * * a 150 " 100 " • 50 " 0 i 0 5 1 i 1 i t 10 15 20 < i ; 1 25 i 1 i 1 i 1 i 1 i 1 i 30 35 40 45 50 55 1 i i 1 i 1 60 65 70 1 i 1 75 i i i i 1 1 1 1 80 85 90 95 Figure 2. Scatter plot of m against no when P(b2=0) near 50% 25 i 100 no 112 • o . • q • o A O" • • 0 88% ft O 89% A 90% -91% • 92% 95 100 110 Figure 3. Scatter plot of m against no when P(b2=0) near 90% 26 120 130 140 150 no 27 28 29 30 31 Figure 9. Regression line of 112 against no, when P(b2=0) near 90% 32 probability of D2 0.8 0.6 0.4 4- 24% 0.2 12% 18% 15% 15% 9% 2 3 4 6% 1% b 5 2 Figure 10. Distribution of bi when no=20 112=143 E(b2>=2.65 S(b2)= 1.74295 probability of b2 0.8 40.6 0.4 0.2 22% 29% 15% 12% 7% 9% 3 1 0 1 2 3 4 5 o / o 3o/ 0 b2 6 I - 7 Figure 11. Distribution of D2 when no=30 m=200 E(b2)=2.35 S(b2) =1.72548 probability of D2 0.6 0.4 23%.... 0.2 0 18% 1^70 -8%. 14% — \ !— 1 2 3 4 5 Figure 12. Distribution of b2 when no = 50 m = 400 E ( b 2 ) = 2.65 S(b2)=1.6959 33 1%—I b 2 probability of b2 bz 1 0 2 3 4 5 Figure 13. Distribution of b2 when no= 45 n2= 256 E(b2)=.75 S(b2)=1.04563 probability of b2 0.8 0.6 -- "" 50% 34% -- -- 19% 0 4% \ b2 1 1 2 3 Figure 14. Distribution of b2 when no= 50 m=260 S(b2)=.834847 E(b2)= .7 probability of b2 0.8 0.6 48% 0.4 33% 0.2 + 13% 5% 0 1 2 3 Figure 15. Distribution of b2 when no =80 m=500 S(b2)=1.00449 34 1% 4 E(b2>= =ib2 .78 probability of b2 l T 0.8 4 89% 0.6 0.4 0.2 4 10% 1% b Figure 16. Distribution of D2 when no=30 S(b2)=.356186 112= 80 E(b2>= .12 probability of b2 1 92% 0.8 0.6 4 0.4 0.2 1 0 4 0 1 1 1 2 Figure 17. Distribution of b2 when no=75 m=250 S(b2)= .027266 E(b2>= .08 probability of 02 89% 0.8 4 0.6 0.4 0.2 4 10% i2L 0 =1 b2 Figure 18. Distribution of b2 when no = 100 m=350 E(b2>=.12 S(b2)=.356186 35 2 Bibliography [1] Bela Bollobas. Random Graphs. Academic Press. 1985 [2] James R. Munkes. Element of Algebraic Topology. Academic Press. 1984. [3] Joel Friedman. Computing Betti Numbers via Combinatorial Laplacians. 1995 Preprint. [4] John M.Chambers etc. Graph Methods for Data Analysis.Duxbury Press. 1983 36
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Some characteristics of the second betti number of...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Some characteristics of the second betti number of random two dimensional simplicial complexes Tan, Kang 1996
pdf
Page Metadata
Item Metadata
Title | Some characteristics of the second betti number of random two dimensional simplicial complexes |
Creator |
Tan, Kang |
Date Issued | 1996 |
Description | In this thesis, through generating random two-dimensional simplicial complexes, we studied the event (b₂=0) for some specific probabilities. We found when the probability of event (b₂=0) takes on certain specific values, the pair (n₀,n₂) lies on certain lines. However, this research is limited by our sample space ( i.e. for P(b₂=0) ≈ 10%, 10 ≤ n₀ ≤ 80; for P(b₂=0) ≈ 50%, 12 ≤ n₀ ≤ 100; for P(b₂=0) ≈ 90%, 12 ≤ n₀ ≤ 145). The " linear behavior" may not hold asymptotically. In the same time, we endeavor to find the number of tetrahedra and 6-triangles in the simplicial complexes. When the event (b₂=0) occurs in our specific probabilities, it seems the second Betti number should come from tetrahedra and 6-triangles with high probability. However, the expectation of the number of tetrahedra and 6-triangles goes to zero, when no goes to infinity and there exists linear relationships between the pair (n₀,n₂). This evidence may also support that the " linear behavior" may not hold asymptotically. If n₂ and n₀ vary linearly with n₀ going to infinity, then the probability that n₀(Κ) — n₀ is extremely small in model MB for reasons are similar to the Coupon Collector's problem. Hence, the probability that we cannot find element in S(n₀,n₂) is large, which indicates that the model may have problems. |
Extent | 1502253 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-02-11 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0079898 |
URI | http://hdl.handle.net/2429/4436 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1996-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1996-0282.pdf [ 1.43MB ]
- Metadata
- JSON: 831-1.0079898.json
- JSON-LD: 831-1.0079898-ld.json
- RDF/XML (Pretty): 831-1.0079898-rdf.xml
- RDF/JSON: 831-1.0079898-rdf.json
- Turtle: 831-1.0079898-turtle.txt
- N-Triples: 831-1.0079898-rdf-ntriples.txt
- Original Record: 831-1.0079898-source.json
- Full Text
- 831-1.0079898-fulltext.txt
- Citation
- 831-1.0079898.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0079898/manifest