- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Forbidden configurations
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Forbidden configurations 2011
pdf
Page Metadata
Item Metadata
Title | Forbidden configurations |
Creator |
Raggi, Miguel |
Publisher | University of British Columbia |
Date Created | 2011-08-09T17:05:00Z |
Date Issued | 2011-08-09 |
Date | 2011 |
Description | In this work we explore the field of Forbidden Configurations, a problem in Extremal Set Theory. We consider a family of subsets of {1,2,...,m} as the corresponding {0,1}-incidence matrix. For {0,1}-matrices F, A, we say F is a *subconfiguration* of A if A has a submatrix which is a row and column permutation of F. We say a {0,1}-matrix is *simple* if it has no repeated columns. Let ||A|| denote the number of columns of A. A {0,1}-matrix F with row and column order stripped is a *configuration*. Given m and a family of configurations G, our main function of study is forb(m,G) := max{||A|| : A simple and for all F in G, we have F not a subconfiguration of A }. We give a general introduction to the main ideas and previous work done in the topic. We develop a new more computational approach that allows us to tackle larger problems. Then we present an array of new results, many of which were solved in part thanks to the new computational approach. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | Eng |
Collection |
Electronic Theses and Dissertations (ETDs) 2008+ |
Date Available | 2011-08-09T17:05:00Z |
Rights | Attribution-NonCommercial 2.5 Canada |
DOI | 10.14288/1.0072028 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mathematics |
Affiliation |
Science, Faculty of |
Degree Grantor | University of British Columbia |
Graduation Date | 2011-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by/3.0/ |
URI | http://hdl.handle.net/2429/36598 |
Aggregated Source Repository | DSpace |
Digital Resource Original Record | https://open.library.ubc.ca/collections/24/items/1.0072028/source |
Download
- Media
- ubc_2011_fall_raggi_miguel.pdf [ 891kB ]
- Metadata
- JSON: 1.0072028.json
- JSON-LD: 1.0072028+ld.json
- RDF/XML (Pretty): 1.0072028.xml
- RDF/JSON: 1.0072028+rdf.json
- Turtle: 1.0072028+rdf-turtle.txt
- N-Triples: 1.0072028+rdf-ntriples.txt
- Citation
- 1.0072028.ris
Full Text
Forbidden Configurations by Miguel Raggi a thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the faculty of graduate studies (Mathematics) The University Of British Columbia (Vancouver) August 2011 c© Miguel Raggi, 2011 Abstract In this work we explore the field of Forbidden Configurations, a problem in Extremal Set Theory. We consider a family of subsets of {1, 2, ...,m} as the corresponding {0, 1}-incidence matrix. For {0, 1}-matrices F , A, we write F ≺ A if A has a submatrix which is a row and column permutation of F . We say a {0, 1}-matrix is simple if it has no repeated columns. Let ‖A‖ denote the number of columns of A. A {0, 1}-matrix F with row and column order stripped is a configuration. Given m ∈ N and a family of configurations F , our main function of study is forb(m,F) := max{‖A‖ : A simple and for all F ∈ F we have F ⊀ A}. We give a general introduction to the main ideas and previous work done in the topic. We develop a new more computational approach that allows us to tackle larger problems. Then we present an array of new results, many of which were solved in part thanks to the new computational approach. We use both new ideas and new spins on old ideas to tackle the problems. The new results include finding exact bounds on small configurations that were previously unknown, and proving some previously conjectured asymptotic bounds for “boundary” configurations. We also develop a relationship between Forbidden Configura- tions and Patterns, which we use to prove some results. ii Preface Most of the results in this thesis were done jointly with both Dr. Richard Anstee and Dr. At- tila Sali and we produced four (submitted) papers. There are three papers with Dr. Anstee and Dr. Sali: “Evidence for a Forbidden Configuration Conjecture; one more case solved,” [ARS10a], “Forbidden Configurations: Quadratic Bounds,” [ARS11] and “Forbidden Con- figurations and Product Constructions,” [ARS10b]. There is one with Dr. Anstee, “Genetic Algorithms applied to problems of Forbidden Configurations,” [AR11]. The preprints for these papers can be downloaded from http://www.math.ubc.ca/∼anstee/. A more detailed description is in order. Chapter 1 and Chapter 2 have an introductory character and the ideas contained in them were thought of before I was involved. Chapter 3 is mostly my doing, under the supervision of Dr. Anstee. I wrote all the code and developed the algorithms, but of course, with good advice and encouragement from my supervisor. From Chapter 4, Section 4.1 (from [AR11]) and Section 4.2 (from [ARS10b]) were done mostly by me as well, again with good advice from Dr. Anstee. In particular I produced a list of conjectures for all unknown exact bounds for 3× 4 configurations (using a Genetic Algorithm as described in Chapter 3), and Dr. Anstee gave me advice on which ones to pursue, as he felt were more likely to produce results. He was right. Section 4.3 was done jointly with Dr. Anstee. All results from Chapter 5 (from [ARS10a] and [ARS11]) were done jointly with Dr. Anstee and Dr. Sali. The results on Chapter 6 and Chapter 7 (from [ARS10b]) were started off by Dr. Anstee and myself, but were completed by Dr. Anstee and Dr. Sali in Hungary (although using some of my computations). iii Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Extremal Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.1 Matrices of 0’s and 1’s . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.3 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4 The Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.5 Table of Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.6 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.6.1 Extremal Combinatorics and Set Theory . . . . . . . . . . . . . . . . 29 1.6.2 Forbidden Configurations . . . . . . . . . . . . . . . . . . . . . . . . . 32 2 Basic Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.1 Standard Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.2 What Is Missing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.3 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 iv 3 Computer Program Developed . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.1 Representation of a Configuration . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2 Subconfigurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3 Determining X(F) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.4 Boundary Cases to Classify Configurations . . . . . . . . . . . . . . . . . . 47 3.5 Finding What Is Missing and Forbidden . . . . . . . . . . . . . . . . . . . . 48 3.6 Guessing Forb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.6.1 Brute Force Greedy Search . . . . . . . . . . . . . . . . . . . . . . . . 55 3.6.2 Hill Climbing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.6.3 Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4 Exact Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.1 Two 3x4 Exact Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.1.2 The Bound for W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.1.3 The Bound for V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.2 Exact Bound for Ten Products . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.3 Critical Substructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5 Three Asymptotic Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.2 Quadratic Bound for a 4-rowed Configuration . . . . . . . . . . . . . . . . . 89 5.2.1 What Is Missing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.2.2 Case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.2.3 Linear Bound for the Inductive Children . . . . . . . . . . . . . . . . 100 5.3 Quadratic Bound for a 5-rowed Configuration . . . . . . . . . . . . . . . . . 103 5.3.1 Applying Standard Induction . . . . . . . . . . . . . . . . . . . . . . 103 5.3.2 What Is Missing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.3.3 Linear Bound for the Inductive Children . . . . . . . . . . . . . . . . 107 5.4 Classification of 6-rowed Quadratic Bounds . . . . . . . . . . . . . . . . . . . 118 6 Patterns and Splits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 6.1 Patterns and Splits in 2-Dimensions . . . . . . . . . . . . . . . . . . . . . . . 123 6.2 Patterns and Splits in d-Dimensions . . . . . . . . . . . . . . . . . . . . . . . 128 7 Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 v 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.2 Submatrices of TxT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 7.3 Submatrices of IxI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 7.4 Submatrices of IxT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 7.5 Submatrices of IxTxIc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 7.6 Fractional Exponent Bound for a Family of Configurations . . . . . . . . . . 143 7.7 All Pairs of Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.1 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.1.1 Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.1.2 A Common 4-rowed Subconfiguration . . . . . . . . . . . . . . . . . . 154 8.1.3 Other 3x4 Exact Bounds . . . . . . . . . . . . . . . . . . . . . . . . . 154 8.1.4 Critical Substructures . . . . . . . . . . . . . . . . . . . . . . . . . . 156 8.2 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 vi List of Tables Table 1.1 Classification: 1 row. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Table 1.2 Classification: 2 rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Table 1.3 Classification: 3 rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Table 1.4 Classification: 4 rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Table 1.5 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Table 8.1 Conjectured values of forb(m,Vi) . . . . . . . . . . . . . . . . . . . . . . . 155 Table 8.2 Conjectured values for forb(m,F3) and forb(m,K4) . . . . . . . . . . . . . 157 Table 8.3 Conjectured values for forb(m, 2 ·K24) . . . . . . . . . . . . . . . . . . . . 157 vii Acknowledgements First, I would like to thank my supervisor, Richard Anstee, for his endless patience and the incredible amount of time he spent working with me. It was fun. I would also like to thank my committee members, Stephanie Van Willigenburg, and Joel Friedman, as well the University Examiners and the External Examiner, all for taking the time to go through my rather long (but quite fun) thesis. Also deserving of a thankful mention is my other co-author, Attila Sali. I would also like to thank the Department of Mathematics at UBC. In particular, I would like to thank all professors, specially those who I took (at least) one course from: Richard Anstee, Kai Behrend, Patrick Brosnan, Jingyi Chen, Joel Feldman, Joel Friedman, Richard Froese, Christoph Hauert, Kalle Karu, Richard Kenyon, Kevin Leyton-Brown (from Comp. Sci.), Greg Martin, Wayne Nagata, Ed Perkins, Malabika Pramanik, Laura Scull, Andrew Rechnitzer, Dale Rolfsen, Denis Sjerve, Lior Silberman, Jozsef Solymosi and Vinayak Vatsal. The acknowledgements wouldn’t be complete without thanking all the people who were my friends during my time in Vancouver. By no means a comprehensive list: Yuri, Dennis, Hardeep, Ramon, Itzia, Ignacio, Felipe, and all the others. In my long life I have met many nice people who have done me good. I’ll be eternally grateful to all of them. In my undergraduate thesis acknowledgements (or “agradecimientos”) I gave my best shot to mention as many of them as I could. To mention all of them again would undoubtedly make me hit the much dreaded 173 page mark, quite unlucky indeed. Therefore, I would like to say the following: \input{agradecimientos.tex} Some individuals, though, were so instrumental in my (alleged) success that I feel they deserve the mention twice. My family: Malú, Gerardo, Tanja, Daniel, Grola, O’Kuri, Medo, Gole and, even though she doesn’t live with us anymore, Kima. I’m sure she is running happily in a farm upstate somewhere. I feel like I’m missing someone... someone important... viii Dedication A Teresa. Aunque esta tesis represente el tiempo que no estuve contigo, terminarla significa poder regresar a ti. ix Chapter 1 Introduction 1.1 Summary We give a brief description of the contents of each chapter of the thesis. For a more in-depth description, each chapter contains some introductory remarks. For this summary we try to use as few concepts as possible, but in order to describe the contents of some chapters it is unavoidable to use some key concepts defined in Section 1.3. Forbidden Configurations is at its core an extremal set theory problem: We attempt to maximize the size of a family of subsets of {1, 2, ...,m} subject to some restriction, namely that the family doesn’t contain a particular object which we call a configuration. The function forb(m,F ) represents the maximum size of a family of subsets of {1, 2, ...,m} with no configuration F . For proper definitions, see Section 1.3. We start in Chapter 1 by giving a general introduction and motivation to the field of Forbidden Configurations (in Section 1.2) and some basic definitions (in Section 1.3) that are used throughout the whole thesis. In Section 1.4 we describe a conjecture of Anstee and Sali (Conjecture 1.4.1) which has been a driving force in studying Forbidden Configurations. There is a brief history of the topic in Section 1.6 including related extremal problems. In Chapter 2, we explain three techniques that have been quite fruitful in the field of Forbidden Configurations, namely those of Standard Induction, What Is Missing and Implications. We have greatly increased the reach of Standard Induction (and to a lesser extent, Implications) with new ways to use them in solving problems. We have increased the usability of What Is Missing by developing a computer program to do it automatically. In Chapter 3 we describe the computer program we use to push the boundaries by increasing the size of the problems we are able to deal with. We’ll often refer to this section 1 throughout the rest of the thesis. We used the software in three fundamentally different ways. First and most importantly, we used it to obtain relevant information about configurations (or family of configurations), by finding What Is Missing as described in Section 2.2. Second, we used some specific-purpose code to perform case analysis when we found it hard or tedious to do so by hand, helping in various proofs. Lastly, we used some local search strategies (described in Section 3.6) to guess extremal matrices and structures. Section 3.6 could be removed entirely and all proofs would still be correct, but a reader might be left wondering “How did they come up with this answer?”. This would be a perfectly valid question for which the only reasonable answer was “the computer told us so.” Chapter 4 provides some new exact values for forb(m,F ) (see Definition 1.3.24) for some configurations F for which the bound was not previously known. In Section 4.1 in particular we prove exact bounds for two of the smallest configurations for which the answer wasn’t known. These results can be found in [AR11]. In Section 4.2 we prove an exact bound for a family of 10 configurations, a result that can be found in [ARS10b]. Finally, in Section 4.3 we give all critical substructures (see Definition 4.3.1) of K4, as well as a conjecture for all critical substructures of Kk. In this chapter we make heavy use the program described in Section 3.6 to guess extremal configurations and then proceed to prove they are indeed extremal, mostly using Standard Induction and What Is Missing followed by case analysis. In Chapter 5 we give proofs of some boundary cases (see Definition 1.4.3), which serve as further evidence for our motivating Conjecture 1.4.1. We give the asymptotic bound for a 4-rowed configuration F8(t) in Section 5.2 (appears in [ARS10a]), a 5-rowed configuration F7 in Section 5.3 and a 6-rowed configuration G6×3 in Section 5.4. This last one in particular is proven to be the unique quadratic 6-rowed boundary case. The last two results appear in [ARS11]. For Chapter 6 we give an entirely different extremal combinatorics problem with a more geometric flavour. This chapter is independent of all previous chapters, but is used in Chapter 7 for some proofs, as some problems considered in Chapter 7 can be interpreted as problems of patterns. We dedicate Chapter 7 to the study of product constructions. We define a sister function to ‘forb’ and prove various interesting results about it. Then we prove an unexpected bound (for ‘forb’) of a family of configurations, which on a first glance would seem to be a coun- terexample of Conjecture 1.4.1 (but isn’t). Most of the results of Chapter 6 and Chapter 7 can be found in [ARS10b]. Finally, Chapter 8 contains some open problems, as well as some ideas and thoughts on 2 Forbidden Configurations and what the future might yield. The results of Section 4.1.3 and Section 5.3 and its application to Section 5.4 provide a good sample of the flavour of the new results given this thesis. 1.2 Extremal Problems A general extremal problem in combinatorics has the following character: For a given prop- erty P , what is the maximum size of a family that satisfies P? For example, if the objects of study are families of subsets of {1, 2, ...,m} and property P is that no two subsets have two elements in common, the question becomes: What is the maximum number of subsets of {1, 2, ...,m} such that no two subsets have two elements in common? For this particular example we will give the answer later in Example 1.3.14. In the field of Forbidden Configurations, we consider a particular extremal problem that arises as a generalization to celebrated extremal results in combinatorics, namely those of Erdős and Stone [ES46] and Erdős and Simonovits [ES66]. They consider the following problem: Given m ∈ N and a fixed (small) graph F , find the maximum number of edges in a (simple) graph G with m vertices that avoids having a subgraph isomorphic to F . For example, Mantel’s result answers the question “what is the maximum number of edges in a triangle-free graph on m vertices?” and can be viewed as a special case. We will discuss this further in Section 1.6. There are a number of ways to generalize the above problem to hypergraphs, but in this thesis we consider the following generalization: Given m ∈ N and a hypergraph F , find the maximum number of edges in a simple hypergraph H on m vertices that avoids having a subhypergraph (or trace) isomorphic to F . We consider the notion of subhypergraph as follows: The vertices of a subhypergraph are a subset of the vertices of the original hypergraph. For the edges, we choose a subset of the edges from the original hypergraph and consider the multiset consisting of intersections of the chosen edges with the vertices of the subhypergraph (more on this later). As stated before, there are alternate generalizations of this problem. For example, instead of considering the edges of the subhypergraph to be intersections of edges of the original hy- pergraph with the vertices of the subhypergraph, one might consider a subhypergraph to have only edges that are fully contained in the chosen subset of the vertices. This alternate definition makes sense especially when considering k-uniform hypergraphs, in the sense that all edges have size k (a graph is a 2-uniform hypergraph). This alternate view of subhyper- 3 graphs is consistent with the way subgraphs of graphs are usually defined. In contrast, the way we see subhypergraphs when restricted to 2-uniform hypergraphs would suggest that when considering a subset of the vertices, we might have to consider “edges” of size 1 or size 0, even though we started with all edges of size exactly 2, since the intersection of an edge of size 2 with the vertices of the subhypergraph might have size 1 or 0. Moreover, the subhypergraph need not be simple, even if the original hypergraph was. When studying difficult extremal problems, such as Forbidden Configurations, an exact answer to the extremal question might not come easily. In such cases, it’s common practice to settle for asymptotic bounds, given in Big-O notation: If f, g : N → N, we define the following notation: • We say f is O(g) if there exists a constant c > 0 and a number N for which for all n ≥ N , we have f(n) < c · g(n). • We say f is Ω(g) if there exists a constant d > 0 and a number N for which for all n ≥ N , we have f(n) > d · g(n). • We say f is Θ(g) if f is both O(g) and Ω(g). That is, if there exists constants c > 0, d > 0 and N for which for all n > N , d · g(n) < f(n) < c · g(n). In such case we say f has the same asymptotic growth as g. 1.3 Basic Definitions 1.3.1 Matrices of 0’s and 1’s First some standard notation. Let X be a finite set. We denote by 2X the power set of X: the set that consists of all subsets of X (note that there are 2|X| of them). Denote all subsets of X of a given size t by ( X t ) . A set system A on a set X is a family of subsets of X. In other words, A ⊆ 2X . We might view a set system on X as a simple hypergraph on the vertices X. We find it convenient to encode hypergraphs or set systems in the language of matrices. For a given order of the elements of X, and an ordering of the elements of a family A, we can encode a set system as a matrix in the following way: A single subset α ⊆ X can be thought of as the incidence vector of 0’s and 1’s with |X| entries, where the entry i is 1 if and only if i ∈ α. The family A gives rise to a collection of {0, 1}-vectors with |X| components. Consider these {0, 1}-vectors as column vectors and concatenate them together to form a 4 {0, 1}-matrix. Since the elements of X don’t play any role in our investigations, we might as well take X = [m] := {1, 2, ...,m} with m being the size of X. For example, the matrix A = 1 0 1 0 0 0 1 0 1 0 0 1 1 1 0 0 0 0 1 0 represents the family A = {{1}, {2, 3}, {1, 3}, {2, 3, 4}, ∅}. Notice that in this encoding no two columns are equal: if two columns were equal, they would represent the same subset of X. Different orderings of the columns of A give rise to different ways of ordering the sets of the family A. We usually think of the set [m] as having a canonical order, but we can think of different row orders as giving rise to equivalent hypergraphs. We will define the notion of equivalence of {0, 1}-matrices formally in the next section, but for now let us talk about matrices a little longer. Definition 1.3.1 We say a {0, 1}-matrix A is simple if no two columns of A are equal. The size of the family A (i.e. the number of edges in the hypergraph) is represented by the number of columns of matrix A. Since we will refer to this number often, we find it convenient to create a short notation for it. Definition 1.3.2 For a {0, 1}-matrix A, we define ‖A‖ to be the number of columns of A. The size of each edge in the hypergraph is represented by the number of 1’s in the corresponding column. Definition 1.3.3 For a {0, 1}-matrix A we denote by σ0(A) the number of 0’s in matrix A and by σ1(A) the number of 1’s. In the case of a column α, we say α has column sum k (and write σ1(α) = k) if it has exactly k ones. For m-rowed A, we have σ0(A) + σ1(A) = m · ‖A‖. Definition 1.3.4 Define 0m to be the m-rowed column with all 0’s and 1m to be the m- rowed column of all 1’s. We now define some operations of {0, 1}-matrices that will be useful for constructing new matrices from previously constructed ones. 5 Definition 1.3.5 Let A be a {0, 1}-matrix. We denote Ac the {0, 1}-complement of A. That is, the matrix that results from replacing every 0 in A by a 1, and every 1 by a 0. For example, A = 0 1 0 1 0 0 0 0 1 0 1 1 0 1 0 1 =⇒ Ac = 1 0 1 0 1 1 1 1 0 1 0 0 1 0 1 0 . Definition 1.3.6 Let A and B be {0, 1}-matrices with the same number of rows. Define the concatenation [A|B] to be the configuration that results from taking all columns of A together with all columns of B. For t ∈ N, we define the product t · A := [ A | A | · · · | A︸ ︷︷ ︸ t times ]. The operation · has precedence over |, so that [t · A | B] = [(t · A) | B]. Definition 1.3.7 Let A and B be {0, 1}-matrices. We construct the product A × B by taking each column of A, and putting it on top of each column of B. So if the columns of A are α1, ..., αa and the columns of B are β1, ..., βb, then A×B is the matrix with ab columns: A × B = | | | | | | | | | α1 α1 ... α1 α2 α2 ... α2 ... ... αa αa ... αa | | | | | | | | | | | | | | | | | | β1 β2 ... βb β1 β2 ... βb ... ... β1 β2 ... βb | | | | | | | | | . Here is an example of a product: A = [ 0 1 1 0 0 1 ] , B = [ 1 0 0 1 ] =⇒ A × B = 0 0 1 1 1 1 0 0 0 0 1 1 1 0 1 0 1 0 0 1 0 1 0 1 . Often we will consider single columns and rows of {0, 1}-matrices. 6 Definition 1.3.8 For a column α and a {0, 1}-matrix A, we define the multiplicity of α in A, written as λ(α,A), as the number of columns of A which are equal to α. For example, λ 10 0 , 0 1 1 11 0 1 0 0 0 1 0 = 2. Definition 1.3.9 Let A and B be m-rowed {0, 1}-matrices. We define subtraction A−B to be the m-rowed {0, 1}-matrix such that for each m-rowed column α, A−B satisfies λ(α,A−B) = max{0, λ(α,A)− λ(α,B)}. The order in which the columns appear in A − B isn’t important, but just so that it is a well defined operation of matrices, we might choose the order of the columns of A − B to be the same as for A. This operation corresponds to set difference of the corresponding set systems. Here is an example of the difference of two matrices:0 1 1 1 11 0 1 0 0 0 0 1 0 0 − 1 1 0 11 1 0 0 0 1 1 0 = 0 1 11 0 0 0 0 0 . Of the columns of A, the column (0, 1, 0)T has multiplicity 1 in A and 0 in B, so it must have multiplicity 1 in A−B. The column (1, 0, 0)T has multiplicity 3 in A and 1 in B, so it must have multiplicity 3− 1 = 2 in A−B. Finally, the column (1, 1, 1) has multiplicity 1 in A and 1 in B so it must have multiplicity 0 in A−B. Definition 1.3.10 Let A be a {0, 1}-matrix. Given a subset of the rows S, we define the restriction A|S to be a {0, 1}-matrix formed from rows S of A. For example, A = 0 1 0 1 0 1 0 0 1 1 1 1 0 1 0 0 1 1 0 0 1 1 0 1 =⇒ A|{2,4} = [ 0 0 1 1 1 1 0 0 1 1 0 1 ] . 7 The following notation is convenient. Definition 1.3.11 Let α and β be m × 1 columns. Then we say α ≤ β if for every row r for which the r-th entry of α is a 1, the r-th entry of β is also a 1. For example, 1 0 0 1 ≤ 1 1 0 1 but 1 0 0 1 0 1 1 1 . Definition 1.3.12 Let α be an m× 1 column and A be an m-rowed matrix. We use α ∈ A if α is a column of A. The following three special matrices turn out to be remarkably useful in our investigations. We shall see why they are important in Conjecture 1.4.1 Definition 1.3.13 We define the following three matrices: • The identity matrix Im is be the m × m matrix with 1’s in the diagonal and 0’s everywhere else. It corresponds to the set system {{1}, {2}, ...{m}}. For example, I4 = 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 . • The identity complement Icm is the {0, 1}-complement of Im. That is, every entry is a 1, except for the diagonal entries, which are 0’s. It corresponds to the set system {[m] \ {1}, [m] \ {2}, ..., [m] \ {m}}. For example, Ic4 = 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 0 . 8 • The tower matrix Tm is the m× (m+ 1) matrix that corresponds to the set system {∅, [1], [2], [3], ..., [m]}. For example, T4 = 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 . For these three matrices, when the number of rows is implicit we might omit the subindex m and write I, Ic or T . Example 1.3.14 Returning to the problem of finding the maximum number of subsets of [m] such that no two subsets have two elements in common, we can restate it as the problem of finding the most number of columns a simple matrix with m rows can have without having the following submatrix: F = [ 1 1 1 1 ] . The answer to this particular example is straightforward: The maximum number of columns is ( m 0 ) + ( m 1 ) + ( m 2 ) . Indeed, we can take one column with no 1’s (the empty set), all columns with one 1 and all columns with two 1’s. For every pair of rows there is at most one column with two 1’s in that pair. Thus the number of columns we can have with more than one 1 is at most ( m 2 ) . For example, for m = 3, this is the matrix: A = 0 1 0 0 1 1 00 0 1 0 1 0 1 0 0 0 1 0 1 1 . This concludes Example 1.3.14. Example 1.3.15 Now consider the following question. What is the maximum number of subsets of {1, 2, ...,m} such that no three sets each have an element that the other two sets don’t have? This is a very similar problem. We wish to find the most number of columns a 9 {0, 1}-matrix can have, having neither of the following six submatrices:1 0 00 1 0 0 0 1 nor 1 0 00 0 1 0 1 0 nor 0 1 01 0 0 0 0 1 nor 0 1 00 0 1 1 0 0 nor 0 0 11 0 0 0 1 0 nor 0 0 10 1 0 1 0 0 . In other words, we wish to find the maximum number of columns in a matrix that doesn’t have any row and column permutation of I3 as a submatrix. We will answer this question later (in Proposition 4.3.3). This demonstrates the need to get rid of row and column ordering. We do so in the next section. 1.3.2 Configurations Definition 1.3.16 (informal) A configuration is a {0, 1}-matrix with column and row order stripped. More formally, letMm be the set of all m-rowed matrices. Define an equivalence relation: If A,B ∈Mm, we say A ∼ B iff A is a permutation of the rows and columns of B. Consider the set of configurations Cm by taking the quotient: Cm :=Mm/ ∼, and if F is a {0, 1}-matrix, define F̃ to be the equivalence class (or configuration) to which F belongs. We will usually abuse notation and not distinguish between a configuration and a {0, 1}- matrix representative, always remembering that we can permute the order of the rows and columns without altering the configuration. For example, the six matrices given in Exam- ple 1.3.15 represent the same configuration. Notice that if F , G are configurations and t is a number, the operations we defined for {0, 1}-matrices ‖F‖, F × G, F c, t · F are well defined in configurations, but concatenation [F |G] and subtraction F −G are not. We will therefore use the notation [F |G] and F −G carefully, when the order of rows is understood. The notion of being a simple configuration 10 is also well defined: a configuration for which any representative is a simple matrix. We have that F ×G = G× F as configurations, and if F and G are simple, F ×G is also simple. As with matrices, we might refer to the multiplicity of a column α (and use the notation λ(α, F )). What we mean by this is that for a specific representative of a configuration F , there are k columns that are equal to α. Of course, if we choose a different representative for F (i.e permute the rows and columns), we will have k copies of a row permutation of α. Definition 1.3.17 Denote by Kk the unique k × 2k simple “complete” configuration cor- responding to the set system 2[k]. That is, the one representing the power set of [k]. The notation was chosen to mirror the common notation for the “complete” graph Kk. As an example, here is K3: K3 = 0 1 0 0 1 1 0 10 0 1 0 1 0 1 1 0 0 0 1 0 1 1 1 . Denote by Ksk the k × ( k s ) configuration with k rows and all possible columns of column sum s. Here is K23 : K23 = 1 1 01 0 1 0 1 1 . We also use notation such as K≤sk (resp. K ≥s k ), meaning the configuration with k rows and all possible columns of column sum ≤ s (resp ≥ s). Observe that for k, ` ∈ N, we have Kk ×K` = Kk+`. Definition 1.3.18 For a configuration F and a {0, 1}-matrix A (or a configuration A), we say that F is a subconfiguration of A, and write F ≺ A if there is a representative of F which is a submatrix of A. We say A has no configuration F (or doesn’t contain F as a configuration or avoids F ) if F is not a subconfiguration of A. The notion of subhypergraph in our matrix notation becomes the notion of a subconfig- uration. Observe that ≺ defines a partial ordering in the set Cm of configurations, since ≺ is reflexive (F ≺ F ), anti-symmetric (if F ≺ G and G ≺ F , then F = G as configurations), and transitive (F ≺ G ≺ H implies F ≺ H). 11 Definition 1.3.19 Let Avoid(m,F ) := {A ∈ Cm : A is simple and F ⊀ A}. In other words, the set of all m-rowed simple configurations with no F as a subconfiguration. For example, Avoid ( 2, [ 1 0 0 1 ]) = {[ 0 0 ] , [ 1 0 ] , [ 1 1 ] , [ 0 1 0 0 ] , [ 0 1 0 1 ] , [ 1 1 0 1 ] , [ 0 1 1 0 0 1 ]} Definition 1.3.20 Our main object of study is the following extremal function. We define forb(m,F ) := max{‖A‖ : A ∈ Avoid(m,F )}. In other words, forb(m,F ) is the maximum number of columns a simple m-rowed {0, 1}- matrix can have with no F as a subconfiguration. For example, we may conclude that forb ( 2, [ 1 0 0 1 ]) = 3, by looking at all configurations in Avoid ( 2, [ 1 0 0 1 ]) . Remark 1.3.21 Observe that forb is an “increasing” function in the second variable, mean- ing F ≺ G =⇒ forb(m,F ) ≤ forb(m,G). Indeed, if F ≺ G, then every matrix containing G as a configuration also contains F . Thus, Avoid(m,F ) ⊆ Avoid(m,G), which means forb(m,F ) ≤ forb(m,G). Interestingly, it is an important open problem to see if forb(m,F ) is increasing in the first variable in the sense that m < n implies forb(m,F ) ≤ forb(n, F ). We have strong evidence to believe this is the case, but it hasn’t been proven. More about this in Section 8.1. Remark 1.3.22 Observe that forb(m,F ) = forb(m,F c) by symmetry. We may also consider forbidding a family of configurations F : 12 Definition 1.3.23 Let F be a family of configurations. Define Avoid(m,F) := ⋂ F∈F Avoid(m,F ). In other words, the set of all m-rowed simple matrices that avoid all configurations in F . Definition 1.3.24 We now define the analogous extremal function forb(m,F) := max{‖A‖ : A ∈ Avoid(m,F)}. Note that according to this definition, there exists a simple matrix of size m× forb(m,F) that doesn’t contain any F ∈ F as a subconfiguration, but every matrix of size m × (forb(m,F) + 1) contains at least one F ∈ F . Also note that (F ,≺) can be considered as a partially ordered set, and if H := minimal elements of F , then Avoid(m,F) = Avoid(m,H). Definition 1.3.25 We define the extremal matrices ext(m,F) := {A ∈ Avoid(m,F) : ‖A‖ = forb(m,F)}. Problem 1.3.26 (The Main Problem) We are interested in determining forb(m,F). We would like to find forb(m,F) exactly for any family, but we’ll settle for asymptotic bounds when it’s too hard to find forb(m,F) exactly. This problem was first introduced by Anstee in [Ans85] for |F| = 1. We deal mostly with the case |F| = 1. We now give some definitions that will be helpful in our investigations. Definition 1.3.27 We say a family A is laminar if for every X, Y ∈ A we have that either X ⊆ Y , Y ⊆ X or X ∩Y = ∅. Analogously, we say a {0, 1} matrix A is laminar if for every two columns α, β of A we have either α ≤ β, or β ≤ α or there are no rows in which both α and β have a 1. Equivalently, we say m-rowed A is laminar if A ∈ Avoid(m,F ) for F := 1 11 0 0 1 . 13 For example, 1 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 is laminar but 1 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 isn’t. 1.3.3 Basic Properties While studying forb(m,F) for some family of configurations F , we usually give lower bounds and upper bounds on forb(m,F). A lower bound usually consists of an example of a {0, 1}- matrix with no F ∈ F as a configuration. An upper bound is usually accomplished by a theoretical argument. Proposition 1.3.28 We have that forb(m,0k) = forb(m,1k) = ( m k − 1 ) + ( m k − 2 ) + ...+ ( m 0 ) and forb(m, 2 · 0k) = forb(m, 2 · 1k) = ( m k ) + ( m k − 1 ) + ...+ ( m 0 ) . Thus, if 0k ≺ F or 1k ≺ F for some configuration F , we have that forb(m,F ) is Ω(mk−1) and if 2 · 0k ≺ F or 2 · 1k ≺ F , then forb(m,F ) is Ω(mk). Proof: The bound for forb(m,1k) is easy. Consider an extremal matrix A such that 1k ⊀ A. Note that in A, we cannot have columns with column sum k or more. This leaves only( m k−1 ) + ( m k−2 ) + ...+ ( m 0 ) columns (the ones with column sum k − 1 or less). The bound for forb(m, 2 · 1k) is also easy. For the columns with column sum k or higher, for each k-subset of the rows there must be at most one column with 1’s in that subset of the rows, so there are at most ( m k ) columns, plus the columns with column sum < k, which cannot contribute to create 2 · 1k. The bounds for forb(m,0k) and forb(m, 2 · 0k) are analogous. Here is an important result proved independently by V. Vapnik and A. Chervonenkis, N. Sauer and M. Perles and S. Shelah: 14 Theorem 1.3.29 [VC71][Sau72][She72] We have that forb(m,Kk) = forb(m,1k) = ( m k − 1 ) + ( m k − 2 ) + ...+ ( m 0 ) , and so forb(m,Kk) is Θ(m k−1). Thus for any simple k-rowed configuration F , forb(m,F ) is O(mk−1). An important generalization of this result from Anstee and Füredi is: Theorem 1.3.30 [AF86] For fixed k, t, we have that as m→∞, forb(m, t ·Kk) = forb(m, t · 1k) = t− 2 k + 1 ( m k ) (1− o(1)) + ( m k ) + ( m k − 1 ) + · · ·+ ( m 0 ) . This in particular says that if F has k rows, we have that forb(m,F ) is O(mk) [Für83], since F is contained in t ·Kk for some t. 1.4 The Conjecture Anstee and Sali conjectured that the “best” asymptotic constructions in terms of avoiding a single configuration F would be formed from products of I, Ic and T . There is ample evidence for this conjecture, but no proof or counterexample has been found yet. The research in forbidden configurations is often guided by this conjecture. Conjecture 1.4.1 [AS05] Let F be a configuration. Let Pr(a, b, c) := Ir × ...× Ir︸ ︷︷ ︸ a times × Icr × ...× Icr︸ ︷︷ ︸ b times ×Tr × ...× Tr︸ ︷︷ ︸ c times , Define X(F ) to be the largest number such that there exist numbers a, b, c ∈ N with 15 a+ b+ c = X(F ) such that for all r ∈ N, F ⊀ Pr(a, b, c). Then forb(m,F ) is Θ(mX(F )). Observe that X(F ) is always an integer. Also note that ‖Pr(a, b, c)‖ = ra+b · (r + 1)c which is Θ(rX(F )), so by taking r = dm/X(F )e (and perhaps deleting some rows in case X(F ) - m), we have that ‖Pr(a, b, c)‖ is Ω(mX(F )), so the fact that forb(m,F ) is Ω(mX(F )) is built into the conjecture. In order to prove the conjecture, all that would be required would be to prove that forb(m,F ) is O(mX(F )) for every F . A disproof would be easier, as only a counterexample would be required. A valid objection is that finding X(F ) given F is not a trivial task, but for relatively small configurations F we have a computer program that yields the answer very quickly. We can compute X(F ) for F having less than ∼10 rows in just a few seconds. This task takes merely exponential time, not doubly exponential. A simple (but surprising) corollary of the conjecture is that repeating columns more than twice in F has no effect on the asymptotic behavior of forb(m,F ). In other words, assuming the conjecture were true, the multiplicity of a column in a configuration would not affect the asymptotic bound, and it for asympotic bounds, it would only matter if a column is not there (has multiplicity 0), appears once (has multiplicity 1), or appears “multiple times” (has multiplicity 2 or more). Lemma 1.4.2 Let Ft = [G|t · H] with G and H simple {0, 1}-matrices that have no columns in common. Then X(F2) = X(Ft) for all t ≥ 2. In particular, if the conjecture were true, then forb(m,Ft) and forb(m,F2) would have the same asymptotic behavior. Proof: It suffices to show that given t, G, H, a, b and c, there exists an R such that for every r ≥ R, we have F2 = [G|2 ·H] ≺ Pr(a, b, c) ⇐⇒ Ft = [G|t ·H] ≺ Pr(a, b, c). 16 Since F2 ≺ Ft, we only need to prove that if F2 ≺ Pr(a, b, c) for some r, then Ft ≺ PR(a, b, c) for some R. Suppose then F2 is contained in the product Pr(a, b, c) for some r. The idea is to find a subconfiguration of Pr(a, b, c) in which there are some columns with multiplicity 1, and for the columns with multiplicity 2 or more, the multiplicity depends on r. We need r large enough so that the multiplicity of any one column (with multiplicity of 2 or more) is larger than t. Let x be the number of rows of Ft. Notice the following three facts, which include definitions for EI , EIc and ET . EI(x, r) := [(r − x) · 0x | Ix] ≺ Ir EIc(x, r) := [(r − x) · 1x | Icx] ≺ Icr ET (x, r) := ⌊ r x ⌋ · Tx ≺ Tr. The first and second facts are easy to see; just take any subset of x rows from Ir or Icr . The third statement is true by taking the br/xc-th row of Tr, the 2br/xc-th row of Tr, etcetera, up to the xbr/xc-th row. For example, if r = 5 and x = 2, we may take the second and fourth row from T5: T5 = 0 1 1 1 1 1 0 0 1 1 1 1 0 0 0 1 1 1 0 0 0 0 1 1 0 0 0 0 0 1 =⇒ T5|{2,4} = [ 0 0 1 1 1 1 0 0 0 0 1 1 ] = ET (2, 5) Note that in the three configurations EI(x, r), EIc(x, r) and ET (x, r), we have that there are some columns of multiplicity 1 and there are some columns for which their multiplicity can be made as large as we wish by making r large. Formally, let E(x, r) be one of EI(x, r) or EIc(x, r) or ET (x, r). We have that for every x-rowed column α there are three possibilities: either λ(α,E(x, r)) = 0 for all r, or λ(α,E(x, r)) = 1 for all r, or lim r→∞ λ(α,E(x, r)) =∞. If α is a column for which lim r→∞ λ(α,E(x, r)) = ∞, we may conclude that there is an R for which λ(α,E(x, r)) ≥ t for every r ≥ R. Since F2 is contained in Pr(a, b, c) for some r, the columns in H will have multiplicity at least 2 in some subset of the rows of Pr(a, b, c). By taking PR(a, b, c), we see that Ft is also a subconfiguration of PR(a, b, c). 17 Conjecture 1.4.1 has been a driving force behind the field of Forbidden Configurations, specifically when forbidding a single configuration (see [Ans]). We search for maximal and minimal forbidden configurations with a specific asymptotic bound, since we have that if both forb(m,F ) and forb(m,G) are Θ(mk), and we have a configuration H such that F ≺ H ≺ G, then forb(m,H) is also Θ(mk). For a given k and s, we search for minimal and maximal s-rowed configurations F for which forb(m,F ) is Θ(mk). We hope to use this to classify the asymptotic bounds for all configurations. Definition 1.4.3 We say a configuration Ft = [G|t · H] (for t ≥ 2) with G and H simple with no columns in common is a predicted boundary case if for any column α not present in H, X([Ft|α]) > X(Ft). We say Ft is a boundary case if forb(m,Ft) is Θ(mk) but for any column α not present in H, forb(m, [Ft|α]) is Ω(mk+1). Conjecture 1.4.1 is that predicted boundary cases are the same as boundary cases. Boundary cases for s up to 4 were classified by Fleming ([Ans]). The conjecture has been proven for all k× ` configurations F with k = 1, 2, 3 and many others in various papers. For k = 2 in [AGS97]. For k = 3 in [AGS97], [AFS01] and was completed in [AS05]. For k = 4, the case when F is simple was completed in [AF86]. For k = 4 and F non-simple, there were only three cases left to do. We completed one of them ([ARS10a]) and the proof is given in Section 5.2. For ` = 2, the conjecture was verified in [AK06]. For k = 5, there are nine 5× 6 predicted boundary configurations F for which X(F ) = 2 ([ARS11]). We prove one of them in Section 5.3. For k = 6, we give a complete classification of the configurations F for which forb(m,F ) is quadratic ([ARS11]). If we only consider simple configurations, then we may also consider a restricted version of a boundary case. Definition 1.4.4 We say a simple configuration F is a predicted s-boundary case if for any column α not present in F , we have that X([F |α]) ≥ X(F ) + 1. We say it is a s-boundary case if forb(m,F ) is Θ(mk) but forb(m, [F |α]) is Ω(mk+1) for any column α not present in F . The classification of s-boundary cases for five rows was done by Ryan [Ans]. We now give here a classification of the minimal and maximal boundary cases. For t ≥ 2, 18 Minimal Maximal Constant [0], [1] [0 1] Linear [0 0], [1 1] t · [0 1] Table 1.1: Classification: 1 row. Minimal Maximal Constant [ 1 0 ] [ 1 0 ] Linear [ 0 0 ] , [ 1 1 ] [ 1 0 0 1 ] , [ 1 1 0 0 ] [ 0 0 1 0 1 1 t · [ 1 0 ]] , [ 0 0 t · [ 1 0 0 1 ]] Quadratic [ 0 0 0 0 ] , [ 1 1 1 1 ] ,[ 0 1 1 0 0 1 0 0 0 1 1 1 ] , t ·K2 Table 1.2: Classification: 2 rows. 19 Minimal Maximal Linear 00 1 , 11 0 1 0 1 00 1 1 1 0 0 0 1 Quadratic 00 0 , 11 1 , 1 0 00 1 0 0 0 1 , 0 1 11 0 1 1 1 0 , 1 10 0 0 0 , 1 11 1 0 0 , 1 0 1 00 1 0 1 0 0 1 1 0 0 1 10 0 0 1 0 1 1 1 t · 1 0 1 00 1 1 1 0 0 0 1 , 0 0 1 10 0 1 1 0 1 0 1 t · 1 0 1 00 1 0 1 0 0 1 1 , 00 0 t · 1 0 0 1 10 1 0 1 0 0 0 1 0 1 , 11 1 t · 0 1 1 0 01 0 1 0 1 1 1 0 1 0 Cubic 2·03, [2·K13 |K23 ], [2·K13 |K33 ], [K03 |2 ·K23 ], [K13 |2 ·K23 ], 2 ·13 t ·K3 Table 1.3: Classification: 3 rows. 20 Before showing the table for 4 rows, let us define some configurations: B1 = 0 0 0 0 1 0 0 0 1 1 0 0 1 1 1 0 1 1 1 1 , B2 = 0 0 0 1 1 0 0 0 1 1 0 0 1 1 1 0 1 1 0 1 , B3 = 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1 0 1 1 1 1 , B4 = 0 0 0 0 1 0 0 1 1 1 0 0 1 1 1 0 1 0 1 1 , B5 = 0 0 0 1 1 0 0 1 1 1 0 0 1 0 1 0 1 0 1 1 , B6 = 0 0 0 1 1 0 0 1 1 1 0 0 1 1 1 0 1 0 0 1 . D = 0 0 0 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 . 21 Minimal Maximal Linear 1 1 0 0 1 1 1 0 0 1 0 0 Quadratic 1 1 1 1 0 0 0 0 , 1 1 1 0 , 0 0 0 1 , 1 0 1 0 0 1 0 1 , 1 0 0 0 1 0 0 0 1 1 1 1 , 0 1 1 1 0 1 1 1 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 1 0 0 0 1 t· 1 0 1 1 1 1 0 0 0 1 0 1 0 0 1 0 (Conjectured) 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 1 t· 1 1 0 1 1 0 1 0 0 0 1 1 0 1 0 0 (Conjectured) 1 0 1 0 0 1 0 1 0 0 1 1 0 0 1 1 t· 0 1 1 0 1 1 0 0 Solved in Section 5.2 22 Minimal Maximal Cubic 1 0 0 0 1 0 0 0 1 0 0 1 , 0 1 1 1 0 1 1 1 0 1 1 0 , 1 0 0 0 1 0 0 0 1 0 0 0 , 0 1 1 1 0 1 1 1 0 1 1 1 , 1 0 1 0 1 0 0 1 0 1 1 0 0 1 0 1 , 1 1 1 1 , 1 0 0 1 0 1 0 1 0 0 1 1 1 1 1 0 , 0 1 1 0 1 0 1 0 1 1 0 0 0 0 0 1 , 0 0 0 0 , 1 1 0 0 0 1 1 0 0 1 0 1 0 0 1 1 , 1 1 1 1 1 1 0 0 , 0 0 0 0 0 0 1 1 All these are conjectured. [K4 | t · [K4 −Bi]] for i = 1, 2, ...6, [K04 | t ·D] [K04 | t ·D]c Quartic All others (see [Ans] for a list) t ·K4 Table 1.4: Classification: 4 rows. For five rows, the minimal and maximal matrices haven’t been classified, except for the quadratic case, and these are the nine maximal five-rowed configurations which are conjectured to have a quadratic bound. Of these, only F7 is known to have a quadratic bound, proved in Section 5.3. F3 = 1 1 0 1 1 1 1 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 F4 = 1 1 0 1 1 1 1 0 1 1 1 0 0 1 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 F5 = 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 1 0 23 F6 = 1 1 0 1 1 1 1 0 1 1 1 0 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 1 F7 = 1 1 0 1 1 0 1 0 1 1 1 1 0 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 0 F8 = 1 1 0 1 1 0 1 0 1 1 1 1 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 F9 = 0 0 1 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 1 1 0 1 1 0 1 1 1 1 0 0 F10 = 1 1 0 1 0 0 1 0 1 1 1 1 0 1 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 1 F11 = 1 1 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 24 1.5 Table of Notation For the following definitions, A,B are m-rowed {0, 1} matrices, F and G are configurations, F is a family of configurations, α is a column, and S is a subset of rows. We only give informal definitions. For full definitions, see Section 1.3. Name Notation Informal Defn. Example Simple Con- figuration N/A A configuration with no repeated columns. [ 1 1 0 0 1 1 ] √ [ 1 1 1 0 1 0 ] × Number of Columns ‖F‖ The number of columns of a repre- sentative. ∣∣∣∣∣ ∣∣∣∣∣ [ 1 1 1 0 1 0 ]∣∣∣∣∣ ∣∣∣∣∣ = 3 Number of 0’s/1’s σ0(α), σ1(α) σ0(A), σ1(A) The number of 0’s/1’s in a column or matrix. σ0 ([ 1 0 0 0 ]) = 3 σ1 10 1 = 2 Column of 0’s/1’s 0m,1m An m-rowed column of 0’s/1’s 03 = 00 0 13 = 11 1 Complement F c Replace 1’s for 0’s and 0’s for 1’s. [ 1 0 1 0 0 1 ]c = [ 0 1 0 1 1 0 ] 25 Name Notation Informal Defn. Example Concatenation [A|B] The columns of A to- gether with those of B. [[ 1 0 0 1 ] ∣∣∣∣∣ [ 0 0 0 1 ]] = [ 1 0 0 0 0 1 0 1 ] Product t · F [ F | F | · · · | F︸ ︷︷ ︸ t times ] 2 · [ 1 0 1 1 ] = [ 1 0 1 0 1 1 1 1 ] Product F ×G The configuration that consists of all combinations of each column of F on top of each column of G A = [ 0 1 1 0 0 1 ] , B = [ 1 0 0 1 ] =⇒ A × B = 0 0 1 1 1 1 0 0 0 0 1 1 1 0 1 0 1 0 0 1 0 1 0 1 Multiplicity λ(α,A) Number of columns of A which are equal to α λ ([ 1 0 ] , [ 0 1 1 1 1 0 1 0 ]) = 2 Difference A−B Columns of A minus columns of B [ 1 0 0 1 ] − [ 1 1 0 1 ] = [ 0 1 ] 26 Name Notation Informal Defn. Example Restriction A|S Rows S of A put to- gether A = 0 0 1 10 1 0 1 0 0 1 0 =⇒ A|{1,3} = [ 0 0 1 1 0 0 1 0 ] Identity Im 0’s everywhere except for diagonal I3 = 1 0 00 1 0 0 0 1 Identity Complement Icm 1’s everywhere except for diagonal Ic3 = 0 1 11 0 1 1 1 0 Tower Tm 1’s on the upper-right corner and 0’s on the lower left corner. T3 = 0 1 1 10 0 1 1 0 0 0 1 Complete Km All possible columns K2 = [ 0 1 0 1 0 0 1 1 ] Complete of sum k Kkm All columns of column sum k K23 = 1 1 01 0 1 0 1 1 27 Name Notation Informal Defn. Example Sub- Configuration F ≺ G There is a row and col- umn permutation of F which is a submatrix of G [ 1 0 0 1 ] ≺ 0 0 10 1 1 1 1 0 [ 1 0 0 1 ] ⊀ 1 1 11 0 1 0 0 1 Avoid Avoid(m,F) The set of m-rowed simple matrices which avoid all F ∈ F . Avoid ( 2, [ 1 0 ]) = {[ 0 0 ] , [ 1 1 ] , [ 0 1 0 1 ]} Forb forb(m,F) Maximum number of columns of a matrix in Avoid(m,F). forb ( m, [ 1 0 ]) = 2 Ext ext(m,F) Set of matrices in Avoid(m,F) with exactly forb(m,F) columns ext ( 2, [ 1 0 ]) = {[ 0 1 0 1 ]} Laminar Con- figuration N/A 1 11 0 0 1 ⊀ A 1 1 01 0 0 0 0 1 Table 1.5: Definitions 28 1.6 History 1.6.1 Extremal Combinatorics and Set Theory Extremal Combinatorics deals with problems of finding the maximum or minimum size of a combinatorial class. As such, the path taken by the study of Extremal Combinatorics is filled with alleyways, short roads with many stop signs and quite a few landmarks. The reason for the sinuous nature of this field is that coming up with extremal combinatorial problems is extremely easy, but coming up with answers to said problems often turns out to be very hard. Very similar problems can have vastly different solutions. It is impossible to pinpoint exactly when the first time such questions were studied, but some (like [Juk01]) argue it was Euler who first studied such problems systematically. The last few decades however have seen an explosion of problems of such ilk, one of the reasons being that the hungry monster of Computer Science has posed and keeps posing many such questions with practical applications. Extremal Combinatorics (at least maximization problems) often involves a race to provide better and better upper bounds (which are generally accomplished by theoretical arguments), and better and better lower bounds (which are generally accomplished by constructions). The field of Forbidden Configurations is no different and we often try to “sandwich” forb(m,F) by providing simple matrices with no configuration F ∈ F and then using a theoretical argument for the upper bound. Many different problems have been studied by a diverse collection of mathematicians. Here we mention a few examples of problems and theorems. Theorem 1.6.1 [Man07] Let G = (V,E) be a graph on n vertices with the property that there is no triangle. Then |E| ≤ n2/4. A generalization came from Turán. Theorem 1.6.2 (Turán theorem)[Tur41] Let G = (V,E) be a graph on n vertices such that G does not contain a subgraph isomorphic to Kk. Then, |E| ≤ k − 2 k − 1 · n2 2 = ( 1− 1 k − 1 ) · n 2 2 . 29 And further generalization is due to Erdős and Stone, asking for the maximum number of edges that avoids a given graph H. This question was answered asymptotically by Erdős, Stone, and Simonovits. Theorem 1.6.3 [ES46] [ES66] (Erdős-Stone-Simonovits) Given a number n and a graph H, let forb(n,H) denote the maximum number of edges of a graph in n vertices with no H as a subgraph. Let χ(H) denote the chromatic number of H. Then for every > 0 there exists N such that for all n ≥ N we have( 1− 1 χ(H)− 1 − ) n2 2 ≤ forb(n,H) ≤ ( 1− 1 χ(H)− 1 + ) n2 2 The related problem of Zarankiewicz asks for the maximum number of edges from a given complete bipartite graph for which a smaller given complete bipartite graph is avoided. A bound was given by Kővári, Sós and Turán ([KST54]). A better bound was given by Füredi in [Für96]. We use this result in our investigations in Chapter 7. There is a rich class of results involving intersecting families (i.e. every two sets in the family intersect). The following are some highlights. Theorem 1.6.4 Let A be an intersecting family on [n]. Then |A| ≤ 2n−1. Theorem 1.6.5 [EKR61] (Erdős, Ko, Rado) Let A be an intersecting k-uniform family on [n]. Then |A| ≤ (n−1 k−1 ) . A generalization of this theorem to t-intersecting families is due to Ahlswede and Kacha- trian ([AK97]) who completely characterized the extremal families. A stability result by Anstee and Keevash ([AK06]) was obtained for certain values of the parameters. This was used to establish the asymptotics of forb(m,F ) for k × 2 configurations F . 30 Here is a foundational antichain result. Theorem 1.6.6 [Spe28] (Sperner’s Theorem) Let A be a set system on [n] for which there are no A,B ∈ A such that A ⊆ B. Then |A| ≤ ( n dn/2e ) . The following problem is one of many in the field of Ramsey Theory. Many generalizations of this problem also exist and there are many interesting results. Problem 1.6.7 [Ram30] Given a number k, find the maximum number of vertices n for a graph G such that neither G nor Gc contain Kk. By [Ram30], it is known that such number n exists for any k. The full Ramsey Theorem considers colouring all t-sets of [n] using ` colours. The Theorem states that for every k there is n such that any such graph on n vertices has a full monochromatic k ‘clique’. We may use this, together with the idea of Section 2.2 to conclude that for m large enough there is a clique (of a fixed wanted size) of rows of the matrix A for which the same possibility of “What Is Missing” occurs. We now give a very small sampling of extremal problems in many other areas of combi- natorics. In graph theory, Problem 1.6.8 Given a graph G, what is the minimum number of colors one needs in order to colour the vertices of G so that two vertices who share an edge do not have the same colour? It is well-known that 4 colours is enough for any planar graph [AHK77]. For general graphs, the number of colours needed is called the chromatic number. To additive combinatorics, 31 Problem 1.6.9 [TV06] Given an Abelian group G, what is the maximum size of a sum-free set A ⊆ G? (sum-free means there are no a, b, c ∈ A with a+ b = c). In computational geometry, Problem 1.6.10 [O’R87] (Art Gallery Problem) Given a polygon P , what is the mini- mum number of lights needed to fully illuminate the interior of the polygon? As the reader might have realized by now, there are probably more problems in extremal combinatorics than there are mathematicians (find the minimum number of mathematicians to state all extremal combinatorics problems?) and so we could go on and on. 1.6.2 Forbidden Configurations The field of Forbidden Configurations began when Vapnik and Chervonenkis, Sauer, Perles and Shelah independently proved Theorem 1.3.29, each for different purposes. The theorem was proved first by Vapnik and Chervonenkis in 1968 in [VC68]. Sauer and Bollobas attribute the original problem to Erdős. This theorem and the related concepts of VC-dimension and shattered sets have applications in fields as diverse as Machine Learning and Pattern Recognition ([BEHW89], [Vap00], [WD81]), Probability ([Ste78]), Combinatorial Geometry ([Ver05], [Mat02]), Extremal Set Theory ([MZ07], [BKS05]), etc. The general question: “For a {0, 1}-matrix F , what is the maximum number of columns a simple {0, 1}-matrix A can have so that A doesn’t contain a row and column permutation of F as a submatrix” was first posed by Anstee in [Ans85]. Since then, many people have studied this question. The first paper studying this problem in a systematic way is [AGS97]. Both asymptotic and exact bounds were considered for small forbidden configurations (small in the number of rows). A large number of exact bounds have been proven in [AFS01], [AK07], [ABS11], [AK10]. A particular 4×2 configuration is given in [ABS11] for which finding an exact bound is highly unlikely. A clearer explanation is in [AK10]. In a private communication, Dukes has obtained tight bounds on the leading coefficient of the associated quadratic bound. 32 Conjecture 1.4.1 arose from the various investigations of asymptotic bounds. Product constructions were introduced in [AGS97]. The importance of three building blocks I, Ic and T were also noted in results by Balogh and Bollobás [BB05]. The reader might note that I, Ic and T arise repeatedly in proofs, such as in the results of Section 5.3, and product con- structions arise even in exact bounds, such as in the results of Section 4.1.3. Conjecture 1.4.1 was first stated in [AS05], but Anstee already believed that forbidden configuration bounds would be Θ(mk) for some integer k (that depended on F ) long before that. Since then, much of the progress in asymptotic forbidden configurations has been guided by this conjecture. 33 Chapter 2 Basic Techniques We have a number of powerful techniques that work in many cases and that we will use them repeatedly in this thesis. A careful look at the particular properties of each family of configurations we wish to study is typically required. We will describe some of these techniques in generality here and we will refer to them when used to solve particular examples. 2.1 Standard Induction Let F be a family of configurations and let A ∈ Avoid(m,F). We wish to do induction on m, but if we delete a row r from A, we might run into trouble: the resulting matrix after the deletion might not be simple. Let Cr = Cr(A) be the matrix that consists of the m − 1-rowed columns that have multiplicity 2 in the matrix that results from deleting row r from A. If we permute the rows and columns of A (i.e. we take a representative of the configuration A) so r becomes the first row, the columns of A \ {row r} can be divided into four blocks: A = r [ 0 · · · 0 1 · · · 1 Br Cr Cr Dr ] . (2.1.1) where Br = Br(A) are the columns that appear with a 0 on row r, but don’t appear with a 1, Dr = Dr(A) are the columns that appear with a 1 but not a 0, and Cr are the columns that appear with both. We call this the standard decomposition of A. Note that [Br|Cr|Dr] is simple and since F ⊀ A, we have that F ⊀ [Br|Cr|Dr]. So [Br|Cr|Dr] ∈ Avoid(m− 1,F). This means any upper bound on ‖Cr‖ (as a function of m), 34 automatically yields an inductive upper bound on ‖A‖: ‖A‖ ≤ forb(m− 1,F) + ‖Cr‖ We now study the structure of Cr. Note that [0 1]× Cr is in A. Definition 2.1.1 We say a configuration H is an inductive child of a configuration F if F ≺ [ 0 ... 0 1 ... 1 H H ] . Note that if H is an inductive child of F and F ⊀ A, then H ⊀ Cr. Therefore if we define HF to be the minimal elements of the family of inductive children of F under order ≺, we know Cr ∈ Avoid(m− 1,HF). Proposition 2.1.2 Let F be a family of configurations. Then forb(m,F) ≤ m−1∑ i=1 forb(i,HF). Proof: Let A ∈ ext(m,F). Then we know that for any row r, forb(m,F) = ‖A‖ = ‖Cr‖+ ‖[Br|Cr|Dr]‖ ≤ ‖Cr‖+ forb(m− 1,F) ≤ forb(m− 1,HF) + forb(m− 1,F) ≤ m−1∑ i=1 forb(i,HF) This concludes the proof. Since this proposition gives an upper bound on forb(m,F) with respect to forb(m,HF), and HF consists of smaller configurations (but perhaps more of them), we may repeat this process and find a bound instead on forb(m,HHF ), and so on. 35 Given F , constructing HF by hand is easy. We have a computer program that does this for us, but it almost never gets used since it’s so easy to do it by hand. The inductive children of a configuration must have eithr the same number of rows, or one less that the original. Then we must look at columns which get “repeated”. Perhaps an example would help to clarify this. Example 2.1.3 The inductive children of F = 0 0 1 10 0 1 1 1 1 0 1 are [ 0 0 1 1 1 1 0 1 ] , [ 0 0 1 0 0 1 ] , and 0 1 10 1 1 1 0 1 . These are obtained by deleting a row of F and only taking one for each column that appears with both a 0 and a 1 in the deleted row of F , as well as taking the repeated columns of F only once. Sometimes this bound is not enough because forb(m,HF) is too large. What we actually need in order to proceed as above by induction is that ‖Cr(A)‖ is small for some choice of r, not necessarily that forb(m,HF) is small. Often it is the case that we must search for the row r with the least amount of repetition, one for which ‖Cr(A)‖ is as small as possible. If we can prove that for every A ∈ Avoid(m,F) there must always be a row r with ‖Cr(A)‖ being “small enough”, we may appeal to induction and proceed as above. There is another method we might use when ‖Cr(A)‖ is too large. We may delete a limited number of columns (without deleting any row) before proceeding to do induction. For example, if A ∈ Avoid(m,F), we might select some columns U from A, so that A = [A′ |U ] which satisfy the property that ‖U‖ is small, and Cr(A′) is also small (and therefore we might do induction on A′). In this thesis we also use a new method in order to prove the results of Section 5.3 and Section 7.6 that involves considering only “essential” rows (in some sense). 36 Definition 2.1.4 For a matrix A and a row r, let L(r) be a minimal subset of the rows of Cr(A) such that Cr|L(r) is a simple configuration. This involves some choice. For each row r we fix this choice. This concept turns out to be quite useful in our investigations, and it arose by considering deleting “non-essential” rows of Cr(A) and looking at the structure of the remaining rows. In the next section we will give an alternate way to study the structure of a matrix A ∈ Avoid(m,F). 2.2 What Is Missing? Let F be a (finite) family of forbidden configurations and let t be the maximum multiplicity of any column in any F ∈ F . In other words, t := max{λ(α, F ) : α is a column, and F ∈ F}. Suppose we have a {0, 1}-matrix A ∈ Avoid(m,F). Given s ∈ N, consider all s-tuples of rows from A, and for each s-tuple of rows S, consider the matrix A|S formed from rows S of A as in Definition 1.3.10. Without any restriction, A|S could have all 2s possible columns, each with “high” multi- plicity. But with the restriction that F ⊀ A for each F ∈ F , in particular F ⊀ A|S, so some columns have to be missing: we can’t have more than t of all 2s columns, or else we would definitely have all F ∈ F as subconfigurations. That is, some columns have multiplicity 0 and some must have multiplicity less than t. Perhaps an example might clarify this idea. Suppose F consists of a single configuration F := 10 0 2 · 11 0 = 1 1 10 1 1 0 0 0 , so in this case, t = 2. Let A = 0 0 1 0 1 0 0 1 0 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 0 0 1 1 0 1 1 . 37 For S = {2, 3, 4} we see that A|S = 0 0 0 1 1 1 0 10 1 1 0 0 1 1 1 0 0 0 1 1 0 1 1 . In A|S, the columns (1, 0, 0)T and (0, 0, 1)T have multiplicity 0, the columns (0, 0, 0)T , (1, 1, 1)T , (0, 1, 1)T and (1, 1, 0)T have multiplicity 1, and the columns (0, 1, 0)T and (1, 0, 1)T have multiplicity 2. Now imagine what would happen if all columns had multiplicity 2 in A|S instead. Then F would be a subconfiguration of A|S, which would be a contradiction. So some columns must have multiplicity less than 2. This motivates the following definition. Definition 2.2.1 Given a matrix A, a number s ∈ N and an s-tuple S of the rows of A, we say an s-rowed column α is absent if λ(α,A|S) = 0. We say it is in short supply if λ(α,A|S) < t. We say α is in long supply if λ(α,A|S) ≥ t. In the example, we can conclude that for any A ∈ Avoid(m,F ), in each triple of rows (a, b, c) of A there is an ordering (x, y, z) of (a, b, c), for which the columns marked by “no” must be absent, the columns marked with < 2 must be in short supply and the rest may potentially be in long supply. no no no x y z 10 0 01 0 00 1 or < 2 < 2 < 2 x y z 11 0 10 1 01 1 or no no < 2 < 2 x y z 10 0 01 0 10 1 01 1 . Of course if there are no columns of sum 1 in A|S (the first case), then F ⊀ A|S. The same is true if all columns of sum 2 are in short supply (the second case). The third case might be a little harder to see, but if we take a look at the columns potentially in long supply, we see why: absent short supply long supply no no10 0 01 0 , < 2 < 210 1 01 1 =⇒ l.s. l.s. l.s. l.s.00 0 00 1 11 0 11 1 38 Clearly F is not a subconfiguration: no matter how many times we repeat each column marked in long supply, if we are only allowed to have the columns marked in short supply only once. Notice that in our example, since A does not have F as a subconfiguration, in A|S we see that the third case is the one that occurs, with row order (x, y, z) being (2,3,1), so A|S satisfies no no < 2 < 200 1 10 0 01 1 11 0 . Some effort is required to determine What Is Missing when we avoid all F ∈ F for a given F . In the case above, it is straightforward to check the list of 3 cases is complete. In this particular example s = 3 and the number of rows of F was also 3, but s and the number of rows of F do not need to be equal. When s is less than the number of rows of F , every column is in long supply. We can choose s to be anything, but s being the number of rows of F is often used. Larger s provides more information, but there is a trade off: it’s often harder to analyze such information. 2.3 Implications Sometimes while studying the set of possibilities for What Is Missing from an 3-tuple of rows {x, y, z}, we find that one of the possibilities has this restriction: < t < t x y z 01 0 01 1 Notice that this means in particular that in rows x, y there are at most 2t − 2 columns that have a 0 in row x and a 1 in row y: < t < t x y z 01 0 01 1 =⇒ ≤ 2t− 2 x y [ 0 1 ] 39 This observation motivates the following definition. Definition 2.3.1 Given a matrix A and a function f : N×N→ N, we say for a row x and a row y that we have the implication x→ y, if the following is satisfied on the pair of rows x, y: ≤ f(t) x y [ 0 1 ] (2.3.1) Notice that x→ y means that if in some column of A there is a 0 in row x, then there is usually a 0 in row y, except perhaps for f(t) columns. Definition 2.3.2 We say a column violates an implication x → y if it has a 0 in row x and a 1 in row y. Thus, x → y means there are at most f(t) columns that violate the implication. We call implications that never get violated pure implications, and implications that get violated at least once impure implications. Proposition 2.3.3 Let A be a {0, 1}-matrix and consider the directed graph G, where the vertices are the rows, and the arrows are the implications. Suppose we had an implication x→ y and we also had a path of implications x = x0 → x1 → x2 → ...→ xn = y. Then we may conclude that if a column of A violates x → y, it must also violate an implication of the form xi−1 → xi for some i with 1 ≤ i ≤ n. Proof: Indeed, if a column α has a 0 in row x and a 1 in row y, consider the smallest index i for which α has a 1 in row i. Then the column must have a 0 in row i − 1 and therefore violates xi−1 → xi. Given A, using this technique we might be able in some cases to select just some small set I of implications, so that if a column of A violates any implication, then it must necessarily violate an implication in our set I. The number of columns violating any implication will 40 then be at most f(t) · |I|. The power of implications comes from the fact that if |I| and f(t) are small enough, even if the total number of implications is large, we can delete every column that violates any implication, thus making the implication pure. The result is that columns previously marked as being “in short supply” are now completely absent. This is a powerful technique, because many times we are able to find an upper bound on forb(m, [G|H]) with G,H simple with no columns in common, and we wish to find an upper bound on forb(m, [G|t · H]). Considering a matrix A that has no [G|t · H], we are able to delete at most f(t) · |I| columns from A, and then this new matrix won’t have [G|H]. So under these assumptions, we are able to conclude that ‖A‖ ≤ f(t) · |I|+ forb(m, [G|H]). 41 Chapter 3 Computer Program Developed In this chapter we will describe a C++ program we used extensively in the results that follow. We give a description of the algorithms and data structures. Our code computes the following tasks: Sub-Configuration: Answers the question of whether or not a configuration F is a sub- configuration of a configuration A. Conjecture: Determines the asymptotic bound predicted by Conjecture 1.4.1 for a given configuration F . In other words, it finds X(F ). Classification: Once we know how to find X(F ), given a certain number of rows, this program finds all boundary cases. What Is Missing? Given s and a family of configurations F , this program finds the list of What Is Missing in each s-tuple of rows if we forbid F . Finding forb: Given F and a small value of m (specifically, m ≤ 5), this program computes forb(m,F). Guessing forb: For m = 6, 7, 8, 9, this program uses local search (in particular, a Genetic Algorithm among others) to guess the extremal matrices ext(m,F). In the subsequent sections we’ll deal with all these problems and describe our own approach to these tasks. All these have been implemented in C++ (and some of them in sage as well) and are available for download from http://www.math.ubc.ca/∼anstee/ For some of these tasks, complexity is unfortunately doubly exponential in the number of rows (for What Is Missing, Finding forb, Classification), while for others (Sub-Configuration, 42 Conjecture) it is merely exponential in the number of rows. In any case, we only deal with these problems for small configurations, and even for such cases it is often the case we aren’t able to use the information given by the computer to prove results, so better algorithms would not necessarily result in new results in Forbidden Configurations. 3.1 Representation of a Configuration We are interested in an efficient representation for configurations, in order to perform the tasks described above. In the progress of our investigations, we have had various versions of the program. We will record here how the program works based on the posted version, and recorded for posterity in the above link. Various things were done in a different manner, but we have mainly improved usability and speed with each subsequent version. Each version of the program was tested against many different configurations for correctness, and some tasks were implemented in sage (http://www.sagemath.org/) as well, in order to check the answers when they were too large to check by hand. For the “What Is Missing” calculation it is easy to check that each possibility, if satisfied, indeed implies that the given configurations are avoided, but there is no easy way to check that the list of possibilities is complete. First of all, we explain how the program stores configurations. Most of what we want the program to do involves performing a huge number of configuration comparison operations, which is testing whether or not a configuration F is a subconfiguration of a configuration A. As a first approach it would seem as if, for this task, we would be required to test each row and column permutation of F against each submatrix of A. This is of course a very slow way to do this. A simple trick to speed up the computations is to keep the columns of a configuration always in some canonical order. Then, to test whether or not a configuration F is contained in another A, we just need to permute rows of F and take subsets S of rows of A and place the columns of A|S in canonical order. Most of the tasks we described above involve checking whether or not a given (fixed) configuration F is a subconfiguration of a vast number of configurations A. In particular, any pre-processing we do on F can be considered as almost free. For example, finding all row permutations of F and storing them would need to be done once for each configuration F , and not at all for configurations A. After many attempts, it seems that the best (fastest) way to store a configuration that would make many of the other tasks reasonably fast is this: Maintain an array of integers 43 where the indices of the array, written in binary, are the columns of the configuration, and the actual numbers of the array represent the number of times a column appears. That is to say, a configuration F in m rows is represented by an array (C++ vector) F of size 2m. For a number α, consider the binary representation of α and consider it as a column with m rows. If necessary, put enough 0’s at the beginning of the binary representation in order to have the required m bits. The number F[α] (the α-th number of the array) represents the number of times that column α appears in configuration F . In the implementation, we use an array of unsigned characters instead of integers, since we never need a configuration with the same column repeated more than 255 times. An unsigned character consists of 1 byte (8 bits). For example, the array F = [1, 0, 0, 2, 0, 1, 0, 1] represents the following configuration (notice it has 3 rows, since the array has size 8 = 23): F = 0 0 0 1 10 1 1 0 1 0 1 1 1 1 . To see this, remember we start from 0. There is a one in position 0 = 000b, meaning the colum (0, 0, 0)T gets repeated one time. A two in position 3 = 011b, a one in position 5 = 101b and a one in position 7 = 111b. The columns of this matrix are the representations of these numbers in binary form. An observant reader might complain that this has the disadvantage that it requires storing 2m bytes, and if F doesn’t have many columns, most of those will be 0’s. But it’s a minor disadvantage, because even at 10 rows we would only need 1024 bytes, and we usually have configurations for which the number of rows is 5 or less (32 bytes). Perhaps this would become more of an issue with configurations with a high number of rows, but for those configurations, most of our tasks would require too much time to be of any practical use. We came to this representation after an implementation which represented columns as an array of bits (C++ bitset) and storing them into an ordered tree-like structure (C++ STL multiset). This might be a more natural implementation, but profiling the code made clear that the program was spending most of its time counting how many columns of a certain type appeared in a configuration, and was also spending a considerable amount of time navigating the tree. Explicitly storing the number of times each column appears, and making that number instantly accessible by storing it in an array (for random access) gives a very noticeable speedup and allows us to consider larger problems. By representing columns 44 as numbers, we can do a lot of preprocessing and compute large tables in which we have almost instant access time. For example, consider the following problem, which has to be done many times for our tasks: Given a column α and a subset S (represented by an integer as written in binary), what column is αS? This is relatively slow to compute, but we can fill out a table by preprocessing to speed up any further access to it. Since we do this a few million times, the investment is sound. The other advantage is that it becomes immediately clear how to compare two configu- rations with the same number of rows to see if one is a column-permutation submatrix of the other; check if for any column (index) the integer at position c of the first array is bigger than that of the second array. To check if F is a subconfiguration of A, we would need to find all permutations of the rows of F (which we need to do just once per configuration). 3.2 Subconfigurations Suppose F and A are configurations and we want to decide if F ≺ A. Then for every s-tuple of rows S (where s is the number of rows of F ), we can extract from A the configuration A|S easily with our pre-stored table of columns and subsets. Once we’ve done this for each column of A and found A|S, then for each permutation of rows of F , we check if every column α in the array corresponding to configuration F appears less than or equal to the corresponding number for column α in the array of A|S. We can check every subset S like this. If at any point this is so, we can return true. There are a few speedups. Sometimes it’s immediately obvious a configuration can’t be contained in another. For example, if there are more 1’s in F than in A, or if F has more columns or rows than A, then F ⊀ A. 3.3 Determining X(F) Given a configuration F , we wish to find X(F ). In other words, we wish to find the conjec- tured asymptotic bound for forb(m,F ). We may make a simplification using Lemma 1.4.2 and assume the multiplicity of any column of F is at most 2. First, suppose we wanted to test whether or not there exists r such that configuration F 45 is contained in the product Pr(a, b, c) = Ir × ...× Ir︸ ︷︷ ︸ a times × Icr × ...× Icr︸ ︷︷ ︸ b times ×Tr × ...× Tr︸ ︷︷ ︸ c times . Building this object with r = R as calculated in Lemma 1.4.2 would be prohibitively slow. Instead, we build a set X of subconfigurations from PR(a, b, c) such that if F ≺ PR(a, b, c), then F ≺ X for some X ∈ X . Notice that if F ≺ PR(a, b, c), then the rows of F get “partitioned” into a+ b+ c parts (a part can be empty), where each part belongs to a factor of the product PR(a, b, c). Because of Lemma 1.4.2, we can assume each column appears at most twice. Given s ∈ N, consider the following matrices: AI(s) = [ 0s 0s Is ] , AIc(s) = [ 1s 1s I c s ] , AT (s) = [ Ts Ts ] . We see that an s-rowed configuration F with each column repeated at most twice is contained in Im for some large m, if and only if F ≺ AI(s). We can then consider all partitions of rows of F and see if each part is contained in the corresponding AI , AIc or AT . For example, to test whether F = 1 0 1 1 0 1 0 1 1 1 0 0 0 0 1 1 1 0 0 1 is contained in I × T × T , we would partition the rows of F in three parts. In this case, F has five rows, so consider, for example, the following partition of 5: (2, 2, 1), for example. Consider the following representatives: AI(2) = [ 0 0 1 0 0 0 0 1 ] AT (2) = [ 0 0 1 1 1 1 0 0 0 0 1 1 ] AT (1) = [ 0 0 1 1 ] , and build a 5-rowed matrix A := AI(2)×AT (2)×AT (1). If F ≺ A, then F ≺ I × T × T . If we do this for every possible partition of 5, we get the desired result. To find X(F ) we can build a tree of possibilities of products and return the largest for which F not a subconfiguration of a product of this form, observing that if F is a 46 subconfiguration of a product Pr(a, b, c), then it will be a subconfiguration in any product Pr(a ′, b′, c′) with a′ ≥ a, b′ ≥ b and c′ ≥ c. 3.4 Boundary Cases to Classify Configurations We wish to find all boundary cases (Definition 1.4.3) for a given number of rows s and a number k using the computer. We use the program described in the previous section to find X(F ) for many different F . The method we use is very straightforward: start adding columns, one by one. Build a tree of configurations, where the children of a configuration F are the ones that consist of F plus a column. Find X(F ) for each configuration in the tree. Store all those configurations F for which X(F ) = k, and then find only the maximal and minimal configurations in the ordering ≺. If at some point we add a column and the bound jumps to k + 1 or higher, discard and go to the next configuration. Columns will be repeated at most twice because of Lemma 1.4.2. Unfortunately this method is very slow because the same configuration is searched mul- tiple times, since each configuration may have many representatives. To get rid of repetition we might check for equivalence of configurations against everything we have stored so far. But checking if two configurations are equivalent is usually slow, so doing it every time is also very slow. What seems to work best is to do check for equivalence, but only up to a point. For example, we can consider all pairs of columns, test those and only take one representative of each equivalence class. Then from each pair, start building the tree as described above. After that, it will be relatively unlikely that two configurations we search are equivalent, so the amount of repetition will be relatively low. Of course, much repetition will still occur, but much less than in the original tree. Notice that it isn’t as critical that classifying configurations be a fast calculation. We need to do it once for every s and k, but no more. Once we know the maximal and minimal quadratics for five rows, we never need to calculate them again. Other calculations, such as finding X(F ) or What Is Missing, have to be performed for each configuration (or family) we wish to study, so decreasing the running time is more of a priority in those cases since faster programs allow us to study bigger configurations. 47 3.5 Finding What Is Missing and Forbidden We describe the part of the program whose input is a family of configurations F and a number s, and its output is the list of possibilities for columns absent or in short supply. This computation has been the most useful for the thesis. To find a list of possibilities of What Is Missing in s rows after forbidding F ∈ F , we present the problem in a complementary way and instead find what can be present. To find this we may start with St = t · Ks (where t is the maximum multiplicity of a column in any configuration of F) and remove columns from it doing breadth first search, to find all the maximal configurations A for which F ⊀ A. Then the {0, 1}−complements of such A’s are the list of What Is Missing. This is very slow, since the search space could have 22 ts configurations. But we can speed up the calculations. For example, by carefully choosing which columns we start our search with (i.e. removing unnecessary columns from St). The main speedup for this program comes from choosing a different starting point by considering each column sum separately in the following manner. For a sequence (t0, t1, ..., ts) ∈ Ns, consider S := [t0 ·K0s | t1 ·K1s | ... | ts ·Kss ]. Instead of making ti = t, we can make ti the biggest number a column of column sum i appears in any configuration F . Of course, since configurations F ∈ F might have less than s rows, each column might count for many different column sums, in all the ways it could be filled. In other words, if F has k rows, we consider the s-rowed configuration F × Ts−k and then count how many times each column appears. Then for each column sum take the maximum, and then take the maximum over all configurations F ∈ F . Perhaps an example would be useful. Suppose s = 4 and F consist of the following two configurations: F1 = 0 0 11 0 1 0 1 1 and F2 = 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 We wish to find a starting point S for the search. We are searching then for t0, t1, t2, t3 and t4. As instructed above, since F1 only has 3 rows, it acts as F1 × T1 = F1 × [0 1] for the purpose of considering column sums. We can deduce that t0 = 0, since there are no columns of column sum 0. Since F1 × T1 48 with a 0 on the bottom contains columns of column-sum 1, but no repeated columns of column sum 1, we can set t1 = 1. F2 has a column of sum two with multiplicity two, so t2 = 2. And because of F1, we have both t3 = 1 and t4 = 1. Then instead of starting with 2 ·K4, which has 32 columns, we may start with the following matrix: S = [K14 | 2 ·K24 | K34 | K44 ] = 100011110011000011101 010011001100110011011 001000111100001110111 000100000011111101111 , which has only 22 columns. The search space has size 222; 1024 times smaller than 232. This alone provides a massive reduction for most families F . Using Lemma 1.4.2, if the multiplicity of a column is 3 or more, we can usually assume it is 2 for our purposes. We only distinguish between columns absent, those in short supply and those in long supply. Once we have the list of subconfigurations A of S such that F ⊀ A, we take only the maximals with respect to order ≺. What Is Missing can be obtained by considering S − A. For the example above, the computer gives us the list of What Is Missing in about one second. P0 = no no no no no no no no 1 0 0 1 0 1 0 1 0 0 1 1 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 P1 = no no < 2 no < 2 no no no no 0 1 0 1 0 0 1 1 0 1 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 P2 = no no no no no no no no 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 49 P3 = no < 2 < 2 no < 2 < 2 no no no no 0 0 1 1 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 P4 = no no no no no no no no no 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 P5 = no no no no no no no no no 0 1 1 0 0 1 0 1 0 0 1 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 0 1 1 1 1 1 1 1 P6 = no no no no no no no no no 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 1 1 1 1 P7 = < 2 < 2 < 2 no < 2 < 2 no < 2 no no no 1 1 0 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 50 P8 = no no < 2 no no no no < 2 no no 0 1 1 0 0 1 0 1 1 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 1 0 1 1 1 1 1 1 1 P9 = no no no no no no no no no 1 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 0 1 1 1 1 1 1 1 P10 = no no no no no no no no 0 1 1 0 0 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 1 1 P11 = no no no no no no no no no no 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 P12 = no no no no no no no no 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 1 51 P13 = no no no no no no < 2 no < 2 no no no 0 1 1 0 1 0 0 1 0 0 1 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 P14 = no no no no no no no no 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 In practice checking configurations with s ≤ 4 is almost instantaneous, s = 5 takes, depending on the configuration, anywhere from a few minutes to a couple of hours, and with s = 6 it’s typically hopeless, although it can be done if for example the configuration only has columns of column sum 3 and is simple, since 2( 6 3) is still a reasonable number. Finally, note that finding forb(m,F) is a subproblem of this, when s = m. Start with the complete matrix Km and remove columns one by one until we stop having F ∈ F . Perform breadth-first on the tree of column deletions in order to search for the configuration with the most number of columns for which no F ∈ F is a subconfiguration. Algorithm 3.5.1 Here is some pseudo-code to find What Is Missing (WIM). Input: A family of configurations F , a number s Output: A list of possibilities for What Is Missing for i = 0 to s do ti := max{λ(α, F × Ts−k) : α column of column sum i and F ∈ F with k rows} end for S := [t0 ·K0s | t1 ·K1s | ... | ts ·Kss ] List := ∅ for all s-rowed K ≺ S do if F ⊀ K then Add K to List end if 52 end for List = maximals(List) ListWIM := {S −K : K ∈ List} return ListWIM 3.6 Guessing Forb Consider F a family of (small) configurations and m a (small) fixed integer. Suppose we wish to to find (or rather, guess) forb(m,F) using the computer. This approach has helped considerably with the proofs of 3 exact bounds (described in Section 4.1 and Section 4.3). For these results, once we had an extremal matrix that avoids F it was relatively straightforward to construct a proof of this fact. We defer the details of the proofs to the subsequent sections. For now, we will describe the methods used to provide us with what seems to be a good and dependable guess of forb(m,F). The idea is to consider all the 2m columns in some order and add them one by one into a matrix A, making sure at each step we don’t create any F ∈ F as a configuration. The order in which to add columns is what will determine the size of ‖A‖ at the end of the process. To do this, enumerate all the columns and for each permutation of [2m] find the number of columns that would be added while avoiding all F ∈ F if we were to add them in that order. We call the columns that would get added in this procedure good columns and the columns that get discarded bad columns, with respect to the given permutation. We wish then to find a permutation that maximizes the number of good columns. So the search space has size (2m)!, which is too big to search exhaustively, but our experiments suggest local search can often find the correct answer when m isn’t too large. We’ll describe three methods. Perhaps this explanation could benefit by an example. Suppose we wanted to forbid F = [ 1 0 0 1 ] 53 and suppose m = 3. Consider all eight 3-rowed columns in the following order: 1 2 3 4 5 6 7 800 0 10 0 01 0 00 1 11 0 10 1 01 1 11 1 Then consider the permutation (4, 1, 3, 6, 5, 8, 2, 7). We start adding columns in the order given by this permutation. If adding a certain column would rise to a copy of F , then don’t add it and continue to the next one. This is how it would work for this permutation: 1. Add column 4. A = 00 1 2. Add column 1: A = 0 00 0 1 0 3. Can’t add column 3. 4. Add column 6: A = 0 0 10 0 0 1 0 1 5. Can’t add column 5. 6. Add column 8: A = 0 0 1 10 0 0 1 1 0 1 1 7. Can’t add column 2. 8. Can’t add column 7. We end up with 4 columns, with the good columns being {4, 1, 6, 8} and the bad columns being {3, 5, 2, 7}. 54 3.6.1 Brute Force Greedy Search The brute force method does the following: • Choose a permutation at random. • Separate into good and bad columns. • Count the number of good columns. • Repeat, while keeping track of the “best” so far (i.e. the one with the most number of good columns) The strength of this method is that we may do this thousands of times in a relatively short time. This method is good for simple configurations in which there are many different ways to achieve the bound, but in our experience, it fails to find the optimal bound in many cases. An easy bound on the probability p of finding the best configuration in a single try is the following. Let f = forb(m,F). Then p ≥ (2 m − f + 1)( 2m f ) . Indeed, if A ∈ ext(m,F) and gi (for i ∈ {1, 2, ..., f}) is any permutation of the columns of A and bi (for i ∈ {1, 2, ..., (2m − f)}) is a permutation of the columns that are not in A, we see that for any j ∈ {0, 1, ..., (2m − f)} any permutation of the form: (g1g2...gf−1b1, b2, ..., bj, gf , bj+1, ..., b2m−f ) gives rise to a matrix in ext(m,F), since by separating good columns from bad columns, as described in the previous section, we see that at least we are allowed to add g1, g2, ..., gf−1 to the good columns without interruption, and we must be allowed to add to the good columns at least one of b1, b2, ..., bi or gf . The above bound might not be very accurate, as many other permutations might give rise to extremal matrices, but at least it grounds the problem. 3.6.2 Hill Climbing We describe a method that improves over brute force. Start with some permutation and separate the 2m columns into good and bad columns. Then for each column b in bad, form a 55 new permutation by putting b at the beginning, but leaving all others in place. This ensures that the chosen column b is selected. From all the possibilities for choice of b, choose the best one, that is, the one that gives the most columns in good. If there are ties, pick one at random. Repeat this process until there is no improvement. In our example, we would have to consider all the following permutations: 1. (3, 4, 1, 6, 5, 8, 2, 7) 2. (5, 4, 1, 3, 6, 8, 2, 7) 3. (2, 4, 1, 3, 6, 5, 8, 7) 4. (7, 4, 1, 3, 6, 5, 8, 2) In this case all of them give size 4, but in general some might be better than others. Local search as described above gets stuck very quickly, but it does perform better than brute force. Algorithm 3.6.1 Here is some pseudo-code to perform hill climbing. This code gives up if it gets stuck. It’s easy to modify it to to be able to “walk” while being stuck, and if there is no improvement after, say, 20 iterations, stop. Input: A family of configurations F , a number m, a permutation of 2m, σ. Output: A configurations in Avoid(m,F) that is thought to be extremal. stuck := false Find good(σ) and bad(σ) while not stuck do for all b ∈ bad(σ) do σb := (b, σ \ b) Separate σb into good(σb) and bad(σb). end for φ := argmax{|good(σb)| : b ∈ bad(σ)} if |good(φ)| > |good(σ)| then σ ← φ else stuck = true end if 56 end while return σ 3.6.3 Genetic Algorithm An even better way to search the space is using a Genetic Algorithm. The idea is to mimic evolution. We maintain a population of ‘individuals’ and assign ‘fitness’ to them in some way (in our case, the number of ‘good’ columns). Then pick two at random, but giving higher probability to those with higher fitness. Call them father and mother. Then combine them in some way to produce offspring, hopefully with better results than either father or mother. Do this for many generations. There are many ways to combine permutations of course. The one we found that works well is as follows: 1. For both father and mother, separate the columns into good and bad. 2. Take a random number of the good part of mother, and consider all columns from the start up to the chosen random number. 3. Permute the entries of father so that the numbers in the chosen part of mother are in the same order as those in mother. This is better shown with an example. Suppose we had a father and a mother like this: father = (2, 3, 6, 1 | 7, 4, 8, 5) and mother = (5, 3, 8, 1, 4 | 7, 2, 6) with the numbers shown before the vertical line being the good part and the numbers shown after the vertical line being the bad part. Pick a random entry in the good part of mother. For example, pick 1 and look at the numbers that appear (in mother) to the left of the picked number. In this case, (5, 3, 8, 1). In father, select these numbers. father = (2,3, 6,1|7, 4,8,5) Make child by shuffling the selected entries in father so that they match the order of mother. child = (2,5, 6,3, 7, 4,8,1). 57 And finally, separate the columns of child into good and bad. This approach has given very good results in our experience. Algorithm 3.6.2 Here is some pseudo-code to perform genetic search as described in this section. After some empirical experimentation, we find that using a population size of 40 and 200 generations are good numbers. Input: A family of configurations F , a number m Output: A list of configurations in Avoid(m,F) that are thought to be extremal. Define Population to be a list 40 random permutations of 2m. for i = 0 to 200 do for all σ ∈ Population do Separate σ into good(σ) and bad(σ) columns. end for Define NewGeneration :=10 best from Population. while |NewGeneration| < 40 do Pick father and mother from Population {Pick random permutations σ according to how big good(σ) is} Mix father and mother to produce child Add child to NewGeneration end while Population← NewGeneration end for 58 Chapter 4 Exact Bounds 4.1 Two 3x4 Exact Bounds A list of 3× 4 configurations for which the exact bounds are not known are listed in [AK10]. We solve two of them, and give conjectured bounds for the rest in Section 8.1.3. 4.1.1 Introduction We study the following configurations V,W (where forb(m,V ) and forb(m,W ) weren’t pre- viously known), and compare to the common submatrix X (where forb(m,X) was found in [ABS11]). X = 1 11 1 0 0 , V = 1 1 0 01 1 0 0 0 0 1 1 , W = 1 1 1 11 1 0 0 0 0 1 1 . We used our Genetic Algorithm to seek extremal matrices A ∈ ext(m,V ) and A ∈ ext(m,W ) for small m. From these examples we guess the structure of extremal matrices in general and then we are subsequently able to prove these matrices are indeed extremal. Guessing such structures would have been challenging without the help of the Genetic Al- gorithm. Theorem 4.1.1 [ABS11] Let m ≥ 3. Then forb(m,X) = (m 2 ) + ( m 1 ) + ( m 0 ) + 1. 59 Note that X ≺ V and X ≺ W . We initially thought (before using the Genetic Algorithm) that forb(m,V ) = forb(m,W ) = forb(m,X). Related results are in [ABS11], [AK06]. The structure of the apparently extremal matrices generated by the Genetic Algorithm provided the strategy to tackle the bounds for forb(m,V ) and forb(m,W ). Interestingly, the Genetic Algorithm was used again in each example in the inductive proof to guess forb(m,HV ) and forb(m,HW ). 4.1.2 The Bound for W Recall that W = 1 1 1 11 1 0 0 0 0 1 1 . We can use the computer to compute forb(m,W ) for m ≤ 5 as described in Section 3.5. We then use the Genetic Algorithm to compute guesses for forb(m,W ) for m = 6, 7, 8 and also guess the structure of the extremal matrices. We then proceed to prove these guesses. Theorem 4.1.2 Let m ≥ 2. Then forb(m,W ) = (m 2 ) + ( m 1 ) + ( m 0 ) +m− 2. In order to prove this we proceed using the Standard Induction of Section 2.1: Let A ∈ Avoid(m,W ). If we could find some row r for which the number of repeated columns ‖Cr(A)‖ ≤ m+ 1, then we would be done: ‖A‖ ≤ m+ 1 + ( m− 1 2 ) + ( m− 1 1 ) + ( m− 1 0 ) + (m− 1)− 2 = ( m 2 ) + ( m 1 ) + ( m 0 ) +m− 2. So we may assume that ‖Cr(A)‖ ≥ m+ 2 for every r. The minimal inductive children of W are: G1 = 1 11 0 0 1 , G2 = [1 1 0 0 0 0 1 1 ] , G3 = [ 1 1 1 1 1 1 0 0 ] . 60 Given that Cr(A) ∈ Avoid(m − 1, {G1, G2, G3}), under these assumptions the following Lemma establishes that ‖Cr(A)‖ = (m− 1) + 3 = m+ 2. Lemma 4.1.3 Let m ≥ 4. Then forb(m, {G1, G2, G3}) = m+ 3. Proof: For the lower bound forb(m, {G1, G2, G3}) ≥ m + 3 an example (which was again found using our Local Search strategies) suffices. Consider the matrix A = [0m | K1m | 1m | α], where α is any other column. Clearly A ∈ Avoid(m, {G1, G2, G3}) and ‖A‖ = m + 3. To prove the upper bound, we proceed by induction on m. Let A ∈ Avoid(m, {G1, G2, G3}). Then, if we forbid {G1, G2, G3}, below are the 16 possible cases of What Is Missing on each quadruple of rows, found using the program described in Section 3.5. The following list is complete because the search is exhaustive. no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 0 0 0 1 1 0 0 no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 0 0 no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 1 1 1 0 0 0 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 1 1 1 1 0 0 0 0 1 1 0 1 61 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 0 0 0 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 1 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 0 1 1 62 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 1 0 0 0 0 1 1 1 0 1 1 0 0 0 0 1 0 0 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 1 0 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 0 0 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 1 1 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 1 1 1 1 no no no no no no no no no no no no ≤ 1 ≤ 1 > 1 > 1 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 no no no no no no no no no no no no ≤ 1 ≤ 1 > 1 > 1 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 63 no no no no no no no no no no no no ≤ 1 ≤ 1 > 1 > 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 1 1 1 We check this list to see that there are at most seven columns present on four rows (at least nine of sixteen are absent) and so forb(4, {G1, G2, G3}) = 7. Now consider m ≥ 5. It’s easy to check by considering every quadruple of rows that there is a row and a column we can delete from A and keep the remaining (m− 1)-rowed matrix A′ simple. Then by induction, ‖A′‖ ≤ (m − 1) + 3 = m + 2 and then ‖A‖ ≤ ‖A′‖ + 1 ≤ m + 3. To find such a row and column, look at the columns marked ≤ 1 and > 1, and see that there is a row we can delete such that the only repeat (if there is one) has one of the columns marked ≤ 1. We used the help of the free software sage (http://www.sagemath.org/), but it could also be done by hand. For example, for the first possibility, ≤ 1 ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 i j k ` 1 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 0 0 0 1 1 0 0 , deleting the row of A corresponding to row ` and the column of A corresponding to the fifth column above (0, 0, 0, 1)T (if it exists), keeps the remaining matrix simple. In all cases above one can delete the final row (together with at most one column in short supply). Thus ‖C‖ ≤ m+ 3. We need a more detailed lemma about an m× (m+3) matrix in Avoid(m, {G1, G2, G3}), one that was predicted using our Genetic Algorithm. Lemma 4.1.4 Let A ∈ Avoid(m, {G1, G2, G3}) with m ≥ 3 and ‖A‖ = m + 3. Then K1m ≺ A. Moreover, the remaining three columns are 0m and two additional columns α, β with α < β (meaning that on each row for which α has a 1, β also has a 1). Proof: We proceed by induction on m. We checked all cases with m = 3, 4 using a computer. 64 Assume A ∈ Avoid(m, {G1, G2, G3}) with ‖A‖ = m + 3 and m ≥ 5. From our proof of Lemma 4.1.3, there is a row and a column from A we can delete to obtain a (m−1)×(m+2) simple matrix A′. We may assume K1m−1 ≺ A′ by induction. Assume we deleted the last row from A to obtain A′ and that the deleted column was the last column of A. If we restrict ourselves to the first m−1 rows, the deleted column has to be a repeat of one of the columns of A′, else we would have an (m − 1) × (m + 3) simple matrix contradicting Lemma 4.1.3. At this point the proof is finished save for a case analysis on each of the possible columns to repeat. Aside from the column of zeros, there are two columns α, β which aren’t in K1m−1. Given α < β, we call α the small column and β the big column. We consider 3 different types of rows: • Row type 1: Both α, β has 0 in the row. • Row type 2: Column β has a 1 and α has a 0 in the row. • Row type 3: Both α, β has 1 in the row. There may not be any rows of type 1, but there has to be at least one row of type 2 (in order to differentiate between α and β) and at least two rows of type 3 (in order to differentiate between α and a column of column-sum 1. Consider the generic rows below. We’ve included the appropriate parts of the copy of K1m−1 and the column of 0’s. The entries marked c1, c2, . . . , c8, r1, ..., r4 are the entries of the deleted row and column. α β 0 1 0 0 0 0 0 r1 type 1 0 0 1 0 0 0 1 r2 type 2 0 0 0 1 0 1 1 r3 type 3 0 0 0 0 1 1 1 r4 type 3 c1 c2 c3 c4 c5 c6 c7 c8 . Of course there might be many rows of each of the types 1,2,3, but there is no loss of generality if we focus on these rows. There have to be at least two rows of type 3 so it is possible to have two rows which correspond to the entries r3, r4. We have to be careful because row r1 might not exist. There are some cases for which column gets repeated. In each case we attempt to find either G1, G2 or G3. The fact that A has no G1 means A is laminar, as in Definition 1.3.27. 65 The following case analysis is probably much easier as an exercise for the reader than it is to either write or read about. Case 1: 0m−1 is the repeated column. Then we have 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 c2 c3 c4 c5 c6 c7 1 . So either K1m ≺ A or some of c2, c3, c4, c5 are 1. If c2 = 1 then c6 = 0 and c7 = 0 in order to have a laminar matrix. But then we have G2 in the last and next-to-last rows. So we may assume c2 = 0. If c3 = 1 then c5 = 0, c6 = 0 and c7 = 1 by the laminar property and we have then 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 0 1 c4 0 0 1 1 . But then we have G2 in the last two rows. We may assume then that c3 = 0. If c4 = 1 then both c6 and c7 have to be 1, and so we get 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1 c5 1 1 1 , which has G3 in the last two rows. This completes Case 1. Case 2: The repeated column has column sum 1. Then there are three sub-cases, de- pending on the position of the 1 in the new column. Let r be the row on which, other than the last row, the new column has a 1. 66 Subcase 2a: r is of type 1. We have 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 c1 0 c3 c4 c5 c6 c7 1 , which contains G2 in the first two rows. Subcase 2b: r is of type 2. We have 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 c1 c2 0 c4 c5 c6 c7 1 , which contains G2 in the second and third rows. Subcase 2c: r is of type 3. We have 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 c1 0 c3 c4 c5 c6 c7 1 , which contains G3 in the third and fourth row. Case 3: The repeated column is the small column α. Then we have this: 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 1 1 c1 c2 c3 c4 c5 0 c7 1 . So c7 has to be 1 in order to have a laminar matrix. If either c4 or c5 were 0, then we 67 would have G3 in the last row together with one of the next-to-last rows. So both have to be 1. But this contradicts the fact that we have a laminar matrix. Case 4: The repeated column is the big column β: Then we have this: 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 1 1 1 0 0 0 0 1 1 1 1 c1 c2 c3 c4 c5 c6 0 1 . This yields G3 in the second and third rows. This completes all cases. We’ve concluded that K1m ≺ A. We deduce that, apart from the columns of K1m and 0m, A has two other columns α and β. Since G2 ⊀ A we deduce that either α < β or β < α. Proof of Theorem 4.1.2: We use induction on m. The result is true for m = 2, 3 so we may assume m ≥ 4. Let A ∈ Avoid(m,W ). Apply the decomposition (2.1.1). If ‖Cr(A)‖ ≤ m + 1, then we can apply the Standard Induction Section 2.1 to establish the bound for ‖A‖. So assume ‖Cr(A)‖ ≥ m+2 for all choices of r. By Lemma 4.1.3, we deduce that ‖Cr(A)‖ = m + 2 for every row r, and by Lemma 4.1.4, we have that K1m−1 ≺ Cr(A) also for every row r. Thus K2m ≺ A, since all columns of column sum 1 in Cr(A) appear with a 1 in row r (and this happens for every row r). We also have [K1m | K0m] ≺ A. Now in every triple of rows of K2m we have the matrix G1 once in every ordering of the triple. Given that W = 2 ·G1, the columns of A of column sum at least 3 have no configuration G1. We appeal to Lemma 4.1.5 below to deduce that the number of columns in A of column sum at least 3 is at most m− 2. Then ‖A‖ ≤ ( m 2 ) + ( m 1 ) + ( m 0 ) +m− 2, which yields the desired bound. A quick counting argument reveals the following general result about laminar families which we use in the proof of Theorem 4.1.2. 68 Lemma 4.1.5 Let m ≥ 3 and let Z be a laminar family of subsets of [m] = {1, 2, ...,m} with the property that for all Z ∈ Z we have |Z| ≥ 3. Then |Z| ≤ m−2 and furthermore, this bound is tight (i.e. there exists a family Z for which |Z| = m − 2). Thus, if A ∈ Avoid(m,G1) satisfies that all column sums are at least 3, then ‖A‖ ≤ m− 2. Proof: Let f(x) denote the size of the biggest laminar family of [x] with no sets of size less than or equal to 2. Recall that a laminar family is equivalent to a configuration which avoids G1. Assume A ∈ Avoid(m,G1) with the property that all column sums are at least 3. We wish to show that f(m) = m− 2. The family {[3], [4], . . . , [m]} has size m − 2, which proves f(m) ≥ m − 2. We wish to prove f(m) ≤ m− 2. Let Z be such that |Z| = f(m). We may assume [m] ∈ Z: if [m] /∈ Z, observe that Z ∪ {[m]} is also a laminar family of bigger size. Suppose then that the next biggest set Z in Z has size k. We partition [m] into two disjoint sets: Z and [m]\Z. Every set Y ∈ Z satisfies either Y ⊆ Z or Y ⊆ [m]\Z or Y = [m]. This gives the recurrence f(m) ≤ 1 + f(k) + f(m− k). If k 6= m − 1, then by induction f(k) = k − 2 and f(m − k) = m − k − 2, and so f(m) ≤ 1+k−2+m−k−2 = m−3, a contradiction. When k = m−1 we have f(m) ≤ m−2. Moreover if f(m) = m− 2 we observe that Z is “equivalent” to {[3], [4], . . . , [m]}. 4.1.3 The Bound for V Recall that V = 1 1 0 01 1 0 0 0 0 1 1 . Using the computer, we can prove by exhaustively looking at all the possibilities that forb(3, V ) = 8, forb(4, V ) = 13, and forb(5, V ) = 18. Using the Genetic Algorithm of Section 3.6, we obtained large matrices with no subconfiguration V which suggested to us that forb(6, V ) = 25, forb(7, V ) = 32, forb(8, V ) = 40. For m ≥ 6, this suggests forb(m,V ) = ( m 2 ) + ( m 1 ) + ( m 0 ) +3, two more than our first guess that forb(m,V ) = forb(m,X). 69 Theorem 4.1.6 Let m ≥ 6. Then forb(m,V ) = (m 2 ) + ( m 1 ) + ( m 0 ) + 3. The Genetic Algorithm also predicted the following structure of matrices in ext(m,V ). Recall the product × defined in Definition 1.3.7. Consider a partition of the m rows into two disjoint sets, say U = {1, 2, . . . , u} and L = {u+ 1, u+ 2, . . . ,m}. Suppose |U | = u and |L| = ` with u+ ` = m. Let A have the following structure: A = U L 0u× 0` 0u × 1` 0u × K`−1` 0u × K`−2` K1u × 1` K1u × K`−1` K2u × 1` 1u × 0` 1u × 1` . (4.1.1) We easily check that for A ∈ Avoid(m,V ) and 3 ≤ u, ` ≤ m−3, ‖A‖ = (m 2 ) + ( m 1 ) + ( m 0 ) +3. We will prove that A ∈ ext(m,V ) and hence establish Theorem 4.1.6. To prove this, consider A ∈ Avoid(m,V ) and apply the standard decomposition (2.1.1). The minimal inductive children of V are: H1 = 1 01 0 0 1 , H2 = [1 1 0 0 0 0 1 1 ] , H3 = [ 1 1 0 0 1 1 0 0 ] Thus, Cr(A) ∈ Avoid(m − 1, {H1, H2, H3}). We used the computer again to conjecture a structure on a matrix in Avoid(m, {H1, H2, H3}). Lemma 4.1.7 Let m ≥ 4 and A ∈ Avoid(m, {H1, H2, H3}). We have ‖A‖ ≤ m+ 2. Proof: Using the program described in Section 3.5, we find that one of the following must hold for each quadruple of rows of A: no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 0 0 0 70 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 0 1 0 1 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 0 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 1 0 0 0 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 0 1 1 1 1 1 1 0 0 1 1 1 0 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 1 0 0 0 1 1 0 0 0 0 1 1 1 0 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 1 no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 1 0 1 0 1 0 0 0 1 1 0 1 1 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1 1 0 0 1 1 1 0 1 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 1 0 0 1 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 0 0 0 71 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 1 0 0 0 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 0 0 0 0 0 1 0 0 1 1 1 1 1 0 0 0 1 1 0 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 0 1 1 0 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 1 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 1 0 1 0 0 1 0 1 0 1 0 0 0 1 1 0 1 1 0 1 1 1 0 0 1 1 1 0 0 0 0 1 0 1 1 1 1 1 1 1 0 0 1 1 1 0 1 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 0 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 0 0 0 1 0 1 0 1 1 1 1 1 1 0 0 1 1 1 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 72 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 > 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 0 0 1 1 0 1 1 1 0 1 1 1 1 no no no no no no no no no no no no ≤ 1 ≤ 1 > 1 > 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 0 1 1 0 0 0 0 0 0 1 0 0 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 1 0 1 1 0 1 1 1 0 0 0 no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 ≤ 1 > 1 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 1 1 0 no no no no no no no no no no no no ≤ 1 ≤ 1 > 1 > 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 1 1 0 1 1 1 0 0 1 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 no no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 1 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0 0 1 0 1 0 0 1 1 0 0 0 0 0 73 no no no no no no no no no no no no ≤ 1 ≤ 1 ≤ 1 > 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 0 1 0 0 1 1 1 1 1 This verifies in particular that forb(4, {H1, H2, H3}) = 6 (at least 10 columns are absent in each of the twenty cases). We proceed as we did in Lemma 4.1.3 and verify that each of these possibilities yields one row (the final row in each case) and at most one column that we can delete from A, maintaining simplicity. This means that ‖A‖ ≤ m + 2 by induction. Lemma 4.1.8 Assume m ≥ 4. Let A ∈ Avoid(m, {H1, H2, H3}) with ‖A‖ = m + 2. Then there exists a partition U ⊆ [m] and L = [m]\U with |U | = u ≥ 1 and |L| = ` = m− u ≥ 1 so that if we permute rows, A = U L 0u K 1 u 0u 0u × × × × 1` 1` K `−1 ` 0` or A = U L 0u K 1 u 0u 1u × × × × 1` 1` K `−1 ` 1` (4.1.2) Note that in the former case we must have ` ≥ 2 and in the latter u ≥ 2. Proof: We use induction on m. The computer is able to show that the result is true for m = 3. Assume m ≥ 4. Let A ∈ Avoid(m, {H1, H2, H3}). By our argument in Lemma 4.1.7, there is a row and a single column we can delete, leaving the remainder simple. Let A′ be the resulting simple matrix. We may assume by induction that there exists disjoint sets U ′, L′ such that |U ′| = a ≥ 1, |L′| = b ≥ 1 where a + b = m− 1 so that after permuting rows and columns, A′ = U ′ L′ 0a K 1 a 0a 0a × × × × 1b 1b K b−1 b 0b , or A′ = U ′ L′ 0a K 1 a 0a 1a × × × × 1b 1b K b−1 b 1b . We will assume b ≥ 2. Assume the last column (m + 2) and last row (m) of A is deleted to obtain A′. After deleting row m, the last column of A must be one of the columns of A′ 74 given that ‖A′‖ = forb(m, {H1, H2, H3}). In order to avoid H1, H2 and H3 in A, we can show that we have the desired structure (4.1.1) in A (with either a or b one larger than before). We can make a few general comments about row m. If we have both a 0 and a 1 in row m under the columns containing K1a , then using the two columns containing the 0 and 1 and two rows of the U ′ together with row m, we obtain a copy of H1, a contradiction. Similarly we cannot have both a 0 and a 1 in row m under the columns containing Kb−1b . It is also true that in row m we cannot have a 0 under K1a and a 1 under K b−1 b else we find a copy of H1 in two columns containing 0 and 1 in row m and in a row of U ′, a row of L′ and row m. We will consider the two cases that A′ has either the column 0m−1 or 1m−1 together. Note that if a = 1, then A′ has both 0m−1 and 1m−1. As in the proof of Theorem 4.1.2, we must do some case analysis for which column gets repeated. Let γ be the last (repeated) column of A. Case 1: γ = 0a × 1b. We deduce that column 0m−1 of A′ (if present) appears with a 0 in row m else we have the subconfiguration H1 in two rows of L ′ and row m. Similarly, if column 1m−1 is present and a ≥ 2 then 1m−1 appears with a 1 in row m else we have a copy of H1 in two rows of U ′ and row m. Now if we have 1’s in row m under both K1a and K b−1 b , then we have the desired structure with U = U ′ and L = L′ ∪ {m}. Similarly, if we have 0’s in row m under both K1a and under K b−1 b then we have the desired structure with U = U ′ ∪ {m} and L = L′. Thus we may assume we have 1’s under K1a and 0’s under K b−1 b . Recall that either A ′ has 0m−1, or A′ has 1m−1 but in that case a ≥ 2. We have two subcases. Subcase 1a: A′ has 0m−1 (so A has 0m). If a = 1 we note that A′ has 1m−1 (and therefore we are in Subcase 1b). Else we find a copy of H3 in A in a row of L ′ and row m in the column 0m of A, a column from 0a×Kb−1b × 01 with a 0 in the chosen row of L′, the column with 0a × 1b × 01 and a column from K1a × 1b+1. Subcase 1b: A′ has 1m−1 (so A has 1m) and a ≥ 2. We have a ≥ 2 and find a copy of H3 in A in a row of U ′ and row m in the column 1m of A, a column from K1a × 1b+1 with a 1 in the chosen row of U ′, the column with 0a × 1b × 01 and a column from 0a ×Kb−1b × 01. This completes Case 1. 75 Case 2: γ = 0a × βb, where βb ∈ Kb−1b . Given that 0a×βb is repeated in A on rows [m−1], we deduce that it appears both with a 1 and with a 0 on row m. Assume any other column 0a×β′b of A′ (with β′b ∈ Kb−1b ) has a 0 in row m. Then we find a copy of H1 in row m and the two rows of L′ where βb and β′b differ and in the column of A containing 0a×β′b and one of the two columns of A containing 0a×βb which differs from the first column in row m. Case 3: γ = αa × 1b, where αa ∈ K1a. For a ≥ 2 we may follow the argument of the pre- vious case and find a copy of H1. Given a = 1, we have that A ′ contains 0m−1 as well as αa × 1b = α1 × 1m−2 = 1m−1. We deduce that A has 1m and 1m−1 × 01. Thus we can find a copy of H3 in a row of U ′ together with a row of L′, in the two columns with 1m−1 in A′, the column with 0m−1 in A′ and a column of 01 ×Kb−1b selected in order to have a 0 on the chosen row of L′. Case 4: γ = 0a × 0b. We find H3 in two rows of L′ since b ≥ 2. Case 5: γ = 1a × 1b. If a ≥ 2, we can find H3 in two rows of U ′. In case a = 1, we know 0m−1 as well as 1m−1 are in A′. We find H3 using a row of U ′ and a row of L′ where we choose a column of 0a × Kb−1b that has a 0 in the chosen row of L′, plus 0m−1 of A′, 1m−1 × 0 and 1m−1 × 1. This completes the proof or Lemma 4.1.8. Proof of Theorem 4.1.6: We use induction on m for m ≥ 6. We established by computer that forb(5, V ) = 18 (which is smaller than the bound of Theorem 4.1.6). Noting that forb(5, {H1, H2, H3}) = 7, we deduce using Section 2.1 that forb(6, V ) ≤ forb(5, V ) + forb(5, {H1, H2, H3}) = 18 + 7 = 25 and so forb(6, V ) = 25, because of construction (4.1.1). Thus, we may assume m ≥ 7. By induction, assume forb(m− 1, V ) = (m−1 2 ) + ( m−1 1 ) + ( m−1 0 ) + 3. Let A ∈ Avoid(m,V ) with ‖A‖ = forb(m,V ). Apply the standard decomposition (2.1.1) to A for some row r. If 76 ‖Cr(A)‖ ≤ m, we obtain ‖A‖ ≤ ‖Cr(A)‖+ forb(m− 1, V ) ≤ m+ ( m− 1 2 ) + ( m− 1 1 ) + ( m− 1 0 ) + 3 = ( m 2 ) + ( m 1 ) + ( m 0 ) + 3 Thus, we may assume that for every r we have ‖Cr(A)‖ ≥ m+ 1. Using Lemma 4.1.7 (with m replaced by m−1), we may assume ‖Cr(A)‖ = m+1 for each r. Then using Lemma 4.1.8 we can assume Cr(A) has the structure of (4.1.2) for every r so that every Cr(A) partitions [m]\r rows into sets Ur, Lr with |Ur|, |Lr| ≥ 2. Note also that the only difference between the two possible structures is a column of 0’s or a column of 1’s neither of which is used in the case analysis below. Furthermore, we will prove that there is a partition of the rows [m] of A into U,L where Ur = U\r and Lr = L\r. Take two rows, say s and t. Consider Cs(A) and Ct(A) as determined by (2.1.1) by setting r = s and r = t. Applying Lemma 4.1.8 when considering Cs(A) and Ct(A) we obtain the partitions Us, Ls, Ut, Lt (Upper and Lower) of rows. Remember that Cs(A) and Ct(A) both appear twice in A with 0’s and 1’s in rows s and t respectively. We now define partitions U ′s = Us\t, L′s\t, U ′t\s, L′t\s so that U ′s ∪L′s = [m]\{s, t} = U ′t ∪L′t. We assumed m ≥ 7 and so |[m]\{s, t}| ≥ 5. Hence we may assume that at least one of U ′s and L′s has size at least 3. Without loss of generality, assume |U ′s| ≥ 3. Let X = 1 11 1 0 0 , Y = 0 00 0 1 1 . Consider the following three cases: |U ′s ∩ L′t| ≥ 3: We can find V in rows U ′s∩L′t in A (since A contains two copies of K13 in each triple of rows of U ′s and two copies of K 2 3 in each triple of rows of L ′ t). |U ′s ∩ L′t| = 2: Then |U ′s∩U ′t| ≥ 1, and so we can find V in A|U ′s by taking two rows of U ′s∩L′t together with any row in the intersection U ′s ∩ U ′t . We find Y as a submatrix in any row order (A contains two copies of K13 in each triple of rows of U ′ s) and we also have X as a submatrix whose first two rows are from U ′s ∩L′t and the last one from U ′s ∩U ′t . 77 This yields V . |U ′s ∩ L′t| = 1: We have |U ′s ∩ U ′t| ≥ 2, and so we can find V in A|U ′s by taking the row of U ′s ∩ L′t together with two rows in the intersection U ′s ∩ U ′t . We find Y as a submatrix in any row order (A contains two copies of K13 in each triple of rows of U ′ s) and we also have X as a submatrix whose first row is U ′s ∩ L′t and last two rows are from U ′s ∩ U ′t . This yields V . This means that U ′s ⊆ U ′t , but then |U ′t| ≥ 3 and so analogously U ′t ⊆ U ′s. So U ′s = U ′t , and then L′s = L ′ t. The same conclusion will hold if |L′s| ≥ 3. Thus for all s, t ∈ [m], U ′s = U ′t , and then L′s = L ′ t. Using m ≥ 4, we now may deduce that there is a partition U,L of [m] so that for any r, Ur = U\r and Lr = L\r This proves that the partition for each Cr is really a global partition. Let |U | = u and |L| = `. We may argue u, ` ≥ 2 since for example if |U ′r| ≥ 1 and U ′r = {s} then U ′s ∪ {s} ⊆ U and we have |U | ≥ 2. Note that for every row r, we have that [0 1] × Cr(A) ≺ A. We deduce A contains the following columns: B = U L 0u× 0` 0u × 1` 0u × K`−1` 0u × K`−2` K1u × 1` K1u × K`−1` K2u × 1` 1u × 1` . (4.1.3) We have included the column of 0’s and the column of 1’s since such columns can be added to any matrix without creating V . What other columns might we add to this? For u ≥ 3, matrix B contains 1 0 01 0 0 0 1 1 in any triple of rows or U in any row order. So (A−B)|U must not contain the configuration (1, 1, 0)T , else A has subconfiguration V . Similarly for ` ≥ 4, (A − B)|L must not contain the configuration (1, 0, 0)T . Thus, for u, ` ≥ 3, all columns of (A − B) are in [0u | K1u | 1u] × [0` | K`−1` | 1`]. The only column not already in B is 1u × 0` which is a column of the hypothesized structure (4.1.1). Thus, without loss of generality, we need only consider the case u = 2, ` ≥ 5. Let U = {a, b} and consider any two rows c, d ∈ L. We know B contains K1u × [1` | K`−1` ] and 78 [0u | K1u]×K`−1` . So B has: a 0 0 1 0 0 1 b 1 1 0 c 1 1 0 0 0 1 d 1 1 0 . Note we can interchange a with b and c with d. To avoid V we must not have columns in (A−B) with a 1 1 b 0 c 0 1 d 0 . Thus, the only possible columns of A−B are a 0 1 0 1 1 b 0 0 1 1 1 L α 1` 1` 0` 1` . where α is any column. Recall that since ` ≥ 3, any such α must avoid configuration (1, 0, 0)T . All these columns are already in B, except for 12 × 0`, which together with B completes the hypothesized structure (4.1.1). The desired bound follows. Interestingly, the structure of (4.1.1) falls short of the bound in the case u = 2. 4.2 Exact Bound for Ten Products This section is dedicated to proving an exact bound. The following four matrices are all 2× 2 simple matrices (up to row and column permutations). Let I2 = [ 1 0 0 1 ] T2 = [ 1 1 0 1 ] U2 = [ 0 0 0 1 ] V2 = [ 1 0 1 0 ] . We note forb(m, {I2, T2, U2, V2}) = 1. Define {I2, T2, U2, V2} × {I2, T2, U2, V2} = {X × Y : X, Y ∈ {I2, T2, U2, V2}} as the 10 79 possible products of these matrices (note some of the 16 are equivalent). Theorem 4.2.1 We have forb(m, {I2, T2, U2, V2} × {I2, T2, U2, V2}) = m+ 3. Proof: Use the notation F = {I2, T2, U2, V2}×{I2, T2, U2, V2}. We establish the lower bound by construction. Let α = 1m−1×01. The m×(m+3) matrix A consisting of [0m | I | α | 1m] avoids all configurations in F , hence forb(m,F) ≥ m+ 3. We use induction on m for the upper bound. We verified forb(4,F) = 7 using the computer program described in Section 3.5. To prove the bound for m ≥ 5, we will proceed by induction on m. For an m-rowed matrix A that doesn’t contain any configuration in F it suffices by induction to show there exists a row r for which ‖Cr‖ ≤ 1, using the standard decomposition as in (2.1.1). If this were so, we could delete row r and perhaps one column (one instance of the column forming Cr) from A, keeping the remaining matrix simple. This would yield forb(m,F) ≤ 1 + forb(m− 1,F) = 1 + (m− 1) + 3 = m+ 3 as desired. Let us proceed by contradiction. Suppose then that for every row r, ‖A‖ ≥ 2. We then have at least two columns α and β in C1(A). The matrix A would look like this 1 [ 0 · · · 0 0 1 1 · · · 1 α β α β ] . But α and β must differ in some row. Without loss of generality, assume they differ on row 2, and suppose α2 = 0 and β2 = 1. We will prove that α and β must be complements of each other. Suppose otherwise and suppose they had something in common, say in row 3. The first four rows of A would look like this: 1 2 3 4 0 · · · 0 0 1 1 · · · 1 0 1 0 1 a a a a b c b c . for some values of a, b, c (we are using the fact that the matrix has at least 4 rows). Then in 80 rows 1 and 3 and rows 2 and 4 we get that this matrix contains 1 3 [ 0 1 a a ] × 2 4 [ 0 1 b c ]. which is a configuration of F (for any a, b, c), so we conclude β = α. Now C2(A) must have two repeated columns, say γ and δ. As argued above, δ = γ. Here is part of the matrix A: 0 0 1 1 0 0 1 10 0 0 0 1 1 1 1 α γ α γ α γ α γ . We have abused notation and refer to α and γ on m − 2 rows as α and γ again. This abuse will continue throughout this proof. Since α and γ have to differ somewhere, we can assume α3 = a, and γ3 = a. Since α and γ must differ somewhere, we can assume α4 = b and γ4 = b. Furthermore, since we have at least 5 rows, we can then write the selected columns of A where the columns are given labels below to indicate the source of the column. 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 a a a a a a a a b b b b b b b b d c d c d c d c α γ α γ α γ α γ . There are two cases. Either d = c or d = c. So we either have 81 (d = c) : 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 a a a a a a a a b b b b b b b b d d d d d d d d α γ α γ α γ α γ or (d = c) : 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 a a a a a a a a b b b b b b b b d d d d d d d d α γ α γ α γ α γ . These yield the following configurations in F respectively: (d = c) : 2 4 [ 0 1 b b ] × 3 5 [ a a d d ], (d = c) : 2 3 [ 0 1 a a ] × 4 5 [ b b d d ]. This is a contradiction to ‖Cr‖ ≥ 2 and hence for m ≥ 5, there must be some row r for which ‖Cr‖ ≤ 1. The bound is achieved by induction. 4.3 Critical Substructures In view of Remark 1.3.21, we know that G ≺ F implies forb(m,G) ≤ forb(m,F ). So given F , it is a natural question to try to find minimal configurations G ≺ F such that forb(m,F ) = forb(m,G). In this section we give a conjecture for all such subconfigurations of Kk and prove our conjecture for k = 4. Remember that by Theorem 1.3.29 we have that forb(m,Kk) = ( m k−1 ) + ( m k−2 ) + ...+ ( m 0 ) . Definition 4.3.1 Let G be a configuration. We say G ≺ F is a critical substructure if forb(m,G) = forb(m,F ). We give a conjecture that has been verified for k = 2, 3 and we verify it for k = 4 in Proposition 4.3.3, Proposition 4.3.7 and Proposition 4.3.9. 82 Conjecture 4.3.2 Let k be given. Then the minimal critical substructures of Kk are the k + 3 configurations 2 · 1k−1, 2 · 0k−1 and K`k for 0 ≤ ` ≤ k. Previous results established this conjecture for k = 3. The fact that all these configura- tions are critical substructures of Kk isn’t hard to prove. The difficulty then lies in proving every critical substructure contains one of these as a configuration. We show that Conjecture 4.3.2 is equivalent to Conjecture 4.3.8 (for any k) and then prove Conjecture 4.3.8 for k = 4. The following proposition was proven by Füredi and Quinn [FQ83], and Gronau [Gro80]. Proposition 4.3.3 [FQ83][Gro80] For all `, k with 0 ≤ ` ≤ k we have forb(m,Kk) = forb(m, 2 · 1k−1) = forb(m, 2 · 0k−1) = forb(m,K`k). Notice in particular that this proposition answers the question posed in Example 1.3.15. We have I3 = K 1 3 and so forb(m, I3) = forb(m,K3) = ( m 2 ) + ( m 1 ) + ( m 0 ) . We now consider a shifting argument first used by Alon for this context in [Alo83]. Let A be a simple {0, 1}-matrix. We construct a shifted matrix A by performing the following operation repeatedly. Consider an entry of the matrix that has a 1 and make it into a 0 if the resulting matrix is still simple. Keep doing this until there is no entry with a 1 which can be made 0 with the matrix remaining simple. If a column α appears in A, then every column with 0’s in positions where α is 0 (and perhaps others) also appears in A. In other words, the associated family for the shifted matrix A is a downset (if a set is in the family, every subset of that set is in the family as well). Lemma 4.3.4 [Alo83] Let A be a simple {0, 1}-matrix and let A be a shifted version of A. For any set of rows S, we have that the number of different columns of A|S is at least the number of different columns of A|S. 83 Lemma 4.3.5 Let A ∈ ext(m,Kk). Then for any set S ∈ ( [m] k ) , we have that there is a k × 1 column α such that (Kk − α) ≺ A|S. Proof: Let A ∈ ext(m,Kk). Then for any set S ∈ ( [m] k ) , A|S has at most 2k − 1 different columns since it does not have Kk. Let A be a shifted version of A and so A|S has at most 2k−1 different columns. Now the columns of A form a downset and so cannot have a column of sum k or larger. Now ‖A‖ is forb(m,Kk) which is equal to the number of columns of sum k− 1 or smaller and so A consists of all columns of sum at most k− 1. Thus A|S has exactly 2k − 1 different columns (it must have at least 2k − 1 but cannot have 2k since it would contain Kk). Then A|S has exactly 2k−1 different columns, establishing our result. We establish the k-rowed critical substructures of Kk using the following Lemma. Lemma 4.3.6 Let B be an k× (k+ 1) matrix consisting of one column of each column sum i for 0 ≤ i ≤ k. Let F = Kk −B. Assume m ≥ k. Then forb(m,F ) < forb(m,Kk). Proof: Let A ∈ ext(m,F ). Assume forb(m,F ) = forb(m,Kk). Then also A ∈ ext(m,Kk). But then by Lemma 4.3.5 we have that there is a k×1 column α so that A has Kk−α. Now F is a configuration in Kk − α (for any choice of α), contradicting our assumption. Thus forb(m,F ) < forb(m,Kk). Proposition 4.3.7 Let F be a minimal k-rowed critical substructure of Kk. Then F = K`k for some `. Proof: Consider any k-rowed critical substructure F of Kk. Then if there exists an ` such that K`k ≺ F , since we know forb(m,K`k) = forb(m,Kk), then F cannot be minimal unless F = K`k. Consider the case where K ` k ⊀ F for all `. In this case F is contained in Kk−B for 84 some B a collection of columns with one column of each column sum. Using Lemma 4.3.6, we can conclude forb(m,F ) ≤ forb(m,Kk −B) < forb(m,F ). Thus, F is not a critical substructure. For (k−1)-rowed critical substructures we have more work to do. To prove this conjecture, it would suffice to show that the only (k−1)-rowed minimal critical substructures are 2 ·0k−1 and 2 · 1k−1. Given the previous Lemmas, we conclude Conjecture 4.3.2 is equivalent to the following: Conjecture 4.3.8 Let Fk−1 = [0k−1 | 2 ·K1k−1 | 2 ·K2k−1 | · · · | 2 ·Kk−2k−1 | 1k−1]. Then there exists M so that for m ≥M , forb(m,Fk−1) < forb(m,Kk). Let A ∈ ext(m,Kk). A consequence of Lemma 4.3.5 is that for every S ∈ ( [m] k ) there is some choice of α for which Kk − α ≺ A|S. Assume that α has p 1’s and (k − p) 0’s. Then for some T ⊆ S with |T | = k− 1, we have that for some choice of β, Kk−1− β ≺ A|T , where β has either p 1’s and k − p− 1 0’s or has p− 1 1’s and k − p 0’s. Let C be a (k − 1)-rowed matrix with at most one column of each column sum and having column of 0’s , column of 1’s and for each 1 ≤ i ≤ k − 3, we have either a column of sum i or a column of sum i + 1. Then A|T has 2 ·Kk−1 − C. This handles k = 3 for which we can take C to consist of a column of 0’s and a column of 1’s. We can handle the case k = 4 as well. Proposition 4.3.9 Let m ≥ 6. Let F3 = [03 | 2 ·K13 | 2 ·K23 | 13] as in Conjecture 4.3.8. Then forb(m,F3) < forb(m,K4). Proof: Let A ∈ ext(m,F3) and assume ext(m,F3) ⊆ ext(m,K4). Consider any column α of [K0m | K1m | Km−1m | Kmm ]. By extremality of A, [A | α] has K4 as a subconfiguration, say on rows S. Then for any row j ∈ S with Rj := S \ {j}, we have that A|Rj contains 85 2 ·K3 − α|Rj . Selecting j ∈ S so that α|Rj is either 13 or 03 we have then that A contains F3, a contradiction. So we deduce that [K 0 m | K1m | Km−1m | Kmm ] ≺ A already and hence A contains K3 in every set of 3 rows. Let B consist of the remaining columns of A (not in [K0m | K1m | Km−1m | Kmm ]). We deduce that B has no [K13 | K23 ]. Then ‖B‖ ≤ forb(m, [K13 | K23 ]) = ( m 2 ) + ( m 1 ) + ( m 0 ) . Thus ‖A‖ ≤ 2m+ 2 + (m 2 ) + ( m 1 ) + ( m 0 ) . For m ≥ 6, we have 2m+ 2 < (m 3 ) , a contradiction. In Section 8.1 we give some further ideas on how to deal with k ≥ 5. 86 Chapter 5 Three Asymptotic Bounds 5.1 Introduction When we are not able to obtain an exact bound on forb(m,F ) for some F , we might settle for asymptotic bounds. The research in this topic is often guided by Conjecture 1.4.1. In this chapter we prove three main results, each of which completes some part which would be required in order to establish Conjecture 1.4.1. In Section 5.2 we prove the quadratic asymptotic bound predicted by Conjecture 1.4.1 for the configuration F8(t), one of the three maximal 4-rowed configurations for which the conjecture predicts a quadratic bound. Here are these three configurations (called F6(t), F7(t), F8(t) for historical reasons). F6(t) = 1 0 1 1 0 1 1 0 0 0 1 1 0 0 0 1 t · 1 0 1 1 1 1 0 0 0 1 0 1 0 0 1 0 F7(t) = 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 1 t · 1 1 0 1 1 0 1 0 0 0 1 1 0 1 0 0 87 F8(t) := 1 0 1 0 0 1 0 1 1 1 0 0 1 1 0 0 t · 1 0 0 1 1 1 0 0 . As of this writing, quadratic bounds for F6(t) and F7(t) have not been found. Note that both F6(t) and F7(t) contain the following: t · 1 0 1 0 0 1 0 1 . It is not known (but The Conjecture predicts it) whether or not the above matrix has a quadratic bound for t ≥ 3, but for t = 2 and t = 1, the quadratic bound was proven by Anstee in [Ans90]. The methods described in the following chapters have so far failed to produce results for these other configurations. There are some cases in their respective lists of What Is Missing which we don’t know how to deal with. Then in Section 5.3 we prove the bound for F7, one of the nine maximal 5-rowed config- urations below which are predicted to be quadratic by Conjecture 1.4.1. F3 = 1 1 0 1 1 1 1 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 F4 = 1 1 0 1 1 1 1 0 1 1 1 0 0 1 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 F5 = 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 1 0 F6 = 1 1 0 1 1 1 1 0 1 1 1 0 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 1 F7 = 1 1 0 1 1 0 1 0 1 1 1 1 0 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 0 F8 = 1 1 0 1 1 0 1 0 1 1 1 1 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 88 F9 = 0 0 1 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 1 1 0 1 1 0 1 1 1 1 0 0 F10 = 1 1 0 1 0 0 1 0 1 1 1 1 0 1 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 1 F11 = 1 1 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 It is not known whether or not any of the other eight boundary configurations will satisfy the conjecture. Finally, in Section 5.4 we use the result about F7 to classify all 6-rowed quadratic con- figurations by proving the unique boundary case: G6×3 = 1 1 1 1 1 0 1 0 1 0 1 0 0 0 1 0 0 0 . 5.2 Quadratic Bound for a 4-rowed Configuration Let F8(t) be the following 4× (4 + 2t) configuration (as in the Introduction) F8(t) := 1 0 1 0 0 1 0 1 1 1 0 0 1 1 0 0 t · 1 0 0 1 1 1 0 0 . Theorem 5.2.1 Let t ≥ 1 be given. Then forb(m,F8(t)) is Θ(m2). More precisely, forb(m,F8(t)) ≤ 12tm2. Furthermore, F8(t) is a boundary quadratic case. We have several ingredients to the proof. The first is our Standard Induction given in Section 2.1: We find the inductive children HF8(t). By Standard Induction, it suffices then to prove that forb(m,HF8(t)) is Θ(m). We then find What Is Missing as in Section 2.2 in a 89 triple of rows if HF8(t) is avoided. Lastly, we use Implications, as described in Section 2.3, to delete a linear number of columns from a matrix A ∈ Avoid(m,HF8(t)) so that Cr(A) becomes constant. First, let us calculate HF8(t′) for some number t′. Let A ∈ Avoid(m,F8(t′)) and consider Cr(A). The fact that A contains no configuration F8(t ′) means Cr(A) doesn’t have any of the following three configurations, for t = b t′+1 2 c+ 1: H1(t) = 1 00 1 0 0 t · 1 00 1 1 1 , H2(t) = 1 00 1 1 1 t · 1 00 1 0 0 , H3(t) = 0 1 0 11 1 0 0 1 1 0 0 t · 0 11 1 0 0 . So we may conclude that HF8(t) = {H1(t), H2(t), H3(t)}. We now focus our attention to matrices A ∈ Avoid(m, {H1(t), H2(t), H3(t)}). We make the following bold claim: Lemma 5.2.2 Let t be given. Then forb(m, {H1(t), H2(t), H3(t)}) ≤ 12tm. Note that Theorem 5.2.1 is not equivalent to the previous lemma is false, and this is indeed the case for F6(t) and F7(t), for which we know the bound for the inductive children is quadratic and not linear. The proof of Lemma 5.2.2 appears at the end of this subsection. But by using this lemma we are ready to prove Theorem 5.2.1. Proof of Theorem 5.2.1 : Let A ∈ Avoid(m,F8(t)). We simply use induction as in Proposition 2.1.2 (replacing t by t+1 2 + 1) using Lemma 5.2.2 to deduce ‖A‖ is linear and hence forb(m,F8(t)) is O(m 2). The fact that F8(t) is a boundary case follows from the constructions in Conjecture 1.4.1. If α ∈ F8(0), then α has column sum 1 or 3, but then [F8(t) |α] ⊀ I × I × I. Consider then the possibilities for α not in F8(1). If α consists of all 1’s or three 1’s or two 1’s except on the first two rows then each pair of rows of [F8(1) |α] has a column of two 1’s and so [F8(1) |α] is not in Im/3 × Im/3 × Im/3. This also handles the complementary case, i.e. where α is all 0’s or one 1 or two 1’s on the last two rows using Icm/3 × Icm/3 × Icm/3. There are two remaining cases: α having two 1’s in the first and fourth rows or in the second and fourth rows. Then [F8(1) |α] has the 2× 2 matrix I2 on each pair 90 of rows and so [F8(1) |α] is not in Tm/3 × Tm/3 × Tm/3. Thus forb(m, [F8(1) |α]) is Ω(m3). To prove Lemma 5.2.2, we will need some additional Lemmas and properties. 5.2.1 What Is Missing? The following are all the possibilities of columns that are either missing or in short supply on three rows if we forbid H1(t), H2(t), H3(t). This was computed using the C++ program referred to in Chapter 3. Checking that if a triple of rows satisfies each Pi, then configurations H1(t), H2(t), H3(t) are avoided is quite easy, but the computer is used to avoid the enormous amount of work that would be required to establish that the list is complete. Lemma 5.2.3 Let A ∈ Avoid(m, {H1(t), H2(t), H3(t)}). For each triple of rows (x, y, z) satisfies (at least) one of the 21 cases P1, P2, ..., P21 described below for some ordering of (x, y, z). The cases are: P1 = no no10 1 01 1 , P2 = < t < t no no00 1 10 1 01 1 11 1 P3 = < t < t no no01 0 10 1 01 1 11 1 , P4 = < t < t < t no11 0 00 1 10 1 01 1 P5 = < t < t < t no10 0 00 1 10 1 01 1 , P6 = no < t no01 0 10 1 01 1 91 P7 = no < t < t no00 0 00 1 10 1 01 1 , P8 = no < t < t no00 0 01 0 10 1 01 1 P9 = < t < t < t < t no01 0 00 1 10 1 01 1 11 1 , P10 = < t < t < t < t10 0 01 0 10 1 01 1 P11 = no < t < t < t01 0 00 1 10 1 01 1 , P12 = no < t < t < t < t00 0 01 0 00 1 10 1 01 1 P13 = no no no00 1 01 1 11 1 , P14 = < t no no10 0 00 1 01 1 P15 = no no no00 0 00 1 01 1 , P16 = < t no < t no01 0 00 1 01 1 11 1 P17 = < t no < t no10 0 00 1 01 1 11 1 , P18 = < t < t no < t10 0 01 0 00 1 01 1 92 P19 = no < t no < t00 0 01 0 00 1 01 1 , P20 = no < t no < t00 0 10 0 00 1 01 1 P21 = no no01 0 00 1 . Proof: By exhaustive computer search as described in Section 3.5. 5.2.2 Case Analysis We now proceed to do some case analysis of the 21 cases mentioned in Lemma 5.2.3. The first trick to discard some cases involves attempting to do induction one more time. For some of the 21 cases, if they happened to occur in A we may apply the Standard Induction again. Definition 5.2.4 We say that a row r of A is non-essential if ‖Cr(A)‖ ≤ 4t. The motivation for this definition is that if we could find a non-essential row of A, we could use Standard Induction as in Section 2.1 for the case where we are forbidding {H1(t), H2(t), H3(t)} to prove Lemma 5.2.2 (in fact ‖Cr(A)‖ ≤ 12t would suffice). Lemma 5.2.5 If A ∈ Avoid(m, {H1(t), H2(t), H3(t)}) has a triple of rows (i, j, k) that satisfies one of P2, P4, P8, P9, P10, P12, P17, P18 and P19, then there is a non-essential row r of A. Proof: If we analyze the columns that could be in long supply for each of the cases, we see in each case that one of the rows of A isn’t necessary to distinguish between columns in long supply. 93 Perhaps an example would be useful. The other cases are similar. Suppose A is missing P2 in rows i, j, k, in that order. So A satisfies the following from rows i, j, k: P2 = < t < t no no i j k 00 1 10 1 01 1 11 1 Let us analyze the columns that could be in long supply (denote l.s.). These columns are: l.s l.s l.s l.s i j k 11 0 10 0 01 0 00 0 . We can see that ‖Ck(A)‖ ≤ 2t, and therefore row k is non-essential. The same happens for the other cases: in each of them, there is a row r, like row k, for which ‖Cr(A)‖ ≤ 4t. If we assume there are no non-essential rows then we may restrict our attention to matrices which for all triples of rows satisfy one of the cases P1, P3, P5, P6, P7, P11, P13, P14, P15, P16, P20 or P21. We will now use the technique of Implications described in Section 2.3 to delete a linear number of columns from A (without deleting any row) in order to obtain a matrix A′ for which ‖Cr(A′)‖ is bounded by a constant (for any row r). First we give a lemma which is purely a property of directed graphs. Lemma 5.2.6 Let G be a directed graph on m vertices. Then we can colour the edges of G using three colours (blue, red and green) in such a way that for vertices r, a, b we have that G satisfies the following properties: (R) There are at most 2m red edges. (B) If r → a and r → b are blue, then neither a→ b nor b→ a (of any colour). (G) If a→ b is green, there is a blue-red path from a to b. 94 Proof: The idea for this colouring came from an idea first introduced by Anstee and Sali in [AS05], although the actual colouring is different. We provide an algorithmic proof. 1. Divide G into strongly connected components X1, X2, ..., Xk ordered in a way consistent with the order given by the acyclic ordering (so that if i < j, there might be a path between a vertex of Xi and a vertex of Xj, but there is no path back). 2. Pick a strongly connected component Xi. It is a well known property of directed graphs that there is a strongly connected subgraph Yi of Xi that uses all the vertices of Xi and the number of edges is at most 2|Xi|. For every edge of Xi, see whether it is in Yi or not. If it is, colour it with red, and if it isn’t, colour it with green. 3. Colour every remaining edge with blue. Notice that currently the only property that may not be satisfied is (B). We will change some of the blue ones to green (leaving the red ones intact) until we get the desired property, but never breaking (R) nor (G). Notice also that red edges always stay in the same strongly connected component, while blue edges always go to a higher level. To make this statement precise, define a level function λ : V (G)→ N as λ(v) := i if v ∈ Xi. We have the property that if v → u is a red edge, then λ(v) = λ(u), and if v → u is a blue edge, then λ(v) < λ(u). This property will be preserved during all steps of the colouring algorithm. In particular, it is true when applying steps 1 through 3. 4. Go through all the strongly connected components and for each component go through each vertex. Suppose we are at vertex v. Look at the set of blue edges coming out of v. Say their endpoints are v1, v2, ..., vd. 5. If there is an edge vi → vj (of any colour) and v → vi and v → vj are (still) blue, paint v → vj with green. It is easy to check that step 5 preserves property (G). We only need to prove that there is a blue-red path from v to vj. We know there is a blue-red path Pvi,vj from vi to vj by (G), and since v → vi is blue, we can consider the path v → vi plus Pvi,vj . This can be done as long as Pvi,vj doesn’t contain v → vj, and indeed it can’t, because red edges always stay in the same connected component, while blue edges always go to a higher level, and since 95 v → vi is blue and the path Pvi,vj is blue-red, λ(v) < λ(vi) ≤ λ(vj). This implies that Pvi,vj cannot contain vertex v. Recall our definition of implications of Section 2.3. Lemma 5.2.7 Let A ∈ Avoid(m, {H1(t), H2(t), H3(t)}) have no non-essential rows. Form the directed graph G on m vertices whose edges are the implications in A and colour G using Lemma 5.2.6. Then we can delete at most 4tm columns from A so that all red implications are pure. Proof: There are at most 2m red edges and there are at most 2t columns that violate any given implication, hence there are at most 4tm columns that violate red implications. We can delete at most 4tm columns to make the red implications pure. If a column violates a green implication, it must also violate an implication in a blue-red path, so it must violate either a blue or a red implication. So if we manage to purify the blue implications, no column could violate a green implication either. We will devote some time to proving there are at most 4tm columns that violate at least one blue implication. Lemma 5.2.8 Let A ∈ Avoid(m, {H1(t), H2(t), H3(t)}) with no non-essential rows (i.e. for every row r, we have ‖Cr(A)‖ > 4t). Colour the associated implications graph as in Lemma 5.2.6. In any triple of rows r, ri, rj where r → ri and r → rj are blue and impure, then either P16 or P21 is satisfied. Proof: By Lemma 5.2.5, we know that in any triple of rows one of P1, P3, P5, P6, P7, P11, P13, P14, P15, P16, P20, P21 has to be satisfied on the triple r, ri, rj. 96 Having the implications r → ri and r → rj means that the following is satisfied in the triple of rows r, ri, rj for some 0 ≤ a < 2t: < 2t− a < 2t− a = a r ri rj 01 0 00 1 01 1 We can go through each of the remaining cases (except P16 and P21) and observe that the implications r → ri and r → rj are already not violated. In each case we will find a contradiction by finding either a non-essential row, contradict- ing the hypothesis, or ri → rj (or rj → ri), which is a contradiction to the fact that r → ri and r → rj are blue. For each case Pi, we number the rows of Pi by 1,2,3 as they appear in our listing of What Is Missing. Note that we can’t have have two implications a → b and b → a (yields a non-essential row). There are in fact at most two implication on the three rows r, ri, rj. Here is a quick check for each case: P1: If r corresponds to row 1 of P1, then we get either ri → rj or rj → ri, a contradiction. If r corresponds to row 2 we get the same contradiction. And if r corresponds to row 3, then one of the rows ri and row rj is non-essential. P3: Already has the implication 1→ 2, so row r must correspond with row 1, but then the row corresponding to row 3 of P3 will be non-essential. P5: Already has 1→ 3 and 2→ 3, regardless of ordering we’ll have the contradiction. P6: Already has the implication 1→ 2. If we set r to correspond to 1, then we also get the implication 2→ 3. P7: Already has 2→ 3 and 1→ 3, so no matter how we set row r we’ll have the contradiction. P11: Already has 2→ 3 and 1→ 3. P13: The columns involved in the implications already have been marked with ‘no’, so they never get violated anyway. P14: Already has 1 → 3. Then row r must correspond to row 1 of P14 but then row r becomes non-essential. 97 P15: Already has 1 → 3. Then row r must correspond to row 1 of P15 but then row r becomes non-essential. P20: Already has 1 → 3. Then row r must correspond to row 1 of P20 but then row r becomes non-essential. Hence we must have either P16 or P21 on any triple r, ri, rj. Lemma 5.2.9 Let A ∈ Avoid(m, {H1(t), H2(t), H3(t)}) with no non-essential rows, whose implication graph is coloured to satisfy the conditions R,B,G of Lemma 5.2.6. We may delete at most 4tm columns from A so that no blue implication is violated in what remains. Proof: We will prove something a bit stronger: for every row r, the number of columns that violate a blue implication coming out of r is bounded by a 4t. Take a row r and consider Bluer, the induced subgraph on the blue children of r. That is, Bluer = {s ∈ G : r → s is blue }. We will assume |Bluer| ≥ 3. If |Bluer| ≤ 2, we have at most 4t columns that violate the blue implications out of r. Let Bluer = {r1, ..., r`}. Notice that if a triple of rows r, ri, rj with r → ri and r → rj impure and blue satisfies either P16 or P21 then in particular it must satisfy this: < t no < 2t r ri rj 01 0 00 1 01 1 which means that in a column where row r is 0 and row rj is 1, then row ri is 1. Restrict our attention to the submatrix of A given as [BrCr] in (2.1.1). We use the notation row(ri) to denote the set given by row ri considered as an incidence vector (but restricting to the submatrix [BrCr]), we have row(rj) ⊆ row(ri). Every pair of rows in Bluer then must have one contained in the other (under the zeros of row r), which means we can order the sets row(r1), row(r2), ..., row(r`) into an ascending chain. Therefore we can separate the columns that have a zero on row r into three categories. The first, C0 consists of the columns with all the entries in rows r1, r2, ..., r` being 0. The 98 second category, C1 consists of all columns with all the entries in rows r1, r2, ..., r` being 1. And the last, C, consists of columns that start with some number of zeros and end with ones, like this: r r1 ... ri ri+1 ... r` 0 0 ... 0 1 ... 1 . We deal with these three categories separately. Columns in C0: These columns already don’t violate any implication r → ri. Columns in C: Rows r1 and r`, in addition to satisfying row(r1) ⊆ row(r`), must also satisfy < t r r1 r` 00 1 , which means the number of columns with 0 in row r, 0 in row r1 and 1 in row r` is at most t− 1. This means |C| < t. Columns in C1: We may use the fact that each triple r, ri, rj satisfies < 2t r ri rj 01 1 , so C1 < 2t as well. In conclusion, for each row r, we can delete 3t columns, and then every blue implication r → ri is pure. We repeat this for every row r, and by deleting at most 3tm columns of A, every blue implication is pure. 99 After these column deletions every blue implication is pure and we may now assume every implication is pure: No column violates either blue or red implications, which means no column violates any green implications. We’ve managed to delete only a linear number of columns of A without deleting any row, and now no implication gets violated. 5.2.3 Linear Bound for the Inductive Children We are now ready to prove Lemma 5.2.2. Proof of Lemma 5.2.2 : We will show that forb(m, {H1(t), H2(t), H3(t)}) ≤ 12tm. Let A ∈ Avoid(m, {H1(t), H2(t), H3(t)}). By Proposition 2.1.2 as described above, we may assume A has no non-essential rows. For every triple of rows of A, we choose Pi if Pi is satisfied in that triple (other Pj might be satisfied as well, but pick one for every triple). This yields a map p : S3(A) → {Pi : i ∈ {1, 3, 5, 6, 7, 11, 13, 14, 15, 16, 20, 21}} (by Lemma 5.2.5) where S3(A) is the set of triples of rows of A. Appealing to Lemma 5.2.7 and Lemma 5.2.9 we can delete at most 8tm columns and conclude that all implications associated to one of the Pi in the image of S3(A) are pure. We now do induction again with a new hypothesis. We wish to show that for A ∈ Avoid(m, {H1(t), H2(t), H3(t)}) satisfying that each triple of rows has a chosen satisfied condition Pi (for some i ∈ {1, 3, 5, 6, 7, 11, 13, 14, 15, 16, 20, 21}) and any implications arising from these chosen conditions is pure, then ‖A‖ ≤ 4tm. We note that any submatrix of A satisfies the same hypotheses. Thus it suffices to appeal to our induction argument Proposition 2.1.2 and show that A has a non-essential row. We will in fact show that row 1 is non-essential. Let A ∈ Avoid(m, {H1(t), H2(t), H3(t)}) satisfying that each triple of rows has a chosen satisfied condition Pi (for some i ∈ {1, 3, 5, 6, 7, 11, 13, 14, 15, 16, 20, 21}) and any implications arising from these chosen conditions is pure. Consider a triple of rows 1, r, s from A. In this triple, one of the 12 cases will have been chosen to be satisfied, so for the pair r, s in C1(A), two rows corresponding to two rows of one of the 12 Pi’s must be satisfied, since C1(A) consists of the repeated columns. Consider a (0,1)-row with n columns as the incidence vector of a subset of [n] = {1, 2, . . . , n}. We now use row(r) to denote row r in C1(A), since we’ve restricted our attention to this matrix. We show by the case analysis below that if two columns are absent on a pair of rows r, s, then either row(r) = ∅ or row(r) = [n], (or row(s) = ∅ or row(s) = [n]), or row(r) = row(s) or row(r) = row(s)c. If two columns are absent, then only two columns can be present. Thus, 100 1. If the two columns absent are no no r s [ 1 0 ] [ 1 1 ] or no no r s [ 0 1 ] [ 0 0 ] then row(r) = ∅ (row r is 0’s) in the former, or row(r) = [n] (row r is 1’s) in the latter. 2. If the columns missing are no no r s [ 0 1 ] [ 1 0 ] then row(r) = row(s). 3. If the columns missing are no no r s [ 0 0 ] [ 1 1 ] then row(r) = row(s)c. We may check each case P1, P3, P5, P6, P7, P11, P13, P14, P15, P16, P20, P21 to find that two columns are absent for each pair of rows of C1(A), except for P6 and P14 when row 1 of A corresponds to row 3 of P6 or row 2 of P14, In the case when 1, r, s form a P6 or a P14, and row 1 of A corresponds to row 3 of P6 or row 2 of P14, we have (for some order of r and s) no < t r s [ 0 1 ] [ 1 0 ] which means row(s) ⊆ row(r), and the difference row(r) \ row(s) is at most t. Construct the following coloured semi-directed graph: • The vertices are the rows r of C1(A) with row(r) 6= ∅ and row(r) 6= [n]. • Place a purple edge between two rows r, s if row(r) = row(s). • Place a yellow edge between two rows r, s if row(r) = row(s)c. 101 • Place a directed edge r → s if row(r) ⊆ row(s). If some rows are equal, we will treat them as being just one row. So we can take the quotient of the graph over the purple edges and work in the new graph. If two yellow edges share a vertex, the non-shared vertices must have a purple edge between them, because the complement of the complement is itself. Since we did the quotient over purple edges, we can assume no two yellow edges share a vertex. So we are left with only directed and yellow edges. We will prove there are no yellow edges. We proceed by contradiction. Suppose we have a yellow edge between rows r1 and r2 so that row(r1) = row(r2) c. If there is no other row then the matrix has at most 2 columns and we are done. Assume r is another row, different from r1 or r2. Consider the edge between r and r1 and between r and r2. Let us analyze the four possibilities. Clearly it can’t be yellow or purple. • If r → r1 and r → r2, then row(r) ⊆ row(r1) and row(r) ⊆ row(r2) contradicts row(r1) = row(r2) c if row(r) 6= ∅. So we conclude row(r) = ∅, contradicting our construction. • If r → r1 and r2 → r, then row(r2) ⊆ row(r1), a contradiction. • If r1 → r and r → r2, then row(r1) ⊆ row(r2), a contradiction. • If r1 → r and r2 → r, then we have that row(r) contains both a set and its complement. This means row(r) = [n] contradicting our construction. Every pair of rows r, s has a directed edge and therefore we have a tournament. We note that the graph has no directed cycles (since a directed edge means containment of rows) and hence it is a transitive tournament. This in particular yields a path that goes through all the vertices (a total ordering of the rows). We have more information: A directed edge only occurs in cases P6 and P14, when row 1 of A corresponds to row 3 of P6 or row 2 of P14. In these two cases, when a row r contains row s, we also have that we get that rows r and s differ in at most t columns. And since the first row in the path (the one with the least number of ones) and the last (the one with the most number of ones) have to differ in at most t places, there must be at most t+ 2 columns in C1(A), and so row 1 is non-essential, which proves the lemma. 102 5.3 Quadratic Bound for a 5-rowed Configuration Previous work of Ryan, reported in [Ans], computed nine 5-rowed simple matrices F which by Conjecture 1.4.1 should be boundary cases and for which forb(m,F ) should be Θ(m2). One of them, named F7 in [Ans], is F7 = 1 1 0 1 1 0 1 0 1 1 1 1 0 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 0 . Note that F7 = F c 7 . Theorem 5.3.1 We have that forb(m,F7) is Θ(m 2). Moreover, for any 5 × 1 {0, 1}- column α, forb(m, [F7 |α]) is Ω(m3). The proof uses Standard Induction (Section 2.1) and a linear bound for three smaller matrices in Lemma 5.3.2 which in a novel way uses Standard Induction. We give the proof of Theorem 5.3.1 from Lemma 5.3.2 in Section 2.1 and the proof of Lemma 5.3.2 in Section 5.3.3. 5.3.1 Applying Standard Induction Let A ∈ Avoid(m,F7) and apply the standard decomposition of (2.1.1) for r = 1. Our goal is to show ‖A‖ is quadratic by showing that ‖C1(A)‖ is linear. We find the inductive children of F7 (basically, delete each row of F7 in turn) and note that C1(A) cannot contain any of the configurations H1, H2, H3, H4, H5: H1 = 1 0 1 1 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 , H2 = 1 0 1 1 0 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 , H3 = 1 1 0 1 0 1 1 1 0 0 1 0 0 0 0 1 , 103 H4 = 0 1 1 0 1 1 1 1 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 1 , H5 = 1 1 0 1 0 1 0 1 1 1 0 1 0 1 1 0 0 1 0 1 . We observe that Hc3 = H3, H4 = H c 1, H c 2 = H5. Also H3 ≺ H1 (columns 2,3,5,6) and so H3 ≺ H4 and so we may ignore H1, H4. Standard Induction (Section 2.1) gives the bound forb(m,F7) < m·forb(m, {H2, H3, H5}). The next lemma states that forb(m, {H2, H3, H5}) has a linear bound, which means forb(m,F7) has a quadratic bound. Lemma 5.3.2 We have that forb(m, {H2, H3, H5}) is O(m). We will prove Lemma 5.3.2 in Section 5.3.3. We can now prove that forb(m,F7) is quadratic. Proof of Theorem 5.3.1: The fact that forb(m,F7) is Ω(m 2) comes directly out of the conjecture, as F7 ⊀ I × I. We show forb(m,F7) is O(m2) using induction on m. Consider A ∈ Avoid(m,F7) with ‖A‖ = forb(m,F7). Then using Proposition 2.1.2, we have forb(m,F7) = ‖A‖ ≤ m−1∑ i=1 forb(i, {H2, H3, H5}). Given that there is a constant c so that forb(i, {H2, H3, H5}) ≤ ci by Lemma 5.3.2, we deduce the bound forb(m,F7) < c ·m(m− 1)/2 which is Θ(m2). Now consider any 5×1 column α. We immediately deduce that forb(m, [F7 |α]) is Ω(m3) for α having zero, one, four or five 1’s, or if α is a column in F7 (considered as a matrix). It is a computational exercise to show that every other α results in forb(m, [F7 |α]) being Ω(m3) and the computer program described in Section 3.4 indeed gave us this result. We give a proof here for completeness. We need only consider α having two 1’s since F c7 = F7. If α has 0’s on rows 2,3 then [F7 |α] ⊀ Ic × Ic × Ic (each pair of rows from the four rows 1,2,3,4 of [F |α] has (0, 0)T ) or two 0’s on rows 1,4 then [F7 |α] ⊀ Ic × Ic × Ic (each pair of rows from the four rows 1,3,4,5 has (0, 0)T ) . This only leaves four columns, three of which are already in F7. Only 104 α = (0, 0, 1, 1, 0)T is not already in F7, and in such case [F7 |α] ⊀ T ×T ×T since every pair of rows from the four rows 1,2,3,4 has the 2× 2 configuration I2. 5.3.2 What Is Missing? Applying the technique of What Is Missing described in Section 2.2 to F = {H2, H3, H5}, we get the following lemma. Lemma 5.3.3 Let A ∈ Avoid(m, {H2, H3, H5}). Then there are 13 possibilities Q0, Q1, . . . , Q12 for What Is Missing on each 4-set of rows. Q0 = no no no no no no 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 Q1 = no no no no no no 0 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 Q2 = no no no no no no 0 1 1 0 1 1 1 0 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 Q3 = no no no no no no 0 1 0 1 1 0 0 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 1 1 Q4 = no no no no 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 Q5 = no no no no no no no no 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 1 105 Q6 = no no no no no no 0 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 0 1 0 1 1 1 0 1 Q7 = no no no no no no 0 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 1 0 1 Q8 = no no no no no no 1 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 Q9 = no no no no no no 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 1 0 1 1 Q10 = no no no no no no 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 Q11 = no no no no no no 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 Q12 = no no no no no no no no 1 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 Proof of Lemma 5.3.3: An exhaustive computer search by the program described in Section 3.5 yields the result. 106 5.3.3 Linear Bound for the Inductive Children The rest of the section is a proof of Lemma 5.3.2. Let A ∈ Avoid(m, {H2, H3, H5}). We will use special features of H2, H3, H5 to obtain a linear bound on ‖A‖. The forbidden configuration H3 is used most often in this proof. We will show ‖A‖ ≤ 7m by induction on m. We analyze the 13 cases of Lemma 5.3.3 one by one and have special arguments for the three troublesome cases Q2, Q3, Q11. Lemma 5.3.4 Let A ∈ Avoid(m, {H2, H3, H5}). Consider the standard decomposition (2.1.1) of A based on row r. Let L(r) 6= ∅ be a minimal set of rows such that Cr|L(r) is simple. Then each triple of rows {i, j, k} in L(r) yield a quadruple of rows {r, i, j, k} on which one of the cases Q2, Q3, Q11 occurs, with row r being the first row of each of the cases Q2, Q3, Q11 as given in Lemma 5.3.3. Proof: For each Qi we record pairs of rows containing “a copy of K2”: namely in the columns marked absent we find r i j k no a e 0 0 , no b f 1 0 , no c g 0 1 , no d h 1 1 . Suppose A had these columns missing on the quadruple of rows r, i, j, k and that rows i, j, k belong to L(r). Then the simple matrix Cr from (2.1.1) has the four 3 × 1 columns (e, 0, 0)T , (f, 1, 0)T , (g, 0, 1)T and (h, 1, 1)T missing on the triple of rows {i, j, k}. We deduce that row i cannot belong to L(r), a contradiction. By analyzing the cases, we find that Q0, Q1, Q5, Q6, Q7, Q8, Q10, Q12 have 3 rows each pair of which have a “K2” and Q4, Q9 have two disjoint pairs of rows each with a “K2”. Thus in any of these cases, What Is Missing on a triple of rows in Cr will contain a copy of “K2” and so we can delete a row from Cr without disturbing simplicity of the remainder of Cr. In cases Q2, Q3, Q11, if we choose row r to be any row but the first row in each of the cases then there is a “K2” on the remaining triple. We would like to show that for all A ∈ Avoid(m, {H2, H3, H5}) we can choose row r so 107 that ‖Cr‖ ≤ 7 as in (2.1.1). Then by Proposition 2.1.2 and induction, ‖A‖ ≤ 7m. We will assume the contrary, namely that there is A ∈ Avoid(m, {H2, H3, H5}) such that for every row r, ‖Cr‖ ≥ 8. In each of the troublesome cases Q2, Q3, Q11, we end up with the following sets of columns missing on a triple of rows in Cr (arising from What Is Missing in A on a quadruple of rows involving r) and we name the cases correspondingly P2, P3, P11. Note the implicatoins arising from P3. P2 : no 11 0 no 10 1 no 01 1 (5.3.1) P3 : i j k no 10 1 no 00 1 no 01 1 no 01 0 yielding ij k no 01 no 0 1 no 0 1 (5.3.2) P11 : no 10 0 no 01 0 no 00 1 (5.3.3) Lemma 5.3.5 Let A ∈ Avoid(m, {H2, H3, H5}). Consider the standard decomposition (2.1.1) of A based on row r. Let L(r) 6= ∅ be a minimal set of rows such that Cr|L(r) is simple. Then each triple of rows {i, j, k} in L(r) is in one of the cases P2, P3 or P11. Moreover, if any triple in L(r) is in case P2, then all triples of rows of L(r) are in case P2. Similarly if any triple in L(r) is in case P3 (respectively P11), then all triples of rows are in case P3 (resp. P11). Proof: By Lemma 5.3.4, every triple of rows of L(r) satisfies one of P2, P3 or P11. A triple of rows {a, b, c} in case P3 can’t overlap with a triple of rows in case P2 (respectively P11) on two rows {a, b} since on the two rows {a, b} What Is Missing (by (5.3.2)) will extend to one 108 new column missing on the triple from P2 (resp. P11) yielding a “K2”. This would allow us to delete a further row from Cr|L(r) while preserving simplicity, a contradiction to the fact that L(r) is minimal with Cr|L(r) simple. Thus, if any triple of rows of L(r) is in case P3, then all triples of rows of L(r) are in case P3. Assume all triples of rows are in case P2 or P11. We can’t have a triple of rows in case P2 overlap with a triple of rows in case P11 on two rows as shown below. On the quadruple of rows we have marked ‘OK’ over the columns which can occur on the quadruple of rows. At most 6 columns can be present in Cr|L(r) and we note that we can delete the second or third row from Cr|L(r) and not affect simplicity of Cr|L(r), a contradiction. Hence such an overlap cannot occur. no no no no no no 1 1 0 ∗ ∗ ∗ 1 0 1 1 0 0 0 1 1 0 1 0 ∗ ∗ ∗ 0 0 1 OK OK OK OK OK OK 0 1 0 0 1 1 0 0 1 0 1 1 0 0 0 1 1 1 0 0 1 1 0 1 Given that each triple of the remaining rows of Cr rows must be in case P2 or P11, we must have all triples satisfy only one of the two. Lemma 5.3.6 Assume all triples in L(r) are in case P3. Then the rows of Cr can be ordered so that each triple of rows a < b < c corresponds to a = i, b = j, and c = k in P3. Proof: In this case there is an ordering of the rows L(r) so that all triples are consistent with the ordering given. We had noted in (5.3.2) that having P3 on rows i, j, k in that order correspond to three columns, each on two rows, being absent. If we cannot find a consistent ordering of the rows of L(r), then on some pair of rows we will be missing two columns and this implies that one of the two rows can be deleted while preserving simplicity of Cr|L(r). This contradiction proves the result. In view of Lemma 5.3.5, we will say L(r) is type i if each triple of rows in L(r) is in case Pi for i = 2, 3 or 11. Recall we assumed ‖Cr‖ ≥ 8. We obtain M(r) from L(r) as follows 109 where the type of M(r) is the type of L(r). M(r) = { L(r) if L(r) is type 2 or 11 L(r)\{first and last row in ordering} if L(r) is type 3 (5.3.4) Note Cr|M(r) need not be simple. Lemma 5.3.7 Let A ∈ Avoid(m, {H2, H3, H5}) with (2.1.1) applied for row r and M(r) from (5.3.4). (i) If M(r) is type 2, then Cr|M(r) must consist of [0|M(r)| I|M(r)|] and possibly column 1|M(r)| and no other column. Thus ‖Cr‖ − 2 ≤ |M(r)| ≤ ‖Cr‖ − 1. In addition, columns of A|M(r) are from [0|M(r)| I|M(r)| 1|M(r)|]. (ii) If M(r) is type 11, then Cr|M(r) must consist of [Ic|M(r)| 1|M(r)|] and possibly column 0|M(r)| and no other column. Thus ‖Cr‖ − 2 ≤ |M(r)| ≤ ‖Cr‖ − 1. In addition columns of A|M(r) are from [0|M(r)| Ic|M(r)| 1|M(r)|]. (iii) If M(r) is type 3, then Cr|M(r) must consist of [0|M(r)| T|M(r)| 1|M(r)|]. Thus |M(r)| = ‖Cr‖ − 3. In addition, columns of A|M(r) are from T|M(r)|. Proof: For M(r) being type 2, we observe that columns of Cr|M(r) must belong to [0|M(r)| I|M(r)| 1|M(r)|]. By minimality of L(r) (which is M(r)), we cannot delete any rows from Cr|M(r) and preserve simplicity. Thus all columns of [0|M(r)| I|M(r)|] must be present. A quick count reveals ‖Cr‖ − 2 ≤ |M(r)| ≤ ‖Cr‖ − 1. Similarly for M(r) being type 11, Cr|M(r) must consist of [Ic|M(r)| 1|M(r)|], and possibly column 0|M(r)| and no other column. For M(r) being type 3 then, with the row ordering of Lemma 5.3.6, Cr|L(r) must consist of T|L(r)|. Hence Cr|M(r) must consist of [0|M(r)| T|M(r)| 1|M(r)|], 110 and so |M(r)| = ‖Cr‖ − 3. The restricted columns on Cr|M(r) extend to restricted columns on A|M(r) as follows. If M(r) is type 2 then for any H ⊆M(r) with |H| = 3, the 6 forbidden columns on rows r∪H of Q2 yield the restrictions P2 of 3 forbidden columns on rows H of A. Thus the columns of A|M(r) are all contained in [0|M(r)| I|M(r)| 1|M(r)|]. In a similar way, if M(r) is type 11 then the columns of A|M(r) are all contained in [0|M(r)| Ic|M(r)| 1|M(r)|]. If L(r) is type 3 we noted Cr|L(r) is T|L(r)|. Indeed, by Lemma 5.3.6, Q3 has each triple i, j, k ∈ L(r) ordered consistent with the ordering of the rows of L(r) yielding T . We deduce the following columns are absent in A on rows i < j < k: i j k 10 1 ij k 01 0 The following two columns are also forbidden on the 4 rows r, i, j, k of A by Q3: α = r i j k 0 0 1 1 β = r i j k 1 0 0 1 Thus, using α, under the 0’s in row r in [BrCr]|L(r) we may only have the columns of T|L(r)| plus one additional column consisting of all 0’s except a 1 in the last row of L(r). Similarly using β, under the 1’s in row r in [CrDr]|L(r) we may only have the columns of T|L(r)| plus one additional column consisting of all 1’s except a 0 in the first row of L(r). Thus if M(r) is L(r) with the first and last row deleted then Cr|M(r) = [0T 1] and the columns of A|M(r) are contained in [TM(r)]. Proof of Lemma 5.3.2: Let A ∈ Avoid(m, {H2, H3, H5}). Use the decomposition of A given in (2.1.1). Our procedure is as follows. We use Lemma 5.3.5 to deduce the possible cases we need to consider. Under the assumption that ‖Cr‖ ≥ 8 for all rows r, we will establish by induction an infinite sequence r1, r2, r3, . . . and associated sets of rowsN(r1), N(r2), N(r3), . . . with |N(ri)| ≥ 4 for each i. The sets N(r) differ very little from L(r) and M(r). We are able to show that the sets N(r1)\r2, N(r2)\r3, . . ., N(ri)\ri+1 are all disjoint (see the beginning of Case 1a) and yet |N(rj)\rj+1| ≥ 3. This yields a contradiction (there are only m rows!) and so we may conclude that for some r, ‖Cr‖ ≤ 7. Hence by our induction we deduce that 111 ‖A‖ ≤ 7m. Assume for all rows r that ‖Cr‖ ≥ 8 and hence find the sets M(r) with |M(r)| ≥ 5 (checking the three cases of Lemma 5.3.7). Let r1 be some row of A. We form M(r1). Note that if M(r1) was type 3 then we have deleted the first and last rows (in the ordering) from the originally determined L(r1). We determine the sets N(ri) from M(ri) as follows N(r) = { M(r) if M(r) is type 2 or 11 M(r)\ first row in ordering if M(r) is type 3 (5.3.5) Our general step commences with N(ri). We select a row ri+1 ∈ N(ri), making sure that when N(ri) is of type 3, we select the first row in the ordering of Lemma 5.3.6. We obtain M(ri+1) applying Lemma 5.3.4, Lemma 5.3.5, Lemma 5.3.6 and Lemma 5.3.7. Given our assumption that ‖Cr‖ ≥ 8 we have |M(ri+1)| ≥ 5. Now by (5.3.5) we deduce |N(ri+1)| ≥ 4 in all cases. We hope identifying L(r),M(r), N(r) makes the proof clearer. To show the desired properties of the sets N(ri), we set up an inductive hypothesis concerning the structure of A. In what follows let Z denote a matrix of 0’s (or perhaps a matrix of no columns) and J denote a matrix of 1’s (or perhaps a matrix of no columns). The critical inductive structure is the following. The middle columns correspond to the columns of Cri as shown in (5.3.6). For each p with p < i with N(rp) of type 2 or 3 we have the structure shown in rows N(rp) \ rp+1. For each q with q < i with N(rq) of 11 we have the structure shown in rows N(rq) \ rq+1. We have three cases depending on the type of N(ri). When N(ri) is type 2: We have S = [0 I] or [0 I 1] and the columns of Ui and Vi are in [0 I 1]. When N(ri) is type 11: We have S = [I c 1] or [0 Ic 1] and the columns of Ui, Vi are in [0 Ic 1]. When N(ri) is type 3: We have S = [0T 1] and the columns of Ui, Vi are in T . 112 A = ri → ... N(rp)\rp+1 { ... N(rq)\rq+1 { ... N(ri) { ... 0 · · · 0 0 · · · 0 0 · · · 0 1 · · · 1 1 · · · 1 1 · · · 1 ... Z W 0p Z Z W 1 p Z ... J W 0q J J W 1 q J ... Ui ZJ S S ZJ Vi ... ... ... ... ... ... ︸︷︷︸ Cri ︸︷︷︸ Cri (5.3.6) We proceed to verify that we have the same inductive structure for ri+1. There will be cases to explore. It is helpful to display representatives of H2, H3, H5 that we will use in our arguments. For M(ri+1) type 2 or 11 we will use H2 = ri+1 s i j 0 0 0 1 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 1 , H3 = ri+1 s i j 0 1 1 1 0 0 0 1 1 1 0 1 0 0 1 0 (5.3.7) H3 = ri+1 t i j 0 0 0 1 0 1 1 1 1 0 1 1 0 1 0 0 , H5 = ri+1 t i j 0 0 1 1 1 1 1 1 1 0 0 1 0 1 1 1 0 1 0 0 (5.3.8) For M(ri+1) type 3 we will use H3 = ri+1 s i j 0 1 1 1 1 0 0 0 1 0 1 1 0 0 0 1 , H3 = ri+1 s i j 0 1 1 1 0 0 0 1 1 0 1 1 1 0 0 0 (5.3.9) 113 H3 = ri+1 t i j 0 0 0 1 0 1 1 1 1 0 1 1 1 0 0 0 , H3 = ri+1 t i j 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 0 (5.3.10) Case 1: N(ri) is type 2. Begin with inductive structure of (5.3.6). Given N(ri) of type 2 we have S = [0 I] or [0 I 1]. Choose a row ri+1 ∈ N(ri). Now consider the decomposition (2.1.1) applied to A using row r = ri+1. Apply Lemma 5.3.4, Lemma 5.3.5, Lemma 5.3.6 and Lemma 5.3.7 to obtain M(ri+1). Case 1a: M(ri+1) is type 2. The columns of Cri+1 must appear once with a 0 in row ri+1 and once with a 1 in row ri+1. By Lemma 5.3.7 we know that columns of A|N(ri) are contained in [0 I 1]. The only columns of A|N(ri) which differ only in row ri+1 would be the column of 0’s and the column of all 0’s except a 1 in row ri+1. Thus the repeated columns of Cri+1 , when restricted to rows N(ri)\ri+1, must be all 0’s. By examining (5.3.6), the only columns of A which on rows N(ri) that have a single 1 (on row ri+1) respectively on the rows N(ri) are the columns which are Z in rows N(rp)\rp+1 for those p < i with N(rp) being type 2 or 3 and J in rows N(rq)\rq+1 for those q < i with N(rq) being type 11. We need to show that N(ri+1) is disjoint from N(rj)\rj+1 for all j < i+ 1. All columns in W 0 or W 1 of (5.3.6) are either all 0’s or all 1’s on the rows of N(ri) and so won’t give rise to columns of Cri+1 . We deduce that the columns of Cri+1 are all 0’s in rows N(rp)\rp+1 for those p < i with N(rp) being type 2 or 3 and all 1’s in rows N(rq)\rq+1 for those q < i with N(rq) being type 11. Recalling that we form L(ri+1) by deleting rows of Cri+1 while preserving simplicity, we deduce that L(ri+1) (and hence M(ri+1) and N(ri+1)) is disjoint from N(rj)\rj+1 for all j < i+ 1. This gives us the structure of Cri+1 given below in (5.3.11) where the two copies of Cri+1 occupy the central columns. To complete (5.3.11) we define W 0 and W 1 (likely different from those in (5.3.6) in the paragraph above). We choose from the columns of Bri+1 and Dri+1 , all columns which for some ` < i, where N(r`) is type 2 or 3 (and hence rows N(r`) is Z in Cri+1), have a 1 in some row of N(r`) or for some ` < i, with N(r`) is type 11 (and hence rows N(r`) is J in Cri+1), have a 0 in some row of N(r`). We identify such columns in Bri+1 as W 0 and such columns in Dri+1 as W 1. Moreover let W 0t (respectively W 1 t ) denote the submatrix of W 0 (respectively W 1) in rows N(rt)\rt+1 for t = 1, . . . , i or in rows M(rt) 114 for t = i+ 1. All remaining columns of Bri+1 and Dri+1 are all 0’s on rows of each N(r`) where N(r`) is type 2 or 3 and all 1’s on rows of each N(r`) where N(r`) is type 11 for ` < i. ri+1 → 0 · · · 0 0 · · · 0 0 · · · 0 1 · · · 1 1 · · · 1 1 · · · 1 ... N(rp)\rp+1 { Z W 0p Z Z W 1p Z ... N(rq)\rq+1 { J W 0q J J W 1q J ... N(ri)\ri+1 { Z W 0i Z Z W 1i Z M(ri+1) { Ui+1 W 0i+1 0 I 1 0 I 1 W 1i+1 Vi+1 ... ... (5.3.11) By Lemma 5.3.7 we know that columns of A|N(ri) are contained in [0 I 1] and so we deduce that columns of Ui+1, Vi+1 are in [0 I 1]. Our remaining goal is to show that W 0 i+1 = ZJ and W 1i+1 = ZJ to complete the induction. We will use the four forbidden matrices of (5.3.7), (5.3.8) which have been ordered and labelled to assist the reader in seeing the occurrence of the forbidden objects H2, H3, H5. Assume for some column α of W 0 that α has a 1 in row s ∈ N(rp)\rp+1 where N(rp) is type 2 or 3. We will give this first case in greater detail. All columns of Cri+1 have 0’s in the rows of N(rp) and in particular in row s. Given that M(ri+1) is type 2 or 11 we deduce Cri+1|M(ri+1) contains either I or Ic. Thus each pair of rows i, j ∈ M(ri+1) will contain [ 1 0 0 1 ] in each copy of Cri+1 . We find the following entries in A in the rows r, s, i, j where the left column comes from α and the remaining columns are from the two copies of Cri+1 : ri+1 s i j 0 0 0 1 1 1 0 0 0 0 a 1 0 1 0 b 0 1 0 1 . If [ a b ] = [ 1 0 ] or [ 0 1 ] then we have a representative of H2 as noted in the left matrix of (5.3.7). Thus the column α which contains a 1 in some row s of W 0p must either be all 0’s or all 1’s on the rows M(ri+1). Assume for some column β of W 0 that β has a 0 in row t ∈ N(rq)\rq+1 where N(rq) is type 11. Using the left matrix of (5.3.8) we may argue as above that column 115 β must either be all 0’s or all 1’s on the rows M(ri+1). Given our choice of W 0, this is enough to show that W 0i+1 is ZJ . Assume for some column α of W 1 that α has a 1 in row s ∈ N(rp)\rp+1 where N(rp) is type 2 or 3 and hence we find 0’s in Cri+1 in row s. Hence by the right matrix in (5.3.7) we cannot have the matrix i j [ 1 0 ] in α for any choices i, j ∈ M(ri+1). As above, the column α is either all 1’s or all 0’s on the rows of M(ri+1). Similarly, using the right matrix of (5.3.8) , we can show that for any column β of W 1 that has a 0 in row t ∈ N(rq)\rq+1 where N(rq) is type 11 that β cannot have the matrix i j [ 1 0 ] in α for any choices i, j ∈ M(ri+1). Hence β is either all 0’s or all 1’s on the rows of M(ri+1). Thus W 1 i+1 = ZJ as desired. Setting N(ri+1) = M(ri+1) results in the same structure of (5.3.6) with ri replaced by ri+1 and S = [0 I] or [0 I 1]. Case 1b: M(ri+1) is type 11. We can use the argument of Case 1a if M(ri+1) is type 11 since any two rows of I c contain I2 allowing us to use the matrices of (5.3.7),(5.3.8) as above. We would obtain (5.3.6) with ri replaced by ri+1, N(ri+1) = M(ri+1) and S = [I c 1] or [0 Ic 1]. Case 1c: M(ri+1) is type 3. We follow the argument at the beginning of Case 1a) to obtain most of the structure of (5.3.12). Given that we form L(ri+1) by deleting rows of Cri+1 while preserving simplicity, we deduce that L(ri+1) (and hence M(ri+1)) is disjoint from N(rj)\rj+1 for all j < i + 1. We will use (5.3.9) and (5.3.10) and, arising from the left matrix of (5.3.10), we discover a row of M(ri+1) that must be deleted. ri+1 → 0 · · · 0 0 · · · 0 0 · · · 0 1 · · · 1 1 · · · 1 1 · · · 1 ... N(rp)\rp+1 { Z W 0p Z Z W 1p Z ... N(rq)\rq+1 { J W 0q J J W 1q J ... N(ri)\ri+1 { Z W 0i Z Z W 1i Z M(ri+1) { Ui+1 W 0i+1 0T 1 0T 1 W 1i+1 Vi+1 (5.3.12) Do not be concerned that Cri+1 as shown is not simple, as we have deleted two rows from L(ri+1) to obtain M(ri+1) which are not displayed here. As before, we note that by Lemma 5.3.7, the columns of Ui+1, Vi+1,W 0 i+1,W 1 i+1 are contained in T . Our goal to complete the induction is to show W 0i+1 = ZJ and W 1 i+1 = ZJ . We use the four forbidden matrices of 116 (5.3.9),(5.3.10). Given that Cri+1 |M(ri+1) = [0T 1], each pair of rows i, j ∈ M(ri+1) with i < j in the special row ordering of M(ri+1) will contain [ 0 1 1 0 0 1 ] in each copy of Cri+1 . If we have a column α of W 1 with a 1 in a row s ∈ N(rj)\rj+1 where N(rj) is type 2 or 3 and hence we find 0’s in columns of Cri+1 in row s. Hence by the right matrix in (5.3.9), α cannot have the submatrix i j [ 1 0 ] for each pair of rows i, j ∈M(ri+1) with i < j. Given that α|M(ri+1) is a column in T , we deduce that column α is either all 1’s or all 0’s on the rows of M(ri+1). If we have a column β of W 1 with a 0 in a row t ∈ N(rj)\rj+1 where N(rj) is type 11, we find 1’s in row t of Cri+1 . Hence by the right matrix in (5.3.10), β cannot have the submatrix i j [ 1 0 ] for each pair of rows i, j ∈M(ri+1) with i < j. As above, the column β is either all 1’s or all 0’s on the rows of M(ri+1). This considers all columns of W 1 and so W 1i+1 = ZJ . If we have a column α of W 0 with a 1 in a row s ∈ M(rp)\rp+1 where M(rp) is type 2 or 3, we find 0’s in row s of Cri+1 . Hence by the left matrix in (5.3.9), α cannot have the submatrix i j [ 1 0 ] for each pair of rows i, j ∈ M(ri+1) with i < j and so the column α is either all 1’s or all 0’s on the rows of M(ri+1). If we have a column β of W 0 with a 0 in row t ∈ N(rq) where N(rq) is type 11 then we follow a different argument that we explain more carefully. For i, j ∈M(ri+1) with i < j, we find the entries as given below in the rows ri+1, t, i, j in the given column β (the column on the left) and selected columns of Cri+1 (on the right). ri+1 t i j 0 0 0 1 0 1 1 1 a 0 1 1 b 0 0 0 If [ a b ] = [ 1 1 ] then this yields H3 in A as noted in the left matrix in (5.3.10). Now β|M(ri+1) is a column in T and yet cannot have the submatrix [ 1 1 ] . Thus β on the rows of M(ri+1) is either all 0’s or possibly the column of all 0’s except a single 1 in the first row of M(ri+1). It is for this case that we need to delete the first row of M(ri+1) to obtain N(ri+1) (as in (5.3.5)) so that on the rows N(ri+1), the matrix W 0 i+1 = ZJ . We now have obtained (5.3.6) with ri replaced by ri+1, and S = T . Case 2: N(ri) is type 11. We use the same argument as Case 1. When N(ri) is type 11 we would have to replace I by Ic in S in (5.3.6) and then proceed to M(ri+1) of type 2 or 11 (essentially Case 1a or 1b) or M(ri+1) of type 3 (essentially Case 1c). 117 Case 3: N(ri) is type 3. Begin with inductive structure of (5.3.6) where N(ri) is type 3 and S = [0 0T 1 1]. Now choose the first row ri+1 ∈ N(ri) using the ordering on N(ri). Now consider Standard Induction applied to A using row ri+1. We deduce that in rows N(ri)\ri+1, the repeated columns in Cri+1 are Z since for a column to be repeated it extension to row ri+1 with both a 0 and a 1 must be present in Cri+1 . Given that the repeated columns under the 1’s in row ri+1 must correspond to columns of a single 1 on rows N(ri)\ri+1 and by (5.3.6) that means we can deduce the structure of the other rows of the columns in Cri+1 . Note that in what follows I have rearranged the columns of Bri+1 and Dri+1 so that we have put in the columns of the Wj’s all columns which either have a 0 in a row of N(rj) where N(rj) is type 2 or 3 (and hence is Z in Cri+1), and all columns which have a 1 in a row of N(rj) where N(rj) is type 11 (and hence is J in Cri+1). This yields (5.3.11) or (5.3.12) . If M(ri+1) is type 2 we follow the same argument as in Case 1a) to deduce that W 0 i+1 and W 1i+1 have only constant columns. Similarly the case M(ri+1) is type 11 can use the argument of Case 1b) by switching I with Ic. In either case we set N(ri+1) = M(ri+1). If M(ri+1) is type 3, we follow the same argument as in Case 1c) and again may have to delete the first row of M(ri+1) to obtain N(ri+1) and yields (5.3.6) with ri replaces by ri+1. This concludes the induction and so have proven that we can find rows r1, r2, r3, . . . and disjoint sets |N(ri)\ri+1| ≥ 3 yielding a contradiction. As noted this proves the result. We still have eight 5 × 6 simple F for which the conjecture predicts they are boundary cases with forb(m,F ) being O(m2). Given the complicated case analysis of this proof, it seems challenging to prove such bounds. One positive observation is that Lemma 5.3.5 may not be necessary. We were only interested in having a large set L(r), say |L(r)| ≥ 8, for which each triple is in a given case. We could appeal to Ramsey Theory [Ram30] and given a finite number of cases, we can identify a large (!) constant c so that if ‖Cr‖ ≥ c then there are say 8 rows such that every triple is in the same case and in the same row ordering. This would avoid appealing to the particular structures of cases P2, P3, P11 but is not advantageous for our proof. 5.4 Classification of 6-rowed Quadratic Bounds This section is devoted to proving a quadratic bound for the 6 × 3-rowed configuration G6×3. We prove forb(m,G6×3) is Θ(m2), and furthermore, we prove G6×3 is the only 6-rowed 118 boundary quadratic case, therefore classifying all quadratic 6-rowed cases. The following theorem classifies all 6-rowed configurations F for which forb(m,F ) is Θ(m2) by giving the unique boundary case. Theorem 5.4.1 Let F be any 6-rowed configuration. Then forb(m,F ) is Θ(m2) if and only if F is a configuration in G6×3 = 1 1 1 1 1 0 1 0 1 0 1 0 0 0 1 0 0 0 . Furthermore, if F ⊀ G6×3, then forb(m,F ) is Ω(m3). Given the classification and Remark 1.3.22, we are not surprised that Gc6×3 = G6×3. Anstee and Keevash [AK06] established the asymptotic bounds for all k×2 configurations, and in particular concluded that forb ( m, 1 1 1 1 1 0 0 1 0 0 0 0 ) and forb ( m, 1 1 1 0 1 0 0 1 0 1 0 0 ) are both Θ(m2). The proof of the second of these begins to use the full power of the proof in [AK06] and so it is interesting that Theorem 5.4.1 provides a generalization for both of them using an inductive proof (admittedly, a rather complicated one for Theorem 5.3.1) that is quite different than that in [AK06]. In order to prove Theorem 5.4.1, we will use three results. First, Lemma 5.4.2 is the “only if” part of the theorem. The second, Lemma 5.4.3, generalizes Lemma 3.2 in [AK06]. Lastly, we will use the main result of the previous chapter, Theorem 5.3.1. 119 Lemma 5.4.2 Let F be a 6-rowed configuration such that F ⊀ G6×3. Then forb(m,F ) must be Ω(m3). Proof: We may assume all of F ’s columns have column sum 3, otherwise, if F had a column of column sum 4 or more, then F ⊀ I × I × I, and if F had a column sum of 2 or less, then F ⊀ Ic × Ic × Ic. Without loss of generality, let the first column of F be (1, 1, 1, 0, 0, 0)T . With these assumptions, there are only a few cases left to check, and an exhaustive computer search revealed the lemma to be true. But we present here a more constructive proof, if for no other reason than to check the computer code. Note that the following 2-columned matrices have at least a cubic bound: 1 1 1 1 1 1 0 0 0 0 0 0 ⊀ I × I × I, 1 0 1 0 1 0 0 1 0 1 0 1 ⊀ I × I × T. This means that to in order to have F for which forb(m,F ) is not Ω(m3), we must put together columns of sum 3 such that for each pair of columns, the number of rows where both columns have 1’s is either one or two. Here are all the possibilities for (the first) two columns having 1’s in (the first) two rows in common: 1 1 1 1 1 1 1 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 1 1 1 1 1 0 1 0 1 0 1 0 0 0 1 0 0 0 1 1 1 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 1 1 0 1 1 0 1 0 1 0 1 1 0 0 1 0 0 0 ⊀ Ic × Ic × Ic ⊀ I × I × I = G6×3 ⊀ Ic × Ic × Ic ⊀ I × I × I . 120 The only other possibility is that each pair of columns has a 1 in only one row in common. 1 1 0 1 0 1 1 0 0 0 1 1 0 1 0 0 0 1 ⊀ I × I × T. Thus, the only four-columned matrices F for which forb(m,F ) could be O(m2) have to contain G6×3 in every three-columned subset. The only possibility is then 1 1 1 0 1 1 0 1 1 0 1 0 0 1 0 1 0 0 1 1 0 0 0 0 ⊀ I × I × T , which means forb(m,F ) is Ω(m3) for the above matrix F . This concludes the lemma. The following lemma generalizes Lemma 3.2 in [AK06]. Lemma 5.4.3 Let F = 0 · · · 01 · · · 1 F ′ . Then we can conclude that forb(m,F ) ≤ forb ( m, [ 1 · · · 1 F ′ ]) + forb ( m, [ 0 · · · 0 F ′ ]) . Proof: Let A ∈ Avoid(m,F ) with ‖A‖ = forb(m,F ). Then permute the columns of A (take 121 another representative in the equivalence class) and write it as A = [ 0 · · · 0 1 · · · 1 A′ A′′ ] . Note that A′ and A′′ are simple. Since A′ cannot have [ 1 · · · 1 F ′ ] as a subconfiguration, and A′′ cannot have [ 0 · · · 0 F ′ ] as a subconfiguration, the bound follows. From the previous lemma, we note that G6×3 has a row of 0’s and a row of 1’s, and therefore the quadratic bound for forb(m,G6×3) would follow from quadratic bounds for forb(m,G) and forb(m,G′), with G and G′ obtained by removing the row of 1’s and the row of 0’s from G6×3 respectively: G = 1 1 0 1 0 1 0 1 0 0 0 1 0 0 0 and G ′ = 1 1 1 1 1 0 1 0 1 0 1 0 0 0 1 . We will prove more, as both are contained in the boundary case F7. Observe that G ′ = Gc as configurations. We are now ready to prove Theorem 5.4.1. Proof of Theorem 5.4.1: To prove forb(m,G6×3) is O(m2) we use Lemma 5.4.3 and so we only need to prove forb(m,G) and forb(m,G′) are both O(m2). We check that G ≺ F7 and G′ ≺ F7. Now Theorem 5.3.1 shows that forb(m,F7) is O(m2). Applying Remark 1.3.21 yields the bound for G6×3. Lemma 5.4.2 verifies that every configuration F not contained in G6×3 has forb(m,F ) being Ω(m3). 122 Chapter 6 Patterns and Splits 6.1 Patterns and Splits in 2-Dimensions We move away from the world of Forbidden Configurations to consider another extremal problem which will have applications to Forbidden Configurations (in Chapter 7), but no knowledge of Forbidden Configurations is required in order to understand this chapter. Instead of considering the maximum number of columns an m-rowed {0, 1}-matrix can have, we are going to consider the maximum number of 1’s an m×n {0, 1}-matrix can have subject to some property. These problems are about geometric patterns in a grid and are close relatives of Zarankiewicz’ problem [KST54],[Für96] and the investigations of patterns in [FH92],[MT04],[Tar05],[KM07]. Consider the grid [m]× [n] in the Euclidean 2-dimensional space. Let A ⊆ [m]× [n]. Any such subset can be represented by an m × n {0, 1}-matrix A, where A has a 1 in position (x, y) iff (x, y) ∈ A. In order to give the following definitions, we find it convenient to talk about the subset A instead of the matrix A, but in our applications and proofs it becomes more convenient to talk about the matrix A. Since A and A are equivalent, we will switch back and forth. Sometimes it will be more convenient to use the language of grids and other times to use the language of matrices. Recall that σ1(A) denotes the number of 1’s of A. Observe that σ1(A) = |A|. We consider horizontal and vertical lines on integer values (i.e. y = ` or x = ` for some integer k). For a horizontal line y = `, we partition the points of the grid into two parts. The first part consists of those points which lie in the bottom region of the line (points (x, y) such that y < `) and the second part consists of points which lie above the line or in the line itself (points (x, y) with y ≥ `). 123 Let p and q be numbers. If we consider p− 1 horizontal lines and q − 1 vertical lines in the plane, we naturally divide the plane into p · q rectangular regions. The horizontal lines can be represented by p − 1 numbers h1 < h2 < ... < hp−1 and the vertical lines by q − 1 numbers v1 < v2 < ... < vq−1. Set h0 = v0 = 1 and set hp = n, vq = m. Then the regions are of the form R(i, j) := [hi, hi+1)× [vj, vj+1). Definition 6.1.1 Let p and q be numbers an let A ⊆ [m]× [n]. We say A has a (p, q)-split if there exist p vertical lines and q horizontal lines such that for every one of the pq regions R(i, j) we have R(i, j) ∩ A 6= ∅. Below is an example of a 3, 3-split in A, where a 1 from each block is indicated 1 1 1 1 1 1 1 1 1 A 3,3 split Definition 6.1.2 Let NoSplit(m,n; p, q) denote the maximum number of 1’s in a m × n {0, 1}-matrix that does not have a p, q-split. Here is a definition that is related to splits, but differs. Note that row and column order is important. Definition 6.1.3 Let F be a {0, 1}-matrix. We say a {0, 1} matrix A has F as a pattern if there is a submatrix G of F with F ≤ G. That is, if the entry (i, j) is 1 in F then entry (i, j) is also 1. The following theorem was proven by Marcus and Tardos, but using different language. Theorem 6.1.4 Marcus and Klazar[KM07], Marcus and Tardos [MT04], Balogh, Bol- lobás and Morris [BBM06]. Let k be given. Then there exists a constant ck such that NoSplit(n, n; k, k) ≤ ckn. 124 The result in [MT04] involving forbidden permutation patterns implies the above result by choosing the permutation appropriately. Moreover the proof of [MT04] directly extends to the above result. The papers [KM07], [BBM06] note this as well as deriving higher dimension generalizations. While the constants involved in [MT04] are not optimal (in fact, they are very far from optimal), we can produce best possible constants for small values: Theorem 6.1.5 Let m,n be given with m,n ≥ 2. Then NoSplit(m,n; 2, q) = m+ (q − 1) · n− (q − 1) and NoSplit(m,n; 3, q) = 2m+ (q − 1)n− 2(q − 1). Proof: For any m,n, p, q, an m× n matrix A with (p− 1)m+ (q − 1)n− (p− 1)(q − 1) 1’s can be constructed with 1’s in the first p− 1 rows and the first q − 1 columns. Then A has no p, q split. We show an example of this for m = 7, n = 6, p = 3 and q = 2. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 Of course the matrix shown above has no 3, 2-split. If we take any two horizontal lines and one vertical line, the bottom-right region will have no 1’s. This yields a lower bound for NoSplit(p, q;m,n). For p = 2, consider a matrix A′ by first deleting the bottommost 1 from each column. Thus the bottom row has all 0’s. At most m 1’s were deleted. Then delete the q−1 rightmost 1’s from each non-zero row. At most we delete (n− 1)(q − 1) 1’s. Then we have σ1(A) ≤ σ1(A′) +m+ (n− 1)(q − 1) Now, if σ1(A ′) > 0, pick a 1 from A′. It must have q − 1 1’s to its right, and each of those must have a 1 below. This produces a 2, q split, a contradiction. For p = 3 we proceed in a similar way. Form A′ by deleting from A the bottommost 1 from each column. Then delete the topmost 1 from each column. Note that the bottom row as well as the top row cannot have 1’s. Then delete the q− 1 rightmost 1’s in each non-zero 125 row. So σ1(A) ≤ σ1(A′) + 2m+ (n− 2)(q − 1) Now, if σ1(A ′) > 0, pick a 1 from A′. It must have (q − 1) 1’s to its right, and each of those must have a 1 below and a 1 above. This produces a 3, q split, a contradiction. This proof technique was introduced to the authors by Jozsef Solymosi as a curling technique (the winter sport of curling uses a strategy called ‘peeling’). We do not have good bounds for NoSplit(m,n; p, q) with p, q ≥ 4. We conjecture that NoSplit(m,n; 4, 4) = 3m + 3n + min(m,n) − 13. We can prove that NoSplit(m,n; 4, 4) ≥ 3m+ 3n+ min(m,n)− 13 by giving an example with no (4,4)-split. Here is the example for m = n = 8, but it can be easily generalized. 1 1 0 0 0 0 1 1 1 1 1 0 0 0 1 1 0 1 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 The construction above can be generalized for bigger p, q. We believe this to be the best possible construction, but have so far been unable to prove it. Conjecture 6.1.6 Let n, p be given. Then for p ≥ 4, NoSplit(n, n; p, p) = (3p− 5)n+O(p2) The lower bound is given by a construction like the one given above for p = 4, q = 4. 126 We use the following lemma about a particular 3, 3 split in Chapter 7. Lemma 6.1.7 Let A be a {0, 1}-matrix with σ1(A) > 4m− 4. Then A has a special 3,3 split for which the 1’s in rows are in the same line. In other words, A has nine 1’s that satisfy the following structure, 1←−−−−−−−−−−− 1 −→ 1 ↑ 1←−−−− 1 → 1 ↓ 1←− 1 −−−−−−−→ 1 . Proof: The proof is the same as Theorem 6.1.5. Form a matrix A′ from A by deleting from A in turn the leftmost and rightmost 1’s in each row and then delete topmost and bottommost 1’s in each column. We have that σ1(A) ≤ σ1(A′) + 4m− 4. Hence σ1(A′) > 0. Select a 1 in A′. There is a 1 above and a 1 below. Then for each of these three 1’s, there is a 1 to the right and a 1 to the left. This yields the desired structure. We also study a particular 2, 4 split. Lemma 6.1.8 There exists a {0, 1}-matrix A with σ1(A) being Ω(mα(m)), where α(m) is the inverse Ackermann function, and which doesn’t have a 2, 4 split on which the topmost 1’s lie in the same row and the bottommost 1’s lie in the same row as well. In other words, there are no eight 1’s with the following structure: 1 −−−−→ 1 −−−−−−−−−→ 1 −−−−−−−−→ 1 1 −−−−−→ 1 −−→ 1 −−−−−−−−−→ 1 The following observation was suggested to us by Tardos. Observe that if A has a special 127 split like the one described above, then it must also have the pattern P = [ 1 0 1 0 0 1 0 1 ] The following proposition of Füredi and Hajnal provides a construction. Proposition 6.1.9 [FH92] There exists a {0, 1}-matrix A with no pattern P for which σ1(A) is Ω(mα(m)), where α(m) denotes the inverse Ackermann function. Since the Ackermann function grows very quickly the inverse grows very slowly, but it still is bigger than a constant. This means mα(m) is more than linear. 6.2 Patterns and Splits in d-Dimensions The papers [KM07],[BBM06] consider Theorem 6.1.4 generalized to d-dimensional arrays. Unfortunately, the notation for talking about these splits becomes admittedly cumber- some as we consider many dimensions, but the ideas remain simple. We use the following no- tation. Given integers n1, n2, . . . , nd we can consider the positions ∏d i=1[ni] in an n1×n2×· · ·× nd {0, 1}-array A. Our main interest is in the case n1 = n2 = · · · = nd. Let p1, p2, . . . , pd ≥ 2 be given. Assume we have d sets of indices I(j) = {r1(j), r2(j), . . . , rpj−1(j)} for coordinate j, for j = 1, 2, . . . , d. We can form d sets R1(j), R2(j), . . . , Rpj(j) with ∪pji=1Ri(j) = [ni] as follows: R1(j) = {1, 2, . . . , r1(j)}, R2(j) = {r1(j) + 1, r1(j) + 2, . . . , r2(j)},..., Rpj(j) = {rpj−1(j) + 1, rpj−1(j) + 2, . . . , nj}. We generalize to many dimensions the notion of splits and NoSplit. Definition 6.2.1 We say A has a p1, p2, . . . , pd split if we can choose the sets as above and for each j ∈ [d] and for each possible choice t ∈ [pj] with R(j) = Rt(j), the ∏d i=1 pj block A|R(1)×R(2)×···×R(d) contains at least one 1. Perhaps an example would be useful to understand this rather cumbersome notation. 128 Example 6.2.2 The following matrix R1(1) R2(1) R3(1) R1(2) R2(2) 0 0 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 in [8] × [8] has a 3,2 split, where p1 = 3, p2 = 2, and I(1) = {3, 6} and I(2) = {3}, and so R1(1) = {1, 2, 3}, R2(1) = {4, 5, 6}, R3(1) = {7, 8}, R1(2) = {1, 2, 3}, R2(2) = {4, 5, 6, 7, 8}, Definition 6.2.3 Let NoSplit(n1, n2, . . . , nd; p1, p2, . . . , pd) be the maximum number of 1’s in n1 × n2 × · · · × nd {0, 1}-array that has no p1, p2, . . . , pd split. The following describes the asymptotic behavior of NoSplit. Theorem 6.2.4 Klazar and Marcus [KM07], Balogh, Bollabás and Morris [BBM06]. Let k, d be given. Then there exists a constant ck,d so that NoSplit( d︷ ︸︸ ︷ n, . . . , n; d︷ ︸︸ ︷ k, . . . , k) ≤ ck,dn d−1. We may also extend the argument in Theorem 6.1.5 to d−1︷ ︸︸ ︷ 3, 3, . . . , 3, q splits of d-dimensional arrays. It is surprising that we get exact results here yet the results of [KM07] and [BBM06] do not have a reasonable bound for NoSplit(m,n; 4, 4). Theorem 6.2.5 Let B be the ∏d i=1[ni] {0, 1}-array with 1’s in entries whose j coordinate is 1 or 2 for some j = 1, 2, . . . , d−1 or whose dth coordinate is 1, 2, . . . or q−1. The matrix B has no d−1︷ ︸︸ ︷ 3, 3, . . . , 3, q split. Let A be any ∏d i=1[ni] {0, 1}-array with no d−1︷ ︸︸ ︷ 3, 3, . . . , 3, q split. Then σ1(A) ≤ σ1(B), and hence for q = 3, NoSplit(n1, n2, . . . , nd; d−1︷ ︸︸ ︷ 3, 3, . . . , 3, q) = σ1(B). 129 Proof: Let A be any ∏d i=1[ni] {0, 1}-array with no d−1︷ ︸︸ ︷ 3, 3, . . . , 3, q split. For each direction ei with i = 1, 2, . . . , d− 1, remove from A the two 1’s with the smallest and largest coordinate value xi. Finally in the direction ed, remove the largest q − 1 points in each line in the direction ed. Let A ′ be the resulting matrix. Now if A′ has a 1 in position (y1, y2, . . . , yd) then we note that A has 1’s in positions (y1, y2, . . . , yd(j)) for j ∈ [q] where yd = yd(1) and yd(1) < yd(2) < · · · < yd(q). It is now straightforward to show that choosing indices I(1) = {y1 − 1, y1}, I(2) = {y2 − 1, y2}, ..., I(d− 1) = {yd−1 − 1, yd} and I(d) = {yd(1), yd(2), . . . , yd(q − 1)} yields a d−1︷ ︸︸ ︷ 3, 3, . . . , 3, q split. We can show that σ1(B) ≤ σ1(A)− σ1(A′), hence if σ1(A) > σ1(B), then A would have the desired split. The following notation is helpful. Definition 6.2.6 Thinking of the positions in B as elements of [n]d, we let the coordinates of B be x1, x2, . . . , xd and for a position y ∈ [n]d we define xi(y) to be the value of coordinate xi in y. Let proji(B) denote the [n] d−1 (d− 1)-dimensional {0, 1}-array C obtained from B by projecting in the direction ei (where ei is the d-dimensional {0, 1}-vector with a single 1 in coordinate xi). For each y ∈ [n]d we form yī in [n]d−1 by deleting the ith coordinate of y and placing a 1 in a position yī of C if and only if there is at least one 1 in a position z ∈ [n]d of B with zī = yī. We are able to use the proof technique of Theorem 6.1.5 to obtain the following result about arrangements of 1’s. Lemma 6.2.7 Let C be an (n/3)× (n/3)× (n/3) 3-dimensional {0, 1}-array with more than 6(n/3)2 1’s. Then there are twenty-seven 1’s as follows. There are three values a, b, c for x1 coordinate and each plane x1 = a, x1 = b and x1 = c of C contains nine points. The nine points in each plane form a special 3,3 split as in Lemma 6.1.7 with the central 1’s having the same x2, x3 coordinates in each of the three planes. Proof: Form a matrix C ′ from C by deleting from C in turn the top and bottom 1 in each line in direction x3, the top and bottom 1 in each line in the direction x2 and the top two 130 1’s in each line in the direction x1. We obtain σ1(C) ≤ σ1(C ′) + 6(n/3)2. If C ′ has a 1 in position y1, we can find twenty-seven 1’s yielding a special 3,3,3 split as follows. There are 2 1’s of C in positions y2,y3 with x1(y1) < x1(y2) < x1(y3), x2(y1) = x2(y2) = x2(y3) and x3(y1) = x3(y2) = x3(y3). Then there are 1’s of C in positions xj, zj for j = 1, 2, 3, where x2(xj) < x2(yj) < x2(zj) and x1(xj) = x1(yj) = x1(zj), x3(xj) = x3(yj) = x3(zj). Now we obtain positions vj,v ′′ j for j = 1, 2, 3 and v = x,y, z with x3(v ′ j) < x3(vj) < x3(v ′′ j ) and x1(v ′ j) = x1(vj) = x1(v ′′ j ), x2(v ′ j) = x2(vj) = x2(v ′′ j ). In particular there are three planes x1 = a, x1 = b, x1 = c each with nine 1’s and each plane has a special 3,3 split as in Lemma 6.1.7 with the central 1’s of each plane (namely y1,y2,y3) having the same x2, x3 coordinates and the horizontal direction corresponding to the x2 direction. 131 Chapter 7 Products In this chapter we will study another related extremal problem related to Conjecture 1.4.1. Instead of considering how many columns out of all possible columns we can have avoiding a configuration F , we now turn our attention to the problem of finding how many columns we can have out of only a particular, restricted set of columns, while avoiding F . For this we will make heavy use of splits and patterns of Chapter 6. At the end of this section we conclude with forbidden configuration bounds for certain families arising as products. 7.1 Introduction The following are the maximal 2-rowed simple submatrices of the matrices I, T, Ic. Let E1 = [ 0 1 0 0 0 1 ] , E2 = [ 0 1 1 0 0 1 ] , E3 = [ 1 0 1 0 1 1 ] Note that E2 = T2. Let T ′ m = Tm − 0m. That is, T ′m is the tower matrix Tm except for the column of 0′s. We now give an analogous definition to forb. Definition 7.1.1 Let F be a family of configurations, and let P be an m-rowed {0, 1}- matrix. We define MaxChoiceCols(F , P ) = max{‖A‖ : A ≺ P and A ∈ Avoid(m,F)}. Notice that according to this definition, forb(m,F ) = MaxChoiceCols(F,Km). 132 Theorem 7.1.2 MaxChoiceCols(E1 × E1, Im/2 × Im/2) is Θ(m3/2). Theorem 7.1.3 MaxChoiceCols(E1 × E2, Im/2 × T ′m/2) ≤ 2m. Theorem 7.1.4 MaxChoiceCols(E2 × E2, T ′m/2 × T ′m/2) ≤ 2m. The bound of Theorem 7.1.2 is perhaps unexpected in view of Conjecture 1.4.1 but it is not a counterexample. The remaining three cases (E1×E3 in Im/2× Icm/2 , E2×E3 in T ′m/2× Icm/2 and E3×E3 in Icm/2 × Icm/2) essentially follow by taking appropriate {0, 1}-complements. The proof of Theorem 7.1.4 is in Section 7.2, the proof of Theorem 7.1.2 is in Section 7.3 and the proof of Theorem 7.1.3 is in Section 7.4. Related results such as Theorem 7.1.5 MaxChoiceCols(E1 × E2 × E3, Im/3 × T ′m/3 × Icm/3) is Θ(m2). are proved in Section 7.5. A central idea to studying these products is to encode columns of a p-fold product A1×A2×· · ·×Ap as entries in a p dimensional {0, 1}-array B whose ith coordinate is indexed by the columns of Ai. Then the problem of finding MaxChoiceCols(F,A1 × A2 × · · · × Ap) can be transformed to the problem of finding the maximum number of 1’s in a {0, 1}-array that avoids a certain pattern of 1’s. In Chapter 6 we gave results about patterns that we will use in this chapter. The following basic result is proven in Section 7.2. 133 Proposition 7.1.6 Let p, q, r, u, v, w be given positive integers. Define x+ = max{0, x}. The configuration given by the product F (u, v, w) = u︷ ︸︸ ︷ E1 × · · · × E1× v︷ ︸︸ ︷ E2 × · · · × E2× w︷ ︸︸ ︷ E3 × · · · × E3 is contained in the product Pm(p, q, r) = p︷ ︸︸ ︷ Im × · · · × Im× q︷ ︸︸ ︷ T ′m × · · · × T ′m× r︷ ︸︸ ︷ Icm × · · · × Icm for some m if and only if 2 ( (u− p)+ + (v − q)+ + (w − r)+) ≤ (p− u)+ + (q − v)+ + (r − w)+. For example with u = 2, q = 3 and the rest being 0, Proposition 7.1.6 yields that E1 × E1 ⊀ T ′ × T ′ × T ′ and hence forb(m,E1 × E1) is Ω(m3). Proof: We note that any row from Ei contains [0 1] and also note that [0 1] × [0 1] = K2. Since none of our product terms I, T ′, Ic contain K2 then two rows of F chosen from two different products of the u+v+w 2-rowed products, will necessarily contain K2. This implies that if F is contained in the (p+ q + r)-fold product p︷ ︸︸ ︷ I × · · · × I × q︷ ︸︸ ︷ T ′ × · · · × T ′× r︷ ︸︸ ︷ Ic × · · · × Ic, then each product I, T ′, Ic has at most 2 rows of F and if it has two rows then they come from the same 2-rowed product term Ei of F . Of the three matrices I, T ′, Ic, we note that we can find E1 only in I, E2 only in T ′ and E3 only in Ic. We now consider forbidden families of configurations. It was noted in [AF86] that forb(m, {E1, E2, E3}) = 2. Balogh and Bollobás [BB05] have the much more general re- sult that for a given k, there is a constant ck such that forb(m, {Ik, T ′k, Ick}) = ck. Let {E1, E2, E3} × {E1, E2, E3} denote the 6 possible 2-fold products whose terms are chosen from {E1, E2, E3}. We would like to compute forb(m, {E1, E2, E3} × {E1, E2, E3}) but in the interest of a more tractable proof we consider I2 as a replacement for both 134 E1 and E3 (I c 2 = I2) and T ′ 2 as replacement for E2. It is likely (but unknown) that forb(m, {E1, E2, E3} × {E1, E2, E3}) is O(m3/2). One might ask the relationship of The- orem 7.1.7 to Conjecture 1.4.1. The Conjecture (which applies only for a single forbidden configuration) says that only product constructions are needed for best possible asymptotic bounds, but this case {I2 × I2, I2 × T ′2, T ′2 × T ′2} are simultaneously missing from all 1-fold products but not simultaneously missing from any 2-fold product of I, Ic and T . In particu- lar I×I avoids I2×T ′2 and T ′2×T ′2 but does not avoid I2×I2 (Proposition 7.1.6). Surprisingly there is an O(m3/2) construction contained in I× I and yet avoiding I2× I2 (Theorem 7.1.2) and of course also avoiding I2 × T ′2, and T ′2 × T ′2. The other 2-fold products I2 × T ′2 (The- orem 7.1.3) and T ′2 × T ′2 (Theorem 7.1.4) behave as the conjecture might suggest. We note forb(m, {I2, T ′2}) = 2 and the bounds of Theorem 7.1.2, Theorem 7.1.3 and Theorem 7.1.4 apply. In Section 7.6 we prove a rather surprising result in which the exponent turns out to be fractional, when in most other cases the exponents where always integral: Theorem 7.1.7 forb(m, {I2 × I2, I2 × T ′2, T ′2 × T ′2}) is Θ(m3/2). Let I2 × I2 = 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 , I2 × T ′2 = 1 1 0 0 0 0 1 1 1 1 1 1 0 1 0 1 , T ′2 × T ′2 = 1 1 1 1 0 0 1 1 1 1 1 1 0 1 0 1 . (7.1.1) We make use of our Standard Induction (Section 2.1) in a way similar to the way it was used in Section 5.3.3. 7.2 Submatrices of TxT In this section we’ll show how to exploit the results about splits in the context of forbidden configurations. Proof of Theorem 7.1.4 that MaxChoiceCols(E2 × E2, T ′m/2 × T ′m/2) ≤ 2m: Let F = E2×E2. Recall the ith column of T ′k is the column with 1’s in rows 1, 2, . . . , i and 0’s in the 135 remaining rows. Let A be an m-rowed submatrix of T ′m/2×T ′m/2. We can create an m/2×m/2 {0, 1}-matrix B from A by placing a 1 in position r, c if A has the column obtained from the rth column of T ′m/2 placed on top of the cth column of T ′ m/2 namely the column with 1’s only in rows 1, 2, . . . r and m+ 1,m+ 2, . . . ,m+ c. We note that ‖A‖ is σ1(B). We claim that F ≺ A if and only if B has a 3, 3 split. The only way for a submatrix of T ′m/2×T ′m/2 to be a row and column permutation of F is to lie in rows r1, r2,m/2+c1,m/2+c2 for some choices 2 ≤ r1 < r2 ≤ m/2 and 2 ≤ c1 < c2 ≤ m/2 (using the argument of Proposition 7.1.6 and noting that first row of T ′m/2 is 1’s). We have that any two rows of T ′m/2 (not including the first) have a copy of E2. We note that the t th column of T ′ m/2 on rows r1, r2 (with r1 < r2) has r1 r2 t[ 0 0 ] for 1 ≤ t < r1, r1 r2 t[ 1 0 ] for r1 ≤ t < r2, and r1 r2 t[ 1 1 ] for r2 ≤ t. Assume A has a copy of F in the 4 rows r1, r2,m/2 + c1,m/2 + c2. We discover that the nine columns of F would correspond to nine 1’s, one 1 in each of the nine blocks in the 3, 3 split of B given by I(1) = {r1 − 1, r2 − 1} and I(2) = {c1 − 1, c2 − 1} (notation from Chapter 6). Similarly a 3, 3 split of B yields a copy of F in A. We now appeal to the bound in Lemma 6.1.5. An immediate generalization is the following. Lemma 7.2.1 We have MaxChoiceCols( d︷ ︸︸ ︷ E2 × E2 × · · · × E2, d︷ ︸︸ ︷ T ′m/d × T ′m/d × · · · × T ′m/d) is Θ(md−1). Proof: We use the d-dimensional generalization of splits Theorem 6.2.4 where the d-fold product E2 × E2 × · · · × E2 will correspond to a d︷ ︸︸ ︷ 3, 3, . . . , 3 split. We have an exact bound from Theorem 6.2.5 if needed. A further generalization considers the matrix Tk. 136 Lemma 7.2.2 We have that MaxChoiceCols( d︷ ︸︸ ︷ Tk × Tk × · · · × Tk, d︷ ︸︸ ︷ T ′m/d × T ′m/d × · · · × T ′m/d) is equal to NoSplit(m/d,m/d, . . . ,m/d; k + 1, k + 1, . . . , k + 1), and so is Θ(md−1). Proof: We again use the d-dimensional generalization of splits Theorem 6.2.4 where the d-fold product Tk × Tk × · · · × Tk will correspond to a d︷ ︸︸ ︷ k + 1, k + 1, . . . , k + 1 split. A rather interesting version of Theorem 7.1.4 and Lemma 7.2.1 that uses the idea of ‘peeling’ from Theorem 6.1.5 is the following. Lemma 7.2.3 f(E2 × E2, T ′m/p × T ′m/p × · · · × T ′m/p) ≤ 2 · 4p−2m. Proof: Let F = E2 × E2. We consider A as an (m/p) × · · · × (m/p) p-dimensional (0,1)- array B as follows. Let x1, x2, . . . , xp be the p coordinate directions in B. The entries in coordinate direction xi are indexed by the columns of Tm/p in the given order. We note that |A| = σ1(B). We first handle the case p = 3. By Theorem 7.1.4, we have that for i = 1, 2, 3, σ1(proji(B)) < 2m. In fact if σ1(proji(B)) > 2m then we have a 3, 3 split in proji(B) and that yields F in A where no rows of F come from the ith term Tm/3 of the product and 2 rows of F come from another T ′m/3 and the other 2 rows of F come from remaining part T ′m/3. Now proceed to form a matrix B′ from B by deleting from B in turn the top 1 in each line in the direction x3 and then deleting the bottom 1 in each line in the direction x2 and finally deleting the top two entries in each line in the direction x1. We have σ1(B) ≤ σ1(B′) + σ1(proj3(B)) + σ1(proj2(B)) + 2σ1(proj1(B)) ≤ σ1(B′) + 8m. 137 Let y1 be a 1 of B ′. Then, by our construction, there are 2 further 1’s of B in positions y2,y3 with x1(y1) < x1(y2) < x1(y3), x2(y1) = x2(y2) = x2(y3) and x3(y1) = x3(y2) = x3(y3). For each yj we will have two 1’s in positions y ′ j,y ′′ j of B where y ′ j agrees with yj except in coordinate x2 where x2(y ′ j) < x2(yj) and y ′′ j agrees with yj except in coordinate x3 where x3(yj) < x3(y ′′ j ). Then these nine 1’s in B correspond to a copy of F in A as follows. We choose two coordinates a, b from x1 so that when we consider the columns of A corresponding to y1 (and y ′ 1,y ′′ 1 respectively ), y2 (and y ′ 2,y ′′ 2 resp.), y3 (and y ′ 3,y ′′ 3 resp.) we have y1 y2 y3 a b [ 0 0 ] [ 1 0 ] [ 1 1 ] . For the next part note that column t of T ′m/3 has a 0 in row r if and only if t < r. Noting that x2(y ′ j) < x2(yj) = x2(yj) and that x3(y ′ j) = x3(yj) < x3(y ′′ j ), we choose a value c for x2 and a value d for x3 and so that in A, the columns corresponding to the 1’s yj,y ′ j,y ′′ j have yj y ′ j y ′′ j m/3 + c 2(m/3) + d [ 1 0 ] [ 0 0 ] [ 1 1 ] . This yields a copy of F in A in rows a, b, c, d, a contradiction. We deduce that σ1(B ′) = 0 and hence σ1(B) ≤ 8m, concluding the proof for p = 3. For p ≥ 4, we proceed in a similar fashion. By induction on p, σ1(proji(B)) is at most 2 · 4p−3m. We form a matrix B′ from B by deleting from B in turn the top 1 in each line in the direction x4 and then deleting the bottom 1 in each line in the direction x3 and then deleting the top 1 in each line in the direction x2 finally deleting the bottom 1 in each line in the direction x1. We have σ1(B) ≤ σ1(B′) + 4∑ i=1 σ1(proji(B)) ≤ σ1(B′) + 4 · 2 · 4p−3m. Let y1 be an 1 of B ′. Then, by our construction, there are 2 further 1’s of B in positions y2,y3 with x1(y2) < x1(y1) and xi(y2) = xi(y1) for i 6= 1, and x2(y1) < x2(y3) and xi(y1) = xi(y3) for i 6= 2. For each yj we will have two 1’s in positions y′j,y′′j of B where y′j agrees with yj except in coordinate x3 where x3(y ′ j) < x3(yj) and y ′′ j agrees with yj except in coordinate x4 where x4(yj) < x4(y ′′ j ). Then these nine 1’s in B correspond to a copy of F . In particular 138 we can choose coordinates values a, b so that in A the columns contain y2 y1 y3 a b+ (m/4) [ 0 0 ] [ 1 0 ] [ 1 1 ] . As above we can choose values c, d and obtain the copy of F from the nine columns of A given by the nine 1’s of B, a contradiction. We deduce that σ1(B ′) = 0 and hence σ1(B) ≤ 2 · 4p−2m Some growth in the bound with respect to p is to be expected since forb(m,E2 × E2) is Θ(m3). Note that the proof technique suggests generalizations of Theorem 6.1.4 and Theorem 6.2.4 where we forbid a larger class of arrangements of 1’s in a (m/d) × (m/d) × · · · × (m/d) d-dimensional {0, 1}-array and are able to conclude that the array has only a linear number of 1’s. Our geometric argument above has no obvious generalization which would allow us to generalize Theorem 7.2.2. A first step would be a simple geometric proof of the linear bound for NoSplit(m/2,m/2; 4, 4) but Lemma 6.1.8 suggests other difficulties. 7.3 Submatrices of IxI Proof of Theorem 7.1.2 that MaxChoiceCols(E1 × E1, Im/2 × Im/2) is Θ(m3/2). Let F = E1×E1. Let A be a submatrix of Im/2×Im/2 that has no configuration E1×E1. We consider A as an (m/2)× (m/2) {0, 1}-matrix B whose rows are indexed by the columns of Im/2 and whose columns are indexed by the columns of Im/2. Then ‖A‖ = σ1(B). Now 4 rows of A contain F if and only if 2 rows of A chosen from the first m/2 contain the first two rows of F (and so correspond to one copy of E1) and 2 rows of A chosen from the second m/2 correspond to the second two rows of F (and the other copy of E1). Now the nine columns of A containing F corresponds to B having nine 1’s as follows: a 2× 2 submatrix of 1’s and at least one more 1 in each row of the 2× 2 submatrix and at least one more 1 in each column of the submatrix and at least one more 1 in neither of the two chosen rows or two chosen columns. To see this, consider the 2 chosen rows of A from the first m/2 to be i, j and the two other chosen rows are k+m/2, `+m/2. Now Im/2 restricted to rows i, j has i j [ 1 0 ] in column 139 i, has i j [ 0 1 ] in column j and i j [ 0 0 ] in all columns not equal to i, j. Thus we get a copy of I2 × I2 in rows i, j, k + m/2, ` + m/2 of A from 4 1’s in B in rows i, j and columns k, `. Similar observations yield the other 5 columns of F from the 5 1’s of B as described. We initially process B by deleting any row or column with at most two 1’s (and hence up to 2m 1’s) repeating the deletion process if necessary so that the resulting matrix B has row and column sums at least 3. We note that σ1(B) ≤ σ1(B) + 2m. We now appeal to Kővari, Sós and Turán [KST54] for a solution of Zarankiewicz’ problem and deduce that if the number of 1’s in B is Ω(m3/2), then B has 2 × 2 block of 1’s and then B has the configuration of nine 1’s yielding E1 × E1 in A. Moreover from [KST54] we can point out that a construction using projective planes establishes MaxChoiceCols(E1×E1, Im/2× Im/2) is Ω(m3/2). Problem 7.3.1 Determine MaxChoiceCols(E1 × E1, Im/3 × Im/3 × Im/3). We note that MaxChoiceCols(E1×E1, Im/3× Im/3× Im/3) is Ω(m3/2) by Theorem 7.1.2 and is also O(m7/4). Problem 7.3.2 Determine MaxChoiceCols(E1 × E1 × E1, Im/3 × Im/3 × Im/3). The core of this problem would be determining the maximum number of 1’s in a 3-dimensional (m/3)× (m/3)× (m/3) {0, 1}-matrix which has no 2× 2× 2 submatrix of eight 1’s. Erdős [Erd64] has obtained a bound O(m11/4) for this but no matching construction. Note the sharp contrast with results such as Theorem 7.1.4, Lemma 7.2.3, Lemma 7.2.1. 7.4 Submatrices of IxT Proof of Theorem 7.1.3 that MaxChoiceCols(E1×E2, Im/2×T ′m/2) ≤ 2m. Let F = E1×E2. Let A be an m-rowed submatrix of Im/2×T ′m/2 with no F . We consider A as an (m/2)×(m/2) {0, 1}-matrix B whose rows are indexed by the columns of Im/2 and whose columns are indexed by the columns of T ′m/2 in the usual order. We note that ‖A‖ is σ1(B). Now A contains F if and only if there are three rows and in each row there are three 1’s such that we can divide the nine 1’s as a sort of “1,3 split”, where the three leftmost 1’s in each row are entirely to the left of the middle three 1’s in each row which are entirely to the left of the rightmost three 1’s in each row. By Lemma 6.1.7 we have the desired result. The construction A = Im/2 × 1m/2 avoids F and is Ω(m). 140 Lemma 7.4.1 MaxChoiceCols(E1 × E2, Im/3 × T ′m/3 × T ′m/3) is Θ(m2). Proof: Let F = E1 × E2. Let A be an m-rowed submatrix Im/3 × T ′m/3 × T ′m/3 with no F . We consider A as an (m/3)× (m/3)× (m/3) 3-dimensional {0, 1}-array B with coordinate x1 indexed by the columns of Im/2 and with coordinates x2, x3 indexed by columns of T ′ m/3 in the usual order. We note that ‖A‖ is σ1(B). By Theorem 7.1.3 we have |proj2(B)| and |proj3(B)| being O(m), and of course |proj1(B)| is O(m2). Form a matrix B′ from B by deleting from B in turn the top 1 in each line in direction x3, the bottom 1 in each line in the direction x2 and the top two 1’s in each line in the direction x1. We have that the number of deleted 1’s is at most O(m 2). Now assume B′ has a 1 in position y1. Then, by our construction, there are 2 further 1’s of B in positions y2,y3 with x1(y1) < x1(y2) < x1(y3), x2(y1) = x2(y2) = x2(y3) and x3(y1) = x3(y2) = x3(y3). For each yj we will have two 1’s in positions y ′ j,y ′′ j of B where y ′ j agrees with yj except in coordinate x2 where x2(y ′ j) < x2(yj) and y ′′ j agrees with yj except in coordinate x3 where x3(yj) < x2(y ′′ j ). Then these nine 1’s in B correspond to a copy of F = E1×E2. We choose r1, r2 so that in the columns of Im/3 indexed by x1(y1), x1(y2), x1(y3) we find E1: r1 r2 [ 0 1 0 0 0 1 ] . We choose two additional rows b, c following the discussion for p = 3 in Lemma 7.2.3 when looking for E2. Thus σ1(B ′) = 0 and then σ1(B) is O(m2). Lemma 7.4.2 MaxChoiceCols(E1 × E2 × E2, Im/3 × T ′m/3 × T ′m/3 is Θ(m2). Proof: Let A be an m-rowed submatrix Im/3×T ′m/3×T ′m/3 with no E1×E2×E2. As above, we translate A into the 3-dimensional array B with ‖A‖ = σ1(B). Now by Lemma 6.2.7, if σ1(B) > 6(m/3) 2 there will be twenty-seven 1’s in B as described and this will yield a copy of E1 × E2 × E2 in A. Thus σ1(B) ≤ 6(m/3)2. Using an analogous argument one obtains 141 Lemma 7.4.3 MaxChoiceCols(E1× p−1︷ ︸︸ ︷ E2 × · · · × E2, Im/p× p−1︷ ︸︸ ︷ T ′m/p × · · · × T ′m/p) is Θ(mp−1). We note some difficulty may arise when considering Tk instead of E2 = T2. Theorem 7.4.4 MaxChoiceCols(T3 × I2, T × I) is Ω(mα(m)), where α(m) denotes the inverse Ackermann function. Proof: Form our usual array B and observe that B can’t have the structure described in Lemma 6.1.8. Initially we had thought MaxChoiceCols(T3 × I2, T × I) would be linear, but it isn’t. 7.5 Submatrices of IxTxIc Proof of Theorem 7.1.5 that f(E1 × E2 × E3, Im/3 × T ′m/3 × Icm/3) is Θ(m2). Let F = E1×E2×E3. Let A be an m-rowed submatrix of Im/3×T ′m/3× Icm/3 with no F . We consider A as an (m/3) × (m/3) × (m/3) 3-dimensional {0, 1}-array B as follows. Let x1, x2, x3 be the three coordinate directions in B. The entries in coordinate direction x1 are indexed by the columns of Im/3, the entries in coordinate direction x2 are indexed by the columns of T ′m/3 in that order and the entries in coordinate direction x3 are indexed by the columns of Icm/3. As before ‖A‖ is σ1(B). By Lemma 6.2.7, we know that if σ1(B) > 6(m/3) 2, then there is the configuration of twenty-seven points as described. Then these twenty-seven 1’s in B correspond to a special 3, 3, 3 split that yield copy of F in A following our usual arguments. This contradiction implies that σ1(B) ≤ 6(m/3)2. 142 7.6 Fractional Exponent Bound for a Family of Con- figurations Proof of Theorem 7.1.7 that forb(m, {I2 × I2, I2 × T ′2, T ′2 × T ′2}) is Θ(m3/2). By Theorem 7.1.2 there exists a matrix A ≺ Im/2 × Im/2 with ‖A‖ being Θ(m3/2) and no I2 × I2. Because of Proposition 7.1.6, neither I2 × T ′2 ≺ A nor T ′2 × T ′2 ≺ A. Let A ∈ Avoid(m, {I2× I2, I2×T ′2, T ′2×T ′2}). We begin using our usual standard decom- position (2.1.1) on row r A = r → [ 00 · · · 0 11 · · · 1 Br Cr Cr Dr ] We would be done by induction if we could show that ‖Cr‖ ≤ 36m1/2 for some r, so we may assume ‖Cr‖ ≥ 36m1/2 for all r. Our proof will show that we can associate matrix Cr with a certain set of rows M(r) (to be defined later), where |M(r)| ≥ ‖Cr‖/4. Consider a given choice r and set M(r) of rows with |M(r)| ≥ 9m1/2. Then for t = 9m1/2 choices r1, r2, . . . , rt ∈M(r) we will show that |M(ri) ∩M(rj)| ≤ 9 (7.6.1) and so we obtain disjoint sets M(r1),M(r2) \M(r1),M(r3) \ (M(r1)∪M(r2)), . . .M(rt) \ (M(r1)∪M(r2)∪ · · · ∪M(rt−1)) which yields that M(r1) ∪M(r2) ∪M(r3) · · · is of size 9m1/2 + (9m1/2 − 9) + (9m1/2 − 18) + · · · > m, a contradiction given that we only have m rows. We can find the inductive children of {I2 × I2, I2 × T ′2, T ′2 × T ′2} easily: F4 = 1 11 1 0 1 , F5 = 1 11 0 0 1 . (7.6.2) This follows since if Cr has F4, then A has T ′ 2 × T ′2, and if Cr has F5, then A has T ′2 × I2, both forbidden configurations. Because of F5 we know Cr is a laminar matrix (as in 143 Definition 1.3.27). If we consider both forbidden configurations F4 and F5 we deduce that the columns of Cr of sum at least 2, considered as sets, are disjoint. We need more detailed information and begin by computing what happens on quadru- ples of rows of A in order to avoid the three 4 × 4 configurations. There are 11 cases Q0, Q1, . . . , Q10. These cases were computed using the program described in Chapter 3. Q0 = no no no no no no 1 1 0 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 1 1 , Q1 = no no no no no no 1 1 0 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 1 0 1 1 Q2 = no no no no no no 1 1 0 0 1 0 1 0 0 1 1 0 1 0 0 1 1 1 0 1 1 1 1 1 , Q3 = no no no no no no 1 1 0 0 1 0 1 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 1 1 Q4 = no no no no no 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 1 1 0 1 1 , Q5 = no no no no no 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 1 1 1 1 1 Q6 = no no no no no 1 1 0 0 1 0 1 0 0 1 0 1 1 0 1 1 0 1 1 1 , Q7 = no no no no no 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 144 Q8 = no no no no no 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 0 1 1 1 , Q9 = no no no no no 1 1 0 0 1 0 1 0 1 1 0 1 1 0 1 1 0 1 1 1 Q10 = no no no no no 1 1 0 0 1 0 1 0 1 1 0 1 1 0 1 1 1 1 1 1 As in the proof given in Lemma 5.3.2, choose L(r) to be a minimal subset of rows of Cr such that Cr|L(r) is a simple matrix. Note that Cr|L(r) also has the property that the columns of sum at least 2 are disjoint as sets. Now consider the family of sets C corresponding to the columns of Cr. We can have the empty set as well as sets of size 1 and some disjoint family of sets of size at least 2. Let V ∈ C be a set of size at least 2 and let U1, U2, . . . , Ut be the sets of C of size 1 contained in V . Now, if t < |V | and |V | ≥ 3, we could delete a row of Cr|L(r) and preserve simplicity, a contradiction. As an example, consider deleting the last row of G below. For |V | = 2, we can have t = 1 < |V | and still have no row to delete, as shown for example in H below. G = 0 1 0 10 0 1 1 0 0 0 1 , H = [ 0 1 1 0 0 1 ] Our choice of L(r) ensures that no row of Cr|L(r) can be deleted (while preserving simplicity) and hence Cr|L(r) contains the column of 0’s (0|L(r)|) and at least half the columns of I|L(r)|, as well as possibly up to |L(r)|/2 disjoint columns of sum 2 or more. Thus we may estimate |L(r)| ≥ ‖Cr‖/2. Thus the number of sets of size 1 in C is at least |L(r)|/2. We choose M(r) to correspond to the sets of size 1, namely Cr|M(r) contains I|M(r)| as well as 0|M(r)|. Note that Cr|M(r) need not be a simple matrix and may in fact have up to 2 copies of each column from I|M(r)| (imagine deleting the second row of H). We obtain |M(r)| ≥ ‖Cr‖/4 ≥ 9m1/2. 145 In what follows we analyze closely each of the 11 cases above to deduce the structure of Cr|L(r) in order to prove (7.6.1). It’s possible to determine What Is Missing in Cr|L(r) on the triple of rows i, j, k ∈ L(r) by considering What Is Missing on the quadruple of rows r, i, j, k, as we did for F7 in Section 5.3. We obtain a contradiction if we find a copy of “K2” in What Is Missing, namely if on the triple i, j, k there is a pair of rows i, j with all 4 columns of K2 appearing as follows: i j k no 00 a no 01 b no 10 c no 11 d where a, b, c, d ∈ {0, 1}. Perhaps other columns are missing on rows i, j, k. Note that we could delete row k from Cr|L(r) and preserve simplicity, contradicting our choice of L(r). The reason for this is that on the three rows i, j, k, the columns present would possibly be i j k 00 a 01 b 10 c 11 d where x denotes the {0, 1}-complement of x. We can see that deleting row k will not result in repeated columns assuming Cr|L(r) has no repeated columns. We note that Q0, Q1, Q2 each have 3 rows, each pair of which has a K2 in What Is Missing (rows 1, 2, 3 for Q0, rows 2, 3, 4 for Q1, rows 2,3,4 for Q2) and Q5, Q6 have two disjoint pairs of rows, each of which has K2 (rows 1,2 and rows 3,4 for Q5 and rows 1,3 and rows 2,4 for Q6) so deleting any row of the quadruple will leave a K2 leaving a row to delete from Cr and so by our choice of L(r) we cannot have the cases represented by Q0, Q1, Q2, Q5, Q6 in quadruples of rows consisting of r plus a triple of rows from L(r). For the remaining cases, to avoid leaving a K2 after deleting a row we find that we must have deleted particular rows from each quadruple. Note that Q3 and Q8 have K2 on rows 3,4 and so if present, to avoid leaving a copy of K2 row r must either correspond to row 3 or to row 4. The cases Q4, Q7, Q9, Q10 all have K2 on rows 2,4 and rows 3,4 and so row r must be row 4 of such a quadruple. We have used Pi to denote the condition on a triple of rows arising from the quadruple Qi in these ways. 146 P3 : no 01 0 no 11 0 no 10 1 no 11 1 or no 11 0 no 10 0 no 01 1 no 11 1 P4 or P9 : no 11 0 no 10 1 no 01 1 P8 : no 11 0 no 10 1 no 01 1 no 11 1 or no 11 0 no 10 0 no 01 1 no 11 1 P7 or P10 : no 11 0 no 10 1 no 11 1 Note that the presence of I|M(r)| means that on each triple of rows in M(r) we have all three columns of sum 1 present and so cannot have P3 (arising from Q3 with row r being either the third or fourth row) or the second of the two cases for P8 (arising from Q8 with row r being the third row) on triples of rows in M(r) since each forbids a column of sum 1. This careful detail is used in what follows. We now consider what is possible in the full width of A on the rows of M(r). By considering the corresponding Qi’s and the row of Qi that corresponds to r, we find that under the 0’s in row r, any triple of rows [BrCr]|M(r) must have two columns of sum 2 absent and hence [BrCr]|M(r) cannot have the configuration F5. Thus the columns form a laminar family on rows M(r), namely, considered as subsets of M(r), any two sets are either disjoint or one is contained in the other. One has to consider Q4, Q8, Q9, Q10 and then in each case the row corresponding to row r must be the fourth row of each such Qi (else a K2 remains on the triple). Similarly under the 1’s in row r, in [CrDr]|M(r) we cannot have the configuration 147 F5 and so the columns also form a laminar family on M(r). Had we used row r as the third or fourth row of Q3 then [CrDr]|M(r) might have F5 and had we used row r as the third row of Q8 then [BrCr]|M(r) might have the configuration F5. Fortunately these cases are not possible. Now consider rj ∈ M(r) and standard decomposition based on row rj. The columns of Crj , must correspond to columns of A which appear with a 1 and with a 0 in row rj and are the same elsewhere and hence this is also true when restricted to the rows M(r) ∪ r. We have pairs of columns as follows: rj M(r) \ rj { r 0 1α α a a We can show that there are at most three possible choices for pairs of columns and hence at most three choices for α. In fact consider [BrCr] and β 6= γ be two nonzero choices for α. rj M(r) \ rj { r 0 1 0 1β β γ γ 0 0 0 0 The columns of A with 0’s in row r, when restricted to the rows M(r), must form a laminar family. Columns 2 and 4 have 1’s in common on row rj and we deduce, without loss of generality by considering columns 2 and 4, that β ≤ γ. Now considering columns 2,3 and using 0 6= β ≤ γ, we violate the laminar property, a contradiction to β 6= γ. Thus there is at most one nonzero choice for α in [BrCr] and at most one nonzero choice for α in [CrDr]. Given that there are at most three choices for columns of Crj restricted to the rows M(r) \ rj, we can deduce that |L(rj) ∩M(r)| ≤ 2 (we would be able to delete all but two rows of M(r) \ rj from Crj |L(rj) without affecting simplicity). So let M ′(rj) = M(rj) \M(r) and note that |M(rj)| − 2 ≤ |M ′(rj)| ≤ |M(rj)|. From our previous observations we have that the columns of A indexed by the rows of M ′(rj) consist of the column of 0’s, the columns of the identity matrix, and (possibly) columns of sum at least 2 which are disjoint when considered as sets. We use the following 148 notation. Crj = M(r) \ rj { M ′(rj) { X Y ... , A = rj → M(r) \ rj { M ′(rj) { 0 · · · 0 0 · · · 0 1 · · · 1 1 · · · 1 X X Y Y ... ... ... ... . By our previous observations for Cr|L(r) and Cr|M(r) we know that columns of Y are either columns of 0’s, columns of sum 1 and possibly columns of sum 2 (disjoint as sets). By reducing to M ′(rj) and deleting the possible row of overlap with M(r), we may assume that Y contains I|M ′(rj)|. We can show that two different columns of Y of sum 1 (i.e. with 1’s in different rows) on rows M ′(rj) cannot lie under two identical nonzero columns (of X) on rows M(r) \ rj, else we have I2 × T ′2 in A as follows. Let i, k correspond to the rows of M ′(rj) which contain the 1’s of the two selected columns of Y and let a ∈ M(r) \ rj be a row containing 1’s in the repeated nonzero column of X. This gives I2 × T ′2: rj a i k 0 0 1 1 1 1 1 1 1 0 1 0 0 1 0 1 Recall that there are at most 3 different columns in X and Y has I|M ′(rj)|. Let M ′′(rj) ⊆ M ′(rj) denote the rows corresponding to the 1’s in the columns of the identity I|M ′(rj)| which lie under columns of 0’s in rows of M(r)\rj. We have that |M ′(rj)|−2 ≤ |M ′′(rj)| ≤ |M ′(rj)| else we create I2 × T ′2 as above since there are at most two different nonzero columns of X. Thus |M(rj)| − 4 ≤ |M ′′(rj)| ≤ |M(rj)|. Now consider two rows rp, rq ∈ M(r). Let i, j ∈ M ′′(rp) ∩M ′′(rq). From the standard decomposition using row rp, we can find two columns with 1’s in row rp, 0’s in row rq (and also 0’s in all other rows of M(r)) and I2 on rows i, j. Similarly we can find two columns with 1’s in row rq, 0’s in row rp (and all other rows of M(r)) and I2 on rows i, j. Then we can find a copy of I2 × I2: rp rq i j 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 We deduce that |M ′′(rp) ∩ M ′′(rq)| ≤ 1, which in turn establishes (7.6.1) that |M(ri) ∩ 149 M(rj)| ≤ 9, completing our proof. 7.7 All Pairs of Columns We now establish an asymptotic bound that is a generalization of the exact bound of the bound proved in Section 4.2. Consider all pairs of possible `× 1 columns. F` := {[αβ] : α, β ∈ {0, 1}`, α 6= β}. For example, F2 is the set of matrices considered in Section 4.2, where we proved forb(m,F2 ×F2) = m+ 3. F2 = {[ 0 0 0 1 ] , [ 0 1 0 1 ] , [ 1 0 0 1 ] , [ 1 1 0 1 ]} and F3 = 1111 10 , 1110 10 , 1010 10 , 1110 01 , 1110 00 , 1010 01 , 1010 00 , 1001 00 , 1000 00 . Theorem 4.2.1 has an asymptotic version. Observe that forb(m,F`) = 1 for m ≥ ` (and 2m for m < `). Theorem 7.7.1 Let T be a family of forbidden configurations with forb(m, T ) being O(mk). Then for any `, we have forb(m,F` × T ) is O(mk+1). Proof: We consider a matrix A with no configuration in T × F`. We do an analogous operation to the standard decomposition (2.1.1) on ` rows. For any two columns α, β with α 6= β we let Aα,β denote the (m − `)−rowed matrix formed as the set of columns on A|{`+1,`+2,...,m} which appear under both α and β. For example, for ` = 2, we would consider 150 the following representative A = 0 0 1 1 0 1 0 1 A[ 00 ] A[ 10 ] A[ 01 ] A[ 11 ] where Aα are all simple matrices for α ∈ { [ 0 0 ] , [ 1 0 ] , [ 0 1 ] , [ 1 1 ]} In such matrix, A[ 00 ],[ 10 ] would be matrix formed by the columns that are in both A[ 00 ] and A[ 10 ] . We deduce that Aα,β is a simple matrix with no configuration in T since if F ∈ T is in A|α,β then A has [αβ]× F ∈ F` × T . Hence at A|α,β has at most forb(m− `, T ) columns. There are at most a constant number of choices for pairs α and β and so after deleting at most O(mk) columns from A to produce A′ we can ensure A|{`+1,`+2,...,m} is simple and hence, by induction, has at most forb(m− `, T × F`) columns. This yields the result. The O(m) bound of Theorem 4.2.1 would follow from taking T = F2, although to get the exact bound we use the proof given in that section. 151 Chapter 8 Conclusions 8.1 Open Problems 8.1.1 Monotonicity It is quite reasonable to conjecture that the function forb(m,F ) is increasing in the first variable m, as we know it is in the second variable F (Remark 1.3.21). Problem 8.1.1 Let F be a k-rowed (0,1)-matrix and let m ≥ 1. Is it true that forb(m,F ) ≤ forb(m+ 1, F )? We can answer positively for single forbidden configurations with certain properties. We know this is false for families of forbidden configurations. Proposition 8.1.2 Let F be a configuration that doesn’t have either a row of 0’s, a row of 1’s or a repeated row. Then for all m, forb(m,F ) ≤ forb(m+ 1, F ). Proof: Let A ∈ ext(m,F ). Then consider the (m+ 1)-rowed matrix A′ = A× [0] if F doesn’t have a row of 0’s, A× [1] if F doesn’t have a row of 1’s,[ A r ] if F doesn’t have a repeated row, , 152 where r is any row of A. Then clearly A′ ∈ Avoid(m + 1, F ) and ‖A′‖ = ‖A‖, and F ⊀ A′. Note that for families of configuration this is not always the case. Here is an amusing case. Theorem 8.1.3 forb(m, {[ 0 0 0 0 ] , [ 1 1 1 1 ]} ) = 2 for m = 1 or m ≥ 7 6 for m = 3, 4 4 for m = 2, 5, 6 Proof: The result is easy for m = 1, 2. First consider pairs of columns. Let Fabcd denote the (a+ b+ c+ d)× 2 matrix of a rows [0 0], b rows [1 0], c rows [0 1] and d rows [1 1]. Consider two columns of A and suppose they are Fabcd. We have a, d ≤ 1. We deduce that b, c ≤ 2 else with b ≥ 3 (F0300), by the pigeonhole principle a third column will create either [ 0 0 0 0 ] or [ 1 1 1 1 ] on those 3 rows. For m ≥ 7 we deduce then that there are at most 2 columns. For m = 6 we can use this to deduce that every pair of columns must form F1221 and then the 6 × 4 configuration (K24)T establishes the bound. For m = 5 we can use this to deduce that every pair of columns must form F1211 or F1220 or F0221. We can create 4 columns by deleting one row from (K24) T . Let A ∈ ext(m, {[ 0 0 0 0 ] , [ 1 1 1 1 ]} ). Using the technique of What Is Missing, we have that on any pair of rows we have ≤ 1[ 0 0 ] ≤ 1[ 1 1 ] For any three rows we find that any choice of columns for those three rows will violate one of the restrictions but there are at most 6 restrictions to violate on the three rows so there are at most 6 columns. For m = 3 we may use the construction [K23 | K13 ] and for m = 4 we may use the configuration K24 . 153 One could potentially circumvent this example by only seeking monotonicity for m large enough. Conjecture 8.1.4 Let F be a family of forbidden configurations. Then there exists M for which for all m ≥M , forb(m,F) ≤ forb(m+ 1,F). 8.1.2 A Common 4-rowed Subconfiguration As mentioned in the introduction to Chapter 5, there are two 4-rowed configurations for which we do not know the asymptotic bound. Both configurations have the following configuration in common: F (t) = t · 1 0 1 0 0 1 0 1 This is the smallest configuration F for which an asymptotic bound on forb(m,F ) is not known. Interestingly, for t = 1 and t = 2 a quadratic bound has been proven. Problem 8.1.5 Prove that forb(m,F (t)) is Θ(m2) (or find a counterexample). 8.1.3 Other 3x4 Exact Bounds In Section 4.1, we found two exact bounds on 3× 4 configurations. There are only 10 others for which the answer isn’t known. We can apply the local search techniques described in Section 3.6 to find conjectured extremal matrices. The following table shows the conjectured answers. The entries of the table represent forb(m,Vi). The “?” symbol represents that the computer has found an example (i.e. a lower bound) but we haven’t proven the upper bound. 154 m −→ 3 4 5 6 7 8 V1 = 1 1 1 11 1 1 1 1 0 0 0 8 16 22 32? 43? 55? V2 = 1 1 1 11 1 1 0 1 0 0 1 8 12 18 26? 36? 45? V3 = 1 1 1 11 1 1 0 1 0 0 0 8 12 18 26? 36? 45? V4 = 1 1 1 01 1 1 0 1 0 0 1 8 12 18 26? 36? 45? V5 = 1 1 1 01 1 1 0 1 0 0 0 8 12 18 26? 36? 44? V6 = 1 1 1 01 1 1 0 1 0 0 0 8 16 22 29? 37? 45? V7 = 1 1 1 11 1 1 0 0 0 0 0 8 16 22 29? 37? 46? V8 = 1 1 1 01 1 1 0 0 0 0 1 8 16 22 29? 37? 46? V9 = 1 1 1 01 1 1 0 0 0 0 0 8 16 22 29? 37? 46? V10 = 1 1 1 01 1 0 1 0 0 1 1 8 12 17 23? 30? 38? Table 8.1: Conjectured values of forb(m,Vi) 155 The last numbers on this table (for 8 rows) take a few hours to find. The next number, for 9 rows, might take a few weeks of computer time. One could conjecture that forb(m,V10) = ( m 2 ) + ( m 1 ) + ( m 0 ) + 1. 8.1.4 Critical Substructures We give some ideas extending those of Section 4.3. Problem 8.1.6 Prove Conjecture 4.3.2 or the equivalent Conjecture 4.3.8 or give a coun- terexample. What follows are ideas and attempts at proving Conjecture 4.3.2 for k ≥ 5. We prove some of the properties required for general k and we prove some for k = 5. Consider applying the idea for the proof of Proposition 4.3.9 for general k to try to show that forb(m,Fk−1) < forb(m,Kk). If we could show this is true for any m, then it will be true for all larger m by induction. One might try for m = k as a base case, but unfortunately we find that forb(k, 2 ·Kk−1k−1−0k−1) = forb(k,Kk) = 2k−1, namely if we delete any column of Kk which has column sum not 0 or k, then the resulting matrix has no 2 ·Kk−1k−1−0k−1. One might try next to obtain the inequality for m = k + 1. Proceed by contradiction and assume forb(k + 1, Fk−1) = forb(k + 1, Kk). Let A ∈ ext(k + 1, Fk−1). As argued before, for any (k + 1) × 1 column α which is not a column in A, [A|α] has a copy of Kk say on rows S. Similarly for a column β not in A, we have that [A|β] has a copy of Kk say on rows T . Now if S = T , then A|S has the submatrix Kk−α|S and also the submatrix Kk− β|S. If αS 6= β|S, then we deduce A has Kk, a contradiction. There are at most 2 columns on k+ 1 rows agreeing on a k-set of rows S. Now if S 6= T then |S ∩ T | = k− 1. A similar argument yields a contradiction if α|S∩T 6= β|S∩T . Table 8.2 shows some computer-found lower bounds (using the algorithms described in Chapter 3). Remember these are conjectures for m ≥ 6. We can show forb(k+ 2, Fk) = forb(k+ 2, Kk+1) for k ≥ 4 as follows. The k+ 3 columns to delete from Kk+2 are B = 13 I c 3 13 × × × 0k−1 0k−1 Ik−1 One can show that this matrix B has the property that for every k-set of rows S, there is a column repeated three times in B|S, while Kk+2 has four copies of each column on a k-set 156 m forb(m,F3) forb(m,K4) 4 15 15 5 25 26 6 40(?) 43 7 59(?) 64 m forb(m,F4) forb(m,K5) 5 31 31 6 57 57 7 98(?) 99 Table 8.2: Conjectured values for forb(m,F3) and forb(m,K4) m forb(m, 2 ·K24) forb(m,K5) 4 16 16 5 30 31 6 53(?) 57 7 86(?) 99 Table 8.3: Conjectured values for forb(m, 2 ·K24) of rows. We should also expect that forb(m, 2 ·K24) is much less than forb(m,K5). Several proof techniques will work but the easiest is to consider forb(5, 2·K24) and show that forb(5, 2·K24) ≤ 25− 2, i.e you must delete at least 2 columns in order to avoid 2 ·K24 . This is relatively easy. A column of sum 2 can only yield 12 × 02 on three 4-sets of rows but there are five 4-sets of rows. One way to achieve this is to delete from K5, a column of sum 2 and a column of sum 3 that overlap in two rows. Table 8.3 shows some more computer experiments. Lemma 8.1.7 We can establish that forb(m, 2 ·K24) ≤ forb(m,K5)−m+ 4. Proof: We use induction as follows. We first note forb(m, [2·K13 | 2·K23 ]) < forb(m,K4). Now applying the standard decomposition (2.1.1) repeatedly we have forb(m,K24) ≤ forb(m − 1, K24) + forb(m − 1, [2 ·K13 | 2 ·K23 ]) and then forb(m,K24) ≤ forb(m − 2, K24) + forb(m − 2, [2 ·K13 | 2 ·K23 ]) + forb(m− 1, [2 ·K13 | 2 ·K23 ]) and then repeat this m− 4 times using that [2 ·K13 | 2 ·K23 ] is in F3 and forb(5, F3) = forb(5, K4)− 1. The following idea is probably of no help in establishing these inequalities but it should be noted. 157 Proposition 8.1.8 If forb(m,F ) < forb(m,Kk) for m ≥ m0, then forb(m, [0 1]× F ) < forb(m,Kk+1)−m+m0. Proof: Simply use induction forb(m, [0 1] × F ) ≤ forb(m − 1, [0 1] × F ) + forb(m − 1, F ). Let forb(m,F4) = forb(m,K5) and let A ∈ ext(m,F4) ⊆ ext(m,K5). Then on each 5-set of rows exactly one column is missing as described above. So each 5-set contains K5 − α for some choice of 5× 1 α. Now if α has column sum 0,1,4,5 then we find that K5 − α has F4. So we deduce that α has column sum 2,3. Thus we deduce that A has no K 2 5K 3 5 on the grounds that one column is missing. Immediately we deduce that A has all columns of sum 0,1,m−1, m (adding these columns can’t create a K5. But then A− [K0m | K1m | Km−1m | Kmm ] has no [K14 | 2 ·K24 | K34 ]. Now applying our inductive decomposition on [K14 | 2 ·K24 | K34 ] yields F3. We would be done if we could show forb(m,F3) ≤ forb(m,K4)− 3. Here’s an idea that might not help at all, but gives rise to an interesting problem. Let F be a k-rowed configuration for which forb(m,F ) is known. Let G ≺ F be an `-rowed subconfiguration. Suppose we wish to find out if G is a critical substructure of F . That is, suppose we wished to find out if forb(m,G) = forb(m,F ). Proceeding by contradiction, assume this was true and consider a matrix A ∈ ext(m,G) ⊆ ext(m,F ). If we add a column α to A, by hypothesis we should get that [A | α] contains F as a configuration. When does this mean A contains G? In particular, this means A contains F − φ for some column φ of F with φ = α|S for some subset of the rows S. Note then that G ⊀ F − φ. Thus, any copy of G in [A | α] must be missing at most 1 column of F . Consider the following reinterpretation of the problem. Lemma 8.1.9 If for every column φ of F we have that G ≺ F − φ, then forb(m,G) < forb(m,F ). 158 Proof: If A is in ext(m,G) ⊆ ext(m,F ), then by adding any column not already present in A to A we must obtain the configuration F since A ∈ ext(m,F ), which means that for some column φ of F , we have that F −φ is present in A, which means G ⊀ F −φ, a contradiction. 8.2 Concluding Remarks We have seen many results concerning a single Forbidden Configuration and just a peek at results concerning multiple Forbidden Configurations. An observant reader might have noticed that the tools we use to study a single forbidden configuration are mostly the same tools we use to study a family of forbidden configurations, so, the observant reader might ask, why then have we neglected our families and opted for singles? A perhaps unsatisfying answer could be that single configurations are hard enough. Imagine for a moment we had an oracle that when given a family of configurations F would immediately provide the (correct) value of forb(m,F) and perhaps an extremal matrix satisfying the bound. What could we do with such a wonder? In particular, among other things, this oracle would solve all Erdős-Stone-Simonovits type results, as one could imagine forbidding (1, 1, 1)T together with the incidence matrix of a graph H. When forbidding these configurations the extremal matrix one would obtain would be [0 | I | G] where G is the incidence matrix of a graph that does not have H as a configuration. Because of this, the idea of having a general theory for families of forbidden configura- tions looks tantalizing, but perhaps a little daunting. Results about small families of small forbidden configurations could very well flourish with a systematic study. If I were to con- tinue research in forbidden configurations, I would certainly start here, as I am sure there are many new results waiting to be discovered. As of this writing we don’t yet have the tools to study configurations with 6 rows or more, but we hope that by chipping away at the problem we are making progress toward perhaps a proof or refutation of Conjecture 1.4.1. Proving this pivotal conjecture would indeed be a milestone for the field, but it is my personal belief that the conjecture will turn out to be false. The reason for this belief, in the face of so much evidence for it, is its constructive nature. Sure, the product constructions described in this Conjecture 1.4.1 have given the best asymptotic bounds for many (small) configurations, but these are not, by any stretch, a random sample of configurations. They are chosen to be small and manageable. As we’ve seen, a very small change in a configuration (for example, substituting a 0 by a 159 1 or adding a column) can lead to wildly different bounds. I find there is not much reason to believe a large, random configuration would satisfy the conjecture that the best (asymptotic) construction will be products of I, Ic and T . On the other hand, the configurations I, Ic and T do seem to “pop-up” repeatedly while studying configurations. Perhaps they are in some way special, and we have indeed proven many cases in which the conjecture turns out to be true. 160 Bibliography [ABS11] R.P. Anstee, F. Barekat, and A. Sali, Small forbidden configurations V: Exact bounds for 4×2 cases, Studia. Sci. Math. Hun. 48 (2011), 1–22. → pages 32, 59, 60 [AF86] R.P. Anstee and Z. Füredi, Forbidden submatrices, Discrete Math. 62 (1986), 225–243. → pages 15, 18, 134 [AFS01] R.P. Anstee, R. Ferguson, and A. Sali, Small forbidden configurations II, Elec- tronic Journal of Combinatorics 8 (2001), R4 25pp. → pages 18, 32 [AGS97] R.P. Anstee, J.R. Griggs, and A. Sali, Small forbidden configurations, Graphs and Combinatorics 13 (1997), 97–118. → pages 18, 32, 33 [AHK77] Kenneth Appel, Wolfgang Haken, and John Koch, Every planar map is four colorable, Illinois Journal of Mathematics (1977). → pages 31 [AK97] R. Ahlswede and L.H. Khachatrian, The complete intersection theorem for sys- tems of finite sets, European Journal of Combinatorics 16 (1997), 125–136. → pages 30 [AK06] R.P. Anstee and P. Keevash, Pairwise intersections and forbidden configurations, European Journal of Combinatorics 27 (2006), 1235–1248. → pages 18, 30, 60, 119, 121 [AK07] R.P. Anstee and N. Kamoosi, Small forbidden configurations III, Electronic Jour- nal of Combinatorics 14 (2007), R79 34pp. → pages 32 [AK10] R.P. Anstee and S.N. Karp, Forbidden configurations: Exact bounds determined by critical substructures, Electronic Journal of Combinatorics 17 (2010), R50 27pp. → pages 32, 59 161 [Alo83] N. Alon, On the density of sets of vectors, Discrete Math 46 (1983), 199–202. → pages 83 [Ans] R.P. Anstee, A survey of forbidden configuration results, http://www.math.ubc. ca/∼anstee/. → pages 18, 23, 103 [Ans85] R.P. Anstee, General forbidden configuration theorems, Journal of Combinatorial Theory A 40 (1985), 108–124. → pages 13, 32 [Ans90] R.P. Anstee, Forbidden configurations, determinants and discrepancy, European Journal of Combinatorics 11 (1990), 15–19. → pages 88 [AR11] R.P. Anstee and Miguel Raggi, Genetic algorithms applied to problems of for- bidden configurations, preprint. http://www.math.ubc.ca/∼anstee/ (2011), 22pp. → pages iii, 2 [ARS10a] R.P. Anstee, Miguel Raggi, and Attila Sali, Evidence for a forbidden config- uration conjecture; one more case solved., preprint. http://www.math.ubc.ca/ ∼anstee/ (2010), 16pp. → pages iii, 2, 18 [ARS10b] R.P. Anstee, Miguel Raggi, and Attila Sali, Forbidden configurations and prod- uct constructions, preprint. http://www.math.ubc.ca/∼anstee/ (2010), 22pp. → pages iii, 2 [ARS11] R.P. Anstee, Miguel Raggi, and Attila Sali, Forbidden configurations: Quadratic bounds, preprint. http://www.math.ubc.ca/∼anstee/ (2011), 21pp. → pages iii, 2, 18 [AS05] R.P. Anstee and A. Sali, Small forbidden configurations IV, Combinatorica 25 (2005), 503–518. → pages 15, 18, 33, 95 [BB05] J. Balogh and B. Bollobás, Unavoidable traces of set systems, Combinatorica 25 (2005), 633–643. → pages 33, 134 [BBM06] J. Balogh, B. Bollobás, and R. Morris, Hereditary properties of partitions, ordered graphs and ordered hypergraphs, European Journal of Combinatorics 8 (2006), 1263–1281. → pages 124, 125, 128, 129 162 [BEHW89] A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth, Learnability and the Vapnik–Chervonenkis dimension, Journal of the ACM 36(4) (1989), 929–865. → pages 32 [BKS05] J. Balogh, P. Keevash, and B. Sudakov, Disjoint representability of sets and their complements, Journal of Combinatorial Theory B 95 (2005), 12–28. → pages 32 [EKR61] P. Erdős, C. Ko, and R. Rado, Intersection theorems for systems of finite sets, Quarterly Journal of Mathematics, Oxford Series series 2 12 (1961), 313–320. → pages 30 [Erd64] P. Erdős, On extremal problems of graphs and generalized graphs, Israel Journal of Mathematics 2 (1964), 183–190. → pages 140 [ES46] P. Erdős and A.H. Stone, On the structure of linear graphs, Bulletin of the American Mathematical Society 52 (1946), 1089–1091. → pages 3, 30 [ES66] P. Erdős and M. Simonovits, A limit theorem in graph theory., Studia Scien- tiarum Mathematicarum Hungarica 1 (1966), 51–57. → pages 3, 30 [FH92] Z. Füredi and P. Hajnal, Davenport-Schinzel theory of matrices, Discrete Math 103 (1992), 233–251. → pages 123, 128 [FQ83] Z. Füredi and F. Quinn, Traces of finite sets, Ars Combin. 18 (1983), 195–200. → pages 83 [Für83] Z. Füredi, personal communication, 1983. → pages 15 [Für96] Z. Füredi, Combinatorics, Probability and Computing 5 (1996), 29–33. → pages 30, 123 [Gro80] H.O.F. Gronau, An extremal set problem, Studia Sci.Math. Hungar 15 (1980), 29–30. → pages 83 [Juk01] S. Jukna, Extremal combinatorics with applications in computer science, Springer-Verlag, 2001. → pages 29 [KM07] M. Klazar and A. Marcus, Extensions of the linear bound in the Füredi-Hajnal conjecture, Advances in Applied Mathematics 38 (2007), 258–266. → pages 123, 124, 125, 128, 129 163 [KST54] Kövari, V. Sós, and P. Turán, On a problem of K. Zarankiewicz, Colloq. Math. 3 (1954), 50–57. → pages 30, 123, 140 [Man07] Problem 28 soln by H. Gouwentak, W. Mantel, J. Teixeira de Mattes, F. Schuh, and W.A. Wythoff, Wiskundige Opgavem 10 (1907), 60–61. → pages 29 [Mat02] Jiri Matousek, Lectures in discrete geometry, Springer, 2002. → pages 32 [MT04] A. Marcus and G. Tardos, Excluded permutation matrices and the Stanley-Wilf Conjecture, Journal of Combinatorial Theory Ser. A 107 (2004), 153–160. → pages 123, 124, 125 [MZ07] D. Mubayi and Y. Zhao, Forbidding complete hypergraphs as traces, Graphs and Combinatorics 23 (2007), 667–679. → pages 32 [O’R87] Joseph O’Rourke, Art gallery theorems and algorithms, Oxford University Press, 1987. → pages 32 [Ram30] F. P. Ramsey, On a problem of formal logic, Proceedings London Mathematical Society s2 30 (1) (1930), 264–286. → pages 31, 118 [Sau72] N. Sauer, On the density of families of sets, Journal of Combinatorial Theory. Series A 13 (1972), 145–147. → pages 15 [She72] S. Shelah, A combinatorial problem: Stability and order for models and theories in infinitary languages, Pacific Journal of Mathematics 4 (1972), 247–261. → pages 15 [Spe28] Emanuel Sperner, Ein satz über untermengen einer endlichen menge, Mathema- tische Zeitschrift 27 (1) (1928), 544–548. → pages 31 [Ste78] J. M. Steele, Empirical discrepancies and subadditive processes, Annals of Prob- ability 6(1) (1978), 118–227. → pages 32 [Tar05] G. Tardos, On 0-1 matrices and small excluded submatrices, Journal of Combi- natorial Theory Series A 111 (2005), 266–288. → pages 123 [Tur41] P. Turán, On an extremal problem in graph theory, Matematikai és Fizikai Lapok 48 (1941), 436–452. → pages 29 164 [TV06] Terence Tao and Van H. Vu, Additive combinatorics, Cambridge Studies in Ad- vanced Mathematics, Cambridge University Press, 2006. → pages 32 [Vap00] V.N. Vapnik, The nature of statistical learning theory. information science and statistics, Springer-Verlag, 2000. → pages 32 [VC68] V. N. Vapnik and A. Ya. Chervonenkis, On the uniform convergence of relative frequencies of events to their probabilities, Soviet Math. Dokl. 9 (1968), 915–918. → pages 32 [VC71] V.N. Vapnik and A.Ya. Chervonenkis, On the uniform convergence of relative frequencies of events to their probabilities, Theory of Probability and Applica- tions (1971), 264–280. → pages 15 [Ver05] R. Vershynin, Integer cells in convex sets, Advances in Mathematics 197 (2005), 248–273. → pages 32 [WD81] R. S. Wencour and R. M. Dudley, Some special Vapnik–Chervonenkis classes, Discrete Mathematics 33(3) (1981), 313–318. → pages 32 165 Index (p, q)-split, 124 {0, 1}-complement, 6 p1, p2, . . . , pd split, 128 s-boundary case, 18 proji(B), 130 absent, 38 avoids, 11 boundary case, 18 column sum, 5 concatenation, 6 configuration, 10 contain F as a configuration, 11 critical substructure, 82 has no configuration, 11 identity complement Icm, 8 identity matrix Im, 8 implication, 40 impure, 40 inductive child, 35 laminar, 13 long supply, 38 multiplicity, 7 non-essential, 93 pattern, 124 power set, 4 predicted s-boundary case, 18 predicted boundary case, 18 pure, 40 restriction, 7 set system, 4 shifted matrix, 83 short supply, 38 simple, 5 simple configuration, 10 simple hypergraph, 4 standard decomposition, 34 subconfiguration, 11 subtraction, 7 tower matrix Tm, 9 violates an implication, 40 166
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
Japan | 9 | 0 |
China | 3 | 11 |
Canada | 2 | 0 |
United States | 1 | 0 |
City | Views | Downloads |
---|---|---|
Tokyo | 9 | 0 |
Beijing | 3 | 0 |
Vancouver | 2 | 0 |
Sunnyvale | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Share to: