UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A survey of results toward the class number problem for real quadratic fields Nevin, Joshua Borwein 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2015_may_nevin_joshua.pdf [ 466.27kB ]
JSON: 24-1.0166230.json
JSON-LD: 24-1.0166230-ld.json
RDF/XML (Pretty): 24-1.0166230-rdf.xml
RDF/JSON: 24-1.0166230-rdf.json
Turtle: 24-1.0166230-turtle.txt
N-Triples: 24-1.0166230-rdf-ntriples.txt
Original Record: 24-1.0166230-source.json
Full Text

Full Text

A SURVEY OF RESULTS TOWARD THE CLASS NUMBER PROBLEM FOR REALQUADRATIC FIELDSbyJOSHUA BORWEIN NEVINB.Sc., California Institute of Technology, 2013A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THEDEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Mathematics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)April 2015© Joshua Borwein Nevin, 2015AbstractThe class number problem is one of the central open problems of algebraic number theory. It haslong been known that there are only finitely many imaginary quadratic fields of class number one, andthe full list of such fields is given by the Stark-Heegner theorem, but the corresponding problem forreal quadratic fields is still open. It is conjectured that infinitely many real quadratic fields have classnumber one but at present it is still unknown even whether infinitely many algebraic number fieldshave class number one. This thesis reviews the relevant work that has been done on this problemin the last several decades. It is primarily concerned with a heuristic model of the sequence of realquadratic class groups called the Cohen-Lenstra heuristics, since this appears to offer the best hope ofpotentially solving the class number problem. The work of several other people who have put forwardinterpretations of the Cohen-Lenstra heuristics in other contexts, or who have generalized the heuristics,is also reviewed.iiPrefaceThis thesis is the unpublished, independent work of the author, Joshua Borwein Nevin.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 An Overview of the Basic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 The Ideal Class Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 The Dedekind ζ-function and the Class Number Formula . . . . . . . . . . . . . . . . . 52 The Cohen-Lenstra Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1 Modules over Dedekind Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 The Weighted ζ-Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 Applying The Heuristic Assumption of the Model . . . . . . . . . . . . . . . . . . . . . . 282.4 Justification for the Cohen-Lenstra Heuristic . . . . . . . . . . . . . . . . . . . . . . . . 343 Interpretations of the Cohen-Lenstra Heuristics . . . . . . . . . . . . . . . . . . . . . . 443.1 Young Diagrams of Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.2 The Young Tableau Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.3 The Conjugacy Class Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53ivAcknowledgementsI would like to thank my supervisor, Professor Sujatha Ramdorai, as well as Professor Greg Martin,who took over the task of helping me finish my thesis when Sujatha could not be here. I would also liketo thank Professor Mike Bennett for acting as a reader for this thesis.vIntroductionQuadratic fields have been intensely studied, beginning with Gauss’ development of the theory of binaryquadratic forms. It is remarkable that Gauss developed the theory of class numbers without havingany of the modern language of groups and fields. He conjectured that only finitely many negativefundamental discriminants have class number 1. This conjecture was proven by Heilbronn in 1934.The full list of imaginary quadratic fields with class number 1 was completed by Kurt Heegner andHarold Stark. It is conjectured that infinitely many real quadratic fields have class number one, butthis problem is much more difficult than the corresponding problem for imaginary quadratic fields.Imaginary quadratic fields have unit group {±1} (except Q(√−1) and Q(√−3)) but real quadraticfields have a unit group with a free part of rank 1 and so have a non-trivial regulator which is verydifficult to control. Much of the work on this conjecture stems from a model created by Henri Cohen andHendrik Lenstra. Cohen and Lenstra noticed that the (deterministic) sequence of quadratic class groupsbehaved as they were subject to a probability measure in which groups with many automorphisms wereless likely to appear. If the heuristic assumptions of the Cohen-Lenstra model are true, the implicationsare far beyond what we can currently prove. For example, according to this model, the probability thata real quadratic class group has a trivial odd part is 75.446%.The primary source for this model is from the original paper of Cohen and Lenstra, Heuristics on ClassGroups of Number Fields, but the ideas contained therein have been expanded on by several otherpeople, such as Lengler (see [3]) and Fulman (see [2]).1Chapter 1An Overview of the Basic Theory1.1 The Ideal Class GroupIf K is a number field with number ring OK , then the group of fractional ideals of OK form a freeabelian group JK generated by the primes. This contains a subgroup PK generated by all fractionalprincipal ideals. The ideal class group is the quotient group JK/PK . Crucially, this group is finite.There are many elegant proofs of the finiteness of this group. One such method starts by embeddingthe field K into the Euclidean space Rr ×Cs, where K has r embeddings into R and s conjugate pairsof embeddings into C. The finitness of the ideal class group can then be deduced by regarding eachnonzero ideal as a lattice in Rr+2s and applying Minkowski’s theorem. The class number of the numberfield K is |JK/PK |.For K a number field, the number ring OK is a free Z-module whose rank is equal to the number ofembeddings of K into C. Every field K comes equipped with a discriminant.Definition 1.1.1 If K is a number field whose number ring OK has basis {α1, · · · , αn} as a free Z-module, and such that {σ1, σ2, · · · , σn} is the set of embeddings of K into C then the discriminant ofK is the following quantity2detσ1(α1) σ2(α1) · · · σn(α1)σ2(α1) σ2(α2) · · · σ2(αn)....... . ....σn(α1) σn(α2) · · · σn(αn).We denote this quantity by disc(K).Our interest lies in the class groups of the quadratic number fields, that is, a number field of the formK = Q(√k) for k squarefree.Proposition 1.1.2 The discriminant of the number field Q(√k), for k squarefree, isdisc(Q(√k))=k if k ≡ 1 mod 44k if k ≡ 2, 3 mod 4.Later on, we will develop the theory of binary quadratic forms. In the language of Gauss’ theory ofbinary quadratic forms, the discriminant of a quadratic is a fundamental discriminant.Definition 1.1.3 A binary quadratic form is a homogenous degree 2 polynomial Q(x, y) = ax2 + bxy+cy2 with integer coefficients. The discriminant of the binary quadratic form ax2 + bxy + cy2 is theinvariant b2 − 4ac. That is, a discriminant is an integer of the form b2 − 4ac where a, b, c ∈ Z. Adiscriminant D is called fundamental if it cannot be written in the form D = D0f2 for D0, f integerswith f > 1.The discriminant of a quadratic number field (as in definition 1.1.1/proposition 1.1.2) is a fundamentaldiscriminant (as in definition 1.1.3). Thus, the notation h(d) will always be reserved for the class numberof the unique quadratic field with fundamental discriminant d. For an algebraic number field K, we willalso use the notation h(K) to refer to the class number of K.In the case where d < 0, the problem of studying h(d) is made considerably simpler by the fact that thecorresponding quadratic field has a finite unit group:Proposition 1.1.4 (Dirichlet’s Unit Theorem)Let K be an algebraic number field with r real embeddings and s pairs of conjugate complex embeddings.Then the unit group O∗K is isomorphic to µ(K) × Zr+s−1, where µ(K) ⊆ C∗ is the group of roots of3unity lying in K.Thus, imaginary quadratic fields, for which r = 0 and s = 1, have no free part to their unit group. Fork > 0 and K = Q(√−k), the unit group of K is just {1,−1}, unless k = 1 or k = 3. But for any otheralgebraic number field K apart from the imaginary quadratic fields and Q, K comes equipped with asystem of fundamental units. A set S ⊆ O∗K is called a system of fundamental units if it is a minimalgenerating set of the free part of O∗K .In the particular case we are interested in, that of real quadratic fields, a system of fundamental units isjust a singleton, since r+ s− 1 = 1. In this case, we have an exact characterization of the fundamentalunits in terms of Diophantine equations. Namely, a fundamental unit of a real quadratic field is aminimal solution to Pell’s equation:Proposition 1.1.5 Let k > 1 be a squarefree integer and d the discriminant of the real quadratic numberfield K = Q(√k). Let x1, y1 be the uniquely determined minimal positive rational integer solution ofthe equation x2 − dy2 = −4, unless no such solutions exist. In that case, let x1, y1 be the uniquelydetermined minimal positive rational integer solutions of the equation x2−dy2 = 4. Then u1 =x1+y1√d2is a fundamental unit of K.Proof: We consider two cases. In the first case k ≡ 1 mod 4. In that case k = d by proposition 1.1.2.ThusN(u1) =(x21 − dy21)4= ±1.Here the norm is with respect to the extension K/Q. So in any case u1 is a unit. Suppose it is not afundamental unit. The number ring of Q(√k) is Z[1+√d2], as k = d. Therefore, there is a fundamentalunit u expressible asu = a+ b(1 +√d2)=(2a+ b) + b√d2where a, b ∈ Z.4Note we can assume that both 2a+ b and b are nonnegative integers because if both are negative, then−u is also a fundamental unit. If one of them is negative, then we can just pass to the Galois conjugateof u (or −u if necessary), which is also a fundamental unit. So we may assume that u has 2a+b > 0 andb > 0. Since u1 is not a fundamental unit, there exists a j ∈ Z where j 6= 0, 1,−1, such that u1 = uj .By minimality, we have0 < x1 ≤ 2a+ b0 < y1 ≤ b.Since 2a+b, b, x1, y1 are all positive integers, and d ≥ 5 (since d ≡ 1 mod 4), this means that u1 > 1 andu > 1. Therefore, we are forced to have j > 0, otherwise u1 < 1. But this contradicts the minimality ofx1, y1.Now we do the case where k ≡ 2, 3 mod 4. Thus d = 4k by 1.1.2. Furthermore, in this case the numberring of K is Z[√k]. As before, we suppose u1 =x1+√dy12 where x1, y1 are the minimal positive solutionsto the Diophantine equation above. Suppose u1 is not a fundamental unit of O∗K . In this case, thefundamental unit u of OK can be expressed in the form a+ b√k. Since d = 4k we write this as 2a+b√d2 .As before, we may suppose that 2a, b are nonnegative integers. Then for some j 6= 0, 1,−1 we haveu1 = uj . Since u1 =x1+2y1√d2 , we have u1 > 1 so as above, j > 1, contradicting minimality.1.2 The Dedekind ζ-function and the Class Number FormulaFor any number field K we have a Dirichlet series∑I 6=0(NI)−swhere the sum is taken over all nonzero ideals I of OK . The notation NI will always denote the idealnorm of the ideal I. That is, NI = |OK : I|.This Dirichlet series converges on the half-plane Re(s) > 1 and can be meromorphically continuedto a function ζK(s) on the half-plane Re(s) > 1 − 1|K:Q| with a simple pole at s = 1. In particularif K = Q this is just the ordinary Riemann-zeta function. This machinery is enough to provide an5analytic formula for the class number of K. Through Dirichlet’s Unit Theorem, each algebraic numberfield K is associated an invariant reg(K), the regulator of K, which is a measure of the density of theunit group in the logarithmic space.Definition 1.2.1 For an algebraic number field K with degree n = r + 2s, set t = r + s− 1. Then let{u1, · · · , ut} be a system of fundamental units of O∗K . Let τ1, · · · , τt+1 be the different embeddings of Kinto R or C (up to conjugation). Then we form a t× (t+1) matrix whose (i, j)th entry is Nj log |τj(ui)|,where Nj = 1 if τj is a real embedding and 2 if it is a complex embedding up to conjugation. Then letR be the absolute value of the determinant of an arbitrary t× t minor of the matrix. Then R is calledthe regulator of K.It can be shown that the determinant of the matrix above is independent of our choice of system offundamental units. In the case of imaginary quadratic fields, we just set the regulator to be 1, becausethere are no fundamental units. In the case of a real quadratic field Q(√k), suppose that we have afundamental unit a+ b√k. In that case, by the definition above, we form a 1× 2 matrixlog |a+ b√k|log |a− b√k| .Since a + b√k is a unit, it has norm 1. Thus log |a + b√k| = − log |a − b√k|, so we obtain the sameabsolute value regardless of which 1 × 1 minor we choose. The absolute value | log |a + b√k|| is theregulator of Q(√k).The connection between the regulator and the ideal class group is given by the class number for-mula.Proposition 1.2.2 (The Class Number Formula)For a number field K with r real embeddings and s pairs of complex embeddings, the class number h(K)is given by(lims→1ζK(s)ζ(s))#µ(K)√|disc(K)|2r+spisreg(K).This quantity(lims→1ζK(s)ζ(s))can now be studied by introducing a more general class of functions.Dirichlet L-functions, which were originally introduced to prove Dirichlet’s theorem on arithmetic pro-6gression, can be used to give an expression for(lims→1ζK(s)ζ(s)). For a Dirichlet character χ : (Z/mZ)∗ →C, we associate the Dirichlet series∞∑n=0χ(n)ns.This converges in the half-plane Re(s) > 1 and can be analytically continued to a meromorphic functionL(s, χ) on the complex plane. Now, if K is an abelian extension of Q with Galois group G = Gal(K/Q),and corresponding character group Gˆ, then G is contained in some cyclotomic extension Q(e2pii/m)by the Kronecker-Weber theorem. Thus each element χ : G → C∗ of the character group Gˆ may beregarded as a Dirichlet character mod m.The term(lims→1ζK(s)ζ(s))can be evaluated in terms of L-functions, at least in the case where K isabelian:Proposition 1.2.3 Let K be an abelian extension of Q contained in the cyclotomic extension Q(e2pii/m).Let G = Gal(K/Q) and let Gˆ be the corresponding character group. For each prime p|m, let fp be theinertia degree of p in K and let rp be the number of distinct primes in K above p. Then(lims→1ζK(s)ζ(s))=∏p|m(1− p−1) (1− p−fp)−rp ∏χ∈Gˆχ 6=1L(1, χ).Proof: See [6].Before continuing, we introduce several important properties of characters.Definition 1.2.4 Let χ : (Z/nZ)∗ → C be a character mod n. Suppose that m is a multiple of n,and so there is a projection pi : (Z/mZ)∗ → (Z/nZ)∗. Then χ is said to induce the character mod mχ′ = χ ◦ pi. A character χ′ mod m is called primitive if it is not induced by a character mod n for anydivisor n of m such that n 6= m. The character χ′ is called even if χ′(−1) = 1 and odd if χ′(−1) = −1.In the case of a quadratic field K = Q(√k) with Galois group G, we are interested in describing the lonenontrivial character χ : G → C∗ by considering an induced cyclotomic character. Let m = |disc(K)|.Then K is contained in the cyclotomic field Q(e2pii/m). Thus Gal(Q(e2pii/m)/Q) projects to Gal(K/Q).The first Galois group is (Z/mZ)∗. Let H ≤ Gal(Q(ω)/Q) be subgroup fixing K, so that Gal(K/Q) ∼=7G/H. If p is an odd prime, with p - m, (and so p is unramified in Q(ω)), then p mod m ∈ H if and onlyif k is a square mod p. Furthermore, if p = 2 then p mod m ∈ H if and only if k ≡ 1 mod 8.So, for the character χ mod m induced by a nontrivial character on (Z/mZ)∗/H, χ takes the value of1 on a ∈ (Z/mZ)∗ if and only if a ∈ H. Otherwise, it takes the value −1. Thus, for odd primes p - mwe obtainχ(p) =1 if k is a square mod p−1 otherwise=(kp)andχ(2) =1 if k ≡ 1 mod 8−1 if k ≡ 5 mod 80 otherwiseSo the character χ in the class number formula is the multiplicative extension of the character above.We introduce the Jacobi symbol. For a ∈ Z and odd b > 0 with (a, b) = 1 with b = pr11 · · · prkk wedefine(ab)=k∏i=1(api)riwhere(api)is the usual Legendre symbol. Thus for odd n we obtainχ(n) =(kn).And with χ(2) defined above, χ is now defined over the positive integers.It is straightforward to show that χ is a primitive (mod m) even character when d > 0 and a primitive(mod m) odd character when d < 0. So with this description in hand, for each fundamental discriminantd we set χd to be the unique nontrivial character above associated to the quadratic field with discriminantd, which is a character mod |d|. Furthermore, we note that this is a primitive character mod |d|. Then8we have the following formula for h(d):Proposition 1.2.5 If d < 0, let µd be the group of roots of unity contained in the unique quadratic fieldwith fundamental discriminant d (if d > 0 then the only roots of unity in the corresponding quadraticfield are 1 and −1). If d > 0 then let ud be a fundamental unit of the unique quadratic field withfundamental discriminant d. Thenh(d) =#µd√|d|2pi L(1, χd) if d < 0√d2| log(|ud|)|L(1, χd) if d > 0.Proof: See [6]Thus, in the case of d < 0, studying h(d) amounts to studying L(1, χd). The growth ofL(1,χd)√|d|as|d| → ∞ depends on the size of the zero-free region of L(s, χd). For d < 0, χd is an odd, real primitivecharacter mod |d|. Hecke proved the following theorem:Proposition 1.2.6 For d < 0 and χ an odd, real primitive character mod |d|, if L(s, χ) 6= 0 in theregion s > 1− clog |d| for s real and c > 0 some fixed absolute constant, thenL(1, χd)1log |d|.This theorem was the first step to solving a long-standing problem of Gauss. Gauss had conjectured thatlimd→−∞ h(d) =∞. This is referred to as Gauss’ conjecture for imaginary quadratic fields. Gauss alsoconjectured that infinitely many real quadratic fields have class number one. That is, he conjecturedthat lim infd→∞ h(d) = 1. This is referred to as Gauss’ conjecture for real quadratic fields.The proof of Gauss’ conjecture was then completed by Heilbronn in 1934, who showed that h(d)→∞as d→ −∞ unconditionally.The behavior of the real quadratic class fields is much harder to study, owing to the difficulty ofcontrolling the term 1| log(|ud|)| . So far, Gauss’ conjecture for real quadratic field has resisted any attemptat an analytic proof (or any other kind of proof). The approach outlined in the following section is9perhaps the current best hope of achieving a full solution. In this model, the sequence of odd parts ofthe real quadratic class groups are treated as if they are subject to a probability measure. Under theassumption that the odd parts of the real quadratic class groups behave as if they were generated by aweighted random process with this probability measure, we can predict the probability that a particularodd part of a real quadratic class group will be trivial (we can predict much more, as we shall see). Theproof of this heuristic assumption, if it exists, must lie very deep.10Chapter 2The Cohen-Lenstra ModelThroughout this chapter we fix the following notation. The letter A will denote a number ring. The letterG will always denote a finite A-module. The notation G will denote the category of finite A-modules.For G a finite A-module, Aut(G) will denote the A-automorphisms of G. Furthermore, if B,C are A-modules, then unless otherwise indicated Hom(B,C) will always denote the A-module homomorphismsfrom B to C, i.e HomA(B,C). The letter I will always denote a nonzero ideal of A. The script letter pwill always denote a prime ideal of A. We present some of the proofs of the propositions which form themodel, in order to give an overview of the methods being used, but we leave some proofs as cited.2.1 Modules over Dedekind DomainsWe review the theory of finite modules over Dedekind domains. A finite module G over A decomposesinto a direct sum of the formG =n⊕i=1A/pmiiwhere p1, · · · , pn are prime ideals of A (not necessarily distinct). The ideal pm11 · · · pnmn of A is aninvariant associated to G. We denote this ideal as χA(G). We refer to χA(G) as the cardinal of G. Notethat in the case where A = Z, then χZ(G) = (#G)Z. In the general case, NχA(G) = #G, where N isthe ideal norm on A. We recall that NI = |A : I| for I 6= 0 an ideal of A. Throughout this chapter,11NI will always denote the ideal norm of I. One advantage of χA is its multiplicativity through exactsequences of finite A-modules. That is, for0→ G1 → G→ G/G1 → 0an exact sequence of finite A-modules we haveχA(G) = χA(G1)χA(G/G1).For a given prime ideal p of A, the p-rank of G is the dimension of G/pG as an A/p-vector space. Notethat this is just the number of direct summands of G of the form A/pm in the decomposition above,with m > 0. We denote the p-rank of G by rp(G). We note that G has a nonzero p-rank if and onlyif p|χA(G). Furthermore, we define the rank of G to be the minimal number of generators of G as anA-module. We denote this by r(G). Finally, for any prime ideal p and G ∈ GA, the notation Gp willdenote the p-part of G. More generally, for any P ⊆ Spec(A) and finite A-module G, the notationGP will denote the P-part of G. If there is only a single prime ideal p|χA(G) then we refer to G as apA-module. More generally, if P ⊆ Spec(A) and the primes dividing χA(G) all lie in P then we refer toG as a PA-module.For each integer k ≥ 1 we introduce a k-weight to each finite A-module GDefinition 2.1.1wk(G) :=#{ϕ ∈ HomA(Ak, G) : ϕ surjective}(#G)k(#Aut(G)).w(G) :=1#(Aut(G)).The purpose of introducing wk(G) is for ease of calculation. We will want to calculate the averagesof certain functions f : GA → C in which each G ∈ GA is weighted by w(G). It is usually easier toweight each G ∈ GA by wk(G) and then let k → ∞, since, as we shall see, limk→∞ wk(G) = w(G) forall G ∈ GA.We note that #{ϕ∈HomA(Ak,G):ϕ surjective}(#Aut(G)) is simply #{J ≤ Ak : Ak/J ∼= G}, because two surjectivehomomorphisms ϕ1, ϕ2 : Ak → G have the same kernel if and only if ϕ2 = φ ◦ ϕ1 for some φ ∈ Aut(G).12We also note that #{ϕ ∈ HomA(Ak, G) : ϕ surjective} = 0 if k < rp(G) for any prime ideal p ofA.Finally, we note that w(G) induces a local probability measure on GA in the following sense. For eachp set GA,p to be the set of finite pA-modules. Then for any M ⊆ GA,p, the probability that M occurs(with respect to w(G)) is just∑G∈M (#(Aut(G)))−1∑G∈GA,p(#(Aut(G)))−1.This makes sense because∑G∈GA,p1#(Aut(G)) < ∞, which we shall see in the next section when weregard the weights w(G) as Dirichlet coefficients.Because the ideals of a number ring A are ordered by their norms, we can talk about the density of aparticular subset of the set of all finite A-modules by ordering these modules by their cardinals. In viewof this we introduce the following notations:Definition 2.1.2 For any function f : GA → C and nonzero ideal I of A we introduce the notation∑G(I)f(G) :=∑G an isomorphism classχA(G)=If(G).Definition 2.1.3 For I a nonzero ideal of A and k ≥ 1 an integer we setwk(I) :=∑G(I)wk(G)andw(I) :=∑G(I)w(G).Definition 2.1.4 For any function f : GA → C and k ≥ 1 an integer we define the CL-k-average of fto beMk,A(f) = limx→∞∑NI≤x(NI)−1∑G(I)ϕ∈Hom(A,G)wk(G)f(G/im(ϕ))∑NI≤x(NI)−1∑G(I)ϕ∈Hom(A,G)wk(G)13and we define the CL-average of f to beMA(f) = limx→∞∑NI≤x(NI)−1∑G(I)ϕ∈Hom(A,G)w(G)f(G/im(ϕ))∑NI≤x(NI)−1∑G(I)ϕ∈Hom(A,G)w(G).If the ring A is clear from from the context, we just write Mk(f) and M(f) respectively. Of course,neither Mk(f) nor M(f) is guaranteed to exist for a particular function f , so the above definitionsapply only if Mk(f) and M(f) exist respectively. The original paper of Cohen and Lenstra (see [5])introduces a more general notion of a (k, u)-average with applications in calculating the ranks of groupsof units, but the full generality of the model is not necessary for our purposes. If P is a property of afinite A-module and fP : GA → C is the corresponding indicator function, which takes the value of 1if G satisfies P and 0 otherwise, then we refer to Mk(fP ) as the CL-k-probability of P and we refer toM(fP ) as the CL-probability of P .We note now that the denominator of Mk(f) is simply∑NI≤xwk(I)and likewise for k = ∞, with w(I) replacing wk(I). This is because for χA(G) = I, #HomA(A,G) =#G = NI.The goal of the next section will be to express Mk(f) and M(f) in terms of Dirichlet series. To do thiswe must first provide formulae for wk(G) and wk(I).Proposition 2.1.5 For any k ≥ 1 and G ∈ GA with χA(G) = Ii)wk(G) =1#Aut(G)∏p|I∏ki=k−rp(G)+1(1−Np−i) if k ≥ rp(G)0 otherwiseii)w(G) = limk→∞wk(G).Proof: We first note that if p1, · · · , pj is the list of primes dividing I then14wk(G) =j∏i=1wk(Gpi)since choosing a surjective homomorphism ϕ : Ak → G just amounts to choosing a surjective homomor-phism ϕ : Ak → Gpi for each 1 ≤ i ≤ j, likewise for A-automorphisms of G. So it suffices to prove theformula in i) for a pA-module G. So let p be the lone prime dividing I and set r = rp(G). It is clearthat if k < r then wk(G) = 0 as there are no surjective homomorphisms from Ak to G in that case. Sowe suppose k ≥ r.Now we note that ϕ : Ak → G induces a A/p-vector space homomorphism ϕ : (A/p)k → G/pG. Tosee this, we let pi ◦ ϕ : Ak → G/pG be the homomorphism from Ak to G/pG induced by projection.Now, note that for each 1 ≤ i ≤ k, pi ◦ ϕ induces an A-module homomorphism (pi ◦ ϕ)i : A → G/pGvia (pi ◦ ϕ)i(x) = (pi ◦ ϕ)(0, · · · , x, · · · , 0), where (0, · · · , x, · · · , 0) ∈ Ak is the k-tuple with x in the ithcoordinate. Of course, the annihlator AnnA(G/pG) ⊇ p. Thus (pi ◦ ϕ)i factors through A/p, and sowe get a well defined homomorphism (pi ◦ ϕ)i : A/p → G/pG for each 1 ≤ i ≤ k. Thus we obtain awell-defined homomorphism ϕ : (A/p)k → G/pG.It is clear that if ϕ is surjective then so is ϕ. The converse is also true. If ϕ : (A/p)k → G/p is asurjective A/p-vector space homomorphism and ϕ : Ak → G is an A-module homomorphism whichinduces ϕ, then ϕ is also surjective.To see this, we note that if ϕ is surjective, then there exist some {h1, · · · , hs} ∈ ϕ(Ak), where s =#G#pG ,which form a complete system of residue classes mod pG. Now suppose toward a contradiction thatthere is some g ∈ G with g 6∈ ϕ(Ak). Then we may write g = hi1 + p1g1 for some p1 ∈ p and g1 ∈ G.Thus p1g1 6∈ ϕ(Ak). We repeat this process. We write g1 = hi2 +p2g2 for some p2 ∈ p and g2 ∈ G. Thusp1p2g2 6∈ ϕ(Ak). Continuing in this way, we obtain a sequence of elements p1g1, p1p2g2, p1p2p3g3, · · · ,each lying outside ϕ(Ak), where the nth element lies in pnG. But this is clearly false. Since G is a finitepA-module, there exists some n such that pnG = 0.Thus we conclude that ϕ is surjective if and only if ϕ is surjective. Furthermore, for each ϕ : (A/p)k →G/pG, the number of lifts of ϕ to a surjective homomorphism from Ak to G is just #{φ ∈ Hom(Ak, G) :φ = 0}, since two surjective homomorphisms ϕ1, ϕ2 : Ak → G pass to the same quotient homomorphismif and only if they lie in the same coset of Hom(Ak, G) modulo the submodule {φ ∈ Hom(Ak, G) : φ =0}.15Thus we get an equality between #{ϕ ∈ Hom(Ak, G) : ϕ surjective} and #{ϕ ∈ HomA/p((A/p)k, G/pG) :ϕ surjective}#{φ ∈ Hom(Ak, G) : φ = 0}, as required.Now we count #{ϕ ∈ HomA/p((A/p)k, G/pG) : ϕ surjective}. G/pG is just a A/p-vector space ofdimension r. So we want to count the k× r matrices X which have entries in A/p and which have rankr. So there areNpk choices for the first column vector v1 ofX, where v1 ∈ (A/p)k, sinceNpk = #(A/p)k.Now we choose a second column vector v2 ∈ (A/p)k linearly independent to v1. There are Npk − Npsuch choices, as we just avoid all the scalar multiples of v1. Now we choose a v3 ∈ (A/p)k which isindependent to v1, v2. There are Npk −Np2 such choices, as we just avoid all A/p-linear combinationsof v1, v2, of which there are Np2. Thus continuing in this way we obtain#{ϕ ∈ HomA/p((A/p)k, G/pG) : ϕ surjective} =∏1≤i≤r(Npk −Npi).Now we consider the other term. #{φ ∈ Hom(Ak, G) : φ = 0} is just the set of φ ∈ Hom(Ak, G) suchthat φ(v) ∈ pG for all v ∈ Ak. The number of A-module homomorphisms φ : A → pG is just #pG, sothis quantity is just(#pG)k =#Gk#(G/pG)k=(NI)k(Np)kr.Thus#{ϕ ∈ Hom(Ak, G) : ϕ surjective} =(NI)k(Np)kr∏1≤i≤r(Npk −Npi).Furthermore:∏1≤i≤r(Npk −Npi) = Npkrk∏i=k−r+1(1−Np−i).Thuswk(G) =#{ϕ ∈ Hom(Ak, G) : ϕ surjective}w(G)#Gk=1w(G)k∏i=k−r+1(1−Np−i)16since #G = NI. This completes the proof of i). Letting k →∞, the product∏ki=k−r+1(1−Np−i)→ 1,so ii) follows immediately.Now that we have provided a formula for wk(G), we would like to provide a formula for wk(I) for I anonzero ideal of A. We immediately note that wk is multiplicative in coprime ideals of A since for I, Jcoprime and G,H finite A-modules with χA(G) = I and χA(H) = J , wk(G)wk(H) = wk(G ⊕H) andso∑χ(G)=Iwk(G)∑χ(H)=Jwk(H) =∑χA(K)=IJwk(K).Thus it suffices to provide a formula for wk(pm).Proposition 2.1.6 For any m ≥ 0 and k ≥ 1i)wk(pm) = Np−m(k+m−1∏i=k(1−Np−i))(m∏i=1(1−Np−i)−1)ii)w(pm) = Np−m(m∏i=1(1−Np−i)−1).Proof: i) See [5].ii) now follows directly from i) by letting k →∞ and applying proposition The Weighted ζ-FunctionSince wk(I) is multiplicative in coprime ideals of A, it is natural for us to consider the correspondingDirichlet series associated to each k ≥ 1:17ζk,w(s) :=∑I 6=0wk(I)(NI)−sand likewiseζ∞,w(s) :=∑I 6=0w(I)(NI)−s.We will see below that both of these Dirichlet series converge in the half plane Re(s) > 0. The notationζk,w(s) indicates that this is a weighted ζ-function with respect to the weight wk(I) given to eachideal. By multiplicativity, we have the following Euler product formula for these ζ-functions in the halfplane:ζk,w(s) =∏p prime(∞∑a=0wk(pa)Np−a).ζ∞,w(s) =∏p prime(∞∑a=0w(pa)Np−a).The task now is to find a product formula for each p-term in the product above. Consider the followingDirichlet series:∞∑a=0wk(pj)(Np)−as.To find a formula for this, we first must provide a recursive formula for wk+1(pa) in terms of wk(pa):Proposition 2.2.1wk+1(pa) =a∑j=0(Np)−jkw1(pj)wk(pa−j).Proof: Let G be an A-module with χA(G) = pa. We begin by writing#{ϕ ∈ Hom(Ak+1, G) : ϕ surjective} =∑H≤G#{ϕ ∈ Hom(Ak+1, G) : ϕ surjective and ϕ(Ak) = H}.18We can rewrite this as∑H≤G∑φ∈Hom(Ak,H)φ surjective#{ϕ ∈ Hom(Ak+1, G) : ϕ surjective and ϕ|Ak = φ}.For a fixed H and fixed surjective φ ∈ Hom(Ak, H), we have the following correspondence between thetwo sets {ϕ ∈ Hom(Ak+1, G) : ϕ surjective and ϕ|Ak = φ} and {ψ ∈ Hom(A,G/H) : ψ surjective}. Foreach ϕ : Ak+1 → G with ϕ|Ak = φ, we restrict ϕ to the last coordinate of Ak+1 and then project toG/H. Each surjective ψ : A→ G/H then arises from exactly #H choices of ϕ : Ak+1 → G in this way.Thus#{ϕ ∈ Hom(Ak+1, G) : ϕ surjective and ϕ|Ak = φ} = (#H)#{ψ ∈ Hom(A,G/H) : ψ surjective}.Thus#{ϕ ∈ Hom(Ak+1, G) : ϕ surjective} =∑H≤G(#H)∑φ∈Hom(Ak,H)φ surjective#{ψ ∈ Hom(A,G/H) : ψ surjective}.Now since {ψ ∈ Hom(A,G/H) : ψ surjective} is just #Aut(G/H)#(G/H)w1(G/H) (and #(G/H) =#G#H ) and likewise #{φ ∈ Hom(Ak, H) : φ surjective} is just #Aut(H)(#H)kwk(H), we obtain#{ϕ ∈ Hom(Ak+1, G) : ϕ surjective} = #G∑H≤G#Aut(G/H)w1(G/H)#Aut(H)wk(H)(#H)k.Thus, since wk+1(G) = (#G)−(k+1)#Aut(G)−1#{ϕ ∈ Hom(Ak+1, G) : ϕ surjective}wk+1(G) =1#Aut(G)∑H≤G#(G/H)−k#Aut(G/H)w1(G/H)#Aut(H)wk(H).So now we sum over isomorphism classes of G with χA(G) = pa:19wk+1(pa) =∑G(pa)1#Aut(G)∑H≤G#(G/H)−k#Aut(G/H)w1(G/H)#Aut(H)wk(H).We can interchange the order of summation and then rewrite this as a sum over A-ideals dividing pa,writing C = G/H and K = H to obtaina∑j=0(Np)−jk∑C(pj)#Aut(C)w1(C)∑K(pa−j)Aut(K)wk(K)×∑G(pa)1#Aut(G)#{H ≤ G : H ∼= K and G/H ∼= C} .We emphasize that the rightmost term in this sum depends upon the two independent sums over C(pj)and K(pa−j). From a lemma in [5] we have the following∑G(pa)1#Aut(G)#{H ≤ G : H ∼= K and G/H ∼= C} = (#Aut(K))−1(#Aut(C))−1.Thuswk+1(pa) =a∑j=0(Np)−jk∑C(pj)w1(C)∑K(pa−j)wk(K)=a∑j=0(Np)−jkw1(pj)wk(pa−j).With this recursive formula in hand, the product form of the Dirichlet series above is a corollary:Corollary 2.2.2 For any nonzero prime ideal p of A and Re(s) > −1∞∑a=0wk(pa)(Np)−as =k∏j=1(1−Np−j−s)−1and20∞∑a=0w(pa)(Np)−as =∞∏j=1(1−Np−j−s)−1.Proof: We prove this by induction on k. The base case is k = 1.w1(pa) = Np−a.∞∑a=0w1(pa)Np−as =∞∑a=0Np−a(s+1) = (1−Np−(s+1))−1.This is justified since −(s+ 1) < 0.Now suppose the induction hypothesis is true in the kth case.∞∑a=0wk(pa)(Np)−as =k∏j=1(1−Np−j−s)−1.Now we apply proposition 2.2.1:∞∑a=0wk+1(pa)(Np)−as =∞∑a=0a∑j=0(Np)−jkw1(pj)wk(pa−j)Np−as.Applying proposition 2.1.6 we obtain:w1(pj) = Np−j .Thus∞∑a=0wk+1(pa)(Np)−as =∞∑a=0a∑j=0wk(pa−j)Np−j(k+1)Np−as.Now interchange the order of summation to obtain21∞∑j=0Np−j(k+1)∞∑a=jwk(pa−j)Np−as =∞∑j=0Np−j(k+1)∞∑r=0wk(pr)Np−(r+j)s=∞∑j=0Np−j(k+1+s)∞∑r=0wk(pr)Np−rs = (1−Np−(k+1+s))−1∞∑r=0wk(pr)Np−rswhich, by our induction hypothesis, equals(1−Np−(k+1+s))−1k∏j=1(1−Np−j−s)−1 =k+1∏j=1(1−Np−(j+s))−1.This completes the induction proof. Thus for all k ≥ 1 we obtain∞∑a=0wk(pa)(Np)−as =k∏j=1(1−Np−j−s)−1.Letting k → ∞, absolute converge of the product justifies us taking the limit inside the sum to ob-tain∞∑a=0w(pa)(Np)−as =∞∏j=1(1−Np−j−s)−1.From this we conclude the following:Corollary 2.2.3 For Re(s) > 0, the Dirichlet series ζk,w(s) and ζ∞,w(s) converge, and:i)ζk,w(s) =k∏j=1ζA(s+ j)ii)ζ∞,w(s) =∞∏j=1ζA(s+ j)where ζA(s) is the usual Dedekind ζ-function of A.22Proof: Applying the product formula for ζk,w and ζ∞,w and the result above we obtainζk,w(s) =∏p primek∏j=1(1−Np−j−s)−1 =k∏j=1ζA(s+ j).This clearly converges for Re(s) > 0, since each ζA(s + j) converges for j ≥ 1. The result for ζ∞,w(s)follows by letting k →∞.Thus we have formulae for the weighted ζ-functions ζw,k and ζw,∞ purely in terms of the values of theusual Dedekind ζ-function ζA. Note that from 2.2.2 we obtain the result we claimed in the previoussection, namely that∑G∈GA,p1#(Aut(G))<∞since this is just∞∑a=0w(pa) =∞∏j=1(1−Np−j)−1.Now we continue with the goal of finding ζ-function formulae for Mk(f) and M∞(f).Definition 2.2.4 For f : GA → C and k ≥ 1 an integer and I a nonzero ideal of A we setwk(f, I) =∑G(I)wk(G)f(G).Now we introduce the following ζ-functions weighted with respect to the function f :Definition 2.2.5 For k ≥ 1 and f : GA → C we set:ζf,k(s) :=∑I 6=0wk(f, I)(NI)−sand23ζf,∞(s)L =∑I 6=0w(f, I)(NI)−s.This is a purely formal notion, since we cannot guarantee the convergence of ζk,f or ζ∞,f anywherein C for an arbitrary f : GA → C, but we will be interested in sufficiently nice functions f that theircorresponding ζ-functions ζf,k and ζf,∞ converge in the half-plane.Affer some work we arrive at the following ζ-function formula for the k-average of a function f : GA →CProposition 2.2.6 For any k ≥ 1 and f : GA → C, if Mk(f) exists then the following holds:Mk(f) =ζf,k(1)ζw,k(1)and likewise if M(f) exists thenM(f) =ζf,∞(1)ζw,∞(1).Proof: See [5].We can now use our formula for M(f) to calculate the averages of functions f : GA → C with respectto the weighting w(G). Some examples are done below:Proposition 2.2.7 Let P ⊆ Spec(A), Let L be a fixed finite PA-module, and let Q denote the propertythat #GP ∼= L as an A-module. Let fQ denote the corresponding indicator function. Then the CL-probability that G satisfies Q, which is M(fQ), is given by1#L#(Aut(L))∏p∈P∞∏j=2(1−Np−j).Proof: We calculate the coefficients of ζfQ,∞(s). We have24w(fQ, I) =∑G(I)G(Q)w(G)where the sum∑G(I)G(Q)is just taken over all G with χA(G) = I which satisfy property Q. Let J1 denotethe P-part of the ideal I and J2 the coprime-to-P part of I. Thus by multiplicativity:w(fQ, I) =∑H(J2)w(H)∑K(J1)K(fQ)w(K) .Now note∑K(J1)K(fQ)w(K) =w(L) if χA(L) = J10 otherwise.Thus∑I 6=0w(fQ, I)(NI)−s =∑J 6=0J prime to Pw(J)(NJ)−sw(L)(NχA(L))−s.ThusζfQ,∞(1) =∑J 6=0J prime to Pw(J)(NJ)−1w(L)(NχA(L))−1.Whereas, again apply multiplicativity:ζw,∞(1) =∑I 6=0w(I)(NI)−1 =∑I 6=0I prime to Pw(I)(NI)−1∑I 6=0p|I→p∈Pw(I)(NI)−1 .The rightmost term is simply the sum taken over all ideals I whose prime factors lie among P.For the rightmost term we apply multiplicativity one more time and write the sum as a product:25∑I 6=0p|I→p∈Pw(I)(NI)−1 =∏p∈P(∞∑a=0w(pa)Np−a).Now we just apply corollary 2.2.2:∑I 6=0p|I→p∈Pw(I)(NI)−1 =∏p∈P∞∏j=1(1−Np−j−1)−1.ThusζfQ,∞(1)ζw,∞(1)=w(L)N(χA(L))∏p∈P∞∏j=1(1−Np−j−1)=1#L#(Aut(L))∏p∈P∞∏j=1(1−Np−j−1).This calculation demonstrates the usefulness of having a ζ-function expression for the average of afunction f with respect to this probability measure. We do one more example:Proposition 2.2.8 Fix a nonzero prime p of A. The CL-probability that Gp 6= 0 is1−∞∏j=2(1−Np−j).Proof: Let Q denote the property that Gp = 0 and let fQ denote the corresponding indicator function.As before, we calculate the coefficients of ζfQ,∞(s).w(fQ, I) =∑G(I)f(G)w(G).Let J denote the prime-to-p part of I and pm the p part of I (where k might be zero), so by multiplica-tivity we obtain26w(fQ, I) =∑H(J)w(H)∑K(pm)f(K)w(K) .Of course, the rightmost term is just∑K(pm)f(K)w(K) =1 if m = 00 otherwise.ThusζfQ,∞(1) =∑I 6=0w(fQ, I)(NI)−1 =∑I 6=0p-Iw(I)(NI)−1.Whereas, applying multiplicativity of w(I) and corollary 2.2.2ζw,∞(1) =∑I 6=0p-Iw(I)(NI)−1∞∑j=0w(pj)Np−j=∑I 6=0p-Iw(I)(NI)−1∞∏j=1(1−Np−j−1)−1 .ThusζfQ,∞(1)ζw,∞(1)=∞∏j=1(1−Np−j−1).With the tools of the previous section in hand, we will see how we can make calculations about thebehavior of the odd parts of the real quadratic class groups. We introduce the heuristic assumption inthe next section.272.3 Applying The Heuristic Assumption of the ModelIn order to introduce the heuristic assumption of the Cohen-Lenstra model, we will need to introducesome notions from commutative algebra. To each abelian group Γ of order n, we associate a ring AΓwhich is a finite product of rings of integers of number fields. Consider the Q-algebra Q[Γ]/(∑g∈Γ g).The group ring Q[Γ] is a semisimple Q-algebra by Maschke’s theorem. Thus by Artin-Wedderburn, Q[Γ]is (as a Q-algebra) isomorphic to a product of the form R1×· · ·×Rk, where each Ri is a Q-algebra of theform Mni(∆i), where ∆i is a division ring over Q and Mni(∆i) is the Q-algebra of all ni × ni matricesover ∆i. Of course, since Q[Γ] is commutative, this means ∆i is a number field and each ni = 1, i.eQ[Γ] is a finite product of number fields.∑g∈Γ g is a primitive central idempotent of Q[Γ], so under the Wedderburn decomposition of Q[Γ]it corresponds to a tuple having 1 in one position, say the ith, and 0 in all other positions, and soQ[Γ]/(∑g∈Γ g)is again a finite product of number fields. If Γ is the Galois group of an abelianextension K/Q, then Cl(K) is a Γ-module in the usual way. The idempotent∑g∈Γ g annihlates theprime-to-|Γ| part of Cl(K), so Cl(K) is a Z[Γ]/(∑g∈Γ g)-module.Now we introduce the following definitions:Definition 2.3.1 Let V be a finite-dimensional Q-vector space. Then we define a full Z-lattice in Vto be a finitely generated Z-submodule M of V such that QM = V , where QM is the set of all finiteQ-linear combinations of M .Definition 2.3.2 Let A be a Q-algebra. A Z-order in A is a unital subring Λ of A such that Λ is a fullZ-lattice in A. A maximal Z-order in A is a Z-order not properly contained in any other Z-order of A.We will now use the theory of Z-orders to develop the Cohen-Lenstra model. We want to show thatQ[Γ]/(∑g∈Γ g)has a unique maximal Z-order, and that this unique maximal Z-order is a finite productof number rings. We first note that if Λ is a Z-order in a Q-algebra A, then every element of Λ is integralover Z. In particular, if K is a number field, then the number ring OK is the unique maximal Z-orderin K. For our case, as Q[Γ]/(∑g∈Γ g)is a finite product of number fields, the unique maximal Z-orderin Q[Γ]/(∑g∈Γ g)is just the corresponding product of number rings.Thus we conclude that for Γ a finite abelian group, there is a unique maximal Z-order in Q[Γ]/(∑g∈Γ g)and this unique maximal Z-order is a finite product of number rings. Let AΓ denote this unique maximalZ-order. Let Γ be an abelian group of N , and r1, r2 ≥ 0 integers with r1 + 2r2 = N , let FN,r1,r2 denote28the family of abelian extensions of Q with Galois group Γ, r1 real embeddings and r2 pairs of complexembeddings up to conjugation.We can order the fields in FN,r1,r2 by the absolute value of their discriminants, where the order amongthe finite list of fields in FN,r1,r2 having the same discriminant d does not matter. For K ∈ FN,r1,r2let H(K) denote the prime-to-N part of the class group Cl(K). Then, as indicated above, H(K) isan AΓ-module. For f a complex-valued function defined on the set of isomorphism classes of finiteAΓ-modules, we setE(f) = limX→∞∑K∈FΓ,r1,r2|disc(K)|≤Xf(H(K)))∑K∈FΓ,r1,r2|disc(K)|≤X1.That is, E(f) is just the usual notion of the average value of f(H(K)) taken over all the fields inK ∈ FΓ,r1,r2 .Before introducing the heuristic assumption of the Cohen-Lenstra model, we need one more definition.Previously, we introduced the definition of the CL-average of a function f : GA → C, where GA is thecategory of finite modules over a fixed number A. To state the heuristic assumption, we must extendthis definition to functions defined on finite modules over finite products of number rings.Definition 2.3.3 Let A1, · · · , An be number rings and let A =⊕ni=1Ai and GA denote the categoryof finite A-modules. Let f : GA → C and for each 1 ≤ i ≤ n let fi : Ai → C be the correspondingcoordinate function. Then the CL-average of f , denoted MA(f) is defined to beMA(f) :=n∏i=1MAi(fi)provided that each MAi(fi) exists.We now have enough definitions in hand to introduce the Cohen-Lenstra heuristic.Definition 2.3.4 Let FΓ,N,0 be the family of all totally real extensions K/Q of degree N (that is, r2 = 0and r1 = N) having Galois group Γ, where Γ is an abelian group of order N . Let AΓ be the category ofall finite AΓ-modules. Then the Cohen-Lenstra heuristic states that for all reasonably-behaved functionsf : AΓ → C, E(f) is equal to the CL-average MAΓ(f) of f .Intuitively, this heuristic assumption says that we should expect abelian groups with more automor-29phisms to be less likely to appear as class groups of abelian extensions of Q. At the end of this chapter, wewill give some justification for this heuristic, and make more precise the notion of “reasonably-behavedfunctions”.In the quadratic case, where Γ = Z/2Z, the ring Q[Γ] is just Q[x]/(x2 − 1) ∼= Q ⊕ Q, and soQ[Γ]/(∑g∈Γ g)∼= Q so the corresponding maximal order AΓ is just Z.The heuristic assumption implies that if f is the characteristic function of some property P of finiteAΓ-modules, then M(f) is equal to the CL-probability of P . We note from the definition above thatin this case, M(f) is the probability that a field K ∈ FΓ,r1,r2 selected at random is such that theAΓ-module H(K) has property P .Thus, for example, the heuristic assumption of the model, together with proposition 2.2.7 of the previoussection, imply that the probability that a real quadratic field selected at random has a trivial odd partis2∞∏j=1(1− 2−j)−1∏p prime∞∏j=2(1− p−j) = 2∞∏j=1(1− 2−j)−1∞∏j=2ζ(j)−1 ≈ 0.75446.We can generalize this to the following:Proposition 2.3.5 Let ` be an odd positive integer and L be an abelian group of order `. Then theprobability that a real quadratic field selected at random has odd part isomorphic to L is12`(#Aut(L))∞∏j=2ζ(j)−1∞∏j=2(1− 2−j)−1.Proof: Apply proposition 2.2.7 to the ring A = Z and the set of primes Spec(Z) \ {2}.In particular, if p is an odd prime, then the probability that a a real quadratic field selected at randomhas odd part of cardinality p is12p(p− 1)∞∏j=2ζ(j)−1∞∏j=2(1− 2−j)−1 ≈0.754462p(p− 1).30We can also compute the probability that a real quadratic field selected at random has a class groupwith a cyclic odd part.Proposition 2.3.6 Let A be a number ring and let P ⊆ Spec(A). Let Q be the property of a finiteA-module G that the P-part of G is cyclic. Then the CL-probability of Q is∏p∈P1 + (Np)−1 − (Np)−3(1− (Np)−2)(1− (Np)−1)∞∏j=2(1−Np−j).Proof: Let fQ be the indicator function corresponding to Q. As before, we calculute the coefficientsof ζfQ,∞(s).w(fQ, I) =∑G(I)G(Qw(G).Here, the sum∑G(I)G(Q)is taken over all finite A-modules G with χA(G) = I which satisfy property Q. LetJ1 denote the P-part of the ideal I and J2 the coprime-to-P part of I. Thus by multiplicativityw(fQ, I) =∑H(J2)w(H)∑K(J1)K(Q)w(K) .Now note∑K(J1)K(Q)w(K) = w(A/J1) =1#(Aut(A/J1))=1#((A/J1)∗).Here, (A/J1)∗ is the multiplicative group of A/J1.ThusζfQ,∞(1) =∑I 6=0w(fQ, I)(NI)−1 =∑I 6=0I prime to Pw(I)(NI)−1∑I 6=0p|I→p∈P(#(A/I)∗)−1(NI)−1 .Whereas31ζw,∞(1) =∑I 6=0w(I)(NI)−1 =∑I 6=0I prime to Pw(I)(NI)−1∑I 6=0p|I→p∈Pw(I)(NI)−1 .Now, for I, J coprime ideals, we note that #(A/IJ)∗ = #(A/I)∗#(A/J)∗ by the Chinese Remaindertheorem. Thus, we may write∑I 6=0p|I→p∈P(#(A/I)∗)−1(NI)−1 as an Euler product:∑I 6=0p|I→p∈P(#(A/I)∗)−1(NI)−1 =∏p∈P(∞∑m=0(#(A/pm)∗)−1Np−m).To determine∑∞m=0(#(A/pm)∗)−1Np−m, we note that we can calculuate #(A/pm) with the generalizedEuler totient function ϕA (where ϕA(I) gives the number of invertible elements of A/I. We first note that(A/p)∗ has a cardinality of Np−1, as A/p is a field, and furthermore, the projection map (A/pm)→ A/pis such that a + pm ∈ (A/pm)∗ if and only if a + p ∈ (A/p)∗. Furthermore, each a + p ∈ (A/p)∗ hasexactly Npm−1 preimages in (A/pm)∗. Thus we conclude thatϕA(pm) = Npm−1(Np− 1)which is what we expect. We need to be somewhat careful here. This formula is valid if m ≥ 1. Ifm = 0 then of course the trivial A-module has an automorphism group of size 1. We take this intoaccount below.Thus∞∑m=0(#(A/pm)∗)−1Np−m = 1 + (Np− 1)−1∞∑m=1Np1−2m= 1 +Np(Np− 1)Np−2(1−Np−2)= 1 +Np−2(1− (Np)−1)(1− (Np)−2)=(1− (Np)−1)(1− (Np)−2) +Np−2(1− (Np)−1)(1− (Np)−2)=1− (Np)−1 + (Np)−3(1− (Np)−1)(1− (Np)−2).32ζfQ,∞(1)ζw,∞(1)=∏p∈P1− (Np)−1 + (Np)−3(1− (Np)−1)(1− (Np)−2)∞∏j=1(1−Np−j−1).Here, we have again used the fact that∑I 6=0I prime to Pw(I)(NI)−1 =∏p6∈P∞∏j=1(1−Np−j−1)and likewise for the set of ideals I with all prime divisors in P. Thus, the CL-probability that a finiteA-module has cyclic P-part is∏p∈P1− (Np)−1 + (Np)−3(1− (Np)−1)(1− (Np)−2)∞∏j=2(1−Np−j).Thus, in the real quadratic case, where AΓ = Z, we see that the probability of selecting a real quadraticfield at random which has a class group with cyclic odd part is12 ·341− 12 +18∞∏j=2(1− 2−j)−1∏p prime1− p−1 + p−3(1− p−1)(1− p−2)∞∏j=2(1− p−j)=35∞∏j=2(1− 2−j)−1∞∏j=2ζ(j)−1∏p prime1− p−1 + p−3(1− p−1)(1− p−2).∏p prime1− p−1 + p−3(1− p−1)(1− p−2)= ζ(2)∏p prime(1 +p−31− p−1).So the probability that a real quadratic calss field selected at random will be cyclic (i.e have rank ≤ 1)is3 · ζ(2)5∞∏j=2ζ(j)−1∞∏j=2(1− 2−j)−1∏p prime(1 +1p3(1− p−1))which is approximately 0.33267.33We can go further than this and ask what is the probability that a real quadratic class group selectedat random will have rank r, but producing a formula like the one in 2.3.3 is not possible in the generalcase. The local version of this question is answerable, however. Fix a prime p. Then we can find theprobability that a real quadratic class group selected at random has a p-rank of r.Proposition 2.3.7 For a number ring A and a fixed prime p of A and an integer r ≥ 0, the CL-probability that a finite A-module G has rp(G) = r isNp−r(r+1)∞∏j=r+11−Np−jr+1∏j=1(1−Np−j)−1 .Proof: See [5].2.4 Justification for the Cohen-Lenstra HeuristicIn order to give some heuristic justification for 2.3.3, it is convenient to pass between the language of idealclasses and the language of binary quadratic forms to describe the class group of a number field. Recallfrom the first section that a binary quadratic form is a homogeneous polynomialQ(x, y) = ax2+bxy+cy2.We will pass between the quadratic form ax2 +bxy+cy2 and the tuple of coefficients (a, b, c). An integralquadratic form is a binary quadratic form with integer coefficients. An integral quadratic form (a, b, c)is called primitive if a, b, c are coprime. From now on in this section, all binary quadratic forms unlessotherwise stated will be integral quadratic forms.Definition 2.4.1 Let P (X,Y ) and Q(A,B) be two binary quadratic forms. Then P (X,Y ) and Q(A,B)are called equivalent if there exists a linear transformation T ∈ SL2(Z) such that P (X,Y ) = Q(TX, TY ).That is, there are m,n, p, q such that P (X,Y ) = Q(mX + nY, pX + qY ) and mq − np = ±1.It is clear that this defines an equivalence relation on the binary quadratic forms. It is also clear thatthis defines a group action of SL2(Z) onto {Q(x, y) ∈ Z[X,Y ] : deg(Q) = 2, Q,homogeneous} where(T,Q(X,Y ))→ Q(TX, TY ).Recall that a binary quadratic form Q(x, y) = ax2 + bxy + cy2 has discriminant D(Q) = b2 − 4ac.Clearly, for any binary quadratic form Q, D(Q) ≡ 0 mod 4 or D(Q) ≡ 1 mod 4. Furthermore, it34is clear that any integer a which is 0 or 1 mod 4 is of the form D(Q), since a differs from a perfectsquare by a multiple of 4. It is also clear that two equivalent binary quadratic forms represent the sameinteger.Proposition 2.4.2 Let P (X,Y ) and Q(A,B) be two equivalent binary quadratic forms. Then D(P ) =D(Q).Proof: Suppose that P (X,Y ) = Q(mX +nY, pX + qY ), where m,n, p, q are integers with mq−np =±1. Suppose that Q(X,Y ) = aX2 + bXY + cY 2. ThusP (X,Y ) = a(mX + nY )2 + b(mX + nY )(pX + qY ) + c(pX + qY )2= (am2 + bmp+ cp2)X2 + (2amn+ bmq + bnp+ 2cpq)XY + (an2 + bnq + cq2)Y 2.ThenD(Q) = (2amn+ bmq + bnp+ 2cpq)2 − 4(am2 + bmp+ cp2)(an2 + bnq + cq2)= (b2 − 4ac)(mq − np)2 = b2 − 4ac= D(P ).Thus, we may speak of the discriminant of an equivalence class of binary quadratic forms (i.e a SL2(Z)-orbit of {Q(x, y) ∈ Z[X,Y ] : deg(Q) = 2, Qhomogeneous}).Gauss defined an operation called composition on the set of equivalence calsses of binary quadratic formshaving a fixed discriminant D. The deepest insight of Gauss was that for any discriminant D, there areonly finitely many equivalence classes of binary quadratic forms representing D, and this finite familyof equivalence classes forms a group under composition. This group is isomorphic to the class group ofQ(√D).35Definition 2.4.3 A binary quadratic form (a, b, c) of discriminant D < 0 is called reduced if thefollowing conditions hold:|√D − 2|a|| < b <√D if D > 0|b| ≤ a ≤ cb ≥ 0 if |b| = a or a = cif D < 0.For a fixed discriminant D, we define RD to be the set of reduced forms of discriminant D. An elementof RD may be written as (a, b), since c = b2−D4a .Proposition 2.4.4 RD has only finitely many elements.Proof: Let (a, b, c) be a form of discriminant D.We first do the case where D > 0. Suppose that |√D − 2|a|| < b <√D. Thus (√D − 2|a|)2 < D andso D + 4|a|2 − 4|a|√D < D. Thus 4|a|2 < 4|a|√D, and so |a| <√D. From this it follows immediatelythat RD is finite, since there are only finitely many choices of a, and for each a there are only finitelymany choices of b, and for each choice of a, b there is at most one choice of c. However, more is true inthis case. We haveD + 4|a|2 − 4|a|√D < b2 < D.Subtracting D from all three terms we get4|a|2 − 4|a|√D < 4ac < 0.So taking absolute values and reversing the inequalities4|a|√D − 4|a|2 > 4|a||c|.Thus36√D > |c|+ |a|.Now we do the case where D < 0. In this case, 0 ≤ a ≤ c. Since D = b2 − 4ac, we have D ≤ b2 − 4a2.Since b2 ≤ a2, we have D ≤ −3a2, we have |D|/3 ≥ a2, so there are only finitely many choices of a andthus, only finitely many choices of b. So again there are only finitely many elements of RD. Gauss devised a reduction algorithm to transfer any binary quadratic form into an equivalent reducedform. The reduction Algorithm for Binary Quadratic Forms is as follows.Let P (X,Y ) = aX2 + bXY + cY 2 be a binary quadratic form of discriminant D. Appy to P (X,Y ) anelement T ∈ SL2(Z) of the following form:Tm =1 m0 1where m ∈ Z.Note that applying T to P (X,Y ) produces the equivalent binary quadratic formP (X +mY, Y ) = a(X +mY )2 + b(X +mY )Y + cY 2= aX2 + (b+ 2am)XY + (c+ bm+m2)Y 2.If we denote this new binary quadratic form by (a′, b′, c′), then we note that we can choose b′ to be inany real interval of length 2|a| by an appropriate choice of m ∈ Z, since the translates are all the pointsdiffering from b by an integer multiple of 2a.We can also apply to P (X,Y ) the element S ∈ SL2(Z)S =0 −11 0 .37When we apply S to P (X,Y ), we getP (−Y,X) = aY 2 − bXY + cX2.Thus, applying S to (a, b, c) we obtain the equivalent binary quadratic form (c,−b, a).The algorithm consists of the following steps:1. Start with an integer binary quadratic form (a, b, c) of discriminant D.2. If D < 0 then choose an m ∈ Z so that when Tm is applied to (a, b, c), the resulting binary quadraticform (a′, b′, c′) has b′ ∈ {x ∈ R : −|a| < x ≤ |a|}.If D > 0 and 0 < |a| <√D, choose an m ∈ Z so that when Tm is applied to (a, b, c), the resultingbinary quadratic form (a′, b′, c′) has b′ ∈ {x ∈ R : −|a| < x < |a|}.If D > 0 and a = 0, then first apply S to (a, b, c) to get (c,−b, a) and then choose m ∈ Z so that when Tmis applied to (c,−b, a) the resulting binary quadratic form (a′, b′, c′) has b′ ∈ {x ∈ R : −|c| < x < |c|},unless c = 0 as well. If c = 0 as well, then first apply T1 to (0, b, 0) to get (0, b, b+ 1) and then redo theabove.If D > 0 and |a| ≥√D, choose an m ∈ Z so that when Tm is applied to (a, b, c), the resulting binaryquadratic form (a′, b′, c′) has b′ ∈ {x ∈ R :√D − 2|a| < x <√D}.3. After step 2, we have obtained a binary quadratic form (a′, b′, c′). If (a′, b′, c′) is reduced, we aredone. If not, replace (a′, b′, c′) with (c′, b′,−a′) using S, and go back to step 2, now applying step 2 to(c′, b′,−a′).Now we can show that the algorithm does terminate with the desired output.Proposition 2.4.5 For any binary quadratic form (a, b, c) as input, the algorithm above will terminateand its output will be a reduced form (a′, b′, c′) equivalent to (a, b, c).Proof: Let (a0, b0, c0) have discriminant D. We first do the case where D < 0. We first note that inthis case, both a0 6= 0 and c0 6= 0, otherwise D = b20 > 0. Suppose we start with |b0| > a0. By the rightchoice of Tm0 we get a new form (a0, b1, c1) with b1 ∈ {x ∈ R : −|a0| < x ≤ |a0|}. Thus |b1| ≤ |a0|. Nowwe break all the possible cases satisfied by the form (a0, b1, c1) into the tree below38(a0,b1,c1)|b1|≤|a0|tt **a0 > 0&&yya0 < 0a0 > c1 a0 ≤ c1xx &&a0 = c1 a0 < c1xx &&|b1| = a0 |b1| < a0.The idea is to show that in each of the subcases on the left main fork of the tree, the algorithm terminateswith the desired output, and then show that if (a0, b1, c1) lies in the right main fork of the tree, thenwe can run the algorithm to obtain an equivalent form which lies in the left main fork, and then we aredone.We start from the bottom of the left main fork. If |b1| < a0 and 0 < a0 < c1 then we are done. Wehave a reduced form. On the other hand, suppose we have a form satisfying c1 > a0 > 0 and |b1| = a0.If b1 > 0 we have a reduced form, so we are done. If not, then we have the form (a0,−a0, c1). Thisform is almost reduced, except that the XY -coefficient has the wrong sign. Applying the element T1we obtain the form (a0, a0, c1 + 1− a0). This is not reduced since 0 ≤ c1 + 1− a0 ≤ a0, so we go to step3 and obtain the form (c1 + 1− a0,−a0, a0). If c1 ≤ 2a0 − 1 then we have the form (c2,−a0, a0) where0 < c2 ≤ a0. Then we apply T1 to this form to obtain (c2, 2c2 − a0, 1). Then we go to step 3 to obtainthe form (1, a0 − 2c2, c2) and this form finally is reduced since if c2 = 1 then c1 = a0 which is false, anda0 − 2c2 ≤ 0. On the other hand, if c1 > 2a0 − 1 then (c1 + 1− a0,−a0, a0) is already reduced.So now both branches of the a0 < c1 subcase are done. So now suppose a0 = c1. Then we have theform (a0, b1, a0). If b1 > 0 then we are done. If not, we just apply S to obtain (a0,−b1, a0) and we aredone. So now, both branches of the a0 ≤ c1 subcase are done.So now suppose a0 > 0 and a0 > c1 in the form (a0, b1, c1). If this form is not reduced, then we go tostep 3 and apply S to obtain (c1,−b1, a0). If |a0| ≤ c1 we are done, since that just reduces to the rightfork of a0 > 0 in the diagram above, which we just completed. On the other hand, if |b0| > c1 thenwe apply some Tm1 to produce the form (c1, b2, c2) with |b2| ≤ c1. If c1 ≤ c2 then we are done, since39that just reduces to the right form of a0 > 0 in the diagram above. If c1 > c2, we apply S to obtain(c2,−b2, c1)We repeat until we obtain a form satisfying the right fork of a0 > 0.This completes the case where a0 > 0. On the other hand, if a0 < 0 then we can always pass to anequivalent reduced form (a, b, c) satisfying a0 > 0 by some appropriate sequence of Tm and S.This finally completes the proof in the case where D < 0. The case where D > 0 is more difficult. Theproof for this case is in [4].Note that this provides another way proof of the finiteness of the class group without using anyMinkowski theory: Since every form of discriminant D is equivalent to a form in RD, and there are onlyfinitely many such forms, the class group of D is finite. In the case where D < 0, each binary quadraticform of discriminant D is equivalent exactly one reduced form.Proposition 2.4.6 Let D < 0 and let Q(x, y) be a form of discriminant D. Then there is exactly oneform in RD equivalent to Q(x, y).Proof: We have already proven that for any form (a, b, c), there is an equivalent reduced form. Thuswe may assume that (a, b, c) is a reduced form. We want to show that any reduced form equivalent to(a, b, c) is (a, b, c). Suppose that (d, e, f) is equivalent to (a, b, c). So letT =m np qbe the transformation that goes between (a, b, c) and (d, e, f). We recall from the proof of 2.4.2thatd = am2 + bmp+ cp2.Since mq − np = ±1, mq is coprime to np. Since (a, b, c) is reduced, we have |b| ≤ a ≤ c. We want toshow that a ≤ d. We can assume that m 6= 0. If m = 0 then d = cp2 ≥ a, so we are done.40Now we writeam2 + bmp+ cp2 = am2(1 +bapm)+ cp2.We also writeam2 + bmp+ cp2 = am2 + cp2(1 +bamp).We know that∣∣ ba∣∣ ≤ 1. There are two cases to consider Either∣∣ pm∣∣ < 1 or∣∣ pm∣∣ > 1. In the first case, weapply the first identity to obtain 0 ≤ a ≤ d. In the second case, we apply the second identity to againobtain 0 ≤ a ≤ d.Thus, if (a, b, c) is a any reduced form, and (d, e, f) is any equivalent reduced form, then a is minimalamong the first coordinates of all reduced forms equivalent to (a, b, c). Thus a = d. Therefore wegeta = am2(1 +bapm)+ cp2.a = am2 + cp2(1 +bamp).Since 0 ≤ a ≤ c, if∣∣ pm∣∣ < 1 then the first identity implies that p = 0 and m = 1, and if∣∣∣mp∣∣∣ < 1, thesecond identity implies that p = 0 and m = 1. Thus our matrix T must be of the formT =1 n0 ±1 .We require q = ±1 since det(T ) = ±1.We recall from 2.4.2 thate = 2amn+ bmq + bnp+ 2cpq = 2an± b.41Since a = d, we must have e ∈ (−a, a] since |e| ≤ a and (d, e, f) is reduced. Thus n = 0. Finally, wecannot have q = −1. If we let q = −1 and then apply T to (a, b, c) we obtain (c,−b, a). If (c,−b, a) isalso reduced, it implies that a = c since a ≤ c and c ≤ a. But that requires that b ≥ 0 and −b ≥ 0,so b = 0. But then e = 0 as well by the identity above, so (a, b, c) = (d, e, f), which implies thatq = 1.Thus (a, b, c) = (d, e, f). Thus, in the case of D < 0, the set RD can be identified with the ideal class group of Q(√D). This isnot true in the case where D > 0. In the case where D > 0, a binary quadratic form of discriminant Dis equivalent to finitely many reduced forms.In the case D > 0 we can describe RD in the following way. Let ρ : RD → RD be a map whichtakes a reduced binary quadratic form (a, b, c) and assigns it to (a′, b′, c′), where (a′, b′, c′) satisfies thefollowing:• a′ = c• b′ is the unique integer in (√D − 2|c|,√D) which is congruent to −b mod 2|c|We note that with a′ and b′ defined, c′ is defined as well since it is determined by a′, b′ and D.We can view the map ρ as performing a reduction step in accordance with the algorithm above on abinary quadratic form which is already reduced.Proposition 2.4.7 For each D > 0 the map ρ : RD → RD as described above is a permutation of RD.Proof: See [1].Since ρ is a permutation of RD, we can examine the cycle decomposition of RD. We note that theleading coefficients of (a, b, c) and ρ((a, b, c)) have opposite signs. To see this, we note that |a| <√D,since (a, b, c) is reduced, and thus a2 < b2−4ac, so 4ac < b2−a2 < 0, the right inequality again followingfrom the fact that (a, b, c) is reduced. Thus each ρ-cycle in RD has even length.Proposition 2.4.8 Two binary quadratic forms (a, b, c) and (a′, b′, c′) in RD are equivalent iff theybelong to the same ρ-cycle.42Proof: One direction just follows from the definition of ρ. The other direction may be found in[1].Thus, in the case D > 0, the ideal class group of Q(√D) can be identified with the ρ-cycles of RD.Thus, RD comes equipped with a principal cycle, the ρ-cycle corresponding to the identity element ofCl(Q(√D)).The Cohen-Lenstra heuristic assumption then asserts that Cl(Q(√D) behaves like a random group ofthe form G/〈σ〉, where G is a random group weighted by 1/|Aut(G)|, where G/〈σ〉 can be thought ofas RD modulo the principal cycle in the space of ρ-cycles of RD. Although RD is not a group, andeach ρ-cycle in RD does not have the same number of cycles, RD has a “group-like” structure which isdescribed in [3], which makes the notion of RD modulo the principle cycle sensible.We have seen that the Cohen-Lenstra heuristic assumption implies results much stronger than anythingwe can currently prove. Although the ideas of the previous section provide some intuitive grounds tobelieve the Cohen-Lenstra heuristic, they aren’t a formal proof. In the next and final chapter, we willreview some interpretations of the Cohen-Lenstra probability measure. That is, we will show that theCohen-Lenstra probability measure on the family of finite abelian groups corresponds to several otherprobability measures.43Chapter 3Interpretations of theCohen-Lenstra HeuristicsIn this section, we will be concerned with connecting the Cohen-Lenstra probability measure on finiteabelian groups, to other probability measures that occur in other contexts. Throughout this chapter wewill be focused only on the local Cohen-Lenstra probability measure.Definition 3.0.9 Fix an integer prime p. Then let Gp denote the category of all finite Abelian p-groups.The local Cohen-Lenstra probability measure on Gp is a map P : P(GAp)→ R≥0 defined asP (M) =∑G∈M |Aut(G)|−1∑G∈Gp|Aut(G)|−1for M ⊆ Gp.If M has a single element G, we will just write P (G) in place of P ({G}).We will keep p as a fixed prime throughout this chapter.3.1 Young Diagrams of PartitionsEach element of GAp clearly corresponds to an integer partition. An integer partition is just a multisetwith elements from N. For example, the integer partitions of 5 are just the mulisets {5}, {1, 4}, {2, 3},44{1, 1, 3}, {1, 2, 2}, {1, 1, 1, 2}, {1, 1, 1, 1, 1}. The multset {a1, a2, · · · , an} corresponds to the abeliangroupZ/pa1Z⊕ Z/pa2Z · · ·Z/panZ.Clearly there is a 1-1 correspondence between the abelian groups in GAp and the integer partitions.Thus, for an integer partition λ, we will write P (λ) to denote the probability of selecting the abeliangroup corresponding to λ, with respect to the probability measure P above.We can also associate to each integer partition a Young diagram.Definition 3.1.1 Let {a1, · · · , an} be an integer partition, such that a1 ≥ a2 ≥ · · · an. Then the Youngdiagram corresponding to {a1, · · · , an} is a grid of boxes with n rows, with ai boxes in the ith row.For example, the Young diagram corresponding to the integer partition {5, 3, 3, 2, 1} is.Clearly there is a 1-1 correspondence between integer partitions and Young diagrams. Through itsYoung diagram, every integer partition is associated with a conjugate partition.Definition 3.1.2 Let {a1, · · · , an} be an integer partition, with a1 ≥ · · · ≥ an. The conjugate partitionof {a1, · · · , an} is the integer partition corresponding to the Young diagram whose rows are the columnsof the Young diagram of {a1, · · · , an}.Using the previous Young diagram as an example, if we make the columns of the Young diagram of{5, 3, 3, 2, 1} into rows, we obtain the following Young diagram:.Thus, the conjugate partition to {5, 3, 3, 2, 1} is {6, 5, 3, 1, 1}.45For a partition λ and an integer s ≥ 1, the notation λs will denote the number of blocks in the sth-rowof the Young diagram of λ (read from top to bottom). For example, if λ = {5, 3, 3, 2, 1} as above, thenλ1 = 5, λ2 = 3, λ3 = 3, λ4 = 2, λ5 = 1 and λs = 0 for s > 5. If λ is the unique integer partitionsatisfying λs = 0 for all s ≥ 1 then we refer to λ as the empty partition. For a partition λ, the notationλ′ will always denote the conjugate partition to λ.Definition 3.1.3 Let {a1, · · · , an} be an integer partition with a1 ≥ · · · ≥ an, and∑ni=1 ai = M .A Young tableau of {a1, · · · , an} is the Young diagram of {a1, · · · , an} with each box filled in withone element from the set {1, · · · ,M}, so that each element of {1, · · · ,M} appears exactly once in thediagram.For example, a Young tableau corresponding to the integer partition {5, 3, 3, 2, 1} would be2 7 9 10 111 6 143 8 134 512 .Clearly there are M ! distinct Young tableaux corresponding to a Young diagram of an integer partition{a1, · · · , an} with∑ni=1 ai = M . A Young tableau is called standard if each row has entries of increasingvalue when read from left to right. The above Young tableau is a standard Young tableau.Several probabilistic interpretations of the Cohen-Lenstra probability measure on Gp were given byJason Fulman. We describe some of these below.3.2 The Young Tableau AlgorithmIn this section we examine the Young Tableau Algorithm. This is a probabilistic algorithm which outputsan integer partition {a1, · · · , an} with probability w(G), where G is the abelian p-group correspond-ing to {a1, · · · , an} (recall that throughout this chapter, p is a fixed prime). The algorithm is not adeterministic algorithm, but at each step, makes a decision with a certain probability.The steps of the algorithm are as follows:0. Start with the empty partition λ and a collection of weighted coins indexed by N = {1, 2, · · · }, such46that coin i has probability p−i of heads, and 1− p−i of tails. Also start with an N = 1.1. Flip coin N . If this coin comes up tails, then set N := N + 1 and redo step 1. Otherwise go to step2.2. Choose an integer S ≥ 1 according to the following probabilistic rule: Set S := 1 with probabilitypN−λ′1−1pN−1 . For s > 1, set S := s with probabilitypN−λ′s−pN−λ′s−1pN−1 . In either case, after S is chosen in thisway, add one block to the Sth column of the Young diagram of λ, and then return to step 1.In other words, what we are doing is starting with a partition and an integer N . We flip the Nth coin.If it comes up tails, we flip the N + 1th coin and repeat. If not, we add another block to the Youngdiagram of our partition and then start again. Note the step 3 is well-defined because if we are at step2 and have partition λ and integer N , and λ has k columns, then the probability of selecting s > k + 1is zero.Below we do an example of following several steps of the algorithm:Suppose we are currently at step 1, with N = 5 and the partition λ = {5, 2, 2, 1}, so we have thefollowing Young diagram.Thus the conjugate partition is.Now we flip coin 5. With probability p−5 we get heads, and with probability 1− p−5 we get tails. If weget tails, we redo the process with coin 6. Otherwise, we choose an integer S according to the followingprobabilities. We choose S := 1 with probability p−1p5−1 . We choose S := 2 with probabilityp2−pp5−1 . Wechoose S := 3 with probability p4−p2p5−1 . We choose S := 4 with probability 0. We choose S := 5 withprobability 0. We choose S := 6 with probability p5−p4p5−1 . We choose S > 6 with probability 0.47We note that this probability distribution is well defined since∑s≥1 P (S := s) =(p−1)+(p2−p)+(p4−p2)+(p5−p4)p5−1 =p5−1p5−1 = 1. If we have selected, say, S := 3 then we replace the Young diagram of λ with the Youngdiagram.which is the Young diagram of {5, 3, 2, 1}. Now we go back to step 1.This algorithm doesn’t halt, but with probability 1, it ouputs a finite partition. This means that if,after some finite number of steps, the algorithm has arrived at the partition ω, there is a positiveprobability that in all the infinitely many steps of the algorithm after this point, the algorithm willadd no more blocks to the Young diagram of ω (in other words, the algorithm outputs ω with positiveprobability).Proposition 3.2.1 With probability 1, the algorithm above outputs an integer partition. That is, if wetreat the possible outputs of the algorithm as infinite sequences of Young diagrams, the probability thatthe algorithm will output a sequence which stops adding blocks after finitely many steps is one.Proof: For each N ≥ 1 let AN denote the event “the Nth coin comes up heads at least once”. Theconditional probability that the Nth coin always comes up tails, given that the algorithm reaches theinteger N in finite time, is equal to the conditional probability that the Nth coin comes up tails the firsttime the algorithm reaches N . since a coin can only come up tails once. This is equal to 1− p−N . Thusthe conditional probability of AN , given that the algorithm reaches N is finite time, is p−N . Thus∑N≥1P (AN ) ≤∑N≥1p−N <∞.Thus, by the Borel-Cantenelli lemma, with probability 1 only finitely many of the events {AN : N ≥ 1}occur. For N, k let BN,k denote the event that the Nth coin comes up heads k times.Suppose then that the algorithm outputs a sequence in which the events AN1 , · · · , ANr occur. ThenP (BNi,k | ANi) = p−Nik. Thus48∑k≥1P (BNi,k) <∞.Thus a second application of the Borel-Cantenelli lemma implies that any coin coming up heads at leastonce, does so only finitely many times with probability 1.Thus with probability 1, the algorithm outputs a sequence in which for some i sufficiently large, afterthe ith stage all coin tosses come up tails. That is, the algorithm outputs a sequence which stops addingblocks to the Young diagram after some finite step. So, with probability 1, the algorithm outputs afinite partition in finitely many steps.The connection between the algorithm and the local Cohen-lenstra probability measure is as fol-lows:Proposition 3.2.2 For a particular integer partition ω, the probability that the algorithm above outputsthe conjugate partition to ω (that is, outputs an infinite sequence which reaches ω′ after finitely manysteps, and then stops adding blocks) is P (ω).Proof: See [2].3.3 The Conjugacy Class InterpretationAnother interpretation of the CL probability measure on GAp provided by Fulman is in terms of conju-gacy classes of elements of GL(n, p) (that is, the group of invertible n× n matrices over Fp). To movefrom conjugacy classes to integer partitions, we define the Jordan-Chevalley normal form.Definition 3.3.1 Let φ(X) ∈ Fp[X] be a monic polynomial of degree m, where φ(X) = Xm +am−1Xm−1 + · · · + a1X + a0. To this polynomial we associate a companion matrix over Fp withsize m×m. This matrix is denoted by C(φ) and is defined as49C(φ) =0 1 0 · · · 00 0 1 · · · 0.......... . ....0 0 0 · · · 1−a0 −a1 −a2 · · · −am−1.Definition 3.3.2 A Jordan-Chevalley normal form matrix in GL(n, p) is a block matrix of the followingformJ1 0 0 · · · 00 J2 0 · · · 00 0 J3 · · · 0.......... . ....0 0 0 · · · Jrwhere the block Ji is of the form C(φsii ), where φi(X) ∈ Fp[X] is a monic irreducible polynomial and siis a positive integer, such thatn =r∑i=1sideg(φi).That is, a Jordan-Chevalley normal form is given by associating to each irreducible polynomial φ(X) ∈Fp[X] an integer partition, where all but finitely many irreducible monic polynomials in Fp[X] recieve theempty partition. As an example, if a particular monic irreducible ψ(X) ∈ Fp[X] recieves the partitionλψ = {4, 4, 3, 3, 3, 1}, this means that the corresponding Jordan-Chevalley normal form contains twoblocks of the form C(ψ4), three blocks of the form C(ψ3) and one block of the form C(ψ).Thus, a matrix A ∈ GL(n, p) in Jordan-Chevalley normal form corresponds to a family (λφ)φ, whereeach λφ is an integer partition and φ runs over all irreducible monic polynomials in Fp[X], all butfinitely many of the λφ are empty partitions. Clearly the monic irreducible polynomial X must recievethe empty partition, otherwise the matrix A is not invertible. Furthermore, if we let λφ,s denote thenumber of blocks in the s-th row of the Young diagram of λφ,s, then the condition that the sum of thesizes of the Jordan blocks adds to n is expressed as50∑φ,sdeg(φ) · λφ,s = n.Conversely, suppose we start with a family (λφ)φ of integer partitions indexed by all the monic irreduciblepolynomials in Fp[X], and this family satisfiees the following properties• λX = ∅•∑φ,s deg(φ) · s · λφ,s = n.Then the family (λφ)φ determines a conjugacy class in GL(n, p).Now that we have the appropriate dictionary between integer partitions and conjugacy classes inGL(n, p), and integer partitions can be regarded as elements of Gp, we have the followingProposition 3.3.3 Fix a monic polynomial φ(X) ∈ Fp[X] of degree 1. Let λ be any integer parti-tion. For each conjugacy class [A] of GL(n, p), let (λ[A],φ)φ denote the corresponding family of integerpartitions. Then if we choose a conjugacy class [A] in GL(n, p) uniformly at random we getlimn→∞P(λ[A],φ = λ) = w(λ).In other words, if we choose a conjugacy class [A] of GL(n, p) uniformly at random, with correspondingpartition family (λ[A],φ)φ the probability that λ[A],φ = λ converges to P (λ) as n→∞.Proof: See [2].51Chapter 4ConclusionThe approach of Cohen and Lenstra appears to be a promising one for solving the class number problemfor real quadratic fields. Applying the methods and results of the previous chapter, results in grouptheory and representation theory about conjugacy classes of GL(n, p) can be transferred to the study ofthe Cohen-Lenstra probability measure. This offers a potential avenue for proving the Cohen-Lenstraheuristic and therefore solving the class number problem for real quadratic fields.52Bibliography[1] Dirichlet, P.G. and Dedekind, R., Vorlesungen u¨ber Zahlentheorie. New York: Springer-Verlag,1968 (reprint).[2] Fulman, J., A Probabilistic Approach Toward Conjugacy Classes in the Finite General Linear andUnitary Groups. Journal of Algebra 212, 557-590, 1999.[3] Lengler, J., The Cohen-Lenstra heuristic: Methodology and results. Journal of Algebra 323, 2960-2967, 2010.[4] Lenstra, H.W., On the Calculation of Regulators and Class Numbers of Quadratic Fields. Cam-bridge: University Press, 1982.[5] Lenstra, H. and Cohen, H.W., Heuristics on Class Groups of Number Fields. Lecture Notes onMath vol. 1086. Berlin: Springer-Verlag, 1984.[6] Marcus, A., Number Fields. New York: Springer-Verlag, 1977.[7] Neukirch, J., Algebraic Number Theory. Berlin: Springer-Verlag, 1999.[8] Reiner, I., Maximal Orders. Oxford: Clarendon Press, 2003.[9] Stark, H., The Gauss class-number problems. Clay Mathematics Proceedings, Volume 7, 2007.53


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items