Equations in the Primes by Brian Michael Cook B.Sc., Southern Polytechnic University, 2005 M.Sc., Georgia State University, 2007 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in The Faculty of Graduate Studies (Mathematics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) April 2012 c Brian Michael Cook 2012Abstract We provide results related to the study of prime points on level sets of homo- geneous integral forms which are linear or quadratic. In the linear case we present an extension of the Green-Tao Theorem, which finds affine copies of finite intervals in relatively dense subsets of the primes, to a higher dimen- sional setting in which one finds affine copies of suitably generic point con- figurations in relatively dense subsets of a Cartesian product of the primes. For general integral quadratic forms we present a result which is a Birch- Goldbach type theorem for a single quadratic form with sufficient rank. This guarantees solubility among the primes on the level set of a quadratic form subject to local conditions. This is an extension of a well known result of Hua. iiPreface The material from Chapter 2 is taken from the following manuscript, which is comprised of joint work A. Magyar. Constellations in Pd (with A. Magyar), Int Math Res Notices (2011) doi: 10.1093/imrn/rnr127 iiiTable of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Linear Equations in the Primes . . . . . . . . . . . . 2 1.1.2 Diagonal Forms in the Primes . . . . . . . . . . . . . 6 1.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2.1 A Multidimensional Green-Tao Theorem . . . . . . . 8 1.2.2 General Quadratic Forms in the Primes . . . . . . . 11 2 A Higher Dimensional Green-Tao Theorem . . . . . . . . . 14 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2 Norms, Transference, and Pseudo-random Measures . . . . . 14 2.3 The Generalized von Neumann Inequality. . . . . . . . . . . 18 2.4 The Dual Function Estimate. . . . . . . . . . . . . . . . . . . 22 2.5 Proof of the Main Results. . . . . . . . . . . . . . . . . . . . 25 2.5.1 Proof of Theorem 1.2.2. . . . . . . . . . . . . . . . . . 25 2.5.2 Proof of Theorem 1.2.1. . . . . . . . . . . . . . . . . . 27 2.6 The Correlation Condition. . . . . . . . . . . . . . . . . . . . 30 iv3 Quadratic Equations . . . . . . . . . . . . . . . . . . . . . . . . 36 3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2 The Minor Arcs . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.1 Sufficiently Off Diagonal Forms . . . . . . . . . . . . 36 3.2.2 Insufficiently Off Diagonal Forms . . . . . . . . . . . 39 3.3 The Major Arcs and an Asymptotic Formula . . . . . . . . . 43 3.4 The Singular Series . . . . . . . . . . . . . . . . . . . . . . . 47 3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.1 Future Projects . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.1.1 A Conjecture . . . . . . . . . . . . . . . . . . . . . . . 53 4.1.2 A Reasonable Approach . . . . . . . . . . . . . . . . . 55 4.2 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 vAcknowledgements First and foremost, I wish to acknowledge my advisor Prof. A. Magyar. I can think of no other person with whom I would have rather worked with. Other names of note are Prof. G. Chen, Prof. N. Lyall, Prof. J. Ziegler, and Prof. J. Whitenton. The faculty and staff at the following universities also require recognition: University of British Columbia; University of Georgia; Georgia State University; and Southern Polytechnic State University. viDedication To Tigger. viiChapter 1 Introduction 1.1 Introduction The main purpose of this work is to provide some background on the study of solving equations in the primes, as well as to contribute a few results in this area. The main types of equations we are interested in are homogenous polynomials with integer coefficients. We are especially interested in affine varieties defined by the level sets of such equations. This problem is of par- ticular importance due to the large number of related problems in additive number theory and additive combinatorics. It is not a matter of necessity, but one of convenience that we do not measure directly the density of prime points on surfaces, but a weighted version. All weights are given essentially by the von Mangoldt function, denoted by Λ, which takes the the value log(p) for a power of a prime p, and zero elsewhere. In every result appearing in this paper one may translate directly to the primes by dividing through by the appropriate power of a logarithm, the appropriate power being the number of variables. This is standard. We note that P is used to denote the set of primes. We now proceed to overview the main results below, after a brief inter- lude to overview some notations as well as some of the known results that have previously been obtained in this area. As usual, we use Z for the integers, R for the reals, and C for set of complex numbers, We use the shorthand notation Zm to denote the group of residue classes Z/mZ. We use ||f ||Lp(X) to denote the standard Lp norms of a function f on a given measure space X, and unless the situation neces- sitates we shall omit X and simple write ||f ||Lp . We employ the notation Ex∈X = |X|−1 x∈X .The Bachmann-Landau notation O and o notation 1is used frequently. The notation f g is also used as an alternative to f = O(g). We use f ≈ g to mean f g and g f . Further notational conventions are introduced as they appear. 1.1.1 Linear Equations in the Primes The study of linear systems of equations in prime unknowns has a long his- tory. However, the recent work of Green and Tao encompasses the majority of what is known in the subject, and this is where we shall focus. We need some preliminary ideas before the main conjecture and the results that par- tially resolve it may be stated. These definitions are taken directly from [12]. Definition 1.1.1. (Affine-linear forms) Let R,n be integers. An affine- linear form on ZR is a function ψ : ZR → Z which is the sum ψ = ψ +ψ(0) of a linear form ψ on ZR and ψ(0) is an integer. A system of affine-linear forms is then a collection Ψ = (ψ1, ...,ψn) where each ψi is an affine-linear form. The image of the ZR under Ψ is referred to as an affine-sublattice of Zn. The size of such a system relative to the scale N is given by ||Ψ||N = n i=1 R j=1 |ψi(ej)|+ n i=1 |ψi(0)N−1|, (1.1) where the ej are the standard basis elements for ZR. To avoid trivialities, it is assumed that in a given system of affine-linear forms we have no constant forms and no two affine-linear forms are rational multiples of each other. With a given system of affine-linear forms, Ψ = (ψ1, ...,ψn), the main problem is to evaluate the sum x∈K∩[−N,N ]R n i=1 Λ(ψi(x)) (1.2) for a given convex body K. This sum counts prime points represented (with multiplicity) by the system of affine-linear forms where each prime point, 2more precisely each prime power point, is weighted with the von Mangoldt function. A heuristic argument, which dates back to Hardy and Littlewood at least, provides an expected value for the sum as a singular series, which is actually a product of terms taking into account solubility at each prime place. The Archimedean factor is β∞ = vol(K ∩ [−N,N ]R ∩Ψ−1(R+)n). (1.3) Here vol denotes the volume and Ψ is extended to R in the natural way. The order of this term is NR in general. The local factors are defined in terms of localized versions of the von Mangoldt function, which for each prime p is given by Λp(y) = p/(p − 1) for each integer y not divisible by p, whence we simply get zero. The local factors are then βp = Ex∈ZRp n i=1 Λp(ψi(x)). (1.4) The heuristic argument lends the following conjecture. Conjecture 1.1.1. (Generalized Hardy-Littlewood conjecture). Let N,R, n, and L be positive integers. Also, let Ψ = (ψ1, ...,ψn) be a system of affine- linear forms with size ||Ψ||N ≤ L, and K ⊂ [−N,N ]R be a convex body. Then we have x∈K∩[−N,N ]R n i=1 Λ(ψi(x)) = β∞ p βp + ot,d,L(NR). (1.5) The original formulation of Hardy and Littlewood only deals with the systems composed of the affine-linear forms ψi(x) = x + bi on Z. This generalization originally appears in [13]. In this same paper they prove a significant portion of this conjecture, albeit conditionally. The assumptions required in their proof are known as the Mo¨bius and nilsequences conjec- ture and the Gowers-norm conjecture. Subsequently these results have been shown. The former by Green and Tao in [1], and the latter Green, Tao, and Ziegler in [2]. To state the main result of Green and Tao we need one more 3definition. Definition 1.1.2. (Complexity). Let (Ψ = ψ1, ...,ψn) be a system of affine- linear forms. If 1 ≤ i ≤ n and s ≥ 0, we say that Ψ has i-complexity at most s if one can cover the n − 1 forms {ψi : i = 1, ..., n ; i = j} by s + 1 classes such that φi does not lie in the affine-linear span of any of these classes. The complexity of the system is then defined as the minimal s such that the system has i-complexity at most s for each i. If no such s exists, then the complexity is infinite. With this we have the following. Theorem 1.1.1. (Green and Tao) The generalized Hardy-Littlewood con- jecture is true for all systems of finite complexity. From the definition of complexity it is easily seen that the only excluded cases are those systems which have two affine-linear forms which are affinely related, meaning that the homogeneous part of two forms are rational mul- tiples of one another. So while this result provides numerous examples, a few of which we point out below, it does nothing for the problems like the Goldbach conjecture. Thus far the phrasing of the problems and results are presented in the form of simultaneously representing primes by affine-linear forms. In the linear setting, it turns out that this is the same as finding prime points on level sets of a system of linear forms in many situations. Green and Tao apply a little algebra to Theorem 1.1.1 and obtain the following. Theorem 1.1.2. (Green and Tao: Linear equations in the primes). Let A be an R × n integral matrix with R ≤ n. Assume that A has full rank R, and that the only element of the row-space of A over Q with two or fewer non-zero entries is the zero vector. Let N > 1 and b ∈ ZR be a vector in AZn. Then we have x∈[N ]n, Ax=b n i=1 Λ(xi) = µ∞ p µp + o(Nn−R). (1.6) 4The local densities µp are given by µp = lim M→∞ Ex∈[−M,M ]n, Ax=b n i=1 Λp(xi), (1.7) and the global factor µ∞ is given by µ∞ = #{x ∈ Zn : xi ∈ [0, N ]n, Ax = b}. (1.8) The implied constant in the little o term depends on A, R, and n only. Let us highlight a couple of examples. Example 1. The weighted number of solutions to the equation x1+x2+x3 = N with each xi ≤ N prime obeys the asymptotic S(N)N2 + o(N2), (1.9) with S(N) = (p,N) =1 1− 1 (p− 1)2 (p,N)=1 1 + 1 (p− 1)3 . (1.10) This is Vinogradov’s Three Primes Theorem, see e.g. [16]. Relatively re- cently it has been shown, conditionally on the Generalized Riemann Hy- pothesis, that there exists at least one prime point solution for every odd integer N ≥ 7, this is done in [7]. Example 2. The system Li(x) = xi − 2xi+1 + xi+2 = 0 for i = 1, ..., k in k+2 variables counts k-term arithmetic progressions. The weighted number of solutions with x1 < ... < xk+2 is given by (S(N) + o(1))N2 (1.11) where S(N) = 1 2(k − 1) p βp (1.12) 5with βp = 1 p p p−1 k−1 if p ≤ k 1− k−1 p p p−1 k−1 if p > k. (1.13) The ease in adding the conditions xi < xi+1 is a nice feature of the addition of a general convex body.. We return to this example shortly. More examples are given in [13], and are in general not overly difficult to arrive at. 1.1.2 Diagonal Forms in the Primes The study of finding prime points on level sets of homogenous forms of higher degree has received much less attention than the linear case. The major exception is for forms which are diagonal, meaning all terms are of the form aixdi . Forms of this type obey some similar properties to linear ones, and it is this approach Hua takes in showing the results of this section. The main result Hua obtains is an asymptotic similar in nature to the one presented in Theorem 1.1.2. For convenience we look only at F (x) = xd1+...+ xdn where d ≥ 1. This is the most important instance of a diagonal form, the problem of finding prime solutions to F = v for various values of v is known as the Goldbach-Waring problem. This one, and more general diagonal forms, are covered by the work of Chapter 3, albeit not as effectively. Set M(v,N) = 1 0 N x1,...,xn=1 Λ(x1)...Λ(xn)e((F (x)− v)r)dr, (1.14) that is the weighted number of prime points on the level set F = v. Note that this is only relevant when v ≈ Nd, so we shall just assume the equality and set M(Nd, N) = M(N). Modifying the methods Vinogradov employs, Hua arrives at an asymp- 6totic with the following singular series. Wa,q = s∈Z∗q e(asd/q) B(v, q) = (a,q)=1 φ(q)−nWna,qe(−va/q) S(v) = ∞ q=1 B(v, q). (1.15) Here φ is the totient function of Euler, and (a, q) represents the greatest common divisor. Also, Z∗q is the multiplicative of the group Zq. Theorem 1.1.3. We have M(N) = Γn(1 + d−1) Γ(nd−1) S(N)Nn/d−1 + o(N s/d−1) (1.16) provided that n ≥ 2d + 1 if d ≤ 11 2d2 (2 log d+ log log d+ 2.5) if d > 11. (1.17) Here Γ denotes the standard gamma function. One of the main aspects of Hua’s work is with the singular series. He provides an an infinite arithmetic progression Z, dependent on d, such that S(N) is bounded below on Z. Some specific instances are as follows. Corollary 1.1.1. Every sufficiently large odd integer is the sum of three primes. Corollary 1.1.2. Every sufficiently large integer congruent to 5 modulo 24 is the sum of five squares of primes. Corollary 1.1.3. Every sufficiently odd large integer is the sum of nine cubes of primes. Corollary 1.1.4. Every sufficiently large integer congruent to 17 modulo 240 is the sum of 17 fourth powers of primes. 7While these results are for a specific arithmetic progression, it is possible to obtain results on all large numbers for these particular forms. For ex- ample, as one can easily check, the 4-fold sum set of squares of the reduced residue class modulo 24 covers all residue classes modulo 24. With this one can obtain from Corollary 1.1.2. Theorem 1.1.1. Every sufficiently large integer is the sum of at most nine squares of primes. 1.2 Overview We now come to a description of the results presented in this document. These are split into two main parts. The first of which overviews a re- sult that we provide which generalizes the result of Green and Tao that the primes contain arbitrarily long arithmetic progression. The subsequent section introduces an extension of the result of Hua for general quadratic forms. 1.2.1 A Multidimensional Green-Tao Theorem One way of phrasing the celebrated theorem of Green and Tao [12] is the statement that subsets of positive relative upper density of the primes con- tain an affine copy of any finite set of the integers, and in particular contain arbitrary long arithmetic progressions. It is natural to ask if similar re- sults hold in the multi-dimensional settings, especially in light of the multi- dimensional extensions of the closely related theorem of Szemere´di [19] on arithmetic progressions in dense subsets of the integers. Indeed such a result was obtained by Tao [20], showing that the Gaussian primes contain arbi- trary constellations. In the same paper the problem of finding constellations in dense subsets of P d was raised and briefly discussed. The difficulty in this setting comes from two facts. First, the natural majorant of the d-tuples of primes is not pseudo-random with respect to the box norms, which replace the Gowers’ uniformity norms in the multi- dimensional case. This may be circumvented by assuming the set e is in 8general position as described below, as is already suggested in [20]. However even under the this non-degeneracy assumption, the so-called correlation conditions in [12] do not seem to be sufficient, and a key observation of this note is to use more general correlation conditions to obtain the dual function estimates in the multi-dimensional case. Also, we need a more abstract form of the transference principle of Green and Tao [12]. The formulation we use is due to Gowers [10], however essentially equivalent results have been obtained by Tao and Ziegler [21], as well as by Reingold et al. [17]. Finally let us note that we expect the main result of this paper remains true for sets which are not in general position. For example in the simplest case, when e = {(x, y), (x + d, y), (x, y + d)}, it is easy to see that both subsets of the form A = B × C and random subsets A ⊆ P 2 of positive relative density, contain many affine copies of e. However to prove such a result, our approach needs to be modified in an essential way, as the box norms do not seem to control such constellations in the relative setting. Let e = {e1, . . . , el} ∈ (Zd)l be a set of vectors; a constellation defined by e is then a set e = {x, x + te1, . . . , x + tel} where t = 0 is a scalar, that is an affine image of the set e ∪ {0}. Definition 1.2.1. We say that a set of l vectors e ∈ (Zd)l is in general position, if |πi(e ∪ {0})| = l + 1 for each i, where πi is the orthogonal projection to the ith coordinate axis. Let us also recall that a subset A of the d-tuples of primes P d is of positive upper relative density if lim sup N→∞ |A ∩ [1, N ]d| π(N)d > 0 Our main result is then the following Theorem 1.2.1. Given any set A ⊆ P d of positive relative upper density, we have that A contains infinitely many constellations defined by a set of vectors e ∈ (Zd)l in general position. Remarks: We note that for d = 1 this translates back to the above de- scribed theorem of Green and Tao [12], as any finite subset of Z is in general 9position. Also, one may assume that l = d and the set e = {e1, . . . , ed} ⊆ Zd forms a basis in Rd besides being in general position, by passing to higher dimen- sions. Indeed, if e ∈ (Zd)l then let {f1, . . . , fl} ⊆ Zl be linearly independent vectors, and define a basis e = {e1 = (e1, f1), . . . , e l = (el, fl), e l+1, . . . , e l+d} ⊆ Zd+l by extending the linearly independent set of vectors ei = (ei, fi), (1 ≤ i ≤ l). Here we have used to (ei, fi) to denote the concatenation of the vectors ei and fi. If e was in general position then it is easy to make the construction so that e is also in general position, and if the set A := A × P l contains a constellation x + te, then A contains x + te. Thus from now on we will always assume that e is also a basis of Rd. Theorem 1.2.1 may be viewed as a relative version of the so-called Multidimensional Szemere´di Theorem [8], stating that any subset of Zd of positive upper density contains infinitely many constellations defined by any finite set of vectors e ⊆ Zd. As is customary, we will work in the finitary settings, when the underlying space is the group ZdN = (Z/NZ)d, N being a large prime. In this settings we need the following, more quantitative version: Theorem 1.2.1. (Furstenberg-Katznelson [8]). Let α > 0, d ∈ N and let e = {e1, . . . , ed} ⊆ ZdN be a fixed set of vectors. If f : ZdN → [0, 1] is a given function such that E(f(x) : x ∈ ZdN ) ≥ α, then one has E(f(x)f(x+ te1) . . . f(x+ ted) : x ∈ ZdN , t ∈ ZN ) ≥ c(α, e) (1.18) where c(α, e) > 0 is a constant depending only on α and the set e. Here we have used the expectation notation E(f(x) : x ∈ A) = 1 |A| x∈A f(x). In the relative setting, when A ⊆ P d, the condition: E(f(x) : x ∈ ZdN ) ≥ 10α (after identifying [1, N ]d with ZdN ) does not hold for the indicator function f = 1A, however it holds for f = 1AΛd where Λd is the d-fold tensor product of the von Mangoldt function Λ. The price one pays is that the function f is no longer bounded uniformly in N . Following the strategy of [12] we will show that the d-fold tensor product ⊗dν of the pseudo-random measure ν used in [12] is sufficiently random in our settings in order to apply the transference principle of [10]; we will refer to such measures ν as d-pseudo- random measures. We postpone the definition of d-pseudo-random measures until later, but state our main result in the finitary settings below: Theorem 1.2.2. Let α > 0 be given, and d be fixed. There exists a constant c(α, e) > 0 such that the following holds. If 0 ≤ f ≤ µ is a given function on ZdN such that µ = ⊗dν where ν is d-pseudo-random, and E(f(x) : x ∈ ZdN ) ≥ α, then for any basis e = {e1, ..., ed} in general position, we have that E(f(x)f(x+ te1)...f(x+ ted) : x ∈ ZdN , t ∈ ZN ) ≥ c(α, e) (1.19) 1.2.2 General Quadratic Forms in the Primes Here we introduce an analogue of Theorem 1.1.3 for general quadratic forms in n variables, i.e. forms which may not be diagonal. Let Q(x) = x,Ax be an integral quadratic form on the integers in n variables, so that A is a symmetric n × n matrix with integer coefficients. Also, let Sv = {x ∈ Zn : Q(x) = v} be its level surface. With PN being the set of primes which are at most N , we wish to study |Sv ∩ PnN |, that is the number of solutions of the equation Q(x1, . . . , xn) = v among the prime numbers. Our work is building on that of Hua and the method follows a similar outline, while also taking into account the methods of Birch [3] and Davenport [6] which treated integer solutions of general forms. The difficulty in treating general quadratic forms is that in the work of Hua, and in fact most subsequent works addressing the number of prime solutions of diophantine equations, has exploited the additive structure of diagonal equations. For general quadratic forms the additive structure is not available. To overcome this we have first considered forms Q(x) = x,Ax, 11where the underlying matrix A has an off-diagonal block of sufficiently large rank. We start out, as usual, by writing the number of weighted prime solutions via the expression M(v,N) = N x1,...,xn=1 Λ(x1)...Λ(xn)Sv(x1, ..., xn) = 1 0 N x1,...,xn=1 Λ(x1)...Λ(xn)e((Q(x)− v)r)dr, (1.20) where Λ again denotes the von Mangoldt function. Our approach is to apply the Hardy-Littlewood circle method to obtain an asymptotic for 1.20. We (eventually) define the major arcs, the same way as in [14]; the major arcs shall be the collection of r’s such that |r − a/q| ≤ (log N)c/N2 for some reduced fraction a/q with denominator q ≤ (log N)c, c being a constant depending only on the underlying dimension n. In the case of an off-diagonal block of large enough rank, we first eliminate the von Mangoldt function by two applications of the Cauchy-Schwarz inequality picking up a logarithmic type loss. However using the Birch-Davenport method we can get strong enough bounds on the minor arcs to compensate. In the opposite case we treat the matrix A as a block diagonal matrix consisting of a small and two large blocks, exploiting that the remaining off- diagonal blocks have small rank. Here the minor arcs estimates are similar to those of Hua [14] and Vinogradov [22], using uniform estimates and rewriting L2 bounds as solutions of systems of equations. The treatment of the major arcs is fairly standard reducing the integral over them to a product of local factors and a singular integral by making acceptable errors. This process culminates in an asymptotic formula of the form M(v,N) = Nn−2S(v)J(µ) +O((log N)−δNn−2), (1.21) where δ > 0, and S and J are the singular series and integral, respectively. The asymptotic may be used to deduce several results. The following is the analogue of Theorem 1 in [3], which in this case is essentially the 12Hasse-Minkowski Theorem, see e.g. [4]. In the statement, B is the unit cube [0, 1]n in Rn. Theorem 1.2.3. Let Q be a homogenous quadratic polynomial in n vari- ables. If we have the rank of Q is at least 34, then M(v,N) = Nn−2S(v)J(µ) +O(Nn−2(log N)−δ), where δ > 0, the O-term is uniform in v and N , and µ = N−2v. Here S(v) is positive so long as Q has a non-singular point in the reduced residue class modulo every sufficiently large prime power; and J(Q(x¯)) exceeds a positive lower bound if x¯ runs through a closed subset of the interior of B−V ∗. Here B = [0, 1]n ⊂ Rn, and V ∗ is the null space of the matrix A over Rn. Another result of interest is the following. Theorem 1.2.4. If Q is a positive definite integral quadratic form with rank at least 34 in n variables, then there exists an arithmetic progression, Z, such that, when restricted to Pn, Q represents all sufficiently large elements in Z . Theorem 1.2.4 may be viewed as a generalization of Theorem 1.1.3. As we have previously seen, Hua’s main result (for quadratics) is that every sufficiently large integer congruent to 5 modulo 24 can be written as a sum 5 squared prime numbers. 13Chapter 2 2.1 Introduction The aim of this chapter is to prove Theorem 1.2.1. The method of proof follows the lines of the proof given in [12]. In Sections 3-4 we prove two key propositions, the so-called generalized von Neumann inequality and the dual function estimate. The first roughly says that the number of constellations defined by a set e is controlled by the appropriate box norm. The second is the essential step in showing that the box norms are QAP norms. In Section 5, we prove our main results assuming that the measure ex- hibited in [12] is also d-pseudo-random in the sense defined above. First we show Theorem 1.2.2, which follows then easily from the Transference Prin- ciple. Next, we prove Theorem 1.2.1 by a standard argument passing from ZN to Z. Finally, in the last section, we provide the additions of the results given in [12], which in turn proves d-pseudo-randomness of the measure ν that is used by Green and Tao. This is done by slightly modifying their arguments of Sec.10 in [12] based on earlier work of Goldston and Yıldırım [9] [5]. 2.2 Norms, Transference, and Pseudo-random Measures First we introduce the d-dimensional box norms. We actually introduce one norm for each linearly independent set of vectors {e1, ..., ed} ⊆ ZdN . For a function f : ZdN → C this norm with respect to a basis e is given by ||f ||2 d (e)d = E( ω∈{0,1}d f(x+ ωte) : x ∈ ZdN , t ∈ ZdN ) 14with the notation ωte = ω1t1e1 + ...+ ωdtded. That this is actually a norm is not immediate, but for the standard basis it can be shown by repeated applications of the Cauchy-Schwarz inequality, similarly as for the Gowers norms (see for example [11]). For a different basis, note that we have ||f ||(e)d = ||f ◦ T ||d for an appropriate linear transformation T , where ||f ||d is the norm with respect to the standard basis. The same way one shows [11] that the analogue of the so-called Gowers-Cauchy-Schwarz inequality holds Proposition 2.2.1. ( d(e)-Cauchy-Schwarz inequality) Given 2d func- tions, indexed by elements of {0, 1}d, we have fω : ω ∈ {0, 1}d = E( ω∈{0,1}d fω(x+ ωte) : x ∈ ZdN , t ∈ ZdN ) ≤ ω∈{0,1}d ||fω||(e)d Gowers presents an alternative approach to the Green-Tao Transference Theorem from a more functional analytic point of view, making use of the Hahn-Banach Theorem. The specific version he provides will be presented below after we recall some definitions. First we note that || · ||∗ is defined to be the dual norm of || · ||. Definition 2.2.1. Let || · || be a norm on H = L2(Zn) such that ||f ||L∞ ≤ ||f ||∗, and let X ⊆ H be bounded. Then ||·|| is a quasi algebra predual (QAP) norm with respect to X if there exists an operator D : H → H, a positive function c on R and an increasing positive function C on R satisfying: (i) f,Df ≤ 1 for all f ∈ X, (ii) f,Df ≥ c() for every f ∈ X with ||f || ≥ , and (iii) ||Df1...DfK ||∗ ≤ C(K) for any f1, ..., fK ∈ X. This definition in enough to state the transference principle. Theorem 2.2.1. (Gowers [10]) Let µ and ω be non-negative functions on Y, Y finite, with ||µ||L1 , ||ω||L1 ≤ 1, and η, δ > 0 be given parameters. Also 15let || · || be a QAP norm with respect to X, the set of all functions bounded above by max{µ,ω} in absolute value. There exists an > 0 such that the following holds: If we have that ||µ − ω|| < , then for every function with 0 ≤ f ≤ µ there exists a function g with 0 ≤ g ≤ ω/(1− δ) and ||f − g|| ≤ η. Remarks: By a simple re-scaling of the norms the constants 1 in Definition 2.2.1 and Theorem B can be replaced by any other fixed constants. The actual form given by Gowers is more explicit, in fact giving a specific choice of . However, for our purposes, we only need such an that is independent of the size of Y . Also, for our purpose one may choose ω ≡ 1 and δ = 1/2. The definition of a pseudo-random measure here is slightly stronger than that of Green and Tao, adapted to the higher dimensional settings. Let us begin with the one dimensional case. Following [12], we define a measure to be a non-negative function ν : ZN → R such that E(ν(x) : x ∈ ZN ) = 1 + o(1). where the o(1) notation means a quantity which tends to 0 as N → ∞. A measure will be deemed pseudo-random if it satisfies two properties at a specific level. The first of these is known as the linear forms condition, as we will use only forms with integer coefficients we need a slightly simplified version. Definition 2.2.2. (Green-Tao [12]) Let ν be a measure, and m0, t0 ∈ N be small parameters. Then ν satisfies the (m0, t0)-linear forms condition if the following holds. For m ≤ m0 and t ≤ t0 arbitrary, suppose that {Li,j}1≤i≤m,1≤j≤t are integers, and that bi are arbitrary elements of ZN . Given m linear forms φi : ZtN → ZN with φi(x) = t j=1 Li,jxj + bi, x = (x1, ..., xt) and b = (b1, ..., bt), if we have that each φi is nonzero and 16that they are pairwise linearly independent, then E m i=1 ν(φi(x)) : x ∈ ZtN = 1 + o(1), (2.1) where the o(1) term is independent of the choice of the bi’s. The next condition is referred to as the correlation condition. Definition 2.2.3. Let ν be a measure. Then ν satisfies the (m0,m1) correlation condition if for every 1 ≤ m ≤ m0 there exists a function τ = τm : ZN → R+ such that for all k ∈ N E(τk(x) : x ∈ ZN ) = Om,k(1) and also E m1 i=1 m0 j=1 ν(φi(y) + hi,j) : y ∈ ZrN ≤ m0 i=1 1≤j<j≤m0 τ(hi,j − hi,j) where the functions φi : ZrN → ZN are pairwise independent linear forms. Remarks: This is a stronger condition that what is used in [12], in fact they used the special case when m1 = 1, and φ is the identity. We define below a d-pseudo-random measure to be a measure satisfying these conditions at specific levels. Definition 2.2.4. We call a measure ν a d-pseudo-random if ν satisfies the ((d2 + 2d)2d−1, 2d2 + d)-linear forms condition and the (d, 2d)-correlation condition We will deal with d-fold tensor product of measures, µ = ⊗di=1ν and call them d-measures. We will call such a d-measure µ to be pseudo-random if the corresponding measure ν is d-pseudo-random. Finally, note that for a 17d-measure E(µ(x) : x ∈ ZdN ) = d i=1 E(ν(xi) : xi ∈ ZN ) = 1 + o(1). 2.3 The Generalized von Neumann Inequality. Let e = {e1, . . . , ed} ⊆ ZdN be a base of ZdN which is also in general position, which in this settings means that |πi(e ∪ {0})| = d + 1 for each i where πi : ZdN → ZN is the orthogonal projection to the i-th coordinate axis. Proposition 2.3.1. (Generalized von Neumann Inequality) Let w = ⊗dν be a pseudo-random d-measure. Given a function 0 ≤ f ≤ w, we have that Λf := E (f(x)f(x+te1)...f(x+ted) : x ∈ ZdN , t ∈ ZN ) = O(||f ||(e)d) (2.2) where e = {ed, ed − e1, ..., ed − ed−1}. Proof. We shall apply the Cauchy-Schwarz inequality several times. Begin by writing Λf ≡ Λ = E(f(x) d i=1 f(x+ t1ei) : x ∈ ZdN , t1 ∈ ZN ). Push through the summation on t1 and split the f to write this as E( f(x)E( f(x) d i=1 f(x+ t1ei) : t1 ∈ ZN ) : x ∈ ZdN ). Applying Cauchy-Schwarz to get Λ2 ≤ E(w(x) d i=1 f(x+ t1ei) d j=1 f(x+ t1ej + t2ej) : t1, t2 ∈ ZN , x ∈ ZdN ), where we have made the substitution t2 → t1+ t2 for the new variable. Note that there should be a E(w(x)) = 1 + o(1) multiplier, following from the fact that f ≤ w and from the linear forms condition, but for convenience we 18suppress it and will continue to do so (this is a big O result, so this is not of any consequence). We make one further substitution, x → x− t1e1, yielding Λ2 ≤ E(w(x− t1e1) d i=2 ω∈{0,1} f(x+ t1e (1) i + ωt (1) ei) × ω∈{0,1} f(x+ ωt(1)e1) : t1, t2 ∈ ZN , x ∈ ZdN ), where we have introduced the notations e(j)i = ei − ej , and t (i) = {t1+j}ij=1. Note that the final product of this expression is independent of t1. We now repeat this procedure exactly, pushing through the t1 sum and splitting the terms independent of t1, followed by a change of variables. After l applications of the Cauchy-Schwarz inequality, we claim to have Λ2 l ≤ E(Wl(x, t1, ..., tl+1) d i=l+1 ω∈{0,1}l f(x+ t1e (l) i + ωt (l) ei;l)) × ω∈{0,1}l f(x+ ωt(l)el;l−1) : t1, ..., tl+1 ∈ ZN , x ∈ ZdN ), (2.3) for an appropriate weight function Wl which is a product of w’s, evaluated on linear forms which are pairwise linearly independent. The notations introduced here are ei;l = {ei, e (1) i , ..., e (l−1) i } (note that l > 1), and ωt(l)ei;l = ω1t2ei + ω2t3e (1) i + ...+ ωltl+1e (l−1) i . To check this form, using induction, apply the Cauchy-Schwarz inequal- ity one more time with the new variable t1 + tl+2 to get Λ2 l+1 ≤ E(Wl(x, t1, ..., tl+1)Wl(x, t1 + tl+2, ..., tl+1) × d i=l+1 ω∈{0,1}l f(x+ t1e (l) i + ωt (l) ei;l)f(x+ t1e (l) i + tl+2e (l) i + ωt (l) ei;l) × ω∈{0,1}l w(x+ ωt(l)el;l−1) : t1, ..., tl+2 ∈ ZN , x ∈ ZdN ). 19Write W l+1(x, t1, ..., tl+2) = Wl(x, t1, ..., tl+1)Wl(x, t1 + tl+2, ..., tl+1) × ω∈{0,1}l w(x+ ωt(l)el;l−1). We now apply the substitution x → x− t1e (l) l+1, note that e (l) i −e (l) l+1 = e (l+1) i , and set Wl+1(x, t1, ..., tl+2) = W l+1(x− t1e (l) l+1, t1, ..., tl+2), (2.4) This gives Λ2 l+1 ≤ E(Wl+1(x, t1, ..., tl+2)× d i=l+2 ω∈{0,1}l+1 f(x+ t1e (l+1) i + ωt (l+1) ei;l+1) × ω∈{0,1}l+1 f(x+ ωt(l+1)el+1;l) : t1, ..., tl+2 ∈ ZN , x ∈ ZdN ). and this is the form we wanted to obtain. After d− 1 iterations, one arrives at the form Λ2 d−1 ≤ E(Wd−1(x, t1, ..., td) × ω∈{0,1}d f(x+ ωt(d−1)ed;d−1) : t1, , ..., td ∈ ZN , x ∈ ZdN ). This may be written as Λ2 d−1 ≤ E( ω∈{0,1}d f(x+ ωt(d−1)ed;d−1) : t2, ..., td ∈ ZN , x ∈ ZdN ) + E, where E = E((Wd−1(x, t1, ..., td)− 1) × ω∈{0,1}d f(x+ ωt(d−1)ed;d−1) : t1, ..., td ∈ ZN , x ∈ ZdN ). 20To see that the main term is in fact an appropriate box norm, notice that ed;d−1 = {ed, ed − e1, ..., ed − ed−1} is also in general position. To deal with the error term E, we apply the Cauchy-Schwarz inequality one more time to get E ≤ E((W (x, t2, ..., td)− 1)2 × ω∈{0,1}d w(x+ ωt(d)ed;d−1) : t2, , ..., td+1 ∈ ZN , x ∈ ZdN ), where we have set W (x, t2, ..., td) = E(Wd−1(x, t1, t2, ..., td) : t1 ∈ ZN ) and again used the fact that f ≤ w. Now to show that E = o(1), it is enough to show that the linear forms defining W are pairwise independent, after of course expanding (W − 1)2 and applying the linear forms condition. By following the construction of W , this amounts to showing that at each step Wl satisfies pairwise independence, which itself reduces to showing that the coefficient of x is 1 in each form and each form has a nonzero coefficient in t1 (in each coordinate). To be more precise, the case l = 1 is immediate. Assuming this is so for l fixed, then W l+1(x, t1, ..., tl+2) = Wl(x, t1, ..., tl+1)Wl(x, t1 + tl+2, ..., tl+1) × ω∈{0,1}l w(x+ ωt(l)el;l−1). certainly satisfies this, as the forms in Wl(x, t1, ..., tl+1) and Wl(x, t1 + tl+2, ..., tl+1) are pairwise independent because the t1 coefficient is non-zero, and ω∈{0,1}l w(x+ ω t(l)el;l−1) is independent of t1. The statement about the coefficient of x is obvious. Also, it not hard to see that the vector mul- 21tiple of t1 is either el+1 or e (i) l+1 (for forms appearing after i applications of Cauchy-Schwarz). Thus the statement is true for l + 1. The fact that E = o(1) then follows directly from the (d(d+2)2d−1, d(2d+ 1)) linear forms condition. 2.4 The Dual Function Estimate. As before we assume that a basis e = {e1, ..., ed} ⊆ ZdN is given which is in general position. We will use the notation ωye = ω1y1e1 + ... + ωdyded, for ω ∈ {0, 1}d and y ∈ ZdN . First we define the dual of a function f : ZdN → R with respect to the (e)d norm. Definition 2.4.1. . Let f : ZdN → R be a given function and let e = {e1, ..., ed} ⊆ ZdN be a basis of ZdN . The dual of the the function f is the function Df(x) = E ( ω∈{0,1}d,ω =0 f(x+ ωte) : t ∈ ZdN ) (2.5) Proposition 2.4.1. With X and D as above, and e in general position, we have ||Df1...DfK || ∗ (e)d ≤ C(K) for any f1, ..., fK ∈ X. Proof. We must show that f,Df1...DfK ≤ CK ||f ||(e)d by the definition of the dual norm. By applying the definition of Df , the LHS gives f,Df1...DfK = E(f(x) K i=1 E( ω∈{0,1}d,ω =0 fi(x+ωtie) : ti ∈ ZdN ) : x ∈ ZdN ). 22Expanding out the products then gives the right hand side as E(E(f(x) ω∈{0,1}d,ω =0 × × K i=1 fi(x+ ωtie+ ωte) : x, t ∈ ZdN ) : T = (t1, ..., tK) ∈ (ZdN )K) after a substitution ti → t + ti for each i for some fixed t, and adding a redundant summation in t. Now we call F(ω,T )(x) = K i=1 fi(x + ωt ie) for non-zero ω, and F(0d,T )(x) = f(x). The last expression then becomes E(F(ω,T ) : ω ∈ {0, 1}d : T ∈ ZdN ). By applying the (e)-Cauchy-Schwarz inequality, we have arrived at ||Df1...DfK || ∗ (e)d ≤ E( ω∈{0,1}d,ω =0d ||F(ω,T )||(e)d : T ∈ ZdN ). An application of the Holder inequality gives that the right hand side is bounded above by ω∈{0,1}d,ω =0d E(||F(ω,T )||2 d (e)d : T ∈ (Z d N ) K), where we added one factor of the constant 1 function, which has Lq-norm one for each q. Thus, we now just need to show that for a fixed ω = 0d we have E(||F(ω,T )||2 d (e)d : T ∈ (Z d N ) K) = O(K) for T = (t1, ..., tK). We continue by expanding the last expression for a fixed ω = 0d, ||F(ω,T )|| 2d (e)d : T ∈ (Z d N ) K) = O(K) 23= E( ω∈{0,1}d K i=1 fi(x+ ωtie+ ωte) : x, t, t1, ..., tK ∈ ZdN ). The right hand side factorizes as E( K i=1 E( ω∈{0,1}d fi(x+ ωye+ ωte) : y ∈ ZdN ) : x, t ∈ ZdN ). Applying the bound f ≤ ν gives E(EK( ω∈{0,1}d ν(x+ ωye+ ωte) : y ∈ ZdN ) : x, t ∈ ZdN ). The inner sum is now split component wise E( d j=1 ω∈{0,1}d µ((ωye)j + (ωte+ x)j) : y ∈ ZdN ), where the notation (x)j denotes the jth coordinate. The terms (ωye)j rep- resent the linear forms d s=1 ωsys(es)j , which satisfy the hypothesis in the (d, 2d) correlation condition by the assumptions on e. Hence we have E( d j=1 ω∈{0,1}d µ((ωye)j+(ωte+x)j) : y ∈ ZdN ) ≤ d j=1 ω =ω τ(((ω−ω)te)j), as the (x)j terms drop out in the subtraction. Plugging this bound back in gives E(( d j=1 ω =ω τ(((ω − ω)te)j))K : t ∈ ZdN ). Making use of the triangle inequality in LdK , after another application of 24Holder, reduces our task to bounding d j=1 ω =ω E(τdK(((ω − ω)te)j) : t ∈ ZdN ). By the assumptions on e and the fact that ω − ω = 0d, ((ω − ω)te)j provides a uniform cover of ZN , and we may reduce this to E(τdK(t) : t ∈ ZN ). This expression is OK(1). 2.5 Proof of the Main Results. In this section we prove our main results under the assumption that the measure exhibited in [12] is d-pseudo-random, i.e. it satisfies Definition 2.2.4. 2.5.1 Proof of Theorem 1.2.2. Let e = {e1, . . . , ed} ⊆ ZdN be a basis which is in general position. For a function f : ZdN → R we define its dual by Df(x) = E( ω∈{0,1}d,ω =0 f(x+ ωte) : t ∈ ZdN ). (2.6) Then clearly f,Df = f2 d (e)d (2.7) Let µ = ⊗dν be a pseudo-random d-measure, and let X be the set of functions f on ZdN such that |f | ≤ µ pointwise. Lemma 2.5.1. The norm (e)d is a quasi algebra predual (QAP) norm, with respect to the set X and the operator D. Proof. We have already shown part (iii) of Definition 2.2.1, which was the 25content of Proposition 2.4.1. If fd(e)d ≤ ε then f,Df = f2 d (e)d ≤ ε 2d and part (ii) follows. Finally, since |f | ≤ µ it follows that f,Df ≤ µ2 d (e)d = 1 + o(1) as the linear forms (x+ ωte)j are pairwise linearly independent (for each j) and ν satisfies the linear forms condition. We are in the position to apply the transference principle to decompose a function 0 ≤ f ≤ µ into the sum of a bounded function g and a function h which has small contribution to the expression in (1.2). Proof of Theorem 1.2.2. Let α > 0 and let 0 ≤ f ≤ µ be function such that Ef ≥ α, where µ is a pseudo-random d-measure on ZdN . We apply Theorem 2.2.1, with Y = ZdN , δ = 1/2 and η > 0. Note that since µ is a measure one has that µL1 = Eµ = 1+ o(1). Since (e)d is a QAP norm with respect to the set X = {f : Y → R, |f | ≤ µ}, it follows that there is an ε > 0 such that if µ− 1(e)d < ε (2.8) then there is a decomposition f = g + h such that 0 ≤ g ≤ 2 and h(e)d < η. (2.9) Since µ is pseudo-random µ − 1(e)d = o(1) thus (2.8) holds for large enough N . Using this decomposition together with Theorem 1.2.1 and Proposition 2.3.1 one may write E(f(x)f(x+ te1)...f(x+ ted) : x ∈ ZdN , t ∈ ZN ) = = E(g(x)g(x+ te1)...g(x+ ted) : x ∈ ZdN , t ∈ ZN ) + O(h(e)d) ≥ c(α, e)− Cdη ≥ c(α, e)/2 26by choosing η sufficiently small with respect to α and e. This proves Theorem 1.2.2. 2.5.2 Proof of Theorem 1.2.1. Let us identify [1, N ]d with ZdN . First we show that constellations in ZdN defined by e which are contained in a box B ⊆ [1, N ]d of size εN , are in fact genuine constellations contained in B. We say that e = {e1, . . . , ed} ∈ Zd 2 is primitive if the segment [0, e] does not contain any other lattice points other than its endpoints in Zd2 considered as a lattice point in Zd2 . Let us also define the positive quantity τ(e) by τ(e) = inf m/∈{0,e}, x∈[0,e] ||m− x||L∞ where |x|∞ = max 1≤j≤d2 |xj | m is running through the lattice points Zd2 other than 0 and e. Lemma 2.5.2. Let 0 < ε < τ(e). Let N be sufficiently large, and let B = Id be a box of size εN contained in [1, N ]d ZdN . If there exist x ∈ ZdN and t ∈ ZN\{0} such that x ∈ B and x + te ⊆ B as a subset on ZdN , then there exists a scalar t = 0 such that x+ te ⊆ B also as a subset of Zd. Moreover if e is primitive (and 1 ≤ t < N) then one may take t = t or t = t−N . Proof. First, note that one can assume e is primitive as x+ te = x+ tse for a fixed primitive e and s ∈ N. By our assumption, there is an x ∈ [1, N ]d and t ∈ [1, N−1] such that x ∈ B and x+ tej ∈ B+(NZ)d for all 1 ≤ j ≤ d. Thus for each j there exits mj ∈ Zd such that ||tej − Nmj ||L∞ ≤ εN and hence |λe −m|∞ ≤ ε, where m = {m1, . . . ,md} ∈ Zd 2 and λ = t/N . Since 0 < λ < 1 and ε < τ(e) this implies that m = 0 or m = e. If m = 0 then |te|∞ ≤ εN and since x ∈ B it follows that x+ te ⊆ B ⊆ Zd. If m = e then ||(t−N)ej ||L∞ ≤ εN thus x+(t−N)e ⊆ B ⊆ Zd, so x+ te ⊆ B as a subset of Zd. This proves the lemma. Let us briefly recall the pseudo-random measure ν defined in Sec.9 [12]. Let w = w(N) be a sufficiently slowly growing function (choosing w(N) = O(log log N) is sufficient as in [12]) and let W = p≤w p be the product of 27primes up to w. For given b relative prime to W define the modified von Mangoldt function Λ¯b by Λ¯b(n) = φ(W ) W log(Wn+ b) if Wn+ b is a prime; 0 otherwise. (2.10) where φ is the Euler function. Note that by Dirichlet’s theorem on the distribution of primes in residue classes one has that n≤N Λ¯b(n) = N(1 + o(1)). Also, if A ⊆ P d is of positive relative density α and if Λ¯db := ⊗ dΛ¯b is the d-fold tensor product of Λ¯b the it is easy to see that there exists a b such that lim sup N→∞ N −d x∈[1,N ]d 1A(x)Λ¯db(x) > α/2 (2.11) We will fix such b and choose N sufficiently large N for which the expression in (2.11) is at least α/2. Let R = Nd −12−d−5 and recall the Goldston-Yildirim divisor sum [12], [9] ΛR(n) = d|n,d≤R µ(d) log(R/d) µ being the Mo¨bius function. For given small parameters 0 < ε1 < ε2 < 1 (whose exact values will be specified later) recall the Green-Tao measure ν(n) = φ(W ) W ΛR(Wn+b) 2 log R if ε1N ≤ n ≤ ε2N ; 1 otherwise. (2.12) Note that ν(n) ≥ 0 for all n, and also it is easy to see that for N sufficiently large, one has that ν(n) ≥ d−12−d−6 Λ¯b(n) (2.13) for all ε1N ≤ n ≤ ε2N . Indeed, this is trivial unless Wn + b is a prime. In that case, since ε1N > R, ΛR(Wn + b) = log R = d−12−d−5 log N and (2.13) follows. 28Proof of Theorem 1.2.1. Set µ = ⊗dν, and let g(x) := cd Λ¯db(x)1A(x)1[ε1N,ε2N ]d(x) (cd = d −d2−d 2−6d) (2.14) Then by (2.13) one has that g(x) ≤ µ(x) for all x ∈ Zd+. By (2.12) one may choose a sufficiently large number N for which (N )−d x∈[1,N ]d 1A(x)Λ¯db(x) > α/2 (2.15) and a prime N such that (1− α 100d )N ≤ ε2N ≤ N If ε1 is such that ε1/ε2 ≤ α/100d, then by the Prime Number Theorem in arithmetic progressions (N )−d x∈[1,N ]d\[ε1N,ε2N ]d Λ¯db(x) ≤ α/10 (2.16) It follows from (2.15) and (2.16) N −d x∈[1,N ]d g(x) ≥ cd N−d x∈[ε1N,ε2N ]d 1A(x)Λ¯db(x) ≥ cdε d 2α/4 (2.17) Using the identification [1, N ]d ZdN , one has that E(g(x) : x ∈ ZdN ) ≥ α (with α = cddε d 2α/4), and 0 ≤ g(x) ≤ µ(x) for all x. Thus, save for proving that the measure ν is d-pseudo-random, Theorem 1.2.2 implies that E(g(x)g(x+ te1) . . . g(x+ ted) : x ∈ ZdN , t ∈ ZN ) ≥ c(α, e) > 0. Note that the contribution of trivial constellations, corresponding to t = 0, is at most O(N−1 logd N), as |Λ¯db | ≤ log d N uniformly on [1, N ]d. Since the support of g is contained in A ∩ [ε1N, ε2N ]d, Lemma 2.5.2 implies that A ∩ [ε1N, ε2N ]d must contain genuine constellations of the form {x, x + te1, . . . , x + ted} as a subset of Zd. Choosing an infinite sequence of N ’s it 29follows that A contains infinitely many constellations defined by e. 2.6 The Correlation Condition. To complete the proof of Theorem 1.2.1, one needs to show that the measure ν defined in (2.12) satisfies both the linear forms conditions and the (d, 2d) correlation conditions given above. Since the measure ν is the same (apart from the slight change in the interval where ν ≡ 1) is the one given in [12] (see Definition 9.3, there), the linear forms condition is already established in Prop. 9.8 in [12]. It turns out that the arguments given in [12] (see Prop. 9.6, Lemma 9.9 and Prop.9.10) generalize in a straightforward manner to obtain the more general (m0,m1) correlation condition for any given specific values of m0 and m1. Proposition 2.6.1. For a fixed m0,m1, there exists a function τ such that Eτk = Ok(1) and also E( m1 i=1 m0 j=1 ν(φi(y) + hi,j) : y ∈ ZrN ) ≤ m0 i=1 ( 1≤j<j≤m0 τ(hi,j − hi,j)) (2.18) where the φi : ZrN → ZN are pairwise linearly independent linear forms. Let us first note that the arguments of Lemma 9.9 and Prop. 9.10 of [12] applies to our case and it is enough to establish the following inequality (see Prop. 9.6 [12]) E ( m1 i=1 m0 j=1 Λ2R(W (φi(y) + hi,j) + b) : y ∈ B) ≤ CM W log R φ(W ) M m1 i=1 p|i (1 +OM (p−1/2)) (2.19) where M = m1m0 and B is a box of size at most R10M . Moreover one can 30assume that hi,j = hi,j for all i, j = j. The next step is, following [12], to write the the expression E( M i=1 Λ2R(θi(y)) : y ∈ B), where θi = W (φi/m1(y)+hi/m1, (i (p))+ b (x is the floor function, i (m1) is i modulo m1), as a contour integral of the the following form plus a small error (2πi)−M Γ1 ... Γ1 F (z, z) M j=1 Rzj+z j z2j z 2 j dzjdz j , (2.20) where z = (z1, ..., zM ), z = (z1, ..., z M ), and function F (z, z ) is taking form of an Euler product F (z, z) = p Ep(z, , z), where Ep(z, z) = X,X⊆[M ] (−1)|X|+|X |ωX X(p) p j∈X zj+ j∈X z j . The function ω relates this expression to the particular forms. Specifically ωX(p) = E( i∈X 1θi≡0 (p) : x ∈ Z r N ). Lemma 2.6.1. (Local factor estimate). Set the intervals Ii = [(i− 1)m1 + 1, im1] as a partition of [M ]. For α ∈ Ii, the homogeneous part of θα is Wφi. Also, set ∆i = j<j; j,j∈Ii |hi,j − hi,j |.The following estimates hold: ωX(p): 1. If p ≤ w(N), then ωX(p) = 1 if |X| = 0, and is 0 otherwise. 2. If p > w(N) and |X| = 0, then wX(p) = 1. 3. If p > w(N) and X ⊆ Ii is nonempty, we have wX(p) = p−1 when |X| = 1, and wX(p) ≤ p−1 when |X| > 1. In the latter case, if p ∆α, we have that ωX(p) = 0. 314. If p > w(N) and X ∩ Ii = ∅ and X ∩ Ii = ∅ for some i = i, we have ωX(p) ≤ p−2 . Proof. When p ≤ w(N), then Wφi + b ≡ b (p), giving the first result. The second statement is trivial. For the third statement, let us start with X ⊆ Ii with |X| = 1. Then we have E(1W (φi(y)+hi, j)+b≡0 (p) : y ∈ Z r N ) = p −1 for any fixed j, proving the first part. The second part requires an estimate of E(1W (φi(y)+hi, j)+b≡0 (p)1W (φi(y)+hi, j )+b≡0 (p) : y ∈ Z r N ), with j = j. If p| |hα, j − hα, j |, then the we are left with simply a single equation (p W ), and may refer to the first part. When p ∆α, ωX(p) = 0 as hi, j is not congruent to hi, j, modulo p. For the last statement, we have the upper bound E(1W (φi(y)+hi, j)+b≡0 (p)1W (φi(y)+hi, j )+b≡0 (p) : y ∈ Z r N ) for some i = i and j, j. The forms φi and φi are linearly independent mod- ulo p (see the proof of Lemma 10.1 in [12]), hence we have the intersection of two distinct linear algebraic sets, which has size at most pr−2. The terms Ep in the Euler product can be separated as Ep(z, z) = 1− 1p>w(N) M j=1 (p−1−zj + p−1−z j − p −1−zj−zj ) + m1 i=1 1p>w(N); p|∆iλ (i) p (z, z ) + X XIα,α∈[m1]; |X X|>1 OM (p−2) p X zj+ X z j , where λ (i) p (z, z ) = X X⊂Ii; |X X|>1 OM (p−1) p X zj+ X z j . 32We define the terms E (0) p = 1 + m1 i=1 1p>w(N); p|∆iλ (i) p (z, z ), and factorize Ep = E (0) p E (1) p E (2) p E (3) p as follows: E (1) p = (E (0) p ) −1× ×( Ep M j=1(1− 1p>w(N)p −1−zj )(1− 1p>w(N)p −1−zj )(1− 1p>w(N)p −1−zj−zj )−1 , and E (2) p = M j=1 (1− 1p≤w(N)p −1−zj )−1(1− 1p≤w(N)p −1−zj )−1(1− 1p≤w(N)p −1−zj−zj ), and E (3) p = M j=1 (1− p−1−zj )(1− p−1−z j )(1− p−1−zj−z j )−1, Also set Gi = p E (i) p , noting that G3 = M j=1 ζ(1 + zj + zj) ζ(1 + zj)ζ(1 + zj) . The the following is the analogue of lemma 10.6 in [12]. To state it, Let us recall the domain DMσ to be the set {zj , z j : zj ,z j ∈ (−σ, 100) , 1 ≤ j ≤ M}. We also have the norms on for f analytic on DMσ , denoted ||f ||Ck(DMσ ), given 33by ||f ||Ck(DMσ ) = sup ||( ∂ ∂z1 )α1 ...( ∂ ∂zM )α1( ∂ ∂z1 )α1 ...( ∂ ∂zM )α1f ||L∞(DMσ ), where the supremum is taken over all α1, ...,αM ,α1, ...,α M whose sum is at most k. Lemma 2.6.2. Let 0 < σ = 1/(6M). Then the Euler products Gi are absolutely convergent for i = 0, 1, 2 in the domain DMσ , and hence represent analytic functions on this domain. We also have the estimates ||G0||Cr(DMσ ) = OM (log(R)/ log log(R)) r p| m1 i=1∆i (1 +OM (p2Mσ−1)) ||G0||CM (DM1/6M ) ≤ exp(OM (log1/3(R))) ||G1||CM (DM1/6M ) ≤ OM (1) ||G2||CM (DM1/6M ) ≤ OM,w(N)(1) G0(0, 0) = m1 i=1 p|∆i (1 +OM (p−1/2)) G1(0, 0) = 1 + oM (1) G2(0, 0) = (W/φ(W ))M , where the first bound is for all 0 ≤ r ≤ M . Proof. The estimates proceed exactly as in Lemma 10.3 and Lemma 10.6 in [12] with ∆ = m1 i=1∆i, barring the statement about G0(0, 0). To see this, we have G0(0, 0) = p|∆ E (0) p = p|∆ (1 + m1 i=1 λ (i) p (0, 0)) ≤ m1 i=1 p|∆i (1 + |λ(i)p (0, 0)|) and we crudely have |λ(i)p (0, 0)| = 1 +OM (p−1/2). The expression in (5.3) takes the form 34(2πi)−M Γ1 ... Γ1 G(z, z) M j=1 ζ(1 + zj + zj)R zj+z j ζ(1 + zj)ζ(1 + zj)z 2 j z 2 j dzjdz j with G = G0G1G2. To estimate it let us recall the following general result on contour integration from [12], see Lemma 10.4 there. Lemma 2.6.3. (Goldston-Yıldırım [12][5]) Let R be a positive number. If G(z, z) is analytic in the 2M variables on DMσ for some σ > 0, and we have the estimate ||G||Ck(DMσ ) = exp(OM,σ(log 1/3(R))), then (2πi)−M Γ1 ... Γ1 G(z, z) M j=1 ζ(1 + zj + zj)R zj+z j ζ(1 + zj)ζ(1 + zj)z 2 j z 2 j dzjdz j = G(0, ..., 0) logM (R) + M j=1 OM,σ(||G||Cj(DMσ )) log M−j(R) +OM,σ(exp(−δ log(R))) for some δ > 0. Estimate (2.19) follows easily applying Lemma 5 (with σ = 1/6M) to G = G0G1G2 using Lemma 4, which in turn implies Proposition 2.6.1, where the function τ is defined precisely as in [12]. This finishes the proof of Theorem 1.2.1. 35Chapter 3 3.1 Introduction. The main goal for this portion of our work is to provide the analogue of Theorem 1.1.3 for a general integral quadratic form. As previously noted, we are applying the circle method of Hardy and Littlewood. The minor arcs are dealt with in section 2, which is done is two separate cases. The methods for the major arcs are standard, and worked out in section 3. Section 4 is dedicated to the singular series. The implications stated in the Chapter 1 are dealt with in the final section. 3.2 The Minor Arcs 3.2.1 Sufficiently Off Diagonal Forms For this section, we make the stronger assumption that A has an m1 by m2 off-diagonal block say B, of rank at least R, which we shall determine later. The ability to handle this scenario is first noticed by Liu [15]. We set T (r) = N x1,...,xn=1 Λ(x1)...Λ(xn)e(Q(x)r). (3.1) One may write in the form T (r) = y=(x1,...,xm1 ) z=(xm1+1,...,xn) F (y)G(z)e(Q(y, z)r). We use F and G simply as shorthand for the corresponding products of the von Mangoldt function. With two applications of the Cauchy-Schwartz 36inequality, and the fact that N x=1 Λ(x)2 N log N, we have the Weyl-type inequality |T (r)|4 N3n(log N)2n × h∈[−N,N ]m1 ,l∈[−N,N ]m2 e(2 l, Bhr) = N3n(log N)2n V (r). (3.2) Writing w = (h, l), we have that w l, Bh = w,Aw , where A is obtained from A by making all entries aij zero when both i ≤ m1 and j ≤ m1, or both i > m1 and j > m1. In other words A1 consists of the off-diagonal block B and its transpose BT , hence it has rank 2R. Let us define the set of major arcs according to a parameter 0 < θ < 1 as M(θ) = 1≤q≤Nθ (a,q)=1 Ma,q(θ) where Ma,q(θ) = {r : 2|qr − a| ≤ N−2+θ}, and the minor arcs are simply the complement of the major arcs. Then it is fairly standard (see e.g. Lemma 3.3 [3] and Lemma 3.2 in [6]), that one has the estimate for r /∈ M(θ) |V (r)| ≤ Cn (log N)n Nn−Rθ. Thus we have shown Lemma 3.2.1. Suppose A is a symmetric n×n matrix, which has an m1× m2 off-diagonal block of rank R. Then for r /∈ M(θ), we have |T (r)| ≤ Cn (log N)n Nn− Rθ 4 . (3.3) 37A much more precise formulation of this result is given by Liu [15] in his treatment of this case. We will assume from now on that R ≥ 9, and fix θ = θ1 such that Rθ1 > 8. Then in particular T (r) = O(Nn−2−δ) for some fixed δ > 0 for r /∈ M(θ1). We will use now a ”sliding scale” argument due to Birch (see Lemma 4.4 [3]) to reduce the major arcs corresponding to value θ such that N θ ≈ (log N)C while keeping the error terms of size O(Nn−2(log N)−c) for some fixed constant c which depends on n. To do that we’ll use the fact |M(θ)| ≤ N−2+2θ which is immediate from the definition. We set up a sequence θ1, ..., θt, such that 9θ1 > 8, and θi+1 = 1718θi, which will ensure that 2θi− R 4 θi+1 ≤ − θi 8 for R ≥ 9, thus by (3.3) M(θi)−M(θi+1) |T (r)| dr (log N)n Nn−R4 θi+1 |M(θi)| ≤ (log N)n Nn−2− θi 8 . (3.4) Now if we fix θt such that N θt ≈ (log N)c for some fixed constant c > 0, then t ≈ c log N log log N , thus we have shown Lemma 3.2.2. Assume that the matrix A has an off-diagonal block of rank R ≥ 9. Let c > 0 be fixed, and let 0 < θ < 1 be such that N θt ≈ (log N)c . Then one has the minor arcs estimate m(θt) |T (r)| dr = O(Nn−2 log log N (log N)C ), (3.5) with C = c 8 − n, assuming N is sufficiently large. 383.2.2 Insufficiently Off Diagonal Forms We decompose the form matrix A which defines Q into the following form A = a l1 l2 l1 A1 B l2 B t A2 . Here l1, l2 are vectors and A1, A2, B are matrices, and of course a comes from the pure quadratic term (which we assume is for x1). Then we write Q(x) = ax21 + 2x1L1(y) + 2x1L2(z) +Q1(y) +Q2(z) + 2By · z, where we have decomposed Zn = Z× Zm1 × Zm2 (x = (x1, y, z)). This first thing we need to discuss is the decomposition of A, which is accomplished once we pick A1. If we had any such decomposition giving B rank larger than 8 then we may use the previous section, so we assume rank(B) ≤ 8. If we assume that A has overall rank of R ≥ 34, then we can choose n1 such that the matrix A1 B from the above form has rank precisely 20. Then the rank of A1 is at least 20-8=12. It follows that the rank of Bt A2 is at least R − 20 − 2 ≥ 12. So we have that the rank of the matrix A2 is at least R − 22 − 8 ≥ 4. So assuming R ≥ 34 gives the ability to select A1 with rank RA1 ≥ 12, and A2 with rank RA2 ≥ 4. For now let us fix a generic minor arc m, and look at the integral Im := m (x1,y,z)∈[N ]×[N ]m1×[N ]m2 Λ(x1)F (y)G(z) ×e((ax21 + x1L1(y) + x1L2(z) +Q1(y) +Q2(z) +By · z)r)dr. (3.6) We partition the sum in the integral along the level sets of the linear forms L1(y), L2(z), and By. Then we have t1,t2,t3 m x1∈[N ], y∈L−11 (t1)∩B−1(t3), z∈L −1 2 (t2) Λ(x1)F (y)G(z) ×e((ax21 + t1x1 + t2x1 +Q1(y) +Q2(z) + t3 · z)r)dr, (3.7) 39where L−11 (t1) = {y ∈ [N ] m1 : L1(y) = t1}, L−12 (t2) = {z ∈ [N ] m2 : L1(y) = t2} and B−1(t3) = {y ∈ [N ]m1 : B(y) = t3}. Note that since that rank of B is RB ≤ 8, t3 runs through ΓB ∩ [−CN,CN ]m2 , where ΓB = B(Zm1) is a sublattice of rank RB. First lets assume, the generic case, when the linear form L1(y) is lin- early independent of the forms defining By. Otherwise, the value t3 would uniquely determine t1 so we would not need to restrict to the level set of the form L1(y), a case we will get back to later. Similarly, we assume first that L2(z) is not identically zero. The innermost sums now split into a product. Call the x1 sum S0, the y sum S1, and the z sum S2, and we have the form t1,t2,t3 m S0(r, t1, t2)S1(r, t1, t3)S2(r, t2, t3)dr := t1,t2,t3 U(t1, t2, t3). We have the simple bound U(t1, t2, t3) ≤ ||S0(·, t1, t2)||L∞(m)||S1(·, t1, t3)||L2(T)||S2(·, t2, t3)||L2(T), where T denotes R/Z. If t1 + t2 = 0, then we may apply Hua’s bound on S0 (see e.g. Lemma 10.8 [14]). If we have t1 + t2 = 0, then the following argument may be rerun to give a power gain. Let us assume then that we have t1 + t2 = 0. Then we may choose the parameter c defining the minor arcs such that ||S0(·, t1, t2)||L∞(m) N (log N)−C , (3.8) on m for any given constant C uniformly in t1 and t2. It now follows from the Cauchy-Schwarz inequality and the fact that the parameters (t1, t2, t3) can take O(NRB+2) values, that |Um| 2 NRB+4(log N)−C t1,t2,t3 ||S1(·, t1, t3)||2L2(T)||S2(·, t2, t3)|| 2 L2(T) (3.9) For fixed t1, t2, t3, the L2 estimates are the weighted number of solutions in the primes to the systems of equations 40Q1(y) = Q1(y) L1(y) = L1(y) = t1 By = By = t3, and Q2(z) + t3 · z = Q2(z) + t3 · z L2(z) = L2(z) = t2. If we sum these over t1 and t2 then the systems become Q1(y) = Q1(y) L1(y) = L1(y) By = By = t3, and Q2(z) + t3 · z = Q2(z) + t3 · z L2(z) = L2(z). Let u(t3) and v(t3) denote the number of solutions to these systems over the in the natural numbers of size at most N . Then u(t3)v(t3) is the number of solutions to the system of equations Q1(y) = Q1(y) L1(y) = L1(y) By = By = t3 Q2(z) +By · z = Q2(z) +By · z L2(z) = L2(z). 41The sum over t3 reduces this to the number of solutions of Q1(y) = Q1(y) (3.10) L1(y) = L1(y) By = By Q2(z) +By · z = Q2(z) +By · z L2(z) = L2(z), which we denote by W . Since the weights are at most log N , the integral over the minor arcs is then bounded above by |Im| 2 NRB+4(log N)−C+n W. The following and a few additional remarks finish the argument for Lemma 3.2.4 below. Lemma 3.2.3. We have the bound W N2n−RB−8. Proof. We will use the well-known fact (see e.g. [18]), that if Q(x) is an integral quadratic form of rank at least 5 in n variables and if v ∈ Zn, then the number of solutions of the equation Q(x) + v · x = 0 in [−N,N ]n is of O(Nn−2). Now for the system (3.10), we have that Q1(y) − Q1(y) = 0, that is Q(y, y) = 0 with the quadratic form Q of rank twice the rank of A1, so is at least 14 by our construction. Now restricting Q(y, y) to the subspace M defined by the linear equations: L1(y) − L1(y) = 0, By − By = 0 , which is by our assumptions is a subspace of codimension RB + 1 ≤ 9 in R2m1 , its rank is still at least 2RA1 − 18 ≥ 6. Thus the number of solutions in (y, y) ∈ M ∩ [N ]2m1 is of O(N2m1−RB−3), where the implicit constant depends only on the coefficients on the matrix A. Next, fix a solution (y, y) and consider the equations the number of pairs 42(z, z) ∈ [−N,N ]2m2 for which Q2(z) − Q2(z) + By · z − By · z = 0 and L2(z)−L2(z) = 0. Since the rank of the form Q2(z)−Q2(z) is 2RA2 ≥ 8 it follows that its restriction to the hyperplane {L2(z)− L2(z) = 0} has rank at least 6. Thus the number of solutions (z, z) is of magnitude O(N2m2−3), where the implicit constant depend only on the matrix A. This yields W N2m1+2m2−RB−6 = N2n−RB−8. For the case when L1(y) is linearly dependent of By, that is: L1(y) = By·γ for some fixed rational vector γ, we only need to restrict the summation along the level sets of By and L2(z). Thus one has |Im| ≤ t2,t3 m x1∈[N ], y∈B−1(t3), z∈L−12 (t2) Λ(x1)F (y)G(z) ×e((ax21 + t3 · γ x1 + t2x1 +Q1(y) +Q2(z) + t3 · z)r)dr, (3.11) and the rest of the analysis goes along the same lines. Similarly if L2(z) is identically 0, then there is of course no need for the parameter t2. We now have achieved Lemma 3.2.4. Assume that the matrix A has rank R ≥ 34. Let C > 0 be a fixed constant. If c > 0 is a fixed constant, sufficiently large with respect to C and N , and if 0 < θ < 1 is such that N θ = (log N)c, then one has the minor arcs estimate m(θ) |T (r)| Nn−2 (log N)−C . (3.12) 3.3 The Major Arcs and an Asymptotic Formula The major arcs are now a union of intervals of the form Ma,q ((log N)c) (q ≤ (log N)c), where c is given by Lemma 3.2.4, and is fixed throughout this section. For a fixed a, q we look at the exponential sum T , and as the 43major arcs are small, we may use any approximation that has a logarithmic gain in the error. To start we fix a q ≤ (log N)c and some a ∈ Z∗q . We follow the standard arguments, albeit with a slightly different look. Write T (r) = x∈[N ]n F (x)e(Q(x)r) (3.13) = s∈Znq x∈[N ]n 1x≡s (q)F (x)e(aQ(s)/q)e(Q(x)τ) = s∈Znq e(Q(s)a/q) z∈NB e(Q(z)τ)dψs(z), where we have set τ = r − a/q, and ψs(z) = ψs1(z1)...ψsn(zn) for ψl(y) = t≡l(q), t≤y Λ(t), and B is [0, 1] n ⊂ Rn. Lemma 3.3.1. On each major arc Ma,q((log N)c), the following holds: Fix a constant C > 2c. For each s ∈ Znq we have z∈NB e(Q(z)τ)dψs(z) = 1s∈(Z∗q)nφ(q) −n z∈NB e(Q(z)τ)dz +O(Nn (log N)c−C/2). (3.14) Proof. Define for a fixed l the one dimensional signed measure dνl = dψl − dωl, where dωl is the Lebesgue measure divided by the reciprocal of the totient of q if l ∈ Z∗q , and zero otherwise. For a continuous function f one then has [0,N ] f(z)dνl(z) = x∈[N ], x≡l (q) f(x)− φ(q)−1 N 0 f(z)dz. Also set d|νl| = dωl + dψl. We have z∈NB e(Q(z)τ)dψs(z) = z∈NB e(Q(z)τ)(dνs1(z1) + dωs1(z1)) ×...× (dνsn(zn) + dωsn(zn)). (3.15) 44Expanding out the products in the last integral gives the form z∈NB e(Q(z)τ)dωs(z) + 2n−1 i=1 z∈NB e(Q(z)τ)dµi,s(z), (3.16) where dµi,s runs over all the corresponding products, barring the dωs(z) term. Consider z∈NB e(Q(z)τ)dµi,s(z) for some fixed i. Assume without loss of generality that dµi,s is of the form dνs1(z1)dσs(z2, ..., zn), where dσs may be signed in some variables (and is of course independent of s1). Now for the first component we shall split the continuous interval [0, N ] into smaller disjoint intervals of size N (log N)−C . Here C is simply chosen to be between C/2 and C such that (log N)C is an integer, say J . The equality [0, N ] = J j=1 Ij follows. Also let us set Bj = Ij × [0, N ]n−1, which absorbs the factor of N . Now for a fixed interval Ij select some y ∈ Ij and we have z∈B| e(Q(z)τ)dµi,s = z∈B| e(Q(y, z2, ..., zn)τ)dνs1(z1)dσs(z2, ..., zn) + z∈B| (e(Q(z1, ..., zn)τ)− e(Q(y, z2, ..., zn)τ)) ×dνs1(z1)dσs(z2, ..., zn). := E1 + E2 We have |E1| ≤ z2,...,zn∈[0,N ] | Ij dνs1(z1)| d|σs|(z2, ..., zn) = O(N n e −c0 √ log N ) by the Siegel-Walfisz Theorem, as q ≤ (log N)c. To bound E2 we note that 45on Ij the integrand is O(Nn (log N) c−C). In turn, |E2| N−cθN z∈B| d|νs1 |(z1)d|σs|(z2, ..., zn) Nn (log N)c−2C . Summing over the intervals gives the result for each error term. There are 2n − 1 error terms and the proof is complete. The integral appearing in the last result, namely NB e(Q(z)τ)dz = Nn ζ∈B e(Q(ζ)N2τ)dζ, is denoted by NnI(B, N2τ) in [3]. This function is independent of a and q. Thus the integral over any major arc yields the common integral τ=(log N)c I(B, N2τ)e(−τv)dτ. With µ = N−2v, set J(µ;Φ) = |τ |≤Φ I(B, τ)e(−τv)dτ, and J(µ) = lim Φ→∞ J(µ). The following is Lemma 5.3 in [3]. Lemma 3.3.2. J(µ) is continuous and uniformly bounded in µ. Moreover, |J(µ)− J(µ,Φ)| Φ− 12 holds uniformly in µ. If we define Wa,q = s∈(Z∗q)n e(Q(s)a/q), then we now have 46Lemma 3.3.3. For a fixed major arc Ma,q((log N) c) and fixed constant C > 3c we have Ma,q T (r)e(−vr)dr = Nn−2φ(q)−nWa,qe(−va/q)J(µ)+O(Nn (log N) 3c/2−C/2), where µ = N−2v. Recall that the measure of the major arcs is at most N−2+2cθN , and define B(v, q) = (a,q)=1 φ(q)−nWa,qe(−va/q) S(v,N) = q≤(log N)c B(v, q). It follows that Lemma 3.3.4. By choosing C = 4c in the above arguments and setting δ = c/2 > 0 we have M(N, v) = S(v,N)J(µ)Nn−2 +O(Nn−2 (log N)−δ). 3.4 The Singular Series Here we analyze the singular series S(v,N) following the outline of [14]. Lemma 3.4.1. For a given prime p, let Rp denote the rank of A over Znp . For all a ∈ Z∗q the estimate |Wa,p| pn−Rp/2 holds, and the implied constant is dependent only on n. In turn |B(v, p)| p1−Rp/2 holds uniformly in v. 47Proof. Define the sets Yi = {s ∈ Z∗p : si ≡ 0 (p)}, i = 1, ..., n, and Y = i Yi. Using the principle of inclusion-exclusion gives s∈Znp 1s∈(Z∗p)ne(Q(s)a/p) = s∈Znp e(Q(s)a/p)− − n k=1 (−1)k−1 L⊆[n], |L|=k s∈Znp 1s∈YLe(Q(s)a/p), where YL = i∈L Yi. The first term on the right hand side above is a Gaussian sum, and has the the upper bound pn−Rp/2. For a fixed k above we have n k choices for L. For each choice we again have a Gaussian sum in n− k variables which has rank at least αk = max{0, Rp − 2k}. Hence for k fixed we again have a bound of n k pn−k−αk/2 ≤ n k pn−Rp/2. The result then follows. Define, for a given prime p, βp to the largest power of p to divide all the coefficients of A. Then set γp = βp + 1 for p > 2 and γ2 = β2 + 2 for p = 2. Lemma 3.4.2. Fix a prime p, let t ≥ 2γp, and for α > 0 define Rt to be the rank of the map A : Znpt−α → Z n pα . It follows that |Wa,pt | ≤ p tn−Rt(t−t/2). Hence for large enough t, dependent only on A, we have |Wa,pt | ≤ p tn−R(t−t/2) , where R is the rank of A over Rn. Moreover, if A is nonsingular modulo p, then Wa,t = 0 for all t > γp. Proof. Fix a prime p, and for simplicity set γp = γ. Let t ≥ 2γ and set 48α = t/2. Then we apply the substitution s = s1 + pt−αs2 to get Wa,pt = s∈(Z∗ pt )n e(Q(s)a/pt) = s1∈(Z∗pt−α ) n s2∈Znpα e(Q(s1 + pt−αs2)a/pt) = s1∈(Z∗pt−α ) n s2∈Znpα e(Q(s1)a/pt)e((2As1 · s2)a/pα), as α ≤ t/2. The inner sum is zero if 2As1 has a nonzero coordinate. Thus |Wa,pt | ≤ p αn|{s1 ∈ (Zpt−α)∗)n : 2As1 ≡ 0 (pα)}|. This gives the upper bound |Wa,pt | ≤ p (t−α)(n−Rt)pαn ≤ ptn−Rt(t−α). Also, if t is sufficiently large with respect to A then we have that Rt = R. Finally, since s1 ∈ (Z∗pt)n, it follows that 2As1 ≡ 0 has no solutions if A is nonsingular modulo p. Applying the above argument with α = 1 completes the proof. We note that Lemma 3.4.1 covers Wa,p for all sufficiently large primes. Also, for any prime p, Lemma 3.4.2 provides bounds for Wa,pt for all suffi- ciently large t. Lemma 3.4.3. If (q1, q2) = 1, then Wa,q1q2 = Waq2,q1Waq1,q2 and B(v, q1q2) = B(v, q1)B(v, q2). See [14] for the proof (Lemma 8.1). We are now in a position to provide a bound for B(v, q) 49Lemma 3.4.4. Given > 0, we have B(v, q) q1−R/2+ uniformly in v. Proof. Applying Lemma 3.4.3 gives B(v, q) = φ(q)−n (a,q)=1 Wa,qe(−va/q) = = (a,q)=1 φ(pt11 ) −n W a,p t1 1 ...φ(ptll ) −n W a,p tl l . Now we apply Lemma 3.4.1 and Lemma 3.4.2 to get |B(v, q)| q l i=1 (1− 1/pi)−np −R(ti−ti/2) i , where the implied constant absorbs the finite number of pairs (pi, ti) for which the rank is insufficient. We have l i=1 (1− 1/pi)−n ≤ p≤q (1− 1/p)−n (log q)n. Thus |B(v, q)| q1+ l i=1 p −R(ti−ti/2) i ≤ q 1+−R/2 as claimed. It easily follows now that the singular series is absolutely convergent when the the rank of A is at least 5. The infinite product representation follows as usual: with χp(v) = 1 + ∞ t=1 B(v, pt), 50we have S(v) = lim N→∞ S(N, v) = p χp(v). Define M(pt, v) to be the number of solutions of Q(x) ≡ v (pt) where x ∈ (Z∗ pt )n. We have the analogue of Lemma 8.6 in [14]. Lemma 3.4.5. We have M(pt, v) = φ(pt)np−t(1 + t j=1 B(v, p j)). We conclude this section with one final result. Lemma 3.4.6. If A has rank at least 5, then there exists integers λ and K, and a positive number δ such that S(v) ≥ δ whenever v ≡ λ (K). Proof. With the above estimates for |B(v, q)|, there exists a p0 such that p>p0 χp(v) ≥ δ > 0 holds for some positive δ for all v. Set χp(v, t) to be the tth partial sum of the series defining χp. The estimates for |B(v, pt)| provide a t0 such that |χp(v, t)− χp(v)| < 1 2p0+1 holds for all v. By simple averaging, Lemma 3.4.5 provides a vp in Zpt such that χp(vp, t) ≥ 1. We now set λ = p≤p0 vp and K = p≤p0 p and the result follows from the Chinese Remainder Theorem. 3.5 Conclusions Here we simply collect the pieces to prove Theorem 1.2.3 and Theorem 1.2.4 which are stated in the opening chapter. 51Proof of Theorem 1.2.3. The asymptotic formula has already been showed to hold under this hypothesis. The statement about the positivity of S is a consequence of Hensel’s Lemma, see e.g. [4]. The statement regarding the function J is precisely the same as the one given in [3]. This completes the proof. Proof of Theorem 1.2.4. We have seen in Lemma 3.4.6 that there is a in- finite arithmetic progression Z such that S(v) ≥ δ > 0 for all sufficiently large elements v ∈ Z. Also, over R it is easily seen that Q = v has a non- singular solution (as Q is canonically quivalent to x21 + ... + x 2 n). Thus the function J can be bounded below by a positive constant for these v ≈ N2, and the proof is complete. 52Chapter 4 We take some time to conclude with some a discussion future projects. 4.1 Future Projects 4.1.1 A Conjecture The most natural continuation of this work is to extend the results of Chap- ter 3 to equations of higher degree. We put forth a general conjecture, which is an analogue of Birch’s Theorem for prime points. Let us overview Birch’s results given in [3]. Let f = (f1, ..., fR) be a system of homogeneous integral forms of com- mon degree d in n variables. For a fixed v ∈ ZR set V (v) to be the complex affine variety defined by f = v. Set V ∗ to be the collection of points where the rank of Jacf is strictly less than R. Set K = 21−dcodim(V ∗), where codim denotes the codimension. One should note that in the case that f is represented by a single quadratic form Q = x,Ax, we have that the codim(V ∗) is simply the rank of the matrix A. The main result of Birch states that if K > R(R + 1)(d − 1), then the number of integer points in the box x ∈ [N ]n satisfying f(x) = v, call this N (v,N), obeys N (v,N) = S(v)J(N−Rdv)Nn−Rd +O(Pn−Rd−δ) (4.1) for some δ > 0, where S is given here by the product of p-adic densities for the equation f = v, and J is precisely the same as in the previous chapter. 53Here we conjecture the following. Define the singular series as Wa,q = s∈(Z∗q)n e(Q(s)a/q) B(v, q) = (a∈(Z∗)R φ(q)−nWa,qe((−v · a)/q) S(v,N) = ∞ q=1 B(v, q). Conjecture 4.1.1. There exists a constant K0 = K(R, d) such that if the singular variety associated to the set V = {f = v}, f as above, has codi- mension K ≥ K0, then we have the weighted number of prime points on the V ∩ [N ]n satisfies M(v,N) = S(v,N)J(N−Rdv)Nn−Rd + o(Nn−Rd), (4.2) where S is given in 4.2, J is the same as in 4.1, and the implied constant depends in the little o depends only on n,R, and d. It is worth noting that the constant K0 gives a lower bound on the number of variables n. The case d = 1 is rendered moot by the results of Green and Tao discussed in Chapter 1. The results of Chapter 3 resolve this case when d = 2, R = 1. From the point of view of the transference principle of Green and Tao in [12], which is essentially the one presented in Chapter 2, it seems reasonable that one should be able to take K0 = R(R+1)(d−1)2d−1. This says nothing of the positivity of the singular series however. It is also worth noting that this is essentially a minor arc question. The treatise of the major arcs given in Chapter 2 is easily modified to above situation, and provides precisely the main term as stated above. One more final note, in relation to the work on linear equations, this difficulty of this conjecture is on par with linear systems of complexity one. Essentially this boils down to the fact that we allow n to be taken large compared to R and d. In comparison, systems of R linear forms in n > 2R+1 variables have complexity one. 544.1.2 A Reasonable Approach Here we shall discuss an approach for the case R = 1, d > 2 of Conjecture 4.1.1. First we shall point out a few partial results that are obtainable from the methods we have used. Take F to be a homogenous integral form of degree d. Associated to F is a d-ary symmetric linear form F (x(1), ..., x(d)) over (Cn)d. Thus we recover our original form when we restrict ourself to the diagonal, which is of course a copy of Cn. If there exists a splitting of the variables x(1) = (y, 0) and x(2) = (0, z) such that (y, z) ∈ Cn with codim(L large, where L = {((y, 0), (0, z), ..., x(d)) : F (((y, 0), (0, z), ..., x(d)) = 0)} (dependent only on d), then the methods in section 3.2.1 can provide an appropriate asymptotic. The method applied in section 3.2.2 is not directly generalizable to higher degree polynomials, as the notion rank loses meaning. However it does provide a framework in which to approach such a generalization. The work of Schmidt [18] may prove to be quite useful here. His variant of Birch’s method provides a more thorough treatment of systems of forms which are highly singular. His is approach is to decompose a form as Q = R1S1 + ...RmSm, (4.3) where Ri and Si are forms of positive degree. The minimal value of m provides a natural generalization of the rank condition above. Moreover, over C this notion is essentially equivalent to condition of Birch. The goal is then to modify the decomposition which appears in section 3.2.2 to the extent that when the off diagonal analogue fails, then one may apply a similar mean value type estimate for the ‘good’ parts of the form (those with a large Schmidt condition). 4.2 Final Remarks This section brings our presentation to a close, and the author would like to take this time to thank the reader. 55Bibliography [1] T. Tao B. Green. The Mo¨bius function is strongly orthogonal to nilse- quences. Annals of Math., to appear. [2] T. Ziegler B. Green, T. Tao. An inverse theorem for the Gowers’ us+1[n]-norm. Annals of Math., to appear. [3] B.J. Birch. Forms in many variables. Proc. Roy. Soc. Ser. A, 265:245– 263, 1962. [4] J. W. S. Cassels. Rational Quadratic Forms. Dover, 2008. [5] C. Yıldırım D. Goldston. Higher correlations of divisor sums related to primes iii: small gaps between primes. Proc. London Math. Soc., 95:653–686, 2003. [6] H. Davenport. Cubic forms in 32 variables. Phil. Trans. A, 251:193–232, 1975. [7] Te Riele Deshouillers, Effinger and Zinoviev. A complete Vinogradov 3-primes theorem under the Riemann hypothesis. Electronic Research Announcements of the American Mathematical Society, 3 (15):99–104, 2008. [8] H. Furstenberg and Y. Katznelson. An ergodic Szemere´di’s theorem for commuting transformations. J. d’Analyse Math., 34:275–291, 1978. [9] D. Goldstond and C. Yıldırım. Higher correlations of divisor sums related to primes i: triple correlations. Integers: Electronic Journal of Combinatorial Number theory, 3:1–66, 2003. 56[10] T. Gowers. Decompositions, approximate structure, transference, and the Hahn-Banach theorem. Bull. Lon. Math. Soc., 42 (4):573–606, 2010. [11] W.T. Gowers. Hypergraph regularity and the multidimensional sze- mere´di theorem. Annals of Math., 166/3:897–946, 2007. [12] B.J. Green and T. Tao. The primes contain arbitrarily long arithmetic progressions. Annals of Math., 167:481–547, 2008. [13] B.J. Green and T. Tao. Linear equations in the primes. Annals of Math., 171-3:1753–1850, 2010. [14] L.K. Hua. Additive Theory of Prime Numbers. Translations of Mathe- matical Monographs, 13, American Mathematical Society, Providence, R.I., 1965. [15] J. Liu. Integral points on quadrics with prime coordinates. Monatsh Math, 164:439–465, 2011. [16] M. B. Nathonson. Additive Number Theory: The Classical Bases. Springer, 1996. [17] M. Tulsiani O. Reingold, L. Trevisan and S. Vadham. Dense subsets of pseudorandom sets. Electronic Colloquium of Computational Complex- ity, pages Report TR08–045, 2008. [18] W. Schmidt. The density of integer points on homogeneous varieties. Acta Math., 154:243–296, 1985. [19] E. Szemere´di. On sets of integers containing no k elements in arithmetic progression. Acta Arith., 27:299–345, 1975. [20] T. Tao. The Gaussian primes contain arbitrarily shaped constellations. Journal d’Analyse Mathmatique, 99 (1):109–176, 2006. [21] T. Tao and T. Ziegler. The primes contain arbitrarily long polynomial progressions. Acta Math., 201:213–305, 2008. 57[22] I.M. Vinogradov. The method of trigonometrical sums in the theory of numbers. London: Interscience Publishers, Translated, revised, and annotated by K. F. Roth and A. Davenport. 58
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Equations in the primes
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Equations in the primes Cook, Brian Michael 2012-12-31
pdf
Page Metadata
Item Metadata
Title | Equations in the primes |
Creator |
Cook, Brian Michael |
Publisher | University of British Columbia |
Date | 2012 |
Date Issued | 2012-04-23 |
Description | We provide results related to the study of prime points on level sets of homogeneous integral forms which are linear or quadratic. In the linear case we present an extension of the Green-Tao Theorem, which finds affine copies of finite intervals in relatively dense subsets of the primes, to a higher dimensional setting in which one finds affine copies of suitably generic point configurations in relatively dense subsets of a Cartesian product of the primes. For general integral quadratic forms we present a result which is a Birch-Goldbach type theorem for a single quadratic form with sufficient rank. This guarantees solubility among the primes on the level set of a quadratic form subject to local conditions. This is an extension of a well known result of Hua. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Collection |
Electronic Theses and Dissertations (ETDs) 2008+ |
Date Available | 2012-04-23 |
Provider | Vancouver : University of British Columbia Library |
DOI | 10.14288/1.0080625 |
URI | http://hdl.handle.net/2429/42217 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2012-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- ubc_2012_spring_cook_brian.pdf [ 767.19kB ]
- Metadata
- JSON: 1.0080625.json
- JSON-LD: 1.0080625+ld.json
- RDF/XML (Pretty): 1.0080625.xml
- RDF/JSON: 1.0080625+rdf.json
- Turtle: 1.0080625+rdf-turtle.txt
- N-Triples: 1.0080625+rdf-ntriples.txt
- Original Record: 1.0080625 +original-record.json
- Full Text
- 1.0080625.txt
- Citation
- 1.0080625.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
China | 9 | 8 |
United States | 6 | 1 |
Canada | 2 | 0 |
Poland | 1 | 0 |
City | Views | Downloads |
---|---|---|
Ashburn | 5 | 0 |
Shenzhen | 5 | 7 |
Shanghai | 2 | 1 |
Unknown | 1 | 12 |
Harbin | 1 | 0 |
Courtenay | 1 | 0 |
Seattle | 1 | 0 |
Beijing | 1 | 0 |
Edmonton | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0080625/manifest