You are currently browsing the monthly archive for August 2016.
Let be the divisor function. A classical application of the Dirichlet hyperbola method gives the asymptotic
where denotes the estimate
as
. Much better error estimates are possible here, but we will not focus on the lower order terms in this discussion. For somewhat idiosyncratic reasons I will interpret this estimate (and the other analytic number theory estimates discussed here) through the probabilistic lens. Namely, if
is a random number selected uniformly between
and
, then the above estimate can be written as
that is to say the random variable has mean approximately
. (But, somewhat paradoxically, this is not the median or mode behaviour of this random variable, which instead concentrates near
, basically thanks to the Hardy-Ramanujan theorem.)
Now we turn to the pair correlations for a fixed positive integer
. There is a classical computation of Ingham that shows that
The error term in (2) has been refined by many subsequent authors, as has the uniformity of the estimates in the aspect, as these topics are related to other questions in analytic number theory, such as fourth moment estimates for the Riemann zeta function; but we will not consider these more subtle features of the estimate here. However, we will look at the next term in the asymptotic expansion for (2) below the fold.
Using our probabilistic lens, the estimate (2) can be written as
From (1) (and the asymptotic negligibility of the shift by ) we see that the random variables
and
both have a mean of
, so the additional factor of
represents some arithmetic coupling between the two random variables.
Ingham’s formula can be established in a number of ways. Firstly, one can expand out and use the hyperbola method (splitting into the cases
and
and removing the overlap). If one does so, one soon arrives at the task of having to estimate sums of the form
for various . For
much less than
this can be achieved using a further application of the hyperbola method, but for
comparable to
things get a bit more complicated, necessitating the use of non-trivial estimates on Kloosterman sums in order to obtain satisfactory control on error terms. A more modern approach proceeds using automorphic form methods, as discussed in this previous post. A third approach, which unfortunately is only heuristic at the current level of technology, is to apply the Hardy-Littlewood circle method (discussed in this previous post) to express (2) in terms of exponential sums
for various frequencies
. The contribution of “major arc”
can be computed after a moderately lengthy calculation which yields the right-hand side of (2) (as well as the correct lower order terms that are currently being suppressed), but there does not appear to be an easy way to show directly that the “minor arc” contributions are of lower order, although the methods discussed previously do indirectly show that this is ultimately the case.
Each of the methods outlined above requires a fair amount of calculation, and it is not obvious while performing them that the factor will emerge at the end. One can at least explain the
as a normalisation constant needed to balance the
factor (at a heuristic level, at least). To see this through our probabilistic lens, introduce an independent copy
of
, then
using symmetry to order (discarding the diagonal case
) and making the change of variables
, we see that (4) is heuristically consistent with (3) as long as the asymptotic mean of
in
is equal to
. (This argument is not rigorous because there was an implicit interchange of limits present, but still gives a good heuristic “sanity check” of Ingham’s formula.) Indeed, if
denotes the asymptotic mean in
, then we have (heuristically at least)
and we obtain the desired consistency after multiplying by .
This still however does not explain the presence of the factor. Intuitively it is reasonable that if
has many prime factors, and
has a lot of factors, then
will have slightly more factors than average, because any common factor to
and
will automatically be acquired by
. But how to quantify this effect?
One heuristic way to proceed is through analysis of local factors. Observe from the fundamental theorem of arithmetic that we can factor
where the product is over all primes , and
is the local version of
at
(which in this case, is just one plus the
–valuation
of
:
). Note that all but finitely many of the terms in this product will equal
, so the infinite product is well-defined. In a similar fashion, we can factor
where
(or in terms of valuations, ). Heuristically, the Chinese remainder theorem suggests that the various factors
behave like independent random variables, and so the correlation between
and
should approximately decouple into the product of correlations between the local factors
and
. And indeed we do have the following local version of Ingham’s asymptotics:
Proposition 1 (Local Ingham asymptotics) For fixed
and integer
, we have
and
From the Euler formula
we see that
and so one can “explain” the arithmetic factor in Ingham’s asymptotic as the product of the arithmetic factors
in the (much easier) local Ingham asymptotics. Unfortunately we have the usual “local-global” problem in that we do not know how to rigorously derive the global asymptotic from the local ones; this problem is essentially the same issue as the problem of controlling the minor arc contributions in the circle method, but phrased in “physical space” language rather than “frequency space”.
Remark 2 The relation between the local means
and the global mean
can also be seen heuristically through the application
of Mertens’ theorem, where
is Pólya’s magic exponent, which serves as a useful heuristic limiting threshold in situations where the product of local factors is divergent.
Let us now prove this proposition. One could brute-force the computations by observing that for any fixed , the valuation
is equal to
with probability
, and with a little more effort one can also compute the joint distribution of
and
, at which point the proposition reduces to the calculation of various variants of the geometric series. I however find it cleaner to proceed in a more recursive fashion (similar to how one can prove the geometric series formula by induction); this will also make visible the vague intuition mentioned previously about how common factors of
and
force
to have a factor also.
It is first convenient to get rid of error terms by observing that in the limit , the random variable
converges vaguely to a uniform random variable
on the profinite integers
, or more precisely that the pair
converges vaguely to
. Because of this (and because of the easily verified uniform integrability properties of
and their powers), it suffices to establish the exact formulae
in the profinite setting (this setting will make it easier to set up the recursion).
We begin with (5). Observe that is coprime to
with probability
, in which case
is equal to
. Conditioning to the complementary probability
event that
is divisible by
, we can factor
where
is also uniformly distributed over the profinite integers, in which event we have
. We arrive at the identity
As and
have the same distribution, the quantities
and
are equal, and (5) follows by a brief amount of high-school algebra.
We use a similar method to treat (6). First treat the case when is coprime to
. Then we see that with probability
,
and
are simultaneously coprime to
, in which case
. Furthermore, with probability
,
is divisible by
and
is not; in which case we can write
as before, with
and
. Finally, in the remaining event with probability
,
is divisible by
and
is not; we can then write
, so that
and
. Putting all this together, we obtain
and the claim (6) in this case follows from (5) and a brief computation (noting that in this case).
Now suppose that is divisible by
, thus
for some integer
. Then with probability
,
and
are simultaneously coprime to
, in which case
. In the remaining
event, we can write
, and then
and
. Putting all this together we have
which by (5) (and replacing by
) leads to the recursive relation
and (6) then follows by induction on the number of powers of .
The estimate (2) of Ingham was refined by Estermann, who obtained the more accurate expansion
for certain complicated but explicit coefficients . For instance,
is given by the formula
where is the Euler-Mascheroni constant,
The formula for is similar but even more complicated. The error term
was improved by Heath-Brown to
; it is conjectured (for instance by Conrey and Gonek) that one in fact has square root cancellation
here, but this is well out of reach of current methods.
These lower order terms are traditionally computed either from a Dirichlet series approach (using Perron’s formula) or a circle method approach. It turns out that a refinement of the above heuristics can also predict these lower order terms, thus keeping the calculation purely in physical space as opposed to the “multiplicative frequency space” of the Dirichlet series approach, or the “additive frequency space” of the circle method, although the computations are arguably as messy as the latter computations for the purposes of working out the lower order terms. We illustrate this just for the term below the fold.
Fifteen years ago, I wrote a paper entitled Global regularity of wave maps. II. Small energy in two dimensions, in which I established global regularity of wave maps from two spatial dimensions to the unit sphere, assuming that the initial data had small energy. Recently, Hao Jia (personal communication) discovered a small gap in the argument that requires a slightly non-trivial fix. The issue does not really affect the subsequent literature, because the main result has since been reproven and extended by methods that avoid the gap (see in particular this subsequent paper of Tataru), but I have decided to describe the gap and its fix on this blog.
I will assume familiarity with the notation of my paper. In Section 10, some complicated spaces are constructed for each frequency scale
, and then a further space
is constructed for a given frequency envelope
by the formula
where is the Littlewood-Paley projection of
to frequency magnitudes
. Then, given a spacetime slab
, we define the restrictions
where the infimum is taken over all extensions of
to the Minkowski spacetime
; similarly one defines
The gap in the paper is as follows: it was implicitly assumed that one could restrict (1) to the slab to obtain the equality
(This equality is implicitly used to establish the bound (36) in the paper.) Unfortunately, (1) only gives the lower bound, not the upper bound, and it is the upper bound which is needed here. The problem is that the extensions of
that are optimal for computing
are not necessarily the Littlewood-Paley projections of the extensions
of
that are optimal for computing
.
To remedy the problem, one has to prove an upper bound of the form
for all Schwartz (actually we need affinely Schwartz
, but one can easily normalise to the Schwartz case). Without loss of generality we may normalise the RHS to be
. Thus
for each , and one has to find a single extension
of
such that
for each . Achieving a
that obeys (4) is trivial (just extend
by zero), but such extensions do not necessarily obey (5). On the other hand, from (3) we can find extensions
of
such that
the extension will then obey (5) (here we use Lemma 9 from my paper), but unfortunately is not guaranteed to obey (4) (the
norm does control the
norm, but a key point about frequency envelopes for the small energy regularity problem is that the coefficients
, while bounded, are not necessarily summable).
This can be fixed as follows. For each we introduce a time cutoff
supported on
that equals
on
and obeys the usual derivative estimates in between (the
time derivative of size
for each
). Later we will prove the truncation estimate
Assuming this estimate, then if we set , then using Lemma 9 in my paper and (6), (7) (and the local stability of frequency envelopes) we have the required property (5). (There is a technical issue arising from the fact that
is not necessarily Schwartz due to slow decay at temporal infinity, but by considering partial sums in the
summation and taking limits we can check that
is the strong limit of Schwartz functions, which suffices here; we omit the details for sake of exposition.) So the only issue is to establish (4), that is to say that
for all .
For this is immediate from (2). Now suppose that
for some integer
(the case when
is treated similarly). Then we can split
where
The contribution of the term is acceptable by (6) and estimate (82) from my paper. The term
sums to
which is acceptable by (2). So it remains to control the
norm of
. By the triangle inequality and the fundamental theorem of calculus, we can bound
By hypothesis, . Using the first term in (79) of my paper and Bernstein’s inequality followed by (6) we have
and then we are done by summing the geometric series in .
It remains to prove the truncation estimate (7). This estimate is similar in spirit to the algebra estimates already in my paper, but unfortunately does not seem to follow immediately from these estimates as written, and so one has to repeat the somewhat lengthy decompositions and case checkings used to prove these estimates. We do this below the fold.
[This blog post was written jointly by Terry Tao and Will Sawin.]
In the previous blog post, one of us (Terry) implicitly introduced a notion of rank for tensors which is a little different from the usual notion of tensor rank, and which (following BCCGNSU) we will call “slice rank”. This notion of rank could then be used to encode the Croot-Lev-Pach-Ellenberg-Gijswijt argument that uses the polynomial method to control capsets.
Afterwards, several papers have applied the slice rank method to further problems – to control tri-colored sum-free sets in abelian groups (BCCGNSU, KSS) and from there to the triangle removal lemma in vector spaces over finite fields (FL), to control sunflowers (NS), and to bound progression-free sets in -groups (P).
In this post we investigate the notion of slice rank more systematically. In particular, we show how to give lower bounds for the slice rank. In many cases, we can show that the upper bounds on slice rank given in the aforementioned papers are sharp to within a subexponential factor. This still leaves open the possibility of getting a better bound for the original combinatorial problem using the slice rank of some other tensor, but for very long arithmetic progressions (at least eight terms), we show that the slice rank method cannot improve over the trivial bound using any tensor.
It will be convenient to work in a “basis independent” formalism, namely working in the category of abstract finite-dimensional vector spaces over a fixed field . (In the applications to the capset problem one takes
to be the finite field of three elements, but most of the discussion here applies to arbitrary fields.) Given
such vector spaces
, we can form the tensor product
, generated by the tensor products
with
for
, subject to the constraint that the tensor product operation
is multilinear. For each
, we have the smaller tensor products
, as well as the
tensor product
defined in the obvious fashion. Elements of of the form
for some
and
will be called rank one functions, and the slice rank (or rank for short)
of an element
of
is defined to be the least nonnegative integer
such that
is a linear combination of
rank one functions. If
are finite-dimensional, then the rank is always well defined as a non-negative integer (in fact it cannot exceed
. It is also clearly subadditive:
For ,
is
when
is zero, and
otherwise. For
,
is the usual rank of the
-tensor
(which can for instance be identified with a linear map from
to the dual space
). The usual notion of tensor rank for higher order tensors uses complete tensor products
,
as the rank one objects, rather than
, giving a rank that is greater than or equal to the slice rank studied here.
From basic linear algebra we have the following equivalences:
Lemma 1 Let
be finite-dimensional vector spaces over a field
, let
be an element of
, and let
be a non-negative integer. Then the following are equivalent:
- (i) One has
.
- (ii) One has a representation of the form
where
are finite sets of total cardinality
at most
, and for each
and
,
and
.
- (iii) One has
where for each
,
is a subspace of
of total dimension
at most
, and we view
as a subspace of
in the obvious fashion.
- (iv) (Dual formulation) There exist subspaces
of the dual space
for
, of total dimension at least
, such that
is orthogonal to
, in the sense that one has the vanishing
for all
, where
is the obvious pairing.
Proof: The equivalence of (i) and (ii) is clear from definition. To get from (ii) to (iii) one simply takes to be the span of the
, and conversely to get from (iii) to (ii) one takes the
to be a basis of the
and computes
by using a basis for the tensor product
consisting entirely of functions of the form
for various
. To pass from (iii) to (iv) one takes
to be the annihilator
of
, and conversely to pass from (iv) to (iii).
One corollary of the formulation (iv), is that the set of tensors of slice rank at most is Zariski closed (if the field
is algebraically closed), and so the slice rank itself is a lower semi-continuous function. This is in contrast to the usual tensor rank, which is not necessarily semicontinuous.
Corollary 2 Let
be finite-dimensional vector spaces over an algebraically closed field
. Let
be a nonnegative integer. The set of elements of
of slice rank at most
is closed in the Zariski topology.
Proof: In view of Lemma 1(i and iv), this set is the union over tuples of integers with
of the projection from
of the set of tuples
with
orthogonal to
, where
is the Grassmanian parameterizing
-dimensional subspaces of
.
One can check directly that the set of tuples with
orthogonal to
is Zariski closed in
using a set of equations of the form
locally on
. Hence because the Grassmanian is a complete variety, the projection of this set to
is also Zariski closed. So the finite union over tuples
of these projections is also Zariski closed.
We also have good behaviour with respect to linear transformations:
Lemma 3 Let
be finite-dimensional vector spaces over a field
, let
be an element of
, and for each
, let
be a linear transformation, with
the tensor product of these maps. Then
Furthermore, if the
are all injective, then one has equality in (2).
Thus, for instance, the rank of a tensor is intrinsic in the sense that it is unaffected by any enlargements of the spaces
.
Proof: The bound (2) is clear from the formulation (ii) of rank in Lemma 1. For equality, apply (2) to the injective , as well as to some arbitrarily chosen left inverses
of the
.
Computing the rank of a tensor is difficult in general; however, the problem becomes a combinatorial one if one has a suitably sparse representation of that tensor in some basis, where we will measure sparsity by the property of being an antichain.
Proposition 4 Let
be finite-dimensional vector spaces over a field
. For each
, let
be a linearly independent set in
indexed by some finite set
. Let
be a subset of
.
where for each
,
is a coefficient in
. Then one has
where the minimum ranges over all coverings of
by sets
, and
for
are the projection maps.
Now suppose that the coefficients
are all non-zero, that each of the
are equipped with a total ordering
, and
is the set of maximal elements of
, thus there do not exist distinct
,
such that
for all
. Then one has
In particular, if
is an antichain (i.e. every element is maximal), then equality holds in (4).
Proof: By Lemma 3 (or by enlarging the bases ), we may assume without loss of generality that each of the
is spanned by the
. By relabeling, we can also assume that each
is of the form
with the usual ordering, and by Lemma 3 we may take each to be
, with
the standard basis.
Let denote the rank of
. To show (4), it suffices to show the inequality
for any covering of by
. By removing repeated elements we may assume that the
are disjoint. For each
, the tensor
can (after collecting terms) be written as
for some . Summing and using (1), we conclude the inequality (6).
Now assume that the are all non-zero and that
is the set of maximal elements of
. To conclude the proposition, it suffices to show that the reverse inequality
holds for some covering
. By Lemma 1(iv), there exist subspaces
of
whose dimension
sums to
Let . Using Gaussian elimination, one can find a basis
of
whose representation in the standard dual basis
of
is in row-echelon form. That is to say, there exist natural numbers
such that for all ,
is a linear combination of the dual vectors
, with the
coefficient equal to one.
We now claim that is disjoint from
. Suppose for contradiction that this were not the case, thus there exists
for each
such that
As is the set of maximal elements of
, this implies that
for any tuple other than
. On the other hand, we know that
is a linear combination of
, with the
coefficient one. We conclude that the tensor product
is equal to
plus a linear combination of other tensor products with
not in
. Taking inner products with (3), we conclude that
, contradicting the fact that
is orthogonal to
. Thus we have
disjoint from
.
For each , let
denote the set of tuples
in
with
not of the form
. From the previous discussion we see that the
cover
, and we clearly have
, and hence from (8) we have (7) as claimed.
As an instance of this proposition, we recover the computation of diagonal rank from the previous blog post:
Example 5 Let
be finite-dimensional vector spaces over a field
for some
. Let
be a natural number, and for
, let
be a linearly independent set in
. Let
be non-zero coefficients in
. Then
has rank
. Indeed, one applies the proposition with
all equal to
, with
the diagonal in
; this is an antichain if we give one of the
the standard ordering, and another of the
the opposite ordering (and ordering the remaining
arbitrarily). In this case, the
are all bijective, and so it is clear that the minimum in (4) is simply
.
The combinatorial minimisation problem in the above proposition can be solved asymptotically when working with tensor powers, using the notion of the Shannon entropy of a discrete random variable
.
Proposition 6 Let
be finite-dimensional vector spaces over a field
. For each
, let
be a linearly independent set in
indexed by some finite set
. Let
be a non-empty subset of
.
Let
be a tensor of the form (3) for some coefficients
. For each natural number
, let
be the tensor power of
copies of
, viewed as an element of
. Then
and
range over the random variables taking values in
.
Now suppose that the coefficients
are all non-zero and that each of the
are equipped with a total ordering
. Let
be the set of maximal elements of
in the product ordering, and let
where
range over random variables taking values in
. Then
as
. In particular, if the maximizer in (10) is supported on the maximal elements of
(which always holds if
is an antichain in the product ordering), then equality holds in (9).
Proof:
as , where
is the projection map. Then the same thing will apply to
and
. Then applying Proposition 4, using the lexicographical ordering on
and noting that, if
are the maximal elements of
, then
are the maximal elements of
, we obtain both (9) and (11).
We first prove the lower bound. By compactness (and the continuity properties of entropy), we can find a random variable taking values in
such that
Let be a small positive quantity that goes to zero sufficiently slowly with
. Let
denote the set of all tuples
in
that are within
of being distributed according to the law of
, in the sense that for all
, one has
By the asymptotic equipartition property, the cardinality of can be computed to be
if goes to zero slowly enough. Similarly one has
Now let be an arbitrary covering of
. By the pigeonhole principle, there exists
such that
which by (13) implies that
noting that the factor can be absorbed into the
error). This gives the lower bound in (12).
Now we prove the upper bound. We can cover by
sets of the form
for various choices of random variables
taking values in
. For each such random variable
, we can find
such that
; we then place all of
in
. It is then clear that the
cover
and that
for all , giving the required upper bound.
It is of interest to compute the quantity in (10). We have the following criterion for when a maximiser occurs:
Proposition 7 Let
be finite sets, and
be non-empty. Let
be the quantity in (10). Let
be a random variable taking values in
, and let
denote the essential range of
, that is to say the set of tuples
such that
is non-zero. Then the following are equivalent:
- (i)
attains the maximum in (10).
- (ii) There exist weights
and a finite quantity
, such that
whenever
, and such that
for all
, with equality if
. (In particular,
must vanish if there exists a
with
.)
Furthermore, when (i) and (ii) holds, one has
Proof: We first show that (i) implies (ii). The function is concave on
. As a consequence, if we define
to be the set of tuples
such that there exists a random variable
taking values in
with
, then
is convex. On the other hand, by (10),
is disjoint from the orthant
. Thus, by the hyperplane separation theorem, we conclude that there exists a half-space
where are reals that are not all zero, and
is another real, which contains
on its boundary and
in its interior, such that
avoids the interior of the half-space. Since
is also on the boundary of
, we see that the
are non-negative, and that
whenever
.
By construction, the quantity
is maximised when . At this point we could use the method of Lagrange multipliers to obtain the required constraints, but because we have some boundary conditions on the
(namely, that the probability that they attain a given element of
has to be non-negative) we will work things out by hand. Let
be an element of
, and
an element of
. For
small enough, we can form a random variable
taking values in
, whose probability distribution is the same as that for
except that the probability of attaining
is increased by
, and the probability of attaining
is decreased by
. If there is any
for which
and
, then one can check that
for sufficiently small , contradicting the maximality of
; thus we have
whenever
. Taylor expansion then gives
for small , where
and similarly for . We conclude that
for all
and
, thus there exists a quantity
such that
for all
, and
for all
. By construction
must be nonnegative. Sampling
using the distribution of
, one has
almost surely; taking expectations we conclude that
The inner sum is , which equals
when
is non-zero, giving (17).
Now we show conversely that (ii) implies (i). As noted previously, the function is concave on
, with derivative
. This gives the inequality
for any (note the right-hand side may be infinite when
and
). Let
be any random variable taking values in
, then on applying the above inequality with
and
, multiplying by
, and summing over
and
gives
By construction, one has
and
so to prove that (which would give (i)), it suffices to show that
or equivalently that the quantity
is maximised when . Since
it suffices to show this claim for the quantity
One can view this quantity as
By (ii), this quantity is bounded by , with equality if
is equal to
(and is in particular ranging in
), giving the claim.
The second half of the proof of Proposition 7 only uses the marginal distributions and the equation(16), not the actual distribution of
, so it can also be used to prove an upper bound on
when the exact maximizing distribution is not known, given suitable probability distributions in each variable. The logarithm of the probability distribution here plays the role that the weight functions do in BCCGNSU.
Remark 8 Suppose one is in the situation of (i) and (ii) above; assume the nondegeneracy condition that
is positive (or equivalently that
is positive). We can assign a “degree”
to each element
by the formula
then every tuple
in
has total degree at most
, and those tuples in
have degree exactly
. In particular, every tuple in
has degree at most
, and hence by (17), each such tuple has a
-component of degree less than or equal to
for some
with
. On the other hand, we can compute from (19) and the fact that
for
that
. Thus, by asymptotic equipartition, and assuming
, the number of “monomials” in
of total degree at most
is at most
; one can in fact use (19) and (18) to show that this is in fact an equality. This gives a direct way to cover
by sets
with
, which is in the spirit of the Croot-Lev-Pach-Ellenberg-Gijswijt arguments from the previous post.
We can now show that the rank computation for the capset problem is sharp:
Proposition 9 Let
denote the space of functions from
to
. Then the function
from
to
, viewed as an element of
, has rank
as
, where
is given by the formula
Proof: In , we have
Thus, if we let be the space of functions from
to
(with domain variable denoted
respectively), and define the basis functions
of indexed by
(with the usual ordering), respectively, and set
to be the set
then is a linear combination of the
with
, and all coefficients non-zero. Then we have
. We will show that the quantity
of (10) agrees with the quantity
of (20), and that the optimizing distribution is supported on
, so that by Proposition 6 the rank of
is
.
To compute the quantity at (10), we use the criterion in Proposition 7. We take to be the random variable taking values in
that attains each of the values
with a probability of
, and each of
with a probability of
; then each of the
attains the values of
with probabilities
respectively, so in particular
is equal to the quantity
in (20). If we now set
and
we can verify the condition (16) with equality for all , which from (17) gives
as desired.
This statement already follows from the result of Kleinberg-Sawin-Speyer, which gives a “tri-colored sum-free set” in of size
, as the slice rank of this tensor is an upper bound for the size of a tri-colored sum-free set. If one were to go over the proofs more carefully to evaluate the subexponential factors, this argument would give a stronger lower bound than KSS, as it does not deal with the substantial loss that comes from Behrend’s construction. However, because it actually constructs a set, the KSS result rules out more possible approaches to give an exponential improvement of the upper bound for capsets. The lower bound on slice rank shows that the bound cannot be improved using only the slice rank of this particular tensor, whereas KSS shows that the bound cannot be improved using any method that does not take advantage of the “single-colored” nature of the problem.
We can also show that the slice rank upper bound in a result of Naslund-Sawin is similarly sharp:
Proposition 10 Let
denote the space of functions from
to
. Then the function
from
, viewed as an element of
, has slice rank
Proof: Let and
be a basis for the space
of functions on
, itself indexed by
. Choose similar bases for
and
, with
and
.
Set . Then
is a linear combination of the
with
, and all coefficients non-zero. Order
the usual way so that
is an antichain. We will show that the quantity
of (10) is
, so that applying the last statement of Proposition 6, we conclude that the rank of
is
,
Let be the random variable taking values in
that attains each of the values
with a probability of
. Then each of the
attains the value
with probability
and
with probability
, so
Setting and
, we can verify the condition (16) with equality for all
, which from (17) gives
as desired.
We used a slightly different method in each of the last two results. In the first one, we use the most natural bases for all three vector spaces, and distinguish from its set of maximal elements
. In the second one we modify one basis element slightly, with
instead of the more obvious choice
, which allows us to work with
instead of
. Because
is an antichain, we do not need to distinguish
and
. Both methods in fact work with either problem, and they are both about equally difficult, but we include both as either might turn out to be substantially more convenient in future work.
Proposition 11 Let
be a natural number and let
be a finite abelian group. Let
be any field. Let
denote the space of functions from
to
.
Let
be any
-valued function on
that is nonzero only when the
elements of
form a
-term arithmetic progression, and is nonzero on every
-term constant progression.
Then the slice rank of
is
.
Proof: We apply Proposition 4, using the standard bases of . Let
be the support of
. Suppose that we have
orderings on
such that the constant progressions are maximal elements of
and thus all constant progressions lie in
. Then for any partition
of
,
can contain at most
constant progressions, and as all
constant progressions must lie in one of the
, we must have
. By Proposition 4, this implies that the slice rank of
is at least
. Since
is a
tensor, the slice rank is at most
, hence exactly
.
So it is sufficient to find orderings on
such that the constant progressions are maximal element of
. We make several simplifying reductions: We may as well assume that
consists of all the
-term arithmetic progressions, because if the constant progressions are maximal among the set of all progressions then they are maximal among its subset
. So we are looking for an ordering in which the constant progressions are maximal among all
-term arithmetic progressions. We may as well assume that
is cyclic, because if for each cyclic group we have an ordering where constant progressions are maximal, on an arbitrary finite abelian group the lexicographic product of these orderings is an ordering for which the constant progressions are maximal. We may assume
, as if we have an
-tuple of orderings where constant progressions are maximal, we may add arbitrary orderings and the constant progressions will remain maximal.
So it is sufficient to find orderings on the cyclic group
such that the constant progressions are maximal elements of the set of
-term progressions in
in the
-fold product ordering. To do that, let the first, second, third, and fifth orderings be the usual order on
and let the fourth, sixth, seventh, and eighth orderings be the reverse of the usual order on
.
Then let be a constant progression and for contradiction assume that
is a progression greater than
in this ordering. We may assume that
, because otherwise we may reverse the order of the progression, which has the effect of reversing all eight orderings, and then apply the transformation
, which again reverses the eight orderings, bringing us back to the original problem but with
.
Take a representative of the residue class in the interval
. We will abuse notation and call this
. Observe that
, and
are all contained in the interval
modulo
. Take a representative of the residue class
in the interval
. Then
is in the interval
for some
. The distance between any distinct pair of intervals of this type is greater than
, but the distance between
and
is at most
, so
is in the interval
. By the same reasoning,
is in the interval
. Therefore
. But then the distance between
and
is at most
, so by the same reasoning
is in the interval
. Because
is between
and
, it also lies in the interval
. Because
is in the interval
, and by assumption it is congruent mod
to a number in the set
greater than or equal to
, it must be exactly
. Then, remembering that
and
lie in
, we have
and
, so
, hence
, thus
, which contradicts the assumption that
.
In fact, given a -term progressions mod
and a constant, we can form a
-term binary sequence with a
for each step of the progression that is greater than the constant and a
for each step that is less. Because a rotation map, viewed as a dynamical system, has zero topological entropy, the number of
-term binary sequences that appear grows subexponentially in
. Hence there must be, for large enough
, at least one sequence that does not appear. In this proof we exploit a sequence that does not appear for
.
Recent Comments