Assaf Naor and I have just uploaded to the arXiv our paper “Random Martingales and localization of maximal inequalities“, to be submitted shortly. This paper investigates the best constant in generalisations of the classical Hardy-Littlewood maximal inequality

for any absolutely integrable , where is the Euclidean ball of radius centred at , and denotes the Lebesgue measure of a subset of . This inequality is fundamental to a large part of real-variable harmonic analysis, and in particular to *Calderón-Zygmund theory*. A similar inequality in fact holds with the Euclidean norm replaced by any other convex norm on .

The exact value of the constant is only known in , with a remarkable result of Melas establishing that . Classical covering lemma arguments give the exponential upper bound when properly optimised (a direct application of the Vitali covering lemma gives , but one can reduce to by being careful). In an important paper of Stein and Strömberg, the improved bound was obtained for any convex norm by a more intricate covering norm argument, and the slight improvement obtained in the Euclidean case by another argument more adapted to the Euclidean setting that relied on heat kernels. In the other direction, a recent result of Aldaz shows that in the case of the norm, and in fact in an even more recent preprint of Aubrun, the lower bound for any has been obtained in this case. However, these lower bounds do not apply in the Euclidean case, and one may still conjecture that is in fact uniformly bounded in this case.

Unfortunately, we do not make direct progress on these problems here. However, we do show that the Stein-Strömberg bound is extremely general, applying to a wide class of metric measure spaces obeying a certain “microdoubling condition at dimension “; and conversely, in such level of generality, it is essentially the best estimate possible, even with additional metric measure hypotheses on the space. Thus, if one wants to improve this bound for a specific maximal inequality, one has to use specific properties of the geometry (such as the connections between Euclidean balls and heat kernels). Furthermore, in the general setting of metric measure spaces, one has a general *localisation principle*, which roughly speaking asserts that in order to prove a maximal inequality over all scales , it suffices to prove such an inequality in a smaller range uniformly in . It is this localisation which ultimately explains the significance of the growth in the Stein-Strömberg result (there are essentially distinct scales in any range ). It also shows that if one restricts the radii to a lacunary range (such as powers of ), the best constant improvees to ; if one restricts the radii to an even sparser range such as powers of , the best constant becomes .

The method of proof is a little unusual from a harmonic analysis perspective, though it is now quite standard in computer science and metric space geometry. The main idea is to try to convert the underlying metric to an ultrametric, such as those arising from dyadic models of Euclidean space (e.g. from infinite binary trees). The reason for this is that maximal inequalities are easy to prove for ultrametrics, indeed the Hardy-Littlewood inequality is always true in such settings with constant . A key reason for this is the following nesting property: in an ultrametric space, if two balls , intersect, then the larger ball in fact contains the smaller ball. Because of this, the Hardy-Littlewood maximal inequality becomes essentially trivial, as can be seen by considering maximal balls in which the function has mean at least , which are disjoint by the ultrametric property.

Now, a direct application of this idea to such metric as the Euclidean metric gives losses that are exponential in the dimension, mainly because one has to triple (resp. double) the size of a ball before it will capture all the balls of lesser radius that intersect it (resp. contain the centre of the original ball). Indeed, this is morally speaking where the Vitali covering lemma bound comes from. The standard way to say this is that the *doubling constant* of the Euclidean metric is exponentially large (indeed, it is ), and indeed it is easy to construct examples of metric measure spaces where the best constant in the Hardy-Littlewood inequality is as large as the doubling constant. (The canonical example here is the star graph with the graph metric and counting measure, where the doubling constant and maximal constant are both essentially the degree of the star.)

But suppose that if instead of trying to capture all balls of lesser radius, one only wanted to capture all balls intersecting a given ball of *much* smaller radius, and in particular if . Then one only needs to enlarge the original ball by a factor of to achieve this, and this only increases the volume by . Thus we see that the Euclidean metric behaves “like an ultrametric” so long as one can somehow separate scales by a ratio of or more. (This observation has also turned out to be crucial for efficient estimates in modern additive combinatorics, resulting in a technique of scale separation for balls (or related objects such as Bohr sets) that is sometimes informally referred to as “Bourgainisation” due to the breakthrough work of Bourgain on Roth’s theorem, which introduced the technique.) This is the intuition behind the localisation result.

How can one formalise this intuition? To avoid some annoying technical issues it is convenient to work on the unit torus (with the induced Euclidean metric) rather than the Euclidean space ; note that a simple scaling argument shows that the maximal inequality constants for the former control those of the latter.

Let us impose a random -ary mesh on the torus by identifying the torus with a randomly translated copy of the unit cube , then partitioning this cube into subcubes of length , each of which is partitioned further into subcubes of length , and so forth.

This mesh induces an ultrametric on the torus, with the distance between two distinct points defined as where is the length of the minimal cube in the mesh that contains both and ; this is basically the ultrametric coming from viewing the mesh as an infinite -ary tree. However, this ultrametric is not particularly close to the Euclidean metric; in particular, two points close to, but on opposite sides, of a cube in the mesh will be close in the Euclidean metric but far apart in the ultrametric.

But one can avoid this issue by the standard trick of *padding* the mesh. Suppose one takes every cube in the mesh and shrinks it by a factor of , thus decreasing the volume by ; call this the *padded* cube at this scale. Note from the random nature of the mesh construction that every point has a constant probability of lying in the padded cube at a fixed scale; in particular, by Fubini’s theorem (aka the first moment method), given a measurable set in the torus, the padded cubes at a fixed scale will capture a constant fraction of . It is because of this that we can afford to restrict the maximal function to padded cubes without significantly losing control of the constants (let me be a bit vague here about exactly how one does this).

The key point is that inside a padded cube of sidelength , any ball of radius or less intersecting this padded cube, will not intersect any other padded cube at this scale. Thus, balls at this scale or less behave somewhat as if the Euclidean metric was an ultrametric, and this turns out to be enough to achieve localisation.

This argument turns out to be extremely general, and applies to any metric measure space (i.e. a metric space with a Radon measure) for which one has the *microdoubling condition* . In particular, this applies to *Alhfors-David regular spaces* in which is comparable to for some constant . Instead of using cubes to create the mesh, it turns out that one can just use randomly placed balls whose radii are essentially powers of to achieve a much stranger looking, but functionally comparable, mesh.

Because of microdoubling, one can assume without loss of generality that all radii are powers of , since one can round off to the nearest such power without much loss. There are only such powers in any range , which thus gives the Stein-Strömberg result in the general microdoubling setting. (We also give an alternate proof of this result, adapting a slightly different probabilistic argument of Lindenstrauss.)

On the other hand, one can concoct microdoubling spaces in which the balls of radius for are genuinely “different”, so much so that the best constant in the maximal inequality is , even if one makes reasonable assumptions such as translation-invariance (assuming an abelian group structure on the space), Ahlfors-David regularity, and bounds (the maximal operator is bounded on for every ). The basic counterexample here is constructed by working in a finite field vector space with counting measure and creating a rather unusual (but still Ahlfors-David regular) metric which, for each , distributes a large portion of the mass of the ball of radius centred at the origin to a different subspace of the vector space. This essentially converts the Hardy-Littlewood maximal function to a maximal function over averages of unrelated subspaces, which one can show to be large by straightforward computations.

Finally, we give an example of a natural space without any microdoubling properties whatsoever for which one still has a maximal inequality, namely the free non-abelian group on a bounded number of generators. Here one can rely instead on the expander (or isoperimetric) properties of the Cayley graph. This suggests to us that one should have a Hardy-Littlewood maximal inequality for measure-preserving actions of the free group on a probability space, but unfortunately the standard transference technology for doing this does not apply because the free group is highly non-amenable. This question therefore remains open (although bounds were previously established by Nevo and Stein).

## 9 comments

Comments feed for this article

8 December, 2009 at 5:45 am

Dave GI think a word (maybe “holds”?) is missing from the final sentence of the first paragraph. “A similar inequality in fact with the Euclidean norm…”

[Corrected, thanks – T.]8 December, 2009 at 7:25 am

ioannis parissisFirst of all this is a very interesting article, and I can’t claim to understand it in the depth that it goes. I will stick to the Euclidean world:

– It is my impression that in order to improve the known bounds in the Euclidean case (or in the -case), one has to “address all scales at the same time”. The Stein-Strömberg proof (via the covering lemma) as well as the Lindenstrauss lemma approach, use the localization principle you explain here and get the final bound by counting how many distinct scales there are. Another, more naive, way to put it is that everything happens for radii that are between powers of and powers of as you mention in your notes for the Lindenstrauss Lemma. Any sparser set of radii gives constants and any denser set than powers of is already the unrestricted maximal function (up to absolute constants). This has already been observed by

~~Soria and Menarguez~~(corrected link here) to do them justice. Between these two endpoints, there are essentially scales. Philosophically speaking, this approach cannot ever give us anything better that . Contrast that to the Euclidean case; the comparison with the heat kernel does not use this localization, but rather the Euclidean symmetry for all scales at the same time (and it is enough to do that for large enough).– Aiming somewhere in the middle, it would be interesting to understand if the bound for (say) lacunary radii , is best possible. We get this bound, again, as a consequence of the localization principle. But I don’t know of a direct approach to this, meaning an approach that’s specific to this scale.

8 December, 2009 at 6:40 pm

Terence TaoThanks for the Soria-Menarguez reference (though it appears that their Collectanea paper is more relevant here than the one you linked, which is mostly concerned with the 1D case). Yes, the localisation principle formalises the gap between powers of (where everything is understood) and powers of (which is essentially the full case) by saying that we can reduce to just the scales between, say, 1 and n. At this point, it seems that one cannot get any further just from the triangle inequality, and must now use something special about the underlying metric structure.

It was also pointed out to us that we neglected to mention some closely related work of Stromberg, Rochberg-Taibleson, and Cowling-Meda-Setti on the free group and on hyperbolic space (we had already intended to cite these papers, but somehow things got lost in the process). The next revision of the paper will address these issues.

8 December, 2009 at 6:51 pm

ioannis parissisYou are absolutely right. I linked the wrong paper so feel free to edit my comment with the correct link (the collectanea paper is freely available here as far as I can tell).

[OK, fixed. -T]9 December, 2009 at 7:07 pm

CJSome formulas in this post aren’t showing up, like in a previous posting of yours. For example, whatever formula defining C_n in the first few sentences doesn’t show up. (I’m running firefox on OS 10.4, if that matters.)

11 December, 2009 at 5:15 am

ioannis parissisNot very important but for the sake of accuracy: on page 4, line 23 of the article […] it was necessary to exploit the relationship between averaging on balls and the Poisson semi-group […] ; i think you mean the Heat semi-group.

à propos: I haven’t actually checked whether we can we use the Poisson semi-group instead here. In the Bourgain/Carberry strong result, the Poisson semi-group is used instead and seems to be more appropriate. I also haven’t checked (maybe a naive comment, but an obvious one) whether an improvement on the weak (1,1) bound can be achieved for the metric by comparing to a suitable semi-group.

16 October, 2010 at 8:29 pm

245A, Notes 5: Differentiation theorems « What’s new[…] to a result of Stein and Strömberg, but it is not known if is bounded in or grows as . See this blog post for some further discussion. Exercise 19 (Dyadic maximal inequality) If is an absolutely […]

17 November, 2012 at 9:34 am

A Few Mathematical Snapshots from India (ICM2010) | Combinatorics and more[…] by Hardy and Littlewood depending on the dimension. Here is a related paper by Naor and Tao and a related post on “what’s […]

18 May, 2015 at 8:09 pm

Failure of the L^1 pointwise and maximal ergodic theorems for the free group | What's new[…] this for the Cesáro averages , but not for itself. About six years ago, Assaf Naor and I tried our hand at this problem, and was able to show an associated maximal inequality on , but due to the non-amenability of , […]