This is a well-known problem in multilinear harmonic analysis; it is fascinating to me because it lies barely beyond the reach of the best technology we have for these problems (namely, multiscale time-frequency analysis), and because the most recent developments in quadratic Fourier analysis seem likely to shed some light on this problem.

Recall that the Hilbert transform is defined on test functions $f \in {\mathcal S}({\Bbb R})$ (up to irrelevant constants) as

$Hf(x) := p.v. \int_{\Bbb R} f(x+t) \frac{dt}{t},$

where the integral is evaluated in the principal value sense (removing the region $|t| < \epsilon$ to ensure integrability, and then taking the limit as $\epsilon \to 0$.)

One of the basic results in (linear) harmonic analysis is that the Hilbert transform is bounded on $L^p({\Bbb R)}$ for every $1 < p < \infty$, thus for each such p there exists a finite constant $C_p$ such that

$\| Hf \|_{L^p({\Bbb R})} \leq C_p \|f\|_{L^p({\Bbb R})}.$

One can view boundedness result (which is of importance in complex analysis and one-dimensional Fourier analysis, while also providing a model case of the more general Calderón-Zygmund theory of singular integral operators) as an assertion that the Hilbert transform is “not much larger than” the identity operator. And indeed the two operators are very similar; both are invariant under translations and dilations, and on the Fourier side, the Hilbert transform barely changes the magnitude of the Fourier transform at all:

$\hat{Hf}(\xi) = \pi i \hbox{sgn}(\xi) \hat f(\xi).$

In fact, one can show the only reasonable (e.g. $L^2$-bounded) operators which are invariant under translations and dilations are just the linear combinations of the Hilbert transform and the identity operator. (A useful heuristic in this area is to view the singular kernel $p.v. 1/t$ as being of similar “strength” to the Dirac delta function $\delta(t)$ – for instance, they have same scale-invariance properties.)

Note that the Hilbert transform is formally a convolution of f with the kernel $1/t$. This kernel is almost, but not quite, absolutely integrable – the integral of $1/|t|$ diverges logarithmically both at zero and at infinity. If the kernel was absolutely integrable, then the above $L^p$ boundedness result would be a simple consequence of Young’s inequality (or Minkowski’s inequality); the difficulty is thus “just” one of avoiding a logarithmic divergence. To put it another way, if one dyadically decomposes the Hilbert transform into pieces localised at different scales (e.g. restricting to an “annulus” $|t| \sim 2^n$), then it is a triviality to establish boundedness of each component; the difficulty is ensuring that there is enough cancellation or orthogonality that one can sum over the (logarithmically infinite number of) scales and still recover boundedness.

There are a number of ways to establish boundedness of the Hilbert transform. One way is to decompose all functions involved into wavelets – functions which are localised in space and scale, and whose frequencies stay at a fixed distance from the origin (relative to the scale). By using standard estimates concerning how a function can be decomposed into wavelets, how the Hilbert transform acts on wavelets, and how wavelets can be used to reconstitute functions, one can establish the desired boundedness. The use of wavelets to mediate the action of the Hilbert transform fits well with the two symmetries of the Hilbert transform (translation and scaling), because the collection of wavelets also obeys (discrete versions of) these symmetries. One can view the theory of such wavelets as a dyadic framework for Calderón-Zygmund theory.

Just as the Hilbert transform behaves like the identity, it was conjectured by Calderón (motivated by the study of the Cauchy integral on Lipschitz curves) that the bilinear Hilbert transform

$B(f,g)(x) := p.v. \int_{\Bbb R} f(x+t) g(x+2t) \frac{dt}{t}$

would behave like the pointwise product operator $f, g \mapsto fg$ (exhibiting again the analogy between $p.v. 1/t$ and $\delta(t)$), in particular one should have the Hölder-type inequality

$\| B(f,g) \|_{L^r({\Bbb R})} \leq C_{p,q} \|f\|_{L^p({\Bbb R})} \|g\|_{L^q({\Bbb R})}$ (*)

whenever $1 < p,q < \infty$ and $\frac{1}{r} = \frac{1}{p} + \frac{1}{q}$. (There is nothing special about the “2” in the definition of the bilinear Hilbert transform; one can replace this constant by any other constant except for 0, 1, or infinity, though it is a delicate issue to maintain good control on the constant $C_{p,q}$ in that case. Note that by setting g=1 and looking at the limiting case $q=\infty$ we recover the linear Hilbert transform theory from the bilinear one, thus we expect the bilinear theory to be harder.) Again, this claim is trivial when localising to a single scale $|t| \sim 2^n$, as it can then be quickly deduced from Hölder’s inequality. The difficulty is then to combine all the scales together.

It took some time to realise that Calderón-Zygmund theory, despite being incredibly effective in the linear setting, was not the right tool for the bilinear problem. One way to see the problem is to observe that the bilinear Hilbert transform B (or more precisely, the estimate (*)) enjoys one additional symmetry beyond the scaling and translation symmetries that the Hilbert transform H obeyed. Namely, one has the modulation invariance

$B( e_{-2\xi} f, e_\xi g ) = e_{-\xi} B(f,g)$

for any frequency $\xi$, where $e_{\xi}(x) := e^{2\pi i \xi x}$ is the linear plane wave of frequency $\xi$, which leads to a modulation symmetry for the estimate (*). This symmetry – which has no non-trivial analogue in the linear Hilbert transform – is a consequence of the algebraic identity

$\xi x - 2\xi (x+t) + \xi (x+2t) = 0$

which can in turn be viewed as an assertion that linear functions have a vanishing second derivative.

It is a general principle that if one wants to establish a delicate estimate which is invariant under some non-compact group of symmetries, then the proof of that estimate should also be largely invariant under that symmetry (or, if it does eventually decide to break the symmetry (e.g. by performing a normalisation), it should do so in a way that will yield some tangible profit). Calderón-Zygmund theory gives the frequency origin $\xi = 0$ a preferred role (for instance, all wavelets have mean zero, i.e. their Fourier transforms vanish at the frequency origin), and so is not the appropriate tool for any modulation-invariant problem.

The conjecture of Calderón was finally verified in a breakthrough pair of papers by Lacey and Thiele, first in the “easy” region $2 < p,q,r' < \infty$ (in which all functions are locally in $L^2$ and so local Fourier analytic methods are particularly tractable) and then in the significantly larger region where $r > 2/3$. (Extending the latter result to $r=2/3$ or beyond remains open, and can be viewed as a toy version of the trilinear Hilbert transform question discussed below.) The key idea (dating back to Fefferman) was to replace the wavelet decomposition by a more general wave packet decomposition – wave packets being functions which are well localised in position, scale, and frequency, but are more general than wavelets in that their frequencies do not need to hover near the origin; in particular, the wave packet framework enjoys the same symmetries as the estimate that one is seeking to prove. (As such, wave packets are a highly overdetermined basis, in contrast to the exact bases that wavelets offers, but this turns out to not be a problem, provided that one focuses more on decomposing the operator B rather than the individual functions f,g.) Once the wave packets are used to mediate the action of the bilinear Hilbert transform B, Lacey and Thiele then used a carefully chosen combinatorial algorithm to organise these packets into “trees” concentrated in mostly disjoint regions of phase space, applying (modulated) Calderón-Zygmund theory to each tree, and then using orthogonality methods to sum the contributions of the trees together. (The same method also leads to the simplest proof known of Carleson’s celebrated theorem on convergence of Fourier series.)

Since the Lacey-Thiele breakthrough, there has been a flurry of other papers (including some that I was involved in) extending the time-frequency method to many other types of operators; all of these had the characteristic that these operators were invariant (or “morally” invariant) under translation, dilation, and some sort of modulation; this includes a number of operators of interest to ergodic theory and to nonlinear scattering theory. However, in this post I want to instead discuss an operator which does not lie in this class, namely the trilinear Hilbert transform

$T(f,g,h)(x) := p.v. \int_{\Bbb R} f(x+t) g(x+2t) h(x+3t) \frac{dt}{t}.$

Again, since we expect $p.v. 1/t$ to behave like $\delta(t)$, we expect the trilinear Hilbert transform to obey a Hölder-type inequality

$\| T(f,g,h) \|_{L^s({\Bbb R})} \leq C_{p,q,r} \|f\|_{L^p({\Bbb R})} \|g\|_{L^q({\Bbb R})} \|h\|_{L^r({\Bbb R})}$ (**)

whenever $1 < p,q,r < \infty$ and $\frac{1}{s} = \frac{1}{p} + \frac{1}{q} + \frac{1}{r}$. This conjecture is currently unknown for any exponents p,q,r – even the case p=q=r=4, which is the “easiest” case by symmetry, duality and interpolation arguments. The main new difficulty is that in addition to the three existing invariances of translation, scaling, and modulation (actually, modulation is now a two-parameter invariance), one now also has a quadratic modulation invariance

$T( q_{-3\xi} f, q_{3\xi} g, q_{-\xi} h ) = q_{-\xi} T(f,g,h)$

for any “quadratic frequency” $\xi$, where $q_{\xi}(x) := e^{2\pi i \xi x^2}$ is the quadratic plane wave of frequency $\xi$, which leads to a quadratic modulation symmetry for the estimate (**). This symmetry is a consequence of the algebraic identity

$\xi x^2 - 3\xi (x+t)^2 + 3\xi (x+2t) ^2 - \xi (x+3t)^2= 0$

which can in turn be viewed as an assertion that quadratic functions have a vanishing third derivative.

It is because of this symmetry that time-frequency methods based on Fefferman-Lacey-Thiele style wave packets seem to be ineffective (though the failure is very slight; one can control entire “forests” of trees of wave packets, but when summing up all the relevant forests in the problem one unfortunately encounters a logarithmic divergence; also, it is known that if one ignores the sign of the wave packet coefficients and only concentrates on the magnitude – which one can get away with for the bilinear Hilbert transform – then the associated trilinear expression is in fact divergent). Indeed, wave packets are certainly not invariant under quadratic modulations. One can then hope to work with the next obvious generalisation of wave packets, namely the “chirps” – quadratically modulated wave packets – but the combinatorics of organising these chirps into anything resembling trees or forests seems to be very difficult. Also, recent work in the additive combinatorial approach to Szemerédi’s theorem (as well as in the ergodic theory approaches) suggests that these quadratic modulations might not be the only obstruction, that other “2-step nilpotent” modulations may also need to be somehow catered for. Indeed I suspect that some of the modern theory of Szemerédi’s theorem for progressions of length 4 will have to be invoked in order to solve the trilinear problem. (Again based on analogy with the literature on Szemerédi’s theorem, the problem of quartilinear and higher Hilbert transforms is likely to be significantly more difficult still, and thus not worth studying at this stage.)

This problem may be too difficult to attack directly, and one might look at some easier model problems first. One that was already briefly mentioned above was to return to the bilinear Hilbert transform and try to establish an endpoint result at r=2/3. At this point there is again a logarithmic failure of the time-frequency method, and so one is forced to hunt for a different approach. Another is to look at the bilinear maximal operator

$M(f,g)(x) := sup_{r > 0} \frac{1}{2r} \int_{-r}^r f(x+t) g(x+2t) dt$

which is a bilinear variant of the Hardy-Littlewood maximal operator, in much the same way that the bilinear Hilbert transform is a variant of the linear Hilbert transform. It was shown by Lacey that this operator obeys most of the bounds that the bilinear Hilbert transform does, but the argument is rather complicated, combining the time-frequency analysis with some Fourier-analytic maximal inequalities of Bourgain. In particular, despite the “positive” (non-oscillatory) nature of the maximal operator, the only known proof of the boundedness of this operator is oscillatory. It is thus natural to seek a “positive” proof that does not require as much use of oscillatory tools such as the Fourier transform, in particular it is tempting to try an additive combinatorial approach. Such an approach has had some success with a slightly easier operator in a similar spirit, in an unpublished paper of Demeter, Thiele, and myself. There is also a paper of Christ in which a different type of additive combinatorics (coming, in fact, from work on the Kakeya problem) was used to establish a non-trivial estimate for single-scale model of various multilinear Hilbert transform or maximal operators. If these operators are understood better, then perhaps additive combinatorics can be used to attack the trilinear maximal operator, and thence to the trilinear Hilbert transform. (This trilinear maximal operator, incidentally, has some applications to pointwise convergence of multiple averages in ergodic theory.)

Another, rather different, approach would be to work in the “finite field model” in which the underlying field ${\Bbb R}$ is replaced by a Cantor ring $\overline{F(t)}$ of formal Laurent series over a finite field F; in such “dyadic models” the analysis is known to be somewhat simpler (in large part because in this non-Archimedean setting it now becomes possible to create wave packets which are localised in both space and frequency). Nazarov has an unpublished proof of the boundedness of the bilinear Hilbert transform in characteristic 3 settings based on a Bellman function approach; it may be that one could achieve something similar over the field of 4 elements for (a suitably defined version of) the trilinear Hilbert transform. This would at least give supporting evidence for the analogous conjecture in ${\Bbb R}$, although it looks unlikely that a positive result in the dyadic setting would have a direct impact on the continuous one.