You are currently browsing the tag archive for the ‘time-frequency analysis’ tag.

This set of notes discusses aspects of one of the oldest questions in Fourier analysis, namely the nature of convergence of Fourier series.

If ${f: {\bf R}/{\bf Z} \rightarrow {\bf C}}$ is an absolutely integrable function, its Fourier coefficients ${\hat f: {\bf Z} \rightarrow {\bf C}}$ are defined by the formula

$\displaystyle \hat f(n) := \int_{{\bf R}/{\bf Z}} f(x) e^{-2\pi i nx}\ dx.$

If ${f}$ is smooth, then the Fourier coefficients ${\hat f}$ are absolutely summable, and we have the Fourier inversion formula

$\displaystyle f(x) = \sum_{n \in {\bf Z}} \hat f(n) e^{2\pi i nx}$

where the series here is uniformly convergent. In particular, if we define the partial summation operators

$\displaystyle S_N f(x) := \sum_{|n| \leq N} \hat f(n) e^{2\pi i nx}$

then ${S_N f}$ converges uniformly to ${f}$ when ${f}$ is smooth.

What if ${f}$ is not smooth, but merely lies in an ${L^p({\bf R}/{\bf Z})}$ class for some ${1 \leq p \leq \infty}$? The Fourier coefficients ${\hat f}$ remain well-defined, as do the partial summation operators ${S_N}$. The question of convergence in norm is relatively easy to settle:

Exercise 1
• (i) If ${1 < p < \infty}$ and ${f \in L^p({\bf R}/{\bf Z})}$, show that ${S_N f}$ converges in ${L^p({\bf R}/{\bf Z})}$ norm to ${f}$. (Hint: first use the boundedness of the Hilbert transform to show that ${S_N}$ is bounded in ${L^p({\bf R}/{\bf Z})}$ uniformly in ${N}$.)
• (ii) If ${p=1}$ or ${p=\infty}$, show that there exists ${f \in L^p({\bf R}/{\bf Z})}$ such that the sequence ${S_N f}$ is unbounded in ${L^p({\bf R}/{\bf Z})}$ (so in particular it certainly does not converge in ${L^p({\bf R}/{\bf Z})}$ norm to ${f}$. (Hint: first show that ${S_N}$ is not bounded in ${L^p({\bf R}/{\bf Z})}$ uniformly in ${N}$, then apply the uniform boundedness principle in the contrapositive.)

The question of pointwise almost everywhere convergence turned out to be a significantly harder problem:

Theorem 2 (Pointwise almost everywhere convergence)
• (i) (Kolmogorov, 1923) There exists ${f \in L^1({\bf R}/{\bf Z})}$ such that ${S_N f(x)}$ is unbounded in ${N}$ for almost every ${x}$.
• (ii) (Carleson, 1966; conjectured by Lusin, 1913) For every ${f \in L^2({\bf R}/{\bf Z})}$, ${S_N f(x)}$ converges to ${f(x)}$ as ${N \rightarrow \infty}$ for almost every ${x}$.
• (iii) (Hunt, 1967) For every ${1 < p \leq \infty}$ and ${f \in L^p({\bf R}/{\bf Z})}$, ${S_N f(x)}$ converges to ${f(x)}$ as ${N \rightarrow \infty}$ for almost every ${x}$.

Note from Hölder’s inequality that ${L^2({\bf R}/{\bf Z})}$ contains ${L^p({\bf R}/{\bf Z})}$ for all ${p\geq 2}$, so Carleson’s theorem covers the ${p \geq 2}$ case of Hunt’s theorem. We remark that the precise threshold near ${L^1}$ between Kolmogorov-type divergence results and Carleson-Hunt pointwise convergence results, in the category of Orlicz spaces, is still an active area of research; see this paper of Lie for further discussion.

Carleson’s theorem in particular was a surprisingly difficult result, lying just out of reach of classical methods (as we shall see later, the result is much easier if we smooth either the function ${f}$ or the summation method ${S_N}$ by a tiny bit). Nowadays we realise that the reason for this is that Carleson’s theorem essentially contains a frequency modulation symmetry in addition to the more familiar translation symmetry and dilation symmetry. This basically rules out the possibility of attacking Carleson’s theorem with tools such as Calderón-Zygmund theory or Littlewood-Paley theory, which respect the latter two symmetries but not the former. Instead, tools from “time-frequency analysis” that essentially respect all three symmetries should be employed. We will illustrate this by giving a relatively short proof of Carleson’s theorem due to Lacey and Thiele. (There are other proofs of Carleson’s theorem, including Carleson’s original proof, its modification by Hunt, and a later time-frequency proof by Fefferman; see Remark 18 below.)

A recurring theme in mathematics is that of duality: a mathematical object ${X}$ can either be described internally (or in physical space, or locally), by describing what ${X}$ physically consists of (or what kind of maps exist into ${X}$), or externally (or in frequency space, or globally), by describing what ${X}$ globally interacts or resonates with (or what kind of maps exist out of ${X}$). These two fundamentally opposed perspectives on the object ${X}$ are often dual to each other in various ways: performing an operation on ${X}$ may transform it one way in physical space, but in a dual way in frequency space, with the frequency space description often being a “inversion” of the physical space description. In several important cases, one is fortunate enough to have some sort of fundamental theorem connecting the internal and external perspectives. Here are some (closely inter-related) examples of this perspective:

1. Vector space duality A vector space ${V}$ over a field ${F}$ can be described either by the set of vectors inside ${V}$, or dually by the set of linear functionals ${\lambda: V \rightarrow F}$ from ${V}$ to the field ${F}$ (or equivalently, the set of vectors inside the dual space ${V^*}$). (If one is working in the category of topological vector spaces, one would work instead with continuous linear functionals; and so forth.) A fundamental connection between the two is given by the Hahn-Banach theorem (and its relatives).
2. Vector subspace duality In a similar spirit, a subspace ${W}$ of ${V}$ can be described either by listing a basis or a spanning set, or dually by a list of linear functionals that cut out that subspace (i.e. a spanning set for the orthogonal complement ${W^\perp := \{ \lambda \in V^*: \lambda(w)=0 \hbox{ for all } w \in W \})}$. Again, the Hahn-Banach theorem provides a fundamental connection between the two perspectives.
3. Convex duality More generally, a (closed, bounded) convex body ${K}$ in a vector space ${V}$ can be described either by listing a set of (extreme) points whose convex hull is ${K}$, or else by listing a set of (irreducible) linear inequalities that cut out ${K}$. The fundamental connection between the two is given by the Farkas lemma.
4. Ideal-variety duality In a slightly different direction, an algebraic variety ${V}$ in an affine space ${A^n}$ can be viewed either “in physical space” or “internally” as a collection of points in ${V}$, or else “in frequency space” or “externally” as a collection of polynomials on ${A^n}$ whose simultaneous zero locus cuts out ${V}$. The fundamental connection between the two perspectives is given by the nullstellensatz, which then leads to many of the basic fundamental theorems in classical algebraic geometry.
5. Hilbert space duality An element ${v}$ in a Hilbert space ${H}$ can either be thought of in physical space as a vector in that space, or in momentum space as a covector ${w \mapsto \langle v, w \rangle}$ on that space. The fundamental connection between the two is given by the Riesz representation theorem for Hilbert spaces.
6. Semantic-syntactic duality Much more generally still, a mathematical theory can either be described internally or syntactically via its axioms and theorems, or externally or semantically via its models. The fundamental connection between the two perspectives is given by the Gödel completeness theorem.
7. Intrinsic-extrinsic duality A (Riemannian) manifold ${M}$ can either be viewed intrinsically (using only concepts that do not require an ambient space, such as the Levi-Civita connection), or extrinsically, for instance as the level set of some defining function in an ambient space. Some important connections between the two perspectives includes the Nash embedding theorem and the theorema egregium.
8. Group duality A group ${G}$ can be described either via presentations (lists of generators, together with relations between them) or representations (realisations of that group in some more concrete group of transformations). A fundamental connection between the two is Cayley’s theorem. Unfortunately, in general it is difficult to build upon this connection (except in special cases, such as the abelian case), and one cannot always pass effortlessly from one perspective to the other.
9. Pontryagin group duality A (locally compact Hausdorff) abelian group ${G}$ can be described either by listing its elements ${g \in G}$, or by listing the characters ${\chi: G \rightarrow {\bf R}/{\bf Z}}$ (i.e. continuous homomorphisms from ${G}$ to the unit circle, or equivalently elements of ${\hat G}$). The connection between the two is the focus of abstract harmonic analysis.
10. Pontryagin subgroup duality A subgroup ${H}$ of a locally compact abelian group ${G}$ can be described either by generators in ${H}$, or generators in the orthogonal complement ${H^\perp := \{ \xi \in \hat G: \xi \cdot h = 0 \hbox{ for all } h \in H \}}$. One of the fundamental connections between the two is the Poisson summation formula.
11. Fourier duality A (sufficiently nice) function ${f: G \rightarrow {\bf C}}$ on a locally compact abelian group ${G}$ (equipped with a Haar measure ${\mu}$) can either be described in physical space (by its values ${f(x)}$ at each element ${x}$ of ${G}$) or in frequency space (by the values ${\hat f(\xi) = \int_G f(x) e( - \xi \cdot x )\ d\mu(x)}$ at elements ${\xi}$ of the Pontryagin dual ${\hat G}$). The fundamental connection between the two is the Fourier inversion formula.
12. The uncertainty principle The behaviour of a function ${f}$ at physical scales above (resp. below) a certain scale ${R}$ is almost completely controlled by the behaviour of its Fourier transform ${\hat f}$ at frequency scales below (resp. above) the dual scale ${1/R}$ and vice versa, thanks to various mathematical manifestations of the uncertainty principle. (The Poisson summation formula can also be viewed as a variant of this principle, using subgroups instead of scales.)
13. Stone/Gelfand duality A (locally compact Hausdorff) topological space ${X}$ can be viewed in physical space (as a collection of points), or dually, via the ${C^*}$ algebra ${C(X)}$ of continuous complex-valued functions on that space, or (in the case when ${X}$ is compact and totally disconnected) via the boolean algebra of clopen sets (or equivalently, the idempotents of ${C(X)}$). The fundamental connection between the two is given by the Stone representation theorem or the (commutative) Gelfand-Naimark theorem.

I have discussed a fair number of these examples in previous blog posts (indeed, most of the links above are to my own blog). In this post, I would like to discuss the uncertainty principle, that describes the dual relationship between physical space and frequency space. There are various concrete formalisations of this principle, most famously the Heisenberg uncertainty principle and the Hardy uncertainty principle – but in many situations, it is the heuristic formulation of the principle that is more useful and insightful than any particular rigorous theorem that attempts to capture that principle. Unfortunately, it is a bit tricky to formulate this heuristic in a succinct way that covers all the various applications of that principle; the Heisenberg inequality ${\Delta x \cdot \Delta \xi \gtrsim 1}$ is a good start, but it only captures a portion of what the principle tells us. Consider for instance the following (deliberately vague) statements, each of which can be viewed (heuristically, at least) as a manifestation of the uncertainty principle:

1. A function which is band-limited (restricted to low frequencies) is featureless and smooth at fine scales, but can be oscillatory (i.e. containing plenty of cancellation) at coarse scales. Conversely, a function which is smooth at fine scales will be almost entirely restricted to low frequencies.
2. A function which is restricted to high frequencies is oscillatory at fine scales, but is negligible at coarse scales. Conversely, a function which is oscillatory at fine scales will be almost entirely restricted to high frequencies.
3. Projecting a function to low frequencies corresponds to averaging out (or spreading out) that function at fine scales, leaving only the coarse scale behaviour.
4. Projecting a frequency to high frequencies corresponds to removing the averaged coarse scale behaviour, leaving only the fine scale oscillation.
5. The number of degrees of freedom of a function is bounded by the product of its spatial uncertainty and its frequency uncertainty (or more generally, by the volume of the phase space uncertainty). In particular, there are not enough degrees of freedom for a non-trivial function to be simulatenously localised to both very fine scales and very low frequencies.
6. To control the coarse scale (or global) averaged behaviour of a function, one essentially only needs to know the low frequency components of the function (and vice versa).
7. To control the fine scale (or local) oscillation of a function, one only needs to know the high frequency components of the function (and vice versa).
8. Localising a function to a region of physical space will cause its Fourier transform (or inverse Fourier transform) to resemble a plane wave on every dual region of frequency space.
9. Averaging a function along certain spatial directions or at certain scales will cause the Fourier transform to become localised to the dual directions and scales. The smoother the averaging, the sharper the localisation.
10. The smoother a function is, the more rapidly decreasing its Fourier transform (or inverse Fourier transform) is (and vice versa).
11. If a function is smooth or almost constant in certain directions or at certain scales, then its Fourier transform (or inverse Fourier transform) will decay away from the dual directions or beyond the dual scales.
12. If a function has a singularity spanning certain directions or certain scales, then its Fourier transform (or inverse Fourier transform) will decay slowly along the dual directions or within the dual scales.
13. Localisation operations in position approximately commute with localisation operations in frequency so long as the product of the spatial uncertainty and the frequency uncertainty is significantly larger than one.
14. In the high frequency (or large scale) limit, position and frequency asymptotically behave like a pair of classical observables, and partial differential equations asymptotically behave like classical ordinary differential equations. At lower frequencies (or finer scales), the former becomes a “quantum mechanical perturbation” of the latter, with the strength of the quantum effects increasing as one moves to increasingly lower frequencies and finer spatial scales.
15. Etc., etc.
16. Almost all of the above statements generalise to other locally compact abelian groups than ${{\bf R}}$ or ${{\bf R}^n}$, in which the concept of a direction or scale is replaced by that of a subgroup or an approximate subgroup. (In particular, as we will see below, the Poisson summation formula can be viewed as another manifestation of the uncertainty principle.)

I think of all of the above (closely related) assertions as being instances of “the uncertainty principle”, but it seems difficult to combine them all into a single unified assertion, even at the heuristic level; they seem to be better arranged as a cloud of tightly interconnected assertions, each of which is reinforced by several of the others. The famous inequality ${\Delta x \cdot \Delta \xi \gtrsim 1}$ is at the centre of this cloud, but is by no means the only aspect of it.

The uncertainty principle (as interpreted in the above broad sense) is one of the most fundamental principles in harmonic analysis (and more specifically, to the subfield of time-frequency analysis), second only to the Fourier inversion formula (and more generally, Plancherel’s theorem) in importance; understanding this principle is a key piece of intuition in the subject that one has to internalise before one can really get to grips with this subject (and also with closely related subjects, such as semi-classical analysis and microlocal analysis). Like many fundamental results in mathematics, the principle is not actually that difficult to understand, once one sees how it works; and when one needs to use it rigorously, it is usually not too difficult to improvise a suitable formalisation of the principle for the occasion. But, given how vague this principle is, it is difficult to present this principle in a traditional “theorem-proof-remark” manner. Even in the more informal format of a blog post, I was surprised by how challenging it was to describe my own understanding of this piece of mathematics in a linear fashion, despite (or perhaps because of) it being one of the most central and basic conceptual tools in my own personal mathematical toolbox. In the end, I chose to give below a cloud of interrelated discussions about this principle rather than a linear development of the theory, as this seemed to more closely align with the nature of this principle.

Camil Muscalu, Christoph Thiele and I have just uploaded to the arXiv our joint paper, “Multi-linear multipliers associated to simplexes of arbitrary length“, submitted to Analysis & PDE. This paper grew out of our project from many years ago to attempt to prove the nonlinear (or “scattering”) version of Carleson’s theorem on the almost everywhere convergence of Fourier series. This version is still open; our original approach was to handle the nonlinear Carleson operator by multilinear expansions in terms of the potential function V, but while the first three terms of this expansion were well behaved, the fourth term was unfortunately divergent, due to the unhelpful location of a certain minus sign. [This survey by Michael Lacey, as well as this paper of ourselves, covers some of these topics.]

However, what we did find out in this paper was that if we modified the nonlinear Carleson operator slightly, by replacing the underlying Schrödinger equation by a more general AKNS system, then for “generic” choices of this system, the problem of the ill-placed minus sign goes away, and each term in the multilinear series is, in fact, convergent (though we did not yet verify that the series actually converged, though in view of the earlier work of Christ and Kiselev on this topic, this seems likely). The verification of this convergence (at least with regard to the scattering data, rather than the more difficult analysis of the eigenfunctions) is the main result of our current paper. It builds upon our earlier estimates of the bilinear term in the expansion (which we dubbed the “biest”, as a multilingual pun). The main new idea in our earlier paper was to decompose the relevant region of frequency space $\{ (\xi_1,\xi_2,\xi_3) \in {\Bbb R}^3: \xi_1 < \xi_2 < \xi_3 \}$ into more tractable regions, a typical one being the region in which $\xi_2$ was much closer to $\xi_1$ than to $\xi_3$. The contribution of each region can then be “parafactored” into a “paracomposition” of simpler operators, such as the bilinear Hilbert transform, which can be treated by standard time-frequency analysis methods. (Much as a paraproduct is a frequency-restricted version of a product, the paracompositions that arise here are frequency-restricted versions of composition.)

A similar analysis happens to work for the multilinear operators associated to the frequency region $S := \{ (\xi_1,\ldots,\xi_n): \xi_1 < \ldots < \xi_n \}$, but the combinatorics are more complicated; each of the component frequency regions has to be indexed by a tree (in a manner reminiscent of the well-separated pairs decomposition), and a certain key “weak Bessel inequality” becomes considerably more delicate. Our ultimate conclusion is that the multilinear operator

$T(V_1,\ldots,V_n) := \int_{(\xi_1,\ldots,\xi_n) \in S} \hat V_1(\xi_1) \ldots \hat V_n(\xi_n) e^{2i (\xi_1+\ldots+\xi_n) x}\ d\xi_1 \ldots d\xi_n$ (1)

(which generalises the bilinear Hilbert transform and the biest) obeys Hölder-type $L^p$ estimates (note that Hölder’s inequality related to the situation in which the (projective) simplex S is replaced by the entire frequency space ${\Bbb R}^n$).

For the remainder of this post, I thought I would describe the “nonlinear Carleson theorem” conjecture, which is still one of my favourite open problems, being an excellent benchmark for measuring progress in the (still nascent) field of “nonlinear Fourier analysis“, while also being of interest in its own right in scattering and spectral theory.

Ciprian Demeter, Michael Lacey, Christoph Thiele and I have just uploaded our joint paper, “The Walsh model for $M_2^*$ Carleson” to the arXiv. This paper (which was recently accepted for publication in Revista Iberoamericana) establishes a simplified model for the key estimate (the “$M_2^*$ Carleson estimate”) in another (much longer) paper of ours on the return times theorem of Bourgain, in which the Fourier transform is replaced by its dyadic analogue, the Walsh-Fourier transform. This model estimate is established by the now-standard techniques of time-frequency analysis: one decomposes the expression to be estimated into a sum over tiles, and then uses combinatorial stopping time arguments into group the tiles into trees, and the trees into forests. One then uses (phase-space localised, and frequency-modulated) versions of classical Calderòn-Zygmund theory (or in this particular case, a certain maximal Fourier inequality of Bourgain) to control individual trees and forests, and sums up over the trees and forests using orthogonality methods (excluding an exceptional set if necessary).

Rather than discuss time-frequency analysis in detail here, I thought I would dwell instead on the return times theorem, and sketch how it is connected to the $M_2^*$ Carleson estimate; this is a more complicated version of the “$M_2$ Carleson estimate”, which is an estimate which is logically equivalent to Carleson’s famous theorem (and its extension by Hunt) on the almost everywhere convergence of Fourier series.

This is a well-known problem in multilinear harmonic analysis; it is fascinating to me because it lies barely beyond the reach of the best technology we have for these problems (namely, multiscale time-frequency analysis), and because the most recent developments in quadratic Fourier analysis seem likely to shed some light on this problem.

Recall that the Hilbert transform is defined on test functions $f \in {\mathcal S}({\Bbb R})$ (up to irrelevant constants) as

$Hf(x) := p.v. \int_{\Bbb R} f(x+t) \frac{dt}{t},$

where the integral is evaluated in the principal value sense (removing the region $|t| < \epsilon$ to ensure integrability, and then taking the limit as $\epsilon \to 0$.)