You are currently browsing the tag archive for the ‘Fourier transform’ tag.
Roth’s theorem on arithmetic progressions asserts that every subset of the integers of positive upper density contains infinitely many arithmetic progressions of length three. There are many versions and variants of this theorem. Here is one of them:
Theorem 1 (Roth’s theorem) Let be a compact abelian group, with Haar probability measure , which is -divisible (i.e. the map is surjective) and let be a measurable subset of with for some . Then we have
where denotes the bound for some depending only on .
This theorem is usually formulated in the case that is a finite abelian group of odd order (in which case the result is essentially due to Meshulam) or more specifically a cyclic group of odd order (in which case it is essentially due to Varnavides), but is also valid for the more general setting of -divisible compact abelian groups, as we shall shortly see. One can be more precise about the dependence of the implied constant on , but to keep the exposition simple we will work at the qualitative level here, without trying at all to get good quantitative bounds. The theorem is also true without the -divisibility hypothesis, but the proof we will discuss runs into some technical issues due to the degeneracy of the shift in that case.
We can deduce Theorem 1 from the following more general Khintchine-type statement. Let denote the Pontryagin dual of a compact abelian group , that is to say the set of all continuous homomorphisms from to the (additive) unit circle . Thus is a discrete abelian group, and functions have a Fourier transform defined by
If is -divisible, then is -torsion-free in the sense that the map is injective. For any finite set and any radius , define the Bohr set
where denotes the distance of to the nearest integer. We refer to the cardinality of as the rank of the Bohr set. We record a simple volume bound on Bohr sets:
Lemma 2 (Volume packing bound) Let be a compact abelian group with Haar probability measure . For any Bohr set , we have
Proof: We can cover the torus by translates of the cube . Then the sets form an cover of . But all of these sets lie in a translate of , and the claim then follows from the translation invariance of .
Given any Bohr set , we define a normalised “Lipschitz” cutoff function by the formula
where is the constant such that
thus
The function should be viewed as an -normalised “tent function” cutoff to . Note from Lemma 2 that
We then have the following sharper version of Theorem 1:
Theorem 3 (Roth-Khintchine theorem) Let be a -divisible compact abelian group, with Haar probability measure , and let . Then for any measurable function , there exists a Bohr set with and such that
where denotes the convolution operation
A variant of this result (expressed in the language of ergodic theory) appears in this paper of Bergelson, Host, and Kra; a combinatorial version of the Bergelson-Host-Kra result that is closer to Theorem 3 subsequently appeared in this paper of Ben Green and myself, but this theorem arguably appears implicitly in a much older paper of Bourgain. To see why Theorem 3 implies Theorem 1, we apply the theorem with and equal to a small multiple of to conclude that there is a Bohr set with and such that
But from (2) we have the pointwise bound , and Theorem 1 follows.
Below the fold, we give a short proof of Theorem 3, using an “energy pigeonholing” argument that essentially dates back to the 1986 paper of Bourgain mentioned previously (not to be confused with a later 1999 paper of Bourgain on Roth’s theorem that was highly influential, for instance in emphasising the importance of Bohr sets). The idea is to use the pigeonhole principle to choose the Bohr set to capture all the “large Fourier coefficients” of , but such that a certain “dilate” of does not capture much more Fourier energy of than itself. The bound (3) may then be obtained through elementary Fourier analysis, without much need to explicitly compute things like the Fourier transform of an indicator function of a Bohr set. (However, the bound obtained by this argument is going to be quite poor – of tower-exponential type.) To do this we perform a structural decomposition of into “structured”, “small”, and “highly pseudorandom” components, as is common in the subject (e.g. in this previous blog post), but even though we crucially need to retain non-negativity of one of the components in this decomposition, we can avoid recourse to conditional expectation with respect to a partition (or “factor”) of the space, using instead convolution with one of the considered above to achieve a similar effect.
The classification of finite simple groups (CFSG), first announced in 1983 but only fully completed in 2004, is one of the monumental achievements of twentieth century mathematics. Spanning hundreds of papers and tens of thousands of pages, it has been called the “enormous theorem”. A “second generation” proof of the theorem is nearly completed which is a little shorter (estimated at about five thousand pages in length), but currently there is no reasonably sized proof of the classification.
An important precursor of the CFSG is the Feit-Thompson theorem from 1962-1963, which asserts that every finite group of odd order is solvable, or equivalently that every non-abelian finite simple group has even order. This is an immediate consequence of CFSG, and conversely the Feit-Thompson theorem is an essential starting point in the proof of the classification, since it allows one to reduce matters to groups of even order for which key additional tools (such as the Brauer-Fowler theorem) become available. The original proof of the Feit-Thompson theorem is 255 pages long, which is significantly shorter than the proof of the CFSG, but still far from short. While parts of the proof of the Feit-Thompson theorem have been simplified (and it has recently been converted, after six years of effort, into an argument that has been verified by the proof assistant Coq), the available proofs of this theorem are still extremely lengthy by any reasonable standard.
However, there is a significantly simpler special case of the Feit-Thompson theorem that was established previously by Suzuki in 1957, which was influential in the proof of the more general Feit-Thompson theorem (and thus indirectly to the proof of CFSG). Define a CA-group to be a group with the property that the centraliser of any non-identity element is abelian; equivalently, the commuting relation (defined as the relation that holds when commutes with , thus ) is an equivalence relation on the non-identity elements of . Trivially, every abelian group is CA. A non-abelian example of a CA-group is the group of invertible affine transformations on a field . A little less obviously, the special linear group over a finite field is a CA-group when is a power of two. The finite simple groups of Lie type are not, in general, CA-groups, but when the rank is bounded they tend to behave as if they were “almost CA”; the centraliser of a generic element in , for instance, when is bounded and is large), is typically a maximal torus (because most elements in are regular semisimple) which is certainly abelian. In view of the CFSG, we thus see that CA or nearly CA groups form an important subclass of the simple groups, and it is thus of interest to study them separately. To this end, we have
Theorem 1 (Suzuki’s theorem on CA-groups) Every finite CA-group of odd order is solvable.
Of course, this theorem is superceded by the more general Feit-Thompson theorem, but Suzuki’s proof is substantially shorter (the original proof is nine pages) and will be given in this post. (See this survey of Solomon for some discussion of the link between Suzuki’s argument and the Feit-Thompson argument.) Suzuki’s analysis can be pushed further to give an essentially complete classification of all the finite CA-groups (of either odd or even order), but we will not pursue these matters here.
Moving even further down the ladder of simple precursors of CSFG is the following theorem of Frobenius from 1901. Define a Frobenius group to be a finite group which has a subgroup (called the Frobenius complement) with the property that all the non-trivial conjugates of for , intersect only at the origin. For instance the group is also a Frobenius group (take to be the affine transformations that fix a specified point , e.g. the origin). This example suggests that there is some overlap between the notions of a Frobenius group and a CA group. Indeed, note that if is a CA-group and is a maximal abelian subgroup of , then any conjugate of that is not identical to will intersect only at the origin (because and each of its conjugates consist of equivalence classes under the commuting relation , together with the identity). So if a maximal abelian subgroup of a CA-group is its own normaliser (thus is equal to ), then the group is a Frobenius group.
Frobenius’ theorem places an unexpectedly strong amount of structure on a Frobenius group:
Theorem 2 (Frobenius’ theorem) Let be a Frobenius group with Frobenius complement . Then there exists a normal subgroup of (called the Frobenius kernel of ) such that is the semi-direct product of and .
Roughly speaking, this theorem indicates that all Frobenius groups “behave” like the example (which is a quintessential example of a semi-direct product).
Note that if every CA-group of odd order was either Frobenius or abelian, then Theorem 2 would imply Theorem 1 by an induction on the order of , since any subgroup of a CA-group is clearly again a CA-group. Indeed, the proof of Suzuki’s theorem does basically proceed by this route (Suzuki’s arguments do indeed imply that CA-groups of odd order are Frobenius or abelian, although we will not quite establish that fact here).
Frobenius’ theorem can be reformulated in the following concrete combinatorial form:
Theorem 3 (Frobenius’ theorem, equivalent version) Let be a group of permutations acting transitively on a finite set , with the property that any non-identity permutation in fixes at most one point in . Then the set of permutations in that fix no points in , together with the identity, is closed under composition.
Again, a good example to keep in mind for this theorem is when is the group of affine permutations on a field (i.e. the group for that field), and is the set of points on that field. In that case, the set of permutations in that do not fix any points are the non-trivial translations.
To deduce Theorem 3 from Theorem 2, one applies Theorem 2 to the stabiliser of a single point in . Conversely, to deduce Theorem 2 from Theorem 3, set to be the space of left-cosets of , with the obvious left -action; one easily verifies that this action is faithful, transitive, and each non-identity element of fixes at most one left-coset of (basically because it lies in at most one conjugate of ). If we let be the elements of that do not fix any point in , plus the identity, then by Theorem 3 is closed under composition; it is also clearly closed under inverse and conjugation, and is hence a normal subgroup of . From construction is the identity plus the complement of all the conjugates of , which are all disjoint except at the identity, so by counting elements we see that
As normalises and is disjoint from , we thus see that is all of , giving Theorem 2.
Despite the appealingly concrete and elementary form of Theorem 3, the only known proofs of that theorem (or equivalently, Theorem 2) in its full generality proceed via the machinery of group characters (which one can think of as a version of Fourier analysis for nonabelian groups). On the other hand, once one establishes the basic theory of these characters (reviewed below the fold), the proof of Frobenius’ theorem is very short, which gives quite a striking example of the power of character theory. The proof of Suzuki’s theorem also proceeds via character theory, and is basically a more involved version of the Frobenius argument; again, no character-free proof of Suzuki’s theorem is currently known. (The proofs of Feit-Thompson and CFSG also involve characters, but those proofs also contain many other arguments of much greater complexity than the character-based portions of the proof.)
It seems to me that the above four theorems (Frobenius, Suzuki, Feit-Thompson, and CFSG) provide a ladder of sorts (with exponentially increasing complexity at each step) to the full classification, and that any new approach to the classification might first begin by revisiting the earlier theorems on this ladder and finding new proofs of these results first (in particular, if one had a “robust” proof of Suzuki’s theorem that also gave non-trivial control on “almost CA-groups” – whatever that means – then this might lead to a new route to classifying the finite simple groups of Lie type and bounded rank). But even for the simplest two results on this ladder – Frobenius and Suzuki – it seems remarkably difficult to find any proof that is not essentially the character-based proof. (Even trying to replace character theory by its close cousin, representation theory, doesn’t seem to work unless one gives in to the temptation to take traces everywhere and put the characters back in; it seems that rather than abandon characters altogether, one needs to find some sort of “robust” generalisation of existing character-based methods.) In any case, I am recording here the standard character-based proofs of the theorems of Frobenius and Suzuki below the fold. There is nothing particularly novel here, but I wanted to collect all the relevant material in one place, largely for my own benefit.
Let be Hermitian matrices, with eigenvalues and . The Harish-Chandra-Itzykson-Zuber integral formula exactly computes the integral
where is integrated over the Haar probability measure of the unitary group and is a non-zero complex parameter, as the expression
when the eigenvalues of are simple, where denotes the Vandermonde determinant
and is the constant
There are at least two standard ways to prove this formula in the literature. One way is by applying the Duistermaat-Heckman theorem to the pushforward of Liouville measure on the coadjoint orbit (or more precisely, a rotation of such an orbit by ) under the moment map , and then using a stationary phase expansion. Another way, which I only learned about recently, is to use the formulae for evolution of eigenvalues under Dyson Brownian motion (as well as the closely related formulae for the GUE ensemble), which were derived in this previous blog post. Both of these approaches can be found in several places in the literature (the former being observed in the original paper of Duistermaat and Heckman, and the latter observed in the paper of Itzykson and Zuber as well as in this later paper of Johansson), but I thought I would record both of these here for my own benefit.
The Harish-Chandra-Itzykson-Zuber formula can be extended to other compact Lie groups than . At first glance, this might suggest that these formulae could be of use in the study of the GOE ensemble, but unfortunately the Lie algebra associated to corresponds to real anti-symmetric matrices rather than real symmetric matrices. This also occurs in the case, but there one can simply multiply by to rotate a complex skew-Hermitian matrix into a complex Hermitian matrix. This is consistent, though, with the fact that the (somewhat rarely studied) anti-symmetric GOE ensemble has cleaner formulae (in particular, having a determinantal structure similar to GUE) than the (much more commonly studied) symmetric GOE ensemble.
Let be a compact group. (Throughout this post, all topological groups are assumed to be Hausdorff.) Then has a number of unitary representations, i.e. continuous homomorphisms to the group of unitary operators on a Hilbert space , equipped with the strong operator topology. In particular, one has the left-regular representation , where we equip with its normalised Haar measure (and the Borel -algebra) to form the Hilbert space , and is the translation operation
We call two unitary representations and isomorphic if one has for some unitary transformation , in which case we write .
Given two unitary representations and , one can form their direct sum in the obvious manner: . Conversely, if a unitary representation has a closed invariant subspace of (thus for all ), then the orthogonal complement is also invariant, leading to a decomposition of into the subrepresentations , . Accordingly, we will call a unitary representation irreducible if is nontrivial (i.e. ) and there are no nontrivial invariant subspaces (i.e. no invariant subspaces other than and ); the irreducible representations play a role in the subject analogous to those of prime numbers in multiplicative number theory. By the principle of infinite descent, every finite-dimensional unitary representation is then expressible (perhaps non-uniquely) as the direct sum of irreducible representations.
The Peter-Weyl theorem asserts, among other things, that the same claim is true for the regular representation:
Theorem 1 (Peter-Weyl theorem) Let be a compact group. Then the regular representation is isomorphic to the direct sum of irreducible representations. In fact, one has , where is an enumeration of the irreducible finite-dimensional unitary representations of (up to isomorphism). (It is not difficult to see that such an enumeration exists.)
In the case when is abelian, the Peter-Weyl theorem is a consequence of the Plancherel theorem; in that case, the irreducible representations are all one dimensional, and are thus indexed by the space of characters (i.e. continuous homomorphisms into the unit circle ), known as the Pontryagin dual of . (See for instance my lecture notes on the Fourier transform.) Conversely, the Peter-Weyl theorem can be used to deduce the Plancherel theorem for compact groups, as well as other basic results in Fourier analysis on these groups, such as the Fourier inversion formula.
Because the regular representation is faithful (i.e. injective), a corollary of the Peter-Weyl theorem (and a classical theorem of Cartan) is that every compact group can be expressed as the inverse limit of Lie groups, leading to a solution to Hilbert’s fifth problem in the compact case. Furthermore, the compact case is then an important building block in the more general theory surrounding Hilbert’s fifth problem, and in particular a result of Yamabe that any locally compact group contains an open subgroup that is the inverse limit of Lie groups.
I’ve recently become interested in the theory around Hilbert’s fifth problem, due to the existence of a correspondence principle between locally compact groups and approximate groups, which play a fundamental role in arithmetic combinatorics. I hope to elaborate upon this correspondence in a subsequent post, but I will mention that versions of this principle play a crucial role in Gromov’s proof of his theorem on groups of polynomial growth (discussed previously on this blog), and in a more recent paper of Hrushovski on approximate groups (also discussed previously). It is also analogous in many ways to the more well-known Furstenberg correspondence principle between ergodic theory and combinatorics (also discussed previously).
Because of the above motivation, I have decided to write some notes on how the Peter-Weyl theorem is proven. This is utterly standard stuff in abstract harmonic analysis; these notes are primarily for my own benefit, but perhaps they may be of interest to some readers also.
A recurring theme in mathematics is that of duality: a mathematical object can either be described internally (or in physical space, or locally), by describing what physically consists of (or what kind of maps exist into ), or externally (or in frequency space, or globally), by describing what globally interacts or resonates with (or what kind of maps exist out of ). These two fundamentally opposed perspectives on the object are often dual to each other in various ways: performing an operation on may transform it one way in physical space, but in a dual way in frequency space, with the frequency space description often being a “inversion” of the physical space description. In several important cases, one is fortunate enough to have some sort of fundamental theorem connecting the internal and external perspectives. Here are some (closely inter-related) examples of this perspective:
- Vector space duality A vector space over a field can be described either by the set of vectors inside , or dually by the set of linear functionals from to the field (or equivalently, the set of vectors inside the dual space ). (If one is working in the category of topological vector spaces, one would work instead with continuous linear functionals; and so forth.) A fundamental connection between the two is given by the Hahn-Banach theorem (and its relatives).
- Vector subspace duality In a similar spirit, a subspace of can be described either by listing a basis or a spanning set, or dually by a list of linear functionals that cut out that subspace (i.e. a spanning set for the orthogonal complement . Again, the Hahn-Banach theorem provides a fundamental connection between the two perspectives.
- Convex duality More generally, a (closed, bounded) convex body in a vector space can be described either by listing a set of (extreme) points whose convex hull is , or else by listing a set of (irreducible) linear inequalities that cut out . The fundamental connection between the two is given by the Farkas lemma.
- Ideal-variety duality In a slightly different direction, an algebraic variety in an affine space can be viewed either “in physical space” or “internally” as a collection of points in , or else “in frequency space” or “externally” as a collection of polynomials on whose simultaneous zero locus cuts out . The fundamental connection between the two perspectives is given by the nullstellensatz, which then leads to many of the basic fundamental theorems in classical algebraic geometry.
- Hilbert space duality An element in a Hilbert space can either be thought of in physical space as a vector in that space, or in momentum space as a covector on that space. The fundamental connection between the two is given by the Riesz representation theorem for Hilbert spaces.
- Semantic-syntactic duality Much more generally still, a mathematical theory can either be described internally or syntactically via its axioms and theorems, or externally or semantically via its models. The fundamental connection between the two perspectives is given by the Gödel completeness theorem.
- Intrinsic-extrinsic duality A (Riemannian) manifold can either be viewed intrinsically (using only concepts that do not require an ambient space, such as the Levi-Civita connection), or extrinsically, for instance as the level set of some defining function in an ambient space. Some important connections between the two perspectives includes the Nash embedding theorem and the theorema egregium.
- Group duality A group can be described either via presentations (lists of generators, together with relations between them) or representations (realisations of that group in some more concrete group of transformations). A fundamental connection between the two is Cayley’s theorem. Unfortunately, in general it is difficult to build upon this connection (except in special cases, such as the abelian case), and one cannot always pass effortlessly from one perspective to the other.
- Pontryagin group duality A (locally compact Hausdorff) abelian group can be described either by listing its elements , or by listing the characters (i.e. continuous homomorphisms from to the unit circle, or equivalently elements of ). The connection between the two is the focus of abstract harmonic analysis.
- Pontryagin subgroup duality A subgroup of a locally compact abelian group can be described either by generators in , or generators in the orthogonal complement . One of the fundamental connections between the two is the Poisson summation formula.
- Fourier duality A (sufficiently nice) function on a locally compact abelian group (equipped with a Haar measure ) can either be described in physical space (by its values at each element of ) or in frequency space (by the values at elements of the Pontryagin dual ). The fundamental connection between the two is the Fourier inversion formula.
- The uncertainty principle The behaviour of a function at physical scales above (resp. below) a certain scale is almost completely controlled by the behaviour of its Fourier transform at frequency scales below (resp. above) the dual scale and vice versa, thanks to various mathematical manifestations of the uncertainty principle. (The Poisson summation formula can also be viewed as a variant of this principle, using subgroups instead of scales.)
- Stone/Gelfand duality A (locally compact Hausdorff) topological space can be viewed in physical space (as a collection of points), or dually, via the algebra of continuous complex-valued functions on that space, or (in the case when is compact and totally disconnected) via the boolean algebra of clopen sets (or equivalently, the idempotents of ). The fundamental connection between the two is given by the Stone representation theorem or the (commutative) Gelfand-Naimark theorem.
I have discussed a fair number of these examples in previous blog posts (indeed, most of the links above are to my own blog). In this post, I would like to discuss the uncertainty principle, that describes the dual relationship between physical space and frequency space. There are various concrete formalisations of this principle, most famously the Heisenberg uncertainty principle and the Hardy uncertainty principle – but in many situations, it is the heuristic formulation of the principle that is more useful and insightful than any particular rigorous theorem that attempts to capture that principle. Unfortunately, it is a bit tricky to formulate this heuristic in a succinct way that covers all the various applications of that principle; the Heisenberg inequality is a good start, but it only captures a portion of what the principle tells us. Consider for instance the following (deliberately vague) statements, each of which can be viewed (heuristically, at least) as a manifestation of the uncertainty principle:
- A function which is band-limited (restricted to low frequencies) is featureless and smooth at fine scales, but can be oscillatory (i.e. containing plenty of cancellation) at coarse scales. Conversely, a function which is smooth at fine scales will be almost entirely restricted to low frequencies.
- A function which is restricted to high frequencies is oscillatory at fine scales, but is negligible at coarse scales. Conversely, a function which is oscillatory at fine scales will be almost entirely restricted to high frequencies.
- Projecting a function to low frequencies corresponds to averaging out (or spreading out) that function at fine scales, leaving only the coarse scale behaviour.
- Projecting a frequency to high frequencies corresponds to removing the averaged coarse scale behaviour, leaving only the fine scale oscillation.
- The number of degrees of freedom of a function is bounded by the product of its spatial uncertainty and its frequency uncertainty (or more generally, by the volume of the phase space uncertainty). In particular, there are not enough degrees of freedom for a non-trivial function to be simulatenously localised to both very fine scales and very low frequencies.
- To control the coarse scale (or global) averaged behaviour of a function, one essentially only needs to know the low frequency components of the function (and vice versa).
- To control the fine scale (or local) oscillation of a function, one only needs to know the high frequency components of the function (and vice versa).
- Localising a function to a region of physical space will cause its Fourier transform (or inverse Fourier transform) to resemble a plane wave on every dual region of frequency space.
- Averaging a function along certain spatial directions or at certain scales will cause the Fourier transform to become localised to the dual directions and scales. The smoother the averaging, the sharper the localisation.
- The smoother a function is, the more rapidly decreasing its Fourier transform (or inverse Fourier transform) is (and vice versa).
- If a function is smooth or almost constant in certain directions or at certain scales, then its Fourier transform (or inverse Fourier transform) will decay away from the dual directions or beyond the dual scales.
- If a function has a singularity spanning certain directions or certain scales, then its Fourier transform (or inverse Fourier transform) will decay slowly along the dual directions or within the dual scales.
- Localisation operations in position approximately commute with localisation operations in frequency so long as the product of the spatial uncertainty and the frequency uncertainty is significantly larger than one.
- In the high frequency (or large scale) limit, position and frequency asymptotically behave like a pair of classical observables, and partial differential equations asymptotically behave like classical ordinary differential equations. At lower frequencies (or finer scales), the former becomes a “quantum mechanical perturbation” of the latter, with the strength of the quantum effects increasing as one moves to increasingly lower frequencies and finer spatial scales.
- Etc., etc.
- Almost all of the above statements generalise to other locally compact abelian groups than or , in which the concept of a direction or scale is replaced by that of a subgroup or an approximate subgroup. (In particular, as we will see below, the Poisson summation formula can be viewed as another manifestation of the uncertainty principle.)
I think of all of the above (closely related) assertions as being instances of “the uncertainty principle”, but it seems difficult to combine them all into a single unified assertion, even at the heuristic level; they seem to be better arranged as a cloud of tightly interconnected assertions, each of which is reinforced by several of the others. The famous inequality is at the centre of this cloud, but is by no means the only aspect of it.
The uncertainty principle (as interpreted in the above broad sense) is one of the most fundamental principles in harmonic analysis (and more specifically, to the subfield of time-frequency analysis), second only to the Fourier inversion formula (and more generally, Plancherel’s theorem) in importance; understanding this principle is a key piece of intuition in the subject that one has to internalise before one can really get to grips with this subject (and also with closely related subjects, such as semi-classical analysis and microlocal analysis). Like many fundamental results in mathematics, the principle is not actually that difficult to understand, once one sees how it works; and when one needs to use it rigorously, it is usually not too difficult to improvise a suitable formalisation of the principle for the occasion. But, given how vague this principle is, it is difficult to present this principle in a traditional “theorem-proof-remark” manner. Even in the more informal format of a blog post, I was surprised by how challenging it was to describe my own understanding of this piece of mathematics in a linear fashion, despite (or perhaps because of) it being one of the most central and basic conceptual tools in my own personal mathematical toolbox. In the end, I chose to give below a cloud of interrelated discussions about this principle rather than a linear development of the theory, as this seemed to more closely align with the nature of this principle.
In these notes we lay out the basic theory of the Fourier transform, which is of course the most fundamental tool in harmonic analysis and also of major importance in related fields (functional analysis, complex analysis, PDE, number theory, additive combinatorics, representation theory, signal processing, etc.). The Fourier transform, in conjunction with the Fourier inversion formula, allows one to take essentially arbitrary (complex-valued) functions on a group (or more generally, a space that acts on, e.g. a homogeneous space ), and decompose them as a (discrete or continuous) superposition of much more symmetric functions on the domain, such as characters ; the precise superposition is given by Fourier coefficients , which take values in some dual object such as the Pontryagin dual of . Characters behave in a very simple manner with respect to translation (indeed, they are eigenfunctions of the translation action), and so the Fourier transform tends to simplify any mathematical problem which enjoys a translation invariance symmetry (or an approximation to such a symmetry), and is somehow “linear” (i.e. it interacts nicely with superpositions). In particular, Fourier analytic methods are particularly useful for studying operations such as convolution and set-theoretic addition , or the closely related problem of counting solutions to additive problems such as or , where are constrained to lie in specific sets . The Fourier transform is also a particularly powerful tool for solving constant-coefficient linear ODE and PDE (because of the translation invariance), and can also approximately solve some variable-coefficient (or slightly non-linear) equations if the coefficients vary smoothly enough and the nonlinear terms are sufficiently tame.
The Fourier transform also provides an important new way of looking at a function , as it highlights the distribution of in frequency space (the domain of the frequency variable ) rather than physical space (the domain of the physical variable ). A given property of in the physical domain may be transformed to a rather different-looking property of in the frequency domain. For instance:
- Smoothness of in the physical domain corresponds to decay of in the Fourier domain, and conversely. (More generally, fine scale properties of tend to manifest themselves as coarse scale properties of , and conversely.)
- Convolution in the physical domain corresponds to pointwise multiplication in the Fourier domain, and conversely.
- Constant coefficient differential operators such as in the physical domain corresponds to multiplication by polynomials such as in the Fourier domain, and conversely.
- More generally, translation invariant operators in the physical domain correspond to multiplication by symbols in the Fourier domain, and conversely.
- Rescaling in the physical domain by an invertible linear transformation corresponds to an inverse (adjoint) rescaling in the Fourier domain.
- Restriction to a subspace (or subgroup) in the physical domain corresponds to projection to the dual quotient space (or quotient group) in the Fourier domain, and conversely.
- Frequency modulation in the physical domain corresponds to translation in the frequency domain, and conversely.
(We will make these statements more precise below.)
On the other hand, some operations in the physical domain remain essentially unchanged in the Fourier domain. Most importantly, the norm (or energy) of a function is the same as that of its Fourier transform, and more generally the inner product of two functions is the same as that of their Fourier transforms. Indeed, the Fourier transform is a unitary operator on (a fact which is variously known as the Plancherel theorem or the Parseval identity). This makes it easier to pass back and forth between the physical domain and frequency domain, so that one can combine techniques that are easy to execute in the physical domain with other techniques that are easy to execute in the frequency domain. (In fact, one can combine the physical and frequency domains together into a product domain known as phase space, and there are entire fields of mathematics (e.g. microlocal analysis, geometric quantisation, time-frequency analysis) devoted to performing analysis on these sorts of spaces directly, but this is beyond the scope of this course.)
In these notes, we briefly discuss the general theory of the Fourier transform, but will mainly focus on the two classical domains for Fourier analysis: the torus , and the Euclidean space . For these domains one has the advantage of being able to perform very explicit algebraic calculations, involving concrete functions such as plane waves or Gaussians .
[This post was typeset using a LaTeX to WordPress-HTML converter kindly provided to me by Luca Trevisan.]
Many properties of a (sufficiently nice) function are reflected in its Fourier transform , defined by the formula
For instance, decay properties of are reflected in smoothness properties of , as the following table shows:
If is… | then is… | and this relates to… |
Square-integrable | square-integrable | Plancherel’s theorem |
Absolutely integrable | continuous | Riemann-Lebesgue lemma |
Rapidly decreasing | smooth | theory of Schwartz functions |
Exponentially decreasing | analytic in a strip | |
Compactly supported | entire and at most exponential growth | Paley-Wiener theorem |
Another important relationship between a function and its Fourier transform is the uncertainty principle, which roughly asserts that if a function is highly localised in space, then its Fourier transform must be widely dispersed in space, or to put it another way, and cannot both decay too strongly at infinity (except of course in the degenerate case ). There are many ways to make this intuition precise. One of them is the Heisenberg uncertainty principle, which asserts that if we normalise
then we must have
thus forcing at least one of or to not be too concentrated near the origin. This principle can be proven (for sufficiently nice , initially) by observing the integration by parts identity
and then using Cauchy-Schwarz and the Plancherel identity.
Another well known manifestation of the uncertainty principle is the fact that it is not possible for and to both be compactly supported (unless of course they vanish entirely). This can be in fact be seen from the above table: if is compactly supported, then is an entire function; but the zeroes of a non-zero entire function are isolated, yielding a contradiction unless vanishes. (Indeed, the table also shows that if one of and is compactly supported, then the other cannot have exponential decay.)
On the other hand, we have the example of the Gaussian functions , , which both decay faster than exponentially. The classical Hardy uncertainty principle asserts, roughly speaking, that this is the fastest that and can simultaneously decay:
Theorem 1 (Hardy uncertainty principle) Suppose that is a (measurable) function such that and for all and some . Then is a scalar multiple of the gaussian .
This theorem is proven by complex-analytic methods, in particular the Phragmén-Lindelöf principle; for sake of completeness we give that proof below. But I was curious to see if there was a real-variable proof of the same theorem, avoiding the use of complex analysis. I was able to find the proof of a slightly weaker theorem:
Theorem 2 (Weak Hardy uncertainty principle) Suppose that is a non-zero (measurable) function such that and for all and some . Then for some absolute constant .
Note that the correct value of should be , as is implied by the true Hardy uncertainty principle. Despite the weaker statement, I thought the proof might still might be of interest as it is a little less “magical” than the complex-variable one, and so I am giving it below.
In the next few lectures, we will be studying four major classes of function spaces. In decreasing order of generality, these classes are the topological vector spaces, the normed vector spaces, the Banach spaces, and the Hilbert spaces. In order to motivate the discussion of the more general classes of spaces, we will first focus on the most special class – that of (real and complex) Hilbert spaces. These spaces can be viewed as generalisations of (real and complex) Euclidean spaces such as and to infinite-dimensional settings, and indeed much of one’s Euclidean geometry intuition concerning lengths, angles, orthogonality, subspaces, etc. will transfer readily to arbitrary Hilbert spaces; in contrast, this intuition is not always accurate in the more general vector spaces mentioned above. In addition to Euclidean spaces, another fundamental example of Hilbert spaces comes from the Lebesgue spaces of a measure space . (There are of course many other Hilbert spaces of importance in complex analysis, harmonic analysis, and PDE, such as Hardy spaces , Sobolev spaces , and the space of Hilbert-Schmidt operators, but we will not discuss those spaces much in this course. Complex Hilbert spaces also play a fundamental role in the foundations of quantum mechanics, being the natural space to hold all the possible states of a quantum system (possibly after projectivising the Hilbert space), but we will not discuss this subject here.)
Hilbert spaces are the natural abstract framework in which to study two important (and closely related) concepts: orthogonality and unitarity, allowing us to generalise familiar concepts and facts from Euclidean geometry such as the Cartesian coordinate system, rotations and reflections, and the Pythagorean theorem to Hilbert spaces. (For instance, the Fourier transform is a unitary transformation and can thus be viewed as a kind of generalised rotation.) Furthermore, the Hodge duality on Euclidean spaces has a partial analogue for Hilbert spaces, namely the Riesz representation theorem for Hilbert spaces, which makes the theory of duality and adjoints for Hilbert spaces especially simple (when compared with the more subtle theory of duality for, say, Banach spaces). Much later (next quarter, in fact), we will see that this duality allows us to extend the spectral theorem for self-adjoint matrices to that of self-adjoint operators on a Hilbert space.
These notes are only the most basic introduction to the theory of Hilbert spaces. In particular, the theory of linear transformations between two Hilbert spaces, which is perhaps the most important aspect of the subject, is not covered much at all here (but I hope to discuss it further in future lectures.)
I’m continuing my series of articles for the Princeton Companion to Mathematics by uploading my article on the Fourier transform. Here, I chose to describe this transform as a means of decomposing general functions into more symmetric functions (such as sinusoids or plane waves), and to discuss a little bit how this transform is connected to differential operators such as the Laplacian. (This is of course only one of the many different uses of the Fourier transform, but again, with only five pages to work with, it’s hard to do justice to every single application. For instance, the connections with additive combinatorics are not covered at all.)
On the official web site of the Companion (which you can access with the user name “Guest” and password “PCM”), there is a more polished version of the same article, after it had gone through a few rounds of the editing process.
I’ll also point out David Ben-Zvi‘s Companion article on “moduli spaces“. This concept is deceptively simple – a space whose points are themselves spaces, or “representatives” or “equivalence classes” of such spaces – but it leads to the “correct” way of thinking about many geometric and algebraic objects, and more importantly about families of such objects, without drowning in a mess of coordinate charts and formulae which serve to obscure the underlying geometry.
[Update, Oct 21: categories fixed.]
Recent Comments