You are currently browsing the tag archive for the ‘Fourier transform’ tag.
The classification of finite simple groups (CFSG), first announced in 1983 but only fully completed in 2004, is one of the monumental achievements of twentieth century mathematics. Spanning hundreds of papers and tens of thousands of pages, it has been called the “enormous theorem”. A “second generation” proof of the theorem is nearly completed which is a little shorter (estimated at about five thousand pages in length), but currently there is no reasonably sized proof of the classification.
An important precursor of the CFSG is the Feit-Thompson theorem from 1962-1963, which asserts that every finite group of odd order is solvable, or equivalently that every non-abelian finite simple group has even order. This is an immediate consequence of CFSG, and conversely the Feit-Thompson theorem is an essential starting point in the proof of the classification, since it allows one to reduce matters to groups of even order for which key additional tools (such as the Brauer-Fowler theorem) become available. The original proof of the Feit-Thompson theorem is 255 pages long, which is significantly shorter than the proof of the CFSG, but still far from short. While parts of the proof of the Feit-Thompson theorem have been simplified (and it has recently been converted, after six years of effort, into an argument that has been verified by the proof assistant Coq), the available proofs of this theorem are still extremely lengthy by any reasonable standard.
However, there is a significantly simpler special case of the Feit-Thompson theorem that was established previously by Suzuki in 1957, which was influential in the proof of the more general Feit-Thompson theorem (and thus indirectly to the proof of CFSG). Define a CA-group to be a group with the property that the centraliser
of any non-identity element
is abelian; equivalently, the commuting relation
(defined as the relation that holds when
commutes with
, thus
) is an equivalence relation on the non-identity elements
of
. Trivially, every abelian group is CA. A non-abelian example of a CA-group is the
group of invertible affine transformations
on a field
. A little less obviously, the special linear group
over a finite field
is a CA-group when
is a power of two. The finite simple groups of Lie type are not, in general, CA-groups, but when the rank is bounded they tend to behave as if they were “almost CA”; the centraliser of a generic element in
, for instance, when
is bounded and
is large), is typically a maximal torus (because most elements in
are regular semisimple) which is certainly abelian. In view of the CFSG, we thus see that CA or nearly CA groups form an important subclass of the simple groups, and it is thus of interest to study them separately. To this end, we have
Theorem 1 (Suzuki’s theorem on CA-groups) Every finite CA-group of odd order is solvable.
Of course, this theorem is superceded by the more general Feit-Thompson theorem, but Suzuki’s proof is substantially shorter (the original proof is nine pages) and will be given in this post. (See this survey of Solomon for some discussion of the link between Suzuki’s argument and the Feit-Thompson argument.) Suzuki’s analysis can be pushed further to give an essentially complete classification of all the finite CA-groups (of either odd or even order), but we will not pursue these matters here.
Moving even further down the ladder of simple precursors of CSFG is the following theorem of Frobenius from 1901. Define a Frobenius group to be a finite group which has a subgroup
(called the Frobenius complement) with the property that all the non-trivial conjugates
of
for
, intersect
only at the origin. For instance the
group is also a Frobenius group (take
to be the affine transformations that fix a specified point
, e.g. the origin). This example suggests that there is some overlap between the notions of a Frobenius group and a CA group. Indeed, note that if
is a CA-group and
is a maximal abelian subgroup of
, then any conjugate
of
that is not identical to
will intersect
only at the origin (because
and each of its conjugates consist of equivalence classes under the commuting relation
, together with the identity). So if a maximal abelian subgroup
of a CA-group is its own normaliser (thus
is equal to
), then the group is a Frobenius group.
Frobenius’ theorem places an unexpectedly strong amount of structure on a Frobenius group:
Theorem 2 (Frobenius’ theorem) Let
be a Frobenius group with Frobenius complement
. Then there exists a normal subgroup
of
(called the Frobenius kernel of
) such that
is the semi-direct product
of
and
.
Roughly speaking, this theorem indicates that all Frobenius groups “behave” like the example (which is a quintessential example of a semi-direct product).
Note that if every CA-group of odd order was either Frobenius or abelian, then Theorem 2 would imply Theorem 1 by an induction on the order of , since any subgroup of a CA-group is clearly again a CA-group. Indeed, the proof of Suzuki’s theorem does basically proceed by this route (Suzuki’s arguments do indeed imply that CA-groups of odd order are Frobenius or abelian, although we will not quite establish that fact here).
Frobenius’ theorem can be reformulated in the following concrete combinatorial form:
Theorem 3 (Frobenius’ theorem, equivalent version) Let
be a group of permutations acting transitively on a finite set
, with the property that any non-identity permutation in
fixes at most one point in
. Then the set of permutations in
that fix no points in
, together with the identity, is closed under composition.
Again, a good example to keep in mind for this theorem is when is the group of affine permutations on a field
(i.e. the
group for that field), and
is the set of points on that field. In that case, the set of permutations in
that do not fix any points are the non-trivial translations.
To deduce Theorem 3 from Theorem 2, one applies Theorem 2 to the stabiliser of a single point in . Conversely, to deduce Theorem 2 from Theorem 3, set
to be the space of left-cosets of
, with the obvious left
-action; one easily verifies that this action is faithful, transitive, and each non-identity element
of
fixes at most one left-coset of
(basically because it lies in at most one conjugate of
). If we let
be the elements of
that do not fix any point in
, plus the identity, then by Theorem 3
is closed under composition; it is also clearly closed under inverse and conjugation, and is hence a normal subgroup of
. From construction
is the identity plus the complement of all the
conjugates of
, which are all disjoint except at the identity, so by counting elements we see that
As normalises
and is disjoint from
, we thus see that
is all of
, giving Theorem 2.
Despite the appealingly concrete and elementary form of Theorem 3, the only known proofs of that theorem (or equivalently, Theorem 2) in its full generality proceed via the machinery of group characters (which one can think of as a version of Fourier analysis for nonabelian groups). On the other hand, once one establishes the basic theory of these characters (reviewed below the fold), the proof of Frobenius’ theorem is very short, which gives quite a striking example of the power of character theory. The proof of Suzuki’s theorem also proceeds via character theory, and is basically a more involved version of the Frobenius argument; again, no character-free proof of Suzuki’s theorem is currently known. (The proofs of Feit-Thompson and CFSG also involve characters, but those proofs also contain many other arguments of much greater complexity than the character-based portions of the proof.)
It seems to me that the above four theorems (Frobenius, Suzuki, Feit-Thompson, and CFSG) provide a ladder of sorts (with exponentially increasing complexity at each step) to the full classification, and that any new approach to the classification might first begin by revisiting the earlier theorems on this ladder and finding new proofs of these results first (in particular, if one had a “robust” proof of Suzuki’s theorem that also gave non-trivial control on “almost CA-groups” – whatever that means – then this might lead to a new route to classifying the finite simple groups of Lie type and bounded rank). But even for the simplest two results on this ladder – Frobenius and Suzuki – it seems remarkably difficult to find any proof that is not essentially the character-based proof. (Even trying to replace character theory by its close cousin, representation theory, doesn’t seem to work unless one gives in to the temptation to take traces everywhere and put the characters back in; it seems that rather than abandon characters altogether, one needs to find some sort of “robust” generalisation of existing character-based methods.) In any case, I am recording here the standard character-based proofs of the theorems of Frobenius and Suzuki below the fold. There is nothing particularly novel here, but I wanted to collect all the relevant material in one place, largely for my own benefit.
Let be
Hermitian matrices, with eigenvalues
and
. The Harish-Chandra-Itzykson-Zuber integral formula exactly computes the integral
where is integrated over the Haar probability measure of the unitary group
and
is a non-zero complex parameter, as the expression
when the eigenvalues of are simple, where
denotes the Vandermonde determinant
and is the constant
There are at least two standard ways to prove this formula in the literature. One way is by applying the Duistermaat-Heckman theorem to the pushforward of Liouville measure on the coadjoint orbit (or more precisely, a rotation of such an orbit by
) under the moment map
, and then using a stationary phase expansion. Another way, which I only learned about recently, is to use the formulae for evolution of eigenvalues under Dyson Brownian motion (as well as the closely related formulae for the GUE ensemble), which were derived in this previous blog post. Both of these approaches can be found in several places in the literature (the former being observed in the original paper of Duistermaat and Heckman, and the latter observed in the paper of Itzykson and Zuber as well as in this later paper of Johansson), but I thought I would record both of these here for my own benefit.
The Harish-Chandra-Itzykson-Zuber formula can be extended to other compact Lie groups than . At first glance, this might suggest that these formulae could be of use in the study of the GOE ensemble, but unfortunately the Lie algebra associated to
corresponds to real anti-symmetric matrices rather than real symmetric matrices. This also occurs in the
case, but there one can simply multiply by
to rotate a complex skew-Hermitian matrix into a complex Hermitian matrix. This is consistent, though, with the fact that the (somewhat rarely studied) anti-symmetric GOE ensemble has cleaner formulae (in particular, having a determinantal structure similar to GUE) than the (much more commonly studied) symmetric GOE ensemble.
Let be a compact group. (Throughout this post, all topological groups are assumed to be Hausdorff.) Then
has a number of unitary representations, i.e. continuous homomorphisms
to the group
of unitary operators on a Hilbert space
, equipped with the strong operator topology. In particular, one has the left-regular representation
, where we equip
with its normalised Haar measure
(and the Borel
-algebra) to form the Hilbert space
, and
is the translation operation
We call two unitary representations and
isomorphic if one has
for some unitary transformation
, in which case we write
.
Given two unitary representations and
, one can form their direct sum
in the obvious manner:
. Conversely, if a unitary representation
has a closed invariant subspace
of
(thus
for all
), then the orthogonal complement
is also invariant, leading to a decomposition
of
into the subrepresentations
,
. Accordingly, we will call a unitary representation
irreducible if
is nontrivial (i.e.
) and there are no nontrivial invariant subspaces (i.e. no invariant subspaces other than
and
); the irreducible representations play a role in the subject analogous to those of prime numbers in multiplicative number theory. By the principle of infinite descent, every finite-dimensional unitary representation is then expressible (perhaps non-uniquely) as the direct sum of irreducible representations.
The Peter-Weyl theorem asserts, among other things, that the same claim is true for the regular representation:
Theorem 1 (Peter-Weyl theorem) Let
be a compact group. Then the regular representation
is isomorphic to the direct sum of irreducible representations. In fact, one has
, where
is an enumeration of the irreducible finite-dimensional unitary representations
of
(up to isomorphism). (It is not difficult to see that such an enumeration exists.)
In the case when is abelian, the Peter-Weyl theorem is a consequence of the Plancherel theorem; in that case, the irreducible representations are all one dimensional, and are thus indexed by the space
of characters
(i.e. continuous homomorphisms into the unit circle
), known as the Pontryagin dual of
. (See for instance my lecture notes on the Fourier transform.) Conversely, the Peter-Weyl theorem can be used to deduce the Plancherel theorem for compact groups, as well as other basic results in Fourier analysis on these groups, such as the Fourier inversion formula.
Because the regular representation is faithful (i.e. injective), a corollary of the Peter-Weyl theorem (and a classical theorem of Cartan) is that every compact group can be expressed as the inverse limit of Lie groups, leading to a solution to Hilbert’s fifth problem in the compact case. Furthermore, the compact case is then an important building block in the more general theory surrounding Hilbert’s fifth problem, and in particular a result of Yamabe that any locally compact group contains an open subgroup that is the inverse limit of Lie groups.
I’ve recently become interested in the theory around Hilbert’s fifth problem, due to the existence of a correspondence principle between locally compact groups and approximate groups, which play a fundamental role in arithmetic combinatorics. I hope to elaborate upon this correspondence in a subsequent post, but I will mention that versions of this principle play a crucial role in Gromov’s proof of his theorem on groups of polynomial growth (discussed previously on this blog), and in a more recent paper of Hrushovski on approximate groups (also discussed previously). It is also analogous in many ways to the more well-known Furstenberg correspondence principle between ergodic theory and combinatorics (also discussed previously).
Because of the above motivation, I have decided to write some notes on how the Peter-Weyl theorem is proven. This is utterly standard stuff in abstract harmonic analysis; these notes are primarily for my own benefit, but perhaps they may be of interest to some readers also.
A recurring theme in mathematics is that of duality: a mathematical object can either be described internally (or in physical space, or locally), by describing what
physically consists of (or what kind of maps exist into
), or externally (or in frequency space, or globally), by describing what
globally interacts or resonates with (or what kind of maps exist out of
). These two fundamentally opposed perspectives on the object
are often dual to each other in various ways: performing an operation on
may transform it one way in physical space, but in a dual way in frequency space, with the frequency space description often being a “inversion” of the physical space description. In several important cases, one is fortunate enough to have some sort of fundamental theorem connecting the internal and external perspectives. Here are some (closely inter-related) examples of this perspective:
- Vector space duality A vector space
over a field
can be described either by the set of vectors inside
, or dually by the set of linear functionals
from
to the field
(or equivalently, the set of vectors inside the dual space
). (If one is working in the category of topological vector spaces, one would work instead with continuous linear functionals; and so forth.) A fundamental connection between the two is given by the Hahn-Banach theorem (and its relatives).
- Vector subspace duality In a similar spirit, a subspace
of
can be described either by listing a basis or a spanning set, or dually by a list of linear functionals that cut out that subspace (i.e. a spanning set for the orthogonal complement
. Again, the Hahn-Banach theorem provides a fundamental connection between the two perspectives.
- Convex duality More generally, a (closed, bounded) convex body
in a vector space
can be described either by listing a set of (extreme) points whose convex hull is
, or else by listing a set of (irreducible) linear inequalities that cut out
. The fundamental connection between the two is given by the Farkas lemma.
- Ideal-variety duality In a slightly different direction, an algebraic variety
in an affine space
can be viewed either “in physical space” or “internally” as a collection of points in
, or else “in frequency space” or “externally” as a collection of polynomials on
whose simultaneous zero locus cuts out
. The fundamental connection between the two perspectives is given by the nullstellensatz, which then leads to many of the basic fundamental theorems in classical algebraic geometry.
- Hilbert space duality An element
in a Hilbert space
can either be thought of in physical space as a vector in that space, or in momentum space as a covector
on that space. The fundamental connection between the two is given by the Riesz representation theorem for Hilbert spaces.
- Semantic-syntactic duality Much more generally still, a mathematical theory can either be described internally or syntactically via its axioms and theorems, or externally or semantically via its models. The fundamental connection between the two perspectives is given by the Gödel completeness theorem.
- Intrinsic-extrinsic duality A (Riemannian) manifold
can either be viewed intrinsically (using only concepts that do not require an ambient space, such as the Levi-Civita connection), or extrinsically, for instance as the level set of some defining function in an ambient space. Some important connections between the two perspectives includes the Nash embedding theorem and the theorema egregium.
- Group duality A group
can be described either via presentations (lists of generators, together with relations between them) or representations (realisations of that group in some more concrete group of transformations). A fundamental connection between the two is Cayley’s theorem. Unfortunately, in general it is difficult to build upon this connection (except in special cases, such as the abelian case), and one cannot always pass effortlessly from one perspective to the other.
- Pontryagin group duality A (locally compact Hausdorff) abelian group
can be described either by listing its elements
, or by listing the characters
(i.e. continuous homomorphisms from
to the unit circle, or equivalently elements of
). The connection between the two is the focus of abstract harmonic analysis.
- Pontryagin subgroup duality A subgroup
of a locally compact abelian group
can be described either by generators in
, or generators in the orthogonal complement
. One of the fundamental connections between the two is the Poisson summation formula.
- Fourier duality A (sufficiently nice) function
on a locally compact abelian group
(equipped with a Haar measure
) can either be described in physical space (by its values
at each element
of
) or in frequency space (by the values
at elements
of the Pontryagin dual
). The fundamental connection between the two is the Fourier inversion formula.
- The uncertainty principle The behaviour of a function
at physical scales above (resp. below) a certain scale
is almost completely controlled by the behaviour of its Fourier transform
at frequency scales below (resp. above) the dual scale
and vice versa, thanks to various mathematical manifestations of the uncertainty principle. (The Poisson summation formula can also be viewed as a variant of this principle, using subgroups instead of scales.)
- Stone/Gelfand duality A (locally compact Hausdorff) topological space
can be viewed in physical space (as a collection of points), or dually, via the
algebra
of continuous complex-valued functions on that space, or (in the case when
is compact and totally disconnected) via the boolean algebra of clopen sets (or equivalently, the idempotents of
). The fundamental connection between the two is given by the Stone representation theorem or the (commutative) Gelfand-Naimark theorem.
I have discussed a fair number of these examples in previous blog posts (indeed, most of the links above are to my own blog). In this post, I would like to discuss the uncertainty principle, that describes the dual relationship between physical space and frequency space. There are various concrete formalisations of this principle, most famously the Heisenberg uncertainty principle and the Hardy uncertainty principle – but in many situations, it is the heuristic formulation of the principle that is more useful and insightful than any particular rigorous theorem that attempts to capture that principle. Unfortunately, it is a bit tricky to formulate this heuristic in a succinct way that covers all the various applications of that principle; the Heisenberg inequality is a good start, but it only captures a portion of what the principle tells us. Consider for instance the following (deliberately vague) statements, each of which can be viewed (heuristically, at least) as a manifestation of the uncertainty principle:
- A function which is band-limited (restricted to low frequencies) is featureless and smooth at fine scales, but can be oscillatory (i.e. containing plenty of cancellation) at coarse scales. Conversely, a function which is smooth at fine scales will be almost entirely restricted to low frequencies.
- A function which is restricted to high frequencies is oscillatory at fine scales, but is negligible at coarse scales. Conversely, a function which is oscillatory at fine scales will be almost entirely restricted to high frequencies.
- Projecting a function to low frequencies corresponds to averaging out (or spreading out) that function at fine scales, leaving only the coarse scale behaviour.
- Projecting a frequency to high frequencies corresponds to removing the averaged coarse scale behaviour, leaving only the fine scale oscillation.
- The number of degrees of freedom of a function is bounded by the product of its spatial uncertainty and its frequency uncertainty (or more generally, by the volume of the phase space uncertainty). In particular, there are not enough degrees of freedom for a non-trivial function to be simulatenously localised to both very fine scales and very low frequencies.
- To control the coarse scale (or global) averaged behaviour of a function, one essentially only needs to know the low frequency components of the function (and vice versa).
- To control the fine scale (or local) oscillation of a function, one only needs to know the high frequency components of the function (and vice versa).
- Localising a function to a region of physical space will cause its Fourier transform (or inverse Fourier transform) to resemble a plane wave on every dual region of frequency space.
- Averaging a function along certain spatial directions or at certain scales will cause the Fourier transform to become localised to the dual directions and scales. The smoother the averaging, the sharper the localisation.
- The smoother a function is, the more rapidly decreasing its Fourier transform (or inverse Fourier transform) is (and vice versa).
- If a function is smooth or almost constant in certain directions or at certain scales, then its Fourier transform (or inverse Fourier transform) will decay away from the dual directions or beyond the dual scales.
- If a function has a singularity spanning certain directions or certain scales, then its Fourier transform (or inverse Fourier transform) will decay slowly along the dual directions or within the dual scales.
- Localisation operations in position approximately commute with localisation operations in frequency so long as the product of the spatial uncertainty and the frequency uncertainty is significantly larger than one.
- In the high frequency (or large scale) limit, position and frequency asymptotically behave like a pair of classical observables, and partial differential equations asymptotically behave like classical ordinary differential equations. At lower frequencies (or finer scales), the former becomes a “quantum mechanical perturbation” of the latter, with the strength of the quantum effects increasing as one moves to increasingly lower frequencies and finer spatial scales.
- Etc., etc.
- Almost all of the above statements generalise to other locally compact abelian groups than
or
, in which the concept of a direction or scale is replaced by that of a subgroup or an approximate subgroup. (In particular, as we will see below, the Poisson summation formula can be viewed as another manifestation of the uncertainty principle.)
I think of all of the above (closely related) assertions as being instances of “the uncertainty principle”, but it seems difficult to combine them all into a single unified assertion, even at the heuristic level; they seem to be better arranged as a cloud of tightly interconnected assertions, each of which is reinforced by several of the others. The famous inequality is at the centre of this cloud, but is by no means the only aspect of it.
The uncertainty principle (as interpreted in the above broad sense) is one of the most fundamental principles in harmonic analysis (and more specifically, to the subfield of time-frequency analysis), second only to the Fourier inversion formula (and more generally, Plancherel’s theorem) in importance; understanding this principle is a key piece of intuition in the subject that one has to internalise before one can really get to grips with this subject (and also with closely related subjects, such as semi-classical analysis and microlocal analysis). Like many fundamental results in mathematics, the principle is not actually that difficult to understand, once one sees how it works; and when one needs to use it rigorously, it is usually not too difficult to improvise a suitable formalisation of the principle for the occasion. But, given how vague this principle is, it is difficult to present this principle in a traditional “theorem-proof-remark” manner. Even in the more informal format of a blog post, I was surprised by how challenging it was to describe my own understanding of this piece of mathematics in a linear fashion, despite (or perhaps because of) it being one of the most central and basic conceptual tools in my own personal mathematical toolbox. In the end, I chose to give below a cloud of interrelated discussions about this principle rather than a linear development of the theory, as this seemed to more closely align with the nature of this principle.
In these notes we lay out the basic theory of the Fourier transform, which is of course the most fundamental tool in harmonic analysis and also of major importance in related fields (functional analysis, complex analysis, PDE, number theory, additive combinatorics, representation theory, signal processing, etc.). The Fourier transform, in conjunction with the Fourier inversion formula, allows one to take essentially arbitrary (complex-valued) functions on a group (or more generally, a space
that
acts on, e.g. a homogeneous space
), and decompose them as a (discrete or continuous) superposition of much more symmetric functions on the domain, such as characters
; the precise superposition is given by Fourier coefficients
, which take values in some dual object such as the Pontryagin dual
of
. Characters behave in a very simple manner with respect to translation (indeed, they are eigenfunctions of the translation action), and so the Fourier transform tends to simplify any mathematical problem which enjoys a translation invariance symmetry (or an approximation to such a symmetry), and is somehow “linear” (i.e. it interacts nicely with superpositions). In particular, Fourier analytic methods are particularly useful for studying operations such as convolution
and set-theoretic addition
, or the closely related problem of counting solutions to additive problems such as
or
, where
are constrained to lie in specific sets
. The Fourier transform is also a particularly powerful tool for solving constant-coefficient linear ODE and PDE (because of the translation invariance), and can also approximately solve some variable-coefficient (or slightly non-linear) equations if the coefficients vary smoothly enough and the nonlinear terms are sufficiently tame.
The Fourier transform also provides an important new way of looking at a function
, as it highlights the distribution of
in frequency space (the domain of the frequency variable
) rather than physical space (the domain of the physical variable
). A given property of
in the physical domain may be transformed to a rather different-looking property of
in the frequency domain. For instance:
- Smoothness of
in the physical domain corresponds to decay of
in the Fourier domain, and conversely. (More generally, fine scale properties of
tend to manifest themselves as coarse scale properties of
, and conversely.)
- Convolution in the physical domain corresponds to pointwise multiplication in the Fourier domain, and conversely.
- Constant coefficient differential operators such as
in the physical domain corresponds to multiplication by polynomials such as
in the Fourier domain, and conversely.
- More generally, translation invariant operators in the physical domain correspond to multiplication by symbols in the Fourier domain, and conversely.
- Rescaling in the physical domain by an invertible linear transformation corresponds to an inverse (adjoint) rescaling in the Fourier domain.
- Restriction to a subspace (or subgroup) in the physical domain corresponds to projection to the dual quotient space (or quotient group) in the Fourier domain, and conversely.
- Frequency modulation in the physical domain corresponds to translation in the frequency domain, and conversely.
(We will make these statements more precise below.)
On the other hand, some operations in the physical domain remain essentially unchanged in the Fourier domain. Most importantly, the norm (or energy) of a function
is the same as that of its Fourier transform, and more generally the inner product
of two functions
is the same as that of their Fourier transforms. Indeed, the Fourier transform is a unitary operator on
(a fact which is variously known as the Plancherel theorem or the Parseval identity). This makes it easier to pass back and forth between the physical domain and frequency domain, so that one can combine techniques that are easy to execute in the physical domain with other techniques that are easy to execute in the frequency domain. (In fact, one can combine the physical and frequency domains together into a product domain known as phase space, and there are entire fields of mathematics (e.g. microlocal analysis, geometric quantisation, time-frequency analysis) devoted to performing analysis on these sorts of spaces directly, but this is beyond the scope of this course.)
In these notes, we briefly discuss the general theory of the Fourier transform, but will mainly focus on the two classical domains for Fourier analysis: the torus , and the Euclidean space
. For these domains one has the advantage of being able to perform very explicit algebraic calculations, involving concrete functions such as plane waves
or Gaussians
.
[This post was typeset using a LaTeX to WordPress-HTML converter kindly provided to me by Luca Trevisan.]
Many properties of a (sufficiently nice) function are reflected in its Fourier transform
, defined by the formula
For instance, decay properties of are reflected in smoothness properties of
, as the following table shows:
| If |
then |
and this relates to… |
| Square-integrable | square-integrable | Plancherel’s theorem |
| Absolutely integrable | continuous | Riemann-Lebesgue lemma |
| Rapidly decreasing | smooth | theory of Schwartz functions |
| Exponentially decreasing | analytic in a strip | |
| Compactly supported | entire and at most exponential growth | Paley-Wiener theorem |
Another important relationship between a function and its Fourier transform
is the uncertainty principle, which roughly asserts that if a function
is highly localised in space, then its Fourier transform
must be widely dispersed in space, or to put it another way,
and
cannot both decay too strongly at infinity (except of course in the degenerate case
). There are many ways to make this intuition precise. One of them is the Heisenberg uncertainty principle, which asserts that if we normalise
then we must have
thus forcing at least one of or
to not be too concentrated near the origin. This principle can be proven (for sufficiently nice
, initially) by observing the integration by parts identity
and then using Cauchy-Schwarz and the Plancherel identity.
Another well known manifestation of the uncertainty principle is the fact that it is not possible for and
to both be compactly supported (unless of course they vanish entirely). This can be in fact be seen from the above table: if
is compactly supported, then
is an entire function; but the zeroes of a non-zero entire function are isolated, yielding a contradiction unless
vanishes. (Indeed, the table also shows that if one of
and
is compactly supported, then the other cannot have exponential decay.)
On the other hand, we have the example of the Gaussian functions ,
, which both decay faster than exponentially. The classical Hardy uncertainty principle asserts, roughly speaking, that this is the fastest that
and
can simultaneously decay:
Theorem 1 (Hardy uncertainty principle) Suppose that
is a (measurable) function such that
and
for all
and some
. Then
is a scalar multiple of the gaussian
.
This theorem is proven by complex-analytic methods, in particular the Phragmén-Lindelöf principle; for sake of completeness we give that proof below. But I was curious to see if there was a real-variable proof of the same theorem, avoiding the use of complex analysis. I was able to find the proof of a slightly weaker theorem:
Theorem 2 (Weak Hardy uncertainty principle) Suppose that
is a non-zero (measurable) function such that
and
for all
and some
. Then
for some absolute constant
.
Note that the correct value of should be
, as is implied by the true Hardy uncertainty principle. Despite the weaker statement, I thought the proof might still might be of interest as it is a little less “magical” than the complex-variable one, and so I am giving it below.
In the next few lectures, we will be studying four major classes of function spaces. In decreasing order of generality, these classes are the topological vector spaces, the normed vector spaces, the Banach spaces, and the Hilbert spaces. In order to motivate the discussion of the more general classes of spaces, we will first focus on the most special class – that of (real and complex) Hilbert spaces. These spaces can be viewed as generalisations of (real and complex) Euclidean spaces such as and
to infinite-dimensional settings, and indeed much of one’s Euclidean geometry intuition concerning lengths, angles, orthogonality, subspaces, etc. will transfer readily to arbitrary Hilbert spaces; in contrast, this intuition is not always accurate in the more general vector spaces mentioned above. In addition to Euclidean spaces, another fundamental example of Hilbert spaces comes from the Lebesgue spaces
of a measure space
. (There are of course many other Hilbert spaces of importance in complex analysis, harmonic analysis, and PDE, such as Hardy spaces
, Sobolev spaces
, and the space
of Hilbert-Schmidt operators, but we will not discuss those spaces much in this course. Complex Hilbert spaces also play a fundamental role in the foundations of quantum mechanics, being the natural space to hold all the possible states of a quantum system (possibly after projectivising the Hilbert space), but we will not discuss this subject here.)
Hilbert spaces are the natural abstract framework in which to study two important (and closely related) concepts: orthogonality and unitarity, allowing us to generalise familiar concepts and facts from Euclidean geometry such as the Cartesian coordinate system, rotations and reflections, and the Pythagorean theorem to Hilbert spaces. (For instance, the Fourier transform is a unitary transformation and can thus be viewed as a kind of generalised rotation.) Furthermore, the Hodge duality on Euclidean spaces has a partial analogue for Hilbert spaces, namely the Riesz representation theorem for Hilbert spaces, which makes the theory of duality and adjoints for Hilbert spaces especially simple (when compared with the more subtle theory of duality for, say, Banach spaces). Much later (next quarter, in fact), we will see that this duality allows us to extend the spectral theorem for self-adjoint matrices to that of self-adjoint operators on a Hilbert space.
These notes are only the most basic introduction to the theory of Hilbert spaces. In particular, the theory of linear transformations between two Hilbert spaces, which is perhaps the most important aspect of the subject, is not covered much at all here (but I hope to discuss it further in future lectures.)
I’m continuing my series of articles for the Princeton Companion to Mathematics by uploading my article on the Fourier transform. Here, I chose to describe this transform as a means of decomposing general functions into more symmetric functions (such as sinusoids or plane waves), and to discuss a little bit how this transform is connected to differential operators such as the Laplacian. (This is of course only one of the many different uses of the Fourier transform, but again, with only five pages to work with, it’s hard to do justice to every single application. For instance, the connections with additive combinatorics are not covered at all.)
On the official web site of the Companion (which you can access with the user name “Guest” and password “PCM”), there is a more polished version of the same article, after it had gone through a few rounds of the editing process.
I’ll also point out David Ben-Zvi‘s Companion article on “moduli spaces“. This concept is deceptively simple – a space whose points are themselves spaces, or “representatives” or “equivalence classes” of such spaces – but it leads to the “correct” way of thinking about many geometric and algebraic objects, and more importantly about families of such objects, without drowning in a mess of coordinate charts and formulae which serve to obscure the underlying geometry.
[Update, Oct 21: categories fixed.]
[This post is authored by Gil Kalai, who has kindly “guest blogged” this week’s “open problem of the week”. - T.]
The entropy-influence conjecture seeks to relate two somewhat different measures as to how a boolean function has concentrated Fourier coefficients, namely the total influence and the entropy.
We begin by defining the total influence. Let be the discrete cube, i.e. the set of
vectors
of length n. A boolean function is any function
from the discrete cube to {-1,+1}. One can think of such functions as “voting methods”, which take the preferences of n voters (+1 for yes, -1 for no) as input and return a yes/no verdict as output. For instance, if n is odd, the “majority vote” function
returns +1 if there are more +1 variables than -1, or -1 otherwise, whereas if
, the “
dictator” function returns the value
of the
variable.
We give the cube the uniform probability measure
(thus we assume that the n voters vote randomly and independently). Given any boolean function f and any variable
, define the influence
of the
variable to be the quantity
where is the element of the cube formed by flipping the sign of the
variable. Informally,
measures the probability that the
voter could actually determine the outcome of an election; it is sometimes referred to as the Banzhaf power index. The total influence I(f) of f (also known as the average sensitivity and the edge-boundary density) is then defined as
Thus for instance a dictator function has total influence 1, whereas majority vote has total influence comparable to . The influence can range between 0 (for constant functions +1, -1) and n (for the parity function
or its negation). If f has mean zero (i.e. it is equal to +1 half of the time), then the edge-isoperimetric inequality asserts that
(with equality if and only if there is a dictatorship), whilst the Kahn-Kalai-Linial (KKL) theorem asserts that
for some k. There is a result of Friedgut that if
is bounded by A (say) and
, then f is within a distance
(in
norm) of another boolean function g which only depends on
of the variables (such functions are known as juntas).

Recent Comments