[This post was typeset using a LaTeX to WordPress-HTML converter kindly provided to me by Luca Trevisan.]

Many properties of a (sufficiently nice) function {f: {\mathbb R} \rightarrow {\mathbb C}} are reflected in its Fourier transform {\hat f: {\mathbb R} \rightarrow {\mathbb C}}, defined by the formula

\displaystyle \hat f(\xi) := \int_{-\infty}^\infty f(x) e^{-2\pi i x \xi}\ dx. \ \ \ \ \ (1)

For instance, decay properties of {f} are reflected in smoothness properties of {\hat f}, as the following table shows:

If {f} is… then {\hat f} is… and this relates to…
Square-integrable square-integrable Plancherel’s theorem
Absolutely integrable continuous Riemann-Lebesgue lemma
Rapidly decreasing smooth theory of Schwartz functions
Exponentially decreasing analytic in a strip
Compactly supported entire and at most exponential growth Paley-Wiener theorem

Another important relationship between a function {f} and its Fourier transform {\hat f} is the uncertainty principle, which roughly asserts that if a function {f} is highly localised in space, then its Fourier transform {\hat f} must be widely dispersed in space, or to put it another way, {f} and {\hat f} cannot both decay too strongly at infinity (except of course in the degenerate case {f=0}). There are many ways to make this intuition precise. One of them is the Heisenberg uncertainty principle, which asserts that if we normalise

\displaystyle \int_{{\mathbb R}} |f(x)|^2\ dx = \int_{\mathbb R} |\hat f(\xi)|^2\ d\xi = 1

then we must have

\displaystyle (\int_{\mathbb R} |x|^2 |f(x)|^2\ dx) \cdot (\int_{\mathbb R} |\xi|^2 |\hat f(\xi)|^2\ dx)\geq \frac{1}{(4\pi)^2}

thus forcing at least one of {f} or {\hat f} to not be too concentrated near the origin. This principle can be proven (for sufficiently nice {f}, initially) by observing the integration by parts identity

\displaystyle \langle xf, f' \rangle = \int_{\mathbb R} x f(x) \overline{f'(x)}\ dx = - \frac{1}{2} \int_{\mathbb R} |f(x)|^2\ dx

and then using Cauchy-Schwarz and the Plancherel identity.

Another well known manifestation of the uncertainty principle is the fact that it is not possible for {f} and {\hat f} to both be compactly supported (unless of course they vanish entirely). This can be in fact be seen from the above table: if {f} is compactly supported, then {\hat f} is an entire function; but the zeroes of a non-zero entire function are isolated, yielding a contradiction unless {f} vanishes. (Indeed, the table also shows that if one of {f} and {\hat f} is compactly supported, then the other cannot have exponential decay.)

On the other hand, we have the example of the Gaussian functions {f(x) = e^{-\pi a x^2}}, {\hat f(\xi) = \frac{1}{\sqrt{a}} e^{-\pi \xi^2/a }}, which both decay faster than exponentially. The classical Hardy uncertainty principle asserts, roughly speaking, that this is the fastest that {f} and {\hat f} can simultaneously decay:

Theorem 1 (Hardy uncertainty principle) Suppose that {f} is a (measurable) function such that {|f(x)| \leq C e^{-\pi a x^2 }} and {|\hat f(\xi)| \leq C' e^{-\pi \xi^2/a}} for all {x, \xi} and some {C, C', a > 0}. Then {f(x)} is a scalar multiple of the gaussian {e^{-\pi ax^2}}.

This theorem is proven by complex-analytic methods, in particular the Phragmén-Lindelöf principle; for sake of completeness we give that proof below. But I was curious to see if there was a real-variable proof of the same theorem, avoiding the use of complex analysis. I was able to find the proof of a slightly weaker theorem:

Theorem 2 (Weak Hardy uncertainty principle) Suppose that {f} is a non-zero (measurable) function such that {|f(x)| \leq C e^{-\pi a x^2 }} and {|\hat f(\xi)| \leq C' e^{-\pi b \xi^2}} for all {x, \xi} and some {C, C', a, b > 0}. Then {ab \leq C_0} for some absolute constant {C_0}.

Note that the correct value of {C_0} should be {1}, as is implied by the true Hardy uncertainty principle. Despite the weaker statement, I thought the proof might still might be of interest as it is a little less “magical” than the complex-variable one, and so I am giving it below.

— 1. The complex-variable proof —

We first give the complex-variable proof. By dilating {f} by {\sqrt{a}} (and contracting {\hat f} by {1/\sqrt{a}}) we may normalise {a=1}. By multiplying {f} by a small constant we may also normalise {C=C'=1}.

The super-exponential decay of {f} allows us to extend the Fourier transform {\hat f} to the complex plane, thus

\displaystyle \hat f(\xi + i \eta) = \int_{\mathbb R} f(x) e^{-2\pi i x \xi} e^{2\pi \eta x}\ dx

for all {\xi, \eta \in {\mathbb R}}. We may differentiate under the integral sign and verify that {\hat f} is entire. Taking absolute values, we obtain the upper bound

\displaystyle |\hat f(\xi + i \eta)| \leq \int_{\mathbb R} e^{-\pi x^2} e^{2\pi \eta x}\ dx;

completing the square, we obtain

\displaystyle |\hat f(\xi + i \eta)| \leq e^{\pi \eta^2} \ \ \ \ \ (2)

for all {\xi, \eta}. We conclude that the entire function

\displaystyle F(z) := e^{\pi z^2} \hat f(z)

is bounded in magnitude by {1} on the imaginary axis; also, by hypothesis on {\hat f}, we also know that {F} is bounded in magnitude by {1} on the real axis. Formally applying the Phragmen-Lindelöf principle (or maximum modulus principle), we conclude that {F} is bounded on the entire complex plane, which by Liouville’s theorem implies that {F} is constant, and the claim follows.

Now let’s go back and justify the Phragmén-Lindelöf argument. Strictly speaking, Phragmén-Lindelöf does not apply, since it requires exponential growth on the function {F}, whereas we have quadratic-exponential growth here. But we can tweak {F} a bit to solve this problem. Firstly, we pick {0 < \theta < \pi/2} and work on the sector

\displaystyle \Gamma_\theta := \{ re^{i\alpha}: r > 0, 0 \leq \alpha \leq \theta \}.

Using (2) we have

\displaystyle |F(\xi + i\eta)| \leq e^{\pi \xi^2}.

Thus, if {\delta > 0}, and {\theta} is sufficiently close to {\pi/2} depending on {\delta}, the function {e^{i\delta z^2} F(z)} is bounded in magnitude by {1} on the boundary of {\Gamma_\theta}. Then, for any sufficiently small {\epsilon > 0}, {e^{i\epsilon e^{i\epsilon} z^{2+\epsilon}} e^{i\delta z^2} F(z)} (using the standard branch of {z^{2+\epsilon}} on {\Gamma_\theta}) is also bounded in magnitude by {1} on this boundary, and goes to zero at infinity in the interior of {\Gamma_\theta}, so is bounded by {1} in that interior by the maximum modulus principle. Sending {\epsilon \rightarrow 0}, and then {\theta \rightarrow \pi/2}, and then {\delta \rightarrow 0}, we obtain {F} bounded in magnitude by {1} on the upper right quadrant. Similar arguments work for the other quadrants, and the claim follows.

— 2. The real-variable proof —

Now we turn to the real-variable proof of Theorem 2, which is based on the fact that polynomials of controlled degree do not resemble rapidly decreasing functions.

Rather than use complex analyticity {\hat f}, we will rely instead on a different relationship between the decay of {f} and the regularity of {\hat f}, as follows:

Lemma 3 (Derivative bound) Suppose that {|f(x)| \leq C e^{-\pi a x^2 }} for all {x \in {\mathbb R}}, and some {C, a > 0}. Then {\hat f} is smooth, and furthermore one has the bound {|\partial_\xi^k \hat f(\xi)| \leq \frac{C}{\sqrt{a}} \frac{k! \pi^{k/2}}{(k/2)! a^{(k+1)/2}}} for all {\xi \in {\mathbb R}} and every even integer {k}.

Proof: The smoothness of {\hat f} follows from the rapid decrease of {f}. To get the bound, we differentiate under the integral sign (one can easily check that this is justified) to obtain

\displaystyle \partial_\xi^k \hat f(\xi) = \int_{\mathbb R} (-2\pi i x)^k f(x) e^{-2\pi i x \xi}\ dx

and thus by the triangle inequality for integrals (and the hypothesis that {k} is even)

\displaystyle |\partial_\xi^k \hat f(\xi)| \leq C \int_{\mathbb R} e^{-\pi a x^2} (2\pi x)^k\ dx.

On the other hand, by differentiating the Fourier analytic identity

\displaystyle \frac{1}{\sqrt{a}} e^{-\pi \xi^2/a} = \int_{\mathbb R} e^{-\pi a x^2} e^{-2\pi i x \xi}\ dx

{k} times at {\xi = 0}, we obtain

\displaystyle \frac{d^k}{d\xi^k}(\frac{1}{\sqrt{a}} e^{-\pi \xi^2/a})|_{\xi=0} = \int_{\mathbb R} e^{-\pi a x^2} (2\pi i x)^k\ dx;

expanding out {\frac{1}{\sqrt{a}} e^{-\pi \xi^2/a}} using Taylor series we conclude that

\displaystyle \frac{k!}{\sqrt{a}} \frac{(-\pi/a)^{k/2}}{(k/2)!} = \int_{\mathbb R} e^{-\pi a x^2} (2\pi i x)^k\ dx

Using Stirling’s formula {k! = k^k (e+o(1))^{-k}}, we conclude in particular that

\displaystyle |\partial_\xi^k \hat f(\xi)| \leq (\frac{\pi e}{a}+o(1))^{k/2} k^{k/2} \ \ \ \ \ (3)

for all large even integers {k} (where the decay of {o(1)} can depend on {a, C}).

We can combine (3) with Taylor’s theorem with remainder, to conclude that on any interval {I \subset {\mathbb R}}, we have an approximation

\displaystyle \hat f(\xi) = P_I(\xi) + O( \frac{1}{k!} (\frac{\pi e}{a}+o(1))^{k/2} k^{k/2} |I|^k )

where {|I|} is the length of {I} and {P_I} is a polynomial of degree less than {k}. Using Stirling’s formula again, we obtain

\displaystyle \hat f(\xi) = P_I(\xi) + O( (\frac{\pi}{e a}+o(1))^{k/2} k^{-k/2} |I|^k ) \ \ \ \ \ (4)

Now we apply a useful bound.

Lemma 4 (Doubling bound) Let {P} be a polynomial of degree at most {k} for some {k \geq 1}, let {I = [x_0-r,x_0+r]} be an interval, and suppose that {|P(x)| \leq A} for all {x \in I} and some {A>0}. Then for any {N \geq 1} we have the bound {|P(x)| \leq (CN)^k A} for all {x \in NI := [x_0-Nr, x_0+Nr]} and for some absolute constant {C}.

Proof: By translating we may take {x_0=0}; by dilating we may take {r=1}. By dividing {P} by {A}, we may normalise {A=1}. Thus we have {|P(x)| \leq 1} for all {-1 \leq x \leq 1}, and the aim is now to show that {|P(x)| \leq (CN)^k} for all {-N \leq x \leq N}.

Consider the trigonometric polynomial {P(\cos \theta)}. By de Moivre’s formula, this function is a linear combination of {\cos(j \theta)} for {0 \leq j \leq k}. By Fourier analysis, we can thus write {P(\cos \theta) = \sum_{j=0}^k c_j \cos(j \theta)}, where

\displaystyle c_j = \frac{1}{\pi} \int_{-\pi}^\pi P(\cos \theta) \cos(j \theta)\ d\theta.

Since {P(\cos \theta)} is bounded in magnitude by {1}, we conclude that {c_j} is bounded in magnitude by {2}. Next, we use de Moivre’s formula again to expand {\cos(j \theta)} as a linear combination of {\cos(\theta)} and {\sin^2(\theta)}, with coefficients of size {O(1)^k}; expanding {\sin^2(\theta)} further as {1 - \cos^2(\theta)}, we see that {\cos(j \theta)} is a polynomial in {\cos(\theta)} with coefficients {O(1)^k}. Putting all this together, we conclude that the coefficients of {P} are all of size {O(1)^k}, and the claim follows. ◻

Remark 1 One can get slightly sharper results by using the theory of Chebyshev polynomials. (Is the best bound for {C} known? I do not know the recent literature on this subject. I think though that even the sharpest bound for {C} would not fully recover the sharp Hardy uncertainty principle, at least with the argument given here.)

We return to the proof of Theorem 2. We pick a large integer {k} and a parameter {r > 0} to be chosen later. From (4) we have

\displaystyle \hat f(\xi) = P_r(\xi) + O( \frac{r^2}{ak} )^{k/2}

for {\xi \in [-r,2r]}, and some polynomial {P_r} of degree {k}. In particular, we have

\displaystyle P_r(\xi) = O( e^{-br^2} ) + O( \frac{r^2}{ak} )^{k/2}

for {\xi \in [r,2r]}. Applying Lemma 4, we conclude that

\displaystyle P_r(\xi) = O( 1 )^k e^{-br^2} + O( \frac{r^2}{ak} )^{k/2}

for {\xi \in [-r,r]}. Applying (4) again we conclude that

\displaystyle \hat f(\xi) = O( 1 )^k e^{-br^2} + O( \frac{r^2}{ak} )^{k/2}

for {\xi \in [-r,r]}. If we pick {r := \sqrt{\frac{k}{cb}}} for a sufficiently small absolute constant {c}, we conclude that

\displaystyle |\hat f(\xi)| \leq 2^{-k} + O( \frac{1}{ab} )^{k/2}

(say) for {\xi \in [-r,r]}. If {ab \geq C_0} for large enough {C_0}, the right-hand side goes to zero as {k \rightarrow \infty} (which also implies {r \rightarrow \infty}), and we conclude that {\hat f} (and hence {f}) vanishes identically.