You are currently browsing the tag archive for the ‘symmetric polynomials’ tag.

Let ${\Omega}$ be some domain (such as the real numbers). For any natural number ${p}$, let ${L(\Omega^p)_{sym}}$ denote the space of symmetric real-valued functions ${F^{(p)}: \Omega^p \rightarrow {\bf R}}$ on ${p}$ variables ${x_1,\dots,x_p \in \Omega}$, thus

$\displaystyle F^{(p)}(x_{\sigma(1)},\dots,x_{\sigma(p)}) = F^{(p)}(x_1,\dots,x_p)$

for any permutation ${\sigma: \{1,\dots,p\} \rightarrow \{1,\dots,p\}}$. For instance, for any natural numbers ${k,p}$, the elementary symmetric polynomials

$\displaystyle e_k^{(p)}(x_1,\dots,x_p) = \sum_{1 \leq i_1 < i_2 < \dots < i_k \leq p} x_{i_1} \dots x_{i_k}$

will be an element of ${L({\bf R}^p)_{sym}}$. With the pointwise product operation, ${L(\Omega^p)_{sym}}$ becomes a commutative real algebra. We include the case ${p=0}$, in which case ${L(\Omega^0)_{sym}}$ consists solely of the real constants.

Given two natural numbers ${k,p}$, one can “lift” a symmetric function ${F^{(k)} \in L(\Omega^k)_{sym}}$ of ${k}$ variables to a symmetric function ${[F^{(k)}]_{k \rightarrow p} \in L(\Omega^p)_{sym}}$ of ${p}$ variables by the formula

$\displaystyle [F^{(k)}]_{k \rightarrow p}(x_1,\dots,x_p) = \sum_{1 \leq i_1 < i_2 < \dots < i_k \leq p} F^{(k)}(x_{i_1}, \dots, x_{i_k})$

$\displaystyle = \frac{1}{k!} \sum_\pi F^{(k)}( x_{\pi(1)}, \dots, x_{\pi(k)} )$

where ${\pi}$ ranges over all injections from ${\{1,\dots,k\}}$ to ${\{1,\dots,p\}}$ (the latter formula making it clearer that ${[F^{(k)}]_{k \rightarrow p}}$ is symmetric). Thus for instance

$\displaystyle [F^{(1)}(x_1)]_{1 \rightarrow p} = \sum_{i=1}^p F^{(1)}(x_i)$

$\displaystyle [F^{(2)}(x_1,x_2)]_{2 \rightarrow p} = \sum_{1 \leq i < j \leq p} F^{(2)}(x_i,x_j)$

and

$\displaystyle e_k^{(p)}(x_1,\dots,x_p) = [x_1 \dots x_k]_{k \rightarrow p}.$

Also we have

$\displaystyle [1]_{k \rightarrow p} = \binom{p}{k} = \frac{p(p-1)\dots(p-k+1)}{k!}.$

With these conventions, we see that ${[F^{(k)}]_{k \rightarrow p}}$ vanishes for ${p=0,\dots,k-1}$, and is equal to ${F}$ if ${k=p}$. We also have the transitivity

$\displaystyle [F^{(k)}]_{k \rightarrow p} = \frac{1}{\binom{p-k}{p-l}} [[F^{(k)}]_{k \rightarrow l}]_{l \rightarrow p}$

if ${k \leq l \leq p}$.

The lifting map ${[]_{k \rightarrow p}}$ is a linear map from ${L(\Omega^k)_{sym}}$ to ${L(\Omega^p)_{sym}}$, but it is not a ring homomorphism. For instance, when ${\Omega={\bf R}}$, one has

$\displaystyle [x_1]_{1 \rightarrow p} [x_1]_{1 \rightarrow p} = (\sum_{i=1}^p x_i)^2 \ \ \ \ \ (1)$

$\displaystyle = \sum_{i=1}^p x_i^2 + 2 \sum_{1 \leq i < j \leq p} x_i x_j$

$\displaystyle = [x_1^2]_{1 \rightarrow p} + 2 [x_1 x_2]_{1 \rightarrow p}$

$\displaystyle \neq [x_1^2]_{1 \rightarrow p}.$

In general, one has the identity

$\displaystyle [F^{(k)}(x_1,\dots,x_k)]_{k \rightarrow p} [G^{(l)}(x_1,\dots,x_l)]_{l \rightarrow p} = \sum_{k,l \leq m \leq k+l} \frac{1}{k! l!} \ \ \ \ \ (2)$

$\displaystyle [\sum_{\pi, \rho} F^{(k)}(x_{\pi(1)},\dots,x_{\pi(k)}) G^{(l)}(x_{\rho(1)},\dots,x_{\rho(l)})]_{m \rightarrow p}$

for all natural numbers ${k,l,p}$ and ${F^{(k)} \in L(\Omega^k)_{sym}}$, ${G^{(l)} \in L(\Omega^l)_{sym}}$, where ${\pi, \rho}$ range over all injections ${\pi: \{1,\dots,k\} \rightarrow \{1,\dots,m\}}$, ${\rho: \{1,\dots,l\} \rightarrow \{1,\dots,m\}}$ with ${\pi(\{1,\dots,k\}) \cup \rho(\{1,\dots,l\}) = \{1,\dots,m\}}$. Combinatorially, the identity (2) follows from the fact that given any injections ${\tilde \pi: \{1,\dots,k\} \rightarrow \{1,\dots,p\}}$ and ${\tilde \rho: \{1,\dots,l\} \rightarrow \{1,\dots,p\}}$ with total image ${\tilde \pi(\{1,\dots,k\}) \cup \tilde \rho(\{1,\dots,l\})}$ of cardinality ${m}$, one has ${k,l \leq m \leq k+l}$, and furthermore there exist precisely ${m!}$ triples ${(\pi, \rho, \sigma)}$ of injections ${\pi: \{1,\dots,k\} \rightarrow \{1,\dots,m\}}$, ${\rho: \{1,\dots,l\} \rightarrow \{1,\dots,m\}}$, ${\sigma: \{1,\dots,m\} \rightarrow \{1,\dots,p\}}$ such that ${\tilde \pi = \sigma \circ \pi}$ and ${\tilde \rho = \sigma \circ \rho}$.

Example 1 When ${\Omega = {\bf R}}$, one has

$\displaystyle [x_1 x_2]_{2 \rightarrow p} [x_1]_{1 \rightarrow p} = [\frac{1}{2! 1!}( 2 x_1^2 x_2 + 2 x_1 x_2^2 )]_{2 \rightarrow p} + [\frac{1}{2! 1!} 6 x_1 x_2 x_3]_{3 \rightarrow p}$

$\displaystyle = [x_1^2 x_2 + x_1 x_2^2]_{2 \rightarrow p} + [3x_1 x_2 x_3]_{3 \rightarrow p}$

which is just a restatement of the identity

$\displaystyle (\sum_{i < j} x_i x_j) (\sum_k x_k) = \sum_{i

Note that the coefficients appearing in (2) do not depend on the final number of variables ${p}$. We may therefore abstract the role of ${p}$ from the law (2) by introducing the real algebra ${L(\Omega^*)_{sym}}$ of formal sums

$\displaystyle F^{(*)} = \sum_{k=0}^\infty [F^{(k)}]_{k \rightarrow *}$

where for each ${k}$, ${F^{(k)}}$ is an element of ${L(\Omega^k)_{sym}}$ (with only finitely many of the ${F^{(k)}}$ being non-zero), and with the formal symbol ${[]_{k \rightarrow *}}$ being formally linear, thus

$\displaystyle [F^{(k)}]_{k \rightarrow *} + [G^{(k)}]_{k \rightarrow *} := [F^{(k)} + G^{(k)}]_{k \rightarrow *}$

and

$\displaystyle c [F^{(k)}]_{k \rightarrow *} := [cF^{(k)}]_{k \rightarrow *}$

for ${F^{(k)}, G^{(k)} \in L(\Omega^k)_{sym}}$ and scalars ${c \in {\bf R}}$, and with multiplication given by the analogue

$\displaystyle [F^{(k)}(x_1,\dots,x_k)]_{k \rightarrow *} [G^{(l)}(x_1,\dots,x_l)]_{l \rightarrow *} = \sum_{k,l \leq m \leq k+l} \frac{1}{k! l!} \ \ \ \ \ (3)$

$\displaystyle [\sum_{\pi, \rho} F^{(k)}(x_{\pi(1)},\dots,x_{\pi(k)}) G^{(l)}(x_{\rho(1)},\dots,x_{\rho(l)})]_{m \rightarrow *}$

of (2). Thus for instance, in this algebra ${L(\Omega^*)_{sym}}$ we have

$\displaystyle [x_1]_{1 \rightarrow *} [x_1]_{1 \rightarrow *} = [x_1^2]_{1 \rightarrow *} + 2 [x_1 x_2]_{2 \rightarrow *}$

and

$\displaystyle [x_1 x_2]_{2 \rightarrow *} [x_1]_{1 \rightarrow *} = [x_1^2 x_2 + x_1 x_2^2]_{2 \rightarrow *} + [3 x_1 x_2 x_3]_{3 \rightarrow *}.$

Informally, ${L(\Omega^*)_{sym}}$ is an abstraction (or “inverse limit”) of the concept of a symmetric function of an unspecified number of variables, which are formed by summing terms that each involve only a bounded number of these variables at a time. One can check (somewhat tediously) that ${L(\Omega^*)_{sym}}$ is indeed a commutative real algebra, with a unit ${[1]_{0 \rightarrow *}}$. (I do not know if this algebra has previously been studied in the literature; it is somewhat analogous to the abstract algebra of finite linear combinations of Schur polynomials, with multiplication given by a Littlewood-Richardson rule. )

For natural numbers ${p}$, there is an obvious specialisation map ${[]_{* \rightarrow p}}$ from ${L(\Omega^*)_{sym}}$ to ${L(\Omega^p)_{sym}}$, defined by the formula

$\displaystyle [\sum_{k=0}^\infty [F^{(k)}]_{k \rightarrow *}]_{* \rightarrow p} := \sum_{k=0}^\infty [F^{(k)}]_{k \rightarrow p}.$

Thus, for instance, ${[]_{* \rightarrow p}}$ maps ${[x_1]_{1 \rightarrow *}}$ to ${[x_1]_{1 \rightarrow p}}$ and ${[x_1 x_2]_{2 \rightarrow *}}$ to ${[x_1 x_2]_{2 \rightarrow p}}$. From (2) and (3) we see that this map ${[]_{* \rightarrow p}: L(\Omega^*)_{sym} \rightarrow L(\Omega^p)_{sym}}$ is an algebra homomorphism, even though the maps ${[]_{k \rightarrow *}: L(\Omega^k)_{sym} \rightarrow L(\Omega^*)_{sym}}$ and ${[]_{k \rightarrow p}: L(\Omega^k)_{sym} \rightarrow L(\Omega^p)_{sym}}$ are not homomorphisms. By inspecting the ${p^{th}}$ component of ${L(\Omega^*)_{sym}}$ we see that the homomorphism ${[]_{* \rightarrow p}}$ is in fact surjective.

Now suppose that we have a measure ${\mu}$ on the space ${\Omega}$, which then induces a product measure ${\mu^p}$ on every product space ${\Omega^p}$. To avoid degeneracies we will assume that the integral ${\int_\Omega \mu}$ is strictly positive. Assuming suitable measurability and integrability hypotheses, a function ${F \in L(\Omega^p)_{sym}}$ can then be integrated against this product measure to produce a number

$\displaystyle \int_{\Omega^p} F\ d\mu^p.$

In the event that ${F}$ arises as a lift ${[F^{(k)}]_{k \rightarrow p}}$ of another function ${F^{(k)} \in L(\Omega^k)_{sym}}$, then from Fubini’s theorem we obtain the formula

$\displaystyle \int_{\Omega^p} F\ d\mu^p = \binom{p}{k} (\int_{\Omega^k} F^{(k)}\ d\mu^k) (\int_\Omega\ d\mu)^{p-k}.$

Thus for instance, if ${\Omega={\bf R}}$,

$\displaystyle \int_{{\bf R}^p} [x_1]_{1 \rightarrow p}\ d\mu^p = p (\int_{\bf R} x\ d\mu(x)) (\int_{\bf R} \mu)^{p-1} \ \ \ \ \ (4)$

and

$\displaystyle \int_{{\bf R}^p} [x_1 x_2]_{2 \rightarrow p}\ d\mu^p = \binom{p}{2} (\int_{{\bf R}^2} x_1 x_2\ d\mu(x_1) d\mu(x_2)) (\int_{\bf R} \mu)^{p-2}. \ \ \ \ \ (5)$

On summing, we see that if

$\displaystyle F^{(*)} = \sum_{k=0}^\infty [F^{(k)}]_{k \rightarrow *}$

is an element of the formal algebra ${L(\Omega^*)_{sym}}$, then

$\displaystyle \int_{\Omega^p} [F^{(*)}]_{* \rightarrow p}\ d\mu^p = \sum_{k=0}^\infty \binom{p}{k} (\int_{\Omega^k} F^{(k)}\ d\mu^k) (\int_\Omega\ d\mu)^{p-k}. \ \ \ \ \ (6)$

Note that by hypothesis, only finitely many terms on the right-hand side are non-zero.

Now for a key observation: whereas the left-hand side of (6) only makes sense when ${p}$ is a natural number, the right-hand side is meaningful when ${p}$ takes a fractional value (or even when it takes negative or complex values!), interpreting the binomial coefficient ${\binom{p}{k}}$ as a polynomial ${\frac{p(p-1) \dots (p-k+1)}{k!}}$ in ${p}$. As such, this suggests a way to introduce a “virtual” concept of a symmetric function on a fractional power space ${\Omega^p}$ for such values of ${p}$, and even to integrate such functions against product measures ${\mu^p}$, even if the fractional power ${\Omega^p}$ does not exist in the usual set-theoretic sense (and ${\mu^p}$ similarly does not exist in the usual measure-theoretic sense). More precisely, for arbitrary real or complex ${p}$, we now define ${L(\Omega^p)_{sym}}$ to be the space of abstract objects

$\displaystyle F^{(p)} = [F^{(*)}]_{* \rightarrow p} = \sum_{k=0}^\infty [F^{(k)}]_{k \rightarrow p}$

with ${F^{(*)} \in L(\Omega^*)_{sym}}$ and ${[]_{* \rightarrow p}}$ (and ${[]_{k \rightarrow p}}$ now interpreted as formal symbols, with the structure of a commutative real algebra inherited from ${L(\Omega^*)_{sym}}$, thus

$\displaystyle [F^{(*)}]_{* \rightarrow p} + [G^{(*)}]_{* \rightarrow p} := [F^{(*)} + G^{(*)}]_{* \rightarrow p}$

$\displaystyle c [F^{(*)}]_{* \rightarrow p} := [c F^{(*)}]_{* \rightarrow p}$

$\displaystyle [F^{(*)}]_{* \rightarrow p} [G^{(*)}]_{* \rightarrow p} := [F^{(*)} G^{(*)}]_{* \rightarrow p}.$

In particular, the multiplication law (2) continues to hold for such values of ${p}$, thanks to (3). Given any measure ${\mu}$ on ${\Omega}$, we formally define a measure ${\mu^p}$ on ${\Omega^p}$ with regards to which we can integrate elements ${F^{(p)}}$ of ${L(\Omega^p)_{sym}}$ by the formula (6) (providing one has sufficient measurability and integrability to make sense of this formula), thus providing a sort of “fractional dimensional integral” for symmetric functions. Thus, for instance, with this formalism the identities (4), (5) now hold for fractional values of ${p}$, even though the formal space ${{\bf R}^p}$ no longer makes sense as a set, and the formal measure ${\mu^p}$ no longer makes sense as a measure. (The formalism here is somewhat reminiscent of the technique of dimensional regularisation employed in the physical literature in order to assign values to otherwise divergent integrals. See also this post for an unrelated abstraction of the integration concept involving integration over supercommutative variables (and in particular over fermionic variables).)

Example 2 Suppose ${\mu}$ is a probability measure on ${\Omega}$, and ${X: \Omega \rightarrow {\bf R}}$ is a random variable; on any power ${\Omega^k}$, we let ${X_1,\dots,X_k: \Omega^k \rightarrow {\bf R}}$ be the usual independent copies of ${X}$ on ${\Omega^k}$, thus ${X_j(\omega_1,\dots,\omega_k) := X(\omega_j)}$ for ${(\omega_1,\dots,\omega_k) \in \Omega^k}$. Then for any real or complex ${p}$, the formal integral

$\displaystyle \int_{\Omega^p} [X_1]_{1 \rightarrow p}^2\ d\mu^p$

can be evaluated by first using the identity

$\displaystyle [X_1]_{1 \rightarrow p}^2 = [X_1^2]_{1 \rightarrow p} + 2[X_1 X_2]_{2 \rightarrow p}$

(cf. (1)) and then using (6) and the probability measure hypothesis ${\int_\Omega\ d\mu = 1}$ to conclude that

$\displaystyle \int_{\Omega^p} [X_1]_{1 \rightarrow p}^2\ d\mu^p = \binom{p}{1} \int_{\Omega} X^2\ d\mu + 2 \binom{p}{2} \int_{\Omega^2} X_1 X_2\ d\mu^2$

$\displaystyle = p (\int_\Omega X^2\ d\mu - (\int_\Omega X\ d\mu)^2) + p^2 (\int_\Omega X\ d\mu)^2$

or in probabilistic notation

$\displaystyle \int_{\Omega^p} [X_1]_{1 \rightarrow p}^2\ d\mu^p = p \mathbf{Var}(X) + p^2 \mathbf{E}(X)^2. \ \ \ \ \ (7)$

For ${p}$ a natural number, this identity has the probabilistic interpretation

$\displaystyle \mathbf{E}( X_1 + \dots + X_p)^2 = p \mathbf{Var}(X) + p^2 \mathbf{E}(X)^2 \ \ \ \ \ (8)$

whenever ${X_1,\dots,X_p}$ are jointly independent copies of ${X}$, which reflects the well known fact that the sum ${X_1 + \dots + X_p}$ has expectation ${p \mathbf{E} X}$ and variance ${p \mathbf{Var}(X)}$. One can thus view (7) as an abstract generalisation of (8) to the case when ${p}$ is fractional, negative, or even complex, despite the fact that there is no sensible way in this case to talk about ${p}$ independent copies ${X_1,\dots,X_p}$ of ${X}$ in the standard framework of probability theory.

In this particular case, the quantity (7) is non-negative for every nonnegative ${p}$, which looks plausible given the form of the left-hand side. Unfortunately, this sort of non-negativity does not always hold; for instance, if ${X}$ has mean zero, one can check that

$\displaystyle \int_{\Omega^p} [X_1]_{1 \rightarrow p}^4\ d\mu^p = p \mathbf{Var}(X^2) + p(3p-2) (\mathbf{E}(X^2))^2$

and the right-hand side can become negative for ${p < 2/3}$. This is a shame, because otherwise one could hope to start endowing ${L(X^p)_{sym}}$ with some sort of commutative von Neumann algebra type structure (or the abstract probability structure discussed in this previous post) and then interpret it as a genuine measure space rather than as a virtual one. (This failure of positivity is related to the fact that the characteristic function of a random variable, when raised to the ${p^{th}}$ power, need not be a characteristic function of any random variable once ${p}$ is no longer a natural number: “fractional convolution” does not preserve positivity!) However, one vestige of positivity remains: if ${F: \Omega \rightarrow {\bf R}}$ is non-negative, then so is

$\displaystyle \int_{\Omega^p} [F]_{1 \rightarrow p}\ d\mu^p = p (\int_\Omega F\ d\mu) (\int_\Omega\ d\mu)^{p-1}.$

One can wonder what the point is to all of this abstract formalism and how it relates to the rest of mathematics. For me, this formalism originated implicitly in an old paper I wrote with Jon Bennett and Tony Carbery on the multilinear restriction and Kakeya conjectures, though we did not have a good language for working with it at the time, instead working first with the case of natural number exponents ${p}$ and appealing to a general extrapolation theorem to then obtain various identities in the fractional ${p}$ case. The connection between these fractional dimensional integrals and more traditional integrals ultimately arises from the simple identity

$\displaystyle (\int_\Omega\ d\mu)^p = \int_{\Omega^p}\ d\mu^p$

(where the right-hand side should be viewed as the fractional dimensional integral of the unit ${[1]_{0 \rightarrow p}}$ against ${\mu^p}$). As such, one can manipulate ${p^{th}}$ powers of ordinary integrals using the machinery of fractional dimensional integrals. A key lemma in this regard is

Lemma 3 (Differentiation formula) Suppose that a positive measure ${\mu = \mu(t)}$ on ${\Omega}$ depends on some parameter ${t}$ and varies by the formula

$\displaystyle \frac{d}{dt} \mu(t) = a(t) \mu(t) \ \ \ \ \ (9)$

for some function ${a(t): \Omega \rightarrow {\bf R}}$. Let ${p}$ be any real or complex number. Then, assuming sufficient smoothness and integrability of all quantities involved, we have

$\displaystyle \frac{d}{dt} \int_{\Omega^p} F^{(p)}\ d\mu(t)^p = \int_{\Omega^p} F^{(p)} [a(t)]_{1 \rightarrow p}\ d\mu(t)^p \ \ \ \ \ (10)$

for all ${F^{(p)} \in L(\Omega^p)_{sym}}$ that are independent of ${t}$. If we allow ${F^{(p)}(t)}$ to now depend on ${t}$ also, then we have the more general total derivative formula

$\displaystyle \frac{d}{dt} \int_{\Omega^p} F^{(p)}(t)\ d\mu(t)^p \ \ \ \ \ (11)$

$\displaystyle = \int_{\Omega^p} \frac{d}{dt} F^{(p)}(t) + F^{(p)}(t) [a(t)]_{1 \rightarrow p}\ d\mu(t)^p,$

again assuming sufficient amounts of smoothness and regularity.

Proof: We just prove (10), as (11) then follows by same argument used to prove the usual product rule. By linearity it suffices to verify this identity in the case ${F^{(p)} = [F^{(k)}]_{k \rightarrow p}}$ for some symmetric function ${F^{(k)} \in L(\Omega^k)_{sym}}$ for a natural number ${k}$. By (6), the left-hand side of (10) is then

$\displaystyle \frac{d}{dt} [\binom{p}{k} (\int_{\Omega^k} F^{(k)}\ d\mu(t)^k) (\int_\Omega\ d\mu(t))^{p-k}]. \ \ \ \ \ (12)$

Differentiating under the integral sign using (9) we have

$\displaystyle \frac{d}{dt} \int_\Omega\ d\mu(t) = \int_\Omega\ a(t)\ d\mu(t)$

and similarly

$\displaystyle \frac{d}{dt} \int_{\Omega^k} F^{(k)}\ d\mu(t)^k = \int_{\Omega^k} F^{(k)}(a_1+\dots+a_k)\ d\mu(t)^k$

where ${a_1,\dots,a_k}$ are the standard ${k}$ copies of ${a = a(t)}$ on ${\Omega^k}$:

$\displaystyle a_j(\omega_1,\dots,\omega_k) := a(\omega_j).$

By the product rule, we can thus expand (12) as

$\displaystyle \binom{p}{k} (\int_{\Omega^k} F^{(k)}(a_1+\dots+a_k)\ d\mu^k ) (\int_\Omega\ d\mu)^{p-k}$

$\displaystyle + \binom{p}{k} (p-k) (\int_{\Omega^k} F^{(k)}\ d\mu^k) (\int_\Omega\ a\ d\mu) (\int_\Omega\ d\mu)^{p-k-1}$

where we have suppressed the dependence on ${t}$ for brevity. Since ${\binom{p}{k} (p-k) = \binom{p}{k+1} (k+1)}$, we can write this expression using (6) as

$\displaystyle \int_{\Omega^p} [F^{(k)} (a_1 + \dots + a_k)]_{k \rightarrow p} + [ F^{(k)} \ast a ]_{k+1 \rightarrow p}\ d\mu^p$

where ${F^{(k)} \ast a \in L(\Omega^{k+1})_{sym}}$ is the symmetric function

$\displaystyle F^{(k)} \ast a(\omega_1,\dots,\omega_{k+1}) := \sum_{j=1}^{k+1} F^{(k)}(\omega_1,\dots,\omega_{j-1},\omega_{j+1} \dots \omega_{k+1}) a(\omega_j).$

But from (2) one has

$\displaystyle [F^{(k)} (a_1 + \dots + a_k)]_{k \rightarrow p} + [ F^{(k)} \ast a ]_{k+1 \rightarrow p} = [F^{(k)}]_{k \rightarrow p} [a]_{1 \rightarrow p}$

and the claim follows. $\Box$

Remark 4 It is also instructive to prove this lemma in the special case when ${p}$ is a natural number, in which case the fractional dimensional integral ${\int_{\Omega^p} F^{(p)}\ d\mu(t)^p}$ can be interpreted as a classical integral. In this case, the identity (10) is immediate from applying the product rule to (9) to conclude that

$\displaystyle \frac{d}{dt} d\mu(t)^p = [a(t)]_{1 \rightarrow p} d\mu(t)^p.$

One could in fact derive (10) for arbitrary real or complex ${p}$ from the case when ${p}$ is a natural number by an extrapolation argument; see the appendix of my paper with Bennett and Carbery for details.

Let us give a simple PDE application of this lemma as illustration:

Proposition 5 (Heat flow monotonicity) Let ${u: [0,+\infty) \times {\bf R}^d \rightarrow {\bf R}}$ be a solution to the heat equation ${u_t = \Delta u}$ with initial data ${\mu_0}$ a rapidly decreasing finite non-negative Radon measure, or more explicitly

$\displaystyle u(t,x) = \frac{1}{(4\pi t)^{d/2}} \int_{{\bf R}^d} e^{-|x-y|^2/4t}\ d\mu_0(y)$

for al ${t>0}$. Then for any ${p>0}$, the quantity

$\displaystyle Q_p(t) := t^{\frac{d}{2} (p-1)} \int_{{\bf R}^d} u(t,x)^p\ dx$

is monotone non-decreasing in ${t \in (0,+\infty)}$ for ${1 < p < \infty}$, constant for ${p=1}$, and monotone non-increasing for ${0 < p < 1}$.

Proof: By a limiting argument we may assume that ${d\mu_0}$ is absolutely continuous, with Radon-Nikodym derivative a test function; this is more than enough regularity to justify the arguments below.

For any ${(t,x) \in (0,+\infty) \times {\bf R}^d}$, let ${\mu(t,x)}$ denote the Radon measure

$\displaystyle d\mu(t,x)(y) := \frac{1}{(4\pi)^{d/2}} e^{-|x-y|^2/4t}\ d\mu_0(y).$

Then the quantity ${Q_p(t)}$ can be written as a fractional dimensional integral

$\displaystyle Q_p(t) = t^{-d/2} \int_{{\bf R}^d} \int_{({\bf R}^d)^p}\ d\mu(t,x)^p\ dx.$

Observe that

$\displaystyle \frac{\partial}{\partial t} d\mu(t,x) = \frac{|x-y|^2}{4t^2} d\mu(t,x)$

and thus by Lemma 3 and the product rule

$\displaystyle \frac{d}{dt} Q_p(t) = -\frac{d}{2t} Q_p(t) + t^{-d/2} \int_{{\bf R}^d} \int_{({\bf R}^d)^p} [\frac{|x-y|^2}{4t^2}]_{1 \rightarrow p} d\mu(t,x)^p\ dx \ \ \ \ \ (13)$

where we use ${y}$ for the variable of integration in the factor space ${{\bf R}^d}$ of ${({\bf R}^d)^p}$.

To simplify this expression we will take advantage of integration by parts in the ${x}$ variable. Specifically, in any direction ${x_j}$, we have

$\displaystyle \frac{\partial}{\partial x_j} d\mu(t,x) = -\frac{x_j-y_j}{2t} d\mu(t,x)$

and hence by Lemma 3

$\displaystyle \frac{\partial}{\partial x_j} \int_{({\bf R}^d)^p}\ d\mu(t,x)^p\ dx = - \int_{({\bf R}^d)^p} [\frac{x_j-y_j}{2t}]_{1 \rightarrow p}\ d\mu(t,x)^p\ dx.$

Multiplying by ${x_j}$ and integrating by parts, we see that

$\displaystyle d Q_p(t) = \int_{{\bf R}^d} \int_{({\bf R}^d)^p} x_j [\frac{x_j-y_j}{2t}]_{1 \rightarrow p}\ d\mu(t,x)^p\ dx$

$\displaystyle = \int_{{\bf R}^d} \int_{({\bf R}^d)^p} x_j [\frac{x_j-y_j}{2t}]_{1 \rightarrow p}\ d\mu(t,x)^p\ dx$

where we use the Einstein summation convention in ${j}$. Similarly, if ${F_j(y)}$ is any reasonable function depending only on ${y}$, we have

$\displaystyle \frac{\partial}{\partial x_j} \int_{({\bf R}^d)^p}[F_j(y)]_{1 \rightarrow p}\ d\mu(t,x)^p\ dx$

$\displaystyle = - \int_{({\bf R}^d)^p} [F_j(y)]_{1 \rightarrow p} [\frac{x_j-y_j}{2t}]_{1 \rightarrow p}\ d\mu(t,x)^p\ dx$

and hence on integration by parts

$\displaystyle 0 = \int_{{\bf R}^d} \int_{({\bf R}^d)^p} [F_j(y) \frac{x_j-y_j}{2t}]_{1 \rightarrow p}\ d\mu(t,x)^p\ dx.$

We conclude that

$\displaystyle \frac{d}{2t} Q_p(t) = t^{-d/2} \int_{{\bf R}^d} \int_{({\bf R}^d)^p} (x_j - [F_j(y)]_{1 \rightarrow p}) [\frac{(x_j-y_j)}{4t}]_{1 \rightarrow p} d\mu(t,x)^p\ dx$

and thus by (13)

$\displaystyle \frac{d}{dt} Q_p(t) = \frac{1}{4t^{\frac{d}{2}+2}} \int_{{\bf R}^d} \int_{({\bf R}^d)^p}$

$\displaystyle [(x_j-y_j)(x_j-y_j)]_{1 \rightarrow p} - (x_j - [F_j(y)]_{1 \rightarrow p}) [x_j - y_j]_{1 \rightarrow p}\ d\mu(t,x)^p\ dx.$

The choice of ${F_j}$ that then achieves the most cancellation turns out to be ${F_j(y) = \frac{1}{p} y_j}$ (this cancels the terms that are linear or quadratic in the ${x_j}$), so that ${x_j - [F_j(y)]_{1 \rightarrow p} = \frac{1}{p} [x_j - y_j]_{1 \rightarrow p}}$. Repeating the calculations establishing (7), one has

$\displaystyle \int_{({\bf R}^d)^p} [(x_j-y_j)(x_j-y_j)]_{1 \rightarrow p}\ d\mu^p = p \mathop{\bf E} |x-Y|^2 (\int_{{\bf R}^d}\ d\mu)^{p}$

and

$\displaystyle \int_{({\bf R}^d)^p} [x_j-y_j]_{1 \rightarrow p} [x_j-y_j]_{1 \rightarrow p}\ d\mu^p$

$\displaystyle = (p \mathbf{Var}(x-Y) + p^2 |\mathop{\bf E} x-Y|^2) (\int_{{\bf R}^d}\ d\mu)^{p}$

where ${Y}$ is the random variable drawn from ${{\bf R}^d}$ with the normalised probability measure ${\mu / \int_{{\bf R}^d}\ d\mu}$. Since ${\mathop{\bf E} |x-Y|^2 = \mathbf{Var}(x-Y) + |\mathop{\bf E} x-Y|^2}$, one thus has

$\displaystyle \frac{d}{dt} Q_p(t) = (p-1) \frac{1}{4t^{\frac{d}{2}+2}} \int_{{\bf R}^d} \mathbf{Var}(x-Y) (\int_{{\bf R}^d}\ d\mu)^{p}\ dx. \ \ \ \ \ (14)$

This expression is clearly non-negative for ${p>1}$, equal to zero for ${p=1}$, and positive for ${0 < p < 1}$, giving the claim. (One could simplify ${\mathbf{Var}(x-Y)}$ here as ${\mathbf{Var}(Y)}$ if desired, though it is not strictly necessary to do so for the proof.) $\Box$

Remark 6 As with Remark 4, one can also establish the identity (14) first for natural numbers ${p}$ by direct computation avoiding the theory of fractional dimensional integrals, and then extrapolate to the case of more general values of ${p}$. This particular identity is also simple enough that it can be directly established by integration by parts without much difficulty, even for fractional values of ${p}$.

A more complicated version of this argument establishes the non-endpoint multilinear Kakeya inequality (without any logarithmic loss in a scale parameter ${R}$); this was established in my previous paper with Jon Bennett and Tony Carbery, but using the “natural number ${p}$ first” approach rather than using the current formalism of fractional dimensional integration. However, the arguments can be translated into this formalism without much difficulty; we do so below the fold. (To simplify the exposition slightly we will not address issues of establishing enough regularity and integrability to justify all the manipulations, though in practice this can be done by standard limiting arguments.)

The complete homogeneous symmetric polynomial ${h_d(x_1,\dots,x_n)}$ of ${n}$ variables ${x_1,\dots,x_n}$ and degree ${d}$ can be defined as

$\displaystyle h_d(x_1,\dots,x_n) := \sum_{1 \leq i_1 \leq \dots \leq i_d \leq n} x_{i_1} \dots x_{i_d},$

thus for instance

$\displaystyle h_0(x_1,\dots,x_n) = 0,$

$\displaystyle h_1(x_1,\dots,x_n) = x_1 + \dots + x_n,$

and

$\displaystyle h_2(x_1,\dots,x_n) = x_1^2 + \dots + x_n^2 + \sum_{1 \leq i < j \leq n} x_i x_j.$

One can also define all the complete homogeneous symmetric polynomials of ${n}$ variables simultaneously by means of the generating function

$\displaystyle \sum_{d=0}^\infty h_d(x_1,\dots,x_n) t^d = \frac{1}{(1-t x_1) \dots (1-tx_n)}. \ \ \ \ \ (1)$

We will think of the variables ${x_1,\dots,x_n}$ as taking values in the real numbers. When one does so, one might observe that the degree two polynomial ${h_2}$ is a positive definite quadratic form, since it has the sum of squares representation

$\displaystyle h_2(x_1,\dots,x_n) = \frac{1}{2} \sum_{i=1}^n x_i^2 + \frac{1}{2} (x_1+\dots+x_n)^2.$

In particular, ${h_2(x_1,\dots,x_n) > 0}$ unless ${x_1=\dots=x_n=0}$. This can be compared against the superficially similar quadratic form

$\displaystyle x_1^2 + \dots + x_n^2 + \sum_{1 \leq i < j \leq n} \epsilon_{ij} x_i x_j$

where ${\epsilon_{ij} = \pm 1}$ are independent randomly chosen signs. The Wigner semicircle law says that for large ${n}$, the eigenvalues of this form will be mostly distributed in the interval ${[-\sqrt{n}, \sqrt{n}]}$ using the semicircle distribution, so in particular the form is quite far from being positive definite despite the presence of the first ${n}$ positive terms. Thus the positive definiteness is coming from the finer algebraic structure of ${h_2}$, and not just from the magnitudes of its coefficients.

One could ask whether the same positivity holds for other degrees ${d}$ than two. For odd degrees, the answer is clearly no, since ${h_d(-x_1,\dots,-x_n) = -h_d(x_1,\dots,x_n)}$ in that case. But one could hope for instance that

$\displaystyle h_4(x_1,\dots,x_n) = \sum_{1 \leq i \leq j \leq k \leq l \leq n} x_i x_j x_k x_l$

also has a sum of squares representation that demonstrates positive definiteness. This turns out to be true, but is remarkably tedious to establish directly. Nevertheless, we have a nice result of Hunter that gives positive definiteness for all even degrees ${d}$. In fact, a modification of his argument gives a little bit more:

Theorem 1 Let ${n \geq 1}$, let ${d \geq 0}$ be even, and let ${x_1,\dots,x_n}$ be reals.

• (i) (Positive definiteness) One has ${h_d(x_1,\dots,x_n) \geq 0}$, with strict inequality unless ${x_1=\dots=x_n=0}$.
• (ii) (Schur convexity) One has ${h_d(x_1,\dots,x_n) \geq h_d(y_1,\dots,y_n)}$ whenever ${(x_1,\dots,x_n)}$ majorises ${(y_1,\dots,y_n)}$, with equality if and only if ${(y_1,\dots,y_n)}$ is a permutation of ${(x_1,\dots,x_n)}$.
• (iii) (Schur-Ostrowski criterion for Schur convexity) For any ${1 \leq i < j \leq n}$, one has ${(x_i - x_j) (\frac{\partial}{\partial x_i} - \frac{\partial}{\partial x_j}) h_d(x_1,\dots,x_n) \geq 0}$, with strict inequality unless ${x_i=x_j}$.

Proof: We induct on ${d}$ (allowing ${n}$ to be arbitrary). The claim is trivially true for ${d=0}$, and easily verified for ${d=2}$, so suppose that ${d \geq 4}$ and the claims (i), (ii), (iii) have already been proven for ${d-2}$ (and for arbitrary ${n}$).

If we apply the differential operator ${(x_i - x_j) (\frac{\partial}{\partial x_i} - \frac{\partial}{\partial x_j})}$ to ${\frac{1}{(1-t x_1) \dots (1-tx_n)}}$ using the product rule, one obtains after a brief calculation

$\displaystyle \frac{(x_i-x_j)^2 t^2}{(1-t x_1) \dots (1-tx_n) (1-t x_i) (1-t x_j)}.$

Using (1) and extracting the ${t^d}$ coefficient, we obtain the identity

$\displaystyle (x_i - x_j) (\frac{\partial}{\partial x_i} - \frac{\partial}{\partial x_j}) h_d(x_1,\dots,x_n)$

$\displaystyle = (x_i-x_j)^2 h_{d-2}( x_1,\dots,x_n,x_i,x_j). \ \ \ \ \ (2)$

The claim (iii) then follows from (i) and the induction hypothesis.

To obtain (ii), we use the more general statement (known as the Schur-Ostrowski criterion) that (ii) is implied from (iii) if we replace ${h_d}$ by an arbitrary symmetric, continuously differentiable function. To establish this criterion, we induct on ${n}$ (this argument can be made independently of the existing induction on ${d}$). If ${(y_1,\dots,y_n)}$ is majorised by ${(x_1,\dots,x_n)}$, it lies in the permutahedron of ${(x_1,\dots,x_n)}$. If ${(y_1,\dots,y_n)}$ lies on a face of this permutahedron, then after permuting both the ${x_i}$ and ${y_j}$ we may assume that ${(y_1,\dots,y_m)}$ is majorised by ${(x_1,\dots,x_m)}$, and ${(y_{m+1},\dots,y_n)}$ is majorised by ${(x_{m+1},\dots,x_n)}$ for some ${1 \leq m < n}$, and the claim then follows from two applications of the induction hypothesis. If instead ${(y_1,\dots,y_n)}$ lies in the interior of the permutahedron, one can follow it to the boundary by using one of the vector fields ${(x_i - x_j) (\frac{\partial}{\partial x_i} - \frac{\partial}{\partial x_j})}$, and the claim follows from the boundary case.

Finally, to obtain (i), we observe that ${(x_1,\dots,x_n)}$ majorises ${(x,\dots,x)}$, where ${x}$ is the arithmetic mean of ${x_1,\dots,x_n}$. But ${h_d(x,\dots,x)}$ is clearly a positive multiple of ${x^d}$, and the claim now follows from (ii). $\Box$

If the variables ${x_1,\dots,x_n}$ are restricted to be nonnegative, the same argument gives Schur convexity for odd degrees also.

The proof in Hunter of positive definiteness is arranged a little differently than the one above, but still relies ultimately on the identity (2). I wonder if there is a genuinely different way to establish positive definiteness that does not go through this identity.

Recently, I had tentatively announced a forthcoming result with Ben Green establishing the “Gowers inverse conjecture” (or more accurately, the “inverse conjecture for the Gowers uniformity norm”) for vector spaces ${\Bbb F}_p^n$ over a finite field ${\Bbb F}_p$, in the special case when p=2 and when the function $f: {\Bbb F}_p^n \to {\Bbb C}$ for which the inverse conjecture is to be applied is assumed to be a polynomial phase of bounded degree (thus $f= e^{2\pi i P/|{\Bbb F}|}$, where $P: {\Bbb F}_p^n \to {\Bbb F}_p$ is a polynomial of some degree $d=O(1)$). See my FOCS article for some further discussion of this conjecture, which has applications to both polynomiality testing and to various structural decompositions involving the Gowers norm.

This conjecture can be informally stated as follows. By iterating the obvious fact that the derivative of a polynomial of degree at most d is a polynomial of degree at most d-1, we see that a function $P: {\Bbb F}_p^n \to {\Bbb F}_p$ is a polynomial of degree at most d if and only if

$\sum_{\omega_1,\ldots,\omega_{d+1} \in \{0,1\}} (-1)^{\omega_1+\ldots+\omega_{d+1}} P(x +\omega_1 h_1 + \ldots + \omega_{d+1} h_{d+1}) = 0$

for all $x,h_1,\ldots,h_{d+1} \in {\Bbb F}_p^n$. From this one can deduce that a function $f: {\Bbb F}_p^n \to {\Bbb C}$ bounded in magnitude by 1 is a polynomial phase of degree at most d if and only if the Gowers norm

$\|f\|_{U^{d+1}({\Bbb F}_p^n)} := \bigl( {\Bbb E}_{x,h_1,\ldots,h_{d+1} \in {\Bbb F}_p^n} \prod_{\omega_1,\ldots,\omega_{d+1} \in \{0,1\}}$

${\mathcal C}^{\omega_1+\ldots+\omega_{d+1}} f(x + \omega_1 h_1 + \ldots + \omega_{d+1} h_{d+1}) \bigr)^{1/2^{d+1}}$

is equal to its maximal value of 1. The inverse conjecture for the Gowers norm, in its usual formulation, says that, more generally, if a function $f: {\Bbb F}_p^n \to {\Bbb C}$ bounded in magnitude by 1 has large Gowers norm (e.g. $\|f\|_{U^{d+1}} \geq \varepsilon$) then f has some non-trivial correlation with some polynomial phase g (e.g. $\langle f, g \rangle > c(\varepsilon)$ for some $c(\varepsilon) > 0$). Informally, this conjecture asserts that if a function has biased $(d+1)^{th}$ derivatives, then one should be able to “integrate” this bias and conclude that the function is biased relative to a polynomial of degree d. The conjecture has already been proven for $d \leq 2$. There are analogues of this conjecture for cyclic groups which are of relevance to Szemerédi’s theorem and to counting linear patterns in primes, but I will not discuss those here.

At the time of the announcement, our paper had not quite been fully written up. This turned out to be a little unfortunate, because soon afterwards we discovered that our arguments at one point had to go through a version of Newton’s interpolation formula, which involves a factor of d! in the denominator and so is only valid when the characteristic p of the field exceeds the degree. So our arguments in fact are only valid in the range $p > d$, and in particular are rather trivial in the important case $p=2$; my previous announcement should thus be amended accordingly.