You are currently browsing the category archive for the ‘math.RA’ category.

Kronecker is famously reported to have said, “God created the natural numbers; all else is the work of man”. The truth of this statement (literal or otherwise) is debatable; but one can certainly view the other standard number systems {{\bf Z}, {\bf Q}, {\bf R}, {\bf C}} as (iterated) completions of the natural numbers {{\bf N}} in various senses. For instance:

  • The integers {{\bf Z}} are the additive completion of the natural numbers {{\bf N}} (the minimal additive group that contains a copy of {{\bf N}}).
  • The rationals {{\bf Q}} are the multiplicative completion of the integers {{\bf Z}} (the minimal field that contains a copy of {{\bf Z}}).
  • The reals {{\bf R}} are the metric completion of the rationals {{\bf Q}} (the minimal complete metric space that contains a copy of {{\bf Q}}).
  • The complex numbers {{\bf C}} are the algebraic completion of the reals {{\bf R}} (the minimal algebraically closed field that contains a copy of {{\bf R}}).

These descriptions of the standard number systems are elegant and conceptual, but not entirely suitable for constructing the number systems in a non-circular manner from more primitive foundations. For instance, one cannot quite define the reals {{\bf R}} from scratch as the metric completion of the rationals {{\bf Q}}, because the definition of a metric space itself requires the notion of the reals! (One can of course construct {{\bf R}} by other means, for instance by using Dedekind cuts or by using uniform spaces in place of metric spaces.) The definition of the complex numbers as the algebraic completion of the reals does not suffer from such a non-circularity issue, but a certain amount of field theory is required to work with this definition initially. For the purposes of quickly constructing the complex numbers, it is thus more traditional to first define {{\bf C}} as a quadratic extension of the reals {{\bf R}}, and more precisely as the extension {{\bf C} = {\bf R}(i)} formed by adjoining a square root {i} of {-1} to the reals, that is to say a solution to the equation {i^2+1=0}. It is not immediately obvious that this extension is in fact algebraically closed; this is the content of the famous fundamental theorem of algebra, which we will prove later in this course.

The two equivalent definitions of {{\bf C}} – as the algebraic closure, and as a quadratic extension, of the reals respectively – each reveal important features of the complex numbers in applications. Because {{\bf C}} is algebraically closed, all polynomials over the complex numbers split completely, which leads to a good spectral theory for both finite-dimensional matrices and infinite-dimensional operators; in particular, one expects to be able to diagonalise most matrices and operators. Applying this theory to constant coefficient ordinary differential equations leads to a unified theory of such solutions, in which real-variable ODE behaviour such as exponential growth or decay, polynomial growth, and sinusoidal oscillation all become aspects of a single object, the complex exponential {z \mapsto e^z} (or more generally, the matrix exponential {A \mapsto \exp(A)}). Applying this theory more generally to diagonalise arbitrary translation-invariant operators over some locally compact abelian group, one arrives at Fourier analysis, which is thus most naturally phrased in terms of complex-valued functions rather than real-valued ones. If one drops the assumption that the underlying group is abelian, one instead discovers the representation theory of unitary representations, which is simpler to study than the real-valued counterpart of orthogonal representations. For closely related reasons, the theory of complex Lie groups is simpler than that of real Lie groups.

Meanwhile, the fact that the complex numbers are a quadratic extension of the reals lets one view the complex numbers geometrically as a two-dimensional plane over the reals (the Argand plane). Whereas a point singularity in the real line disconnects that line, a point singularity in the Argand plane leaves the rest of the plane connected (although, importantly, the punctured plane is no longer simply connected). As we shall see, this fact causes singularities in complex analytic functions to be better behaved than singularities of real analytic functions, ultimately leading to the powerful residue calculus for computing complex integrals. Remarkably, this calculus, when combined with the quintessentially complex-variable technique of contour shifting, can also be used to compute some (though certainly not all) definite integrals of real-valued functions that would be much more difficult to compute by purely real-variable methods; this is a prime example of Hadamard’s famous dictum that “the shortest path between two truths in the real domain passes through the complex domain”.

Another important geometric feature of the Argand plane is the angle between two tangent vectors to a point in the plane. As it turns out, the operation of multiplication by a complex scalar preserves the magnitude and orientation of such angles; the same fact is true for any non-degenerate complex analytic mapping, as can be seen by performing a Taylor expansion to first order. This fact ties the study of complex mappings closely to that of the conformal geometry of the plane (and more generally, of two-dimensional surfaces and domains). In particular, one can use complex analytic maps to conformally transform one two-dimensional domain to another, leading among other things to the famous Riemann mapping theorem, and to the classification of Riemann surfaces.

If one Taylor expands complex analytic maps to second order rather than first order, one discovers a further important property of these maps, namely that they are harmonic. This fact makes the class of complex analytic maps extremely rigid and well behaved analytically; indeed, the entire theory of elliptic PDE now comes into play, giving useful properties such as elliptic regularity and the maximum principle. In fact, due to the magic of residue calculus and contour shifting, we already obtain these properties for maps that are merely complex differentiable rather than complex analytic, which leads to the striking fact that complex differentiable functions are automatically analytic (in contrast to the real-variable case, in which real differentiable functions can be very far from being analytic).

The geometric structure of the complex numbers (and more generally of complex manifolds and complex varieties), when combined with the algebraic closure of the complex numbers, leads to the beautiful subject of complex algebraic geometry, which motivates the much more general theory developed in modern algebraic geometry. However, we will not develop the algebraic geometry aspects of complex analysis here.

Last, but not least, because of the good behaviour of Taylor series in the complex plane, complex analysis is an excellent setting in which to manipulate various generating functions, particularly Fourier series {\sum_n a_n e^{2\pi i n \theta}} (which can be viewed as boundary values of power (or Laurent) series {\sum_n a_n z^n}), as well as Dirichlet series {\sum_n \frac{a_n}{n^s}}. The theory of contour integration provides a very useful dictionary between the asymptotic behaviour of the sequence {a_n}, and the complex analytic behaviour of the Dirichlet or Fourier series, particularly with regard to its poles and other singularities. This turns out to be a particularly handy dictionary in analytic number theory, for instance relating the distribution of the primes to the Riemann zeta function. Nowadays, many of the analytic number theory results first obtained through complex analysis (such as the prime number theorem) can also be obtained by more “real-variable” methods; however the complex-analytic viewpoint is still extremely valuable and illuminating.

We will frequently touch upon many of these connections to other fields of mathematics in these lecture notes. However, these are mostly side remarks intended to provide context, and it is certainly possible to skip most of these tangents and focus purely on the complex analysis material in these notes if desired.

Note: complex analysis is a very visual subject, and one should draw plenty of pictures while learning it. I am however not planning to put too many pictures in these notes, partly as it is somewhat inconvenient to do so on this blog from a technical perspective, but also because pictures that one draws on one’s own are likely to be far more useful to you than pictures that were supplied by someone else.

Read the rest of this entry »

[This blog post was written jointly by Terry Tao and Will Sawin.]

In the previous blog post, one of us (Terry) implicitly introduced a notion of rank for tensors which is a little different from the usual notion of tensor rank, and which (following BCCGNSU) we will call “slice rank”. This notion of rank could then be used to encode the Croot-Lev-Pach-Ellenberg-Gijswijt argument that uses the polynomial method to control capsets.

Afterwards, several papers have applied the slice rank method to further problems – to control tri-colored sum-free sets in abelian groups (BCCGNSU, KSS) and from there to the triangle removal lemma in vector spaces over finite fields (FL), to control sunflowers (NS), and to bound progression-free sets in {p}-groups (P).

In this post we investigate the notion of slice rank more systematically. In particular, we show how to give lower bounds for the slice rank. In many cases, we can show that the upper bounds on slice rank given in the aforementioned papers are sharp to within a subexponential factor. This still leaves open the possibility of getting a better bound for the original combinatorial problem using the slice rank of some other tensor, but for very long arithmetic progressions (at least eight terms), we show that the slice rank method cannot improve over the trivial bound using any tensor.

It will be convenient to work in a “basis independent” formalism, namely working in the category of abstract finite-dimensional vector spaces over a fixed field {{\bf F}}. (In the applications to the capset problem one takes {{\bf F}={\bf F}_3} to be the finite field of three elements, but most of the discussion here applies to arbitrary fields.) Given {k} such vector spaces {V_1,\dots,V_k}, we can form the tensor product {\bigotimes_{i=1}^k V_i}, generated by the tensor products {v_1 \otimes \dots \otimes v_k} with {v_i \in V_i} for {i=1,\dots,k}, subject to the constraint that the tensor product operation {(v_1,\dots,v_k) \mapsto v_1 \otimes \dots \otimes v_k} is multilinear. For each {1 \leq j \leq k}, we have the smaller tensor products {\bigotimes_{1 \leq i \leq k: i \neq j} V_i}, as well as the {j^{th}} tensor product

\displaystyle \otimes_j: V_j \times \bigotimes_{1 \leq i \leq k: i \neq j} V_i \rightarrow \bigotimes_{i=1}^k V_i

defined in the obvious fashion. Elements of {\bigotimes_{i=1}^k V_i} of the form {v_j \otimes_j v_{\hat j}} for some {v_j \in V_j} and {v_{\hat j} \in \bigotimes_{1 \leq i \leq k: i \neq j} V_i} will be called rank one functions, and the slice rank (or rank for short) {\hbox{rank}(v)} of an element {v} of {\bigotimes_{i=1}^k V_i} is defined to be the least nonnegative integer {r} such that {v} is a linear combination of {r} rank one functions. If {V_1,\dots,V_k} are finite-dimensional, then the rank is always well defined as a non-negative integer (in fact it cannot exceed {\min( \hbox{dim}(V_1), \dots, \hbox{dim}(V_k))}. It is also clearly subadditive:

\displaystyle \hbox{rank}(v+w) \leq \hbox{rank}(v) + \hbox{rank}(w). \ \ \ \ \ (1)

 

For {k=1}, {\hbox{rank}(v)} is {0} when {v} is zero, and {1} otherwise. For {k=2}, {\hbox{rank}(v)} is the usual rank of the {2}-tensor {v \in V_1 \otimes V_2} (which can for instance be identified with a linear map from {V_1} to the dual space {V_2^*}). The usual notion of tensor rank for higher order tensors uses complete tensor products {v_1 \otimes \dots \otimes v_k}, {v_i \in V_i} as the rank one objects, rather than {v_j \otimes_j v_{\hat j}}, giving a rank that is greater than or equal to the slice rank studied here.

From basic linear algebra we have the following equivalences:

Lemma 1 Let {V_1,\dots,V_k} be finite-dimensional vector spaces over a field {{\bf F}}, let {v} be an element of {V_1 \otimes \dots \otimes V_k}, and let {r} be a non-negative integer. Then the following are equivalent:

  • (i) One has {\hbox{rank}(v) \leq r}.
  • (ii) One has a representation of the form

    \displaystyle v = \sum_{j=1}^k \sum_{s \in S_j} v_{j,s} \otimes_j v_{\hat j,s}

    where {S_1,\dots,S_k} are finite sets of total cardinality {|S_1|+\dots+|S_k|} at most {r}, and for each {1 \leq j \leq k} and {s \in S_j}, {v_{j,s} \in V_j} and {v_{\hat j,s} \in \bigotimes_{1 \leq i \leq k: i \neq j} V_i}.

  • (iii) One has

    \displaystyle v \in \sum_{j=1}^k U_j \otimes_j \bigotimes_{1 \leq i \leq k: i \neq j} V_i

    where for each {j=1,\dots,k}, {U_j} is a subspace of {V_j} of total dimension {\hbox{dim}(U_1)+\dots+\hbox{dim}(U_k)} at most {r}, and we view {U_j \otimes_j \bigotimes_{1 \leq i \leq k: i \neq j} V_i} as a subspace of {\bigotimes_{i=1}^k V_i} in the obvious fashion.

  • (iv) (Dual formulation) There exist subspaces {W_j} of the dual space {V_j^*} for {j=1,\dots,k}, of total dimension at least {\hbox{dim}(V_1)+\dots+\hbox{dim}(V_k) - r}, such that {v} is orthogonal to {\bigotimes_{j=1}^k W_j}, in the sense that one has the vanishing

    \displaystyle \langle \bigotimes_{j=1}^k w_j, v \rangle = 0

    for all {w_j \in W_j}, where {\langle, \rangle: \bigotimes_{j=1}^k V_j^* \times \bigotimes_{j=1}^k V_j \rightarrow {\bf F}} is the obvious pairing.

Proof: The equivalence of (i) and (ii) is clear from definition. To get from (ii) to (iii) one simply takes {U_j} to be the span of the {v_{j,s}}, and conversely to get from (iii) to (ii) one takes the {v_{j,s}} to be a basis of the {U_j} and computes {v_{\hat j,s}} by using a basis for the tensor product {\bigotimes_{j=1}^k U_j \otimes_j \bigotimes_{1 \leq i \leq k: i \neq j} V_i} consisting entirely of functions of the form {v_{j,s} \otimes_j e} for various {e}. To pass from (iii) to (iv) one takes {W_j} to be the annihilator {\{ w_j \in V_j: \langle w_j, v_j \rangle = 0 \forall v_j \in U_j \}} of {U_j}, and conversely to pass from (iv) to (iii). \Box

One corollary of the formulation (iv), is that the set of tensors of slice rank at most {r} is Zariski closed (if the field {{\mathbf F}} is algebraically closed), and so the slice rank itself is a lower semi-continuous function. This is in contrast to the usual tensor rank, which is not necessarily semicontinuous.

Corollary 2 Let {V_1,\dots, V_k} be finite-dimensional vector spaces over an algebraically closed field {{\bf F}}. Let {r} be a nonnegative integer. The set of elements of {V_1 \otimes \dots \otimes V_k} of slice rank at most {r} is closed in the Zariski topology.

Proof: In view of Lemma 1(i and iv), this set is the union over tuples of integers {d_1,\dots,d_k} with {d_1 + \dots + d_k \geq \hbox{dim}(V_1)+\dots+\hbox{dim}(V_k) - r} of the projection from {\hbox{Gr}(d_1, V_1) \times \dots \times \hbox{Gr}(d_k, V_k) \times ( V_1 \otimes \dots \otimes V_k)} of the set of tuples {(W_1,\dots,W_k, v)} with { v} orthogonal to {W_1 \times \dots \times W_k}, where {\hbox{Gr}(d,V)} is the Grassmanian parameterizing {d}-dimensional subspaces of {V}.

One can check directly that the set of tuples {(W_1,\dots,W_k, v)} with { v} orthogonal to {W_1 \times \dots \times W_k} is Zariski closed in {\hbox{Gr}(d_1, V_1) \times \dots \times \hbox{Gr}(d_k, V_k) \times V_1 \otimes \dots \otimes V_k} using a set of equations of the form {\langle \bigotimes_{j=1}^k w_j, v \rangle = 0} locally on {\hbox{Gr}(d_1, V_1) \times \dots \times \hbox{Gr}(d_k, V_k) }. Hence because the Grassmanian is a complete variety, the projection of this set to {V_1 \otimes \dots \otimes V_k} is also Zariski closed. So the finite union over tuples {d_1,\dots,d_k} of these projections is also Zariski closed.

\Box

We also have good behaviour with respect to linear transformations:

Lemma 3 Let {V_1,\dots,V_k, W_1,\dots,W_k} be finite-dimensional vector spaces over a field {{\bf F}}, let {v} be an element of {V_1 \otimes \dots \otimes V_k}, and for each {1 \leq j \leq k}, let {\phi_j: V_j \rightarrow W_j} be a linear transformation, with {\bigotimes_{j=1}^k \phi_j: \bigotimes_{j=1}^k V_k \rightarrow \bigotimes_{j=1}^k W_k} the tensor product of these maps. Then

\displaystyle \hbox{rank}( (\bigotimes_{j=1}^k \phi_j)(v) ) \leq \hbox{rank}(v). \ \ \ \ \ (2)

 

Furthermore, if the {\phi_j} are all injective, then one has equality in (2).

Thus, for instance, the rank of a tensor {v \in \bigotimes_{j=1}^k V_k} is intrinsic in the sense that it is unaffected by any enlargements of the spaces {V_1,\dots,V_k}.

Proof: The bound (2) is clear from the formulation (ii) of rank in Lemma 1. For equality, apply (2) to the injective {\phi_j}, as well as to some arbitrarily chosen left inverses {\phi_j^{-1}: W_j \rightarrow V_j} of the {\phi_j}. \Box

Computing the rank of a tensor is difficult in general; however, the problem becomes a combinatorial one if one has a suitably sparse representation of that tensor in some basis, where we will measure sparsity by the property of being an antichain.

Proposition 4 Let {V_1,\dots,V_k} be finite-dimensional vector spaces over a field {{\bf F}}. For each {1 \leq j \leq k}, let {(v_{j,s})_{s \in S_j}} be a linearly independent set in {V_j} indexed by some finite set {S_j}. Let {\Gamma} be a subset of {S_1 \times \dots \times S_k}.

Let {v \in \bigotimes_{j=1}^k V_j} be a tensor of the form

\displaystyle v = \sum_{(s_1,\dots,s_k) \in \Gamma} c_{s_1,\dots,s_k} v_{1,s_1} \otimes \dots \otimes v_{k,s_k} \ \ \ \ \ (3)

 

where for each {(s_1,\dots,s_k)}, {c_{s_1,\dots,s_k}} is a coefficient in {{\bf F}}. Then one has

\displaystyle \hbox{rank}(v) \leq \min_{\Gamma = \Gamma_1 \cup \dots \cup \Gamma_k} |\pi_1(\Gamma_1)| + \dots + |\pi_k(\Gamma_k)| \ \ \ \ \ (4)

 

where the minimum ranges over all coverings of {\Gamma} by sets {\Gamma_1,\dots,\Gamma_k}, and {\pi_j: S_1 \times \dots \times S_k \rightarrow S_j} for {j=1,\dots,k} are the projection maps.

Now suppose that the coefficients {c_{s_1,\dots,s_k}} are all non-zero, that each of the {S_j} are equipped with a total ordering {\leq_j}, and {\Gamma'} is the set of maximal elements of {\Gamma}, thus there do not exist distinct {(s_1,\dots,s_k) \in \Gamma'}, {(t_1,\dots,t_k) \in \Gamma} such that {s_j \leq t_j} for all {j=1,\dots,k}. Then one has

\displaystyle \hbox{rank}(v) \geq \min_{\Gamma' = \Gamma_1 \cup \dots \cup \Gamma_k} |\pi_1(\Gamma_1)| + \dots + |\pi_k(\Gamma_k)|. \ \ \ \ \ (5)

 

In particular, if {\Gamma} is an antichain (i.e. every element is maximal), then equality holds in (4).

Proof: By Lemma 3 (or by enlarging the bases {v_{j,s_j}}), we may assume without loss of generality that each of the {V_j} is spanned by the {v_{j,s_j}}. By relabeling, we can also assume that each {S_j} is of the form

\displaystyle S_j = \{1,\dots,|S_j|\}

with the usual ordering, and by Lemma 3 we may take each {V_j} to be {{\bf F}^{|S_j|}}, with {v_{j,s_j} = e_{s_j}} the standard basis.

Let {r} denote the rank of {v}. To show (4), it suffices to show the inequality

\displaystyle r \leq |\pi_1(\Gamma_1)| + \dots + |\pi_k(\Gamma_k)| \ \ \ \ \ (6)

 

for any covering of {\Gamma} by {\Gamma_1,\dots,\Gamma_k}. By removing repeated elements we may assume that the {\Gamma_i} are disjoint. For each {1 \leq j \leq k}, the tensor

\displaystyle \sum_{(s_1,\dots,s_k) \in \Gamma_j} c_{s_1,\dots,s_k} e_{s_1} \otimes \dots \otimes e_{s_k}

can (after collecting terms) be written as

\displaystyle \sum_{s_j \in \pi_j(\Gamma_j)} e_{s_j} \otimes_j v_{\hat j,s_j}

for some {v_{\hat j, s_j} \in \bigotimes_{1 \leq i \leq k: i \neq j} {\bf F}^{|S_i|}}. Summing and using (1), we conclude the inequality (6).

Now assume that the {c_{s_1,\dots,s_k}} are all non-zero and that {\Gamma'} is the set of maximal elements of {\Gamma}. To conclude the proposition, it suffices to show that the reverse inequality

\displaystyle r \geq |\pi_1(\Gamma_1)| + \dots + |\pi_k(\Gamma_k)| \ \ \ \ \ (7)

 

 holds for some {\Gamma_1,\dots,\Gamma_k} covering {\Gamma'}. By Lemma 1(iv), there exist subspaces {W_j} of {({\bf F}^{|S_j|})^*} whose dimension {d_j := \hbox{dim}(W_j)} sums to

\displaystyle \sum_{j=1}^k d_j = \sum_{j=1}^k |S_j| - r \ \ \ \ \ (8)

 

such that {v} is orthogonal to {\bigotimes_{j=1}^k W_j}.

Let {1 \leq j \leq k}. Using Gaussian elimination, one can find a basis {w_{j,1},\dots,w_{j,d_j}} of {W_j} whose representation in the standard dual basis {e^*_{1},\dots,e^*_{|S_j|}} of {({\bf F}^{|S_j|})^*} is in row-echelon form. That is to say, there exist natural numbers

\displaystyle 1 \leq s_{j,1} < \dots < s_{j,d_j} \leq |S_j|

such that for all {1 \leq t \leq d_j}, {w_{j,t}} is a linear combination of the dual vectors {e^*_{s_{j,t}},\dots,e^*_{|S_j|}}, with the {e^*_{s_{j,t}}} coefficient equal to one.

We now claim that {\prod_{j=1}^k \{ s_{j,t}: 1 \leq t \leq d_j \}} is disjoint from {\Gamma'}. Suppose for contradiction that this were not the case, thus there exists {1 \leq t_j \leq d_j} for each {1 \leq j \leq k} such that

\displaystyle (s_{1,t_1}, \dots, s_{k,t_k}) \in \Gamma'.

As {\Gamma'} is the set of maximal elements of {\Gamma}, this implies that

\displaystyle (s'_1,\dots,s'_k) \not \in \Gamma

for any tuple {(s'_1,\dots,s'_k) \in \prod_{j=1}^k \{ s_{j,t_j}, \dots, |S_j|\}} other than {(s_{1,t_1}, \dots, s_{k,t_k})}. On the other hand, we know that {w_{j,t_j}} is a linear combination of {e^*_{s_{j,t_j}},\dots,e^*_{|S_j|}}, with the {e^*_{s_{j,t_j}}} coefficient one. We conclude that the tensor product {\bigotimes_{j=1}^k w_{j,t_j}} is equal to

\displaystyle \bigotimes_{j=1}^k e^*_{s_{j,t_j}}

plus a linear combination of other tensor products {\bigotimes_{j=1}^k e^*_{s'_j}} with {(s'_1,\dots,s'_k)} not in {\Gamma}. Taking inner products with (3), we conclude that {\langle v, \bigotimes_{j=1}^k w_{j,t_j}\rangle = c_{s_{1,t_1},\dots,s_{k,t_k}} \neq 0}, contradicting the fact that {v} is orthogonal to {\prod_{j=1}^k W_j}. Thus we have {\prod_{j=1}^k \{ s_{j,t}: 1 \leq t \leq d_j \}} disjoint from {\Gamma'}.

For each {1 \leq j \leq k}, let {\Gamma_j} denote the set of tuples {(s_1,\dots,s_k)} in {\Gamma'} with {s_j} not of the form {\{ s_{j,t}: 1 \leq t \leq d_j \}}. From the previous discussion we see that the {\Gamma_j} cover {\Gamma'}, and we clearly have {\pi_j(\Gamma_j) \leq |S_j| - d_j}, and hence from (8) we have (7) as claimed. \Box

As an instance of this proposition, we recover the computation of diagonal rank from the previous blog post:

Example 5 Let {V_1,\dots,V_k} be finite-dimensional vector spaces over a field {{\bf F}} for some {k \geq 2}. Let {d} be a natural number, and for {1 \leq j \leq k}, let {e_{j,1},\dots,e_{j,d}} be a linearly independent set in {V_j}. Let {c_1,\dots,c_d} be non-zero coefficients in {{\bf F}}. Then

\displaystyle \sum_{t=1}^d c_t e_{1,t} \otimes \dots \otimes e_{k,t}

has rank {d}. Indeed, one applies the proposition with {S_1,\dots,S_k} all equal to {\{1,\dots,d\}}, with {\Gamma} the diagonal in {S_1 \times \dots \times S_k}; this is an antichain if we give one of the {S_i} the standard ordering, and another of the {S_i} the opposite ordering (and ordering the remaining {S_i} arbitrarily). In this case, the {\pi_j} are all bijective, and so it is clear that the minimum in (4) is simply {d}.

The combinatorial minimisation problem in the above proposition can be solved asymptotically when working with tensor powers, using the notion of the Shannon entropy {h(X)} of a discrete random variable {X}.

Proposition 6 Let {V_1,\dots,V_k} be finite-dimensional vector spaces over a field {{\bf F}}. For each {1 \leq j \leq k}, let {(v_{j,s})_{s \in S_j}} be a linearly independent set in {V_j} indexed by some finite set {S_j}. Let {\Gamma} be a non-empty subset of {S_1 \times \dots \times S_k}.

Let {v \in \bigotimes_{j=1}^k V_j} be a tensor of the form (3) for some coefficients {c_{s_1,\dots,s_k}}. For each natural number {n}, let {v^{\otimes n}} be the tensor power of {n} copies of {v}, viewed as an element of {\bigotimes_{j=1}^k V_j^{\otimes n}}. Then

\displaystyle \hbox{rank}(v^{\otimes n}) \leq \exp( (H + o(1)) n ) \ \ \ \ \ (9)

 

as {n \rightarrow \infty}, where {H} is the quantity

\displaystyle H = \hbox{sup}_{(X_1,\dots,X_k)} \hbox{min}( h(X_1), \dots, h(X_k) ) \ \ \ \ \ (10)

 

and {(X_1,\dots,X_k)} range over the random variables taking values in {\Gamma}.

Now suppose that the coefficients {c_{s_1,\dots,s_k}} are all non-zero and that each of the {S_j} are equipped with a total ordering {\leq_j}. Let {\Gamma'} be the set of maximal elements of {\Gamma} in the product ordering, and let {H' = \hbox{sup}_{(X_1,\dots,X_k)} \hbox{min}( h(X_1), \dots, h(X_k) ) } where {(X_1,\dots,X_k)} range over random variables taking values in {\Gamma'}. Then

\displaystyle \hbox{rank}(v^{\otimes n}) \geq \exp( (H' + o(1)) n ) \ \ \ \ \ (11)

 

as {n \rightarrow \infty}. In particular, if the maximizer in (10) is supported on the maximal elements of {\Gamma} (which always holds if {\Gamma} is an antichain in the product ordering), then equality holds in (9).

Proof:

It will suffice to show that

\displaystyle \min_{\Gamma^n = \Gamma_{n,1} \cup \dots \cup \Gamma_{n,k}} |\pi_{n,1}(\Gamma_{n,1})| + \dots + |\pi_{n,k}(\Gamma_{n,k})| = \exp( (H + o(1)) n ) \ \ \ \ \ (12)

 

as {n \rightarrow \infty}, where {\pi_{n,j}: \prod_{i=1}^k S_i^n \rightarrow S_j^n} is the projection map. Then the same thing will apply to {\Gamma'} and {H'}. Then applying Proposition 4, using the lexicographical ordering on {S_j^n} and noting that, if {\Gamma'} are the maximal elements of {\Gamma}, then {\Gamma'^n} are the maximal elements of {\Gamma^n}, we obtain both (9) and (11).

We first prove the lower bound. By compactness (and the continuity properties of entropy), we can find a random variable {(X_1,\dots,X_k)} taking values in {\Gamma} such that

\displaystyle H = \hbox{min}( h(X_1), \dots, h(X_k) ). \ \ \ \ \ (13)

 

Let {\varepsilon = o(1)} be a small positive quantity that goes to zero sufficiently slowly with {n}. Let {\Sigma = \Sigma_{X_1,\dots,X_k} \subset \Gamma^n} denote the set of all tuples {(a_1, \dots, \vec a_n)} in {\Gamma^n} that are within {\varepsilon} of being distributed according to the law of {(X_1,\dots,X_k)}, in the sense that for all {a \in \Gamma}, one has

\displaystyle |\frac{|\{ 1 \leq l \leq n: a_l = a \}|}{n} - {\bf P}( (X_1,\dots,X_k) = a )| \leq \varepsilon.

By the asymptotic equipartition property, the cardinality of {\Sigma} can be computed to be

\displaystyle |\Sigma| = \exp( (h( X_1,\dots,X_k)+o(1)) n ) \ \ \ \ \ (14)

 

if {\varepsilon} goes to zero slowly enough. Similarly one has

\displaystyle |\pi_{n,j}(\Sigma)| = \exp( (h( X_j)+o(1)) n ),

and for each {s_{n,j} \in \pi_{n,j}(\Sigma)}, one has

\displaystyle |\{ \sigma \in \Sigma: \pi_{n,j}(\sigma) = s_{n,j} \}| \leq \exp( (h( X_1,\dots,X_k)-h(X_j)+o(1)) n ). \ \ \ \ \ (15)

 

Now let {\Gamma^n = \Gamma_{n,1} \cup \dots \cup \Gamma_{n,k}} be an arbitrary covering of {\Gamma^n}. By the pigeonhole principle, there exists {1 \leq j \leq k} such that

\displaystyle |\Gamma_{n,j} \cap \Sigma| \geq \frac{1}{k} |\Sigma|

and hence by (14), (15)

\displaystyle |\pi_{n,j}( \Gamma_{n,j} \cap \Sigma)| \geq \frac{1}{k} \exp( (h( X_j)+o(1)) n )

which by (13) implies that

\displaystyle |\pi_{n,1}(\Gamma_{n,1})| + \dots + |\pi_{n,k}(\Gamma_{n,k})| \geq \exp( (H + o(1)) n )

noting that the {\frac{1}{k}} factor can be absorbed into the {o(1)} error). This gives the lower bound in (12).

Now we prove the upper bound. We can cover {\Gamma^n} by {O(\exp(o(n))} sets of the form {\Sigma_{X_1,\dots,X_k}} for various choices of random variables {(X_1,\dots,X_k)} taking values in {\Gamma}. For each such random variable {(X_1,\dots,X_k)}, we can find {1 \leq j \leq k} such that {h(X_j) \leq H}; we then place all of {\Sigma_{X_1,\dots,X_k}} in {\Gamma_j}. It is then clear that the {\Gamma_j} cover {\Gamma} and that

\displaystyle |\Gamma_j| \leq \exp( (H+o(1)) n )

for all {j=1,\dots,n}, giving the required upper bound. \Box

It is of interest to compute the quantity {H} in (10). We have the following criterion for when a maximiser occurs:

Proposition 7 Let {S_1,\dots,S_k} be finite sets, and {\Gamma \subset S_1 \times \dots \times S_k} be non-empty. Let {H} be the quantity in (10). Let {(X_1,\dots,X_k)} be a random variable taking values in {\Gamma}, and let {\Gamma^* \subset \Gamma} denote the essential range of {(X_1,\dots,X_k)}, that is to say the set of tuples {(t_1,\dots,t_k)\in \Gamma} such that {{\bf P}( X_1=t_1, \dots, X_k = t_k)} is non-zero. Then the following are equivalent:

  • (i) {(X_1,\dots,X_k)} attains the maximum in (10).
  • (ii) There exist weights {w_1,\dots,w_k \geq 0} and a finite quantity {D \geq 0}, such that {w_j=0} whenever {h(X_j) > \min(h(X_1),\dots,h(X_k))}, and such that

    \displaystyle \sum_{j=1}^k w_j \log \frac{1}{{\bf P}(X_j = t_j)} \leq D \ \ \ \ \ (16)

    for all {(t_1,\dots,t_k) \in \Gamma}, with equality if {(t_1,\dots,t_k) \in \Gamma^*}. (In particular, {w_j} must vanish if there exists a {t_j \in \pi_i(\Gamma)} with {{\bf P}(X_j=t_j)=0}.)

Furthermore, when (i) and (ii) holds, one has

\displaystyle D = H \sum_{j=1}^k w_j. \ \ \ \ \ (17)

 

Proof: We first show that (i) implies (ii). The function {p \mapsto p \log \frac{1}{p}} is concave on {[0,1]}. As a consequence, if we define {C} to be the set of tuples {(h_1,\dots,h_k) \in [0,+\infty)^k} such that there exists a random variable {(Y_1,\dots,Y_k)} taking values in {\Gamma} with {h(Y_j)=h_j}, then {C} is convex. On the other hand, by (10), {C} is disjoint from the orthant {(H,+\infty)^k}. Thus, by the hyperplane separation theorem, we conclude that there exists a half-space

\displaystyle \{ (h_1,\dots,h_k) \in {\bf R}^k: w_1 h_1 + \dots + w_k h_k \geq c \},

where {w_1,\dots,w_k} are reals that are not all zero, and {c} is another real, which contains {(h(X_1),\dots,h(X_k))} on its boundary and {(H,+\infty)^k} in its interior, such that {C} avoids the interior of the half-space. Since {(h(X_1),\dots,h(X_k))} is also on the boundary of {(H,+\infty)^k}, we see that the {w_j} are non-negative, and that {w_j = 0} whenever {h(X_j) \neq H}.

By construction, the quantity

\displaystyle w_1 h(Y_1) + \dots + w_k h(Y_k)

is maximised when {(Y_1,\dots,Y_k) = (X_1,\dots,X_k)}. At this point we could use the method of Lagrange multipliers to obtain the required constraints, but because we have some boundary conditions on the {(Y_1,\dots,Y_k)} (namely, that the probability that they attain a given element of {\Gamma} has to be non-negative) we will work things out by hand. Let {t = (t_1,\dots,t_k)} be an element of {\Gamma}, and {s = (s_1,\dots,s_k)} an element of {\Gamma^*}. For {\varepsilon>0} small enough, we can form a random variable {(Y_1,\dots,Y_k)} taking values in {\Gamma}, whose probability distribution is the same as that for {(X_1,\dots,X_k)} except that the probability of attaining {(t_1,\dots,t_k)} is increased by {\varepsilon}, and the probability of attaining {(s_1,\dots,s_k)} is decreased by {\varepsilon}. If there is any {j} for which {{\bf P}(X_j = t_j)=0} and {w_j \neq 0}, then one can check that

\displaystyle w_1 h(Y_1) + \dots + w_k h(Y_k) - (w_1 h(X_1) + \dots + w_k h(X_k)) \gg \varepsilon \log \frac{1}{\varepsilon}

for sufficiently small {\varepsilon}, contradicting the maximality of {(X_1,\dots,X_k)}; thus we have {{\bf P}(X_j = t_j) > 0} whenever {w_j \neq 0}. Taylor expansion then gives

\displaystyle w_1 h(Y_1) + \dots + w_k h(Y_k) - (w_1 h(X_1) + \dots + w_k h(X_k)) = (A_t - A_s) \varepsilon + O(\varepsilon^2)

for small {\varepsilon}, where

\displaystyle A_t := \sum_{j=1}^k w_j \log \frac{1}{{\bf P}(X_j = t_j)}

and similarly for {A_s}. We conclude that {A_t \leq A_s} for all {s \in \Gamma^*} and {t \in \Gamma}, thus there exists a quantity {D} such that {A_s = D} for all {s \in \Gamma^*}, and {A_t \leq D} for all {t \in \Gamma}. By construction {D} must be nonnegative. Sampling {(t_1,\dots,t_k)} using the distribution of {(X_1,\dots,X_k)}, one has

\displaystyle \sum_{j=1}^k w_j \log \frac{1}{{\bf P}(X_j = t_j)} = D

almost surely; taking expectations we conclude that

\displaystyle \sum_{j=1}^k w_j \sum_{t_j \in S_j} {\bf P}( X_j = t_j) \log \frac{1}{{\bf P}(X_j = t_j)} = D.

The inner sum is {h(X_j)}, which equals {H} when {w_j} is non-zero, giving (17).

Now we show conversely that (ii) implies (i). As noted previously, the function {p \mapsto p \log \frac{1}{p}} is concave on {[0,1]}, with derivative {\log \frac{1}{p} - 1}. This gives the inequality

\displaystyle q \log \frac{1}{q} \leq p \log \frac{1}{p} + (q-p) ( \log \frac{1}{p} - 1 ) \ \ \ \ \ (18)

 

for any {0 \leq p,q \leq 1} (note the right-hand side may be infinite when {p=0} and {q>0}). Let {(Y_1,\dots,Y_k)} be any random variable taking values in {\Gamma}, then on applying the above inequality with {p = {\bf P}(X_j = t_j)} and {q = {\bf P}( Y_j = t_j )}, multiplying by {w_j}, and summing over {j=1,\dots,k} and {t_j \in S_j} gives

\displaystyle \sum_{j=1}^k w_j h(Y_j) \leq \sum_{j=1}^k w_j h(X_j)

\displaystyle + \sum_{j=1}^k \sum_{t_j \in S_j} w_j ({\bf P}(Y_j = t_j) - {\bf P}(X_j = t_j)) ( \log \frac{1}{{\bf P}(X_j=t_j)} - 1 ).

By construction, one has

\displaystyle \sum_{j=1}^k w_j h(X_j) = \min(h(X_1),\dots,h(X_k)) \sum_{j=1}^k w_j

and

\displaystyle \sum_{j=1}^k w_j h(Y_j) \geq \min(h(Y_1),\dots,h(Y_k)) \sum_{j=1}^k w_j

so to prove that {\min(h(Y_1),\dots,h(Y_k)) \leq \min(h(X_1),\dots,h(X_k))} (which would give (i)), it suffices to show that

\displaystyle \sum_{j=1}^k \sum_{t_j \in S_j} w_j ({\bf P}(Y_j = t_j) - {\bf P}(X_j = t_j)) ( \log \frac{1}{{\bf P}(X_j=t_j)} - 1 ) \leq 0,

or equivalently that the quantity

\displaystyle \sum_{j=1}^k \sum_{t_j \in S_j} w_j {\bf P}(Y_j = t_j) ( \log \frac{1}{{\bf P}(X_j=t_j)} - 1 )

is maximised when {(Y_1,\dots,Y_k) = (X_1,\dots,X_k)}. Since

\displaystyle \sum_{j=1}^k \sum_{t_j \in S_j} w_j {\bf P}(Y_j = t_j) = \sum_{j=1}^k w_j

it suffices to show this claim for the quantity

\displaystyle \sum_{j=1}^k \sum_{t_j \in S_j} w_j {\bf P}(Y_j = t_j) \log \frac{1}{{\bf P}(X_j=t_j)}.

One can view this quantity as

\displaystyle {\bf E}_{(Y_1,\dots,Y_k)} \sum_{j=1}^k w_j \log \frac{1}{{\bf P}_{X_j}(X_j=Y_j)}.

By (ii), this quantity is bounded by {D}, with equality if {(Y_1,\dots,Y_k)} is equal to {(X_1,\dots,X_k)} (and is in particular ranging in {\Gamma^*}), giving the claim. \Box

The second half of the proof of Proposition 7 only uses the marginal distributions {{{\bf P}(X_j=t_j)}} and the equation(16), not the actual distribution of {(X_1,\dots,X_k)}, so it can also be used to prove an upper bound on {H} when the exact maximizing distribution is not known, given suitable probability distributions in each variable. The logarithm of the probability distribution here plays the role that the weight functions do in BCCGNSU.

Remark 8 Suppose one is in the situation of (i) and (ii) above; assume the nondegeneracy condition that {H} is positive (or equivalently that {D} is positive). We can assign a “degree” {d_j(t_j)} to each element {t_j \in S_j} by the formula

\displaystyle d_j(t_j) := w_j \log \frac{1}{{\bf P}(X_j = t_j)}, \ \ \ \ \ (19)

 

then every tuple {(t_1,\dots,t_k)} in {\Gamma} has total degree at most {D}, and those tuples in {\Gamma^*} have degree exactly {D}. In particular, every tuple in {\Gamma^n} has degree at most {nD}, and hence by (17), each such tuple has a {j}-component of degree less than or equal to {nHw_j} for some {j} with {w_j>0}. On the other hand, we can compute from (19) and the fact that {h(X_j) = H} for {w_j > 0} that {Hw_j = {\bf E} d_j(X_j)}. Thus, by asymptotic equipartition, and assuming {w_j \neq 0}, the number of “monomials” in {S_j^n} of total degree at most {nHw_j} is at most {\exp( (h(X_j)+o(1)) n )}; one can in fact use (19) and (18) to show that this is in fact an equality. This gives a direct way to cover {\Gamma^n} by sets {\Gamma_{n,1},\dots,\Gamma_{n,k}} with {|\pi_j(\Gamma_{n,j})| \leq \exp( (H+o(1)) n)}, which is in the spirit of the Croot-Lev-Pach-Ellenberg-Gijswijt arguments from the previous post.

We can now show that the rank computation for the capset problem is sharp:

Proposition 9 Let {V_1^{\otimes n} = V_2^{\otimes n} = V_3^{\otimes n}} denote the space of functions from {{\bf F}_3^n} to {{\bf F}_3}. Then the function {(x,y,z) \mapsto \delta_{0^n}(x,y,z)} from {{\bf F}_3^n \times {\bf F}_3^n \times {\bf F}_3^n} to {{\bf F}}, viewed as an element of {V_1^{\otimes n} \otimes V_2^{\otimes n} \otimes V_3^{\otimes n}}, has rank {\exp( (H^*+o(1)) n )} as {n \rightarrow \infty}, where {H^* \approx 1.013445} is given by the formula

\displaystyle H^* = \alpha \log \frac{1}{\alpha} + \beta \log \frac{1}{\beta} + \gamma \log \frac{1}{\gamma} \ \ \ \ \ (20)

 

with

\displaystyle \alpha = \frac{32}{3(15 + \sqrt{33})} \approx 0.51419

\displaystyle \beta = \frac{4(\sqrt{33}-1)}{3(15+\sqrt{33})} \approx 0.30495

\displaystyle \gamma = \frac{(\sqrt{33}-1)^2}{6(15+\sqrt{33})} \approx 0.18086.

Proof: In {{\bf F}_3 \times {\bf F}_3 \times {\bf F}_3}, we have

\displaystyle \delta_0(x+y+z) = 1 - (x+y+z)^2

\displaystyle = (1-x^2) - y^2 - z^2 + xy + yz + zx.

Thus, if we let {V_1=V_2=V_3} be the space of functions from {{\bf F}_3} to {{\bf F}_3} (with domain variable denoted {x,y,z} respectively), and define the basis functions

\displaystyle v_{1,0} := 1; v_{1,1} := x; v_{1,2} := x^2

\displaystyle v_{2,0} := 1; v_{2,1} := y; v_{2,2} := y^2

\displaystyle v_{3,0} := 1; v_{3,1} := z; v_{3,2} := z^2

of {V_1,V_2,V_3} indexed by {S_1=S_2=S_3 := \{ 0,1,2\}} (with the usual ordering), respectively, and set {\Gamma \subset S_1 \times S_2 \times S_3} to be the set

\displaystyle \{ (2,0,0), (0,2,0), (0,0,2), (1,1,0), (0,1,1), (1,0,1),(0,0,0) \}

then {\delta_0(x,y,z)} is a linear combination of the {v_{1,t_1} \otimes v_{1,t_2} \otimes v_{1,t_3}} with {(t_1,t_2,t_3) \in \Gamma}, and all coefficients non-zero. Then we have {\Gamma'= \{ (2,0,0), (0,2,0), (0,0,2), (1,1,0), (0,1,1), (1,0,1) \}}. We will show that the quantity {H} of (10) agrees with the quantity {H^*} of (20), and that the optimizing distribution is supported on {\Gamma'}, so that by Proposition 6 the rank of {\delta_{0^n}(x,y,z)} is {\exp( (H+o(1)) n)}.

To compute the quantity at (10), we use the criterion in Proposition 7. We take {(X_1,X_2,X_3)} to be the random variable taking values in {\Gamma} that attains each of the values {(2,0,0), (0,2,0), (0,0,2)} with a probability of {\gamma \approx 0.18086}, and each of {(1,1,0), (0,1,1), (1,0,1)} with a probability of {\alpha - 2\gamma = \beta/2 \approx 0.15247}; then each of the {X_j} attains the values of {0,1,2} with probabilities {\alpha,\beta,\gamma} respectively, so in particular {h(X_1)=h(X_2)=h(X_3)} is equal to the quantity {H'} in (20). If we now set {w_1 = w_2 = w_3 := 1} and

\displaystyle D := 2\log \frac{1}{\alpha} + \log \frac{1}{\gamma} = \log \frac{1}{\alpha} + 2 \log \frac{1}{\beta} = 3H^* \approx 3.04036

we can verify the condition (16) with equality for all {(t_1,t_2,t_3) \in \Gamma'}, which from (17) gives {H=H'=H^*} as desired. \Box

This statement already follows from the result of Kleinberg-Sawin-Speyer, which gives a “tri-colored sum-free set” in {\mathbb F_3^n} of size {\exp((H'+o(1))n)}, as the slice rank of this tensor is an upper bound for the size of a tri-colored sum-free set. If one were to go over the proofs more carefully to evaluate the subexponential factors, this argument would give a stronger lower bound than KSS, as it does not deal with the substantial loss that comes from Behrend’s construction. However, because it actually constructs a set, the KSS result rules out more possible approaches to give an exponential improvement of the upper bound for capsets. The lower bound on slice rank shows that the bound cannot be improved using only the slice rank of this particular tensor, whereas KSS shows that the bound cannot be improved using any method that does not take advantage of the “single-colored” nature of the problem.

We can also show that the slice rank upper bound in a result of Naslund-Sawin is similarly sharp:

Proposition 10 Let {V_1^{\otimes n} = V_2^{\otimes n} = V_3^{\otimes n}} denote the space of functions from {\{0,1\}^n} to {\mathbb C}. Then the function {(x,y,z) \mapsto \prod_{i=1}^n (x_i+y_i+z_i)-1} from {\{0,1\}^n \times \{0,1\}^n \times \{0,1\}^n \rightarrow \mathbb C}, viewed as an element of {V_1^{\otimes n} \otimes V_2^{\otimes n} \otimes V_3^{\otimes n}}, has slice rank {(3/2^{2/3})^n e^{o(n)}}

Proof: Let {v_{1,0}=1} and {v_{1,1}=x} be a basis for the space {V_1} of functions on {\{0,1\}}, itself indexed by {S_1=\{0,1\}}. Choose similar bases for {V_2} and {V_3}, with {v_{2,0}=1, v_{2,1}=y} and {v_{3,0}=1,v_{3,1}=z-1}.

Set {\Gamma = \{(1,0,0),(0,1,0),(0,0,1)\}}. Then {x+y+z-1} is a linear combination of the {v_{1,t_1} \otimes v_{1,t_2} \otimes v_{1,t_3}} with {(t_1,t_2,t_3) \in \Gamma}, and all coefficients non-zero. Order {S_1,S_2,S_3} the usual way so that {\Gamma} is an antichain. We will show that the quantity {H} of (10) is {\log(3/2^{2/3})}, so that applying the last statement of Proposition 6, we conclude that the rank of {\delta_{0^n}(x,y,z)} is {\exp( (\log(3/2^{2/3})+o(1)) n)= (3/2^{2/3})^n e^{o(n)}} ,

Let {(X_1,X_2,X_3)} be the random variable taking values in {\Gamma} that attains each of the values {(1,0,0),(0,1,0),(0,0,1)} with a probability of {1/3}. Then each of the {X_i} attains the value {1} with probability {1/3} and {0} with probability {2/3}, so

\displaystyle h(X_1)=h(X_2)=h(X_3) = (1/3) \log (3) + (2/3) \log(3/2) = \log 3 - (2/3) \log 2= \log (3/2^{2/3})

Setting {w_1=w_2=w_3=1} and {D=3 \log(3/2^{2/3})=3 \log 3 - 2 \log 2}, we can verify the condition (16) with equality for all {(t_1,t_2,t_3) \in \Gamma'}, which from (17) gives {H=\log (3/2^{2/3})} as desired. \Box

We used a slightly different method in each of the last two results. In the first one, we use the most natural bases for all three vector spaces, and distinguish {\Gamma} from its set of maximal elements {\Gamma'}. In the second one we modify one basis element slightly, with {v_{3,1}=z-1} instead of the more obvious choice {z}, which allows us to work with {\Gamma = \{(1,0,0),(0,1,0),(0,0,1)\}} instead of {\Gamma=\{(1,0,0),(0,1,0),(0,0,1),(0,0,0)\}}. Because {\Gamma} is an antichain, we do not need to distinguish {\Gamma} and {\Gamma'}. Both methods in fact work with either problem, and they are both about equally difficult, but we include both as either might turn out to be substantially more convenient in future work.

Proposition 11 Let {k \geq 8} be a natural number and let {G} be a finite abelian group. Let {{\bf F}} be any field. Let {V_1 = \dots = V_k} denote the space of functions from {G} to {{\bf F}}.

Let {F} be any {{\bf F}}-valued function on {G^k} that is nonzero only when the {k} elements of {G^n} form a {k}-term arithmetic progression, and is nonzero on every {k}-term constant progression.

Then the slice rank of {F} is {|G|}.

Proof: We apply Proposition 4, using the standard bases of {V_1,\dots,V_k}. Let {\Gamma} be the support of {F}. Suppose that we have {k} orderings on {H} such that the constant progressions are maximal elements of {\Gamma} and thus all constant progressions lie in {\Gamma'}. Then for any partition {\Gamma_1,\dots, \Gamma_k} of {\Gamma'}, {\Gamma_j} can contain at most {|\pi_j(\Gamma_j)|} constant progressions, and as all {|G|} constant progressions must lie in one of the {\Gamma_j}, we must have {\sum_{j=1}^k |\pi_j(\Gamma_j)| \geq |G|}. By Proposition 4, this implies that the slice rank of {F} is at least {|G|}. Since {F} is a {|G| \times \dots \times |G|} tensor, the slice rank is at most {|G|}, hence exactly {|G|}.

So it is sufficient to find {k} orderings on {G} such that the constant progressions are maximal element of {\Gamma}. We make several simplifying reductions: We may as well assume that {\Gamma} consists of all the {k}-term arithmetic progressions, because if the constant progressions are maximal among the set of all progressions then they are maximal among its subset {\Gamma}. So we are looking for an ordering in which the constant progressions are maximal among all {k}-term arithmetic progressions. We may as well assume that {G} is cyclic, because if for each cyclic group we have an ordering where constant progressions are maximal, on an arbitrary finite abelian group the lexicographic product of these orderings is an ordering for which the constant progressions are maximal. We may assume {k=8}, as if we have an {8}-tuple of orderings where constant progressions are maximal, we may add arbitrary orderings and the constant progressions will remain maximal.

So it is sufficient to find {8} orderings on the cyclic group {\mathbb Z/n} such that the constant progressions are maximal elements of the set of {8}-term progressions in {\mathbb Z/n} in the {8}-fold product ordering. To do that, let the first, second, third, and fifth orderings be the usual order on {\{0,\dots,n-1\}} and let the fourth, sixth, seventh, and eighth orderings be the reverse of the usual order on {\{0,\dots,n-1\}}.

Then let {(c,c,c,c,c,c,c,c)} be a constant progression and for contradiction assume that {(a,a+b,a+2b,a+3b,a+4b,a+5b,a+6b,a+7b)} is a progression greater than {(c,c,c,c,c,c,c,c)} in this ordering. We may assume that {c \in [0, (n-1)/2]}, because otherwise we may reverse the order of the progression, which has the effect of reversing all eight orderings, and then apply the transformation {x \rightarrow n-1-x}, which again reverses the eight orderings, bringing us back to the original problem but with {c \in [0,(n-1)/2]}.

Take a representative of the residue class {b} in the interval {[-n/2,n/2]}. We will abuse notation and call this {b}. Observe that {a+b, a+2b,} {a+3b}, and {a+5b} are all contained in the interval {[0,c]} modulo {n}. Take a representative of the residue class {a} in the interval {[0,c]}. Then {a+b} is in the interval {[mn,mn+c]} for some {m}. The distance between any distinct pair of intervals of this type is greater than {n/2}, but the distance between {a} and {a+b} is at most {n/2}, so {a+b} is in the interval {[0,c]}. By the same reasoning, {a+2b} is in the interval {[0,c]}. Therefore {|b| \leq c/2< n/4}. But then the distance between {a+2b} and {a+4b} is at most {n/2}, so by the same reasoning {a+4b} is in the interval {[0,c]}. Because {a+3b} is between {a+2b} and {a+4b}, it also lies in the interval {[0,c]}. Because {a+3b} is in the interval {[0,c]}, and by assumption it is congruent mod {n} to a number in the set {\{0,\dots,n-1\}} greater than or equal to {c}, it must be exactly {c}. Then, remembering that {a+2b} and {a+4b} lie in {[0,c]}, we have {c-b \leq b} and {c+b \leq b}, so {b=0}, hence {a=c}, thus {(a,\dots,a+7b)=(c,\dots,c)}, which contradicts the assumption that {(a,\dots,a+7b)>(c,\dots,c)}. \Box

In fact, given a {k}-term progressions mod {n} and a constant, we can form a {k}-term binary sequence with a {1} for each step of the progression that is greater than the constant and a {0} for each step that is less. Because a rotation map, viewed as a dynamical system, has zero topological entropy, the number of {k}-term binary sequences that appear grows subexponentially in {k}. Hence there must be, for large enough {k}, at least one sequence that does not appear. In this proof we exploit a sequence that does not appear for {k=8}.

A capset in the vector space {{\bf F}_3^n} over the finite field {{\bf F}_3} of three elements is a subset {A} of {{\bf F}_3^n} that does not contain any lines {\{ x,x+r,x+2r\}}, where {x,r \in {\bf F}_3^n} and {r \neq 0}. A basic problem in additive combinatorics (discussed in one of the very first posts on this blog) is to obtain good upper and lower bounds for the maximal size of a capset in {{\bf F}_3^n}.

Trivially, one has {|A| \leq 3^n}. Using Fourier methods (and the density increment argument of Roth), the bound of {|A| \leq O( 3^n / n )} was obtained by Meshulam, and improved only as late as 2012 to {O( 3^n /n^{1+c})} for some absolute constant {c>0} by Bateman and Katz. But in a very recent breakthrough, Ellenberg (and independently Gijswijt) obtained the exponentially superior bound {|A| \leq O( 2.756^n )}, using a version of the polynomial method recently introduced by Croot, Lev, and Pach. (In the converse direction, a construction of Edel gives capsets as large as {(2.2174)^n}.) Given the success of the polynomial method in superficially similar problems such as the finite field Kakeya problem (discussed in this previous post), it was natural to wonder that this method could be applicable to the cap set problem (see for instance this MathOverflow comment of mine on this from 2010), but it took a surprisingly long time before Croot, Lev, and Pach were able to identify the precise variant of the polynomial method that would actually work here.

The proof of the capset bound is very short (Ellenberg’s and Gijswijt’s preprints are both 3 pages long, and Croot-Lev-Pach is 6 pages), but I thought I would present a slight reformulation of the argument which treats the three points on a line in {{\bf F}_3} symmetrically (as opposed to treating the third point differently from the first two, as is done in the Ellenberg and Gijswijt papers; Croot-Lev-Pach also treat the middle point of a three-term arithmetic progression differently from the two endpoints, although this is a very natural thing to do in their context of {({\bf Z}/4{\bf Z})^n}). The basic starting point is this: if {A} is a capset, then one has the identity

\displaystyle \delta_{0^n}( x+y+z ) = \sum_{a \in A} \delta_a(x) \delta_a(y) \delta_a(z) \ \ \ \ \ (1)

 

for all {(x,y,z) \in A^3}, where {\delta_a(x) := 1_{a=x}} is the Kronecker delta function, which we view as taking values in {{\bf F}_3}. Indeed, (1) reflects the fact that the equation {x+y+z=0} has solutions precisely when {x,y,z} are either all equal, or form a line, and the latter is ruled out precisely when {A} is a capset.

To exploit (1), we will show that the left-hand side of (1) is “low rank” in some sense, while the right-hand side is “high rank”. Recall that a function {F: A \times A \rightarrow {\bf F}} taking values in a field {{\bf F}} is of rank one if it is non-zero and of the form {(x,y) \mapsto f(x) g(y)} for some {f,g: A \rightarrow {\bf F}}, and that the rank of a general function {F: A \times A \rightarrow {\bf F}} is the least number of rank one functions needed to express {F} as a linear combination. More generally, if {k \geq 2}, we define the rank of a function {F: A^k \rightarrow {\bf F}} to be the least number of “rank one” functions of the form

\displaystyle (x_1,\dots,x_k) \mapsto f(x_i) g(x_1,\dots,x_{i-1},x_{i+1},\dots,x_k)

for some {i=1,\dots,k} and some functions {f: A \rightarrow {\bf F}}, {g: A^{k-1} \rightarrow {\bf F}}, that are needed to generate {F} as a linear combination. For instance, when {k=3}, the rank one functions take the form {(x,y,z) \mapsto f(x) g(y,z)}, {(x,y,z) \mapsto f(y) g(x,z)}, {(x,y,z) \mapsto f(z) g(x,y)}, and linear combinations of {r} such rank one functions will give a function of rank at most {r}.

It is a standard fact in linear algebra that the rank of a diagonal matrix is equal to the number of non-zero entries. This phenomenon extends to higher dimensions:

Lemma 1 (Rank of diagonal hypermatrices) Let {k \geq 2}, let {A} be a finite set, let {{\bf F}} be a field, and for each {a \in A}, let {c_a \in {\bf F}} be a coefficient. Then the rank of the function

\displaystyle (x_1,\dots,x_k) \mapsto \sum_{a \in A} c_a \delta_a(x_1) \dots \delta_a(x_k) \ \ \ \ \ (2)

 

is equal to the number of non-zero coefficients {c_a}.

Proof: We induct on {k}. As mentioned above, the case {k=2} follows from standard linear algebra, so suppose now that {k>2} and the claim has already been proven for {k-1}.

It is clear that the function (2) has rank at most equal to the number of non-zero {c_a} (since the summands on the right-hand side are rank one functions), so it suffices to establish the lower bound. By deleting from {A} those elements {a \in A} with {c_a=0} (which cannot increase the rank), we may assume without loss of generality that all the {c_a} are non-zero. Now suppose for contradiction that (2) has rank at most {|A|-1}, then we obtain a representation

\displaystyle \sum_{a \in A} c_a \delta_a(x_1) \dots \delta_a(x_k)

\displaystyle = \sum_{i=1}^k \sum_{\alpha \in I_i} f_{i,\alpha}(x_i) g_{i,\alpha}( x_1,\dots,x_{i-1},x_{i+1},\dots,x_k) \ \ \ \ \ (3)

 

for some sets {I_1,\dots,I_k} of cardinalities adding up to at most {|A|-1}, and some functions {f_{i,\alpha}: A \rightarrow {\bf F}} and {g_{i,\alpha}: A^{k-1} \rightarrow {\bf R}}.

Consider the space of functions {h: A \rightarrow {\bf F}} that are orthogonal to all the {f_{k,\alpha}}, {\alpha \in I_k} in the sense that

\displaystyle \sum_{x \in A} f_{k,\alpha}(x) h(x) = 0

for all {\alpha \in I_k}. This space is a vector space whose dimension {d} is at least {|A| - |I_k|}. A basis of this space generates a {d \times |A|} coordinate matrix of full rank, which implies that there is at least one non-singular {d \times d} minor. This implies that there exists a function {h: A \rightarrow {\bf F}} in this space which is nowhere vanishing on some subset {A'} of {A} of cardinality at least {|A|-|I_k|}.

If we multiply (3) by {h(x_k)} and sum in {x_k}, we conclude that

\displaystyle \sum_{a \in A} c_a h(a) \delta_a(x_1) \dots \delta_a(x_{k-1})

\displaystyle = \sum_{i=1}^{k-1} \sum_{\alpha \in I_i} f_{i,\alpha}(x_i)\tilde g_{i,\alpha}( x_1,\dots,x_{i-1},x_{i+1},\dots,x_{k-1})

where

\displaystyle \tilde g_{i,\alpha}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_{k-1})

\displaystyle := \sum_{x_k \in A} g_{i,\alpha}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_k) h(x_k).

The right-hand side has rank at most {|A|-1-|I_k|}, since the summands are rank one functions. On the other hand, from induction hypothesis the left-hand side has rank at least {|A|-|I_k|}, giving the required contradiction. \Box

On the other hand, we have the following (symmetrised version of a) beautifully simple observation of Croot, Lev, and Pach:

Lemma 2 On {({\bf F}_3^n)^3}, the rank of the function {(x,y,z) \mapsto \delta_{0^n}(x+y+z)} is at most {3N}, where

\displaystyle N := \sum_{a,b,c \geq 0: a+b+c=n, b+2c \leq 2n/3} \frac{n!}{a!b!c!}.

Proof: Using the identity {\delta_0(x) = 1 - x^2} for {x \in {\bf F}_3}, we have

\displaystyle \delta_{0^n}(x+y+z) = \prod_{i=1}^n (1 - (x_i+y_i+z_i)^2).

The right-hand side is clearly a polynomial of degree {2n} in {x,y,z}, which is then a linear combination of monomials

\displaystyle x_1^{i_1} \dots x_n^{i_n} y_1^{j_1} \dots y_n^{j_n} z_1^{k_1} \dots z_n^{k_n}

with {i_1,\dots,i_n,j_1,\dots,j_n,k_1,\dots,k_n \in \{0,1,2\}} with

\displaystyle i_1 + \dots + i_n + j_1 + \dots + j_n + k_1 + \dots + k_n \leq 2n.

In particular, from the pigeonhole principle, at least one of {i_1 + \dots + i_n, j_1 + \dots + j_n, k_1 + \dots + k_n} is at most {2n/3}.

Consider the contribution of the monomials for which {i_1 + \dots + i_n \leq 2n/3}. We can regroup this contribution as

\displaystyle \sum_\alpha f_\alpha(x) g_\alpha(y,z)

where {\alpha} ranges over those {(i_1,\dots,i_n) \in \{0,1,2\}^n} with {i_1 + \dots + i_n \leq 2n/3}, {f_\alpha} is the monomial

\displaystyle f_\alpha(x_1,\dots,x_n) := x_1^{i_1} \dots x_n^{i_n}

and {g_\alpha: {\bf F}_3^n \times {\bf F}_3^n \rightarrow {\bf F}_3} is some explicitly computable function whose exact form will not be of relevance to our argument. The number of such {\alpha} is equal to {N}, so this contribution has rank at most {N}. The remaining contributions arising from the cases {j_1 + \dots + j_n \leq 2n/3} and {k_1 + \dots + k_n \leq 2n/3} similarly have rank at most {N} (grouping the monomials so that each monomial is only counted once), so the claim follows.

Upon restricting from {({\bf F}_3^n)^3} to {A^3}, the rank of {(x,y,z) \mapsto \delta_{0^n}(x+y+z)} is still at most {3N}. The two lemmas then combine to give the Ellenberg-Gijswijt bound

\displaystyle |A| \leq 3N.

All that remains is to compute the asymptotic behaviour of {N}. This can be done using the general tool of Cramer’s theorem, but can also be derived from Stirling’s formula (discussed in this previous post). Indeed, if {a = (\alpha+o(1)) n}, {b = (\beta+o(1)) n}, {c = (\gamma+o(1)) n} for some {\alpha,\beta,\gamma \geq 0} summing to {1}, Stirling’s formula gives

\displaystyle \frac{n!}{a!b!c!} = \exp( n (h(\alpha,\beta,\gamma) + o(1)) )

where {h} is the entropy function

\displaystyle h(\alpha,\beta,\gamma) = \alpha \log \frac{1}{\alpha} + \beta \log \frac{1}{\beta} + \gamma \log \frac{1}{\gamma}.

We then have

\displaystyle N = \exp( n (X + o(1))

where {X} is the maximum entropy {h(\alpha,\beta,\gamma)} subject to the constraints

\displaystyle \alpha,\beta,\gamma \geq 0; \alpha+\beta+\gamma=1; \beta+2\gamma \leq 2/3.

A routine Lagrange multiplier computation shows that the maximum occurs when

\displaystyle \alpha = \frac{32}{3(15 + \sqrt{33})}

\displaystyle \beta = \frac{4(\sqrt{33}-1)}{3(15+\sqrt{33})}

\displaystyle \gamma = \frac{(\sqrt{33}-1)^2}{6(15+\sqrt{33})}

and {h(\alpha,\beta,\gamma)} is approximately {1.013455}, giving rise to the claimed bound of {O( 2.756^n )}.

Remark 3 As noted in the Ellenberg and Gijswijt papers, the above argument extends readily to other fields than {{\bf F}_3} to control the maximal size of subset of {{\bf F}^n} that has no non-trivial solutions to the equation {ax+by+cz=0}, where {a,b,c \in {\bf F}} are non-zero constants that sum to zero. Of course one replaces the function {(x,y,z) \mapsto \delta_{0^n}(x+y+z)} in Lemma 2 by {(x,y,z) \mapsto \delta_{0^n}(ax+by+cz)} in this case.

Remark 4 This symmetrised formulation suggests that one possible way to improve slightly on the numerical quantity {2.756} by finding a more efficient way to decompose {\delta_{0^n}(x+y+z)} into rank one functions, however I was not able to do so (though such improvements are reminiscent of the Strassen type algorithms for fast matrix multiplication).

Remark 5 It is tempting to see if this method can get non-trivial upper bounds for sets {A} with no length {4} progressions, in (say) {{\bf F}_5^n}. One can run the above arguments, replacing the function

\displaystyle (x,y,z) \mapsto \delta_{0^n}(x+y+z)

with

\displaystyle (x,y,z,w) \mapsto \delta_{0^n}(x-2y+z) \delta_{0^n}(y-2z+w);

this leads to the bound {|A| \leq 4N} where

\displaystyle N := \sum_{a,b,c,d,e \geq 0: a+b+c+d+e=n, b+2c+3d+4e \leq 2n} \frac{n!}{a!b!c!d!e!}.

Unfortunately, {N} is asymptotic to {\frac{1}{2} 5^n} and so this bound is in fact slightly worse than the trivial bound {|A| \leq 5^n}! However, there is a slim chance that there is a more efficient way to decompose {\delta_{0^n}(x-2y+z) \delta_{0^n}(y-2z+w)} into rank one functions that would give a non-trivial bound on {A}. I experimented with a few possible such decompositions but unfortunately without success.

Remark 6 Return now to the capset problem. Since Lemma 1 is valid for any field {{\bf F}}, one could perhaps hope to get better bounds by viewing the Kronecker delta function {\delta} as taking values in another field than {{\bf F}_3}, such as the complex numbers {{\bf C}}. However, as soon as one works in a field of characteristic other than {3}, one can adjoin a cube root {\omega} of unity, and one now has the Fourier decomposition

\displaystyle \delta_{0^n}(x+y+z) = \frac{1}{3^n} \sum_{\xi \in {\bf F}_3^n} \omega^{\xi \cdot x} \omega^{\xi \cdot y} \omega^{\xi \cdot z}.

Moving to the Fourier basis, we conclude from Lemma 1 that the function {(x,y,z) \mapsto \delta_{0^n}(x+y+z)} on {{\bf F}_3^n} now has rank exactly {3^n}, and so one cannot improve upon the trivial bound of {|A| \leq 3^n} by this method using fields of characteristic other than three as the range field. So it seems one has to stick with {{\bf F}_3} (or the algebraic completion thereof).

Thanks to Jordan Ellenberg and Ben Green for helpful discussions.

Because of Euler’s identity {e^{\pi i} + 1 = 0}, the complex exponential is not injective: {e^{z + 2\pi i k} = e^z} for any complex {z} and integer {k}. As such, the complex logarithm {z \mapsto \log z} is not well-defined as a single-valued function from {{\bf C} \backslash \{0\}} to {{\bf C}}. However, after making a branch cut, one can create a branch of the logarithm which is single-valued. For instance, after removing the negative real axis {(-\infty,0]}, one has the standard branch {\hbox{Log}: {\bf C} \backslash (-\infty,0] \rightarrow \{ z \in {\bf C}: |\hbox{Im} z| < \pi \}} of the logarithm, with {\hbox{Log}(z)} defined as the unique choice of the complex logarithm of {z} whose imaginary part has magnitude strictly less than {\pi}. This particular branch has a number of useful additional properties:

  • The standard branch {\hbox{Log}} is holomorphic on its domain {{\bf C} \backslash (-\infty,0]}.
  • One has {\hbox{Log}( \overline{z} ) = \overline{ \hbox{Log}(z) }} for all {z} in the domain {{\bf C} \backslash (-\infty,0]}. In particular, if {z \in {\bf C} \backslash (-\infty,0]} is real, then {\hbox{Log} z} is real.
  • One has {\hbox{Log}( z^{-1} ) = - \hbox{Log}(z)} for all {z} in the domain {{\bf C} \backslash (-\infty,0]}.

One can then also use the standard branch of the logarithm to create standard branches of other multi-valued functions, for instance creating a standard branch {z \mapsto \exp( \frac{1}{2} \hbox{Log} z )} of the square root function. We caution however that the identity {\hbox{Log}(zw) = \hbox{Log}(z) + \hbox{Log}(w)} can fail for the standard branch (or indeed for any branch of the logarithm).

One can extend this standard branch of the logarithm to {n \times n} complex matrices, or (equivalently) to linear transformations {T: V \rightarrow V} on an {n}-dimensional complex vector space {V}, provided that the spectrum of that matrix or transformation avoids the branch cut {(-\infty,0]}. Indeed, from the spectral theorem one can decompose any such {T: V \rightarrow V} as the direct sum of operators {T_\lambda: V_\lambda \rightarrow V_\lambda} on the non-trivial generalised eigenspaces {V_\lambda} of {T}, where {\lambda \in {\bf C} \backslash (-\infty,0]} ranges in the spectrum of {T}. For each component {T_\lambda} of {T}, we define

\displaystyle  \hbox{Log}( T_\lambda ) = P_\lambda( T_\lambda )

where {P_\lambda} is the Taylor expansion of {\hbox{Log}} at {\lambda}; as {T_\lambda-\lambda} is nilpotent, only finitely many terms in this Taylor expansion are required. The logarithm {\hbox{Log} T} is then defined as the direct sum of the {\hbox{Log} T_\lambda}.

The matrix standard branch of the logarithm has many pleasant and easily verified properties (often inherited from their scalar counterparts), whenever {T: V \rightarrow V} has no spectrum in {(-\infty,0]}:

  • (i) We have {\exp( \hbox{Log} T ) = T}.
  • (ii) If {T_1: V_1 \rightarrow V_1} and {T_2: V_2 \rightarrow V_2} have no spectrum in {(-\infty,0]}, then {\hbox{Log}( T_1 \oplus T_2 ) = \hbox{Log}(T_1) \oplus \hbox{Log}(T_2)}.
  • (iii) If {T} has spectrum in a closed disk {B(z,r)} in {{\bf C} \backslash (-\infty,0]}, then {\hbox{Log}(T) = P_z(T)}, where {P_z} is the Taylor series of {\hbox{Log}} around {z} (which is absolutely convergent in {B(z,r)}).
  • (iv) {\hbox{Log}(T)} depends holomorphically on {T}. (Easily established from (ii), (iii), after covering the spectrum of {T} by disjoint disks; alternatively, one can use the Cauchy integral representation {\hbox{Log}(T) = \frac{1}{2\pi i} \int_\gamma \hbox{Log}(z)(z-T)^{-1}\ dz} for a contour {\gamma} in the domain enclosing the spectrum of {T}.) In particular, the standard branch of the matrix logarithm is smooth.
  • (v) If {S: V \rightarrow W} is any invertible linear or antilinear map, then {\hbox{Log}( STS^{-1} ) = S \hbox{Log}(T) S^{-1}}. In particular, the standard branch of the logarithm commutes with matrix conjugations; and if {T} is real with respect to a complex conjugation operation on {V} (that is to say, an antilinear involution), then {\hbox{Log}(T)} is real also.
  • (vi) If {T^*: V^* \rightarrow V^*} denotes the transpose of {T} (with {V^*} the complex dual of {V}), then {\hbox{Log}(T^*) = \hbox{Log}(T)^*}. Similarly, if {T^\dagger: V^\dagger \rightarrow V^\dagger} denotes the adjoint of {T} (with {V^\dagger} the complex conjugate of {V^*}, i.e. {V^*} with the conjugated multiplication map {(c,z) \mapsto \overline{c} z}), then {\hbox{Log}(T^\dagger) = \hbox{Log}(T)^\dagger}.
  • (vii) One has {\hbox{Log}(T^{-1}) = - \hbox{Log}( T )}.
  • (viii) If {\sigma(T)} denotes the spectrum of {T}, then {\sigma(\hbox{Log} T) = \hbox{Log} \sigma(T)}.

As a quick application of the standard branch of the matrix logarithm, we have

Proposition 1 Let {G} be one of the following matrix groups: {GL_n({\bf C})}, {GL_n({\bf R})}, {U_n({\bf C})}, {O(Q)}, {Sp_{2n}({\bf C})}, or {Sp_{2n}({\bf R})}, where {Q: {\bf R}^n \rightarrow {\bf R}} is a non-degenerate real quadratic form (so {O(Q)} is isomorphic to a (possibly indefinite) orthogonal group {O(k,n-k)} for some {0 \leq k \leq n}. Then any element {T} of {G} whose spectrum avoids {(-\infty,0]} is exponential, that is to say {T = \exp(X)} for some {X} in the Lie algebra {{\mathfrak g}} of {G}.

Proof: We just prove this for {G=O(Q)}, as the other cases are similar (or a bit simpler). If {T \in O(Q)}, then (viewing {T} as a complex-linear map on {{\bf C}^n}, and using the complex bilinear form associated to {Q} to identify {{\bf C}^n} with its complex dual {({\bf C}^n)^*}, then {T} is real and {T^{*-1} = T}. By the properties (v), (vi), (vii) of the standard branch of the matrix logarithm, we conclude that {\hbox{Log} T} is real and {- \hbox{Log}(T)^* = \hbox{Log}(T)}, and so {\hbox{Log}(T)} lies in the Lie algebra {{\mathfrak g} = {\mathfrak o}(Q)}, and the claim now follows from (i). \Box

Exercise 2 Show that {\hbox{diag}(-\lambda, -1/\lambda)} is not exponential in {GL_2({\bf R})} if {-\lambda \in (-\infty,0) \backslash \{-1\}}. Thus we see that the branch cut in the above proposition is largely necessary. See this paper of Djokovic for a more complete description of the image of the exponential map in classical groups, as well as this previous blog post for some more discussion of the surjectivity (or lack thereof) of the exponential map in Lie groups.

For a slightly less quick application of the standard branch, we have the following result (recently worked out in the answers to this MathOverflow question):

Proposition 3 Let {T} be an element of the split orthogonal group {O(n,n)} which lies in the connected component of the identity. Then {\hbox{det}(1+T) \geq 0}.

The requirement that {T} lie in the identity component is necessary, as the counterexample {T = \hbox{diag}(-\lambda, -1/\lambda )} for {\lambda \in (-\infty,-1) \cup (-1,0)} shows.

Proof: We think of {T} as a (real) linear transformation on {{\bf C}^{2n}}, and write {Q} for the quadratic form associated to {O(n,n)}, so that {O(n,n) \equiv O(Q)}. We can split {{\bf C}^{2n} = V_1 \oplus V_2}, where {V_1} is the sum of all the generalised eigenspaces corresponding to eigenvalues in {(-\infty,0]}, and {V_2} is the sum of all the remaining eigenspaces. Since {T} and {(-\infty,0]} are real, {V_1,V_2} are real (i.e. complex-conjugation invariant) also. For {i=1,2}, the restriction {T_i: V_i \rightarrow V_i} of {T} to {V_i} then lies in {O(Q_i)}, where {Q_i} is the restriction of {Q} to {V_i}, and

\displaystyle  \hbox{det}(1+T) = \hbox{det}(1+T_1) \hbox{det}(1+T_2).

The spectrum of {T_2} consists of positive reals, as well as complex pairs {\lambda, \overline{\lambda}} (with equal multiplicity), so {\hbox{det}(1+T_2) > 0}. From the preceding proposition we have {T_2 = \exp( X_2 )} for some {X_2 \in {\mathfrak o}(Q_2)}; this will be important later.

It remains to show that {\hbox{det}(1+T_1) \geq 0}. If {T_1} has spectrum at {-1} then we are done, so we may assume that {T_1} has spectrum only at {(-\infty,-1) \cup (-1,0)} (being invertible, {T} has no spectrum at {0}). We split {V_1 = V_3 \oplus V_4}, where {V_3,V_4} correspond to the portions of the spectrum in {(-\infty,-1)}, {(-1,0)}; these are real, {T}-invariant spaces. We observe that if {V_\lambda, V_\mu} are generalised eigenspaces of {T} with {\lambda \mu \neq 1}, then {V_\lambda, V_\mu} are orthogonal with respect to the (complex-bilinear) inner product {\cdot} associated with {Q}; this is easiest to see first for the actual eigenspaces (since { \lambda \mu u \cdot v = Tu \cdot Tv = u \cdot v} for all {u \in V_\lambda, v \in V_\mu}), and the extension to generalised eigenvectors then follows from a routine induction. From this we see that {V_1} is orthogonal to {V_2}, and {V_3} and {V_4} are null spaces, which by the non-degeneracy of {Q} (and hence of the restriction {Q_1} of {Q} to {V_1}) forces {V_3} to have the same dimension as {V_4}, indeed {Q} now gives an identification of {V_3^*} with {V_4}. If we let {T_3, T_4} be the restrictions of {T} to {V_3,V_4}, we thus identify {T_4} with {T_3^{*-1}}, since {T} lies in {O(Q)}; in particular {T_3} is invertible. Thus

\displaystyle  \hbox{det}(1+T_1) = \hbox{det}(1 + T_3) \hbox{det}( 1 + T_3^{*-1} ) = \hbox{det}(T_3)^{-1} \hbox{det}(1+T_3)^2

and so it suffices to show that {\hbox{det}(T_3) > 0}.

At this point we need to use the hypothesis that {T} lies in the identity component of {O(n,n)}. This implies (by a continuity argument) that the restriction of {T} to any maximal-dimensional positive subspace has positive determinant (since such a restriction cannot be singular, as this would mean that {T} positive norm vector would map to a non-positive norm vector). Now, as {V_3,V_4} have equal dimension, {Q_1} has a balanced signature, so {Q_2} does also. Since {T_2 = \exp(X_2)}, {T_2} already lies in the identity component of {O(Q_2)}, and so has positive determinant on any maximal-dimensional positive subspace of {V_2}. We conclude that {T_1} has positive determinant on any maximal-dimensional positive subspace of {V_1}.

We choose a complex basis of {V_3}, to identify {V_3} with {V_3^*}, which has already been identified with {V_4}. (In coordinates, {V_3,V_4} are now both of the form {{\bf C}^m}, and {Q( v \oplus w ) = v \cdot w} for {v,w \in {\bf C}^m}.) Then {\{ v \oplus v: v \in V_3 \}} becomes a maximal positive subspace of {V_1}, and the restriction of {T_1} to this subspace is conjugate to {T_3 + T_3^{*-1}}, so that

\displaystyle  \hbox{det}( T_3 + T_3^{*-1} ) > 0.

But since {\hbox{det}( T_3 + T_3^{*-1} ) = \hbox{det}(T_3) \hbox{det}( 1 + T_3^{-1} T_3^{*-1} )} and { 1 + T_3^{-1} T_3^{*-1}} is positive definite, so {\hbox{det}(T_3)>0} as required. \Box

Analytic number theory is only one of many different approaches to number theory. Another important branch of the subject is algebraic number theory, which studies algebraic structures (e.g. groups, rings, and fields) of number-theoretic interest. With this perspective, the classical field of rationals {{\bf Q}}, and the classical ring of integers {{\bf Z}}, are placed inside the much larger field {\overline{{\bf Q}}} of algebraic numbers, and the much larger ring {{\mathcal A}} of algebraic integers, respectively. Recall that an algebraic number is a root of a polynomial with integer coefficients, and an algebraic integer is a root of a monic polynomial with integer coefficients; thus for instance {\sqrt{2}} is an algebraic integer (a root of {x^2-2}), while {\sqrt{2}/2} is merely an algebraic number (a root of {4x^2-2}). For the purposes of this post, we will adopt the concrete (but somewhat artificial) perspective of viewing algebraic numbers and integers as lying inside the complex numbers {{\bf C}}, thus {{\mathcal A} \subset \overline{{\bf Q}} \subset {\bf C}}. (From a modern algebraic perspective, it is better to think of {\overline{{\bf Q}}} as existing as an abstract field separate from {{\bf C}}, but which has a number of embeddings into {{\bf C}} (as well as into other fields, such as the completed p-adics {{\bf C}_p}), no one of which should be considered favoured over any other; cf. this mathOverflow post. But for the rudimentary algebraic number theory in this post, we will not need to work at this level of abstraction.) In particular, we identify the algebraic integer {\sqrt{-d}} with the complex number {\sqrt{d} i} for any natural number {d}.

Exercise 1 Show that the field of algebraic numbers {\overline{{\bf Q}}} is indeed a field, and that the ring of algebraic integers {{\mathcal A}} is indeed a ring, and is in fact an integral domain. Also, show that {{\bf Z} = {\mathcal A} \cap {\bf Q}}, that is to say the ordinary integers are precisely the algebraic integers that are also rational. Because of this, we will sometimes refer to elements of {{\bf Z}} as rational integers.

In practice, the field {\overline{{\bf Q}}} is too big to conveniently work with directly, having infinite dimension (as a vector space) over {{\bf Q}}. Thus, algebraic number theory generally restricts attention to intermediate fields {{\bf Q} \subset F \subset \overline{{\bf Q}}} between {{\bf Q}} and {\overline{{\bf Q}}}, which are of finite dimension over {{\bf Q}}; that is to say, finite degree extensions of {{\bf Q}}. Such fields are known as algebraic number fields, or number fields for short. Apart from {{\bf Q}} itself, the simplest examples of such number fields are the quadratic fields, which have dimension exactly two over {{\bf Q}}.

Exercise 2 Show that if {\alpha} is a rational number that is not a perfect square, then the field {{\bf Q}(\sqrt{\alpha})} generated by {{\bf Q}} and either of the square roots of {\alpha} is a quadratic field. Conversely, show that all quadratic fields arise in this fashion. (Hint: show that every element of a quadratic field is a root of a quadratic polynomial over the rationals.)

The ring of algebraic integers {{\mathcal A}} is similarly too large to conveniently work with directly, so in algebraic number theory one usually works with the rings {{\mathcal O}_F := {\mathcal A} \cap F} of algebraic integers inside a given number field {F}. One can (and does) study this situation in great generality, but for the purposes of this post we shall restrict attention to a simple but illustrative special case, namely the quadratic fields with a certain type of negative discriminant. (The positive discriminant case will be briefly discussed in Remark 42 below.)

Exercise 3 Let {d} be a square-free natural number with {d=1\ (4)} or {d=2\ (4)}. Show that the ring {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} of algebraic integers in {{\bf Q}(\sqrt{-d})} is given by

\displaystyle  {\mathcal O} = {\bf Z}[\sqrt{-d}] = \{ a + b \sqrt{-d}: a,b \in {\bf Z} \}.

If instead {d} is square-free with {d=3\ (4)}, show that the ring {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} is instead given by

\displaystyle  {\mathcal O} = {\bf Z}[\frac{1+\sqrt{-d}}{2}] = \{ a + b \frac{1+\sqrt{-d}}{2}: a,b \in {\bf Z} \}.

What happens if {d} is not square-free, or negative?

Remark 4 In the case {d=3\ (4)}, it may naively appear more natural to work with the ring {{\bf Z}[\sqrt{-d}]}, which is an index two subring of {{\mathcal O}}. However, because this ring only captures some of the algebraic integers in {{\bf Q}(\sqrt{-d})} rather than all of them, the algebraic properties of these rings are somewhat worse than those of {{\mathcal O}} (in particular, they generally fail to be Dedekind domains) and so are not convenient to work with in algebraic number theory.

We refer to fields of the form {{\bf Q}(\sqrt{-d})} for natural square-free numbers {d} as quadratic fields of negative discriminant, and similarly refer to {{\mathcal O}_{{\bf Q}(\sqrt{-d})}} as a ring of quadratic integers of negative discriminant. Quadratic fields and quadratic integers of positive discriminant are just as important to analytic number theory as their negative discriminant counterparts, but we will restrict attention to the latter here for simplicity of discussion.

Thus, for instance, when {d=1}, the ring of integers in {{\bf Q}(\sqrt{-1})} is the ring of Gaussian integers

\displaystyle  {\bf Z}[\sqrt{-1}] = \{ x + y \sqrt{-1}: x,y \in {\bf Z} \}

and when {d=3}, the ring of integers in {{\bf Q}(\sqrt{-3})} is the ring of Eisenstein integers

\displaystyle  {\bf Z}[\omega] := \{ x + y \omega: x,y \in {\bf Z} \}

where {\omega := e^{2\pi i /3}} is a cube root of unity.

As these examples illustrate, the additive structure of a ring {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} of quadratic integers is that of a two-dimensional lattice in {{\bf C}}, which is isomorphic as an additive group to {{\bf Z}^2}. Thus, from an additive viewpoint, one can view quadratic integers as “two-dimensional” analogues of rational integers. From a multiplicative viewpoint, however, the quadratic integers (and more generally, integers in a number field) behave very similarly to the rational integers (as opposed to being some sort of “higher-dimensional” version of such integers). Indeed, a large part of basic algebraic number theory is devoted to treating the multiplicative theory of integers in number fields in a unified fashion, that naturally generalises the classical multiplicative theory of the rational integers.

For instance, every rational integer {n \in {\bf Z}} has an absolute value {|n| \in {\bf N} \cup \{0\}}, with the multiplicativity property {|nm| = |n| |m|} for {n,m \in {\bf Z}}, and the positivity property {|n| > 0} for all {n \neq 0}. Among other things, the absolute value detects units: {|n| = 1} if and only if {n} is a unit in {{\bf Z}} (that is to say, it is multiplicatively invertible in {{\bf Z}}). Similarly, in any ring of quadratic integers {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} with negative discriminant, we can assign a norm {N(n) \in {\bf N} \cup \{0\}} to any quadratic integer {n \in {\mathcal O}_{{\bf Q}(\sqrt{-d})}} by the formula

\displaystyle  N(n) = n \overline{n}

where {\overline{n}} is the complex conjugate of {n}. (When working with other number fields than quadratic fields of negative discriminant, one instead defines {N(n)} to be the product of all the Galois conjugates of {n}.) Thus for instance, when {d=1,2\ (4)} one has

\displaystyle  N(x + y \sqrt{-d}) = x^2 + dy^2 \ \ \ \ \ (1)

and when {d=3\ (4)} one has

\displaystyle  N(x + y \frac{1+\sqrt{-d}}{2}) = x^2 + xy + \frac{d+1}{4} y^2. \ \ \ \ \ (2)

Analogously to the rational integers, we have the multiplicativity property {N(nm) = N(n) N(m)} for {n,m \in {\mathcal O}} and the positivity property {N(n) > 0} for {n \neq 0}, and the units in {{\mathcal O}} are precisely the elements of norm one.

Exercise 5 Establish the three claims of the previous paragraph. Conclude that the units (invertible elements) of {{\mathcal O}} consist of the four elements {\pm 1, \pm i} if {d=1}, the six elements {\pm 1, \pm \omega, \pm \omega^2} if {d=3}, and the two elements {\pm 1} if {d \neq 1,3}.

For the rational integers, we of course have the fundamental theorem of arithmetic, which asserts that every non-zero rational integer can be uniquely factored (up to permutation and units) as the product of irreducible integers, that is to say non-zero, non-unit integers that cannot be factored into the product of integers of strictly smaller norm. As it turns out, the same claim is true for a few additional rings of quadratic integers, such as the Gaussian integers and Eisenstein integers, but fails in general; for instance, in the ring {{\bf Z}[\sqrt{-5}]}, we have the famous counterexample

\displaystyle  6 = 2 \times 3 = (1+\sqrt{-5}) (1-\sqrt{-5})

that decomposes {6} non-uniquely into the product of irreducibles in {{\bf Z}[\sqrt{-5}]}. Nevertheless, it is an important fact that the fundamental theorem of arithmetic can be salvaged if one uses an “idealised” notion of a number in a ring of integers {{\mathcal O}}, now known in modern language as an ideal of that ring. For instance, in {{\bf Z}[\sqrt{-5}]}, the principal ideal {(6)} turns out to uniquely factor into the product of (non-principal) ideals {(2) + (1+\sqrt{-5}), (2) + (1-\sqrt{-5}), (3) + (1+\sqrt{-5}), (3) + (1-\sqrt{-5})}; see Exercise 27. We will review the basic theory of ideals in number fields (focusing primarily on quadratic fields of negative discriminant) below the fold.

The norm forms (1), (2) can be viewed as examples of positive definite quadratic forms {Q: {\bf Z}^2 \rightarrow {\bf Z}} over the integers, by which we mean a polynomial of the form

\displaystyle  Q(x,y) = ax^2 + bxy + cy^2

for some integer coefficients {a,b,c}. One can declare two quadratic forms {Q, Q': {\bf Z}^2 \rightarrow {\bf Z}} to be equivalent if one can transform one to the other by an invertible linear transformation {T: {\bf Z}^2 \rightarrow {\bf Z}^2}, so that {Q' = Q \circ T}. For example, the quadratic forms {(x,y) \mapsto x^2 + y^2} and {(x',y') \mapsto 2 (x')^2 + 2 x' y' + (y')^2} are equivalent, as can be seen by using the invertible linear transformation {(x,y) = (x',x'+y')}. Such equivalences correspond to the different choices of basis available when expressing a ring such as {{\mathcal O}} (or an ideal thereof) additively as a copy of {{\bf Z}^2}.

There is an important and classical invariant of a quadratic form {(x,y) \mapsto ax^2 + bxy + c y^2}, namely the discriminant {\Delta := b^2 - 4ac}, which will of course be familiar to most readers via the quadratic formula, which among other things tells us that a quadratic form will be positive definite precisely when its discriminant is negative. It is not difficult (particularly if one exploits the multiplicativity of the determinant of {2 \times 2} matrices) to show that two equivalent quadratic forms have the same discriminant. Thus for instance any quadratic form equivalent to (1) has discriminant {-4d}, while any quadratic form equivalent to (2) has discriminant {-d}. Thus we see that each ring {{\mathcal O}[\sqrt{-d}]} of quadratic integers is associated with a certain negative discriminant {D}, defined to equal {-4d} when {d=1,2\ (4)} and {-d} when {d=3\ (4)}.

Exercise 6 (Geometric interpretation of discriminant) Let {Q: {\bf Z}^2 \rightarrow {\bf Z}} be a quadratic form of negative discriminant {D}, and extend it to a real form {Q: {\bf R}^2 \rightarrow {\bf R}} in the obvious fashion. Show that for any {X>0}, the set {\{ (x,y) \in {\bf R}^2: Q(x,y) \leq X \}} is an ellipse of area {2\pi X / \sqrt{|D|}}.

It is natural to ask the converse question: if two quadratic forms have the same discriminant, are they necessarily equivalent? For certain choices of discriminant, this is the case:

Exercise 7 Show that any quadratic form {ax^2+bxy+cy^2} of discriminant {-4} is equivalent to the form {x^2+y^2}, and any quadratic form of discriminant {-3} is equivalent to {x^2+xy+y^2}. (Hint: use elementary transformations to try to make {|b|} as small as possible, to the point where one only has to check a finite number of cases; this argument is due to Legendre.) More generally, show that for any negative discriminant {D}, there are only finitely many quadratic forms of that discriminant up to equivalence (a result first established by Gauss).

Unfortunately, for most choices of discriminant, the converse question fails; for instance, the quadratic forms {x^2+5y^2} and {2x^2+2xy+3y^2} both have discriminant {-20}, but are not equivalent (Exercise 38). This particular failure of equivalence turns out to be intimately related to the failure of unique factorisation in the ring {{\bf Z}[\sqrt{-5}]}.

It turns out that there is a fundamental connection between quadratic fields, equivalence classes of quadratic forms of a given discriminant, and real Dirichlet characters, thus connecting the material discussed above with the last section of the previous set of notes. Here is a typical instance of this connection:

Proposition 8 Let {\chi_4: {\bf N} \rightarrow {\bf R}} be the real non-principal Dirichlet character of modulus {4}, or more explicitly {\chi_4(n)} is equal to {+1} when {n = 1\ (4)}, {-1} when {n = 3\ (4)}, and {0} when {n = 0,2\ (4)}.

  • (i) For any natural number {n}, the number of Gaussian integers {m \in {\bf Z}[\sqrt{-1}]} with norm {N(m)=n} is equal to {4(1 * \chi_4)(n)}. Equivalently, the number of solutions to the equation {n = x^2+y^2} with {x,y \in{\bf Z}} is {4(1*\chi_4)(n)}. (Here, as in the previous post, the symbol {*} denotes Dirichlet convolution.)
  • (ii) For any natural number {n}, the number of Gaussian integers {m \in {\bf Z}[\sqrt{-1}]} that divide {n} (thus {n = dm} for some {d \in {\bf Z}[\sqrt{-1}]}) is {4(1*1*1*\mu\chi_4)(n)}.

We will prove this proposition later in these notes. We observe that as a special case of part (i) of this proposition, we recover the Fermat two-square theorem: an odd prime {p} is expressible as the sum of two squares if and only if {p = 1\ (4)}. This proposition should also be compared with the fact, used crucially in the previous post to prove Dirichlet’s theorem, that {1*\chi(n)} is non-negative for any {n}, and at least one when {n} is a square, for any quadratic character {\chi}.

As an illustration of the relevance of such connections to analytic number theory, let us now explicitly compute {L(1,\chi_4)}.

Corollary 9 {L(1,\chi_4) = \frac{\pi}{4}}.

This particular identity is also known as the Leibniz formula.

Proof: For a large number {x}, consider the quantity

\displaystyle  \sum_{n \in {\bf Z}[\sqrt{-1}]: N(n) \leq x} 1

of all the Gaussian integers of norm less than {x}. On the one hand, this is the same as the number of lattice points of {{\bf Z}^2} in the disk {\{ (a,b) \in {\bf R}^2: a^2+b^2 \leq x \}} of radius {\sqrt{x}}. Placing a unit square centred at each such lattice point, we obtain a region which differs from the disk by a region contained in an annulus of area {O(\sqrt{x})}. As the area of the disk is {\pi x}, we conclude the Gauss bound

\displaystyle  \sum_{n \in {\bf Z}[\sqrt{-1}]: N(n) \leq x} 1 = \pi x + O(\sqrt{x}).

On the other hand, by Proposition 8(i) (and removing the {n=0} contribution), we see that

\displaystyle  \sum_{n \in {\bf Z}[\sqrt{-1}]: N(n) \leq x} 1 = 1 + 4 \sum_{n \leq x} 1 * \chi_4(n).

Now we use the Dirichlet hyperbola method to expand the right-hand side sum, first expressing

\displaystyle  \sum_{n \leq x} 1 * \chi_4(n) = \sum_{d \leq \sqrt{x}} \chi_4(d) \sum_{m \leq x/d} 1 + \sum_{m \leq \sqrt{x}} \sum_{d \leq x/m} \chi_4(d)

\displaystyle  - (\sum_{d \leq \sqrt{x}} \chi_4(d)) (\sum_{m \leq \sqrt{x}} 1)

and then using the bounds {\sum_{d \leq y} \chi_4(d) = O(1)}, {\sum_{m \leq y} 1 = y + O(1)}, {\sum_{d \leq \sqrt{x}} \frac{\chi_4(d)}{d} = L(1,\chi_4) + O(\frac{1}{\sqrt{x}})} from the previous set of notes to conclude that

\displaystyle  \sum_{n \leq x} 1 * \chi_4(n) = x L(1,\chi_4) + O(\sqrt{x}).

Comparing the two formulae for {\sum_{n \in {\bf Z}[\sqrt{-1}]: N(n) \leq x} 1} and sending {x \rightarrow \infty}, we obtain the claim. \Box

Exercise 10 Give an alternate proof of Corollary 9 that relies on obtaining asymptotics for the Dirichlet series {\sum_{n \in {\bf Z}} \frac{1 * \chi_4(n)}{n^s}} as {s \rightarrow 1^+}, rather than using the Dirichlet hyperbola method.

Exercise 11 Give a direct proof of Corollary 9 that does not use Proposition 8, instead using Taylor expansion of the complex logarithm {\log(1+z)}. (One can also use Taylor expansions of some other functions related to the complex logarithm here, such as the arctangent function.)

More generally, one can relate {L(1,\chi)} for a real Dirichlet character {\chi} with the number of inequivalent quadratic forms of a certain discriminant, via the famous class number formula; we will give a special case of this formula below the fold.

The material here is only a very rudimentary introduction to algebraic number theory, and is not essential to the rest of the course. A slightly expanded version of the material here, from the perspective of analytic number theory, may be found in Sections 5 and 6 of Davenport’s book. A more in-depth treatment of algebraic number theory may be found in a number of texts, e.g. Fröhlich and Taylor.

Read the rest of this entry »

As laid out in the foundational work of Kolmogorov, a classical probability space (or probability space for short) is a triplet {(X, {\mathcal X}, \mu)}, where {X} is a set, {{\mathcal X}} is a {\sigma}-algebra of subsets of {X}, and {\mu: {\mathcal X} \rightarrow [0,1]} is a countably additive probability measure on {{\mathcal X}}. Given such a space, one can form a number of interesting function spaces, including

  • the (real) Hilbert space {L^2(X, {\mathcal X}, \mu)} of square-integrable functions {f: X \rightarrow {\bf R}}, modulo {\mu}-almost everywhere equivalence, and with the positive definite inner product {\langle f, g\rangle_{L^2(X, {\mathcal X}, \mu)} := \int_X f g\ d\mu}; and
  • the unital commutative Banach algebra {L^\infty(X, {\mathcal X}, \mu)} of essentially bounded functions {f: X \rightarrow {\bf R}}, modulo {\mu}-almost everywhere equivalence, with {\|f\|_{L^\infty(X, {\mathcal X}, \mu)}} defined as the essential supremum of {|f|}.

There is also a trace {\tau = \tau_\mu: L^\infty(X, {\mathcal X}, \mu) \rightarrow {\bf C}} on {L^\infty} defined by integration: {\tau(f) := \int_X f\ d\mu}.

One can form the category {\mathbf{Prb}} of classical probability spaces, by defining a morphism {\phi: (X, {\mathcal X}, \mu) \rightarrow (Y, {\mathcal Y}, \nu)} between probability spaces to be a function {\phi: X \rightarrow Y} which is measurable (thus {\phi^{-1}(E) \in {\mathcal X}} for all {E \in {\mathcal Y}}) and measure-preserving (thus {\mu(\phi^{-1}(E)) = \nu(E)} for all {E \in {\mathcal Y}}).

Let us now abstract the algebraic features of these spaces as follows; for want of a better name, I will refer to this abstraction as an algebraic probability space, and is very similar to the non-commutative probability spaces studied in this previous post, except that these spaces are now commutative (and real).

Definition 1 An algebraic probability space is a pair {({\mathcal A}, \tau)} where

  • {{\mathcal A}} is a unital commutative real algebra;
  • {\tau: {\mathcal A} \rightarrow {\bf R}} is a homomorphism such that {\tau(1)=1} and {\tau( f^2 ) \geq 0} for all {f \in {\mathcal A}};
  • Every element {f} of {{\mathcal A}} is bounded in the sense that {\sup_{k \geq 1} \tau( f^{2k} )^{1/2k} < \infty}. (Technically, this isn’t an algebraic property, but I need it for technical reasons.)

A morphism {\phi: ({\mathcal A}_1, \tau_1) \rightarrow ({\mathcal A}_2, \tau_2)} is a homomorphism {\phi^*: {\mathcal A}_2 \rightarrow {\mathcal A}_1} which is trace-preserving, in the sense that {\tau_1(\phi^*(f)) = \tau_2(f)} for all {f \in {\mathcal A}_2}.

For want of a better name, I’ll denote the category of algebraic probability spaces as {\mathbf{AlgPrb}}. One can view this category as the opposite category to that of (a subcategory of) the category of tracial commutative real algebras. One could emphasise this opposite nature by denoting the algebraic probability space as {({\mathcal A}, \tau)^{op}} rather than {({\mathcal A},\tau)}; another suggestive (but slightly inaccurate) notation, inspired by the language of schemes, would be {\hbox{Spec}({\mathcal A},\tau)} rather than {({\mathcal A},\tau)}. However, we will not adopt these conventions here, and refer to algebraic probability spaces just by the pair {({\mathcal A},\tau)}.

By the previous discussion, we have a covariant functor {F: \textbf{Prb} \rightarrow \textbf{AlgPrb}} that takes a classical probability space {(X, {\mathcal X}, \mu)} to its algebraic counterpart {(L^\infty(X, {\mathcal X},\mu), \tau_\mu)}, with a morphism {\phi: (X, {\mathcal X}, \mu) \rightarrow (Y, {\mathcal Y}, \nu)} of classical probability spaces mapping to a morphism {F(\phi): (L^\infty(X, {\mathcal X},\mu), \tau_\mu) \rightarrow (L^\infty(Y, {\mathcal Y},\nu), \tau_\nu)} of the corresponding algebraic probability spaces by the formula

\displaystyle  F(\phi)^* f := f \circ \phi

for {f \in L^\infty(Y, {\mathcal Y}, \nu)}. One easily verifies that this is a functor.

In this post I would like to describe a functor {G: \textbf{AlgPrb} \rightarrow \textbf{Prb}} which partially inverts {F} (up to natural isomorphism), that is to say a recipe for starting with an algebraic probability space {({\mathcal A}, \tau)} and producing a classical probability space {(X, {\mathcal X}, \mu)}. This recipe is not new – it is basically the (commutative) Gelfand-Naimark-Segal construction (discussed in this previous post) combined with the Loomis-Sikorski theorem (discussed in this previous post). However, I wanted to put the construction in a single location for sake of reference. I also wanted to make the point that {F} and {G} are not complete inverses; there is a bit of information in the algebraic probability space (e.g. topological information) which is lost when passing back to the classical probability space. In some future posts, I would like to develop some ergodic theory using the algebraic foundations of probability theory rather than the classical foundations; this turns out to be convenient in the ergodic theory arising from nonstandard analysis (such as that described in this previous post), in which the groups involved are uncountable and the underlying spaces are not standard Borel spaces.

Let us describe how to construct the functor {G}, with details postponed to below the fold.

  1. Starting with an algebraic probability space {({\mathcal A}, \tau)}, form an inner product on {{\mathcal A}} by the formula {\langle f, g \rangle := \tau(fg)}, and also form the spectral radius {\rho(f) :=\lim_{k \rightarrow \infty} \tau(f^{2^k})^{1/2^k}}.
  2. The inner product is clearly positive semi-definite. Quotienting out the null vectors and taking completions, we arrive at a real Hilbert space {L^2 = L^2({\mathcal A},\tau)}, to which the trace {\tau} may be extended.
  3. Somewhat less obviously, the spectral radius is well-defined and gives a norm on {{\mathcal A}}. Taking {L^2} limits of sequences in {{\mathcal A}} of bounded spectral radius gives us a subspace {L^\infty = L^\infty({\mathcal A},\tau)} of {L^2} that has the structure of a real commutative Banach algebra.
  4. The idempotents {1_E} of the Banach algebra {L^\infty} may be indexed by elements {E} of an abstract {\sigma}-algebra {{\mathcal B}}.
  5. The Boolean algebra homomorphisms {\delta_x: {\mathcal B} \rightarrow \{0,1\}} (or equivalently, the real algebra homomorphisms {\iota_x: L^\infty \rightarrow {\bf R}}) may be indexed by elements {x} of a space {X}.
  6. Let {{\mathcal X}} denote the {\sigma}-algebra on {X} generated by the basic sets {\overline{E} := \{ x \in X: \delta_x(E) = 1 \}} for every {E \in {\mathcal B}}.
  7. Let {{\mathcal N}} be the {\sigma}-ideal of {{\mathcal X}} generated by the sets {\bigcap_n \overline{E_n}}, where {E_n \in {\mathcal B}} is a sequence with {\bigcap_n E_n = \emptyset}.
  8. One verifies that {{\mathcal B}} is isomorphic to {{\mathcal X}/{\mathcal N}}. Using this isomorphism, the trace {\tau} on {L^\infty} can be used to construct a countably additive measure {\mu} on {{\mathcal X}}. The classical probability space {(X, {\mathcal X}, \mu)} is then {G( {\mathcal A}, \tau )}, and the abstract spaces {L^2, L^\infty} may now be identified with their concrete counterparts {L^2(X, {\mathcal X}, \mu)}, {L^\infty(X, {\mathcal X}, \mu)}.
  9. Every algebraic probability space morphism {\phi: ({\mathcal A}_1,\tau_1) \rightarrow ({\mathcal A}_2,\tau_2)} generates a classical probability morphism {G(\phi): (X_1, {\mathcal X}_1, \mu_1) \rightarrow (X_2, {\mathcal X}_2, \mu_2)} via the formula

    \displaystyle  \delta_{G(\phi)(x_1)}( E_2 ) = \delta_{x_1}( \phi^*(E_2) )

    using a pullback operation {\phi^*} on the abstract {\sigma}-algebras {{\mathcal B}_1, {\mathcal B}_2} that can be defined by density.

Remark 1 The classical probability space {X} constructed by the functor {G} has some additional structure; namely {X} is a {\sigma}-Stone space (a Stone space with the property that the closure of any countable union of clopen sets is clopen), {{\mathcal X}} is the Baire {\sigma}-algebra (generated by the clopen sets), and the null sets are the meager sets. However, we will not use this additional structure here.

The partial inversion relationship between the functors {F: \textbf{Prb} \rightarrow \textbf{AlgPrb}} and {G: \textbf{AlgPrb} \rightarrow \textbf{Prb}} is given by the following assertion:

  1. There is a natural transformation from {F \circ G: \textbf{AlgPrb} \rightarrow \textbf{AlgPrb}} to the identity functor {I: \textbf{AlgPrb} \rightarrow \textbf{AlgPrb}}.

More informally: if one starts with an algebraic probability space {({\mathcal A},\tau)} and converts it back into a classical probability space {(X, {\mathcal X}, \mu)}, then there is a trace-preserving algebra homomorphism of {{\mathcal A}} to {L^\infty( X, {\mathcal X}, \mu )}, which respects morphisms of the algebraic probability space. While this relationship is far weaker than an equivalence of categories (which would require that {F \circ G} and {G \circ F} are both natural isomorphisms), it is still good enough to allow many ergodic theory problems formulated using classical probability spaces to be reformulated instead as an equivalent problem in algebraic probability spaces.

Remark 2 The opposite composition {G \circ F: \textbf{Prb} \rightarrow \textbf{Prb}} is a little odd: it takes an arbitrary probability space {(X, {\mathcal X}, \mu)} and returns a more complicated probability space {(X', {\mathcal X}', \mu')}, with {X'} being the space of homomorphisms {\iota_x: L^\infty(X, {\mathcal X}, \mu) \rightarrow {\bf R}}. while there is “morally” an embedding of {X} into {X'} using the evaluation map, this map does not exist in general because points in {X} may well have zero measure. However, if one takes a “pointless” approach and focuses just on the measure algebras {({\mathcal X}, \mu)}, {({\mathcal X}', \mu')}, then these algebras become naturally isomorphic after quotienting out by null sets.

Remark 3 An algebraic probability space captures a bit more structure than a classical probability space, because {{\mathcal A}} may be identified with a proper subset of {L^\infty} that describes the “regular” functions (or random variables) of the space. For instance, starting with the unit circle {{\bf R}/{\bf Z}} (with the usual Haar measure and the usual trace {\tau(f) = \int_{{\bf R}/{\bf Z}} f}), any unital subalgebra {{\mathcal A}} of {L^\infty({\bf R}/{\bf Z})} that is dense in {L^2({\bf R}/{\bf Z})} will generate the same classical probability space {G( {\mathcal A}, \tau )} on applying the functor {G}, namely one will get the space {({\bf R}/{\bf Z})'} of homomorphisms from {L^\infty({\bf R}/{\bf Z})} to {{\bf R}} (with the measure induced from {\tau}). Thus for instance {{\mathcal A}} could be the continuous functions {C( {\bf R}/{\bf Z} )}, the Wiener algebra {A({\bf R}/{\bf Z})} or the full space {L^\infty({\bf R}/{\bf Z})}, but the classical space {G( {\mathcal A}, \tau )} will be unable to distinguish these spaces from each other. In particular, the functor {F \circ G} loses information (roughly speaking, this functor takes an algebraic probability space and completes it to a von Neumann algebra, but then forgets exactly what algebra was initially used to create this completion). In ergodic theory, this sort of “extra structure” is traditionally encoded in topological terms, by assuming that the underlying probability space {X} has a nice topological structure (e.g. a standard Borel space); however, with the algebraic perspective one has the freedom to have non-topological notions of extra structure, by choosing {{\mathcal A}} to be something other than an algebra {C(X)} of continuous functions on a topological space. I hope to discuss one such example of extra structure (coming from the Gowers-Host-Kra theory of uniformity seminorms) in a later blog post (this generalises the example of the Wiener algebra given previously, which is encoding “Fourier structure”).

A small example of how one could use the functors {F, G} is as follows. Suppose one has a classical probability space {(X, {\mathcal X}, \mu)} with a measure-preserving action of an uncountable group {\Gamma}, which is only defined (and an action) up to almost everywhere equivalence; thus for instance for any set {E} and any {g, h \in \Gamma}, {T^{gh} E} and {T^g T^h E} might not be exactly equal, but only equal up to a null set. For similar reasons, an element {E} of the invariant factor {{\mathcal X}^\Gamma} might not be exactly invariant with respect to {\Gamma}, but instead one only has {T^g E} and {E} equal up to null sets for each {g \in \Gamma}. One might like to “clean up” the action of {\Gamma} to make it defined everywhere, and a genuine action everywhere, but this is not immediately achievable if {\Gamma} is uncountable, since the union of all the null sets where something bad occurs may cease to be a null set. However, by applying the functor {F}, each shift {T^g: X \rightarrow X} defines a morphism {T^g: L^\infty(X, {\mathcal X}, \mu) \rightarrow L^\infty(X, {\mathcal X}, \mu)} on the associated algebraic probability space (i.e. the Koopman operator), and then applying {G}, we obtain a shift {T^g: X' \rightarrow X'} on a new classical probability space {(X', {\mathcal X}', \mu')} which now gives a genuine measure-preserving action of {\Gamma}, and which is equivalent to the original action from a measure algebra standpoint. The invariant factor {{\mathcal X}^\Gamma} now consists of those sets in {{\mathcal X}'} which are genuinely {\Gamma}-invariant, not just up to null sets. (Basically, the classical probability space {(X', {\mathcal X}', \mu')} contains a Boolean algebra {\overline{\mathcal B}} with the property that every measurable set {A \in {\mathcal X}'} is equivalent up to null sets to precisely one set in {\overline{\mathcal B}}, allowing for a canonical “retraction” onto {\overline{\mathcal B}} that eliminates all null set issues.)

More indirectly, the functors {F, G} suggest that one should be able to develop a “pointless” form of ergodic theory, in which the underlying probability spaces are given algebraically rather than classically. I hope to give some more specific examples of this in later posts.

Read the rest of this entry »

In this previous post I recorded some (very standard) material on the structural theory of finite-dimensional complex Lie algebras (or Lie algebras for short), with a particular focus on those Lie algebras which were semisimple or simple. Among other things, these notes discussed the Weyl complete reducibility theorem (asserting that semisimple Lie algebras are the direct sum of simple Lie algebras) and the classification of simple Lie algebras (with all such Lie algebras being (up to isomorphism) of the form {A_n}, {B_n}, {C_n}, {D_n}, {E_6}, {E_7}, {E_8}, {F_4}, or {G_2}).

Among other things, the structural theory of Lie algebras can then be used to build analogous structures in nearby areas of mathematics, such as Lie groups and Lie algebras over more general fields than the complex field {{\bf C}} (leading in particular to the notion of a Chevalley group), as well as finite simple groups of Lie type, which form the bulk of the classification of finite simple groups (with the exception of the alternating groups and a finite number of sporadic groups).

In the case of complex Lie groups, it turns out that every simple Lie algebra {\mathfrak{g}} is associated with a finite number of connected complex Lie groups, ranging from a “minimal” Lie group {G_{ad}} (the adjoint form of the Lie group) to a “maximal” Lie group {\tilde G} (the simply connected form of the Lie group) that finitely covers {G_{ad}}, and occasionally also a number of intermediate forms which finitely cover {G_{ad}}, but are in turn finitely covered by {\tilde G}. For instance, {\mathfrak{sl}_n({\bf C})} is associated with the projective special linear group {\hbox{PSL}_n({\bf C}) = \hbox{PGL}_n({\bf C})} as its adjoint form and the special linear group {\hbox{SL}_n({\bf C})} as its simply connected form, and intermediate groups can be created by quotienting out {\hbox{SL}_n({\bf C})} by some subgroup of its centre (which is isomorphic to the {n^{th}} roots of unity). The minimal form {G_{ad}} is simple in the group-theoretic sense of having no normal subgroups, but the other forms of the Lie group are merely quasisimple, although traditionally all of the forms of a Lie group associated to a simple Lie algebra are known as simple Lie groups.

Thanks to the work of Chevalley, a very similar story holds for algebraic groups over arbitrary fields {k}; given any Dynkin diagram, one can define a simple Lie algebra with that diagram over that field, and also one can find a finite number of connected algebraic groups over {k} (known as Chevalley groups) with that Lie algebra, ranging from an adjoint form {G_{ad}} to a universal form {G_u}, with every form having an isogeny (the analogue of a finite cover for algebraic groups) to the adjoint form, and in turn receiving an isogeny from the universal form. Thus, for instance, one could construct the universal form {E_7(q)_u} of the {E_7} algebraic group over a finite field {{\bf F}_q} of finite order.

When one restricts the Chevalley group construction to adjoint forms over a finite field (e.g. {\hbox{PSL}_n({\bf F}_q)}), one usually obtains a finite simple group (with a finite number of exceptions when the rank and the field are very small, and in some cases one also has to pass to a bounded index subgroup, such as the derived group, first). One could also use other forms than the adjoint form, but one then recovers the same finite simple group as before if one quotients out by the centre. This construction was then extended by Steinberg, Suzuki, and Ree by taking a Chevalley group over a finite field and then restricting to the fixed points of a certain automorphism of that group; after some additional minor modifications such as passing to a bounded index subgroup or quotienting out a bounded centre, this gives some additional finite simple groups of Lie type, including classical examples such as the projective special unitary groups {\hbox{PSU}_n({\bf F}_{q^2})}, as well as some more exotic examples such as the Suzuki groups or the Ree groups.

While I learned most of the classical structural theory of Lie algebras back when I was an undergraduate, and have interacted with Lie groups in many ways in the past (most recently in connection with Hilbert’s fifth problem, as discussed in this previous series of lectures), I have only recently had the need to understand more precisely the concepts of a Chevalley group and of a finite simple group of Lie type, as well as better understand the structural theory of simple complex Lie groups. As such, I am recording some notes here regarding these concepts, mainly for my own benefit, but perhaps they will also be of use to some other readers. The material here is standard, and was drawn from a number of sources, but primarily from Carter, Gorenstein-Lyons-Solomon, and Fulton-Harris, as well as the lecture notes on Chevalley groups by my colleague Robert Steinberg. The arrangement of material also reflects my own personal preferences; in particular, I tend to favour complex-variable or Riemannian geometry methods over algebraic ones, and this influenced a number of choices I had to make regarding how to prove certain key facts. The notes below are far from a comprehensive or fully detailed discussion of these topics, and I would refer interested readers to the references above for a properly thorough treatment.

Read the rest of this entry »

An abstract finite-dimensional complex Lie algebra, or Lie algebra for short, is a finite-dimensional complex vector space {{\mathfrak g}} together with an anti-symmetric bilinear form {[,] = [,]_{\mathfrak g}: {\mathfrak g} \times {\mathfrak g} \rightarrow {\mathfrak g}} that obeys the Jacobi identity

\displaystyle [[x,y],z] + [[y,z],x] + [[z,x],y] = 0 \ \ \ \ \ (1)

for all {x,y,z \in {\mathfrak g}}; by anti-symmetry one can also rewrite the Jacobi identity as

\displaystyle [x,[y,z]] = [[x,y],z] + [y,[x,z]]. \ \ \ \ \ (2)

We will usually omit the subscript from the Lie bracket {[,]_{\mathfrak g}} when this will not cause ambiguity. A homomorphism {\phi: {\mathfrak g} \rightarrow {\mathfrak h}} between two Lie algebras {{\mathfrak g},{\mathfrak h}} is a linear map that respects the Lie bracket, thus {\phi([x,y]_{\mathfrak g}) =[\phi(x),\phi(y)]_{\mathfrak h}} for all {x,y \in {\mathfrak g}}. As with many other classes of mathematical objects, the class of Lie algebras together with their homomorphisms then form a category. One can of course also consider Lie algebras in infinite dimension or over other fields, but we will restrict attention throughout these notes to the finite-dimensional complex case. The trivial, zero-dimensional Lie algebra is denoted {0}; Lie algebras of positive dimension will be called non-trivial.

Lie algebras come up in many contexts in mathematics, in particular arising as the tangent space of complex Lie groups. It is thus very profitable to think of Lie algebras as being the infinitesimal component of a Lie group, and in particular almost all of the notation and concepts that are applicable to Lie groups (e.g. nilpotence, solvability, extensions, etc.) have infinitesimal counterparts in the category of Lie algebras (often with exactly the same terminology). See this previous blog post for more discussion about the connection between Lie algebras and Lie groups (that post was focused over the reals instead of the complexes, but much of the discussion carries over to the complex case).

A particular example of a Lie algebra is the general linear Lie algebra {{\mathfrak{gl}}(V)} of linear transformations {x: V \rightarrow V} on a finite-dimensional complex vector space (or vector space for short) {V}, with the commutator Lie bracket {[x,y] := xy-yx}; one easily verifies that this is indeed an abstract Lie algebra. We will define a concrete Lie algebra to be a Lie algebra that is a subalgebra of {{\mathfrak{gl}}(V)} for some vector space {V}, and similarly define a representation of a Lie algebra {{\mathfrak g}} to be a homomorphism {\rho: {\mathfrak g} \rightarrow {\mathfrak h}} into a concrete Lie algebra {{\mathfrak h}}. It is a deep theorem of Ado (discussed in this previous post) that every abstract Lie algebra is in fact isomorphic to a concrete one (or equivalently, that every abstract Lie algebra has a faithful representation), but we will not need or prove this fact here.

Even without Ado’s theorem, though, the structure of abstract Lie algebras is very well understood. As with objects in many other algebraic categories, a basic way to understand a Lie algebra {{\mathfrak g}} is to factor it into two simpler algebras {{\mathfrak h}, {\mathfrak k}} via a short exact sequence

\displaystyle 0 \rightarrow {\mathfrak h} \rightarrow {\mathfrak g} \rightarrow {\mathfrak k} \rightarrow 0, \ \ \ \ \ (3)

thus one has an injective homomorphism from {{\mathfrak h}} to {{\mathfrak g}} and a surjective homomorphism from {{\mathfrak g}} to {{\mathfrak k}} such that the image of the former homomorphism is the kernel of the latter. (To be pedantic, a short exact sequence in a general category requires these homomorphisms to be monomorphisms and epimorphisms respectively, but in the category of Lie algebras these turn out to reduce to the more familiar concepts of injectivity and surjectivity respectively.) Given such a sequence, one can (non-uniquely) identify {{\mathfrak g}} with the vector space {{\mathfrak h} \times {\mathfrak k}} equipped with a Lie bracket of the form

\displaystyle [(t,x), (s,y)]_{\mathfrak g} = ([t,s]_{\mathfrak h} + A(t,y) - A(s,x) + B(x,y), [x,y]_{\mathfrak k}) \ \ \ \ \ (4)

for some bilinear maps {A: {\mathfrak h} \times {\mathfrak k} \rightarrow {\mathfrak h}} and {B: {\mathfrak k} \times {\mathfrak k} \rightarrow {\mathfrak h}} that obey some Jacobi-type identities which we will not record here. Understanding exactly what maps {A,B} are possible here (up to coordinate change) can be a difficult task (and is one of the key objectives of Lie algebra cohomology), but in principle at least, the problem of understanding {{\mathfrak g}} can be reduced to that of understanding that of its factors {{\mathfrak k}, {\mathfrak h}}. To emphasise this, I will (perhaps idiosyncratically) express the existence of a short exact sequence (3) by the ATLAS-type notation

\displaystyle {\mathfrak g} = {\mathfrak h} . {\mathfrak k} \ \ \ \ \ (5)

although one should caution that for given {{\mathfrak h}} and {{\mathfrak k}}, there can be multiple non-isomorphic {{\mathfrak g}} that can form a short exact sequence with {{\mathfrak h},{\mathfrak k}}, so that {{\mathfrak h} . {\mathfrak k}} is not a uniquely defined combination of {{\mathfrak h}} and {{\mathfrak k}}; one could emphasise this by writing {{\mathfrak h} ._{A,B} {\mathfrak k}} instead of {{\mathfrak h} . {\mathfrak k}}, though we will not do so here. We will refer to {{\mathfrak g}} as an extension of {{\mathfrak k}} by {{\mathfrak h}}, and read the notation (5) as “ {{\mathfrak g}} is {{\mathfrak h}}-by-{{\mathfrak k}}“; confusingly, these two notations reverse the subject and object of “by”, but unfortunately both notations are well entrenched in the literature. We caution that the operation {.} is not commutative, and it is only partly associative: every Lie algebra of the form {{\mathfrak k} . ({\mathfrak h} . {\mathfrak l})} is also of the form {({\mathfrak k} . {\mathfrak h}) . {\mathfrak l}}, but the converse is not true (see this previous blog post for some related discussion). As we are working in the infinitesimal world of Lie algebras (which have an additive group operation) rather than Lie groups (in which the group operation is usually written multiplicatively), it may help to think of {{\mathfrak h} . {\mathfrak k}} as a (twisted) “sum” of {{\mathfrak h}} and {{\mathfrak k}} rather than a “product”; for instance, we have {{\mathfrak g} = 0 . {\mathfrak g}} and {{\mathfrak g} = {\mathfrak g} . 0}, and also {\dim {\mathfrak h} . {\mathfrak k} = \dim {\mathfrak h} + \dim {\mathfrak k}}.

Special examples of extensions {{\mathfrak h} .{\mathfrak k}} of {{\mathfrak k}} by {{\mathfrak h}} include the direct sum (or direct product) {{\mathfrak h} \oplus {\mathfrak k}} (also denoted {{\mathfrak h} \times {\mathfrak k}}), which is given by the construction (4) with {A} and {B} both vanishing, and the split extension (or semidirect product) {{\mathfrak h} : {\mathfrak k} = {\mathfrak h} :_\rho {\mathfrak k}} (also denoted {{\mathfrak h} \ltimes {\mathfrak k} = {\mathfrak h} \ltimes_\rho {\mathfrak k}}), which is given by the construction (4) with {B} vanishing and the bilinear map {A: {\mathfrak h} \times {\mathfrak k} \rightarrow {\mathfrak h}} taking the form

\displaystyle A( t, x ) = \rho(x)(t)

for some representation {\rho: {\mathfrak k} \rightarrow \hbox{Der} {\mathfrak h}} of {{\mathfrak k}} in the concrete Lie algebra of derivations {\hbox{Der} {\mathfrak h} \subset {\mathfrak{gl}}({\mathfrak h})} of {{\mathfrak h}}, that is to say the algebra of linear maps {D: {\mathfrak h} \rightarrow {\mathfrak h}} that obey the Leibniz rule

\displaystyle D[s,t]_{\mathfrak h} = [Ds,t]_{\mathfrak h} + [s,Dt]_{\mathfrak h}

for all {s,t \in {\mathfrak h}}. (The derivation algebra {\hbox{Der} {\mathfrak g}} of a Lie algebra {{\mathfrak g}} is analogous to the automorphism group {\hbox{Aut}(G)} of a Lie group {G}, with the two concepts being intertwined by the tangent space functor {G \mapsto {\mathfrak g}} from Lie groups to Lie algebras (i.e. the derivation algebra is the infinitesimal version of the automorphism group). Of course, this functor also intertwines the Lie algebra and Lie group versions of most of the other concepts discussed here, such as extensions, semidirect products, etc.)

There are two general ways to factor a Lie algebra {{\mathfrak g}} as an extension {{\mathfrak h} . {\mathfrak k}} of a smaller Lie algebra {{\mathfrak k}} by another smaller Lie algebra {{\mathfrak h}}. One is to locate a Lie algebra ideal (or ideal for short) {{\mathfrak h}} in {{\mathfrak g}}, thus {[{\mathfrak h},{\mathfrak g}] \subset {\mathfrak h}}, where {[{\mathfrak h},{\mathfrak g}]} denotes the Lie algebra generated by {\{ [x,y]: x \in {\mathfrak h}, y \in {\mathfrak g} \}}, and then take {{\mathfrak k}} to be the quotient space {{\mathfrak g}/{\mathfrak h}} in the usual manner; one can check that {{\mathfrak h}}, {{\mathfrak k}} are also Lie algebras and that we do indeed have a short exact sequence

\displaystyle {\mathfrak g} = {\mathfrak h} . ({\mathfrak g}/{\mathfrak h}).

Conversely, whenever one has a factorisation {{\mathfrak g} = {\mathfrak h} . {\mathfrak k}}, one can identify {{\mathfrak h}} with an ideal in {{\mathfrak g}}, and {{\mathfrak k}} with the quotient of {{\mathfrak g}} by {{\mathfrak h}}.

The other general way to obtain such a factorisation is is to start with a homomorphism {\rho: {\mathfrak g} \rightarrow {\mathfrak m}} of {{\mathfrak g}} into another Lie algebra {{\mathfrak m}}, take {{\mathfrak k}} to be the image {\rho({\mathfrak g})} of {{\mathfrak g}}, and {{\mathfrak h}} to be the kernel {\hbox{ker} \rho := \{ x \in {\mathfrak g}: \rho(x) = 0 \}}. Again, it is easy to see that this does indeed create a short exact sequence:

\displaystyle {\mathfrak g} = \hbox{ker} \rho . \rho({\mathfrak g}).

Conversely, whenever one has a factorisation {{\mathfrak g} = {\mathfrak h} . {\mathfrak k}}, one can identify {{\mathfrak k}} with the image of {{\mathfrak g}} under some homomorphism, and {{\mathfrak h}} with the kernel of that homomorphism. Note that if a representation {\rho: {\mathfrak g} \rightarrow {\mathfrak m}} is faithful (i.e. injective), then the kernel is trivial and {{\mathfrak g}} is isomorphic to {\rho({\mathfrak g})}.

Now we consider some examples of factoring some class of Lie algebras into simpler Lie algebras. The easiest examples of Lie algebras to understand are the abelian Lie algebras {{\mathfrak g}}, in which the Lie bracket identically vanishes. Every one-dimensional Lie algebra is automatically abelian, and thus isomorphic to the scalar algebra {{\bf C}}. Conversely, by using an arbitrary linear basis of {{\mathfrak g}}, we see that an abelian Lie algebra is isomorphic to the direct sum of one-dimensional algebras. Thus, a Lie algebra is abelian if and only if it is isomorphic to the direct sum of finitely many copies of {{\bf C}}.

Now consider a Lie algebra {{\mathfrak g}} that is not necessarily abelian. We then form the derived algebra {[{\mathfrak g},{\mathfrak g}]}; this algebra is trivial if and only if {{\mathfrak g}} is abelian. It is easy to see that {[{\mathfrak h},{\mathfrak k}]} is an ideal whenever {{\mathfrak h},{\mathfrak k}} are ideals, so in particular the derived algebra {[{\mathfrak g},{\mathfrak g}]} is an ideal and we thus have the short exact sequence

\displaystyle {\mathfrak g} = [{\mathfrak g},{\mathfrak g}] . ({\mathfrak g}/[{\mathfrak g},{\mathfrak g}]).

The algebra {{\mathfrak g}/[{\mathfrak g},{\mathfrak g}]} is the maximal abelian quotient of {{\mathfrak g}}, and is known as the abelianisation of {{\mathfrak g}}. If it is trivial, we call the Lie algebra perfect. If instead it is non-trivial, then the derived algebra has strictly smaller dimension than {{\mathfrak g}}. From this, it is natural to associate two series to any Lie algebra {{\mathfrak g}}, the lower central series

\displaystyle {\mathfrak g}_1 = {\mathfrak g}; {\mathfrak g}_2 := [{\mathfrak g}, {\mathfrak g}_1]; {\mathfrak g}_3 := [{\mathfrak g}, {\mathfrak g}_2]; \ldots

and the derived series

\displaystyle {\mathfrak g}^{(1)} := {\mathfrak g}; {\mathfrak g}^{(2)} := [{\mathfrak g}^{(1)}, {\mathfrak g}^{(1)}]; {\mathfrak g}^{(3)} := [{\mathfrak g}^{(2)}, {\mathfrak g}^{(2)}]; \ldots.

By induction we see that these are both decreasing series of ideals of {{\mathfrak g}}, with the derived series being slightly smaller ({{\mathfrak g}^{(k)} \subseteq {\mathfrak g}_k} for all {k}). We say that a Lie algebra is nilpotent if its lower central series is eventually trivial, and solvable if its derived series eventually becomes trivial. Thus, abelian Lie algebras are nilpotent, and nilpotent Lie algebras are solvable, but the converses are not necessarily true. For instance, in the general linear group {{\mathfrak{gl}}_n = {\mathfrak{gl}}({\bf C}^n)}, which can be identified with the Lie algebra of {n \times n} complex matrices, the subalgebra {{\mathfrak n}} of strictly upper triangular matrices is nilpotent (but not abelian for {n \geq 3}), while the subalgebra {{\mathfrak n}} of upper triangular matrices is solvable (but not nilpotent for {n \geq 2}). It is also clear that any subalgebra of a nilpotent algebra is nilpotent, and similarly for solvable or abelian algebras.

From the above discussion we see that a Lie algebra is solvable if and only if it can be represented by a tower of abelian extensions, thus

\displaystyle {\mathfrak g} = {\mathfrak a}_1 . ({\mathfrak a}_2 . \ldots ({\mathfrak a}_{k-1} . {\mathfrak a}_k) \ldots )

for some abelian {{\mathfrak a}_1,\ldots,{\mathfrak a}_k}. Similarly, a Lie algebra {{\mathfrak g}} is nilpotent if it is expressible as a tower of central extensions (so that in all the extensions {{\mathfrak h} . {\mathfrak k}} in the above factorisation, {{\mathfrak h}} is central in {{\mathfrak h} . {\mathfrak k}}, where we say that {{\mathfrak h}} is central in {{\mathfrak g}} if {[{\mathfrak h},{\mathfrak g}]=0}). We also see that an extension {{\mathfrak h} . {\mathfrak k}} is solvable if and only of both factors {{\mathfrak h}, {\mathfrak k}} are solvable. Splitting abelian algebras into cyclic (i.e. one-dimensional) ones, we thus see that a finite-dimensional Lie algebra is solvable if and only if it is polycylic, i.e. it can be represented by a tower of cyclic extensions.

For our next fundamental example of using short exact sequences to split a general Lie algebra into simpler objects, we observe that every abstract Lie algebra {{\mathfrak g}} has an adjoint representation {\hbox{ad}: {\mathfrak g} \rightarrow \hbox{ad} {\mathfrak g} \subset {\mathfrak{gl}}({\mathfrak g})}, where for each {x \in {\mathfrak g}}, {\hbox{ad} x \in {\mathfrak{gl}}({\mathfrak g})} is the linear map {(\hbox{ad} x)(y) := [x,y]}; one easily verifies that this is indeed a representation (indeed, (2) is equivalent to the assertion that {\hbox{ad} [x,y] = [\hbox{ad} x, \hbox{ad} y]} for all {x,y \in {\mathfrak g}}). The kernel of this representation is the center {Z({\mathfrak g}) := \{ x \in {\mathfrak g}: [x,{\mathfrak g}] = 0\}}, which the maximal central subalgebra of {{\mathfrak g}}. We thus have the short exact sequence

\displaystyle {\mathfrak g} = Z({\mathfrak g}) . \hbox{ad} g \ \ \ \ \ (6)

which, among other things, shows that every abstract Lie algebra is a central extension of a concrete Lie algebra (which can serve as a cheap substitute for Ado’s theorem mentioned earlier).

For our next fundamental decomposition of Lie algebras, we need some more definitions. A Lie algebra {{\mathfrak g}} is simple if it is non-abelian and has no ideals other than {0} and {{\mathfrak g}}; thus simple Lie algebras cannot be factored {{\mathfrak g} = {\mathfrak h} . {\mathfrak k}} into strictly smaller algebras {{\mathfrak h},{\mathfrak k}}. In particular, simple Lie algebras are automatically perfect and centerless. We have the following fundamental theorem:

Theorem 1 (Equivalent definitions of semisimplicity) Let {{\mathfrak g}} be a Lie algebra. Then the following are equivalent:

  • (i) {{\mathfrak g}} does not contain any non-trivial solvable ideal.
  • (ii) {{\mathfrak g}} does not contain any non-trivial abelian ideal.
  • (iii) The Killing form {K: {\mathfrak g} \times {\mathfrak g} \rightarrow {\bf C}}, defined as the bilinear form {K(x,y) := \hbox{tr}_{\mathfrak g}( (\hbox{ad} x) (\hbox{ad} y) )}, is non-degenerate on {{\mathfrak g}}.
  • (iv) {{\mathfrak g}} is isomorphic to the direct sum of finitely many non-abelian simple Lie algebras.

We review the proof of this theorem later in these notes. A Lie algebra obeying any (and hence all) of the properties (i)-(iv) is known as a semisimple Lie algebra. The statement (iv) is usually taken as the definition of semisimplicity; the equivalence of (iv) and (i) is a special case of Weyl’s complete reducibility theorem (see Theorem 32), and the equivalence of (iv) and (iii) is known as the Cartan semisimplicity criterion. (The equivalence of (i) and (ii) is easy.)

If {{\mathfrak h}} and {{\mathfrak k}} are solvable ideals of a Lie algebra {{\mathfrak g}}, then it is not difficult to see that the vector sum {{\mathfrak h}+{\mathfrak k}} is also a solvable ideal (because on quotienting by {{\mathfrak h}} we see that the derived series of {{\mathfrak h}+{\mathfrak k}} must eventually fall inside {{\mathfrak h}}, and thence must eventually become trivial by the solvability of {{\mathfrak h}}). As our Lie algebras are finite dimensional, we conclude that {{\mathfrak g}} has a unique maximal solvable ideal, known as the radical {\hbox{rad} {\mathfrak g}} of {{\mathfrak g}}. The quotient {{\mathfrak g}/\hbox{rad} {\mathfrak g}} is then a Lie algebra with trivial radical, and is thus semisimple by the above theorem, giving the Levi decomposition

\displaystyle {\mathfrak g} = \hbox{rad} {\mathfrak g} . ({\mathfrak g} / \hbox{rad} {\mathfrak g})

expressing an arbitrary Lie algebra as an extension of a semisimple Lie algebra {{\mathfrak g}/\hbox{rad}{\mathfrak g}} by a solvable algebra {\hbox{rad} {\mathfrak g}} (and it is not hard to see that this is the only possible such extension up to isomorphism). Indeed, a deep theorem of Levi allows one to upgrade this decomposition to a split extension

\displaystyle {\mathfrak g} = \hbox{rad} {\mathfrak g} : ({\mathfrak g} / \hbox{rad} {\mathfrak g})

although we will not need or prove this result here.

In view of the above decompositions, we see that we can factor any Lie algebra (using a suitable combination of direct sums and extensions) into a finite number of simple Lie algebras and the scalar algebra {{\bf C}}. In principle, this means that one can understand an arbitrary Lie algebra once one understands all the simple Lie algebras (which, being defined over {{\bf C}}, are somewhat confusingly referred to as simple complex Lie algebras in the literature). Amazingly, this latter class of algebras are completely classified:

Theorem 2 (Classification of simple Lie algebras) Up to isomorphism, every simple Lie algebra is of one of the following forms:

  • {A_n = \mathfrak{sl}_{n+1}} for some {n \geq 1}.
  • {B_n = \mathfrak{so}_{2n+1}} for some {n \geq 2}.
  • {C_n = \mathfrak{sp}_{2n}} for some {n \geq 3}.
  • {D_n = \mathfrak{so}_{2n}} for some {n \geq 4}.
  • {E_6, E_7}, or {E_8}.
  • {F_4}.
  • {G_2}.

(The precise definition of the classical Lie algebras {A_n,B_n,C_n,D_n} and the exceptional Lie algebras {E_6,E_7,E_8,F_4,G_2} will be recalled later.)

(One can extend the families {A_n,B_n,C_n,D_n} of classical Lie algebras a little bit to smaller values of {n}, but the resulting algebras are either isomorphic to other algebras on this list, or cease to be simple; see this previous post for further discussion.)

This classification is a basic starting point for the classification of many other related objects, including Lie algebras and Lie groups over more general fields (e.g. the reals {{\bf R}}), as well as finite simple groups. Being so fundamental to the subject, this classification is covered in almost every basic textbook in Lie algebras, and I myself learned it many years ago in an honours undergraduate course back in Australia. The proof is rather lengthy, though, and I have always had difficulty keeping it straight in my head. So I have decided to write some notes on the classification in this blog post, aiming to be self-contained (though moving rapidly). There is no new material in this post, though; it is all drawn from standard reference texts (I relied particularly on Fulton and Harris’s text, which I highly recommend). In fact it seems remarkably hard to deviate from the standard routes given in the literature to the classification; I would be interested in knowing about other ways to reach the classification (or substeps in that classification) that are genuinely different from the orthodox route.

Read the rest of this entry »

The fundamental notions of calculus, namely differentiation and integration, are often viewed as being the quintessential concepts in mathematical analysis, as their standard definitions involve the concept of a limit. However, it is possible to capture most of the essence of these notions by purely algebraic means (almost completely avoiding the use of limits, Riemann sums, and similar devices), which turns out to be useful when trying to generalise these concepts to more abstract situations in which it becomes convenient to permit the underlying number systems involved to be something other than the real or complex numbers, even if this makes many standard analysis constructions unavailable. For instance, the algebraic notion of a derivation often serves as a substitute for the analytic notion of a derivative in such cases, by abstracting out the key algebraic properties of differentiation, namely linearity and the Leibniz rule (also known as the product rule).

Abstract algebraic analogues of integration are less well known, but can still be developed. To motivate such an abstraction, consider the integration functional {I: {\mathcal S}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}} from the space {{\mathcal S}({\bf R} \rightarrow {\bf C})} of complex-valued Schwarz functions {f: {\bf R} \rightarrow {\bf C}} to the complex numbers, defined by

\displaystyle  I(f) := \int_{\bf R} f(x)\ dx

where the integration on the right is the usual Lebesgue integral (or improper Riemann integral) from analysis. This functional obeys two obvious algebraic properties. Firstly, it is linear over {{\bf C}}, thus

\displaystyle  I(cf) = c I(f) \ \ \ \ \ (1)

and

\displaystyle  I(f+g) = I(f) + I(g) \ \ \ \ \ (2)

for all {f,g \in {\mathcal S}({\bf R} \rightarrow {\bf C})} and {c \in {\bf C}}. Secondly, it is translation invariant, thus

\displaystyle  I(\tau_h f) = I(f) \ \ \ \ \ (3)

for all {h \in {\bf C}}, where {\tau_h f(x) := f(x-h)} is the translation of {f} by {h}. Motivated by the uniqueness theory of Haar measure, one might expect that these two axioms already uniquely determine {I} after one sets a normalisation, for instance by requiring that

\displaystyle  I( x \mapsto e^{-\pi x^2} ) = 1. \ \ \ \ \ (4)

This is not quite true as stated (one can modify the proof of the Hahn-Banach theorem, after first applying a Fourier transform, to create pathological translation-invariant linear functionals on {{\mathcal S}({\bf R} \rightarrow {\bf C})} that are not multiples of the standard Fourier transform), but if one adds a mild analytical axiom, such as continuity of {I} (using the usual Schwartz topology on {{\mathcal S}({\bf R} \rightarrow {\bf C})}), then the above axioms are enough to uniquely pin down the notion of integration. Indeed, if {I: {\mathcal S}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}} is a continuous linear functional that is translation invariant, then from the linearity and translation invariance axioms one has

\displaystyle  I( \frac{\tau_h f - f}{h} ) = 0

for all {f \in {\mathcal S}({\bf R} \rightarrow {\bf C})} and non-zero reals {h}. If {f} is Schwartz, then as {h \rightarrow 0}, one can verify that the Newton quotients {\frac{\tau_h f - f}{h}} converge in the Schwartz topology to the derivative {f'} of {f}, so by the continuity axiom one has

\displaystyle  I(f') = 0.

Next, note that any Schwartz function of integral zero has an antiderivative which is also Schwartz, and so {I} annihilates all zero-integral Schwartz functions, and thus must be a scalar multiple of the usual integration functional. Using the normalisation (4), we see that {I} must therefore be the usual integration functional, giving the claimed uniqueness.

Motivated by the above discussion, we can define the notion of an abstract integration functional {I: X \rightarrow R} taking values in some vector space {R}, and applied to inputs {f} in some other vector space {X} that enjoys a linear action {h \mapsto \tau_h} (the “translation action”) of some group {V}, as being a functional which is both linear and translation invariant, thus one has the axioms (1), (2), (3) for all {f,g \in X}, scalars {c}, and {h \in V}. The previous discussion then considered the special case when {R = {\bf C}}, {X = {\mathcal S}({\bf R} \rightarrow {\bf C})}, {V = {\bf R}}, and {\tau} was the usual translation action.

Once we have performed this abstraction, we can now present analogues of classical integration which bear very little analytic resemblance to the classical concept, but which still have much of the algebraic structure of integration. Consider for instance the situation in which we keep the complex range {R = {\bf C}}, the translation group {V = {\bf R}}, and the usual translation action {h \mapsto \tau_h}, but we replace the space {{\mathcal S}({\bf R} \rightarrow {\bf C})} of Schwartz functions by the space {Poly_{\leq d}({\bf R} \rightarrow {\bf C})} of polynomials {x \mapsto a_0 + a_1 x + \ldots + a_d x^d} of degree at most {d} with complex coefficients, where {d} is a fixed natural number; note that this space is translation invariant, so it makes sense to talk about an abstract integration functional {I: Poly_{\leq d}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}}. Of course, one cannot apply traditional integration concepts to non-zero polynomials, as they are not absolutely integrable. But one can repeat the previous arguments to show that any abstract integration functional must annihilate derivatives of polynomials of degree at most {d}:

\displaystyle  I(f') = 0 \hbox{ for all } f \in Poly_{\leq d}({\bf R} \rightarrow {\bf C}). \ \ \ \ \ (5)

Clearly, every polynomial of degree at most {d-1} is thus annihilated by {I}, which makes {I} a scalar multiple of the functional that extracts the top coefficient {a_d} of a polynomial, thus if one sets a normalisation

\displaystyle  I( x \mapsto x^d ) = c

for some constant {c}, then one has

\displaystyle  I( x \mapsto a_0 + a_1 x + \ldots + a_d x^d ) = c a_d \ \ \ \ \ (6)

for any polynomial {x \mapsto a_0 + a_1 x + \ldots + a_d x^d}. So we see that up to a normalising constant, the operation of extracting the top order coefficient of a polynomial of fixed degree serves as the analogue of integration. In particular, despite the fact that integration is supposed to be the “opposite” of differentiation (as indicated for instance by (5)), we see in this case that integration is basically ({d}-fold) differentiation; indeed, compare (6) with the identity

\displaystyle  (\frac{d}{dx})^d ( a_0 + a_1 x + \ldots + a_d x^d ) = d! a_d.

In particular, we see, in contrast to the usual Lebesgue integral, the integration functional (6) can be localised to an arbitrary location: one only needs to know the germ of the polynomial {x \mapsto a_0 + a_1 x + \ldots + a_d x^d} at a single point {x_0} in order to determine the value of the functional (6). This localisation property may initially seem at odds with the translation invariance, but the two can be reconciled thanks to the extremely rigid nature of the class {Poly_{\leq d}({\bf R} \rightarrow {\bf C})}, in contrast to the Schwartz class {{\mathcal S}({\bf R} \rightarrow {\bf C})} which admits bump functions and so can generate local phenomena that can only be detected in small regions of the underlying spatial domain, and which therefore forces any translation-invariant integration functional on such function classes to measure the function at every single point in space.

The reversal of the relationship between integration and differentiation is also reflected in the fact that the abstract integration operation on polynomials interacts with the scaling operation {\delta_\lambda f(x) := f(x/\lambda)} in essentially the opposite way from the classical integration operation. Indeed, for classical integration on {{\bf R}^d}, one has

\displaystyle  \int_{{\bf R}^d} f(x/\lambda)\ dx = \lambda^d \int f(x)\ dx

for Schwartz functions {f \in {\mathcal S}({\bf R}^d \rightarrow {\bf C})}, and so in this case the integration functional {I(f) := \int_{{\bf R}^d} f(x)\ dx} obeys the scaling law

\displaystyle  I( \delta_\lambda f ) = \lambda^d I(f).

In contrast, the abstract integration operation defined in (6) obeys the opposite scaling law

\displaystyle  I( \delta_\lambda f ) = \lambda^{-d} I(f). \ \ \ \ \ (7)

Remark 1 One way to interpret what is going on is to view the integration operation (6) as a renormalised version of integration. A polynomial {x \mapsto a_0 + a_1 + \ldots + a_d x^d} is, in general, not absolutely integrable, and the partial integrals

\displaystyle  \int_0^N a_0 + a_1 + \ldots + a_d x^d\ dx

diverge as {N \rightarrow \infty}. But if one renormalises these integrals by the factor {\frac{1}{N^{d+1}}}, then one recovers convergence,

\displaystyle  \lim_{N \rightarrow \infty} \frac{1}{N^{d+1}} \int_0^N a_0 + a_1 + \ldots + a_d x^d\ dx = \frac{1}{d+1} a_d

thus giving an interpretation of (6) as a renormalised classical integral, with the renormalisation being responsible for the unusual scaling relationship in (7). However, this interpretation is a little artificial, and it seems that it is best to view functionals such as (6) from an abstract algebraic perspective, rather than to try to force an analytic interpretation on them.

Now we return to the classical Lebesgue integral

\displaystyle  I(f) := \int_{\bf R} f(x)\ dx. \ \ \ \ \ (8)

As noted earlier, this integration functional has a translation invariance associated to translations along the real line {{\bf R}}, as well as a dilation invariance by real dilation parameters {\lambda>0}. However, if we refine the class {{\mathcal S}({\bf R} \rightarrow {\bf C})} of functions somewhat, we can obtain a stronger family of invariances, in which we allow complex translations and dilations. More precisely, let {\mathcal{SE}({\bf C} \rightarrow {\bf C})} denote the space of all functions {f: {\bf C} \rightarrow {\bf C}} which are entire (or equivalently, are given by a Taylor series with an infinite radius of convergence around the origin) and also admit rapid decay in a sectorial neighbourhood of the real line, or more precisely there exists an {\epsilon>0} such that for every {A > 0} there exists {C_A > 0} such that one has the bound

\displaystyle  |f(z)| \leq C_A (1+|z|)^{-A}

whenever {|\hbox{Im}(z)| \leq A + \epsilon |\hbox{Re}(z)|}. For want of a better name, we shall call elements of this space Schwartz entire functions. This is clearly a complex vector space. A typical example of a Schwartz entire function are the complex gaussians

\displaystyle  f(z) := e^{-\pi (az^2 + 2bz + c)}

where {a,b,c} are complex numbers with {\hbox{Re}(a) > 0}. From the Cauchy integral formula (and its derivatives) we see that if {f} lies in {\mathcal{SE}({\bf C} \rightarrow {\bf C})}, then the restriction of {f} to the real line lies in {{\mathcal S}({\bf R} \rightarrow {\bf C})}; conversely, from analytic continuation we see that every function in {{\mathcal S}({\bf R} \rightarrow {\bf C})} has at most one extension in {\mathcal{SE}({\bf C} \rightarrow {\bf C})}. Thus one can identify {\mathcal{SE}({\bf C} \rightarrow {\bf C})} with a subspace of {{\mathcal S}({\bf R} \rightarrow {\bf C})}, and in particular the integration functional (8) is inherited by {\mathcal{SE}({\bf C} \rightarrow {\bf C})}, and by abuse of notation we denote the resulting functional {I: \mathcal{SE}({\bf C} \rightarrow {\bf C}) \rightarrow {\bf C}} as {I} also. Note, in analogy with the situation with polynomials, that this abstract integration functional is somewhat localised; one only needs to evaluate the function {f} on the real line, rather than the entire complex plane, in order to compute {I(f)}. This is consistent with the rigid nature of Schwartz entire functions, as one can uniquely recover the entire function from its values on the real line by analytic continuation.

Of course, the functional {I: \mathcal{SE}({\bf C} \rightarrow {\bf C}) \rightarrow {\bf C}} remains translation invariant with respect to real translation:

\displaystyle  I(\tau_h f) = I(f) \hbox{ for all } h \in {\bf R}.

However, thanks to contour shifting, we now also have translation invariance with respect to complex translation:

\displaystyle  I(\tau_h f) = I(f) \hbox{ for all } h \in {\bf C},

where of course we continue to define the translation operator {\tau_h} for complex {h} by the usual formula {\tau_h f(x) := f(x-h)}. In a similar vein, we also have the scaling law

\displaystyle  I(\delta_\lambda f) = \lambda I(f)

for any {f \in \mathcal{SE}({\bf C} \rightarrow {\bf C})}, if {\lambda} is a complex number sufficiently close to {1} (where “sufficiently close” depends on {f}, and more precisely depends on the sectoral aperture parameter {\epsilon} associated to {f}); again, one can verify that {\delta_\lambda f} lies in {\mathcal{SE}({\bf C} \rightarrow {\bf C})} for {\lambda} sufficiently close to {1}. These invariances (which relocalise the integration functional {I} onto other contours than the real line {{\bf R}}) are very useful for computing integrals, and in particular for computing gaussian integrals. For instance, the complex translation invariance tells us (after shifting by {b/a}) that

\displaystyle  I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = e^{-\pi (c-b^2/a)} I( z \mapsto e^{-\pi a z^2} )

when {a,b,c \in {\bf C}} with {\hbox{Re}(a) > 0}, and then an application of the complex scaling law (and a continuity argument, observing that there is a compact path connecting {a} to {1} in the right half plane) gives

\displaystyle  I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = a^{-1/2} e^{-\pi (c-b^2/a)} I( z \mapsto e^{-\pi z^2} )

using the branch of {a^{-1/2}} on the right half-plane for which {1^{-1/2} = 1}. Using the normalisation (4) we thus have

\displaystyle  I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = a^{-1/2} e^{-\pi (c-b^2/a)}

giving the usual gaussian integral formula

\displaystyle  \int_{\bf R} e^{-\pi (ax^2 + 2bx + c)}\ dx = a^{-1/2} e^{-\pi (c-b^2/a)}. \ \ \ \ \ (9)

This is a basic illustration of the power that a large symmetry group (in this case, the complex homothety group) can bring to bear on the task of computing integrals.

One can extend this sort of analysis to higher dimensions. For any natural number {n \geq 1}, let {\mathcal{SE}({\bf C}^n \rightarrow {\bf C})} denote the space of all functions {f: {\bf C}^n \rightarrow {\bf C}} which is jointly entire in the sense that {f(z_1,\ldots,z_n)} can be expressed as a Taylor series in {z_1,\ldots,z_n} which is absolutely convergent for all choices of {z_1,\ldots,z_n}, and such that there exists an {\epsilon > 0} such that for any {A>0} there is {C_A>0} for which one has the bound

\displaystyle  |f(z)| \leq C_A (1+|z|)^{-A}

whenever {|\hbox{Im}(z_j)| \leq A + \epsilon |\hbox{Re}(z_j)|} for all {1 \leq j \leq n}, where {z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix}} and {|z| := (|z_1|^2+\ldots+|z_n|^2)^{1/2}}. Again, we call such functions Schwartz entire functions; a typical example is the function

\displaystyle  f(z) := e^{-\pi (z^T A z + 2b^T z + c)}

where {A} is an {n \times n} complex symmetric matrix with positive definite real part, {b} is a vector in {{\bf C}^n}, and {c} is a complex number. We can then define an abstract integration functional {I: \mathcal{SE}({\bf C}^n \rightarrow {\bf C}) \rightarrow {\bf C}} by integration on the real slice {{\bf R}^n}:

\displaystyle  I(f) := \int_{{\bf R}^n} f(x)\ dx

where {dx} is the usual Lebesgue measure on {{\bf R}^n}. By contour shifting in each of the {n} variables {z_1,\ldots,z_n} separately, we see that {I} is invariant with respect to complex translations of each of the {z_j} variables, and is thus invariant under translating the joint variable {z} by {{\bf C}^n}. One can also verify the scaling law

\displaystyle  I(\delta_A f) = \hbox{det}(A) I(f)

for {n \times n} complex matrices {A} sufficiently close to the origin, where {\delta_A f(z) := f(A^{-1} z)}. This can be seen for shear transformations {A} by Fubini’s theorem and the aforementioned translation invariance, while for diagonal transformations near the origin this can be seen from {n} applications of one-dimensional scaling law, and the general case then follows by composition. Among other things, these laws then easily lead to the higher-dimensional generalisation

\displaystyle  \int_{{\bf R}^n} e^{-\pi (x^T A x + 2 b^T x + c)}\ dx = \hbox{det}(A)^{-1/2} e^{-\pi (c-b^T A^{-1} b)} \ \ \ \ \ (10)

whenever {A} is a complex symmetric matrix with positive definite real part, {b} is a vector in {{\bf C}^n}, and {c} is a complex number, basically by repeating the one-dimensional argument sketched earlier. Here, we choose the branch of {\hbox{det}(A)^{-1/2}} for all matrices {A} in the indicated class for which {\hbox{det}(1)^{-1/2} = 1}.

Now we turn to an integration functional suitable for computing complex gaussian integrals such as

\displaystyle  \int_{{\bf C}^n} e^{-2\pi (z^\dagger A z + b^\dagger z + z^\dagger \tilde b + c)}\ dz d\overline{z}, \ \ \ \ \ (11)

where {z} is now a complex variable

\displaystyle  z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix},

{z^\dagger} is the adjoint

\displaystyle  z^\dagger := (\overline{z_1},\ldots, \overline{z_n}),

{A} is a complex {n \times n} matrix with positive definite Hermitian part, {b, \tilde b} are column vectors in {{\bf C}^n}, {c} is a complex number, and {dz d\overline{z} = \prod_{j=1}^n 2 d\hbox{Re}(z_j) d\hbox{Im}(z_j)} is {2^n} times Lebesgue measure on {{\bf C}^n}. (The factors of two here turn out to be a natural normalisation, but they can be ignored on a first reading.) As we shall see later, such integrals are relevant when performing computations on the Gaussian Unitary Ensemble (GUE) in random matrix theory. Note that the integrand here is not complex analytic due to the presence of the complex conjugates. However, this can be dealt with by the trick of replacing the complex conjugate {\overline{z}} by a variable {z^*} which is formally conjugate to {z}, but which is allowed to vary independently of {z}. More precisely, let {\mathcal{SA}({\bf C}^n \times {\bf C}^n \rightarrow {\bf C})} be the space of all functions {f: (z,z^*) \mapsto f(z,z^*)} of two independent {n}-tuples

\displaystyle  z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix}, z^* = \begin{pmatrix} z_1^* \\ \vdots \\ z_n^* \end{pmatrix}

of complex variables, which is jointly entire in all {2n} variables (in the sense defined previously, i.e. there is a joint Taylor series that is absolutely convergent for all independent choices of {z, z^* \in {\bf C}^n}), and such that there is an {\epsilon>0} such that for every {A>0} there is {C_A>0} such that one has the bound

\displaystyle  |f(z,z^*)| \leq C_A (1 + |z|)^{-A}

whenever {|z^* - \overline{z}| \leq A + \epsilon |z|}. We will call such functions Schwartz analytic. Note that the integrand in (11) is Schwartz analytic when {A} has positive definite Hermitian part, if we reinterpret {z^\dagger} as the transpose of {z^*} rather than as the adjoint of {z} in order to make the integrand entire in {z} and {z^*}. We can then define an abstract integration functional {I: \mathcal{SA}({\bf C}^n \times {\bf C}^n \rightarrow {\bf C}) \rightarrow {\bf C}} by the formula

\displaystyle  I(f) := \int_{{\bf C}^n} f(z,\overline{z})\ dz d\overline{z}, \ \ \ \ \ (12)

thus {I} can be localised to the slice {\{ (z,\overline{z}): z \in {\bf C}^n\}} of {{\bf C}^n \times {\bf C}^n} (though, as with previous functionals, one can use contour shifting to relocalise {I} to other slices also.) One can also write this integral as

\displaystyle  I(f) = 2^n \int_{{\bf R}^n \times {\bf R}^n} f(x+iy, x-iy)\ dx dy

and note that the integrand here is a Schwartz entire function on {{\bf C}^n \times {\bf C}^n}, thus linking the Schwartz analytic integral with the Schwartz entire integral. Using this connection, one can verify that this functional {I} is invariant with respect to translating {z} and {z^*} by independent shifts in {{\bf C}^n} (thus giving a {{\bf C}^n \times {\bf C}^n} translation symmetry), and one also has the independent dilation symmetry

\displaystyle  I(\delta_{A,B} f) = \hbox{det}(A) \hbox{det}(B) I(f)

for {n \times n} complex matrices {A,B} that are sufficiently close to the identity, where {\delta_{A,B} f(z,z^*) := f(A^{-1} z, B^{-1} z^*)}. Arguing as before, we can then compute (11) as

\displaystyle  \int_{{\bf C}^n} e^{-2\pi (z^\dagger A z + b^\dagger z + z^\dagger \tilde b + c)}\ dz d\overline{z} = \hbox{det}(A)^{-1} e^{-2\pi (c - b^\dagger A^{-1} \tilde b)}. \ \ \ \ \ (13)

In particular, this gives an integral representation for the determinant-reciprocal {\hbox{det}(A)^{-1}} of a complex {n \times n} matrix with positive definite Hermitian part, in terms of gaussian expressions in which {A} only appears linearly in the exponential:

\displaystyle  \hbox{det}(A)^{-1} = \int_{{\bf C}^n} e^{-2\pi z^\dagger A z}\ dz d\overline{z}.

This formula is then convenient for computing statistics such as

\displaystyle  \mathop{\bf E} \hbox{det}(W_n-E-i\eta)^{-1}

for random matrices {W_n} drawn from the Gaussian Unitary Ensemble (GUE), and some choice of spectral parameter {E+i\eta} with {\eta>0}; we review this computation later in this post. By the trick of matrix differentiation of the determinant (as reviewed in this recent blog post), one can also use this method to compute matrix-valued statistics such as

\displaystyle  \mathop{\bf E} \hbox{det}(W_n-E-i\eta)^{-1} (W_n-E-i\eta)^{-1}.

However, if one restricts attention to classical integrals over real or complex (and in particular, commuting or bosonic) variables, it does not seem possible to easily eradicate the negative determinant factors in such calculations, which is unfortunate because many statistics of interest in random matrix theory, such as the expected Stieltjes transform

\displaystyle  \mathop{\bf E} \frac{1}{n} \hbox{tr} (W_n-E-i\eta)^{-1},

which is the Stieltjes transform of the density of states. However, it turns out (as I learned recently from Peter Sarnak and Tom Spencer) that it is possible to cancel out these negative determinant factors by balancing the bosonic gaussian integrals with an equal number of fermionic gaussian integrals, in which one integrates over a family of anticommuting variables. These fermionic integrals are closer in spirit to the polynomial integral (6) than to Lebesgue type integrals, and in particular obey a scaling law which is inverse to the Lebesgue scaling (in particular, a linear change of fermionic variables {\zeta \mapsto A \zeta} ends up transforming a fermionic integral by {\hbox{det}(A)} rather than {\hbox{det}(A)^{-1}}), which conveniently cancels out the reciprocal determinants in the previous calculations. Furthermore, one can combine the bosonic and fermionic integrals into a unified integration concept, known as the Berezin integral (or Grassmann integral), in which one integrates functions of supervectors (vectors with both bosonic and fermionic components), and is of particular importance in the theory of supersymmetry in physics. (The prefix “super” in physics means, roughly speaking, that the object or concept that the prefix is attached to contains both bosonic and fermionic aspects.) When one applies this unified integration concept to gaussians, this can lead to quite compact and efficient calculations (provided that one is willing to work with “super”-analogues of various concepts in classical linear algebra, such as the supertrace or superdeterminant).

Abstract integrals of the flavour of (6) arose in quantum field theory, when physicists sought to formally compute integrals of the form

\displaystyle  \int F( x_1, \ldots, x_n, \xi_1, \ldots, \xi_m )\ dx_1 \ldots dx_n d\xi_1 \ldots d\xi_m \ \ \ \ \ (14)

where {x_1,\ldots,x_n} are familiar commuting (or bosonic) variables (which, in particular, can often be localised to be scalar variables taking values in {{\bf R}} or {{\bf C}}), while {\xi_1,\ldots,\xi_m} were more exotic anticommuting (or fermionic) variables, taking values in some vector space of fermions. (As we shall see shortly, one can formalise these concepts by working in a supercommutative algebra.) The integrand {F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)} was a formally analytic function of {x_1,\ldots,x_n,\xi_1,\ldots,\xi_m}, in that it could be expanded as a (formal, noncommutative) power series in the variables {x_1,\ldots,x_n,\xi_1,\ldots,\xi_m}. For functions {F(x_1,\ldots,x_n)} that depend only on bosonic variables, it is certainly possible for such analytic functions to be in the Schwartz class and thus fall under the scope of the classical integral, as discussed previously. However, functions {F(\xi_1,\ldots,\xi_m)} that depend on fermionic variables {\xi_1,\ldots,\xi_m} behave rather differently. Indeed, a fermonic variable {\xi} must anticommute with itself, so that {\xi^2 = 0}. In particular, any power series in {\xi} terminates after the linear term in {\xi}, so that a function {F(\xi)} can only be analytic in {\xi} if it is a polynomial of degree at most {1} in {\xi}; more generally, an analytic function {F(\xi_1,\ldots,\xi_m)} of {m} fermionic variables {\xi_1,\ldots,\xi_m} must be a polynomial of degree at most {m}, and an analytic function {F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)} of {n} bosonic and {m} fermionic variables can be Schwartz in the bosonic variables but will be polynomial in the fermonic variables. As such, to interpret the integral (14), one can use classical (Lebesgue) integration (or the variants discussed above for integrating Schwartz entire or Schwartz analytic functions) for the bosonic variables, but must use abstract integrals such as (6) for the fermonic variables, leading to the concept of Berezin integration mentioned earlier.

In this post I would like to set out some of the basic algebraic formalism of Berezin integration, particularly with regards to integration of gaussian-type expressions, and then show how this formalism can be used to perform computations involving GUE (for instance, one can compute the density of states of GUE by this machinery without recourse to the theory of orthogonal polynomials). The use of supersymmetric gaussian integrals to analyse ensembles such as GUE appears in the work of Efetov (and was also proposed in the slightly earlier works of Parisi-Sourlas and McKane, with a related approach also appearing in the work of Wegner); the material here is adapted from this survey of Mirlin, as well as the later papers of Disertori-Pinson-Spencer and of Disertori.

Read the rest of this entry »

The determinant {\det(A)} of a square matrix {A} obeys a large number of important identities, the most basic of which is the multiplicativity property

\displaystyle  \det(AB) = \det(A) \det(B) \ \ \ \ \ (1)

whenever {A,B} are square matrices of the same dimension. This identity then generates many other important identities. For instance, if {A} is an {n \times m} matrix and {B} is an {m \times n} matrix, then by applying the previous identity to equate the determinants of {\begin{pmatrix} 1 & -A \\ B & 1 \end{pmatrix} \begin{pmatrix} 1 & A \\ 0 & 1 \end{pmatrix}} and {\begin{pmatrix} 1 & A \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & -A \\ B & 1 \end{pmatrix}} (where we will adopt the convention that {1} denotes an identity matrix of whatever dimension is needed to make sense of the expressions being computed, and similarly for {0}) we obtain the Sylvester determinant identity

\displaystyle  \det( 1 + AB ) = \det( 1 + BA ). \ \ \ \ \ (2)

This identity, which relates an {n \times n} determinant with an {m \times m} determinant, is very useful in random matrix theory (a point emphasised in particular by Deift), particularly in regimes in which {m} is much smaller than {n}.

Another identity generated from (1) arises when trying to compute the determinant of a {(n+m) \times (n+m)} block matrix

\displaystyle  \begin{pmatrix} A & B \\ C & D \end{pmatrix}

where {A} is an {n \times n} matrix, {B} is an {n \times m} matrix, {C} is an {m \times n} matrix, and {D} is an {m \times m} matrix. If {A} is invertible, then we can manipulate this matrix via block Gaussian elimination as

\displaystyle  \begin{pmatrix} A & B \\ C & D \end{pmatrix} = \begin{pmatrix} A & 0 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & A^{-1} B \\ C & D \end{pmatrix}

\displaystyle  = \begin{pmatrix} A & 0 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 0 \\ C & 1 \end{pmatrix} \begin{pmatrix} 1 & A^{-1} B \\ 0 & D - C A^{-1} B \end{pmatrix}

and on taking determinants using (1) we obtain the Schur determinant identity

\displaystyle  \det \begin{pmatrix} A & B \\ C & D \end{pmatrix} = \det(A) \det( D - C A^{-1} B ) \ \ \ \ \ (3)

relating the determinant of a block-diagonal matrix with the determinant of the Schur complement {D-C A^{-1} B} of the upper left block {A}. This identity can be viewed as the correct way to generalise the {2 \times 2} determinant formula

\displaystyle  \det \begin{pmatrix} a & b \\ c & d \end{pmatrix} = ad-bc = a ( d - c a^{-1} b).

It is also possible to use determinant identities to deduce other matrix identities that do not involve the determinant, by the technique of matrix differentiation (or equivalently, matrix linearisation). The key observation is that near the identity, the determinant behaves like the trace, or more precisely one has

\displaystyle  \det( 1 + \epsilon A ) = 1 + \epsilon \hbox{tr}(A) + O(\epsilon^2) \ \ \ \ \ (4)

for any bounded square matrix {A} and infinitesimal {\epsilon}. (If one is uncomfortable with infinitesimals, one can interpret this sort of identity as an asymptotic as {\epsilon\rightarrow 0}.) Combining this with (1) we see that for square matrices {A,B} of the same dimension with {A} invertible and {A^{-1}, B} invertible, one has

\displaystyle  \det( A + \epsilon B ) = \det(A) \det(1 + \epsilon A^{-1} B )

\displaystyle = \det(A) (1 + \epsilon \hbox{tr}( A^{-1} B ) + O(\epsilon^2) )

for infinitesimal {\epsilon}. To put it another way, if {A(t)} is a square matrix that depends in a differentiable fashion on a real parameter {t}, then

\displaystyle  \frac{d}{dt} \det(A(t)) = \det(A(t)) \hbox{tr}( A(t)^{-1} \frac{d}{dt} A(t) )

whenever {A(t)} is invertible. (Note that if one combines this identity with cofactor expansion, one recovers Cramer’s rule.)

Let us see some examples of this differentiation method. If we take the Sylvester identity (2) and multiply one of the rectangular matrices {A} by an infinitesimal {\epsilon}, we obtain

\displaystyle  \det( 1 + \epsilon A B ) = \det( 1 + \epsilon B A);

applying (4) and extracting the linear term in {\epsilon} (or equivalently, differentiating at {\epsilon} and then setting {\epsilon=0}) we conclude the cyclic property of trace:

\displaystyle  \hbox{tr}(AB) = \hbox{tr}(BA).

To manipulate derivatives and inverses, we begin with the Neumann series approximation

\displaystyle  (1 + \epsilon A)^{-1} = 1 - \epsilon A + O(\epsilon^2)

for bounded square {A} and infinitesimal {\epsilon}, which then leads to the more general approximation

\displaystyle  (A + \epsilon B)^{-1} = (1 + \epsilon A^{-1} B)^{-1} A^{-1}

\displaystyle  = A^{-1} - \epsilon A^{-1} B A^{-1} + O(\epsilon^2) \ \ \ \ \ (5)

for square matrices {A,B} of the same dimension with {B, A^{-1}} bounded. To put it another way, we have

\displaystyle  \frac{d}{dt} A(t)^{-1} = -A(t)^{-1} (\frac{d}{dt} A(t)) A(t)^{-1}

whenever {A(t)} depends in a differentiable manner on {t} and {A(t)} is invertible.

We can then differentiate (or linearise) the Schur identity (3) in a number of ways. For instance, if we replace the lower block {D} by {D + \epsilon H} for some test {m \times m} matrix {H}, then by (4), the left-hand side of (3) becomes (assuming the invertibility of the block matrix)

\displaystyle  (\det \begin{pmatrix} A & B \\ C & D \end{pmatrix}) (1 + \epsilon \hbox{tr} \begin{pmatrix} A & B \\ C & D \end{pmatrix}^{-1} \begin{pmatrix} 0 & 0 \\ 0 & H \end{pmatrix} + O(\epsilon^2) )

while the right-hand side becomes

\displaystyle  \det(A) \det(D-CA^{-1}B) (1 + \epsilon \hbox{tr}( (D-CA^{-1}B)^{-1} H ) + O(\epsilon^2) );

extracting the linear term in {\epsilon} (after dividing through by (3)), we conclude that

\displaystyle  \hbox{tr} (\begin{pmatrix} A & B \\ C & D \end{pmatrix}^{-1} \begin{pmatrix} 0 & 0 \\ 0 & H \end{pmatrix}) = \hbox{tr}( (D-CA^{-1}B)^{-1} H ).

As {H} was an arbitrary {m \times m} matrix, we conclude from duality that the lower right {m \times m} block of {\begin{pmatrix} A & B \\ C & D \end{pmatrix}^{-1}} is given by the inverse {(D-CA^{-1}B)^{-1}} of the Schur complement:

\displaystyle  \begin{pmatrix} A & B \\ C & D \end{pmatrix}^{-1} = \begin{pmatrix} ?? & ?? \\ ?? & (D-CA^{-1}B)^{-1} \end{pmatrix}.

One can also compute the other components of this inverse in terms of the Schur complement {D-CA^{-1} B} by a similar method (although the formulae become more complicated). As a variant of this method, we can perturb the block matrix in (3) by an infinitesimal multiple of the identity matrix giving

\displaystyle  \det \begin{pmatrix} A+\epsilon & B \\ C & D+\epsilon \end{pmatrix} = \det(A+\epsilon) \det( D +\epsilon - C (A+\epsilon)^{-1} B ). \ \ \ \ \ (6)

By (4), the left-hand side is

\displaystyle  (\det \begin{pmatrix} A & B \\ C & D \end{pmatrix}) (1 + \epsilon \hbox{tr} \begin{pmatrix} A & B \\ C & D \end{pmatrix}^{-1} + O(\epsilon^2) ).

From (5), we have

\displaystyle  D + \epsilon - C (A+ \epsilon)^{-1} B = D - C A^{-1} B + \epsilon(1 + C A^{-2} B) + O(\epsilon^2)

and so from (4) the right-hand side of (6) is

\displaystyle  \det(A) \det(D-CA^{-1} B) \times

\displaystyle  \times ( 1 + \epsilon (\hbox{tr}(A^{-1}) + \hbox{tr}( (D-CA^{-1} B)^{-1} (1 + C A^{-2} B)) ) + O(\epsilon^2) );

extracting the linear component in {\epsilon}, we conclude the identity

\displaystyle  \hbox{tr} \begin{pmatrix} A & B \\ C & D \end{pmatrix}^{-1} = \hbox{tr}(A^{-1}) + \hbox{tr}( (D-CA^{-1} B)^{-1} (1 + C A^{-2} B)) \ \ \ \ \ (7)

which relates the trace of the inverse of a block matrix, with the trace of the inverse of one of its blocks. This particular identity turns out to be useful in random matrix theory; I hope to elaborate on this in a later post.

As a final example of this method, we can analyse low rank perturbations {A+BC} of a large ({n \times n}) matrix {A}, where {B} is an {n \times m} matrix and {C} is an {m \times n} matrix for some {m<n}. (This type of situation is also common in random matrix theory, for instance it arose in this previous paper of mine on outliers to the circular law.) If {A} is invertible, then from (1) and (2) one has the matrix determinant lemma

\displaystyle  \det( A + BC ) = \det(A) \det( 1 + A^{-1} BC) = \det(A) \det(1 + CA^{-1} B);

if one then perturbs {A} by an infinitesimal matrix {\epsilon H}, we have

\displaystyle  \det( A + BC + \epsilon H ) = \det(A + \epsilon H ) \det(1 + C(A+\epsilon H)^{-1} B).

Extracting the linear component in {\epsilon} as before, one soon arrives at

\displaystyle  \hbox{tr}( (A+BC)^{-1} H ) = \hbox{tr}( A^{-1} H ) - \hbox{tr}( (1 + C A^{-1} B)^{-1} C A^{-1} H A^{-1} B )

assuming that {A} and {A+BC} are both invertible; as {H} is arbitrary, we conclude (after using the cyclic property of trace) the Sherman-Morrison formula

\displaystyle  (A+BC)^{-1} = A^{-1} - A^{-1} B (1 + C A^{-1} B)^{-1} C A^{-1}

for the inverse of a low rank perturbation {A+BC} of a matrix {A}. While this identity can be easily verified by direct algebraic computation, it is somewhat difficult to discover this identity by such algebraic manipulation; thus we see that the “determinant first” approach to matrix identities can make it easier to find appropriate matrix identities (particularly those involving traces and/or inverses), even if the identities one is ultimately interested in do not involve determinants. (As differentiation typically makes an identity lengthier, but also more “linear” or “additive”, the determinant identity tends to be shorter (albeit more nonlinear and more multiplicative) than the differentiated identity, and can thus be slightly easier to derive.)

Exercise 1 Use the “determinant first” approach to derive the Woodbury matrix identity (also known as the binomial inverse theorem)

\displaystyle  (A+BDC)^{-1} = A^{-1} - A^{-1} B (D^{-1} + CA^{-1} B)^{-1} C A^{-1}

where {A} is an {n \times n} matrix, {B} is an {n \times m} matrix, {C} is an {m \times n} matrix, and {D} is an {m \times m} matrix, assuming that {A}, {D} and {A+BDC} are all invertible.

Exercise 2 Let {A,B} be invertible {n \times n} matrices. Establish the identity

\displaystyle  \det(A + B) \det(A - B) = \det(B) \det( AB^{-1} A - B)

and differentiate this in {A} to deduce the identity

\displaystyle  (A+B)^{-1} + (A-B)^{-1} = 2 (A - BA^{-1} B)^{-1}

(assuming that all inverses exist) and hence

\displaystyle  (A+B)^{-1} = (A - BA^{-1} B)^{-1} - (B - AB^{-1} A)^{-1}.

Rotating {B} by {i} then gives

\displaystyle  (A+iB)^{-1} = (A + BA^{-1} B)^{-1} - i (B + AB^{-1} A)^{-1},

which is useful for inverting a matrix {A+iB} that has been split into a self-adjoint component {A} and a skew-adjoint component {iB}.

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.