You are currently browsing the monthly archive for October 2007.

This month I have been at the Institute for Advanced Study, participating in their semester program on additive combinatorics. Today I gave a talk on my forthcoming paper with Tim Austin on the property testing of graphs and hypergraphs (I hope to make a preprint available here soon). There has been an immense amount of progress on these topics recently, based in large part on the graph and hypergraph regularity lemmas; but we have discovered some surprising subtleties regarding these results, namely a distinction between undirected and directed graphs, between graphs and hypergraphs, between partite hypergraphs and non-partite hypergraphs, and between monotone hypergraph properties and hereditary ones.

For simplicity let us first work with (uncoloured, undirected, loop-free) graphs G = (V,E). In the subject of graph property testing, one is given a property ${\mathcal P}$ which any given graph G may or may not have. For example, ${\mathcal P}$ could be one of the following properties:

1. G is planar.
2. G is four-colourable.
3. G has a number of edges equal to a power of two.
4. G contains no triangles.
5. G is bipartite.
6. G is empty.
7. G is a complete bipartite graph.

We assume that the labeling of the graph is irrelevant. More precisely, we assume that whenever two graphs G, G’ are isomorphic, that G satisfies ${\mathcal P}$ if and only if G’ satisfies ${\mathcal P}$. For instance, all seven of the graph properties listed above are invariant under graph isomorphism.

We shall think of G as being very large (so $|V|$ is large) and dense (so $|E| \sim |V|^2$). We are interested in obtaining some sort of test that can answer the question “does G satisfy ${\mathcal P}$?” with reasonable speed and reasonable accuracy. By “reasonable speed”, we mean that we will only make a bounded number of queries about the graph, i.e. we only look at a bounded number k of distinct vertices in V (selected at random) and base our test purely on how these vertices are connected to each other in E. (We will always assume that the number of vertices in V is at least k.) By “reasonable accuracy”, we will mean that we specify in advance some error tolerance $\varepsilon > 0$ and require the following:

1. (No false negatives) If G indeed satisfies ${\mathcal P}$, then our test will always (correctly) accept G.
2. (Few false positives in the $\varepsilon$-far case) If G fails to satisfy ${\mathcal P}$, and is furthermore $\varepsilon$-far from satisfying ${\mathcal P}$ in the sense that one needs to add or remove at least $\varepsilon |V|^2$ edges in G before ${\mathcal P}$ can be satisfied, then our test will (correctly) reject G with probability at least $\varepsilon$.

When a test with the above properties exists for each given $\varepsilon > 0$ (with the number of queried vertices k being allowed to depend on $\varepsilon$), we say that the graph property ${\mathcal P}$ is testable with one-sided error. (The general notion of property testing was introduced by Rubinfeld and Sudan, and first studied for graph properties by Goldreich, Goldwasser, and Ron; see this web page of Goldreich for further references and discussion.) The rejection probability $\varepsilon$ is not very important in this definition, since if one wants to improve the success rate of the algorithm one can simply run independent trials of that algorithm (selecting fresh random vertices each time) in order to increase the chance that G is correctly rejected. However, it is intuitively clear that one must allow some probability of failure, since one is only inspecting a small portion of the graph and so cannot say with complete certainty whether the entire graph has the property ${\mathcal P}$ or not. For similar reasons, one cannot reasonably demand to have a low false positive rate for all graphs that fail to obey ${\mathcal P}$, since if the graph is only one edge modification away from obeying ${\mathcal P}$, this modification is extremely unlikely to be detected by only querying a small portion of the graph. This explains why we need to restrict attention to graphs that are $\varepsilon$-far from obeying ${\mathcal P}$.

An example should illustrate this definition. Consider for instance property 6 above (the property that G is empty). To test whether a graph is empty, one can perform the following obvious algorithm: take k vertices in G at random and check whether they have any edges at all between them. If they do, then the test of course rejects G as being non-empty, while if they don’t, the test accepts G as being empty. Clearly there are no false negatives in this test, and if k is large enough depending on $\varepsilon$ one can easily see (from the law of large numbers) that we will have few false positives if G is $\varepsilon$-far from being empty (i.e. if it has at least $\varepsilon |V|^2$ vertices). So the property of being empty is testable with one-sided error.

On the other hand, it is intuitively obvious that property 3 (having an number of edges equal to a power of 2) is not testable with one-sided error.

So it is reasonable to ask: what types of graph properties are testable with one-sided error, and which ones are not?

Today I’d like to discuss (part of) a cute and surprising theorem of Fritz John in the area of non-linear wave equations, and specifically for the equation

$\partial_{tt} u - \Delta u = |u|^p$ (1)

where $u: {\Bbb R} \times {\Bbb R}^3 \to {\Bbb R}$ is a scalar function of one time and three spatial dimensions.

The evolution of this type of non-linear wave equation can be viewed as a “race” between the dispersive tendency of the linear wave equation

$\partial_{tt} u - \Delta u = 0$ (2)

and the positive feedback tendencies of the nonlinear ODE

$\partial_{tt} u = |u|^p$. (3)

More precisely, solutions to (2) tend to decay in time as $t \to +\infty$, as can be seen from the presence of the $\frac{1}{t}$ term in the explicit formula

$u(t,x) = \frac{1}{4\pi t} \int_{|y-x|=t} \partial_t u(0,y)\ dS(y) + \partial_t[\frac{1}{4\pi t} \int_{|y-x|=t} u(0,y)\ dS(y)],$ (4)

for such solutions in terms of the initial position $u(0,y)$ and initial velocity $\partial_t u(0,y)$, where $t > 0$, $x \in {\Bbb R}^3$, and dS is the area element of the sphere $\{ y \in {\Bbb R}^3: |y-x|=t \}$. (For this post I will ignore the technical issues regarding how smooth the solution has to be in order for the above formula to be valid.) On the other hand, solutions to (3) tend to blow up in finite time from data with positive initial position and initial velocity, even if this data is very small, as can be seen by the family of solutions

$u_T(t,x) := c (T-t)^{-2/(p-1)}$

for $T > 0$, $0 < t < T$, and $x \in {\Bbb R}^3$, where c is the positive constant $c := (\frac{2(p+1)}{(p-1)^2})^{1/(p-1)}$. For T large, this gives a family of solutions which starts out very small at time zero, but still manages to go to infinity in finite time.

The equation (1) can be viewed as a combination of equations (2) and (3) and should thus inherit a mix of the behaviours of both its “parents”. As a general rule, when the initial data $u(0,\cdot), \partial_t u(0,\cdot)$ of solution is small, one expects the dispersion to “win” and send the solution to zero as $t \to \infty$, because the nonlinear effects are weak; conversely, when the initial data is large, one expects the nonlinear effects to “win” and cause blowup, or at least large amounts of instability. This division is particularly pronounced when p is large (since then the nonlinearity is very strong for large data and very weak for small data), but not so much for p small (for instance, when p=1, the equation becomes essentially linear, and one can easily show that blowup does not occur from reasonable data.)

The theorem of John formalises this intuition, with a remarkable threshold value for p:

Theorem. Let $1 < p < \infty$.

1. If $p < 1+\sqrt{2}$, then there exist solutions which are arbitrarily small (both in size and in support) and smooth at time zero, but which blow up in finite time.
2. If $p > 1+\sqrt{2}$, then for every initial data which is sufficiently small in size and support, and sufficiently smooth, one has a global solution (which goes to zero uniformly as $t \to \infty$).

[At the critical threshold $p = 1 + \sqrt{2}$ one also has blowup from arbitrarily small data, as was shown subsequently by Schaeffer.]

The ostensible purpose of this post is to try to explain why the curious exponent $1+\sqrt{2}$ should make an appearance here, by sketching out the proof of part 1 of John’s theorem (I will not discuss part 2 here); but another reason I am writing this post is to illustrate how to make quick “back-of-the-envelope” calculations in harmonic analysis and PDE which can obtain the correct numerology for such a problem much faster than a fully rigorous approach. These calculations can be a little tricky to handle properly at first, but with practice they can be done very swiftly.

I’ve just come back from the 48th Annual IEEE Symposium on the Foundations of Computer science, better known as FOCS; this year it was held at Providence, near Brown University. (This conference is also being officially reported on by the blog posts of Nicole Immorlica, Luca Trevisan, and Scott Aaronson.) I was there to give a tutorial on some of the tools used these days in additive combinatorics and graph theory to distinguish structure and randomness. In a previous blog post, I had already mentioned that my lecture notes for this were available on the arXiv; now the slides for my tutorial are available too (it covers much the same ground as the lecture notes, and also incorporates some material from my ICM slides, but in a slightly different format).

In the slides, I am tentatively announcing some very recent (and not yet fully written up) work of Ben Green and myself establishing the Gowers inverse conjecture in finite fields in the special case when the function f is a bounded degree polynomial (this is a case which already has some theoretical computer science applications). I hope to expand upon this in a future post. But I will describe here a neat trick I learned at the conference (from the FOCS submission of Bogdanov and Viola) which uses majority voting to enhance a large number of small independent correlations into a much stronger single correlation. This application of majority voting is widespread in computer science (and, of course, in real-world democracies), but I had not previously been aware of its utility to the type of structure/randomness problems I am interested in (in particular, it seems to significantly simplify some of the arguments in the proof of my result with Ben mentioned above); thanks to this conference, I now know to add majority voting to my “toolbox”.

I’m continuing my series of articles for the Princeton Companion to Mathematics by uploading my article on the Fourier transform. Here, I chose to describe this transform as a means of decomposing general functions into more symmetric functions (such as sinusoids or plane waves), and to discuss a little bit how this transform is connected to differential operators such as the Laplacian. (This is of course only one of the many different uses of the Fourier transform, but again, with only five pages to work with, it’s hard to do justice to every single application. For instance, the connections with additive combinatorics are not covered at all.)

On the official web site of the Companion (which you can access with the user name “Guest” and password “PCM”), there is a more polished version of the same article, after it had gone through a few rounds of the editing process.

I’ll also point out David Ben-Zvi‘s Companion article on “moduli spaces“. This concept is deceptively simple – a space whose points are themselves spaces, or “representatives” or “equivalence classes” of such spaces – but it leads to the “correct” way of thinking about many geometric and algebraic objects, and more importantly about families of such objects, without drowning in a mess of coordinate charts and formulae which serve to obscure the underlying geometry.

[Update, Oct 21: categories fixed.]

As you may have noticed, I had turned off the “Snap preview” feature on this blog (which previews any external link that the mouse hovers over) for a few weeks as an experiment. After a mixed response, I have decided to re-enable the feature for now, but would be interested in getting feedback on whether this feature makes noticeable differences (both positive and negative) to the viewing experience (in particular, to the loading time of the web page), so that I can decide whether to permanently re-enable it. (In the link above, incidentally, it is noted that one can turn off all snap preview windows by following an options link to set a cookie.)

This post would also be a good forum to discuss any other formatting and presentation issues with this blog (wordpress blogs are remarkably customisable). For instance, I have been looking out for a way to enable previewing of comments, but as far as I can tell this appears to not be possible on wordpress. But if anyone knows some workaround or substitute for this feature, I would like to hear about it.

[Update, Oct 17: Judging from the comments, the response to Snap preview continues to be mixed. But I have discovered how to turn Snap preview on and off for individual articles or links. So I will keep Snap preview on for Wikipedia links and off for the rest; see the articles currently in the front page of this blog for examples of this. I also note that in the Snap preview options menu, there are options to turn Snap off permanently, or to increase the delay before the preview shows up.

Also, following suggestions in the comments, I have darkened the text and also changed the blockquote format, in order to increase readability, and also changed the RSS feed from summaries to the full article (or more precisely, the portion of the article before the jump). ]

[Update, Oct 18: Comments moved to just below the post, as opposed to below the sidebar.]

In one of my recent posts, I used the Jordan normal form for a matrix in order to justify a couple of arguments. As a student, I learned the derivation of this form twice: firstly (as an undergraduate) by using the minimal polynomial, and secondly (as a graduate) by using the structure theorem for finitely generated modules over a principal ideal domain. I found though that the former proof was too concrete and the latter proof too abstract, and so I never really got a good intuition on how the theorem really worked. So I went back and tried to synthesise a proof that I was happy with, by taking the best bits of both arguments that I knew. I ended up with something which wasn’t too different from the standard proofs (relying primarily on the (extended) Euclidean algorithm and the fundamental theorem of algebra), but seems to get at the heart of the matter fairly quickly, so I thought I’d put it up on this blog anyway.

Before we begin, though, let us recall what the Jordan normal form theorem is. For this post, I’ll take the perspective of abstract linear transformations rather than of concrete matrices. Let $T: V \to V$ be a linear transformation on a finite dimensional complex vector space V, with no preferred coordinate system. We are interested in asking what possible “kinds” of linear transformations V can support (more technically, we want to classify the conjugacy classes of $\hbox{End}(V)$, the ring of linear endomorphisms of V to itself). Here are some simple examples of linear transformations.

1. The right shift. Here, $V = {\Bbb R}^n$ is a standard vector space, and the right shift $U: V \to V$ is defined as $U(x_1,\ldots,x_n) = (0,x_1,\ldots,x_{n-1})$, thus all elements are shifted right by one position. (For instance, the 1-dimensional right shift is just the zero operator.)
2. The right shift plus a constant. Here we consider an operator $U + \lambda I$, where $U: V \to V$ is a right shift, I is the identity on V, and $\lambda \in {\Bbb C}$ is a complex number.
3. Direct sums. Given two linear transformations $T: V \to V$ and $S: W \to W$, we can form their direct sum $T \oplus S: V \oplus W \to V \oplus W$ by the formula $(T \oplus S)(v,w) := (Tv, Sw)$.

Our objective is then to prove the

Jordan normal form theorem. Every linear transformation $T: V \to V$ on a finite dimensional complex vector space V is similar to a direct sum of transformations, each of which is a right shift plus a constant.

(Of course, the same theorem also holds with left shifts instead of right shifts.)

I have just uploaded to the arXiv my paper “A quantitative formulation of the global regularity problem for the periodic Navier-Stokes equation”, submitted to Dynamics of PDE. This is a short note on one formulation of the Clay Millennium prize problem, namely that there exists a global smooth solution to the Navier-Stokes equation on the torus $({\Bbb R}/{\Bbb Z})^3$ given any smooth divergence-free data. (I should emphasise right off the bat that I am not claiming any major breakthrough on this problem, which remains extremely challenging in my opinion.)
This problem is formulated in a qualitative way: the conjecture asserts that the velocity field $u$ stays smooth for all time, but does not ask for a quantitative bound on the smoothness of that field in terms of the smoothness of the initial data. Nevertheless, it turns out that the compactness properties of the periodic Navier-Stokes flow allow one to equate the qualitative claim with a more concrete quantitative one. More precisely, the paper shows that the following three statements are equivalent:

1. (Qualitative regularity conjecture) Given any smooth divergence-free data $u_0: ({\Bbb R}/{\Bbb Z})^3 \to {\Bbb R}^3$, there exists a global smooth solution $u: [0,+\infty) \times ({\Bbb R}/{\Bbb Z})^3 \to {\Bbb R}^3$ to the Navier-Stokes equations.
2. (Local-in-time quantitative regularity conjecture)
Given any smooth solution $u: [0,T] \times ({\Bbb R}/{\Bbb Z})^3 \to {\Bbb R}^3$ to the Navier-Stokes equations with $0 < T \leq 1$, one has the a priori bound$\| u(T) \|_{H^1(({\Bbb R}/{\Bbb Z})^3)} \leq F( \| u(0) \|_{H^1(({\Bbb R}/{\Bbb Z})^3)} )$ for some non-decreasing function $F:[0,+\infty) \to [0,+\infty)$.
3. (Global-in-time quantitative regularity conjecture) This is the same conjecture as 2, but with the condition $0 < T \leq 1$ replaced by $0 < T < \infty$.

It is easy to see that Conjecture 3 implies Conjecture 2, which implies Conjecture 1. By using the compactness of the local periodic Navier-Stokes flow in $H^1$, one can show that Conjecture 1 implies Conjecture 2; and by using the energy identity (and in particular the fact that the energy dissipation is bounded) one can deduce Conjecture 3 from Conjecture 2. The argument uses only standard tools and is likely to generalise in a number of ways, which I discuss in the paper. (In particular one should be able to replace the $H^1$ norm here by any other subcritical norm.)

In my discussion of the Oppenheim conjecture in my recent post on Ratner’s theorems, I mentioned in passing the simple but crucial fact that the (orthochronous) special orthogonal group $SO(Q)^+$ of an indefinite quadratic form on ${\Bbb R}^3$ can be generated by unipotent elements. This is not a difficult fact to prove, as one can simply diagonalise Q and then explicitly write down some unipotent elements (the magic words here are “null rotations“). But this is a purely algebraic approach; I thought it would also be instructive to show the geometric (or dynamic) reason for why unipotent elements appear in the orthogonal group of indefinite quadratic forms in three dimensions. (I’ll give away the punch line right away: it’s because the parabola is a conic section.) This is not a particularly deep or significant observation, and will not be surprising to the experts, but I would like to record it anyway, as it allows me to review some useful bits and pieces of elementary linear algebra.

I’m continuing my series of articles for the Princeton Companion to Mathematics with my article on the Schrödinger equation – the fundamental equation of motion of quantum particles, possibly in the presence of an external field. My focus here is on the relationship between the Schrödinger equation of motion for wave functions (and the closely related Heisenberg equation of motion for quantum observables), and Hamilton’s equations of motion for classical particles (and the closely related Poisson equation of motion for classical observables). There is also some brief material on semiclassical analysis, scattering theory, and spectral theory, though with only a little more than 5 pages to work with in all, I could not devote much detail to these topics. (In particular, nonlinear Schrödinger equations, a favourite topic of mine, are not covered at all.)

As I said before, I will try to link to at least one other PCM article in every post in this series. Today I would like to highlight Madhu Sudan‘s delightful article on information and coding theory, “Reliable transmission of information“.

[Update, Oct 3: typos corrected.]

[Update, Oct 9: more typos corrected.]