You are currently browsing the category archive for the ‘update’ category.

I have just uploaded to the arXiv my paper “Commutators close to the identity“, submitted to the Journal of Operator Theory. This paper resulted from some progress I made on the problem discussed in this previous post. Recall in that post the following result of Popa: if are bounded operators on a Hilbert space whose commutator is close to the identity in the sense that

for some , then one has the lower bound

In the other direction, for any , there are examples of operators obeying (1) such that

In this paper we improve the upper bound to come closer to the lower bound:

Theorem 1For any , and any infinite-dimensional , there exist operators obeying (1) such that

One can probably improve the exponent somewhat by a modification of the methods, though it does not seem likely that one can lower it all the way to without a substantially new idea. Nevertheless I believe it plausible that the lower bound (2) is close to optimal.

We now sketch the methods of proof. The construction giving (3) proceeded by first identifying with the algebra of matrices that have entries in . It is then possible to find two matrices whose commutator takes the form

for some bounded operator (for instance one can take to be an isometry). If one then conjugates by the diagonal operator , one can eusure that (1) and (3) both hold.

It is natural to adapt this strategy to matrices rather than matrices, where is a parameter at one’s disposal. If one can find matrices that are almost upper triangular (in that only the entries on or above the lower diagonal are non-zero), whose commutator only differs from the identity in the top right corner, thus

for some , then by conjugating by a diagonal matrix such as for some and optimising in , one can improve the bound in (3) to ; if the bounds in the implied constant in the are polynomial in , one can then optimise in to obtain a bound of the form (4) (perhaps with the exponent replaced by a different constant).

The task is then to find almost upper triangular matrices whose commutator takes the required form. The lower diagonals of must then commute; it took me a while to realise then that one could (usually) conjugate one of the matrices, say by a suitable diagonal matrix, so that the lower diagonal consisted entirely of the identity operator, which would make the other lower diagonal consist of a single operator, say . After a lot of further lengthy experimentation, I eventually realised that one could conjugate further by unipotent upper triangular matrices so that all remaining entries other than those on the far right column vanished. Thus, without too much loss of generality, one can assume that takes the normal form

for some , solving the system of equations

It turns out to be possible to solve this system of equations by a contraction mapping argument if one takes to be a “Hilbert’s hotel” pair of isometries as in the previous post, though the contraction is very slight, leading to polynomial losses in in the implied constant.

There is a further question raised in Popa’s paper which I was unable to resolve. As a special case of one of the main theorems (Theorem 2.1) of that paper, the following result was shown: if obeys the bounds

(where denotes the space of all operators of the form with and compact), then there exist operators with such that . (In fact, Popa’s result covers a more general situation in which one is working in a properly infinite algebra with non-trivial centre.) We sketch a proof of this result as follows. Suppose that and for some . A standard greedy algorithm argument (see this paper of Brown and Pearcy) allows one to find orthonormal vectors for such that for each , one has for some comparable to , and some orthogonal to all of the . After some conjugation (and a suitable identification of with , one can thus place in a normal form

where is a isometry with infinite deficiency, and have norm . Setting , it then suffices to solve the commutator equation

with ; note the similarity with (3).

By the usual Hilbert’s hotel construction, one can complement with another isometry obeying the “Hilbert’s hotel” identity

and also , . Proceeding as in the previous post, we can try the ansatz

for some operators , leading to the system of equations

Using the first equation to solve for , the second to then solve for , and the third to then solve for , one can obtain matrices with the required properties.

Thus far, my attempts to extend this construction to larger matrices with good bounds on have been unsuccessful. A model problem would be to express

as a commutator with significantly smaller than . The construction in my paper achieves something like this, but with replaced by a more complicated operator. One would also need variants of this result in which one is allowed to perturb the above operator by an arbitrary finite rank operator of bounded operator norm.

Apoorva Khare and I have updated our paper “On the sign patterns of entrywise positivity preservers in fixed dimension“, announced at this post from last month. The quantitative results are now sharpened using a new monotonicity property of ratios of Schur polynomials, namely that such ratios are monotone non-decreasing in each coordinate of if is in the positive orthant, and the partition is larger than that of . (This monotonicity was also independently observed by Rachid Ait-Haddou, using the theory of blossoms.) In the revised version of the paper we give two proofs of this monotonicity. The first relies on a deep positivity result of Lam, Postnikov, and Pylyavskyy, which uses a representation-theoretic positivity result of Haiman to show that the polynomial combination

of skew-Schur polynomials is Schur-positive for any partitions (using the convention that the skew-Schur polynomial vanishes if is not contained in , and where and denotes the pointwise min and max of and respectively). It is fairly easy to derive the monotonicity of from this, by using the expansion

of Schur polynomials into skew-Schur polynomials (as was done in this previous post).

The second proof of monotonicity avoids representation theory by a more elementary argument establishing the weaker claim that the above expression (1) is non-negative on the positive orthant. In fact we prove a more general determinantal log-supermodularity claim which may be of independent interest:

Theorem 1Let be any totally positive matrix (thus, every minor has a non-negative determinant). Then for any -tuples of increasing elements of , one haswhere denotes the minor formed from the rows in and columns in .

For instance, if is the matrix

for some real numbers , one has

(corresponding to the case , ), or

(corresponding to the case , , , , ). It turns out that this claim can be proven relatively easy by an induction argument, relying on the Dodgson and Karlin identities from this previous post; the difficulties are largely notational in nature. Combining this result with the Jacobi-Trudi identity for skew-Schur polynomials (discussed in this previous post) gives the non-negativity of (1); it can also be used to directly establish the monotonicity of ratios by applying the theorem to a generalised Vandermonde matrix.

(Log-supermodularity also arises as the natural hypothesis for the FKG inequality, though I do not know of any interesting application of the FKG inequality in this current setting.)

Fifteen years ago, I wrote a paper entitled Global regularity of wave maps. II. Small energy in two dimensions, in which I established global regularity of wave maps from two spatial dimensions to the unit sphere, assuming that the initial data had small energy. Recently, Hao Jia (personal communication) discovered a small gap in the argument that requires a slightly non-trivial fix. The issue does not really affect the subsequent literature, because the main result has since been reproven and extended by methods that avoid the gap (see in particular this subsequent paper of Tataru), but I have decided to describe the gap and its fix on this blog.

I will assume familiarity with the notation of my paper. In Section 10, some complicated spaces are constructed for each frequency scale , and then a further space is constructed for a given frequency envelope by the formula

where is the Littlewood-Paley projection of to frequency magnitudes . Then, given a spacetime slab , we define the restrictions

where the infimum is taken over all extensions of to the Minkowski spacetime ; similarly one defines

The gap in the paper is as follows: it was implicitly assumed that one could restrict (1) to the slab to obtain the equality

(This equality is implicitly used to establish the bound (36) in the paper.) Unfortunately, (1) only gives the lower bound, not the upper bound, and it is the upper bound which is needed here. The problem is that the extensions of that are optimal for computing are not necessarily the Littlewood-Paley projections of the extensions of that are optimal for computing .

To remedy the problem, one has to prove an upper bound of the form

for all Schwartz (actually we need affinely Schwartz , but one can easily normalise to the Schwartz case). Without loss of generality we may normalise the RHS to be . Thus

for each , and one has to find a single extension of such that

for each . Achieving a that obeys (4) is trivial (just extend by zero), but such extensions do not necessarily obey (5). On the other hand, from (3) we can find extensions of such that

the extension will then obey (5) (here we use Lemma 9 from my paper), but unfortunately is not guaranteed to obey (4) (the norm does control the norm, but a key point about frequency envelopes for the small energy regularity problem is that the coefficients , while bounded, are not necessarily summable).

This can be fixed as follows. For each we introduce a time cutoff supported on that equals on and obeys the usual derivative estimates in between (the time derivative of size for each ). Later we will prove the truncation estimate

Assuming this estimate, then if we set , then using Lemma 9 in my paper and (6), (7) (and the local stability of frequency envelopes) we have the required property (5). (There is a technical issue arising from the fact that is not necessarily Schwartz due to slow decay at temporal infinity, but by considering partial sums in the summation and taking limits we can check that is the strong limit of Schwartz functions, which suffices here; we omit the details for sake of exposition.) So the only issue is to establish (4), that is to say that

for all .

For this is immediate from (2). Now suppose that for some integer (the case when is treated similarly). Then we can split

where

The contribution of the term is acceptable by (6) and estimate (82) from my paper. The term sums to which is acceptable by (2). So it remains to control the norm of . By the triangle inequality and the fundamental theorem of calculus, we can bound

By hypothesis, . Using the first term in (79) of my paper and Bernstein’s inequality followed by (6) we have

and then we are done by summing the geometric series in .

It remains to prove the truncation estimate (7). This estimate is similar in spirit to the algebra estimates already in my paper, but unfortunately does not seem to follow immediately from these estimates as written, and so one has to repeat the somewhat lengthy decompositions and case checkings used to prove these estimates. We do this below the fold.

Two quick updates with regards to polymath projects. Firstly, given the poll on starting the mini-polymath4 project, I will start the project at Thu July 12 2012 UTC 22:00. As usual, the main research thread on this project will be held at the polymath blog, with the discussion thread hosted separately on this blog.

Second, the Polymath7 project, which seeks to establish the “hot spots conjecture” for acute-angled triangles, has made a fair amount of progress so far; for instance, the first part of the conjecture (asserting that the second Neumann eigenfunction of an acute non-equilateral triangle is simple) is now solved, and the second part (asserting that the “hot spots” (i.e. extrema) of that second eigenfunction lie on the boundary of the triangle) has been solved in a number of special cases (such as the isosceles case). It’s been quite an active discussion in the last week or so, with almost 200 comments across two threads (and a third thread freshly opened up just now). While the problem is still not completely solved, I feel optimistic that it should fall within the next few weeks (if nothing else, it seems that the problem is now at least amenable to a brute force numerical attack, though personally I would prefer to see a more conceptual solution).

Just a quick update on my previous post on gamifying the problem-solving process in high school algebra. I had a little time to spare (on an airplane flight, of all things), so I decided to rework the mockup version of the algebra game into something a bit more structured, namely as 12 progressively difficult levels of solving a linear equation in one unknown. (Requires either Java or Flash.) Somewhat to my surprise, I found that one could create fairly challenging puzzles out of this simple algebra problem by carefully restricting the moves available at each level. Here is a screenshot of a typical level:

After completing each level, an icon appears which one can click on to proceed to the next level. (There is no particular rationale, by the way, behind my choice of icons; these are basically selected arbitrarily from the default collection of icons (or more precisely, “costumes”) available in Scratch.)

The restriction of moves made the puzzles significantly more artificial in nature, but I think that this may end up ultimately being a good thing, as to solve some of the harder puzzles one is forced to really start thinking about how the process of solving for an unknown actually works. (One could imagine that if one decided to make a fully fledged game out of this, one could have several modes of play, ranging from a puzzle mode in which one solves some carefully constructed, but artificial, puzzles, to a free-form mode in which one can solve arbitrary equations (including ones that you input yourself) using the full set of available algebraic moves.)

One advantage to gamifying linear algebra, as opposed to other types of algebra, is that there is no need for disjunction (i.e. splitting into cases). In contrast, if one has to solve a problem which involves at least one quadratic equation, then at some point one may be forced to divide the analysis into two disjoint cases, depending on which branch of a square root one is taking. I am not sure how to gamify this sort of branching in a civilised manner, and would be interested to hear of any suggestions in this regard. (A similar problem also arises in proving propositions in Euclidean geometry, which I had thought would be another good test case for gamification, because of the need to branch depending on the order of various points on a line, or rays through a point, or whether two lines are parallel or intersect.)

My graduate text on measure theory (based on these lecture notes) is now published by the AMS as part of the Graduate Studies in Mathematics series. (See also my own blog page for this book, which among other things contains a draft copy of the book in PDF format.)

A few days ago, I released a preprint entitled “Localisation and compactness properties of the Navier-Stokes global regularity problem“, discussed in this previous blog post. As it turns out, I was somewhat impatient to finalise the paper and move on to other things, and the original preprint was still somewhat rough in places (contradicting my own advice on this matter), with a number of typos of minor to moderate severity. But a bit more seriously, I discovered on a further proofreading that there was a subtle error in a component of the argument that I had believed to be routine – namely the persistence of higher regularity for mild solutions. As a consequence, some of the implications stated in the first version were not exactly correct as stated; but they can be repaired by replacing a “bad” notion of global regularity for a certain class of data with a “good” notion. I have completed (and proofread) an updated version of the ms, which should appear at the arXiv link of the paper in a day or two (and which I have also placed at this link). (In the meantime, it is probably best not to read the original ms too carefully, as this could lead to some confusion.) I’ve also added a new section that shows that, due to this technicality, one can exhibit smooth initial data to the Navier-Stokes equation for which there are no smooth solutions, which superficially sounds very close to a negative solution to the global regularity problem, but is actually nothing of the sort.

Let me now describe the issue in more detail (and also to explain why I missed it previously). A standard principle in the theory of evolutionary partial differentiation equations is that *regularity in space can be used to imply regularity in time*. To illustrate this, consider a solution to the supercritical nonlinear wave equation

(1)

for some field . Suppose one already knew that had some regularity in space, and in particular the norm of was bounded (thus and up to two spatial derivatives of were bounded). Then, by (1), we see that two *time* derivatives of were also bounded, and one then gets the additional regularity of .

In a similar vein, suppose one initially knew that had the regularity . Then (1) soon tells us that also has the regularity ; then, if one differentiates (1) in time to obtain

one can conclude that also has the regularity of . One can continue this process indefinitely; in particular, if one knew that , then these sorts of manipulations show that is infinitely smooth in both space and time.

The issue that caught me by surprise is that for the Navier-Stokes equations

(2)

(setting the forcing term equal to zero for simplicity), infinite regularity in space does *not* automatically imply infinite regularity in time, even if one assumes the initial data lies in a standard function space such as the Sobolev space . The problem lies with the pressure term , which is recovered from the velocity via the elliptic equation

(3)

that can be obtained by taking the divergence of (2). This equation is solved by a non-local integral operator:

If, say, lies in , then there is no difficulty establishing a bound on in terms of (for instance, one can use singular integral theory and Sobolev embedding to place in . However, one runs into difficulty when trying to compute time derivatives of . Differentiating (3) once, one gets

.

At the regularity of , one can still (barely) control this quantity by using (2) to expand out and using some integration by parts. But when one wishes to compute a second time derivative of the pressure, one obtains (after integration by parts) an expansion of the form

and now there is not enough regularity on available to get any control on , even if one assumes that is smooth. Indeed, following this observation, I was able to show that given generic smooth data, the pressure will instantaneously fail to be in time, and thence (by (2)) the velocity will instantaneously fail to be in time. (Switching to the vorticity formulation buys one further degree of time differentiability, but does not fully eliminate the problem; the vorticity will fail to be in time. Switching to material coordinates seems to makes things very slightly better, but I believe there is still a breakdown of time regularity in these coordinates also.)

For later times t>0 (and assuming homogeneous data f=0 for simplicity), this issue no longer arises, because of the instantaneous smoothing effect of the Navier-Stokes flow, which for instance will upgrade regularity to regularity instantaneously. It is only the initial time at which some time irregularity can occur.

This breakdown of regularity does not actually impact the original formulation of the Clay Millennium Prize problem, though, because in that problem the initial velocity is required to be Schwartz class (so all derivatives are rapidly decreasing). In this class, the regularity theory works as expected; if one has a solution which already has some reasonable regularity (e.g. a mild solution) and the data is Schwartz, then the solution will be smooth in spacetime. (Another class where things work as expected is when the *vorticity* is Schwartz; in such cases, the solution remains smooth in both space and time (for short times, at least), and the Schwartz nature of the vorticity is preserved (because the vorticity is subject to fewer non-local effects than the velocity, as it is not directly affected by the pressure).)

This issue means that one of the implications in the original paper (roughly speaking, that global regularity for Schwartz data implies global regularity for smooth data) is not correct as stated. But this can be fixed by weakening the notion of global regularity in the latter setting, by limiting the amount of time differentiability available at the initial time. More precisely, call a solution and *almost smooth* if

- and are smooth on the half-open slab ; and
- For every , exist and are continuous on the full slab .

Thus, an almost smooth solution is the same concept as a smooth solution, except that at time zero, the velocity field is only , and the pressure field is only . This is still enough regularity to interpret the Navier-Stokes equation (2) in a classical manner, but falls slightly short of full smoothness.

(I had already introduced this notion of almost smoothness in the more general setting of smooth finite energy solutions in the first draft of this paper, but had failed to realise that it was also necessary in the smooth setting also.)

One can now “fix” the global regularity conjectures for Navier-Stokes in the smooth or smooth finite energy setting by requiring the solutions to merely be almost smooth instead of smooth. Once one does so, the results in my paper then work as before: roughly speaking, if one knows that Schwartz data produces smooth solutions, one can conclude that smooth or smooth finite energy data produces almost smooth solutions (and the paper now contains counterexamples to show that one does not always have smooth solutions in this category).

The diagram of implications between conjectures has been adjusted to reflect this issue, and now reads as follows:

Christian Elsholtz and I have recently finished our joint paper “Counting the number of solutions to the Erdös-Straus equation on unit fractions“, submitted to the Journal of the Australian Mathematical Society. This supercedes my previous paper on the subject, by obtaining stronger and more general results. (The paper is currently in the process of being resubmitted to the arXiv, and should appear at this link within a few days.)

As with the previous paper, the main object of study is the number of solutions to the Diophantine equation

with positive integers. The Erdös-Straus conjecture asserts that for all . Since for all positive integers , it suffices to show that for all primes .

We single out two special types of solutions: *Type I* solutions, in which is divisible by and are coprime to , and *Type II* solutions, in which is coprime to and are divisible by . Let denote the number of Type I and Type II solutions respectively. For any , one has

with equality when is an odd primes . Thus, to prove the Erdös-Strauss conjecture, it suffices to show that at least one of , is positive whenever is an odd prime.

Our first main results are the asymptotics

This improves upon the results in the previous paper, which only established

and

The double logarithmic factor in the upper bound for is artificial (arising from the inefficiency in the Brun-Titchmarsh inequality on very short progressions) but we do not know how to remove it.

The methods are similar to those in the previous paper (which were also independently discovered in unpublished work of Elsholtz and Heath-Brown), but with the additional input of the Erdös divisor bound on expressions of the form for polynomials , discussed in this recent blog post. (Actually, we need to tighten Erdös’ bound somewhat, to obtain some uniformity in the bounds even as the coefficients of become large, but this turns out to be achievable by going through the original arguments of Erdös more carefully.)

We also note an observation of Heath-Brown, that in our notation gives the lower bound

thus, we see that for typical , that most solutions to the Erdös-Straus equation are not of Type I or Type II, in contrast to the case when is prime.

We also have a number other new results. We find a way to systematically unify all the previously known parameterisations of solutions to the Erdös-Straus equation, by lifting the Cayley-type surface to a certain three-dimensional variety in six-dimensional affine space, in such a way that integer points in the former arise from integer points in the latter. Each of the previously known characterisations of solutions then corresponds to a different choice of coordinates on this variety. (This point of view was also adopted in a paper of Heath-Brown, who interprets this lifted variety as the *universal torsor* of the Cayley surface.) By optimising between these parameterisations and exploiting the divisor bound, we obtain some bounds on the worst-case behaviour of and , namely

and

which should be compared to a recent previous bound of Browning and Elsholtz. In the other direction, we show that for infinitely many , and for almost all primes . Here, the main tools are some bounds for the representation of a rational as a sum of two unit fractions in the above-mentioned work of Browning and Elsholtz, and also the Turán-Kubilius inequality.

We also completely classify all the congruence classes that can be solved by polynomials, completing the partial list discussed in the previous post. Specifically, the Erdös-Straus conjecture is true for whenever one of the following congruence-type conditions is satisfied:

- , where are such that .
- and , where are such that .
- and , where are such that .
- or , where are such that and .
- , where are such that .
- , where are such that .

In principle, this suggests a way to extend the existing verification of the Erdös-Straus conjecture beyond the current range of by collecting all congruences to small moduli (e.g. up to ), and then using this to sieve out the primes up to a given size.

Finally, we begin a study of the more general equation

where are fixed. We can obtain a partial analogue of our main bounds for the case, namely that

and

were denotes the number of solutions to (2) which are of “Type II” in the sense that are all divisible by . However, we do not believe our bounds to be sharp in the large regime, though it does show that the expected number of solutions to (2) should grow rapidly in .

*[This is a (lightly edited) repost of an old blog post of mine, which had attracted over 400 comments, and as such was becoming difficult to load; I request that people wishing to comment on that puzzle use this fresh post instead. -T] *

This is one of my favorite logic puzzles, because of the presence of two highly plausible, but contradictory, solutions to the puzzle. Resolving this apparent contradiction requires very clear thinking about the nature of knowledge; but I won’t spoil the resolution here, and will simply describe the logic puzzle and its two putative solutions. (Readers, though, are welcome to discuss solutions in the comments.)

— The logic puzzle —

There is an island upon which a tribe resides. The tribe consists of 1000 people, with various eye colours. Yet, their religion forbids them to know their own eye color, or even to discuss the topic; thus, each resident can (and does) see the eye colors of all other residents, but has no way of discovering his or her own (there are no reflective surfaces). If a tribesperson does discover his or her own eye color, then their religion compels them to commit ritual suicide at noon the following day in the village square for all to witness. All the tribespeople are highly logical and devout, and they all know that each other is also highly logical and devout (and they all know that they all know that each other is highly logical and devout, and so forth).

Of the 1000 islanders, it turns out that 100 of them have blue eyes and 900 of them have brown eyes, although the islanders are not initially aware of these statistics (each of them can of course only see 999 of the 1000 tribespeople).

One day, a blue-eyed foreigner visits to the island and wins the complete trust of the tribe.

One evening, he addresses the entire tribe to thank them for their hospitality.

However, not knowing the customs, the foreigner makes the mistake of mentioning eye color in his address, remarking “how unusual it is to see another blue-eyed person like myself in this region of the world”.

What effect, if anything, does this

faux pashave on the tribe?

**Note 1**: For the purposes of this logic puzzle, “highly logical” means that any conclusion that can logically deduced from the information and observations available to an islander, will automatically be known to that islander.

**Note 2:** Bear in mind that this is a *logic* puzzle, rather than a description of a real-world scenario. The puzzle is not to determine whether the scenario is plausible (indeed, it is extremely implausible) or whether one can find a legalistic loophole in the wording of the scenario that allows for some sort of degenerate solution; instead, the puzzle is to determine (holding to the spirit of the puzzle, and not just to the letter) which of the solutions given below (if any) are correct, and if one solution is valid, to correctly explain why the other solution is invalid. (One could also resolve the logic puzzle by showing that the assumptions of the puzzle are logically inconsistent or not well-defined. However, merely demonstrating that the assumptions of the puzzle are *highly unlikely*, as opposed to *logically impossible to satisfy*, is not sufficient to resolve the puzzle.)

**Note 3**: An essentially equivalent version of the logic puzzle is also given at the xkcd web site. Many other versions of this puzzle can be found in many places; I myself heard of the puzzle as a child, though I don’t recall the precise source.

Below the fold are the two putative solutions to the logic puzzle. If you have not seen the puzzle before, I recommend you try to solve it first before reading either solution.

Last year, Emmanuel Breuillard, Ben Green, Bob Guralnick, and I wrote a paper entitled “Strongly dense free subgroups of semisimple Lie groups“. The main theorem in that paper asserted that given any semisimple algebraic group over an uncountable algebraically closed field , there existed a free subgroup which was *strongly dense* in the sense that any non-abelian subgroup of was Zariski dense in . This type of result is useful for establishing expansion in finite simple groups of Lie type, as we will discuss in a subsequent paper.

An essentially equivalent formulation of the main result is that if are two non-commuting elements of the free group on two generators, and is a *generic* pair of elements in , then and are not contained in any proper closed algebraic subgroup of . Here, “generic” means “outside of at most countably many proper subvarieties”. In most cases, one expects that if are generically drawn from , then will also be generically drawn from , but this is not always the case, which is a key source of difficulty in the paper. For instance, if is conjugate to in , then and must be conjugate in and so the pair lie in a proper subvariety of . It is currently an open question to determine all the pairs of words for which is not generic for generic (or equivalently, the double word map is not dominant).

The main strategy of proof was as follows. It is not difficult to reduce to the case when is simple. Suppose for contradiction that we could find two non-commuting words such that were generically trapped in a proper closed algebraic subgroup. As it turns out, there are only finitely many conjugacy classes of such groups, and so one can assume that were generically trapped in a conjugate of a fixed proper closed algebraic subgroup . One can show that , , and are generically regular semisimple, which implies that is a maximal rank semisimple subgroup. The key step was then to find another proper semisimple subgroup of which was *not* a *degeneration* of , by which we mean that there did not exist a pair in the Zariski closure of the products of conjugates of , such that generated a Zariski-dense subgroup of . This is enough to establish the theorem, because we could use an induction hypothesis to find in (and hence in such that generated a Zariski-dense subgroup of , which contradicts the hypothesis that was trapped in for generic (and hence in for all .

To illustrate the concept of a degeneration, take and let be the stabiliser of a non-degenerate -space in . All other stabilisers of non-degenerate -spaces are conjugate to . However, stabilisers of *degenerate* -spaces are not conjugate to , but are still degenerations of . For instance, the stabiliser of a totally singular -space (which is isomorphic to the affine group on , extended by ) is a degeneration of .

A significant portion of the paper was then devoted to verifying that for each simple algebraic group , and each maximal rank proper semisimple subgroup of , one could find another proper semisimple subgroup which was not a degeneration of ; roughly speaking, this means that is so “different” from that no conjugate of can come close to covering . This required using the standard classification of algebraic groups via Dynkin diagrams, and knowledge of the various semisimple subgroups of these groups and their representations (as we used the latter as obstructions to degeneration, for instance one can show that a reducible representation cannot degenerate to an irreducible one).

During the refereeing process for this paper, we discovered that there was precisely one family of simple algebraic groups for which this strategy did not actually work, namely the group (or the group that is double-covered by this group) in characteristic . This group (which has Dynkin diagram , as discussed in this previous post) has one maximal rank proper semisimple subgroup up to conjugacy, namely , which is the stabiliser of a line in . To find a proper semisimple group that is not a degeneration of this group, we basically need to find a subgroup that does not stabilise any line in . In characteristic larger than three (or characteristic zero), one can proceed by using the action of on the five-dimensional space of homogeneous degree four polynomials on , which preserves a non-degenerate symmetric form (the four-fold tensor power of the area form on ) and thus embeds into ; as no polynomial is fixed by all of , we see that this copy of is not a degeneration of .

Unfortunately, in characteristics two and three, the symmetric form on degenerates, and this embedding is lost. In the characteristic two case, one can proceed by using the characteristic fact that is isomorphic to (because in characteristic two, the space of null vectors is a hyperplane, and the symmetric form becomes symplectic on this hyperplane), and thus has an additional maximal rank proper semisimple subgroup which is not conjugate to the subgroup. But in characteristic three, it turns out that there are no further semisimple subgroups of that are not already contained in a conjugate of the . (This is not a difficulty for larger groups such as or , where there are plenty of other semisimple groups to utilise; it is only this smallish group that has the misfortune of having exactly one maximal rank proper semisimple group to play with, and not enough other semisimples lying around in characteristic three.)

As a consequence of this issue, our argument does not actually work in the case when the characteristic is three and the semisimple group contains a copy of (or ), and we have had to modify our paper to delete this case from our results. We believe that such groups still do contain strongly dense free subgroups, but this appears to be just out of reach of our current method.

One thing that this experience has taught me is that algebraic groups behave somewhat pathologically in low characteristic; in particular, intuition coming from the characteristic zero case can become unreliable in characteristic two or three.

## Recent Comments