You are currently browsing the category archive for the 'question' category.
Camil Muscalu, Christoph Thiele and I have just uploaded to the arXiv our joint paper, “Multi-linear multipliers associated to simplexes of arbitrary length“, submitted to Analysis & PDE. This paper grew out of our project from many years ago to attempt to prove the nonlinear (or “scattering”) version of Carleson’s theorem on the almost everywhere convergence of Fourier series. This version is still open; our original approach was to handle the nonlinear Carleson operator by multilinear expansions in terms of the potential function V, but while the first three terms of this expansion were well behaved, the fourth term was unfortunately divergent, due to the unhelpful location of a certain minus sign. [This survey by Michael Lacey, as well as this paper of ourselves, covers some of these topics.]
However, what we did find out in this paper was that if we modified the nonlinear Carleson operator slightly, by replacing the underlying Schrödinger equation by a more general AKNS system, then for “generic” choices of this system, the problem of the ill-placed minus sign goes away, and each term in the multilinear series is, in fact, convergent (though we did not yet verify that the series actually converged, though in view of the earlier work of Christ and Kiselev on this topic, this seems likely). The verification of this convergence (at least with regard to the scattering data, rather than the more difficult analysis of the eigenfunctions) is the main result of our current paper. It builds upon our earlier estimates of the bilinear term in the expansion (which we dubbed the “biest”, as a multilingual pun). The main new idea in our earlier paper was to decompose the relevant region of frequency space into more tractable regions, a typical one being the region in which
was much closer to
than to
. The contribution of each region can then be “parafactored” into a “paracomposition” of simpler operators, such as the bilinear Hilbert transform, which can be treated by standard time-frequency analysis methods. (Much as a paraproduct is a frequency-restricted version of a product, the paracompositions that arise here are frequency-restricted versions of composition.)
A similar analysis happens to work for the multilinear operators associated to the frequency region , but the combinatorics are more complicated; each of the component frequency regions has to be indexed by a tree (in a manner reminiscent of the well-separated pairs decomposition), and a certain key “weak Bessel inequality” becomes considerably more delicate. Our ultimate conclusion is that the multilinear operator
(1)
(which generalises the bilinear Hilbert transform and the biest) obeys Hölder-type estimates (note that Hölder’s inequality related to the situation in which the (projective) simplex S is replaced by the entire frequency space
).
For the remainder of this post, I thought I would describe the “nonlinear Carleson theorem” conjecture, which is still one of my favourite open problems, being an excellent benchmark for measuring progress in the (still nascent) field of “nonlinear Fourier analysis“, while also being of interest in its own right in scattering and spectral theory.
[This post is authored by Gil Kalai, who has kindly “guest blogged” this week’s “open problem of the week”. - T.]
The entropy-influence conjecture seeks to relate two somewhat different measures as to how a boolean function has concentrated Fourier coefficients, namely the total influence and the entropy.
We begin by defining the total influence. Let be the discrete cube, i.e. the set of
vectors
of length n. A boolean function is any function
from the discrete cube to {-1,+1}. One can think of such functions as “voting methods”, which take the preferences of n voters (+1 for yes, -1 for no) as input and return a yes/no verdict as output. For instance, if n is odd, the “majority vote” function
returns +1 if there are more +1 variables than -1, or -1 otherwise, whereas if
, the “
dictator” function returns the value
of the
variable.
We give the cube the uniform probability measure
(thus we assume that the n voters vote randomly and independently). Given any boolean function f and any variable
, define the influence
of the
variable to be the quantity
where is the element of the cube formed by flipping the sign of the
variable. Informally,
measures the probability that the
voter could actually determine the outcome of an election; it is sometimes referred to as the Banzhaf power index. The total influence I(f) of f (also known as the average sensitivity and the edge-boundary density) is then defined as
Thus for instance a dictator function has total influence 1, whereas majority vote has total influence comparable to . The influence can range between 0 (for constant functions +1, -1) and n (for the parity function
or its negation). If f has mean zero (i.e. it is equal to +1 half of the time), then the edge-isoperimetric inequality asserts that
(with equality if and only if there is a dictatorship), whilst the Kahn-Kalai-Linial (KKL) theorem asserts that
for some k. There is a result of Friedgut that if
is bounded by A (say) and
, then f is within a distance
(in
norm) of another boolean function g which only depends on
of the variables (such functions are known as juntas).
This problem in compressed sensing is an example of a derandomisation problem: take an object which, currently, can only be constructed efficiently by a probabilistic method, and figure out a deterministic construction of comparable strength and practicality. (For a general comparison of probabilistic and deterministic algorithms, I can point you to these slides by Avi Wigderson).
I will define exactly what UUP matrices (the UUP stands for “uniform uncertainty principle“) are later in this post. For now, let us just say that they are a generalisation of (rectangular) orthogonal matrices, in which the columns are locally almost orthogonal rather than globally perfectly orthogonal. Because of this, it turns out that one can pack significantly more columns into a UUP matrix than an orthogonal matrix, while still capturing many of the desirable features of orthogonal matrices, such as stable and computable invertibility (as long as one restricts attention to sparse or compressible vectors). Thus UUP matrices can “squash” sparse vectors from high-dimensional space into a low-dimensional while still being able to reconstruct those vectors; this property underlies many of the recent results on compressed sensing today.
There are several constructions of UUP matrices known today (e.g. random normalised Gaussian matrices, random normalised Bernoulli matrices, or random normalised minors of a discrete Fourier transform matrix) but (if one wants the sparsity parameter to be large) they are all probabilistic in nature; in particular, these constructions are not 100% guaranteed to actually produce a UUP matrix, although in many cases the failure rate can be proven to be exponentially small in the size of the matrix. Furthermore, there is no fast (e.g. sub-exponential time) algorithm known to test whether any given matrix is UUP or not. The failure rate is small enough that this is not a problem for most applications (especially since many compressed sensing applications are for environments which are already expected to be noisy in many other ways), but is slightly dissatisfying from a theoretical point of view. One is thus interested in finding a deterministic construction which can locate UUP matrices in a reasonably rapid manner. (One could of course simply search through all matrices in a given class and test each one for the UUP property, but this is an exponential-time algorithm, and thus totally impractical for applications.) In analogy with error-correcting codes, it may be that algebraic or number-theoretic constructions may hold the most promise for such deterministic UUP matrices (possibly assuming some unproven conjectures on exponential sums); this has already been accomplished by de Vore for UUP matrices with small sparsity parameter.
The parity problem is a notorious problem in sieve theory: this theory was invented in order to count prime patterns of various types (e.g. twin primes), but despite superb success in obtaining upper bounds on the number of such patterns, it has proven to be somewhat disappointing in obtaining lower bounds. [Sieves can also be used to study many other things than primes, of course, but we shall focus only on primes in this post.] Even the task of reproving Euclid’s theorem - that there are infinitely many primes - seems to be extremely difficult to do by sieve theoretic means, unless one of course injects into the theory an estimate at least as strong as Euclid’s theorem (e.g. the prime number theorem). The main obstruction is the parity problem: even assuming such strong hypotheses as the Elliott-Halberstam conjecture (a sort of “super-generalised Riemann Hypothesis” for sieves), sieve theory is largely (but not completely) unable to distinguish numbers with an odd number of prime factors from numbers with an even number of prime factors. This “parity barrier” has been broken for some select patterns of primes by injecting some powerful non-sieve theory methods into the subject, but remains a formidable obstacle in general.
I’ll discuss the parity problem in more detail later in this post, but I want to first discuss how sieves work [drawing in part on some excellent unpublished lecture notes of Iwaniec]; the basic ideas are elementary and conceptually simple, but there are many details and technicalities involved in actually executing these ideas, and which I will try to suppress for sake of exposition.
The Skolem-Mahler-Lech theorem in algebraic number theory is a significant generalisation of the obvious statement that a polynomial either has finitely many zeroes (in particular, the set of zeroes is bounded), or it vanishes identically. It appeals to me (despite not really being within my areas of expertise) because it is one of the simplest (non-artificial) results I know of which (currently) comes with an ineffective bound - a bound which is provably finite, but which cannot be computed! It appears that to obtain an effective result, one may need a rather different proof. (I thank Kousha Etessami and Tom Lenagan for drawing this problem to my attention.)
Ineffective bounds seem to arise particularly often in number theory. I am aware of at least three ways in which they come in:
- By using methods from soft (infinitary) analysis.
- By using the fact that any finite set in a metric space is bounded (i.e. is contained in a ball of finite radius centred at a designated origin).
- By using the fact that any set of finite diameter in a metric space is bounded.
Regarding #1, there are often ways to make these arguments quantitative and effective, as discussed in my previous post. But #2 and #3 seem to be irreducibly ineffective: if you know that a set A has finite cardinality or finite diameter, you know it has finite distance to the origin, but an upper bound on the cardinality or diameter does not translate to an effective bound on the radius of the ball centred at the origin needed to contain the set. [In the spirit of the preceding post, one can conclude an effective "meta-bound" on such a set, establishing a large annulus in which the set has no presence, but this is not particularly satisfactory.] The problem with the Skolem-Mahler-Lech theorem is that all the known proofs use #2 at some point.
This is a well-known problem in multilinear harmonic analysis; it is fascinating to me because it lies barely beyond the reach of the best technology we have for these problems (namely, multiscale time-frequency analysis), and because the most recent developments in quadratic Fourier analysis seem likely to shed some light on this problem.
Recall that the Hilbert transform is defined on test functions (up to irrelevant constants) as
where the integral is evaluated in the principal value sense (removing the region to ensure integrability, and then taking the limit as
.)
[This lecture is also doubling as this week's "open problem of the week", as it discusses the Birch and Swinnerton-Dyer conjecture and the effective Mordell conjecture.]
Like many other maths departments, UCLA has a distinguished lecture series for eminent mathematicians to present recent developments in a field of mathematics, both to a broad audience and to specialists. Unlike most departments, though, our lecture series goes by the descriptive (but unimaginative) name of “Distinguished Lecture Series“, supported by the Gill Foundation. This week the lecture series is given by Shou-wu Zhang from Columbia, and revolves around the topic of rational points on curves, a key subject of interest in arithmetic geometry and number theory. The first of three talks, which was on Tuesday, was a very accessible and enjoyable overview talk, which I am reproducing here (to use this opportunity to learn this stuff myself, and also to continue the diversification of subject matter here on this blog). As before, I do not vouch for 100% accuracy, and all errors are my responsibility rather than Shou-wu’s.
[This post is authored by Gil Kalai, who has kindly “guest blogged” this week’s “open problem of the week”. - T.]
This is a problem in discrete and convex geometry. It seeks to quantify the intuitively obvious fact that large convex bodies are so “fat” that they cannot avoid “detection” by a small number of observation points. More precisely, we fix a dimension d and make the following definition (introduced by Haussler and Welzl):
- Definition: Let
be a finite set of points, and let
. We say that a finite set
is a weak
-net for X (with respect to convex bodies) if, whenever B is a convex body which is large in the sense that
, then B contains at least one point of Y. (If Y is contained in X, we say that Y is a strong
-net for X with respect to convex bodies.)
For example, in one dimension, if , and
where k is the integer part of
, then Y is a weak
-net for X with respect to convex bodies. Thus we see that even when the original set X is very large, one can create a
-net of size as small as
. Strong
-nets are of importance in computational learning theory, and are fairly well understood via Vapnik-Chervonenkis (or VC) theory; however, the theory of weak
-nets is still not completely satisfactory.
One can ask what happens in higher dimensions, for instance when X is a discrete cube . It is not too hard to cook up
-nets of size
(by using tools such as Minkowski’s theorem), but in fact one can create
-nets of size as small as
simply by taking a random subset of X of this cardinality and observing that “up to errors of
“, the total number of essentially different ways a convex body can meet X grows at most polynomially in
. (This is a very typical application of the probabilistic method.) On the other hand, since X can contain roughly
disjoint convex bodies, each of which contains at least
of the points in X, we see that no
-net can have size much smaller than
.
Now consider the situation in which X is now an arbitrary finite set, rather than a discrete cube. More precisely, let be the least number such that every finite set X possesses at least one weak
-net for X with respect to convex bodies of cardinality at most
. (One can also replace the finite set X with an arbitrary probability measure; the two formulations are equivalent.) Informally, f is the least number of “guards” one needs to place to prevent a convex body from covering more than
of any given territory.
- Problem 1: For fixed d, what is the correct rate of growth of f as
?
This problem lies in the highly interconnected interface between algebraic combinatorics (esp. the combinatorics of Young tableaux and related objects, including honeycombs and puzzles), algebraic geometry (particularly classical and quantum intersection theory and geometric invariant theory), linear algebra (additive and multiplicative, real and tropical), and the representation theory (classical, quantum, crystal, etc.) of classical groups. (Another open problem in this subject is to find a succinct and descriptive name for the field.) I myself haven’t actively worked in this area for several years, but I still find it a fascinating and beautiful subject. (With respect to the dichotomy between structure and randomness, this subject lies deep within the “structure” end of the spectrum.)
As mentioned above, the problems in this area can be approached from a variety of quite diverse perspectives, but here I will focus on the linear algebra perspective, which is perhaps the most accessible. About nine years ago, Allen Knutson and I introduced a combinatorial gadget, called a honeycomb, which among other things controlled the relationship between the eigenvalues of two arbitrary Hermitian matrices A, B, and the eigenvalues of their sum A+B; this was not the first such gadget that achieved this purpose, but it was a particularly convenient one for studying this problem, in particular it was used to resolve two conjectures in the subject, the saturation conjecture and the Horn conjecture. (These conjectures have since been proven by a variety of other methods.) There is a natural multiplicative version of these problems, which now relates the eigenvalues of two arbitrary unitary matrices U, V and the eigenvalues of their product UV; this led to the “quantum saturation” and “quantum Horn” conjectures, which were proven a couple years ago. However, the quantum analogue of a “honeycomb” remains a mystery; this is the main topic of the current post.
[This lecture is also doubling as this week's "open problem of the week", as it (eventually) discusses the soliton resolution conjecture.]
In this third lecture, I will talk about how the dichotomy between structure and randomness pervades the study of two different types of partial differential equations (PDEs):
- Parabolic PDE, such as the heat equation
, which turn out to play an important role in the modern study of geometric topology; and
- Hamiltonian PDE, such as the Schrödinger equation
, which are heuristically related (via Liouville’s theorem) to measure-preserving actions of the real line (or time axis)
, somewhat in analogy to how combinatorial number theory and graph theory were related to measure-preserving actions of
and
respectively, as discussed in the previous lecture.
(In physics, one would also insert some physical constants, such as Planck’s constant , but for the discussion here it is convenient to normalise away all of these constants.)
The question in extremal graph theory I wish to discuss here originates from Luca Trevisan; it shows that we still don’t know everything that we should about the “local” properties of large dense graphs.
Let be a large (undirected) graph, thus V is the vertex set with some large number n of vertices, and E is the collection of edges {x,y} connecting two vertices in the graph. We can allow the graph to have loops {x,x} if one wishes; it’s not terribly important for this question (since the number of loops is so small compared to the total number of edges), so let’s say there are no loops. We define three quantities of the graph G:
- The edge density
, defined as the number of edges in G, divided by the total number of possible edges, i.e.
;
- The triangle density
, defined as the number of triangles in G (i.e. unordered triplets {x,y,z} such that {x,y},{y,z}, {z,x} all lie in G), divided by the total number of possible triangles, namely
;
- The diamond density
, defined as the number of diamonds in G (i.e. unordered pairs { {x,y,z}, {x,y,w} } of triangles in G which share a common edge), divided by the total number of possible diamonds, namely
.
This is a well known problem (see for instance this survey) in the area of “quantum chaos” or “quantum unique ergodicity”; I am attracted to it both for its simplicity of statement (which I will get to eventually), and also because it focuses on one of the key weaknesses in our current understanding of the Laplacian, namely is that it is difficult with the tools we know to distinguish between eigenfunctions (exact solutions to ) and quasimodes (approximate solutions to the same equation), unless one is willing to work with generic energy levels rather than specific energy levels.
The Bunimovich stadium is the name given to any planar domain consisting of a rectangle bounded at both ends by semicircles. Thus the stadium has two flat edges (which are traditionally drawn horizontally) and two round edges, as this picture from Wikipedia shows:
![]()
Despite the simple nature of this domain, the stadium enjoys some interesting classical and quantum dynamics. The classical dynamics, or billiard dynamics on is ergodic (as shown by Bunimovich) but not uniquely ergodic. In more detail: we say the dynamics is ergodic because a billiard ball with randomly chosen initial position and velocity (as depicted above) will, over time, be uniformly distributed across the billiard (as well as in the energy surface of the phase space of the billiard). On the other hand, we say that the dynamics is not uniquely ergodic because there do exist some exceptional choices of initial position and velocity for which one does not have uniform distribution, namely the vertical trajectories in which the billiard reflects orthogonally off of the two flat edges indefinitely.
It is always dangerous to venture an opinion as to why a problem is hard (cf. Clarke’s first law), but I’m going to stick my neck out on this one, because (a) it seems that there has been a lot of effort expended on this problem recently, sometimes perhaps without full awareness of the main difficulties, and (b) I would love to be proved wrong on this opinion :-) .
The global regularity problem for Navier-Stokes is of course a Clay Millennium Prize problem and it would be redundant to describe it again here. I will note, however, that it asks for existence of global smooth solutions to a Cauchy problem for a nonlinear PDE. There are countless other global regularity results of this type for many (but certainly not all) other nonlinear PDE; for instance, global regularity is known for Navier-Stokes in two spatial dimensions rather than three (this result essentially dates all the way back to Leray’s thesis in 1933!). Why is the three-dimensional Navier-Stokes global regularity problem considered so hard, when global regularity for so many other equations is easy, or at least achievable?
(For this post, I am only considering the global regularity problem for Navier-Stokes, from a purely mathematical viewpoint, and in the precise formulation given by the Clay Institute; I will not discuss at all the question as to what implications a rigorous solution (either positive or negative) to this problem would have for physics, computational fluid dynamics, or other disciplines, as these are beyond my area of expertise. But if anyone qualified in these fields wants to make a comment along these lines, by all means do so.)
[This post is authored by Ben Green, who has kindly "guest blogged" this week's "open problem of the week". - T.]
In an earlier blog post Terry discussed Freiman’s theorem. The name of Freiman is attached to a growing body of theorems which take some rather “combinatorial” hypothesis, such that the sumset |A+A| of some set A is small, and deduce from it rather “algebraic” information (such that A is contained in a subspace or a grid).
The easiest place to talk about Freiman’s theorem is in the finite field model (see my survey article on this subject for a full discussion). Here it was shown by Ruzsa that if |A+A| is at most
then A is contained in a subspace of size no more than about
. The exponent has been improved a few times since Ruzsa’s paper, the best result currently in print being due to Sanders, who obtains an upper bound of
. Terry and I are in the process of writing a paper which obtains
, which is best possible in view of the example
where
; this set has doubling roughly K but is not contained in a subspace of dimension smaller than 2K.
This result has an air of finality (except for the true nature of the o(K) term, which represents an interesting open problem). This is something of an illusion, however. Even using this theorem, one loses an exponential every time one tries to transition between “combinatorial” structure and “algebraic” structure and back again. Indeed if one knows that A is contained in a subspace of size then the strongest assertion one can make about the doubling of A is that it is at most
.
The Polynomial Freiman-Ruzsa conjecture (PFR), in , hypothesises a more precise structure theorem for sets with small doubling. Using this
conjecture, one may flit back and forth between combinatorial and algebraic structure with only polynomial losses. Ruzsa attributes the conjecture to
Marton: it states that if A has doubling at most K then A is contained in the union of translates of some subspace H of size at most |A|.
It seems that I have unwittingly started an “open problem of the week” column here; certainly it seems easier for me to pose unsolved problems than to write papers :-) .
This question in convex geometry has been around for a while; I am fond of it because it attempts to capture the intuitively obvious fact that cubes and octahedra are the “pointiest” possible symmetric convex bodies one can create. Sadly, we still have very few tools to make this intuition rigorous (especially when compared against the assertion that the Euclidean ball is the “roundest” possible convex body, for which we have many rigorous and useful formulations).
To state the conjecture I need a little notation. Suppose we have a symmetric convex body in a Euclidean space, thus B is open, convex, bounded, and symmetric around the origin. We can define the polar body
by
.
This is another symmetric convex body. One can interpret B as the unit ball of a Banach space norm on , in which case
is simply the unit ball of the dual norm. The Mahler volume
of the body is defined as the product of the volumes of B and its polar body:
This is another one of my favourite open problems, falling under the heading of inverse theorems in arithmetic combinatorics. “Direct” theorems in arithmetic combinatorics take a finite set A in a group or ring and study things like the size of its sum set or product set
. For example, a typical result in this area is the sum-product theorem, which asserts that whenever
is a subset of a finite field of prime order with
, then
for some . (This particular theorem was first proven here, with an earlier partial result here; more recent and elementary proofs with civilised bounds can be found here, here or here. It has a number of applications.)
In contrast, inverse theorems in this subject start with a hypothesis that, say, the sum set A+A of an unknown set A is small, and try to deduce structural information about A. A typical goal is to completely classify all sets A for which A+A has comparable size with A. In the case of finite subsets of integers, this is Freiman’s theorem, which roughly speaking asserts that if , if and only if A is a dense subset of a generalised arithmetic progression P of rank O(1), where we say that A is a dense subset of B if
and
. (The “if and only if” has to be interpreted properly; in either the “if” or the “only if” direction, the implicit constants in the conclusion depends on the implicit constants in the hypothesis, but these dependencies are not inverses of each other.) In the case of finite subsets A of an arbitrary abelian group, we have the Freiman-Green-Ruzsa theorem, which asserts that
if and only if A is a dense subset of a sum P+H of a finite subgroup H and a generalised arithmetic progression P of rank O(1).
Earlier this month, in the previous incarnation of this page, I posed a question which I thought was unsolved, and obtained the answer (in fact, it was solved 25 years ago) within a week. Now that this page now has better feedback capability, I am now tempted to try again, since I have a large number of such questions which I would like to publicise. (Actually, I even have a secret web page full of these somewhere near my home page, though it will take a non-trivial amount of effort to find it!)
Perhaps my favourite open question is the problem on the maximal size of a cap set - a subset of (
being the finite field of three elements) which contains no lines, or equivalently no non-trivial arithmetic progressions of length three. As an upper bound, one can easily modify the proof of Roth’s theorem to show that cap sets must have size
(see e.g. this paper of Meshulam). This of course is better than the trivial bound of
once n is large. In the converse direction, the trivial example
shows that cap sets can be as large as
; the current world record is
, held by Edel. The gap between these two bounds is rather enormous; I would be very interested in either an improvement of the upper bound to
, or an improvement of the lower bound to
. (I believe both improvements are true, though a good friend of mine disagrees about the improvement to the lower bound.)

Recent Comments