You are currently browsing the category archive for the ‘tricks’ category.
This is going to be a somewhat experimental post. In class, I mentioned that when solving the type of homework problems encountered in a graduate real analysis course, there are really only about a dozen or so basic tricks and techniques that are used over and over again. But I had not thought to actually try to make these tricks explicit, so I am going to try to compile here a list of some of these techniques here. But this list is going to be far from exhaustive; perhaps if other recent students of real analysis would like to share their own methods, then I encourage you to do so in the comments (even – or especially – if the techniques are somewhat vague and general in nature).
(See also the Tricki for some general mathematical problem solving tips. Once this page matures somewhat, I might migrate it to the Tricki.)
Note: the tricks occur here in no particular order, reflecting the stream-of-consciousness way in which they were arrived at. Indeed, this list will be extended on occasion whenever I find another trick that can be added to this list.
This is a technical post inspired by separate conversations with Jim Colliander and with Soonsik Kwon on the relationship between two techniques used to control non-radiating solutions to dispersive nonlinear equations, namely the “double Duhamel trick” and the “in/out decomposition”. See for instance these lecture notes of Killip and Visan for a survey of these two techniques and other related methods in the subject. (I should caution that this post is likely to be unintelligible to anyone not already working in this area.)
For sake of discussion we shall focus on solutions to a nonlinear Schrödinger equation
and we will not concern ourselves with the specific regularity of the solution , or the specific properties of the nonlinearity here. We will also not address the issue of how to justify the formal computations being performed here.
Solutions to this equation enjoy the forward Duhamel formula
for times to the future of in the lifespan of the solution, as well as the backward Duhamel formula
for all times to the past of in the lifespan of the solution. The first formula asserts that the solution at a given time is determined by the initial state and by the immediate past, while the second formula is the time reversal of the first, asserting that the solution at a given time is determined by the final state and the immediate future. These basic causal formulae are the foundation of the local theory of these equations, and in particular play an instrumental role in establishing local well-posedness for these equations. In this local theory, the main philosophy is to treat the homogeneous (or linear) term or as the main term, and the inhomogeneous (or nonlinear, or forcing) integral term as an error term.
The situation is reversed when one turns to the global theory, and looks at the asymptotic behaviour of a solution as one approaches a limiting time (which can be infinite if one has global existence, or finite if one has finite time blowup). After a suitable rescaling, the linear portion of the solution often disappears from view, leaving one with an asymptotic blowup profile solution which is non-radiating in the sense that the linear components of the Duhamel formulae vanish, thus
where are the endpoint times of existence. (This type of situation comes up for instance in the Kenig-Merle approach to critical regularity problems, by reducing to a minimal blowup solution which is almost periodic modulo symmetries, and hence non-radiating.) These types of non-radiating solutions are propelled solely by their own nonlinear self-interactions from the immediate past or immediate future; they are generalisations of “nonlinear bound states” such as solitons.
A key task is then to somehow combine the forward representation (1) and the backward representation (2) to obtain new information on itself, that cannot be obtained from either representation alone; it seems that the immediate past and immediate future can collectively exert more control on the present than they each do separately. This type of problem can be abstracted as follows. Let be the infimal value of over all forward representations of of the form
Typically, one already has (or is willing to assume as a bootstrap hypothesis) control on in the norm , which gives control of in the norms . The task is then to use the control of both the and norm of to gain control of in a more conventional Hilbert space norm , which is typically a Sobolev space such as or .
for all reasonable ; note that setting and applying the arithmetic-geometric inequality then gives (5). The point is that if has a forward representation (3) and has a backward representation (4), then the inner product can (formally, at least) be expanded as a double integral
The dispersive nature of the linear Schrödinger equation often causes to decay, especially in high dimensions. In high enough dimension (typically one needs five or higher dimensions, unless one already has some spacetime control on the solution), the decay is stronger than , so that the integrand becomes absolutely integrable and one recovers (6).
Unfortunately it appears that estimates of the form (6) fail in low dimensions (for the type of norms that actually show up in applications); there is just too much interaction between past and future to hope for any reasonable control of this inner product. But one can try to obtain (5) by other means. By the Hahn-Banach theorem (and ignoring various issues related to reflexivity), (5) is equivalent to the assertion that every can be decomposed as , where and . Indeed once one has such a decomposition, one obtains (5) by computing the inner product of with in in two different ways. One can also (morally at least) write as and similarly write as
So one can dualise the task of proving (5) as that of obtaining a decomposition of an arbitrary initial state into two components and , where the former disperses into the past and the latter disperses into the future under the linear evolution. We do not know how to achieve this type of task efficiently in general – and doing so would likely lead to a significant advance in the subject (perhaps one of the main areas in this topic where serious harmonic analysis is likely to play a major role). But in the model case of spherically symmetric data , one can perform such a decomposition quite easily: one uses microlocal projections to set to be the “inward” pointing component of , which propagates towards the origin in the future and away from the origin in the past, and to simimlarly be the “outward” component of . As spherical symmetry significantly dilutes the amplitude of the solution (and hence the strength of the nonlinearity) away from the origin, this decomposition tends to work quite well for applications, and is one of the main reasons (though not the only one) why we have a global theory for low-dimensional nonlinear Schrödinger equations in the radial case, but not in general.
The in/out decomposition is a linear one, but the Hahn-Banach argument gives no reason why the decomposition needs to be linear. (Note that other well-known decompositions in analysis, such as the Fefferman-Stein decomposition of BMO, are necessarily nonlinear, a fact which is ultimately equivalent to the non-complemented nature of a certain subspace of a Banach space; see these lecture notes of mine and this old blog post for some discussion.) So one could imagine a sophisticated nonlinear decomposition as a general substitute for the in/out decomposition. See for instance this paper of Bourgain and Brezis for some of the subtleties of decomposition even in very classical function spaces such as . Alternatively, there may well be a third way to obtain estimates of the form (5) that do not require either decomposition or the double Duhamel trick; such a method may well clarify the relative relationship between past, present, and future for critical nonlinear dispersive equations, which seems to be a key aspect of the theory that is still only partially understood. (In particular, it seems that one needs a fairly strong decoupling of the present from both the past and the future to get the sort of elliptic-like regularity results that allow us to make further progress with such equations.)
In this post I would like to make some technical notes on a standard reduction used in the (Euclidean, maximal) Kakeya problem, known as the two ends reduction. This reduction (which takes advantage of the approximate scale-invariance of the Kakeya problem) was introduced by Wolff, and has since been used many times, both for the Kakeya problem and in other similar problems (e.g. by Jim Wright and myself to study curved Radon-like transforms). I was asked about it recently, so I thought I would describe the trick here. As an application I give a proof of the case of the Kakeya maximal conjecture.
From Tim Gowers’ blog comes the announcement that the Tricki – a wiki for various tricks and strategies for proving mathematical results – is now live. (My own articles for the Tricki are also on this blog; also Ben Green has written up an article on using finite fields to prove results about infinite fields which is loosely based on my own post on the topic, which is in turn based on an article of Serre.) It seems to already be growing at a reasonable rate, with many contributors.
Title: Give yourself an epsilon of room.
Quick description: You want to prove some statement about some object (which could be a number, a point, a function, a set, etc.). To do so, pick a small , and first prove a weaker statement (which allows for “losses” which go to zero as ) about some perturbed object . Then, take limits . Provided that the dependency and continuity of the weaker conclusion on are sufficiently controlled, and is converging to in an appropriately strong sense, you will recover the original statement.
One can of course play a similar game when proving a statement about some object , by first proving a weaker statement on some approximation to for some large parameter N, and then send at the end.
General discussion: Here are some typical examples of a target statement , and the approximating statements that would converge to :
|for some independent of|
|is finite||is bounded uniformly in|
|for all (i.e. maximises f)||for all (i.e. nearly maximises f)|
|converges as||fluctuates by at most o(1) for sufficiently large n|
|is a measurable function||is a measurable function converging pointwise to|
|is a continuous function||is an equicontinuous family of functions converging pointwise to OR is continuous and converges (locally) uniformly to|
|The event holds almost surely||The event holds with probability 1-o(1)|
|The statement holds for almost every x||The statement holds for x outside of a set of measure o(1)|
Of course, to justify the convergence of to , it is necessary that converge to (or converge to , etc.) in a suitably strong sense. (But for the purposes of proving just upper bounds, such as , one can often get by with quite weak forms of convergence, thanks to tools such as Fatou’s lemma or the weak closure of the unit ball.) Similarly, we need some continuity (or at least semi-continuity) hypotheses on the functions f, g appearing above.
It is also necessary in many cases that the control on the approximating object is somehow “uniform in “, although for “-closed” conclusions, such as measurability, this is not required. [It is important to note that it is only the final conclusion on that needs to have this uniformity in ; one is permitted to have some intermediate stages in the derivation of that depend on in a non-uniform manner, so long as these non-uniformities cancel out or otherwise disappear at the end of the argument.]
By giving oneself an epsilon of room, one can evade a lot of familiar issues in soft analysis. For instance, by replacing “rough”, “infinite-complexity”, “continuous”, “global”, or otherwise “infinitary” objects with “smooth”, “finite-complexity”, “discrete”, “local”, or otherwise “finitary” approximants , one can finesse most issues regarding the justification of various formal operations (e.g. exchanging limits, sums, derivatives, and integrals). [It is important to be aware, though, that any quantitative measure on how smooth, discrete, finite, etc. should be expected to degrade in the limit , and so one should take extreme caution in using such quantitative measures to derive estimates that are uniform in .] Similarly, issues such as whether the supremum of a function on a set is actually attained by some maximiser become moot if one is willing to settle instead for an almost-maximiser , e.g. one which comes within an epsilon of that supremum M (or which is larger than , if M turns out to be infinite). Last, but not least, one can use the epsilon room to avoid degenerate solutions, for instance by perturbing a non-negative function to be strictly positive, perturbing a non-strictly monotone function to be strictly monotone, and so forth.
To summarise: one can view the epsilon regularisation argument as a “loan” in which one borrows an epsilon here and there in order to be able to ignore soft analysis difficulties, and can temporarily be able to utilise estimates which are non-uniform in epsilon, but at the end of the day one needs to “pay back” the loan by establishing a final “hard analysis” estimate which is uniform in epsilon (or whose error terms decay to zero as epsilon goes to zero).
A variant: It may seem that the epsilon regularisation trick is useless if one is already in “hard analysis” situations when all objects are already “finitary”, and all formal computations easily justified. However, there is an important variant of this trick which applies in this case: namely, instead of sending the epsilon parameter to zero, choose epsilon to be a sufficiently small (but not infinitesimally small) quantity, depending on other parameters in the problem, so that one can eventually neglect various error terms and to obtain a useful bound at the end of the day. (For instance, any result proven using the Szemerédi regularity lemma is likely to be of this type.) Since one is not sending epsilon to zero, not every term in the final bound needs to be uniform in epsilon, though for quantitative applications one still would like the dependencies on such parameters to be as favourable as possible.
Prerequisites: Graduate real analysis. (Actually, this isn’t so much a prerequisite as it is a corequisite: the limiting argument plays a central role in many fundamental results in real analysis.) Some examples also require some exposure to PDE.
As many readers may already know, my good friend and fellow mathematical blogger Tim Gowers, having wrapped up work on the Princeton Companion to Mathematics (which I believe is now in press), has begun another mathematical initiative, namely a “Tricks Wiki” to act as a repository for mathematical tricks and techniques. Tim has already started the ball rolling with several seed articles on his own blog, and asked me to also contribute some articles. (As I understand it, these articles will be migrated to the Wiki in a few months, once it is fully set up, and then they will evolve with edits and contributions by anyone who wishes to pitch in, in the spirit of Wikipedia; in particular, articles are not intended to be permanently authored or signed by any single contributor.)
So today I’d like to start by extracting some material from an old post of mine on “Amplification, arbitrage, and the tensor power trick” (as well as from some of the comments), and converting it to the Tricks Wiki format, while also taking the opportunity to add a few more examples.
Title: The tensor power trick
Quick description: If one wants to prove an inequality for some non-negative quantities X, Y, but can only see how to prove a quasi-inequality that loses a multiplicative constant C, then try to replace all objects involved in the problem by “tensor powers” of themselves and apply the quasi-inequality to those powers. If all goes well, one can show that for all , with a constant C which is independent of M, which implies that as desired by taking roots and then taking limits as .
It occurred to me recently that the mathematical blog medium may be a good venue not just for expository “short stories” on mathematical concepts or results, but also for more technical discussions of individual mathematical “tricks”, which would otherwise not be significant enough to warrant a publication-length (and publication-quality) article. So I thought today that I would discuss the amplification trick in harmonic analysis and combinatorics (and in particular, in the study of estimates); this trick takes an established estimate involving an arbitrary object (such as a function f), and obtains a stronger (or amplified) estimate by transforming the object in a well-chosen manner (often involving some new parameters) into a new object, applying the estimate to that new object, and seeing what that estimate says about the original object (after optimising the parameters or taking a limit). The amplification trick works particularly well for estimates which enjoy some sort of symmetry on one side of the estimate that is not represented on the other side; indeed, it can be viewed as a way to “arbitrage” differing amounts of symmetry between the left- and right-hand sides of an estimate. It can also be used in the contrapositive, amplifying a weak counterexample to an estimate into a strong counterexample. This trick also sheds some light as to why dimensional analysis works; an estimate which is not dimensionally consistent can often be amplified into a stronger estimate which is dimensionally consistent; in many cases, this new estimate is so strong that it cannot in fact be true, and thus dimensionally inconsistent inequalities tend to be either false or inefficient, which is why we rarely see them. (More generally, any inequality on which a group acts on either the left or right-hand side can often be “decomposed” into the “isotypic components” of the group action, either by the amplification trick or by other related tools, such as Fourier analysis.)
The amplification trick is a deceptively simple one, but it can become particularly powerful when one is arbitraging an unintuitive symmetry, such as symmetry under tensor powers. Indeed, the “tensor power trick”, which can eliminate constants and even logarithms in an almost magical manner, can lead to some interesting proofs of sharp inequalities, which are difficult to establish by more direct means.
The most familiar example of the amplification trick in action is probably the textbook proof of the Cauchy-Schwarz inequality
for vectors v, w in a complex Hilbert space. To prove this inequality, one might start by exploiting the obvious inequality
but after expanding everything out, one only gets the weaker inequality
Now (3) is weaker than (1) for two reasons; the left-hand side is smaller, and the right-hand side is larger (thanks to the arithmetic mean-geometric mean inequality). However, we can amplify (3) by arbitraging some symmetry imbalances. Firstly, observe that the phase rotation symmetry preserves the RHS of (3) but not the LHS. We exploit this by replacing v by in (3) for some phase to be chosen later, to obtain
Now we are free to choose at will (as long as it is real, of course), so it is natural to choose to optimise the inequality, which in this case means to make the left-hand side as large as possible. This is achieved by choosing to cancel the phase of , and we obtain
This is closer to (1); we have fixed the left-hand side, but the right-hand side is still too weak. But we can amplify further, by exploiting an imbalance in a different symmetry, namely the homogenisation symmetry for a scalar , which preserves the left-hand side but not the right. Inserting this transform into (4) we conclude that
where is at our disposal to choose. We can optimise in by minimising the right-hand side, and indeed one easily sees that the minimum (or infimum, if one of v and w vanishes) is (which is achieved when when are non-zero, or in an asymptotic limit or in the degenerate cases), and so we have amplified our way to the Cauchy-Schwarz inequality (1). [See also this discussion by Tim Gowers on the Cauchy-Schwarz inequality.]