You are currently browsing the category archive for the ‘opinion’ category.

The self-chosen remit of my blog is “Updates on my research and expository papers, discussion of open problems, and other maths-related topics”.  Of the 774 posts on this blog, I estimate that about 99% of the posts indeed relate to mathematics, mathematicians, or the administration of this mathematical blog, and only about 1% are not related to mathematics or the community of mathematicians in any significant fashion.

This is not one of the 1%.

Mathematical research is clearly an international activity.  But actually a stronger claim is true: mathematical research is a transnational activity, in that the specific nationality of individual members of a research team or research community are (or should be) of no appreciable significance for the purpose of advancing mathematics.  For instance, even during the height of the Cold War, there was no movement in (say) the United States to boycott Soviet mathematicians or theorems, or to only use results from Western literature (though the latter did sometimes happen by default, due to the limited avenues of information exchange between East and West, and former did occasionally occur for political reasons, most notably with the Soviet Union preventing Gregory Margulis from traveling to receive his Fields Medal in 1978 EDIT: and also Sergei Novikov in 1970).    The national origin of even the most fundamental components of mathematics, whether it be the geometry (γεωμετρία) of the ancient Greeks, the algebra (الجبر) of the Islamic world, or the Hindu-Arabic numerals 0,1,\dots,9, are primarily of historical interest, and have only a negligible impact on the worldwide adoption of these mathematical tools. While it is true that individual mathematicians or research teams sometimes compete with each other to be the first to solve some desired problem, and that a citizen could take pride in the mathematical achievements of researchers from their country, one did not see any significant state-sponsored “space races” in which it was deemed in the national interest that a particular result ought to be proven by “our” mathematicians and not “theirs”.   Mathematical research ability is highly non-fungible, and the value added by foreign students and faculty to a mathematics department cannot be completely replaced by an equivalent amount of domestic students and faculty, no matter how large and well educated the country (though a state can certainly work at the margins to encourage and support more domestic mathematicians).  It is no coincidence that all of the top mathematics department worldwide actively recruit the best mathematicians regardless of national origin, and often retain immigration counsel to assist with situations in which these mathematicians come from a country that is currently politically disfavoured by their own.

Of course, mathematicians cannot ignore the political realities of the modern international order altogether.  Anyone who has organised an international conference or program knows that there will inevitably be visa issues to resolve because the host country makes it particularly difficult for certain nationals to attend the event.  I myself, like many other academics working long-term in the United States, have certainly experienced my own share of immigration bureaucracy, starting with various glitches in the renewal or application of my J-1 and O-1 visas, then to the lengthy vetting process for acquiring permanent residency (or “green card”) status, and finally to becoming naturalised as a US citizen (retaining dual citizenship with Australia).  Nevertheless, while the process could be slow and frustrating, there was at least an order to it.  The rules of the game were complicated, but were known in advance, and did not abruptly change in the middle of playing it (save in truly exceptional situations, such as the days after the September 11 terrorist attacks).  One just had to study the relevant visa regulations (or hire an immigration lawyer to do so), fill out the paperwork and submit to the relevant background checks, and remain in good standing until the application was approved in order to study, work, or participate in a mathematical activity held in another country.  On rare occasion, some senior university administrator may have had to contact a high-ranking government official to approve some particularly complicated application, but for the most part one could work through normal channels in order to ensure for instance that the majority of participants of a conference could actually be physically present at that conference, or that an excellent mathematician hired by unanimous consent by a mathematics department could in fact legally work in that department.

With the recent and highly publicised executive order on immigration, many of these fundamental assumptions have been seriously damaged, if not destroyed altogether.  Even if the order was withdrawn immediately, there is no longer an assurance, even for nationals not initially impacted by that order, that some similar abrupt and major change in the rules for entry to the United States could not occur, for instance for a visitor who has already gone through the lengthy visa application process and background checks, secured the appropriate visa, and is already in flight to the country.  This is already affecting upcoming or ongoing mathematical conferences or programs in the US, with many international speakers (including those from countries not directly affected by the order) now cancelling their visit, either in protest or in concern about their ability to freely enter and leave the country.  Even some conferences outside the US are affected, as some mathematicians currently in the US with a valid visa or even permanent residency are uncertain if they could ever return back to their place of work if they left the country to attend a meeting.  In the slightly longer term, it is likely that the ability of elite US institutions to attract the best students and faculty will be seriously impacted.  Again, the losses would be strongest regarding candidates that were nationals of the countries affected by the current executive order, but I fear that many other mathematicians from other countries would now be much more concerned about entering and living in the US than they would have previously.

It is still possible for this sort of long-term damage to the mathematical community (both within the US and abroad) to be reversed or at least contained, but at present there is a real risk of the damage becoming permanent.  To prevent this, it seems insufficient for me for the current order to be rescinded, as desirable as that would be; some further legislative or judicial action would be needed to begin restoring enough trust in the stability of the US immigration and visa system that the international travel that is so necessary to modern mathematical research becomes “just” a bureaucratic headache again.

Of course, the impact of this executive order is far, far broader than just its effect on mathematicians and mathematical research.  But there are countless other venues on the internet and elsewhere to discuss these other aspects (or politics in general).  (For instance, discussion of the qualifications, or lack thereof, of the current US president can be carried out at this previous post.) I would therefore like to open this post to readers to discuss the effects or potential effects of this order on the mathematical community; I particularly encourage mathematicians who have been personally affected by this order to share their experiences.  As per the rules of the blog, I request that “the discussions are kept constructive, polite, and at least tangentially relevant to the topic at hand”.

Some relevant links (please feel free to suggest more, either through comments or by email):

In logic, there is a subtle but important distinction between the concept of mutual knowledge – information that everyone (or almost everyone) knows – and common knowledge, which is not only knowledge that (almost) everyone knows, but something that (almost) everyone knows that everyone else knows (and that everyone knows that everyone else knows that everyone else knows, and so forth).  A classic example arises from Hans Christian Andersens’ fable of the Emperor’s New Clothes: the fact that the emperor in fact has no clothes is mutual knowledge, but not common knowledge, because everyone (save, eventually, for a small child) is refusing to acknowledge the emperor’s nakedness, thus perpetuating the charade that the emperor is actually wearing some incredibly expensive and special clothing that is only visible to a select few.  My own personal favourite example of the distinction comes from the blue-eyed islander puzzle, discussed previously here, here and here on the blog.  (By the way, I would ask that any commentary about that puzzle be directed to those blog posts, rather than to the current one.)

I believe that there is now a real-life instance of this situation in the US presidential election, regarding the following

Proposition 1.  The presumptive nominee of the Republican Party, Donald Trump, is not even remotely qualified to carry out the duties of the presidency of the United States of America.

Proposition 1 is a statement which I think is approaching the level of mutual knowledge amongst the US population (and probably a large proportion of people following US politics overseas): even many of Trump’s nominal supporters secretly suspect that this proposition is true, even if they are hesitant to say it out loud.  And there have been many prominent people, from both major parties, that have made the case for Proposition 1: for instance Mitt Romney, the Republican presidential nominee in 2012, did so back in March, and just a few days ago Hillary Clinton, the likely Democratic presidential nominee this year, did so in this speech:

I highly recommend watching the entirety of the (35 mins or so) speech, followed by the entirety of Trump’s rebuttal.

However, even if Proposition 1 is approaching the status of “mutual knowledge”, it does not yet seem to be close to the status of “common knowledge”: one may secretly believe that Trump cannot be considered as a serious candidate for the US presidency, but must continue to entertain this possibility, because they feel that others around them, or in politics or the media, appear to be doing so.  To reconcile these views can require taking on some implausible hypotheses that are not otherwise supported by any evidence, such as the hypothesis that Trump’s displays of policy ignorance, pettiness, and other clearly unpresidential behaviour are merely “for show”, and that behind this facade there is actually a competent and qualified presidential candidate; much like the emperor’s new clothes, this alleged competence is supposedly only visible to a select few.  And so the charade continues.

I feel that it is time for the charade to end: Trump is unfit to be president, and everybody knows it.  But more people need to say so, openly.

Important note: I anticipate there will be any number of “tu quoque” responses, asserting for instance that Hillary Clinton is also unfit to be the US president.  I personally do not believe that to be the case (and certainly not to the extent that Trump exhibits), but in any event such an assertion has no logical bearing on the qualification of Trump for the presidency.  As such, any comments that are purely of this “tu quoque” nature, and which do not directly address the validity or epistemological status of Proposition 1, will be deleted as off-topic.  However, there is a legitimate case to be made that there is a fundamental weakness in the current mechanics of the US presidential election, particularly with the “first-past-the-post” voting system, in that (once the presidential primaries are concluded) a voter in the presidential election is effectively limited to choosing between just two viable choices, one from each of the two major parties, or else refusing to vote or making a largely symbolic protest vote. This weakness is particularly evident when at least one of these two major choices is demonstrably unfit for office, as per Proposition 1.  I think there is a serious case for debating the possibility of major electoral reform in the US (I am particularly partial to the Instant Runoff Voting system, used for instance in my home country of Australia, which allows for meaningful votes to third parties), and I would consider such a debate to be on-topic for this post.  But this is very much a longer term issue, as there is absolutely no chance that any such reform would be implemented by the time of the US elections in November (particularly given that any significant reform would almost certainly require, at minimum, a constitutional amendment).


It has been a little over two weeks now since the protest site at was set up to register declarations of non-cooperation with Reed Elsevier in protest of their research publishing practices, inspired by this blog post of Tim Gowers.   Awareness of the protest has certainly grown in these two weeks; the number of signatories is now well over four thousand, across a broad array of academic disciplines, and the protest has been covered by many blogs and also the mainstream media (e.g. the Guardian, the Economist, Forbes, etc.), and even by Elsevier stock analysts.    (Elsevier itself released an open letter responding to the protest here.)  My interpretation of events is that there was a significant amount of latent or otherwise undisclosed dissatisfaction already with the publishing practices of Elsevier (and, to a lesser extent, some other commercial academic publishers), and a desire to support alternatives such as university or society publishers, and the more recent open access journals; and that this protest (and parallel protests, such as the movement to oppose the Research Works Act) served to drive these feelings out into the open.

The statement of the protest itself, though, is rather brief, reflecting the improvised manner in which the site was created.  A group of mathematicians including myself therefore decided to write and sign a more detailed explanation of why we supported this protest, giving more background and references to support our position.   The 34 signatories are Scott Aaronson, Douglas N. Arnold, Artur Avila, John Baez, Folkmar Bornemann, Danny Calegari, Henry Cohn, Ingrid Daubechies, Jordan Ellenberg, Matthew Emerton, Marie Farge, David Gabai, Timothy Gowers, Ben Green, Martin Grotschel, Michael Harris, Frederic Helein, Rob Kirby, Vincent Lafforgue, Gregory F. Lawler, Randall J. LeVeque, Laszlo Lovasz, Peter J. Olver, Olof Sisask, Richard Taylor, Bernard Teissier, Burt Totaro, Lloyd N. Trefethen, Takashi Tsuboi, Marie-France Vigneras, Wendelin Werner, Amie Wilkinson, Gunter M. Ziegler, and myself.  (Note that while Daubechies is current president of the International Mathematical Union, Lovasz is a past president, and Grotschel is the current secretary, they are signing this letter as individuals and not as representatives of the IMU. Similarly for Trefethen and Arnold (current and past president of SIAM).)

Of course, the 34 of us do not presume to speak for the remaining four thousand signatories to the protest, but I hope that our statement is somewhat representative of the position of many of its supporters.

Further discussion of this statement can be found at this blog post of Tim Gowers.

EDIT: I think it is appropriate to quote the following excerpt from our statement:

All mathematicians must decide for themselves whether, or to what extent, they wish to participate in the boycott. Senior mathematicians who have signed the boycott bear some responsibility towards junior colleagues who are forgoing the option of publishing in Elsevier journals, and should do their best to help minimize any negative career consequences.

Whether or not you decide to join the boycott, there are some simple actions that everyone can take, which seem to us to be uncontroversial:

  1. Make sure that the final versions of all your papers, particularly new ones, are freely available online, ideally both on the arXiv and on your home page.
  2.  If you are submitting a paper and there is a choice between an expensive journal and a cheap (or free) journal of the same standard, then always submit to the cheap one.

A few days ago, inspired by this recent post of Tim Gowers, a web page entitled “the cost of knowledge” has been set up as a location for mathematicians and other academics to declare a protest against the academic publishing practices of Reed Elsevier, in particular with regard to their exceptionally high journal prices, their policy of “bundling” journals together so that libraries are forced to purchase subscriptions to large numbers of low-quality journals in order to gain access to a handful of high-quality journals, and their opposition to the open access movement (as manifested, for instance, in their lobbying in support of legislation such as the Stop Online Piracy Act (SOPA) and the Research Works Act (RWA)).   [These practices have been documented in a number of places; this wiki page, which was set up in response to Tim’s post, collects several relevant links for this purpose.  Some of the other commercial publishers have  exhibited similar behaviour, though usually not to the extent that Elsevier has, which is why this particular publisher is the focus of this protest.]  At the protest site, one can publicly declare a refusal to either publish at an Elsevier journal, referee for an Elsevier journal, or join the board of an Elsevier journal.

(In the past, the editorial boards of several Elsevier journals have resigned over the pricing policies of the journal, most famously the board of Topology in 2006, but also the Journal of Algorithms in 2003, and a number of journals in other sciences as well.  Several libraries, such as those of Harvard and Cornell, have also managed to negotiate an unbundling of Elsevier journals, but most libraries are still unable to subscribe to such journals individually.)

For a more thorough discussion as to why such a protest is warranted, please see Tim’s post on the matter (and the 100+ comments to that post).   Many of the issues regarding Elsevier were already known to some extent to many mathematicians (particularly those who have served on departmental library committees), several of whom had already privately made the decision to boycott Elsevier; but nevertheless it is important to bring these issues out into the open, to make them commonly known as opposed to merely mutually known.  (Amusingly, this distinction is also of crucial importance in my favorite logic puzzle, but that’s another story.)   One can also see Elsevier’s side of the story in this response to Tim’s post by David Clark (the Senior Vice President for Physical Sciences at Elsevier).

For my own part, though I have sent about 9% of my papers in the past to Elsevier journals (with one or two still in press), I have now elected not to submit any further papers to these journals, nor to serve on their editorial boards, though I will continue refereeing some papers from these journals.  As of this time of writing, over five hundred mathematicians and other academics have also signed on to the protest in the four days that the site has been active.

Admittedly, I am fortunate enough to be at a stage of career in which I am not pressured to publish in a very specific set of journals, and as such, I am not making a recommendation as to what anyone else should do or not do regarding this protest.  However, I do feel that it is worth spreading awareness, at least, of the fact that such protests exist (and some additional petitions on related issues can be found at the previously mentioned wiki page).

A friend of mine recently asked me for some suggestions for games or other activities for children that would help promote quantitative reasoning or mathematical skills, while remaining fun to play (i.e. more than just homework-type questions poorly disguised in game form).    The initial question was focused on computer games (and specifically, on iPhone apps), but I think the broader question would also be of interest.

I myself have not seriously played these sorts of games for years, so I could only come up with a few examples immediately: the game “Planarity“, and the game “Factory Balls” (and two sequels).   (Edit: Rubik’s cube and its countless cousins presumably qualify also, due to their implicit use of group theory.)  I am hopeful though that readers may be able to come up with more suggestions.

There is of course no shortage of “educational” games, computer-based or otherwise, available, but I think what I (and my friend) would be looking for here are games with production values comparable to other, less educational games, and for which the need for mathematical thinking arises naturally in the gameplay rather than being artificially inserted by fiat (e.g. “solve this equation to proceed”).  (Here I interpret “mathematical thinking” loosely, to include not just numerical or algebraic thinking, but also geometric, abstract, logical, probabilistic, etc.)

[Question for MathOverflow experts: would this type of question be suitable for crossposting there?   The requirement that such questions be “research-level” seems to suggest not.]

It’s been a while since I’ve added to my career advice and writing pages on this blog, but I recently took the time to write up another such page on a topic I had not previously covered, entitled “Write in your own voice“.  The main point here is that while every piece of mathematical research inevitably builds upon the previous literature, one should not mimic the style and text of that literature slavishly, but instead develop one’s own individual style, while also updating and adapting the results and insights of previous authors.

This is a technical post inspired by separate conversations with Jim Colliander and with Soonsik Kwon on the relationship between two techniques used to control non-radiating solutions to dispersive nonlinear equations, namely the “double Duhamel trick” and the “in/out decomposition”. See for instance these lecture notes of Killip and Visan for a survey of these two techniques and other related methods in the subject. (I should caution that this post is likely to be unintelligible to anyone not already working in this area.)

For sake of discussion we shall focus on solutions to a nonlinear Schrödinger equation

\displaystyle  iu_t + \Delta u = F(u)

and we will not concern ourselves with the specific regularity of the solution {u}, or the specific properties of the nonlinearity {F} here. We will also not address the issue of how to justify the formal computations being performed here.

Solutions to this equation enjoy the forward Duhamel formula

\displaystyle  u(t) = e^{i(t-t_0)\Delta} u(t_0) - i \int_{t_0}^t e^{i(t-t')\Delta} F(u(t'))\ dt'

for times {t} to the future of {t_0} in the lifespan of the solution, as well as the backward Duhamel formula

\displaystyle  u(t) = e^{i(t-t_1)\Delta} u(t_1) + i \int_t^{t_1} e^{i(t-t')\Delta} F(u(t'))\ dt'

for all times {t} to the past of {t_1} in the lifespan of the solution. The first formula asserts that the solution at a given time is determined by the initial state and by the immediate past, while the second formula is the time reversal of the first, asserting that the solution at a given time is determined by the final state and the immediate future. These basic causal formulae are the foundation of the local theory of these equations, and in particular play an instrumental role in establishing local well-posedness for these equations. In this local theory, the main philosophy is to treat the homogeneous (or linear) term {e^{i(t-t_0)\Delta} u(t_0)} or {e^{i(t-t_1)\Delta} u(t_1)} as the main term, and the inhomogeneous (or nonlinear, or forcing) integral term as an error term.

The situation is reversed when one turns to the global theory, and looks at the asymptotic behaviour of a solution as one approaches a limiting time {T} (which can be infinite if one has global existence, or finite if one has finite time blowup). After a suitable rescaling, the linear portion of the solution often disappears from view, leaving one with an asymptotic blowup profile solution which is non-radiating in the sense that the linear components of the Duhamel formulae vanish, thus

\displaystyle  u(t) = - i \int_{t_0}^t e^{i(t-t')\Delta} F(u(t'))\ dt' \ \ \ \ \ (1)


\displaystyle  u(t) = i \int_t^{t_1} e^{i(t-t')\Delta} F(u(t'))\ dt' \ \ \ \ \ (2)

where {t_0, t_1} are the endpoint times of existence. (This type of situation comes up for instance in the Kenig-Merle approach to critical regularity problems, by reducing to a minimal blowup solution which is almost periodic modulo symmetries, and hence non-radiating.) These types of non-radiating solutions are propelled solely by their own nonlinear self-interactions from the immediate past or immediate future; they are generalisations of “nonlinear bound states” such as solitons.

A key task is then to somehow combine the forward representation (1) and the backward representation (2) to obtain new information on {u(t)} itself, that cannot be obtained from either representation alone; it seems that the immediate past and immediate future can collectively exert more control on the present than they each do separately. This type of problem can be abstracted as follows. Let {\|u(t)\|_{Y_+}} be the infimal value of {\|F_+\|_N} over all forward representations of {u(t)} of the form

\displaystyle  u(t) = \int_{t_0}^t e^{i(t-t')\Delta} F_+(t') \ dt' \ \ \ \ \ (3)

where {N} is some suitable spacetime norm (e.g. a Strichartz-type norm), and similarly let {\|u(t)\|_{Y_-}} be the infimal value of {\|F_-\|_N} over all backward representations of {u(t)} of the form

\displaystyle  u(t) = \int_{t}^{t_1} e^{i(t-t')\Delta} F_-(t') \ dt'. \ \ \ \ \ (4)

Typically, one already has (or is willing to assume as a bootstrap hypothesis) control on {F(u)} in the norm {N}, which gives control of {u(t)} in the norms {Y_+, Y_-}. The task is then to use the control of both the {Y_+} and {Y_-} norm of {u(t)} to gain control of {u(t)} in a more conventional Hilbert space norm {X}, which is typically a Sobolev space such as {H^s} or {L^2}.

One can use some classical functional analysis to clarify this situation. By the closed graph theorem, the above task is (morally, at least) equivalent to establishing an a priori bound of the form

\displaystyle  \| u \|_X \lesssim \|u\|_{Y_+} + \|u\|_{Y_-} \ \ \ \ \ (5)

for all reasonable {u} (e.g. test functions). The double Duhamel trick accomplishes this by establishing the stronger estimate

\displaystyle  |\langle u, v \rangle_X| \lesssim \|u\|_{Y_+} \|v\|_{Y_-} \ \ \ \ \ (6)

for all reasonable {u, v}; note that setting {u=v} and applying the arithmetic-geometric inequality then gives (5). The point is that if {u} has a forward representation (3) and {v} has a backward representation (4), then the inner product {\langle u, v \rangle_X} can (formally, at least) be expanded as a double integral

\displaystyle  \int_{t_0}^t \int_{t}^{t_1} \langle e^{i(t''-t')\Delta} F_+(t'), e^{i(t''-t')\Delta} F_-(t') \rangle_X\ dt'' dt'.

The dispersive nature of the linear Schrödinger equation often causes {\langle e^{i(t''-t')\Delta} F_+(t'), e^{i(t''-t')\Delta} F_-(t') \rangle_X} to decay, especially in high dimensions. In high enough dimension (typically one needs five or higher dimensions, unless one already has some spacetime control on the solution), the decay is stronger than {1/|t'-t''|^2}, so that the integrand becomes absolutely integrable and one recovers (6).

Unfortunately it appears that estimates of the form (6) fail in low dimensions (for the type of norms {N} that actually show up in applications); there is just too much interaction between past and future to hope for any reasonable control of this inner product. But one can try to obtain (5) by other means. By the Hahn-Banach theorem (and ignoring various issues related to reflexivity), (5) is equivalent to the assertion that every {u \in X} can be decomposed as {u = u_+ + u_-}, where {\|u_+\|_{Y_+^*} \lesssim \|u\|_X} and {\|u_-\|_{Y_-^*} \lesssim \|v\|_X}. Indeed once one has such a decomposition, one obtains (5) by computing the inner product of {u} with {u=u_++u_-} in {X} in two different ways. One can also (morally at least) write {\|u_+\|_{Y_+^*}} as {\| e^{i(\cdot-t)\Delta} u_+\|_{N^*([t_0,t])}} and similarly write {\|u_-\|_{Y_-^*}} as {\| e^{i(\cdot-t)\Delta} u_-\|_{N^*([t,t_1])}}

So one can dualise the task of proving (5) as that of obtaining a decomposition of an arbitrary initial state {u} into two components {u_+} and {u_-}, where the former disperses into the past and the latter disperses into the future under the linear evolution. We do not know how to achieve this type of task efficiently in general – and doing so would likely lead to a significant advance in the subject (perhaps one of the main areas in this topic where serious harmonic analysis is likely to play a major role). But in the model case of spherically symmetric data {u}, one can perform such a decomposition quite easily: one uses microlocal projections to set {u_+} to be the “inward” pointing component of {u}, which propagates towards the origin in the future and away from the origin in the past, and {u_-} to simimlarly be the “outward” component of {u}. As spherical symmetry significantly dilutes the amplitude of the solution (and hence the strength of the nonlinearity) away from the origin, this decomposition tends to work quite well for applications, and is one of the main reasons (though not the only one) why we have a global theory for low-dimensional nonlinear Schrödinger equations in the radial case, but not in general.

The in/out decomposition is a linear one, but the Hahn-Banach argument gives no reason why the decomposition needs to be linear. (Note that other well-known decompositions in analysis, such as the Fefferman-Stein decomposition of BMO, are necessarily nonlinear, a fact which is ultimately equivalent to the non-complemented nature of a certain subspace of a Banach space; see these lecture notes of mine and this old blog post for some discussion.) So one could imagine a sophisticated nonlinear decomposition as a general substitute for the in/out decomposition. See for instance this paper of Bourgain and Brezis for some of the subtleties of decomposition even in very classical function spaces such as {H^{1/2}(R)}. Alternatively, there may well be a third way to obtain estimates of the form (5) that do not require either decomposition or the double Duhamel trick; such a method may well clarify the relative relationship between past, present, and future for critical nonlinear dispersive equations, which seems to be a key aspect of the theory that is still only partially understood. (In particular, it seems that one needs a fairly strong decoupling of the present from both the past and the future to get the sort of elliptic-like regularity results that allow us to make further progress with such equations.)

One of the most basic theorems in linear algebra is that every finite-dimensional vector space has a finite basis. Let us give a statement of this theorem in the case when the underlying field is the rationals:

Theorem 1 (Finite generation implies finite basis, infinitary version) Let {V} be a vector space over the rationals {{\mathbb Q}}, and let {v_1,\ldots,v_n} be a finite collection of vectors in {V}. Then there exists a collection {w_1,\ldots,w_k} of vectors in {V}, with {1 \leq k \leq n}, such that

  • ({w} generates {v}) Every {v_j} can be expressed as a rational linear combination of the {w_1,\ldots,w_k}.
  • ({w} independent) There is no non-trivial linear relation {a_1 w_1 + \ldots + a_k w_k = 0}, {a_1,\ldots,a_k \in {\mathbb Q}} among the {w_1,\ldots,w_m} (where non-trivial means that the {a_i} are not all zero).

In fact, one can take {w_1,\ldots,w_m} to be a subset of the {v_1,\ldots,v_n}.

Proof: We perform the following “rank reduction argument”. Start with {w_1,\ldots,w_k} initialised to {v_1,\ldots,v_n} (so initially we have {k=n}). Clearly {w} generates {v}. If the {w_i} are linearly independent then we are done. Otherwise, there is a non-trivial linear relation between them; after shuffling things around, we see that one of the {w_i}, say {w_k}, is a rational linear combination of the {w_1,\ldots,w_{k-1}}. In such a case, {w_k} becomes redundant, and we may delete it (reducing the rank {k} by one). We repeat this procedure; it can only run for at most {n} steps and so terminates with {w_1,\ldots,w_m} obeying both of the desired properties. \Box

In additive combinatorics, one often wants to use results like this in finitary settings, such as that of a cyclic group {{\mathbb Z}/p{\mathbb Z}} where {p} is a large prime. Now, technically speaking, {{\mathbb Z}/p{\mathbb Z}} is not a vector space over {{\mathbb Q}}, because one only multiply an element of {{\mathbb Z}/p{\mathbb Z}} by a rational number if the denominator of that rational does not divide {p}. But for {p} very large, {{\mathbb Z}/p{\mathbb Z}} “behaves” like a vector space over {{\mathbb Q}}, at least if one restricts attention to the rationals of “bounded height” – where the numerator and denominator of the rationals are bounded. Thus we shall refer to elements of {{\mathbb Z}/p{\mathbb Z}} as “vectors” over {{\mathbb Q}}, even though strictly speaking this is not quite the case.

On the other hand, saying that one element of {{\mathbb Z}/p{\mathbb Z}} is a rational linear combination of another set of elements is not a very interesting statement: any non-zero element of {{\mathbb Z}/p{\mathbb Z}} already generates the entire space! However, if one again restricts attention to rational linear combinations of bounded height, then things become interesting again. For instance, the vector {1} can generate elements such as {37} or {\frac{p-1}{2}} using rational linear combinations of bounded height, but will not be able to generate such elements of {{\mathbb Z}/p{\mathbb Z}} as {\lfloor\sqrt{p}\rfloor} without using rational numbers of unbounded height.

For similar reasons, the notion of linear independence over the rationals doesn’t initially look very interesting over {{\mathbb Z}/p{\mathbb Z}}: any two non-zero elements of {{\mathbb Z}/p{\mathbb Z}} are of course rationally dependent. But again, if one restricts attention to rational numbers of bounded height, then independence begins to emerge: for instance, {1} and {\lfloor\sqrt{p}\rfloor} are independent in this sense.

Thus, it becomes natural to ask whether there is a “quantitative” analogue of Theorem 1, with non-trivial content in the case of “vector spaces over the bounded height rationals” such as {{\mathbb Z}/p{\mathbb Z}}, which asserts that given any bounded collection {v_1,\ldots,v_n} of elements, one can find another set {w_1,\ldots,w_k} which is linearly independent “over the rationals up to some height”, such that the {v_1,\ldots,v_n} can be generated by the {w_1,\ldots,w_k} “over the rationals up to some height”. Of course to make this rigorous, one needs to quantify the two heights here, the one giving the independence, and the one giving the generation. In order to be useful for applications, it turns out that one often needs the former height to be much larger than the latter; exponentially larger, for instance, is not an uncommon request. Fortunately, one can accomplish this, at the cost of making the height somewhat large:

Theorem 2 (Finite generation implies finite basis, finitary version) Let {n \geq 1} be an integer, and let {F: {\mathbb N} \rightarrow {\mathbb N}} be a function. Let {V} be an abelian group which admits a well-defined division operation by any natural number of size at most {C(F,n)} for some constant {C(F,n)} depending only on {F,n}; for instance one can take {V = {\mathbb Z}/p{\mathbb Z}} for {p} a prime larger than {C(F,n)}. Let {v_1,\ldots,v_n} be a finite collection of “vectors” in {V}. Then there exists a collection {w_1,\ldots,w_k} of vectors in {V}, with {1 \leq k \leq n}, as well an integer {M \geq 1}, such that

  • (Complexity bound) {M \leq C(F,n)} for some {C(F,n)} depending only on {F, n}.
  • ({w} generates {v}) Every {v_j} can be expressed as a rational linear combination of the {w_1,\ldots,w_k} of height at most {M} (i.e. the numerator and denominator of the coefficients are at most {M}).
  • ({w} independent) There is no non-trivial linear relation {a_1 w_1 + \ldots + a_k w_k = 0} among the {w_1,\ldots,w_k} in which the {a_1,\ldots,a_k} are rational numbers of height at most {F(M)}.

In fact, one can take {w_1,\ldots,w_k} to be a subset of the {v_1,\ldots,v_n}.

Proof: We perform the same “rank reduction argument” as before, but translated to the finitary setting. Start with {w_1,\ldots,w_k} initialised to {v_1,\ldots,v_n} (so initially we have {k=n}), and initialise {M=1}. Clearly {w} generates {v} at this height. If the {w_i} are linearly independent up to rationals of height {F(M)} then we are done. Otherwise, there is a non-trivial linear relation between them; after shuffling things around, we see that one of the {w_i}, say {w_k}, is a rational linear combination of the {w_1,\ldots,w_{k-1}}, whose height is bounded by some function depending on {F(M)} and {k}. In such a case, {w_k} becomes redundant, and we may delete it (reducing the rank {k} by one), but note that in order for the remaining {w_1,\ldots,w_{k-1}} to generate {v_1,\ldots,v_n} we need to raise the height upper bound for the rationals involved from {M} to some quantity {M'} depending on {M, F(M), k}. We then replace {M} by {M'} and continue the process. We repeat this procedure; it can only run for at most {n} steps and so terminates with {w_1,\ldots,w_m} and {M} obeying all of the desired properties. (Note that the bound on {M} is quite poor, being essentially an {n}-fold iteration of {F}! Thus, for instance, if {F} is exponential, then the bound on {M} is tower-exponential in nature.) \Box

(A variant of this type of approximate basis lemma was used in my paper with Van Vu on the singularity probability of random Bernoulli matrices.)

Looking at the statements and proofs of these two theorems it is clear that the two results are in some sense the “same” result, except that the latter has been made sufficiently quantitative that it is meaningful in such finitary settings as {{\mathbb Z}/p{\mathbb Z}}. In this note I will show how this equivalence can be made formal using the language of non-standard analysis. This is not a particularly deep (or new) observation, but it is perhaps the simplest example I know of that illustrates how nonstandard analysis can be used to transfer a quantifier-heavy finitary statement, such as Theorem 2, into a quantifier-light infinitary statement, such as Theorem 1, thus lessening the need to perform “epsilon management” duties, such as keeping track of unspecified growth functions such as {F}. This type of transference is discussed at length in this previous blog post of mine.

In this particular case, the amount of effort needed to set up the nonstandard machinery in order to reduce Theorem 2 from Theorem 1 is too great for this transference to be particularly worthwhile, especially given that Theorem 2 has such a short proof. However, when performing a particularly intricate argument in additive combinatorics, in which one is performing a number of “rank reduction arguments”, “energy increment arguments”, “regularity lemmas”, “structure theorems”, and so forth, the purely finitary approach can become bogged down with all the epsilon management one needs to do to organise all the parameters that are flying around. The nonstandard approach can efficiently hide a large number of these parameters from view, and it can then become worthwhile to invest in the nonstandard framework in order to clean up the rest of a lengthy argument. Furthermore, an advantage of moving up to the infinitary setting is that one can then deploy all the firepower of an existing well-developed infinitary theory of mathematics (in this particular case, this would be the theory of linear algebra) out of the box, whereas in the finitary setting one would have to painstakingly finitise each aspect of such a theory that one wished to use (imagine for instance trying to finitise the rank-nullity theorem for rationals of bounded height).

The nonstandard approach is very closely related to use of compactness arguments, or of the technique of taking ultralimits and ultraproducts; indeed we will use an ultrafilter in order to create the nonstandard model in the first place.

I will also discuss a two variants of both Theorem 1 and Theorem 2 which have actually shown up in my research. The first is that of the regularity lemma for polynomials over finite fields, which came up when studying the equidistribution of such polynomials (in this paper with Ben Green). The second comes up when is dealing not with a single finite collection {v_1,\ldots,v_n} of vectors, but rather with a family {(v_{h,1},\ldots,v_{h,n})_{h \in H}} of such vectors, where {H} ranges over a large set; this gives rise to what we call the sunflower lemma, and came up in this recent paper of myself, Ben Green, and Tamar Ziegler.

This post is mostly concerned with nonstandard translations of the “rank reduction argument”. Nonstandard translations of the “energy increment argument” and “density increment argument” were briefly discussed in this recent post; I may return to this topic in more detail in a future post.

Read the rest of this entry »

Next month, I am scheduled to give a short speech (three to five minutes in length) at the annual induction ceremony of the American Academy of Arts and Sciences in Boston.  This is a bit different from the usual scientific talks that I am used to giving; there are no projectors, blackboards, or other visual aids available, and the audience of Academy members is split evenly between the humanities and the sciences (as well as people in industry and politics), so this will be an interesting new experience for me.  (The last time I gave a speech was in 1985.)

My chosen topic is on the future impact of internet-based technologies on academia (somewhat similar in theme to my recent talk on this topic).  I have a draft text below the fold, though it is currently too long and my actual speech is likely to be a significantly abridged version of the one below [Update, Oct 12: The abridged speech is now at the bottom of the post.]  In the spirit of the theme of the talk, I would of course welcome any comments and suggestions.

For comparison, the talks from last year’s ceremony, by Jim Simons, Peter Kim, Susan Athey, Earl Lewis, and Indra Nooyi, can be found here.  Jim’s chosen topic, incidentally, was what mathematics is, and why mathematicians do it.

[Update, Nov 3: Video of the various talks by myself and the other speakers (Emmylou Harris, James Earl Jones, Elizabeth Nabel, Ronald Marc George, and Edward Villela) is now available on the Academy web site here.]

Read the rest of this entry »

In the discussion on what mathematicians need to know about blogging mentioned in the previous post, it was noted that there didn’t seem to be a single location on the internet to find out about mathematical blogs.  Actually, there is a page, but it has been relatively obscure – the Mathematics/Statistics subpage of the Academic Blogs wiki.  It does seem like a good idea to have a reasonably comprehensive page containing all the academic mathematics blogs that are out there (as well as links to other relevant sites), so I put my own maths blogroll onto the page, and encourage others to do so also (though you may wish to read the FAQ for the wiki first).

It may also be useful to organise the list into sublists, and to add more commentary on each individual blog.  (In theory, each blog is supposed to have its own sub-page, though in practice it seems that very few blogs do at present.)


RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.