You are currently browsing the category archive for the ‘opinion’ category.

It has been a little over two weeks now since the protest site at thecostofknowledge.com was set up to register declarations of non-cooperation with Reed Elsevier in protest of their research publishing practices, inspired by this blog post of Tim Gowers.   Awareness of the protest has certainly grown in these two weeks; the number of signatories is now well over four thousand, across a broad array of academic disciplines, and the protest has been covered by many blogs and also the mainstream media (e.g. the Guardian, the Economist, Forbes, etc.), and even by Elsevier stock analysts.    (Elsevier itself released an open letter responding to the protest here.)  My interpretation of events is that there was a significant amount of latent or otherwise undisclosed dissatisfaction already with the publishing practices of Elsevier (and, to a lesser extent, some other commercial academic publishers), and a desire to support alternatives such as university or society publishers, and the more recent open access journals; and that this protest (and parallel protests, such as the movement to oppose the Research Works Act) served to drive these feelings out into the open.

The statement of the protest itself, though, is rather brief, reflecting the improvised manner in which the site was created.  A group of mathematicians including myself therefore decided to write and sign a more detailed explanation of why we supported this protest, giving more background and references to support our position.   The 34 signatories are Scott Aaronson, Douglas N. Arnold, Artur Avila, John Baez, Folkmar Bornemann, Danny Calegari, Henry Cohn, Ingrid Daubechies, Jordan Ellenberg, Matthew Emerton, Marie Farge, David Gabai, Timothy Gowers, Ben Green, Martin Grotschel, Michael Harris, Frederic Helein, Rob Kirby, Vincent Lafforgue, Gregory F. Lawler, Randall J. LeVeque, Laszlo Lovasz, Peter J. Olver, Olof Sisask, Richard Taylor, Bernard Teissier, Burt Totaro, Lloyd N. Trefethen, Takashi Tsuboi, Marie-France Vigneras, Wendelin Werner, Amie Wilkinson, Gunter M. Ziegler, and myself.  (Note that while Daubechies is current president of the International Mathematical Union, Lovasz is a past president, and Grotschel is the current secretary, they are signing this letter as individuals and not as representatives of the IMU. Similarly for Trefethen and Arnold (current and past president of SIAM).)

Of course, the 34 of us do not presume to speak for the remaining four thousand signatories to the protest, but I hope that our statement is somewhat representative of the position of many of its supporters.

Further discussion of this statement can be found at this blog post of Tim Gowers.

EDIT: I think it is appropriate to quote the following excerpt from our statement:

All mathematicians must decide for themselves whether, or to what extent, they wish to participate in the boycott. Senior mathematicians who have signed the boycott bear some responsibility towards junior colleagues who are forgoing the option of publishing in Elsevier journals, and should do their best to help minimize any negative career consequences.

Whether or not you decide to join the boycott, there are some simple actions that everyone can take, which seem to us to be uncontroversial:

  1. Make sure that the final versions of all your papers, particularly new ones, are freely available online, ideally both on the arXiv and on your home page.
  2.  If you are submitting a paper and there is a choice between an expensive journal and a cheap (or free) journal of the same standard, then always submit to the cheap one.

A few days ago, inspired by this recent post of Tim Gowers, a web page entitled “the cost of knowledge” has been set up as a location for mathematicians and other academics to declare a protest against the academic publishing practices of Reed Elsevier, in particular with regard to their exceptionally high journal prices, their policy of “bundling” journals together so that libraries are forced to purchase subscriptions to large numbers of low-quality journals in order to gain access to a handful of high-quality journals, and their opposition to the open access movement (as manifested, for instance, in their lobbying in support of legislation such as the Stop Online Piracy Act (SOPA) and the Research Works Act (RWA)).   [These practices have been documented in a number of places; this wiki page, which was set up in response to Tim's post, collects several relevant links for this purpose.  Some of the other commercial publishers have  exhibited similar behaviour, though usually not to the extent that Elsevier has, which is why this particular publisher is the focus of this protest.]  At the protest site, one can publicly declare a refusal to either publish at an Elsevier journal, referee for an Elsevier journal, or join the board of an Elsevier journal.

(In the past, the editorial boards of several Elsevier journals have resigned over the pricing policies of the journal, most famously the board of Topology in 2006, but also the Journal of Algorithms in 2003, and a number of journals in other sciences as well.  Several libraries, such as those of Harvard and Cornell, have also managed to negotiate an unbundling of Elsevier journals, but most libraries are still unable to subscribe to such journals individually.)

For a more thorough discussion as to why such a protest is warranted, please see Tim’s post on the matter (and the 100+ comments to that post).   Many of the issues regarding Elsevier were already known to some extent to many mathematicians (particularly those who have served on departmental library committees), several of whom had already privately made the decision to boycott Elsevier; but nevertheless it is important to bring these issues out into the open, to make them commonly known as opposed to merely mutually known.  (Amusingly, this distinction is also of crucial importance in my favorite logic puzzle, but that’s another story.)   One can also see Elsevier’s side of the story in this response to Tim’s post by David Clark (the Senior Vice President for Physical Sciences at Elsevier).

For my own part, though I have sent about 9% of my papers in the past to Elsevier journals (with one or two still in press), I have now elected not to submit any further papers to these journals, nor to serve on their editorial boards, though I will continue refereeing some papers from these journals.  As of this time of writing, over five hundred mathematicians and other academics have also signed on to the protest in the four days that the site has been active.

Admittedly, I am fortunate enough to be at a stage of career in which I am not pressured to publish in a very specific set of journals, and as such, I am not making a recommendation as to what anyone else should do or not do regarding this protest.  However, I do feel that it is worth spreading awareness, at least, of the fact that such protests exist (and some additional petitions on related issues can be found at the previously mentioned wiki page).

A friend of mine recently asked me for some suggestions for games or other activities for children that would help promote quantitative reasoning or mathematical skills, while remaining fun to play (i.e. more than just homework-type questions poorly disguised in game form).    The initial question was focused on computer games (and specifically, on iPhone apps), but I think the broader question would also be of interest.

I myself have not seriously played these sorts of games for years, so I could only come up with a few examples immediately: the game “Planarity“, and the game “Factory Balls” (and two sequels).   (Edit: Rubik’s cube and its countless cousins presumably qualify also, due to their implicit use of group theory.)  I am hopeful though that readers may be able to come up with more suggestions.

There is of course no shortage of “educational” games, computer-based or otherwise, available, but I think what I (and my friend) would be looking for here are games with production values comparable to other, less educational games, and for which the need for mathematical thinking arises naturally in the gameplay rather than being artificially inserted by fiat (e.g. “solve this equation to proceed”).  (Here I interpret “mathematical thinking” loosely, to include not just numerical or algebraic thinking, but also geometric, abstract, logical, probabilistic, etc.)

[Question for MathOverflow experts: would this type of question be suitable for crossposting there?   The requirement that such questions be "research-level" seems to suggest not.]

It’s been a while since I’ve added to my career advice and writing pages on this blog, but I recently took the time to write up another such page on a topic I had not previously covered, entitled “Write in your own voice“.  The main point here is that while every piece of mathematical research inevitably builds upon the previous literature, one should not mimic the style and text of that literature slavishly, but instead develop one’s own individual style, while also updating and adapting the results and insights of previous authors.

This is a technical post inspired by separate conversations with Jim Colliander and with Soonsik Kwon on the relationship between two techniques used to control non-radiating solutions to dispersive nonlinear equations, namely the “double Duhamel trick” and the “in/out decomposition”. See for instance these lecture notes of Killip and Visan for a survey of these two techniques and other related methods in the subject. (I should caution that this post is likely to be unintelligible to anyone not already working in this area.)

For sake of discussion we shall focus on solutions to a nonlinear Schrödinger equation

\displaystyle  iu_t + \Delta u = F(u)

and we will not concern ourselves with the specific regularity of the solution {u}, or the specific properties of the nonlinearity {F} here. We will also not address the issue of how to justify the formal computations being performed here.

Solutions to this equation enjoy the forward Duhamel formula

\displaystyle  u(t) = e^{i(t-t_0)\Delta} u(t_0) - i \int_{t_0}^t e^{i(t-t')\Delta} F(u(t'))\ dt'

for times {t} to the future of {t_0} in the lifespan of the solution, as well as the backward Duhamel formula

\displaystyle  u(t) = e^{i(t-t_1)\Delta} u(t_1) + i \int_t^{t_1} e^{i(t-t')\Delta} F(u(t'))\ dt'

for all times {t} to the past of {t_1} in the lifespan of the solution. The first formula asserts that the solution at a given time is determined by the initial state and by the immediate past, while the second formula is the time reversal of the first, asserting that the solution at a given time is determined by the final state and the immediate future. These basic causal formulae are the foundation of the local theory of these equations, and in particular play an instrumental role in establishing local well-posedness for these equations. In this local theory, the main philosophy is to treat the homogeneous (or linear) term {e^{i(t-t_0)\Delta} u(t_0)} or {e^{i(t-t_1)\Delta} u(t_1)} as the main term, and the inhomogeneous (or nonlinear, or forcing) integral term as an error term.

The situation is reversed when one turns to the global theory, and looks at the asymptotic behaviour of a solution as one approaches a limiting time {T} (which can be infinite if one has global existence, or finite if one has finite time blowup). After a suitable rescaling, the linear portion of the solution often disappears from view, leaving one with an asymptotic blowup profile solution which is non-radiating in the sense that the linear components of the Duhamel formulae vanish, thus

\displaystyle  u(t) = - i \int_{t_0}^t e^{i(t-t')\Delta} F(u(t'))\ dt' \ \ \ \ \ (1)

and

\displaystyle  u(t) = i \int_t^{t_1} e^{i(t-t')\Delta} F(u(t'))\ dt' \ \ \ \ \ (2)

where {t_0, t_1} are the endpoint times of existence. (This type of situation comes up for instance in the Kenig-Merle approach to critical regularity problems, by reducing to a minimal blowup solution which is almost periodic modulo symmetries, and hence non-radiating.) These types of non-radiating solutions are propelled solely by their own nonlinear self-interactions from the immediate past or immediate future; they are generalisations of “nonlinear bound states” such as solitons.

A key task is then to somehow combine the forward representation (1) and the backward representation (2) to obtain new information on {u(t)} itself, that cannot be obtained from either representation alone; it seems that the immediate past and immediate future can collectively exert more control on the present than they each do separately. This type of problem can be abstracted as follows. Let {\|u(t)\|_{Y_+}} be the infimal value of {\|F_+\|_N} over all forward representations of {u(t)} of the form

\displaystyle  u(t) = \int_{t_0}^t e^{i(t-t')\Delta} F_+(t') \ dt' \ \ \ \ \ (3)

where {N} is some suitable spacetime norm (e.g. a Strichartz-type norm), and similarly let {\|u(t)\|_{Y_-}} be the infimal value of {\|F_-\|_N} over all backward representations of {u(t)} of the form

\displaystyle  u(t) = \int_{t}^{t_1} e^{i(t-t')\Delta} F_-(t') \ dt'. \ \ \ \ \ (4)

Typically, one already has (or is willing to assume as a bootstrap hypothesis) control on {F(u)} in the norm {N}, which gives control of {u(t)} in the norms {Y_+, Y_-}. The task is then to use the control of both the {Y_+} and {Y_-} norm of {u(t)} to gain control of {u(t)} in a more conventional Hilbert space norm {X}, which is typically a Sobolev space such as {H^s} or {L^2}.

One can use some classical functional analysis to clarify this situation. By the closed graph theorem, the above task is (morally, at least) equivalent to establishing an a priori bound of the form

\displaystyle  \| u \|_X \lesssim \|u\|_{Y_+} + \|u\|_{Y_-} \ \ \ \ \ (5)

for all reasonable {u} (e.g. test functions). The double Duhamel trick accomplishes this by establishing the stronger estimate

\displaystyle  |\langle u, v \rangle_X| \lesssim \|u\|_{Y_+} \|v\|_{Y_-} \ \ \ \ \ (6)

for all reasonable {u, v}; note that setting {u=v} and applying the arithmetic-geometric inequality then gives (5). The point is that if {u} has a forward representation (3) and {v} has a backward representation (4), then the inner product {\langle u, v \rangle_X} can (formally, at least) be expanded as a double integral

\displaystyle  \int_{t_0}^t \int_{t}^{t_1} \langle e^{i(t''-t')\Delta} F_+(t'), e^{i(t''-t')\Delta} F_-(t') \rangle_X\ dt'' dt'.

The dispersive nature of the linear Schrödinger equation often causes {\langle e^{i(t''-t')\Delta} F_+(t'), e^{i(t''-t')\Delta} F_-(t') \rangle_X} to decay, especially in high dimensions. In high enough dimension (typically one needs five or higher dimensions, unless one already has some spacetime control on the solution), the decay is stronger than {1/|t'-t''|^2}, so that the integrand becomes absolutely integrable and one recovers (6).

Unfortunately it appears that estimates of the form (6) fail in low dimensions (for the type of norms {N} that actually show up in applications); there is just too much interaction between past and future to hope for any reasonable control of this inner product. But one can try to obtain (5) by other means. By the Hahn-Banach theorem (and ignoring various issues related to reflexivity), (5) is equivalent to the assertion that every {u \in X} can be decomposed as {u = u_+ + u_-}, where {\|u_+\|_{Y_+^*} \lesssim \|u\|_X} and {\|u_-\|_{Y_-^*} \lesssim \|v\|_X}. Indeed once one has such a decomposition, one obtains (5) by computing the inner product of {u} with {u=u_++u_-} in {X} in two different ways. One can also (morally at least) write {\|u_+\|_{Y_+^*}} as {\| e^{i(\cdot-t)\Delta} u_+\|_{N^*([t_0,t])}} and similarly write {\|u_-\|_{Y_-^*}} as {\| e^{i(\cdot-t)\Delta} u_-\|_{N^*([t,t_1])}}

So one can dualise the task of proving (5) as that of obtaining a decomposition of an arbitrary initial state {u} into two components {u_+} and {u_-}, where the former disperses into the past and the latter disperses into the future under the linear evolution. We do not know how to achieve this type of task efficiently in general – and doing so would likely lead to a significant advance in the subject (perhaps one of the main areas in this topic where serious harmonic analysis is likely to play a major role). But in the model case of spherically symmetric data {u}, one can perform such a decomposition quite easily: one uses microlocal projections to set {u_+} to be the “inward” pointing component of {u}, which propagates towards the origin in the future and away from the origin in the past, and {u_-} to simimlarly be the “outward” component of {u}. As spherical symmetry significantly dilutes the amplitude of the solution (and hence the strength of the nonlinearity) away from the origin, this decomposition tends to work quite well for applications, and is one of the main reasons (though not the only one) why we have a global theory for low-dimensional nonlinear Schrödinger equations in the radial case, but not in general.

The in/out decomposition is a linear one, but the Hahn-Banach argument gives no reason why the decomposition needs to be linear. (Note that other well-known decompositions in analysis, such as the Fefferman-Stein decomposition of BMO, are necessarily nonlinear, a fact which is ultimately equivalent to the non-complemented nature of a certain subspace of a Banach space; see these lecture notes of mine and this old blog post for some discussion.) So one could imagine a sophisticated nonlinear decomposition as a general substitute for the in/out decomposition. See for instance this paper of Bourgain and Brezis for some of the subtleties of decomposition even in very classical function spaces such as {H^{1/2}(R)}. Alternatively, there may well be a third way to obtain estimates of the form (5) that do not require either decomposition or the double Duhamel trick; such a method may well clarify the relative relationship between past, present, and future for critical nonlinear dispersive equations, which seems to be a key aspect of the theory that is still only partially understood. (In particular, it seems that one needs a fairly strong decoupling of the present from both the past and the future to get the sort of elliptic-like regularity results that allow us to make further progress with such equations.)

One of the most basic theorems in linear algebra is that every finite-dimensional vector space has a finite basis. Let us give a statement of this theorem in the case when the underlying field is the rationals:

Theorem 1 (Finite generation implies finite basis, infinitary version) Let {V} be a vector space over the rationals {{\mathbb Q}}, and let {v_1,\ldots,v_n} be a finite collection of vectors in {V}. Then there exists a collection {w_1,\ldots,w_k} of vectors in {V}, with {1 \leq k \leq n}, such that

  • ({w} generates {v}) Every {v_j} can be expressed as a rational linear combination of the {w_1,\ldots,w_k}.
  • ({w} independent) There is no non-trivial linear relation {a_1 w_1 + \ldots + a_k w_k = 0}, {a_1,\ldots,a_k \in {\mathbb Q}} among the {w_1,\ldots,w_m} (where non-trivial means that the {a_i} are not all zero).

In fact, one can take {w_1,\ldots,w_m} to be a subset of the {v_1,\ldots,v_n}.

Proof: We perform the following “rank reduction argument”. Start with {w_1,\ldots,w_k} initialised to {v_1,\ldots,v_n} (so initially we have {k=n}). Clearly {w} generates {v}. If the {w_i} are linearly independent then we are done. Otherwise, there is a non-trivial linear relation between them; after shuffling things around, we see that one of the {w_i}, say {w_k}, is a rational linear combination of the {w_1,\ldots,w_{k-1}}. In such a case, {w_k} becomes redundant, and we may delete it (reducing the rank {k} by one). We repeat this procedure; it can only run for at most {n} steps and so terminates with {w_1,\ldots,w_m} obeying both of the desired properties. \Box

In additive combinatorics, one often wants to use results like this in finitary settings, such as that of a cyclic group {{\mathbb Z}/p{\mathbb Z}} where {p} is a large prime. Now, technically speaking, {{\mathbb Z}/p{\mathbb Z}} is not a vector space over {{\mathbb Q}}, because one only multiply an element of {{\mathbb Z}/p{\mathbb Z}} by a rational number if the denominator of that rational does not divide {p}. But for {p} very large, {{\mathbb Z}/p{\mathbb Z}} “behaves” like a vector space over {{\mathbb Q}}, at least if one restricts attention to the rationals of “bounded height” – where the numerator and denominator of the rationals are bounded. Thus we shall refer to elements of {{\mathbb Z}/p{\mathbb Z}} as “vectors” over {{\mathbb Q}}, even though strictly speaking this is not quite the case.

On the other hand, saying that one element of {{\mathbb Z}/p{\mathbb Z}} is a rational linear combination of another set of elements is not a very interesting statement: any non-zero element of {{\mathbb Z}/p{\mathbb Z}} already generates the entire space! However, if one again restricts attention to rational linear combinations of bounded height, then things become interesting again. For instance, the vector {1} can generate elements such as {37} or {\frac{p-1}{2}} using rational linear combinations of bounded height, but will not be able to generate such elements of {{\mathbb Z}/p{\mathbb Z}} as {\lfloor\sqrt{p}\rfloor} without using rational numbers of unbounded height.

For similar reasons, the notion of linear independence over the rationals doesn’t initially look very interesting over {{\mathbb Z}/p{\mathbb Z}}: any two non-zero elements of {{\mathbb Z}/p{\mathbb Z}} are of course rationally dependent. But again, if one restricts attention to rational numbers of bounded height, then independence begins to emerge: for instance, {1} and {\lfloor\sqrt{p}\rfloor} are independent in this sense.

Thus, it becomes natural to ask whether there is a “quantitative” analogue of Theorem 1, with non-trivial content in the case of “vector spaces over the bounded height rationals” such as {{\mathbb Z}/p{\mathbb Z}}, which asserts that given any bounded collection {v_1,\ldots,v_n} of elements, one can find another set {w_1,\ldots,w_k} which is linearly independent “over the rationals up to some height”, such that the {v_1,\ldots,v_n} can be generated by the {w_1,\ldots,w_k} “over the rationals up to some height”. Of course to make this rigorous, one needs to quantify the two heights here, the one giving the independence, and the one giving the generation. In order to be useful for applications, it turns out that one often needs the former height to be much larger than the latter; exponentially larger, for instance, is not an uncommon request. Fortunately, one can accomplish this, at the cost of making the height somewhat large:

Theorem 2 (Finite generation implies finite basis, finitary version) Let {n \geq 1} be an integer, and let {F: {\mathbb N} \rightarrow {\mathbb N}} be a function. Let {V} be an abelian group which admits a well-defined division operation by any natural number of size at most {C(F,n)} for some constant {C(F,n)} depending only on {F,n}; for instance one can take {V = {\mathbb Z}/p{\mathbb Z}} for {p} a prime larger than {C(F,n)}. Let {v_1,\ldots,v_n} be a finite collection of “vectors” in {V}. Then there exists a collection {w_1,\ldots,w_k} of vectors in {V}, with {1 \leq k \leq n}, as well an integer {M \geq 1}, such that

  • (Complexity bound) {M \leq C(F,n)} for some {C(F,n)} depending only on {F, n}.
  • ({w} generates {v}) Every {v_j} can be expressed as a rational linear combination of the {w_1,\ldots,w_k} of height at most {M} (i.e. the numerator and denominator of the coefficients are at most {M}).
  • ({w} independent) There is no non-trivial linear relation {a_1 w_1 + \ldots + a_k w_k = 0} among the {w_1,\ldots,w_k} in which the {a_1,\ldots,a_k} are rational numbers of height at most {F(M)}.

In fact, one can take {w_1,\ldots,w_k} to be a subset of the {v_1,\ldots,v_n}.

Proof: We perform the same “rank reduction argument” as before, but translated to the finitary setting. Start with {w_1,\ldots,w_k} initialised to {v_1,\ldots,v_n} (so initially we have {k=n}), and initialise {M=1}. Clearly {w} generates {v} at this height. If the {w_i} are linearly independent up to rationals of height {F(M)} then we are done. Otherwise, there is a non-trivial linear relation between them; after shuffling things around, we see that one of the {w_i}, say {w_k}, is a rational linear combination of the {w_1,\ldots,w_{k-1}}, whose height is bounded by some function depending on {F(M)} and {k}. In such a case, {w_k} becomes redundant, and we may delete it (reducing the rank {k} by one), but note that in order for the remaining {w_1,\ldots,w_{k-1}} to generate {v_1,\ldots,v_n} we need to raise the height upper bound for the rationals involved from {M} to some quantity {M'} depending on {M, F(M), k}. We then replace {M} by {M'} and continue the process. We repeat this procedure; it can only run for at most {n} steps and so terminates with {w_1,\ldots,w_m} and {M} obeying all of the desired properties. (Note that the bound on {M} is quite poor, being essentially an {n}-fold iteration of {F}! Thus, for instance, if {F} is exponential, then the bound on {M} is tower-exponential in nature.) \Box

(A variant of this type of approximate basis lemma was used in my paper with Van Vu on the singularity probability of random Bernoulli matrices.)

Looking at the statements and proofs of these two theorems it is clear that the two results are in some sense the “same” result, except that the latter has been made sufficiently quantitative that it is meaningful in such finitary settings as {{\mathbb Z}/p{\mathbb Z}}. In this note I will show how this equivalence can be made formal using the language of non-standard analysis. This is not a particularly deep (or new) observation, but it is perhaps the simplest example I know of that illustrates how nonstandard analysis can be used to transfer a quantifier-heavy finitary statement, such as Theorem 2, into a quantifier-light infinitary statement, such as Theorem 1, thus lessening the need to perform “epsilon management” duties, such as keeping track of unspecified growth functions such as {F}. This type of transference is discussed at length in this previous blog post of mine.

In this particular case, the amount of effort needed to set up the nonstandard machinery in order to reduce Theorem 2 from Theorem 1 is too great for this transference to be particularly worthwhile, especially given that Theorem 2 has such a short proof. However, when performing a particularly intricate argument in additive combinatorics, in which one is performing a number of “rank reduction arguments”, “energy increment arguments”, “regularity lemmas”, “structure theorems”, and so forth, the purely finitary approach can become bogged down with all the epsilon management one needs to do to organise all the parameters that are flying around. The nonstandard approach can efficiently hide a large number of these parameters from view, and it can then become worthwhile to invest in the nonstandard framework in order to clean up the rest of a lengthy argument. Furthermore, an advantage of moving up to the infinitary setting is that one can then deploy all the firepower of an existing well-developed infinitary theory of mathematics (in this particular case, this would be the theory of linear algebra) out of the box, whereas in the finitary setting one would have to painstakingly finitise each aspect of such a theory that one wished to use (imagine for instance trying to finitise the rank-nullity theorem for rationals of bounded height).

The nonstandard approach is very closely related to use of compactness arguments, or of the technique of taking ultralimits and ultraproducts; indeed we will use an ultrafilter in order to create the nonstandard model in the first place.

I will also discuss a two variants of both Theorem 1 and Theorem 2 which have actually shown up in my research. The first is that of the regularity lemma for polynomials over finite fields, which came up when studying the equidistribution of such polynomials (in this paper with Ben Green). The second comes up when is dealing not with a single finite collection {v_1,\ldots,v_n} of vectors, but rather with a family {(v_{h,1},\ldots,v_{h,n})_{h \in H}} of such vectors, where {H} ranges over a large set; this gives rise to what we call the sunflower lemma, and came up in this recent paper of myself, Ben Green, and Tamar Ziegler.

This post is mostly concerned with nonstandard translations of the “rank reduction argument”. Nonstandard translations of the “energy increment argument” and “density increment argument” were briefly discussed in this recent post; I may return to this topic in more detail in a future post.

Read the rest of this entry »

Next month, I am scheduled to give a short speech (three to five minutes in length) at the annual induction ceremony of the American Academy of Arts and Sciences in Boston.  This is a bit different from the usual scientific talks that I am used to giving; there are no projectors, blackboards, or other visual aids available, and the audience of Academy members is split evenly between the humanities and the sciences (as well as people in industry and politics), so this will be an interesting new experience for me.  (The last time I gave a speech was in 1985.)

My chosen topic is on the future impact of internet-based technologies on academia (somewhat similar in theme to my recent talk on this topic).  I have a draft text below the fold, though it is currently too long and my actual speech is likely to be a significantly abridged version of the one below [Update, Oct 12: The abridged speech is now at the bottom of the post.]  In the spirit of the theme of the talk, I would of course welcome any comments and suggestions.

For comparison, the talks from last year’s ceremony, by Jim Simons, Peter Kim, Susan Athey, Earl Lewis, and Indra Nooyi, can be found here.  Jim’s chosen topic, incidentally, was what mathematics is, and why mathematicians do it.

[Update, Nov 3: Video of the various talks by myself and the other speakers (Emmylou Harris, James Earl Jones, Elizabeth Nabel, Ronald Marc George, and Edward Villela) is now available on the Academy web site here.]

Read the rest of this entry »

In the discussion on what mathematicians need to know about blogging mentioned in the previous post, it was noted that there didn’t seem to be a single location on the internet to find out about mathematical blogs.  Actually, there is a page, but it has been relatively obscure – the Mathematics/Statistics subpage of the Academic Blogs wiki.  It does seem like a good idea to have a reasonably comprehensive page containing all the academic mathematics blogs that are out there (as well as links to other relevant sites), so I put my own maths blogroll onto the page, and encourage others to do so also (though you may wish to read the FAQ for the wiki first).

It may also be useful to organise the list into sublists, and to add more commentary on each individual blog.  (In theory, each blog is supposed to have its own sub-page, though in practice it seems that very few blogs do at present.)

John Baez has been invited to write a short opinion piece for the Notices of the AMS to report about the maths blogging phenomenon to the wider mathematical community, and in the spirit of that phenomenon, has opened up a blog post to solicit input for that piece, over at the n-Category café.  Given that the readers here are, by definition, familiar with mathematical blogging, I thought that some of you might like to visit that thread to share your own thoughts on the advantages and disadvantages of this mode of mathematical communication.

The mini-polymath project to find solutions to Problem 6 of the 2009 IMO is still ongoing, but I thought that, while the memories of the experience are still fresh, it would be a good time to open a parallel thread to collect the impressions that participants and observers had of how the project was conducted, how successful it was, and how it (or future projects) could be made to run more smoothly.

Just to get the ball rolling, here are some impressions I got as a (rather passive) moderator:

  1. There is no shortage of potential interest in polymath projects. I was impressed by how the project could round up a dozen interested and qualified participants in a matter of hours; this is one particular strength of the polymath paradigm.  Of course, it helped that this particular project was elementary, and was guaranteed to have an elementary (and relatively short) solution.  Nevertheless, the availability of volunteers does bode well for future projects of this type.
  2. A wiki needs to be set up as soon as possible. The wiki for polymath1 was an enormously valuable resource, once it was set up.  I had naively thought that the mini-polymath1 project would be short enough that a wiki was not necessary, but now I see that it would have come in handy for organising and storing the arguments, strategies, insights, and ideas that arose through the linear blog thread format, but which was difficult to summarise in that format.  (I have belatedly set a wiki for this project up here.)  For the next polymath project (I have none planned yet, but can imagine that one would eventually arise), I will try to ensure a wiki is available early on.
  3. There is an increasing temptation to work offline as the project develops. In the rules of the polymath projects to date, the idea is for participants to avoid working “offline” for too long, instead reporting all partial progress and thoughts on the blog and/or the wiki as it occurs.  This ideal seems to be adhered to well in the first phases of the project, when the “easy” but essential observations are being made, and the various “low-hanging fruits” are harvested, but at some point it seems that one needs to do more non-trivial amounts of computation and thought, which is still much easier to do offline than online.  It is possible that future technological advances (e.g. the concurrent editing capabilities of platforms such as Google Wave) may change this, though; also a culture and etiquette of collaborative thinking might also evolve over time, much as how mathematical research has already adapted to happily absorb new modes of communication, such as email.  In the meantime, though, I think one has accommodate both online and offline modes of thinking to make a polymath project as successful as possible, avoiding degeneration into a mass of low-quality observations on one hand, and a fracturing into isolated research efforts on the other.
  4. Without leadership or organisation, the big picture can be obscured by chaos. As I was distracted by other tasks (for instance, flying from Bremen back to Los Angeles), and had already known of a solution to the problem, I adopted a laissez faire attitude to task of moderating the project.  This worked to some extent, and there was certainly no shortage of ideas being tossed back and forth, arguments being checked and repaired, etc., but I think that with more active moderation, one could have had a bit more focus on longer-term strategy and vision than there was.  Perhaps in future projects one could be more explicit in the rules about encouraging this sort of perspective (for instance, in encouraging periodic summaries of the situation either on the blog or on the wiki).
  5. Polymath projects tend to generate multiple solutions to a problem, rather than a single solution. A single researcher will tend to focus on only one idea at a time, and is thus generally led to just a single solution (if that idea ends up being successful); but a polymath project is more capable of pursuing several independent lines of attack simultaneously, and so often when the breakthrough comes, one gets multiple solutions as a result.  This makes it harder to do direct comparison of success between polymath projects and individual efforts; from the (limited) data points available, I tentatively hypothesise that polymath projects tend to be slower, but obtain broader and deeper results, than what a dedicated individual effort would accomplish.
  6. Polymath progress is both very fast and very slow. I’ve noticed something paradoxical about these projects.  On the one hand, progress can be very fast in the sense that ideas get tossed out there at a rapid rate; also, with all the proofreaders, errors in arguments get picked up much quicker than when only one mathematician is involved.  On the other hand, it can take a while for an idea or insight obtained by one participant to be fully absorbed by the others, and sometimes the key observation can be drowned out by a large number of less important observations.  The process seems somewhat analogous to that of evolution and natural selection in biology; consider for instance how the meme of “try using induction”, which was the ultimately successful approach, had to first fight among competing memes such as “try using contradiction”, “try counting arguments”, “try topological arguments on the cube”, etc., before gaining widespread acceptance.  In contrast, an individual might through luck (or experience) hit upon the right approach (in this case, induction) very early on and end up with a solution far quicker than a polymath effort; conversely, he or she may select the wrong approach and end up wasting far more time than a polymath would.
  7. The wordpress blog format is adequate, but far from ideal. Technical problems (most notably, the spam filter, the inability to preview or edit comments [except by myself], and the (temporary) lack of nesting and automatic comment numbering) made things more frustrating and clunky than they should be.  Adding the wiki helps some of the problems, but definitely not all, especially since there is no integration between the blog and the wiki.  But the LaTeX support included in the WordPress blog is valuable, even if it does act up sometimes. Hopefully future technologies will provide better platforms for this sort of thing.  (As a temporary fix, one might set up some dedicated blog (or other forum) for polymath projects with customised code, rather than relying on hosts.)
<swolpert@support.ucla.edu>

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,573 other followers