The month of April has been designated as Mathematics Awareness Month by the major American mathematics organisations (the AMS, ASA, MAA, and SIAM).  I was approached to write a popular mathematics article for April 2011 (the theme for that month is “Mathematics and Complexity”).  While I have written a fair number of expository articles (including several on this blog) aimed at a mathematical audience, I actually have not had much experience writing articles at the popular mathematics level, and so I found this task to be remarkably difficult.  At this level of exposition, one not only needs to explain the facts, but also to tell a story; I have experience in the former but not in the latter.

I decided to write on the topic of universality – the phenomenon that the macroscopic behaviour of a dynamical system can be largely independent of the precise microscopic structure.   Below the fold is a first draft of the article; I would definitely welcome feedback and corrections.  It does not yet have any pictures, but I plan to rectify that in the final draft.  It also does not have a title, but this will be easy to address later.   But perhaps the biggest thing lacking right now is a narrative “hook”; I don’t yet have any good ideas as to how to make the story of universality compelling to a lay audience.  Any suggestions in this regard would be particularly appreciated.

I have not yet decided where I would try to publish this article; in fact, I might just publish it here on this blog (and eventually, in one of the blog book compilations).

— Untitled —

One of the great triumphs of classical mathematics and physics was the ability to predict, with great precision, the future behaviour of simple physical systems.  For instance, in Isaac Newton’s Principia, Newton’s law of universal gravitation was applied to completely describe the behaviour of two bodies under the influence of each other’s gravity, and in particular explaining Johannes Kepler’s laws of planetary motion as a consequence.

In contrast, the infamous three-body problem, in which one seeks to generalise Newton’s analysis of two bodies under the influence of gravitation to a third, remains unsolved to this day despite centuries of efforts by such great mathematicians as Isaac Newton and Henri Poincaré; Newton himself complained that this was the one problem that gave him headaches.

This is not to say that no progress at all has been made on this problem.  We now know rigorously that the three body problem exhibits chaotic behaviour, which helps explain why no exact solution has ever been located; and we are also able to use computers to obtain numerical solutions to this problem with great accuracy and over long periods of time.  The motion of the planets in the solar system, for instance, can be extrapolated a hundred million years into the future or the past.

If however we increase the complexity of the system further, by steadily increasing the number of degrees of freedom, the ability to predict the behaviour of the system degrades rapidly.  For instance, weather prediction requires one to understand the interaction of a dozen atmospheric variables from thousands of locations throughout the planet, leading to an enormously complicated system that we can barely predict (with only moderate accuracy) a week into the future at best, and that only with the assistance of the latest algorithms and extremely powerful computers.

Given how complicated life is, it may thus seem that the ability of mathematics to model real-world situations is sharply limited.  And this is certainly the case in many contexts; for instance, the ability of mathematical models to predict key economic indicators even just a few months into the future is notoriously poor, and in some cases only marginally better than random chance.

There is however, a remarkable phenomenon that occurs with some (but not all) complex systems: as the number of degrees of freedom increases, a system can become more predictable and orderly, rather than less.  Even more remarkable is the phenomenon of universality: many complex systems have the same macroscopic behaviour, even if they have very different microscopic dynamics.

A simple example of this occurs when predicting how voters will act in an election.  It is enormously difficult to predict how one, ten, or a hundred voters will cast their vote; but when the electorate is in the millions, one can obtain quite precise predictions by polling just a few thousand of the voters, if done with proper methodology.  For instance, just before the 2008 US elections, the statistician Nate Silver used a weighted analysis of all existing polls to correctly predict the outcome of all the US senate races, and of the presidential election in 49 out of 50 states.

Mathematically, the accuracy of such predictions ultimately derives from two fundamental results in probability theory: the law of large numbers, and the central limit theorem.    Very roughly speaking, the law of large numbers asserts that if one takes a large number of randomly chosen quantities (e.g. the voting preferences of randomly chosen voters), then the average of these quantities (known as the empirical average) will usually be quite close to a specific number, namely the expectation of these quantities (which, in this case, would be the mean voting preference of the entire electorate).  This is not quite an exact law; there is a small error between the empirical average and the expectation which can fluctuate randomly, but the central limit theorem describes the size and distribution of this random error; roughly speaking, this error is distributed according to the normal distribution (or Gaussian distribution), more popularly known as the bell curve.

The central limit theorem is astonishingly universal; whenever a large number of independent random quantities are averaged together to form a single combined quantity, the distribution of that quantity usually follows the bell curve distribution (with the caveat that if the quantities are combined together in a multiplicative manner rather than an additive one, then the bell curve only appears when the quantity is plotted logarithmically).  For instance, the heights of human adults (or any other given species) follow a bell curve distribution closely (when plotted logarithmically), as do standardised test scores, velocities of particles in a gas, or the casino’s take from a day’s worth of gambling.  This is not because there is any fundamental common mechanism linking height, test performance, atomic velocities, and gambling profits together; from a “low-level” perspective, the factors that go into these quantities behave completely differently from each other.  But these quantities are all formed by aggregating together many independent factors (for instance, human height arises from a combination of diet, genetics, environment, health, and childhood development), and it is this high-level structure that gives them all the same universal bell curve distribution.

The universality of the bell curve allows one to model many complex phenomena, from the noise of electromagnetic interference to the insurance costs of medical diseases, which would otherwise defy any rational analysis; it is of fundamental importance in many practical disciplines, such as engineering, statistics, and finance.  It is important to note, however, that there are limits to this universality principle: if the individual factors that are aggregated into a combined quantity are not independent of each other, but are instead highly correlated with each other, then the combined quantity can in fact be distributed in a manner radically different from the bell curve.  The financial markets discovered this unfortunate fact recently, most notably with the collateralised debt obligations that aggregated together many mortgages into a single instrument.  If the mortgages behaved independently, then it would be inconceivable that a large fraction of them would fail simultaneously; but, as we now know, there was a lot more dependence between them than had previously been believed.  A mathematical model is only as strong as the assumptions behind it.

The central limit theorem is one of the most basic examples of the universality phenomenon in nature, in which the compound quantity being measured is simply an aggregate of the individual components.  But there are several other universal laws for complex systems, in which one is performing a more complex operation than mere aggregation.  One example of this occurs in the theory of percolation.    This theory models such phenomena as the percolation of water through a porous medium of coffee grounds, the conductance of electricity through nanomaterials, or the extraction of oil through cracks in the earth.  In this theory, one starts with a network of connections in a two-dimensional, three-dimensional, or higher-dimensional space (such as a cubic lattice, a square lattice, or a triangular lattice), and then activates some randomly selected fraction of this network.   These active connections then organise into complicated-looking fractal shapes known as clusters.  If the proportion of connections being activated is low, then the clusters tend to be small and numerous; the water does not percolate through the coffee, the nanomaterial is an insulator, and the oil does not seep out of the earth.  At the other extreme, when the proportion is high, then the connections tend to mostly coalesce into one huge cluster; the water percolates freely, the nanomaterial conducts, and the oil seeps through.  But the most interesting phenomena happens near the critical density of percolation, in which the clusters are almost, but not quite, threatening to join up into one single giant cluster.  When observing such a phase transition, we have found that the size, shape, and distribution of such clusters follow a beautifully fractal, but very specific, law.   Furthermore, this law is universal: one can start with two different networks (e.g. a square lattice and triangular lattice), and the behaviour of the clusters near the critical density will be virtually identical (provided that certain basic statistics of the problem, such as the dimension of the lattice, are kept constant).  Such universality laws have tantalising implications for physics; they suggest that we may not actually need to fully understand the laws of physics at the microscopic level in order to be able to obtain predictions at the macroscopic level, much as the laws of thermodynamics can be derived without necessarily knowing all the subtleties of atomic physics.  Unfortunately, the rigorous mathematical understanding of universality for such theories as percolation is still incomplete, despite many recent advances (for instance, very recently a Fields Medal, one of the highest honours in mathematics, was awarded to the Russian mathematician Stas Smirnov, in part for his breakthroughs in the theory of percolation for certain specific networks).

Another major example of universality, which is also only partially understood at present, arises in understanding the spectra of various large systems.  Historically, the first instance of this came with the work of Eugene Wigner in the 1950s on the scattering of neutrons off of large nuclei, such as the Uranium-238 nucleus.  Much as the laws of quantum mechanics dictate that an atom only absorbs some frequencies of light and not others, leading to a distinctive colour for each atomic element, the electromagnetic and nuclear forces of a nucleus, when combined with the laws of quantum mechanics, predict that a neutron will pass through a nucleus virtually unimpeded for some energies, but will bounce off that nucleus at other energies, known as scattering resonances.   One can compute these resonances directly from first principles when dealing with very simple nuclei, such as the hydrogen and helium nuclei, but the internal structure of larger nuclei are so complex that it has not been possible to compute these resonances either theoretically or numerically, leaving experimental data as the only option.

These resonances have an interesting distribution; they are not independent of each other, but instead seem to obey a precise repulsion law that makes it quite unlikely that two adjacent resonances are too close to each other.  In the decades since Wigner’s work, exactly the same governing laws have been found for many systems, both physical and mathematical, that have absolutely nothing to do with neutron scattering, from the arrival times of buses at a bus stop to the zeroes of the Riemann zeta function. The latter example is particularly striking, as it comes from one of the purest subfields of mathematics, namely number theory, and in particular the distribution of the prime numbers.  The prime numbers are distributed in an irregular fashion through the integers; but if one performs a spectral analysis on this distribution, one can discern certain long-term oscillations in this distribution (sometimes known as the music of the primes), the frequencies of which are described by a sequence of complex numbers known as the (non-trivial) zeroes of the Riemann zeta function, which were first studied by Bernhard Riemann.  In principle, these numbers tell us everything we would wish to know about the primes.  One of the most famous and important problems in number theory is the Riemann hypothesis, which asserts that these numbers all lie on a single line in the complex plane.  It has many consequences in number theory, and in particular gives many important consequences about the prime numbers.  However, even the powerful Riemann hypothesis does not settle everything in this subject, in part because it does not directly say much about how the zeroes are distributed on this line.  But there is extremely strong numerical evidence that these zeroes obey the same precise law that is observed in neutron scattering and in other systems; in particular, the zeroes seem to “repel” each other in a certain very precise fashion.  The formal description of this law is known as the Gaussian Unitary Ensemble (GUE) hypothesis.  Like the Riemann hypothesis, it is currently unproven, but it has powerful consequences for the distribution of the prime numbers.

The fact that the music of the primes, and the energy levels of nuclei, obey the same universal law, is a very surprising fact; legend has it that during a tea at the Institute for Advanced Study, the number theorist Hugh Montgomery had been telling the renowned physicist Freeman Dyson of his remarkable findings into the distribution of the zeroes of the zeta function (or more precisely, a statistic of this distribution known as the pair correlation function), only to have Dyson write down this function exactly, based on nothing more than his experience with scattering, and the random matrix models used to make predictions about them.  But this does not mean that the primes are somehow nuclear-powered, or that atomic physics is somehow driven by the prime numbers; instead, it is evidence that a single distribution (known as the GUE distribution) is so universal that it is the natural end product of any number of different processes, whether it comes from nuclear physics or number theory.

We still do not have a full explanation of why this particular distribution is so universal.  There has however been some recent theoretical progress in this direction.  For instance, recent work of Laszlo Erdos, Benjamin Schlein, and Horng-Tzer Yau has shown that the GUE distribution has a strong attraction property: if a system does not initially obey the GUE distribution, then after randomly perturbing the various coefficients of that system in a certain natural manner, the distribution converges quite rapidly in a certain sense to GUE.  This can already be used to rigorously demonstrate the presence of the GUE distribution in a number of theoretical randomised models known as random matrix models (though, sadly, not for the zeroes of the zeta function, as this is a deterministic system rather than a random one).  In related recent work of Van Vu and myself, we showed that the spectral distribution of a random matrix model does not change much if one replaces one (or even all) of the components of the system with another component that fluctuates in a comparable manner; this implies that large classes of systems end up having almost the same spectral statistics, which is further evidence towards universality, and can be used to give similar results to that obtained by Erdos, Schlein, and Yau.  The study of these random matrix models, and their implications regarding the universality phenomenon, is a highly active current area of research.

There are however important settings in which we are unable to rely on universality, for instance in fluid mechanics.  At microscopic (molecular) scales, one can understand a fluid by modeling the collisions of individual molecues.  At macroscopic scales, one can use the universal equations of fluid mechanics (such as the Navier-Stokes equations), which reduce all the microscopic structure of the component molecues to a few key macroscopic statistics such as viscosity and compressibility.  However, at mesoscopic scales that are too large for microscopic analysis to be effective, but too small for macroscopic universality to kick in, our ability to model fluids is still very poor.  Examples of important mesoscopic fluids include blood flowing through blood vessels; the blood cells that make up this liquid are so large that they cannot be treated merely as an ensemble of microscopic molecues, but as mesoscopic agents with complex behaviour.  Understanding exactly which scenarios admit a universality phenomenon to simplify the analysis, and knowing what to do even in the absence of such phenomena, is a continuing challenge to mathematical modeling today.