The page you are looking for does not exist; it may have been moved, or removed altogether. You might want to try the search function. Alternatively, return to the front page.
Recent Comments
Diversions
- Abstruse Goose
- Assembler
- FiveThirtyEight
- Gapminder
- Nassim Taleb’s essay on the limits of statistics
- Planarity
- snopes
- Strange maps
- Television tropes and idioms
- The Daily Show with Jon Stewart
- The Economist
- The Onion
- The Straight Dope
- This American Life on the financial crisis I
- This American Life on the financial crisis II
- xkcd
Mathematics
- 0xDE
- A mathematical melting pot
- A Mind for Madness
- A Portion of the Book
- AMS Graduate Student Blog
- Analysis & PDE
- Analysis & PDE Conferences
- Annoying Precision
- Ars Mathematica
- Avzel's journal
- Combinatorics and more
- Compressed sensing resources
- Computational Complexity
- Concrete nonsense
- Delta epsilons
- Detexify
- DispersiveWiki
- Disquisitiones Mathematicae
- Embûches tissues
- Emmanuel Kowalski’s blog
- Frank Morgan’s blog
- Gödel’s Lost Letter and P=NP
- Geometric Group Theory
- Geometry and the imagination
- God Plays Dice
- Good Math, Bad Math
- Hydrobates
- Images des mathématiques
- In theory
- Journal of the American Mathematical Society
- LaTeX to Wordpress
- Le Petit Chercheur Illustré
- Lewko's blog
- London number theory
- Low Dimensional Topology
- Math Overflow
- Mathematical musings
- Mathematics blogs wiki
- Mathematics Illuminated
- Mathematics in Australia
- Mathematics Jobs Wiki
- Mathematics under the Microscope
- Mathlog
- Motivic stuff
- Multiple Choice Quiz Wiki
- neverendingbooks
- nLab
- Noncommutative geometry blog
- Nuit-blanche
- Number theory web
- outofprintmath
- Quomodocumque
- Reasonable Deviations
- Richard Borcherds’ blog
- Rigorous Trivialities
- Sage: Open Source Mathematical Software
- Secret Blogging Seminar
- Shtetl-Optimized
- Sketches of topology
- SymOmega
- tcs math
- The accidental mathematician
- The Everything Seminar
- The Geomblog
- The n-Category Café
- The n-geometry cafe
- The polymath blog
- The polymath wiki
- The Tricki
- The twofold gaze
- The Unapologetic Mathematician
- Tim Gowers’ blog
- Tim Gowers’ mathematical discussions
- Todd and Vishal’s blog
- Vivatsgasse 7
- Williams College Math/Stat Blog
- Wiskundemeisjes
- XOR’s hammer
Selected articles
- American Academy of Arts and Sciences speech
- Amplification, arbitrage, and the tensor power trick
- An airport-inspired puzzle
- Benford's law, Zipf's law, and the Pareto distribution
- Compressed sensing and single-pixel cameras
- Einstein’s derivation of E=mc^2
- On multiple choice questions in mathematics
- Quantum mechanics and Tomb Raider
- Sailing into the wind, or faster than the wind
- Simons lectures on structure and randomness
- Small samples, and the margin of error
- Soft analysis, hard analysis, and the finite convergence principle
- The blue-eyed islanders puzzle
- The cosmic distance ladder
- Ultrafilters, non-standard analysis, and epsilon management
- What is a gauge?
- What is good mathematics?
- Why global regularity for Navier-Stokes is hard
The sciences
Top Posts
- An inverse theorem for the Gowers U^4 norm
- Course announcement: 254A - random matrices
- Career advice
- From Bose-Einstein condensates to the nonlinear Schrodinger equation
- Books
- On writing
- Google Wave
- Does one have to be a genius to do maths?
- About
- UCLA Math Seeks Exceptional Student for Undergraduate Merit Scholarship
Archives
- December 2009 (2)
- November 2009 (8)
- October 2009 (15)
- September 2009 (6)
- August 2009 (13)
- July 2009 (10)
- June 2009 (11)
- May 2009 (9)
- April 2009 (11)
- March 2009 (14)
- February 2009 (13)
- January 2009 (18)
- December 2008 (8)
- November 2008 (9)
- October 2008 (10)
- September 2008 (5)
- August 2008 (6)
- July 2008 (7)
- June 2008 (8)
- May 2008 (11)
- April 2008 (12)
- March 2008 (12)
- February 2008 (13)
- January 2008 (17)
- December 2007 (10)
- November 2007 (9)
- October 2007 (9)
- September 2007 (7)
- August 2007 (9)
- July 2007 (9)
- June 2007 (6)
- May 2007 (10)
- April 2007 (11)
- March 2007 (9)
- February 2007 (4)
Categories
- expository (70)
- tricks (5)
- guest blog (7)
- Mathematics (304)
- math.AC (2)
- math.AG (9)
- math.AP (50)
- math.AT (11)
- math.CA (54)
- math.CO (75)
- math.CT (3)
- math.CV (6)
- math.DG (26)
- math.DS (40)
- math.FA (15)
- math.GM (4)
- math.GN (9)
- math.GR (17)
- math.GT (8)
- math.HO (8)
- math.IT (8)
- math.LO (19)
- math.MG (18)
- math.MP (10)
- math.NA (8)
- math.NT (39)
- math.OA (7)
- math.PR (32)
- math.QA (3)
- math.RA (7)
- math.RT (4)
- math.SG (3)
- math.SP (14)
- math.ST (3)
- non-technical (68)
- admin (16)
- advertising (10)
- media (8)
- journals (1)
- obituary (4)
- opinion (21)
- paper (78)
- question (39)
- polymath (19)
- talk (55)
- DLS (18)
- teaching (71)
- travel (24)
Tags
additive combinatorics
almost periodicity
Ben Green
compactness
compressed sensing
condition number
correspondence principle
distributions
eigenvalues
entropy
equidistribution
ergodic theory
finite fields
Fourier transform
Freiman's theorem
gradient shrinking solitons
graph theory
hard analysis
harmonic analysis
hypergraphs
inverse conjecture
Kakeya conjecture
Littlewood-Offord theorems
meta
multiple recurrence
NLS
politics
polymath1
polynomial method
polynomials
prime numbers
project heatwave
property testing
random matrices
randomness
Ratner's theorem
regularity lemma
Ricci flow
Schrodinger equation
sieve theory
structure
Szemeredi's theorem
universality
Van Vu
wave maps
The Polymath Blog
- Proposals (Tim Gowers): Polynomial DHJ, and Littlewood’s problem
- Proposal (Tim Gowers) The Origin of Life
- (Research thread V) Determinstic way to find primes
- Nature article on Polymath
- (Research Thread IV) Determinstic way to find primes
- (Research Thread III) Determinstic way to find primes
- (Research thread II) Deterministic way to find primes
- Polymath on other blogs
- Deterministic way to find primes: discussion thread
- Selecting another polymath project
Mathematics in Australia
- Cheryl Praeger named as 2009 Western Australian Scientist of the YearU 2 December, 2009
- Positions at Australian National University 4 November, 2009
- Mathematics skills out for the count 26 October, 2009
- Federal government awards $2 million for Improving Mathematics Education in Schools project 13 October, 2009
- Message from AMSI regarding proposed cuts at Victoria University 9 October, 2009
- Jurassic Quark 8 October, 2009
- Nobel Prize Laureates 7 October, 2009
- Access to mathematics is vital for equity 29 September, 2009
- Aussie mathematician blogs 26 September, 2009
- ARC releases consultation paper on its peer review process 18 September, 2009

No comments yet
24 February, 2007 at 2:06 pm
Tom
About your postscriptum: apparently there’s an easy way to add some LaTeX in blogger, see:
http://wolverinex02.googlepages.com/emoticonsforblogger2
24 February, 2007 at 6:10 pm
Kate Sanderstead
A word of warning: your math images are going to be very unstable – they rely on a script running at forkosh.com. If this goes offline for some reason all the formulae in your posts will disappear!
Seeing as this blog is only just starting, I suggest you move to wordpress.com, which seems to have (very recently) acquired LaTeX support without any hacks:
http://faq.wordpress.com/2007/02/18/can-i-put-math-or-equations-in-my-posts/
All the best,
Kate
26 February, 2007 at 7:39 pm
Dave Bacon
Very cool post! I like it, but am a bit skeptical about pushing the quantization of probabilities over to the quantization in the Schrodigner equation, but I’ll have to think about it more.
Another interesting problem to think about in computer universes is what entities in these universes would say about predicting their future. Suppose you’ve got a run of the mill deterministic cellular automata and your a big blob in this universe. Can you predict you’re own future? You can’t because you only have access to the information in your past light cone. Now, after the fact, you’ll be able to have a consistent history of your universe, but you won’t be able to predict the future. Sort of quantum mechanical as well!
26 February, 2007 at 7:40 pm
Infinite Reflections » Blog Archive » Fields Medal and Blogging …
[...] post on Lara Croft Tomb Raider and Quantum Mechanics is pretty entertaining and insightfull [...]
26 February, 2007 at 7:46 pm
The Quantum Pontiff » Post To Read While the World Spins
[...] factors while Terrance Tao discusses some quantum mechanics based on…Tomb [...]
26 February, 2007 at 9:32 pm
Jacques Distler
I found myself scratching my head a bit over this post.
Classical physics is no stranger to probabilities. Nor to the “quantization” of certain observables (think, e.g. of Sturm-Liouville systems). What distinguishes quantum mechanics from classical physics is non-commutativity. Observables (self-adjoint operators on Hilbert Space) don’t commute, in general. This non-cummutativity is at the heart of all interference phenomena, is responsible for the (generalized) Heisenberg Uncertainty relation (a lower bound on the product of the variances, in any quantum state, of two non-commuting observables), and all of the weird stuff we call “quantum.”
In the limit that you can neglect it, you recover classical physics (via Ehrenfest’s Theorem).
I didn’t really see how non-commutativity was captured in your Tomb-Raiders analogy. In fact, all of the phenomena sounded distinctly classical, albeit rather odd, unaccustomed as we are to dealing with the classical physics of an ENSEMBLE of Laras.
26 February, 2007 at 9:56 pm
Vish Subramanian
The comment of Jacques Distler leaves me scratching my head a bit. Apparently, physics PhD’s take it upon themselves to mock mathematicians for stepping on their turf, without understanding what they say.
Anyway, its quite clear that Terry Tao is not talking about a classical ensemble of Lara’s, because the various Lara’s are interacting with each other. Its exactly the same as an electron behaving as though there were multiple slits between it and a screen – interference results.
The non-commutative property doesnt come though perfectly – its only an analogy(!), but it can be seen as follows – imagine one operator giving you the exact incarnation of a particular Lara (position), and the other one giving the final score of the game (momentum). They are non-commutative. If you measure only the final score, the incarnation that reached it is indeterminate, and if you measure only the incarnation (by playing only one game without interference), you cant be sure what the final score is.
26 February, 2007 at 10:21 pm
Jacques Distler
Why the hostility, Vish?
Anyway, “interacting” with the corpses of Laras from previous games is purely classical phenomenon. The fact that “Jaqueline” doesn’t know about the corpses makes this a kind of “hidden variable theory.”
26 February, 2007 at 10:33 pm
Terence Tao
Dear Jacques,
Thanks for the comments. The model I have here is classical in the
external universe, but non-classical in the game universe; the point is that the hidden variables are external to the game universe. (The many-worlds interpretation is similar, by the way; it is a local hidden variable explanation of quantum mechanics, but only in the “external” setting in which one regards the entire universe as a wave function. In the
“internal” setting of the classical universe observed by a macroscopic
observer, a hidden variable explanation is impossible.)
I don’t yet have an explicit example of Bell inequality violation in the
game universe – which is an interesting puzzle, and I’ll think about it – but in the meantime, let me offer you a Tomb Raider analogue of the two slit experiment. Imagine a tomb with the following layout:
…Entrance…..
………|………
.Antechamber.
…|………..|…
Door A…Door B
…|………..|…
…..Seesaw….
………|………
…….Exit……..
Suppose that the doors are one way: on reaching the antechamber,
Lara has to choose one of the two doors, and on doing so, is stuck
on one end of the seesaw. Suppose that the seesaw is trapped in
such a way that one has to keep the seesaw balanced for, say,
five minutes before the trap is defused. Classically, this would
be impossible for Lara, as she is only on one side. But if she goes
through once, say on side A, and then dies, then when the game
is restored, she can go in on side B and balance herself against
the corpse from the previous game to defuse the trap. So she in
fact has a 50% chance of survival here.
But on the other hand, note that if you lock either one of the doors,
her survival rate drops to 0%.
This does not have an easy classical explanation within the game
universe, even with hidden variables, at least if you make the
“locality” assumption that Lara can only go through one of the two
one-way doors, and if you assume that the locks have no effect other
than to stop Lara from choosing one of the doors.
26 February, 2007 at 10:58 pm
Jacques Distler
Great!
What would be clearest would be to try to rewrite this this old discussion of the Greenberger-Horne-Shimony-Zeilinger theorem, in terms of your “Tomb-Raiders” universe.
GHSZ really put the distinction between hidden variable theories and QM is sharp relief.
26 February, 2007 at 11:43 pm
It is all about me! » Blog Archive » Determinism
[...] it but I never truly embraced it….. because I had nothing to write about. But physics and quantum mechanics in particular are bloggable. | Suppose you’ve got a run of the | mill deterministic cellular [...]
27 February, 2007 at 12:53 am
zerocold
Great post professor Tao and welcome to the exciting word of blogosfera. I hope that you will be an example to other professional matematician to share they studies and their knowledges with the worl non only with thecnical papers on math journal. I’ve already added your feed to my rssreader.
zerocold
27 February, 2007 at 12:57 am
Terry Tao
Okay, I think I can get a Bell’s inequality violation now, though as in the usual Bell’s inequality the situation is complicated.
I will need two characters (Lara and Indiana), and three locations widely separated in space: Tomb A, Gate L, and Gate I. Gate L and Gate I both have two up-down switches which either character can manipulate into any of the four positions before trying to open the gate: up-up, up-down, down-up, or down-down. However, the gates are trapped: only two of the positions allow the gate to be opened safely; the other two positions will ensure that the gate electrocutes whoever is trying to open it. Lara and Indiana know that the gates are anti-symmetric: if one flips both switches then that toggles whether the gate is safe or not (e.g. if down-up is safe, then up-down electrocutes). But they do not know exactly which combinations are safe.
Lara and Indiana desperately need to open both gates, but do not know the safe combination. They believe (inaccurately, as it turns out) that inside Tomb A, there is inscribed a combination (of one of the four positions) which will safely open both gates. Their plan is to jointly go to Tomb A, find the combination, write that combination down on two piece of paper (one for Lara, one for Indiana), and then Lara and Indiana will travel separately to Gate L and Gate I to try that combination to unlock both gates. At this point, the player saves the game and play continues repeatedly from this restore point. In this particular scenario, the player actually has no control over Lara and Indiana’s actions; they are independent AI’s, following the plan described above.
Unfortunately for Lara and Indiana, the combination in Tomb A is simply a random combination – up-up, up-down, down-up, and down-down are each 25% likely to be found in tomb A. In truth, the combinations to Gate L and Gate I have been set by Jacqueline, as one of the following possible settings:
Gate L, Setting 1: opens safely on up-up or up-down, electrocutes on down-up or down-down
Gate L, Setting 2: opens safely on up-up or down-up, electrocutes on up-down or down-down
Gate I, Setting 1: Opens safely on up-up or up-down, electrocutes on down-up or down-down
Gate I, Setting 2: opens safely on up-down or down-down, electrocutes on up-up or down-up.
Jacqueline is interested in the probability that Lara and Indiana end up with the same fate, i.e.
p = Prob( (Lara and Indiana both survive) OR (Lara and Indiana both die) ).
Let’s see what this probability is in various cases.
First suppose that Gate L and Gate I are both set to setting 1, thus they open on up-* and electrocute on down-*. If Lara and Indiana find an up-* pattern in Tomb A then they both survive. In some cases they may both be electrocuted, but only if they both hold down-* codes. If Lara and Indiana later encounter corpses of themselves clutching a down-* code, they are intelligent enough to apply the opposite of that code (overriding whatever false clue they got from Tomb A) and pass through safely. As the situation is totally symmetric we see in this case that p = p_11=1.
Now suppose that Gate L and Gate I are both set to setting 2, thus Gate L is only safe for *-up and gate I is only safe for *-down. Then what happens every time the game is played is that exactly one of Lara or Indiana dies. Note that due to the entangled nature of the corpse mechanic, this means that Lara and Indiana never see any useful corpses which could save their lives. So in this case p = p_22=0.
Now suppose that Gate L is in setting 1 and Gate I is in setting 2, or vice versa. Then what happens, if Indiana and Lara see no corpses, is that they have an independent 50% chance of survival, and thus a 50% chance of meeting the same fate. On the other hand, if Indiana and Lara see corpses (and the way the mechanic works, if one of them sees a corpse, the other does also), then they will use the more intelligent negation strategy to open both gates. Thus in these cases p_12 or p_21 is _strictly_ greater than 1/2.
These facts put together violate (the CHSH form of) Bell’s inequality:
p_11 + p_12 + p_21 – p_22 \not \leq 2.
p.s. Thanks for the pointer to the GHSZ theorem, which I was unaware of. Unfortunately I will probably not have the time to try to concoct
an analogue for that theorem in my model…
27 February, 2007 at 3:46 am
nc
“Anyway, its quite clear that Terry Tao is not talking about a classical ensemble of Lara’s, because the various Lara’s are interacting with each other. Its exactly the same as an electron behaving as though there were multiple slits between it and a screen – interference results.”
The solar system would be as chaotic as a multi-electron atom if the gravitational charges (masses) of the planets were all the same (as for electrons) and if the sum or planetary masses was the sun’s mass (just as the sum of electron charges is equal to the electric charge of the nucleus). This is the 3+ body problem of classical mechanics:
‘… the ‘inexorable laws of physics’ … were never really there … Newton could not predict the behaviour of three balls … In retrospect we can see that the determinism of pre-quantum physics kept itself from ideological bankruptcy only by keeping the three balls of the pawnbroker apart.’
– Dr Tim Poston and Dr Ian Stewart, ‘Rubber Sheet Physics’ (science article, not science fiction!) in Analog: Science Fiction/Science Fact, Vol. C1, No. 129, Davis Publications, New York, November 1981.
Obviously Bohr did not know anything about this chaos in classical systems, when when coming up with complementarity and correspondence principles in the Copenhagen Interpretation. Nor did even David Bohm, who sought the Holy Grail of a potential which becomes deterministic at large scales and chaotic (due to hidden variables) at small scales.
What is interesting is that, if chaos does produce the statistical effects for multi-body phenomena (atoms with a nucleus and at least two electrons), what produces the interference/chaotic statistically describable (Schroedinger equation model) phenomena when a single photon has a choice of two slits, or when a single electron orbits a proton in hydrogen?
Quantum field theory phenomena obviously contribute to quantum chaotic effects. The loops of charges spontaneously and randomly appearing around a fermion between IR – UV cutoffs could cause chaotic deflections on the motion of even a single orbital electron:
‘… the Heisenberg formulae can be most naturally interpreted as statistical scatter relations [between virtual particles in the quantum foam vacuum and real electrons, etc.] … There is, therefore, no reason whatever to accept either Heisenberg’s or Bohr’s subjectivist interpretation …’
– Sir Karl R. Popper, Objective Knowledge, Oxford University Press, 1979, p. 303.
Yang-Mills exchange radiation is what constitutes electromagnetic fields, both of the electrons in the screen containing the double slits, and also the electromagnetic fields of the actual photon of light itself.
‘Light … “smells” the neighboring paths around it, and uses a small core of nearby space. (In the same way, a mirror has to have enough size to reflect normally: if the mirror is too small for the core of nearby paths, the light scatters in many directions, no matter where you put the mirror.)’ – R. P. Feynman, QED, Penguin, 1990, page 54.
The electrons are exchanging a net amount of gauge boson energy where you have electromagnetic forces doing work. The claim of the failure of Maxwell’s classical electromagnetism (that a spinning charge should radiate, and there should be no ground state) is due to ignoring the fact that a static electric field is simply an equilibrium of light speed radiation exchange between all charges. There could be plenty of interesting checkable mathematical results from rigorously explaining the distinction between classical and quantum results without involking speculation.
27 February, 2007 at 9:08 am
Anonymous
I may be suffering from all the blind spots that a father has when he praises his son, but I believe that my computer program “Quantum Fog” (mac freeware available at Apple website) is the best software ever for understanding quantum mechanics.
27 February, 2007 at 9:15 am
Matt Leifer
Nice post! I’m a bit puzzled by your assertion that the Tomb Raider model is like many-worlds though. To me it smells more like a nonlocal hidden variable theory in the vein of Bohmian mechanics, except that the continuous nonlocal potential is replaced by instantaneous discretized loading and saving operations. One reason for saying this is that you don’t have any ambiguity over the choice of “basis” like you do in quantum theory. The “external picture” in quantum mechanics is just a “wave-vector of the universe” for which no preferred bases or probabilities exist a priori. This is a big conceptual issue with many-worlds because you require an explanation in terms of the external picture (like decoherence) of why the worlds split in the “pointer basis” rather than any other. In your model there is a fixed dead/alive “basis” in which the splitting always occurs, which is a primitive component of both the internal and external pictures, and which is put in by hand. This is analogous to the way that the position variables are put in by hand as part of the external picture in Bohmian mechanics.
27 February, 2007 at 9:32 am
Anonymous
The size of the largest cap sets for F_3^n for n=1,2,3 and 4 are 2,4,9 and 20 right?
Just checking if my second year math project (on the mathematics of Set) is actually related to this.
27 February, 2007 at 9:58 am
Terence Tao
A couple final remarks, and then I will have to go back to my day job :-)
It is possible to tweak the Bell inequality example I gave above to give a GHSZ type result, but only in an asymptotic limit. I doubt the model can give a non-asymptotic GHSZ result without some major rule changes, basically because the first run of any puzzle will be completely classical, even if subsequent runs are not.
Previously, I was implicitly assuming that once both Lara and Indiana make it through the game, the player stops restoring the game. But suppose now that the game is restored a large number of times regardless of whether Lara and Indiana succeed or not. Assume also that corpses stick around for a long time. Then we still have p_11 = 1 and p_22 = 0 as before, but in the 12 or 21 situations what will happen is that after a small number of game iterations, corpses will appear in both gates, and after that both Lara and Indiana can safely navigate both gates every time by using the negation trick. So asymptotically p_12 = p_21 = 1 as well. In other words, the four statements
Lara survives setting 1 iff Indiana survives setting 1
Lara survives setting 2 iff Indiana does not survive setting 2
Lara survives setting 1 iff Indiana survives setting 2
Lara survives setting 2 iff Indiana survives setting 1
all are true with asymptotic probability 1, even though the four statements
are inconsistent in a classical local hidden variable theory.
Of course, there are many fundamental ways in which this type of Tomb Raider model differs heavily from quantum mechanics. QM has a Hilbert space, a complex structure, and an evolution operator which is linear, time-reversible, and unitary. Tomb Raider has none of these, at least without massive rule changes. (For instance, Tomb Raider trivially fails to obey the no-cloning theorem.) In particular, it is the nonlinear, time irreversible nature of the game mechanic which is allowing me to construct the above examples; QM is far more rigid in this regard.
It’s also true that this Tomb Raider model does not fully capture the unintuitive nature of the many-worlds interpretation due to the presence of a canonical basis for the universal wave function. I have some thoughts as to how to capture a lack of basis that by turning the game into a massively multiplayer online game, in which each player’s version of Lara will share items and experiences with other simultaneous players, but I haven’t seriously thought it through, and as I said in the main post I think there is a serious limit as to how far this analogy can and should be pushed anyway. :-)
27 February, 2007 at 11:02 am
John Sidles
If memory serves, Stanislaw Lem’s short story collection The Cyberiad deals with many of these same issues. I will see if I still have my copy of it.
27 February, 2007 at 11:28 am
Bob Hawkins
“… she usually encounters a number of corpses which look uncannily like herself…”
Exactly this happens in the 1960 novel _Rogue Moon_ by Algis Budrys ( http://www.amazon.com/s/ref=nb_ss_gw/103-0139294-7387843?url=search-alias%3Dstripbooks&field-keywords=Rogue+Moon ).
In the novel, the process you describe mirrors the creative destruction of science — the “correct” theory is the one that isn’t killed by contrary fact. It necessarily mirrors QM, since QM is also creative destruction — the observed trajectory is the one that is not destructively interfered with by the integral over any other path.
27 February, 2007 at 12:18 pm
Bee
Indeed, a very nice post! I used to play Tombraider a lot :-) Anyway, I just meant to remark that your explanation of branching into various histories only works if the ‘internal’ world is really completely decoupled from the ‘external’ (deterministic) world. Which it can’t be because if it were there was no branching in the first place. Whether Lara’s consciousness would be able to keep up with it or not, in the presence of a player saving her that is not decoupled from the ‘internal’ world, it would be possible for her to figure out a quite weird timeline where she’s killed, reanimated, goes back, tries again, and again, and again. The reason being that the information saved on disk lacks information and can’t be used to fully determine her future. So. I guess I’d agree with Matt that it’s more a hidden variables scenario.
27 February, 2007 at 12:51 pm
Tomb Raider Quantum Mechanics « The truth makes me fret.
[...] Raider Quantum Mechanics Terry Tao explains QM with a Tomb Raider [...]
27 February, 2007 at 2:28 pm
Mason Porter
Mathematicians should never give up video games. :) Among other things, they can provide good fodder for allusions in Notices articles. :)
27 February, 2007 at 4:20 pm
Kea
So I guess this comes under the heading of related material, which begs the question of exactly how it is related to your research.
28 February, 2007 at 3:59 am
Andrew Thomas
A interesting article, but it didn’t really ring true for me. I felt you were trying to shoe-horn a Many Worlds interpretation into a video game analogy but the analogy just wasn’t there (corpses don’t “pile-up” each time you pass through the game, and video game physics are classical, not non-classical as you suggest).
I don’t subscripe to Many Worlds at all. How do you feel about decoherence? I think your comparison to Schrodinger’s cat and the whole Many Worlds thing doesn’t reflect the latest ideas of decoherence at the microscopic level:
http://physicsweb.org/articles/world/13/8/3
Your statement: “Nevertheless, by studying the statistical behaviour of large numbers of microscopic objects we can indirectly infer their quantum nature via experiment and theoretical reasoning.” Er, that’s news to me! You can see intereference effects in – say – the double slit experiment but I can’t think of any other macroscopic experiment where you can view quantum effects. Surely that’s the whole point of the classical model? I think you should elaborate on this.
There’s plenty of other work in this area. I run my own site on spacetime and quantum mechanics in a computer-simulated world:
http://www.ipod.org.uk/reality/reality_big_brother.asp
Also see the Simulism Wiki:
http://www.simulism.org/Simulism_Home
28 February, 2007 at 4:01 am
Andrew Thomas
Just to add I think I’d agree more with Matt Leifer’s comment above that the video game simulation more closely resembles “a nonlocal hidden variable theory in the vein of Bohmian mechanics”. The simulated beings do not get to see the hidden variables, which would be represented by the inner workings of the computer.
28 February, 2007 at 7:09 am
Captain Slack
You learn something new every day. Until I read this article (here via The Weasel King), I totally thought Benjamin Hutchins invented Natla for Neon Exodus Evangelion.
28 February, 2007 at 7:56 am
Terry Tao is now blogging « Perfectly Reasonable Deviations
[...] reasons why Terry started blogging in wordpress is that wordpress now allows embedding (more info here). Welcome to the blogosphere [...]
28 February, 2007 at 9:54 am
Terence Tao
Regarding the various interpretations possible for this Tomb Raider analogy (ranging from a completely classical interpretation in the external universe, to various non-classical or non-local interpretations within the game universe), it occurred to me that perhaps the QM interpretation that this analogy is closest to is the sum-over-histories model of Feynman, except that the sum operation is replaced by a non-commutative (and also nonlinear) one: past games affect the future ones via the corpse mechanic, but not vice versa. The non-commutativity and nonlinearity serve in some sense to compensate for the lack of a complex phase in this model, so that one can generate non-classical effects which are similar (though not quite identical) to those generated by quantum mechanics.
Of course, from a purely empirical point of view there is no distinction between the various equivalent interpretations so long as they predict the outcome of experimental data (which in our model would be the survival probabilities of various Tomb Raider levels) correctly.
28 February, 2007 at 12:59 pm
Yan Feng » Quantum mechanics and Tomb Raider
[...] just started a blog at wordpress.com. In a recent post, he drawed an analogy between computer game Tomb Raider and Quantum mechanics. Very [...]
28 February, 2007 at 2:53 pm
Pseudo-Polymath » Blog Archive » A Blog of Note
[...] on a possible philosophical way to interpret quantum machanics classically take a gander at this post (read the comments [...]
28 February, 2007 at 3:05 pm
Digital Meadows » Blog Archive » Meccanica quantistica e Tomb Raider
[...] Cosa si ottiene mescolando meccanica quantistica e Lara Croft? Un post (invero un po’ lungo) che cerca di spiegare in maniera comprensibile alcune teorie fisiche particolarmente oscure. Fatta la tara alle inevitabili imprecisioni e semplificazioni (di cui io, visto che non so niente di quantistica, sono comunque totalmente inconsapevole :D) resta un post estremamente stimolante: Quantum mechanics and Tomb Raider. [...]
28 February, 2007 at 3:26 pm
its about time» Blog Archive » links for 2007-02-28
[...] Quantum mechanics and Tomb Raider A lucid expanation, I love it:) (tags: quantum tombraider fun clear science) [...]
1 March, 2007 at 8:55 am
petenello
Imagine a superposition of two quantum events. Event A: a slow moving quantum particle hits a bomb at time t1 (and Lara dies at time t1). Orthogonal event B: the slow moving quantum particle does not hit the bomb at time t1 (and Lara does not die at time t1). In general we’ll have a superposition of both situations. The interesting point is that, at time t1 +
epsylon, Lara knows that the other Lara already died. She does not experience any effects from the other state at all, but she can go there,
to find ‘her’ corpse. You do not need to ’save’ anything, necessarily.
2 March, 2007 at 3:31 pm
ben green
Terry,
A very interesting post. Following some work I did with Tom Sanders, I’d like to suggest a possible splitting of non-commutative Freiman into two parts. The first is called “weak Freiman” and I can state it explicitly; the second is “strong Freiman” and I do not have any thoughts to add to those you have already put down in your post.
Weak Freiman is the statement that if A has small tripling then some large subset A’ of A is contained in a “Bourgain system”. A Bourgain system is a certain notion of what it means to be an approximate group. The idea comes from Bourgain’s 1998 paper on Roth’s theorem, hence the name.
In my paper with Sanders we only defined Bourgain systems in the abelian case, but I’d guess it extends to the non-abelian case as follows.
A Bourgain system in some group G is a collection (X_t)_{t = 0}^{10} of sets such that
(i) If g is in X_t then so is g^{-1}
(ii) X_t X_u and X_u X_t are both subsets of X_{t + u} when t,u > c_1(K)|A| such that A’ is contained inside a Bourgain system with |X_1| U_d(C), where d is not too big. Then the pull-back under pi of any ball about the identity matrix will be a Bourgain system of dimension about d (kind of a non-abelian Bohr set, by the way). What can one say about the structure of such sets? One would seem to need some kind of non-abelian geometry of numbers.
Anyway there are some great questions in this area waiting to be worked on…..
Ben
2 March, 2007 at 3:34 pm
ben green
Terry, half my post got deleted. Any idea what happened?
Ben
3 March, 2007 at 2:50 am
Anonymous
Ben,
Most blogs limit the size of a “fast reply” to a few hundred characters. If you plan to post a long reply, it’s best to type it in notepad or some other text editor, and copy and paste it into the reply window. Then if it is truncated, simply copy and paste the rest in another reply.
(Another advantage of not typing long shpiels directly in the reply box is that you don’t risk losing the lot if you accidently hit the “back” button – Firefox is a bit better at saving input in forms, and I think the latest IE has improved, but it used to be maddening when this happened!)
Terence,
Have you ever considered the ABC Conjecture, or do you know if any progress has been made on it? Seems like a glass cliff face to me, but then I’d have said the same about primes in arbitrarily long arithmetic progressions and many of the other problems you have tackled.
Cheers
John R Ramsden
3 March, 2007 at 2:55 am
John R Ramsden
P.S. Oops – I should have addressed the ABC conjecture question to Terence _and_ Ben (and for that matter to anyone else who might be qualified to speculate).
3 March, 2007 at 7:58 am
Terence Tao
Dear John,
Thanks for clearing up the truncation issue. I poked around but couldn’t find a way to modify the threshold for these “fast replies”. It seems that if one has a wordpress account then it is easier to make and then edit fancy comments, though.
Regarding the ABC conjecture, I tend to distinguish between two types of number theory problems: “many-solutions” problems and “no-solutions” problems. “many-solutions” problems ask for one or more solutions to some system of equations in integers or primes. Examples include the Goldbach or Waring problems, the twin prime conjecture, or finding long progressions in primes. “no-solutions” problems, on the other hand, ask that the number of solutions to a certain system in a certain range is in fact zero (or bounded independently of various parameters). Fermat’s last theorem is the most famous example of the latter; the ABC conjecture can also be rephrased in this form. (Specifically, if N_1, N_2, N_3 are large integers and A_i is the set of numbers of size at least (N_1 N_2 N_3)^{1+epsilon} with radical at most N_i, there are no solutions to a_1 + a_2 = a_3 with a_i in A_i.) Other examples of “no-solutions” problems include the finitude of Fermat primes and the (now solved) Catalan conjecture.
With “many-solutions” problems, one can try to solve the problem by getting a lower bound on the number of solutions; as long as the lower bound is non-trivial, one solves the problem. This suggests analytical methods, as one can tolerate error terms as long as they are dominated by the main term in the lower bound. But such techniques are much less likely to work for “no-solutions” problems; one has to get an upper bound which is less than 1, but all upper bounds are clearly at least zero, so one is aiming for an incredibly miniscule target and can basically afford no losses in error terms whatsoever. I believe that the best hope for such problems instead lies in methods from algebraic number theory.
3 March, 2007 at 8:37 am
Terence Tao
Thinking about it a little more, I guess one of the indicators of difficulty of a number-theoretic problem is not so much whether it is asking for many solutions or for none, but rather just how large of a “conspiracy” amongst the natural numbers would be needed to disprove the conjecture.
For instance, take FLT. It would only take a single solution to a^n + b^n = c^n for some n > 2 to destroy the conjecture; this looks like a very small conspiracy, difficult to detect by most methods, and so the conjecture should be very difficult. But it turns out that such a conspirator must inevitably involve a very strange elliptic curve, which in turn drags in many other mathematical objects into the conspiracy, which makes it more feasible to detect and then destroy such a conspiracy.
On the other hand, a failure of the twin prime conjecture (say) would require an infinite number of conspiracies: past a certain point, every prime p is somehow forbidding p-2 and p+2 to be prime. We can’t yet say that all these conspiracies don’t happen at once, but it is plausible that one could set up some sort of computable statistic involving the primes which would detect and then exclude this scenario.
With arithmetic progressions, the conspiracy has to be even more “global”: if there are no prime progressions of length k, then whenever p, p+r, …, p+(k-2)r are large primes, then p+(k-1)r is mysteriously forced to be composite. Since p and r can be independently quite large, this gives a lot of structural information on the primality of p+(k-1)r which can hopefully be detected or exploited. (Indeed we sort of exploit this kind of information in our proof, via the machinery of “dual functions”. Rather amusingly, we don’t directly exclude the “conspiracy” of exotic structure in the primes, but instead we “co-opt” any such conspiracy, placing it into a homogenised model f_{U^\perp} of the primes which can then be forced (via Szemeredi’s theorem) to cough up progressions. In later papers we exclude these conspiracies directly.)
3 March, 2007 at 8:42 am
Terence Tao
Ah, I got caught by the truncation issue also :-) . Actually I found the source of the problem: the less-than sign is interpreted as an HTML tag and so should be avoided in comments. I’m guessing that the portion of Ben’s post which was removed was the portion between a \lt sign and a \gt sign.
Returning back to the ABC conjecture, in order for this conjecture to be false one needs a sparse sequence of counterexamples, namely a sequence of solutions a+b=c with rad(abc) \leq c^{1+o(1)}. This sequence can be arbitrarily sparse and thus does not generate the type of aggregate bias which could be detectable by an analytical statistic (though this method might conceivably give a non-trivial upper bound to the number of ABC counterexamples). Instead, one would probably have to algebraically transform such a counterexample to another setting (possibly one from transcendence theory?) in which any conspiracy becomes large enough to be detectable. (cf. the “dispersion method” of Linnik.) Not that I have any idea how to achieve this…
3 March, 2007 at 10:31 am
ben green
Dear Terry,
Here is an attempt to revisit my post of yesterday without using less than/greater than symbols,
which HTML doesn’t seem to cope with.
Your post was very interesting. Following some work I did with Tom Sanders, I’d like to suggest a possible splitting of non-commutative Freiman into two parts. The first is called “weak Freiman” and I can state it explicitly; the second is “strong Freiman” and I do not have any thoughts to add to those you have already put down in your post.
Weak Freiman is the statement that if A has small tripling then some large subset A’ of A is contained in a “Bourgain system”. A Bourgain system is a certain notion of what it means to be an approximate group. The idea comes from Bourgain’s 1998 paper on Roth’s theorem, hence the name.
In my paper with Sanders we only defined Bourgain systems in the abelian case, but I’d guess it extends to the non-abelian case as follows.
A Bourgain system in some group G is a collection (X_t)_{t = 0}^{10} of sets such that
(i) If g is in X_t then so is g^{-1};
(ii) X_t X_u and X_u X_t are both subsets of X_{t + u} when t, u are at most 5;
(iii) 0 is in X_t for all t;
(iv) |X_{2t}| is no more than C|X_t| for some C and for all t at most 5.
log C may be interpreted as a kind of “dimension” of the Bourgain system.
What is the point of this definition? There are many places inside a Bourgain system where one can look and see
objects which are almost closed under addition. For example if t and u are “typical” values and if u is much less than
t then X_t + X_u will be almost the same as X_t. These properties are enough to enable Tom and I (in the abelian case)
to do a kind of “approximate harmonic analysis” on a Bourgain system. Actually all the ideas for doing that came from
Bourgain’s work and some unpublished notes of yours. For some applications (such as the one Tom and I had) this is all
one needs, and any more precise algebraic structural information about the Bourgain system is unnecessary. For example,
though the theorem you call the “Freiman-Green-Ruzsa Theorem” lets you relate an abelian Bourgain system to a coset
progression, we did not need this fact in our application.
Thus a Bourgain system (in the abelian case) qualifies as an approximate group in a quite strong sense.
I’m guessing that in the non-abelian case things are pretty similar, and one can do a kind of approximate representation
theory, at least of low-dimensional representations – I haven’t worked out the details yet.
So here’s my guess at a “weak Freiman”:
Suppose that |A+A| is at most K|A|. Then there is some subset A’ of A with size at least c_1(K)|A| such that
A’ is contained inside a Bourgain system of dimension c_2(K) with |X_1| having size at most c_3(K)|A|.
How might one prove such a weak Freiman theorem? Unfortunately one of the key steps in the proof of the Freiman-Green-Ruzsa theorem
is simply false for non-abelian groups. That is the existence of “good models”. Thus one can have a set A with *very*
small tripling (say about 3) with no large subset which is Freiman-isomorphic to a dense subset of a group.
This means that you can’t repeat our trick of transferring to a setting where the tools of (non-abelian) harmonic analysis
can be applied. I think there may be some hope of generalising a recent “energy-increment” argument of you and I; Tom
Sanders and I are going to think about this.
I can’t add much more to what you wrote concerning the correct statement of a “strong Freiman theorem”.
There is one example which I think ought to be instructive. Let G be a finite abelian group and suppose that pi from G to
U_d(C) is a unitary representation. Then the pull-back under pi of any ball about the identity matrix is
likely (assuming some amount of equidistribution of pi(G) in U_d(C)) to be a Bourgain system of dimension about d.
What can one say about the algebraic structure of such “unitary Bohr sets”? Is there a kind of “non-abelian geometry
of numbers” which allows one to find structure inside these sets? I think that the study of some
examples with small d (which I haven’t done yet) could be instructive here.
Anyway there are some great questions in this area waiting to be worked on…..
Ben
4 March, 2007 at 11:59 pm
zerocold
Ben please could you post some of these great questions in this area?
zerocold
5 March, 2007 at 6:53 am
akshay
Lindenstrauss has suggested that a non-commutative Freiman theorem should be connected to quantifications of Gromov’s theorem: a group of polynomial growth is virtually nilpotent. As far as I know, no quantitative form of that is known. Anyway, that may suggest candidates for ???
5 March, 2007 at 8:30 am
Terence Tao
Thanks Akshay!
It does seem that virtual nilpotency is perhaps the key (the above-mentioned paper of Chang also makes this type of connection). Incidentally I realised that one can use a nilpotent group to concoct a set of small tripling for which my “standard candidate” doesn’t come close to approximating – take the Heisenberg group of upper-triangular matrices
and consider the set A in which
.
Perhaps the revised candidate for ??? would be the inverse image (under the projection N(H) -> N(H)/H by a finite group H) of a set B of small tripling contained inside a nilpotent group.
6 March, 2007 at 3:47 pm
Matthew Leifer
I don’t agree about the Feynman path integral “interpretation” because it is not really an interpretation at all, just an incredibly useful calculational technique that physicists often read far too much into. To qualify as an interpretation, you have to give me a concrete statement about what things exist in reality, how measurement outcomes are accounted for, etc. Feynman doesn’t do this, and is notably adamant that his formalism shouldn’t be thought of in this way.
On the other hand, your tomb raider analogy does include Laras who are in some definite state in some definite tomb. Quantum mechanics does not include such things at the operational level, i.e. it gives no account of what is going on in reality in between measurements. In tomb raider, this would be like refusing to make any statements about whether or not any Lara exists in any given tomb until the scientist has performed her experiment. So long as you make the same experimental predictions for the scientist, this is the minimal account of the tomb raider scenario that you could give, and would be compatible with a variety of different interpretations. Since you do have definite Laras the situation is not like an operational view of quantum theory and it must be more akin to picking one interpretation or another. I still think de Broglie-Bohm is closer, but the difference of opinion may be more due to ambiguities in how to map the tomb raider analogy to quantum theory than anything deeper.
7 March, 2007 at 11:42 am
anonymous
You probably mean a,c = O(N) and c = O(N^2)
7 March, 2007 at 11:43 am
anonymous
You probably mean a,c = O(N) and b = O(N^2)
8 March, 2007 at 10:20 am
Andy D
Terence, it is really valuable to have informal expository writing from an experienced mathematician, interspersing accurate descriptions of problems with your perspectives on why they are difficult and where various techniques are likely to apply (e.g., what you said about the strengths of analytic vs. algebraic number theory).
So thanks! Also, I think your readers will not be at all disappointed if you blog about math that is not new or open but that deserves to be better known –or for its Tomb Raider interpretation to be better known :) –or affords a good opportunity for learning/problem solving.
8 March, 2007 at 11:45 am
sirix
I totally agree with Andy D. So, big thank You from me as well!
8 March, 2007 at 1:29 pm
D. Eppstein
I found some references indicating that the conjecture was known to be true for d=2, but not for d=3. Is it still unknown in that case? Can it be proven that the local minimizing shapes for this conjecture are polyhedra, and if so are there any known constraints on the extremal polyhedra (e.g. they should have edges tangent to a sphere)? I am wondering whether enough constraints can be made on the form of the solution to reduce the problem to calculation over a finite number of cases in bounded dimensions; e.g., only over polyhedra with a bounded number of facets.
8 March, 2007 at 2:12 pm
Terence Tao
As far as I know, the d=3 case is open. It does seem intuitively plausible that local minimisers of Mahler volume should have lots of “flat” sides, and thus presumably be a polyhedron, and it may well be that any rigorous proof of that fact may even come with an upper bound on the complexity of that polyhedron. But I am not aware of any actual results in this direction. Note that as Mahler volume is an affine invariant, it is unlikely that the Euclidean sphere (or any other fixed shape) is involved unless one first normalises the body (e.g. to make the John ellipsoid Euclidean).
8 March, 2007 at 3:01 pm
Bo'az Klartag
Suppose we remove the assumption that B is symmetric about the origin.
It seems reasonable to guess that the minimal mahler volume is now attained for a simplex whose barycenter is at the origin. The mahler volume of the simplex is roughly (e/4)^d times that of the cube.
This non-symmetric Mahler conjecture might be a little bit easier, since the set of conjectured minimizers is more approachable: I think it should be only simplices. Is there a process that a) preserves simplices whose barycenter lies at the origin b) does not increase the volume product and c) converges to a simplex?
Also, Santalo’s inequality holds for any centrally-symmetric set B, not necessarily convex (taking the convex hull just increases the volume and doesn’t change the polar), while for the Mahler problem, you actually need to use the convexity of B somehow. This might make the problem more difficult.
Finally, about Bourgain-Milman’s proof: I am not completely sure whether the type and cotype arguments are really the main ingredient there. At least the proof that that is given in Pisier’s book doesn’t have much relation to type and cotype.
8 March, 2007 at 6:15 pm
Terence Tao
Thanks, Bo’az, for the comments!
It does indeed look like the non-symmetric problem is going to be an easier one (and in fact embodies the idea that Mahler volume is minimised for “pointy” bodies even better than the symmetric problem), as it does look a little easier to make a process that converges to a simplex than to a cube/octahedron. Here’s one crazy idea: it suffices by limiting arguments to prove the claim for polyhedra. But every polyhedron is the projection of a simplex – perhaps a lifting argument (suitably normalising the Mahler volume as the dimension increases) might work? One may need to work with log-concave functions instead of polyhedra to get better behaviour with respect to projections, though one then has to figure out what the analogue of polar body becomes. (Young dual?)
In Kuperberg’s papers, one of the key ideas is to try to work with the product
as a unified object sitting in “phase space”
, and specifically inside the set
. It may be that one may have to run the process at this level rather than purely in “physical space”. On the other hand, the above set is not convex, which creates a certain amount of difficulty.
I looked through Pisier’s book briefly – it seems he deduces the reverse Santalo inequality from Milman’s quotient-subspace theorem, which appears to be a slightly different route than the original Bourgain-Milman approach (but I don’t have the original papers handy right now to check).
8 March, 2007 at 11:06 pm
Gil Kalai
Dear all, This is a very beautiful problem and it is great that Greg Kuperberg had this recent breakthrough putting the bound at
. Maybe, in view of Greg’s result hoping for a gap of
is realistic.
The known extremal cases are called Hanner polytopes, and as explained in the post they are obtained from intervals symmetric around 0 by the operations: 1) Cartesian product, and 2) moving from P to its polar.
Hanner Polytopes are known or conjectured to be the only extreme examples to several other problems.
1) Attaining the maximum Banach-Mazur distance to the unit ball. (I am not sure about the precise status of this problem.)
This suggests to try to find the maximum of
when there is an upper bound on the Banach-Mazur distance between and a unit ball.
2) The number of non empty faces (of all dimension; including P itself) of every Hanner n-dimensional polytopes is
. This is conjectured (but embarrassingly not known) to be the minimum for centrally-symmetric n-dimensional polytopes.
3) Every Hanner polytopes has the property that if two of three translates have a point in common then all three have a point in common. (This was proved by Hanner himself.) A difficult result by Lima (I think, maybe with Hansen) shows that this Helly-type property characterizes Hanner polytopes.
(There is also a more general notion of weak Hanner polytopes arising in graph theory (Fulkerson, Chvatal, Hansen).)
9 March, 2007 at 12:45 am
Gil Kalai
Indeed this is a great problem. The Roth-Meshulam argument says something like:
If the density of a set A is t then there is a co-dimension one subspace where the density is f(t).
To get f(t) you have to balance between two scenarios: if there is a large Fourier coefficient (for the characteristic function of A) then this gives you a subspace with higher density right a way. If there is not, a little computation based on Parseval’s theorem tells you that you will get the desired improvement “on average”.
And now you iterate and get the theorem. Because if the density is larger than 2/3 you certainly have a line there. (What is beautiful about Meshulam’s argument is that various difficulties in the method in other cases vanish and we get “right to the point”.)
In this problem, as in other combinatorial problems, gradually increasing the density (by moving to a co-dimension one subspace) is the only known method to get anything non-trivial. It looks that we already reached the limit of what this method gives. As far as I know there is no known way to jump directly to low-dimensional subspaces and get better density improvements.
9 March, 2007 at 4:44 am
Gil
Correction: I oversimplified the argument a bit: Either you get close to
the “right number of lines” (calculated for the case that your set is random?) OR you have a subspace with larger density f(t).
9 March, 2007 at 11:25 am
Danny Calegari
Is it really inconceivable that some flow might find exactly the desired minima?
It’s true that they come in interesting, complicated families, but aren’t
these families all symmetric spaces for the natural orthogonal group
action on the “space of convex bodies”? Couldn’t the flow be O-equivariantly
Morse-Smale, or whatever it’s called? Of course, there would (presumably)
be other critical points, i.e. products of cubes (or polar bodies thereof) with
spheres but presumably enough is known about the equivariant cohomology
of spaces with orthogonal group actions, and about the homotopy type of
the “space of convex bodies” to see if this makes sense. Just a question at
the “village idiot” level – what does grad M look like on the space of convex
bodies (thought of as a subspace of the space of smooth functions on RP^n)?
9 March, 2007 at 11:31 am
Bo'az Klartag
Yes, you’re right, maybe the problem is a little bit simpler with log-concave functions. Write
for the Legendre transform of
.

for which the barycenter of the density
is at the origin. The conjectured Mahler inequality is equivalent to

. The function
for which
when all
, and that equals infinity elsewhere, should be a minimizer. That’s our “simplex”.
Then Santalo’s inequality is equivalent to
for any convex function
for any convex function
A nice thing with this log-concave formulation, is that the polar of the product is just the product of the polars. With convex bodies it’s a more messy. Also, denote by
the minimal Mahler volume in dimension
, for log-concave functions. Then we see that
. So, the desired conjecture would follow if we could show that
. Does it somehow help for the “lifting”?
If we work in the phase space, the Mahler inequality we want looks a little bit like some exact uncertainty principle. So, maybe the Fourier analysis approach has still something to contribute? The Legendre transform is some degenaration of the the Laplace transform, in some vague sense. I don’t know.
9 March, 2007 at 11:34 am
Bo'az Klartag
My formulae have already failed on the html level, let alone mathematically.
Is there a way to test the way a formula would appear, before posting a comment?
9 March, 2007 at 11:52 am
Greg Kuperberg
Hi folks. It is very, very nice to have this discussion about my recent paper.
As Terry alludes, the first important idea is to consider another body, the “diamond body”
. I do not know how widely important this body is in convex geometry in general, but I it seems like a basic structure in convex geometry. If it is useful for the Bourgain-Milman theorem, then it may well be useful for some other problems.
I came up with
, and the conjecture that its volume is minimized when
is an ellipsoid, when I was a senior at Harvard. My dad suggested the name “the bottleneck conjecture”. David Kazhdan assigned me the Bourgain-Milman theorem for my senior thesis, and eventually I mentioned this idea to Joseph Bernstein, who was my reader. Bernstein estimated that my program had a 1 in 10 chance of working, but that these were actually very respectable odds for such an optimistic idea. Later I wrote two papers with partial results that were inspired by the conjecture that I really wanted to solve, but obviously fell short. I also found some “warning signs” in d=3 that suggested that my conjecture might not be true. I felt very discouraged and basically shelved the question.
In 2006, I figured that since differential geometry is in such great shape because of Perelman, I might as well give the “bottleneck” conjecture another try. It made all the difference that I had learned about linking integrals both from working on Kontsevich’s configuration space integrals, and from a paper of Gluck and DeTurck on the Biot-Savart law in
. I had actually thought of using a homological method for the conjecture before — from examples such as Gromov and Thurston norms. But I had thought that it wouldn’t work because there are no closed differential forms available for the argument. But, as the Gluck-DeTurck example shows, Gauss linking forms may be weakly closed double forms rather than closed forms.
I spent the most time thinking about parabolic flows, as Terry says, to prove the bottleneck conjecture. Conceivably these could still work, but the structure of the linking form that appears in my paper is some evidence against. On the other hand, one of the two forms of the (generalized) bottleneck conjecture is still open, and who knows what methods it needs. I think that it is an interesting question in indefinite geometry.
Sorry to go on with all of the egotistical details of this question. As it happens, I was almost exactly twice as old when I proved the bottleneck conjecture as when I proposed it, 19 1/2 versus 39. I also did a key last step when I was in the hospital after surgery on my ankle, unable to sleep. I have no idea how I managed to accomplish anything in those circumstances. (“I thank Sutter-Davis hospital for its hospitality.”) During the 3 months of recovery (I am fully recovered now), I figured maybe I can prove another good theorem, but no such thing happened. Anyway it really has been a non-trivial personal journey.
9 March, 2007 at 11:56 am
Terence Tao
Dear Bo’az: I manually repaired the LaTeX. In order to get the ability to preview a post or comment it seems you need to get a wordpress account (which, incidentally, is trivial), but in the meantime the manual approach seems to work. The Legendre formulation looks quite promising, as does the trick of working in the asymptotic regime
. At that level there may be an asymptotically equivalent formulation which is more tractable (e.g. involving entropy – but now I’m just throwing out buzzwords randomly). This sort of “tensor product trick” has proven effective in a number of other contexts.
The Mahler conjecture is indeed reminiscent of an uncertainty principle. I had discarded the Fourier approach in part because it seemed highly unlikely that any Fourier inequality would be optimised for, say, the indicator function of a cube or simplex (it’s more typical to see Gaussians or something as the extremisers). But now that we have an asymptotic formulation, which gives us a bit more wiggle room, this objection might start to disappear.
Dear Danny: Thanks for the comments! One minor quibble: the relevant structure group here is
rather than
, as Mahler volume is an affine invariant and not just a Euclidean invariant. There may indeed be an affinely equivariant Morse function which has the moduli space of Hanner polytopes as the minimal points, but it would be a rather strange function, in particular it is likely to contain some other exotic critical points. If one could actually write down a function which looked critical at these polytopes then one is probably already more than halfway done to solve the problem.
Incidentally, I think your “villages” are more mathematically literate than most :-) . Computing grad M in terms of the graphing function of the body is likely to be a bit messy because the polar body is given in this non-local manner. Using the log-concave formulation may be a bit more pleasant; it may be cleverer still to somehow “relax” the problem from a product of a body and its polar, to a more general class of objects (or functions) in phase space, in which it is easier to move around in a purely “local” fashion (both in physical space and in the dual space). It may be feasible at least to show using this method that a smooth, strictly convex body cannot be a local minimum.
Dear Gil: thanks for the post! The fact that there are indeed other quantities which are known or conjectured to be extremised at these polytopes does give one some hope. At the bare minimum it should allow one to “factor” the Mahler conjecture into two presumably simpler conjectures…
9 March, 2007 at 12:19 pm
Greg Kuperberg
Also I should mention that I got a lot of very good negative advice in Madrid from differential geometers about just how non-trivial the moduli space of metrics and submanifolds are. Maybe they didn’t think of their advice as negative, but they had a lot of non-trivial things to say and I took it as warnings against naivete. There may be a PDE flow to solve the remaining open version of the bottleneck conjecture. But it is clear that optimizing flows don’t just “grow on trees”. Besides just parabolic flows, I was considering geodesics in moduli spaces of submanifolds of a manifold. The problem is that the formal Riemannian metric structure that you get on submanifolds of a Riemannian manifold is pathological in some ways.
Since there is no preview button here, I ended up with a TeX-o in my previous message, \otimes should be \times. [Fixed now - T.]
Terry points out that the set in phase space between the positive and negative unit hyperboloids is not convex, and he astutely points out that this makes the analysis harder. This is true, but actually I do not know whether
is ever non-convex when
is a convex body. By definition, the boundary of
is the join (with straight line segments) of two necks
and
of the hyperboloids. I am not completely sure, but I sort-of see that the hyperboloids are contact manifolds, and that the necks
are Legendrian submanifolds. If these necks are interesting objects of study, as I think they are, then the fact that they are Legendrian would give you a lot of geometry to work with. This is in addition to what I prove in the paper, that
are spacelike and timelike submanifolds of the hyperboloids
as well as topological cores.
By the way, Terry uses my somewhat unlucky old notation for the Mahler volume,
. This is almost the same notation as a different constant defined by Milman. In my new paper I write
for the Mahler volume. The formula that Terry gives is in my paper, but it is very slightly off from the real bound:
Gil points out that the Hanner polytopes have the same Mahler volume as the cube. Hanner polytopes also seem special in my paper, because they have the property that
. I know of no other convex bodies with this property. Very optimistically, this is a hint at completely proving the Mahler conjecture. The whole point of my paper is that
approximately grows as
shrinks, despite the fact that the former is a subset of the latter.
9 March, 2007 at 12:34 pm
Greg Kuperberg
By the way, to answer David and Danny, I am optimistic that there is a variational proof of the Mahler conjecture in 3 dimension from a Morse perspective. In fact, I would be tempted to ask David and Danny themselves about the geometry relevant to this approach.
You shoud look at the proof of the Mahler conjecture in 2 dimensions. The Mahler volume of a polygon is concave down as you pivot a pair of opposite sides. If you try to maximize the Mahler volume, the polygon limits to a regular polygon as you repeatedly pivot all pairs of sides. If you try to minimize it, then you end up losing a pair of sides. This shows you that parallelograms have the least Mahler volume among centrally symmetric polygons.
I believe that this proof can work in 3 dimensions. There is a happy theorem that the moduli space of polyhedra in 3D with any fixed combinatorial type is homotopy equivalent to SO(3). (Of course you have to mark the polyehdron to eliminate any symmetries.) This theorem is still true in the centrally symmetric case, and it is (a small) part of the grand scheme of Thurston’s ideas. I conjecture that Mahler volume is “Morse trivial” on any such moduli space, and that with a suitable metric, the downward gradient flow must eliminate an opposite pair of edges of the polyhedron. As Terry says, the right argument would be fully GL(3,R)-invariant.
Unfortunately, the moduli space of a polytope type in higher dimensions can be anything. Or it may not be so unfortunate, if you believe that 3D is a special world that yields special arguments.
9 March, 2007 at 2:18 pm
Terence Tao
Incidentally, the proof of Mahler’s conjecture in 2D seems to first appear in
K. Mahler, Ein Minimalproblem fur konvexe Polygone, Mathematica (Zutphen) B, 118-127 (1939);
it unfortunately predates MathSci (and certainly MathSciNet).
Several people have asked me now for the ability to preview comments. I hear that there is some way to change the wordpress settings to do so, but I was unable to locate one. If there are any wordpress experts out there who could help, I would greatly appreciate a tip :-) .
9 March, 2007 at 3:21 pm
Greg Kuperberg
Scott Aaronson uses WordPress and found a way to have instant preview.
To answer Danny’s question about gradient flows: Unless you choose a Riemannian metric on the space of convex bodies, the Mahler volume does not have a gradient, but rather a differential which is a 1-form. There are all too many choices for this metric; it is non-trivial to come up with one that doesn’t look like trash. As Terry noted, the Mahler business is GL(n,R)-invariant, so it already looks problematic if you make a metric on convex bodies by modelling them as polar functions on a round sphere.
I have looked for metrics of this sort not to minimize Mahler volume, but rather
. Unlike Mahler volume, this volume is minimized for ellipsoids. Nonetheless, I never found a useful downhill flow. The inequality that I did find has an oscillatory factor that makes me wonder whether there is any reasonable flow-based proof of my result.
Well, you can make interesting uphill flows for Mahler volume using infinitesimal Steiner symmetrization. Such a flow is not unique and is at best O(n)-invariant, but I think that it still works. However, I don’t think that anything good happens if you try to reverse such a flow. It should be parabolic in the uphill direction (or parabolic-ish since it is non-local), so that the downhill flow would be pathological.
9 March, 2007 at 5:23 pm
Greg Kuperberg
Also, Dave Bacon uses WordPress and somehow has a preview button.
9 March, 2007 at 6:27 pm
Terence Tao
Dear Bo’az,
I’m increasingly excited about the Fourier-analytic (or Laplace-analytic) approach to the problem, coupled perhaps with the idea of working in the asymptotically high dimensional regime to deal with
losses.
If f is convex and t is large, then the Laplace transform of
is essentially
, ignoring lower order corrections. (The slogan here is that Legendre is the “zero-temperature” limit of Laplace; rather perversely, one can also say “tropical” instead of “zero-temperature”.) If I wave my hands and mumble either “Wick rotation” or “stationary phase”, then there is some sort of similar statement for the Fourier transform instead of the Laplace transform. In which case Bo’az’s formulation of Santalo’s inequality begins to look quite a lot like Beckner’s sharp Hausdorff-Young inequality in the limit as
and
. This also fits with the ball being the extremiser, given that the Gaussian is the extremiser for Beckner’s inequality. It also fits well with the
factor.
Now to get back to Mahler’s conjecture, we would need some reverse Hausdorff-Young inequality. This isn’t available in general, of course, but we have a lot of convexity and positivity to exploit. An obvious thing to do is now try to flow along the temperature parameter t. If one looks at, say
and differentiates in t, one begins to get expressions which look like the entropy uncertainty principle, and it is quite likely that indeed one could get yet another proof of Santalo’s inequality from that principle. So now one needs some sort of reverse entropy uncertainty principle for Laplace transforms of log-concave functions… there is a hope that having differentiated along the temperature, one has “linearised” the problem enough that some new techniques (log Sobolev inequalities??) might come into play…
What is appealing about this whole approach is that everything is set up to be invariant under the three symmetries of the problem: linear transformations, polar duality, and Cartesian products.
9 March, 2007 at 7:44 pm
Greg Kuperberg
Despite having written three papers on the Mahler conjecture, I apparently never thought properly about the asymmetric case. I had thought that it was off from the symmetric case by a factorial factor.
Milman mentioned to me by e-mail that my inequality is false for asymmetric convex bodies. But I didn’t realize that it’s a close call. My lower bound is roughly
times the Mahler volume of a cube. As Boaz says, the Mahler volume of a simplex is roughly
times that of a cube. So it’s a matter of replacing
by
!
I have to then agree that the Hanner polytope story could be taken as a red herring. There are some theorems in convex geometry where people can show that a simplex is the worst case. For example, Keith Ball has shown that a simplex has the worst isoperimetric constant among convex bodies, when each body is placed in its best affine position.
I just scrambled to check what my inequality implies for asymmetric convex bodies. I can’t think of any better idea than to convert from asymmetric to symmetric. Rogers and Shephard proved in 1957 that if
is convex and
is its difference body, then
with equality for simplices only. Using the difference body, we get that the asymmetric Mahler conjecture is true up to a factor of roughly
.
9 March, 2007 at 7:58 pm
Greg Kuperberg
Terry and Boaz are masters at combining approximate estimates, whereas my paper has just a single, elementary approximation followed by a geometric argument. The point of the approximation is what I call “table-turning” in the paper. Given a statistic
of convex bodies which is supposed to be maximized for ellipsoids and minimized for less nice shapes, maybe you can find another statistic
that is minimized for ellipsoids instead, but nonetheless gives you a good lower bound for
. Of course you can also reverse all of the inequalities in this approach.
Here then is another table-turning conjecture. Terry already mentioned the elementary relation
which is important in my paper. In any fixed dimension, then, which
maximizes the average value of
on
? Several people have conjectured that it is the ellipsoid, at least in the symmetic case. So is there some nice flow or symmetrization that proves this? It would imply the well-known isoperimetric constant conjecture, moreover with a good value of the constant.
If the average value is too difficult, what about just the integral? I have the impression that you can prove that the ellipsoid maximizes the integral of
just by standard Steiner symmetrization. This would be a warm-up to the harder problem of maximizing the average value. (Maybe it is only a luke-warm-up. Even so, it would be a start.)
9 March, 2007 at 8:42 pm
Terence Tao
Ugh. My previous post was full of rather bad sign errors, and hence contains a large fraction of rubbish (particularly all the stuff involving Hausdorff-Young, and thence the entropy uncertainty principle). :-( I still feel there should be some sort of “detropicalised” version of the Mahler conjecture which involves the Laplace transform rather than Legendre, but I will have to think much more carefully to try to formulate it properly.
10 March, 2007 at 7:27 am
Terence Tao
OK, I think I weeded out the sign errors. Given an exponentially decaying log-concave function
, define the Laplace transform
by
Suppose we have the Hausdorff-Young like inequality
where
(!),
, and o(1) is a quantity depending only on n and p which goes to zero in the joint limit
.
Note that p is less than 1, so the right-hand side norm is only a quasinorm, but what is weirder, p’ is negative, so the left-hand side norm is really a quasinorm of the reciprocal of the Laplace transform rather than the Laplace transform itself. (It kind of makes sense, since the Laplace transform of a non-negative exponentially decaying function is going to be exponentially growing, in general.) Such an inequality, if true, would have to heavily exploit the log-concavity of F. On the other hand, the inequality is consistent with affine symmetry and Cartesian products, and we shall soon see that it has the Mahler conjecture as its tropical limit, so it is not immediately false.
Anyway, if we have such an inequality, we can apply it to
for a convex f which grows superlinearly at infinity. Then the Laplace transform has the asymptotic
and so a little bit of computation in the asymptotic limit
then gets
and the o(1) term can then be eliminated by taking Cartesian products and passing to the asymptotic limit
(this step can be combined with the preceding asymptotic limit to create a joint limit). The requirement that f grow superlinearly at infinity can then be removed by the usual limiting argument.
I’m not entirely sure that the Hausdorff-Young formulation is better, now that I see the negative norms, but it certainly looks rather different.
p.s. Regarding one of Gil’s remarks, if one can get within an epsilon of the Mahler conjecture, then one automatically gets the conjecture itself by the “tensor power trick”. Indeed if one knew that
for any
, then by replacing B with the Cartesian product
one easily sees that
for any k. Taking
roots and then letting k go to infinity, and then letting
go to zero, we obtain the Mahler conjecture.
10 March, 2007 at 8:32 am
Greg Kuperberg
Terry: Since you are interested in detropicalizations, there is a published detropicalization of the Blaschke-Santalo inequality due to Lutwak and Zhang. Their paper is Blaschke-Santalo inequalities.
On a substantially unrelated topic, if you are just generally a fan of p-norm inequalities, I have a quantum version of the pigeonhole principle which takes the form of a Holder inequality with complementary norms. It is Theorem 4.1 in quant-ph/0203105. The form of the pigeonhole principle that I generalize says that if you write a symbol from an alphabet of size n in a shorthand alphabet of size k < n, then you cannot expect to decode it correctly with probability better than k/n.
10 March, 2007 at 10:23 am
Greg Kuperberg
Also, another non-mathematical comment: The Bourgain-Milman theorem was once mentioned in the New York Times! A Gina Kolata article in 1994 on the Fields Medalists said the following:
10 March, 2007 at 2:10 pm
Greg Kuperberg
Also, an emendation to my previous comment about the conversion from the asymmetric case to the symmetric case: What I said before is not optimized. Instead of the difference body
, you can use the symmetrized body
. The dual norm induced by
is then the average of the norms induced by
and
. So it is easy to see by concavity of the integrand that
Thus, using the Rogers-Shephard inequality, the simplex has the least Mahler volume among asymmetric convex bodies up to a factor of
.
10 March, 2007 at 4:02 pm
Mike
The cartoon “Reboot” used to play on the dual universe concept. The story took the viewpoint of anthropomorphized electronic elements of the computer. These elements had their own lives, but when a game began (descended from the sky in a great black block) the main characters were transformed into game elements and had to defeat the User (who was mythologized) or perish.
11 March, 2007 at 7:32 am
Gil Kalai
How small can Vol(K) (equivalently M(K)) be for a self-dual centrally symmetric convex body? Namely for a body K such that the polar of K equals to -K.
11 March, 2007 at 9:26 am
(Ben Green) The Polynomial Freiman-Ruzsa conjecture « What’s new
[...] an earlier blog Terry discussed Freiman’s theorem. The name of Freiman is attached to a growing body of [...]
11 March, 2007 at 10:20 am
Greg Kuperberg
Gil: Given two symmetric convex bodies
and
, you can define the p-norm Cartesian product
. If
and
are dual, then
is isodual. This construction is the best that I can think for your question. To optimize the construction, you should to let
(equivalently
) be a Hanner polytope.
I do not know how much this construction says about the full Mahler conjecture. However, it does mean that the Bourgain-Milman theorem for isodual symmetric convex bodies is equivalent to the theorem for all symmetric convex bodies. A similar consideration tells you that the Bourgain-Milman theorem implies the bottleneck conjecture for symmetric convex bodies up to an exponential factor. (Of course the converse implication motivated the botttleneck conjecture.) Happily the bottleneck conjecture turns out to be true with no fudge factor.
11 March, 2007 at 5:59 pm
Greg Kuperberg
I thought about this a bit and I do not quite understand the finite field heuristic in combinatorics. For instance, we can also define a line in
as a triple that sums to 0. If so, then
is a terrible approximation to
, because the latter has sets of positive density without such triples.
The lines in
are an example of a Steiner triple system, or otherwise they may be viewed as just a collection of
triples of points, where
. I have a heuristic calculation that suggests that if the collection is unstructured, then you can only expect triple-free sets of size
. This suggests that the structure of the lines in
and the arithmetic progressions in
are unusually favorable for the existence of large triple-free sets. Granted, Ben Green and others have much more experience with these problems than I do. Can someone say why these two collections of triples are considered to be comparably favorable?
Does the empirical data have something to do with it? The maximum cap set cardinality is sequence A090245 in Sloane’s Encyclopedia of Integer Sequences. The known terms are 2,4,9,20,45, and (according to a review article by Davis and Maclagan) the next term is between 112 and 114. Well, this small amount of data does suggest that Meshulam’s bound is not far from the truth. What about subsets of
that do not have triples in arithmetic progression?
11 March, 2007 at 9:35 pm
Terence Tao
Dear Greg,
It does seem on first glance that there should not be much difference between lines
and triples summing to zero
, but the theory for these two patterns very different (though of course they are the same for the characteristic 3 setting of
). The reason is the following: any dense set will necessarily contain a large number of trivial lines
, but there is no corresponding notion of a “trivial” triple summing to zero that all dense sets will contain lots of (except in characteristic 3). Because of this, we do not expect dense sets to necessarily contain triples summing to zero in general, whereas we do expect dense sets to contain lines (or more precisely arithmetic progressions of length three).
There appears to be a deep fact, not well understood at present, that if a “low-complexity object” (such as a set) necessarily contains many “trivial” examples of a “high-complexity object” (such as a line), then it must also contain many non-trivial examples of that high-complexity object as well. It is not always true, of course; it seems to require a certain amount of “symmetry” or “homogeneity” present in the problem. (It also requires the trivial solutions to be arranged in a suitably “transverse” manner, which I won’t define here.) Nevertheless, this deep fact seems to be rather important; it explains, for instance, we can find infinitely many arithmetic progressions
in the primes, or even polynomial progressions
where all the integer polynomials
have zero constant coefficient, but we cannot currently find infinitely many twin primes
.
There is a family of results with names such as the “hypergraph removal lemma” which attempt to make this type of intuition precise, and which have led to a number of deep consequences in property testing. Many recurrence theorems in ergodic theory (e.g. the Poincare recurrence theorem) have this flavour also, since a set can trivially recur to itself if you allow to iterate 0 times rather than a positive number of times.
One well-known place where this deep fact manifests itself is Ramsey theory. One formulation of Ramsey’s theorem is that if there is a symmetric binary relation between sufficiently many people, then one can find (say) 100 people between which the relation is always true or always false. Note that the claim is trivial if you allow the 100 people to be the same. On the other hand, the claim fails if you drop the symmetry assumption.
To address your final comment, I think the number of empirical data points we have is too small to meaningfully extract any convincing conclusion; the above phenomenon seems to be an asymptotic one only.
p.s. Regarding the “finite field heuristic”, one caveat is that the characteristic of the finite field should be large enough to avoid various local obstructions; for instance, Roth’s theorem is rather silly in characteristic 2. It turns out that characteristic 3 seems to enough to model arithmetic progressions of length three properly, whereas to model triples summing to zero one needs characteristic 5 at least.
12 March, 2007 at 12:06 pm
Gil
Marton’s conjecture and the many equivalent formulation by Ruzsa are very beautiful. Let me mention something little that I find interesting: You have these two equivalent formulations (1) and (2) (from Green’s paper). (1) says that we can find inside A a subset A’ of size at least
which is contained in a coset of size at most
. (2) says that you can cover |A| with
such sets A’. The non trivial implication (1) –> (2) is rather easy in view of Ruzsa’s lemma 1.2 (in Green’s paper).
In abstract combinatorial settings you can expect (if things go your way and you can maintain the assumptions after deleting A’) an additional factor of
moving from (1) to (2). Some sort of integrality gap. And getting rid of it is at times impossible, at times open, and at times hard. Here, somehow the additive structure makes (1) and (2) quite easily, almost the same. (In combinatorial geometry you also sometimes see similar phenomena.)
12 March, 2007 at 12:53 pm
Terence Tao
Dear Gil,
I tend to view the Freiman-Ruzsa theory as more of an “approximate group theory” than a purely combinatorial theory, to emphasise the additional algebraic (rather than combinatorial) structure present.
For instance, if H and L are two finite subgroups of an abelian group G, and K is an integer, then it is not hard to show that the following two statements are equivalent: (1) H has a subset of size at least
which is contained in a coset of L, and (2) H is covered by K translates of L. Indeed, the equivalence is essentially the first homomorphism theorem from undergraduate algebra. Ruzsa’s lemma is thus the robust version of this theorem.
I am hoping that eventually we will be able to “robustify” many other results in group theory (or ring theory, etc.) to extend to approximate groups. For instance, as I mentioned in an earlier post, Freiman’s theorem can be viewed as a robust version of the classification of finite abelian groups as the direct product of cyclic groups (or perhaps, a classification of the finitely generated abelian groups as the direct product of cyclic groups and copies of Z). Ultimately we might hope to manipulate “1%-structured objects” (such as sets which are closed under addition 1% of the time) with the almost the same degree of versatility as we currently enjoy while manipulating “100%-structured objects” (such as sets which are closed under addition 100% of the time, i.e. groups). An intermediate stage in this program would be to study “99%-structured objects”, e.g. a finite group with 1% of its elements removed and another 1% of garbage elements added. Here we seem to have a very good theory, due to the use of majority-vote and similar tools to “clean up” all the noise.
12 March, 2007 at 3:46 pm
Greg Kuperberg
On the general subject of robust versions of results in group theory, there is the theorem of Ben-Or, Coppersmith, Luby, and Rubinfeld that an approximate homomorphism between two finite groups is close to a group homomorphism.
13 March, 2007 at 8:08 am
ben green
Greg,
Yes, indeed there is. They definitely beat us additive combinatorialists to this whole game of making exact algebraic structures out of approximate ones. But I recently discovered that essentially the same idea occurs in a paper of Fournier from 1977 called “Sharpness of Young’s inequality for convolution“.
He is interested in the following question: for which groups G is the Young inequality, which states (say) that
, basically sharp? I.e. in which circumstances can you improve it to the statement that
, where c is less than one?
He shows that one can do this provided that G does not have compact open subgroups (note that G does not need to be abelian).
You can sort of see the connection to what we’re talking about here – if there is some f for which
is about as big as
then this means (after expanding out) that the support of f has got to be 99 percent closed under addition. He then goes on to show that the support must be close to a true subgroup.
So I sort of think of Fournier as the father of approximate, or “99 percent” theory, unless someone can point me to an earlier paper :-)
13 March, 2007 at 10:23 am
Greg Kuperberg
I think that there is no question that American children could learn more mathematics on average than they do. Of course the issue is not just a specific set of facts, but thinking mathematically and liking it. What I see in the average childhood experience in this country is a lack of cultural appreciation for mathematics. Proper cultural appreciation is something of a chicken-and-egg problem, because you need a lot of people who both like mathematics and have learned the right kind of mathematics that deserves to be liked. In the absence of such a cultural backdrop, every idea for improving math education can be co-opted and rendered a travesty. It has happened many times. Many entirely positive sentiments — things that I would tell people myself — have been co-opted by all sides: parents, teachers, Democrats, Republicans, unionists, anti-unionists, you name it.
In the ideal cultural backdrop, most children would be surrounded by teachers, parents, and other children who all enjoy trying to prove a theorem. Of course, demanding general improvement on all sides is no solution at all. But I think it would help a great deal if elementary school teachers knew and liked mathematics more themselves. Roger Howe had an interesting book review that discussed just how unprepared many math teachers are in their subject.
You can see the process of co-opting in this review. On the one hand, the idea that teachers need more training has long been coopted. American teachers are actually more trained than many other teachers, but it isn’t training in mathematics. On the other hand, the book that Howe reviews leads to yet another mathematics education mantra, “PUFM”. In the letters to the Notices a few months later, some educators testily note that they already use ideas equivalent to “PUFM”.
Certainly my experience growing up was that math classes had too much drill; they did not adequately convey creative problem-solving. That too has been coopted by a “reform math” movement with an excessively unstructured, unrigorous approach. It’s also often a labor-intensive approach, and an excuse to slow down the material.
So fine, reform math goes to far. One antidote to that is uniform testing. Just like all other good ideas, it has been coopted, this time by “No Child Left Behind” methods. They introduce a heavy-handed system of financial rewards and penalties without decent control of the tests themselves. Typically the tests are used for moral judgements — “our schools get a failing grade” — instead of as a dispassionate measure of proficiency.
Is there any escape from the cycle of distortions of good intentions? There are ways to improve the system, but they are generally subtle or unpalatable. Teachers should have less required training in education, and more required proficiency in the subject matter. Teaching mathematics should be differentiated more from teaching K12 in general, so that we don’t have the task of teaching more mathematics to all K12 teachers. There should be more connections between high schools and universities; and connections should not be utterly dominated by university education departments. There should be more choices for parents and students — but this should mainly mean curriculum choices, not school vouchers. There should be standardized testing — but it should really be a national standard, and it should not be weighed down by incentives and pass lines. State curriculum standards should be written with adequate input from mathematicians, scientists, and engineers. Well, all of these steps, and some others, are easier to think of than to properly implement and protect.
13 March, 2007 at 11:04 am
thomas1111
Thank you for initiating this discussion! First a trivial thing: isn’t your number theory page here instead (broken link)?
As for the question about ways to promote interest in maths by the academic community: perhaps its relations with the broader community of math and science teacher is not as smooth as it should be (it seems to be quite close to a step function rather). Learning more examples of mathematical concepts in a very low-key way might improve the teachers’ abilities to pass on their message to their pupils. More generally, you mentionned several times in interviews to have a “bag of tricks” that grows with accumulated experience and I would think many of these have a wider applicability than their pure math setting. Just maybe, a website maintained by the mathematical community covering such mathematically-minded general strategies to attack a given problem (using plain words as much as possible) could be of interest to the wider public.
Yet obviously drawing people’s attention towards math is no easy task even with enthousiastic initiatives: recall the “Puzzles on Wheels” initiative by the MSRI a few years ago (see also this article;). Apparently it ended pretty fast unfortunately. (Sorry for the long post!)
13 March, 2007 at 11:11 am
Greg Kuperberg
Okay, having vented at length, I realized that I didn’t answer Terry’s good question, which is ways that university math departments can help. One to way to answer the question is to simply list examples of successful conduits of the mathematical tradition from universities to school-age children. (With the proviso that forms of all of these good ideas have compromised versions.)
1. Wikipedia.
2. The Ross Young Scholars Program.
3. The Math for America teaching program.
4. AP calculus, both the test and the courses. (*)
5. University math courses for high school students.
(*) Granted, AP calculus has many shortcomings, but in my view it’s a lot better than nothing.
13 March, 2007 at 12:21 pm
Greg Kuperberg
Oh yeah, I was going to list the California State Math standards in the previous as a positive example of influence. Of course it was the involvement of the Stanford math department, not the Hoover institute, that mattered.
The California state math standards have been very useful for my own children. The standards are very clear; they make it easier to decide whether a student has mastered specific material taught in any given grade or class. The draft standards that they replaced were indistinct on this crucial point. They read instead as a description of what students should experience in a given grade rather than what they are supposed to learn or know. It made it harder to answer practical questions like “is my child ready for this class? Is my child learning the material in this class? Is the teacher teaching the right material?”
13 March, 2007 at 2:10 pm
Gil
Well, we have here (HU; Jerusalem at our local IAS) a special semester (March-August) on Polytopes, complexes and connections to combinatorics and topology and we devoted the first seminar to a lecture by Semyon Alesker on Greg Kuperberg’s ingenious and mysterious proof. It was very nice!
Michael Larsen and Ayelet Lindenstrauss came to the lecture and to add to Greg’s nice personal account above, let me mention that Michael even remembered some of this stuff from the time when he and Greg studied in Harvard and worked on their junior thesis (it should be in plural thesise (?) thesa(?), whatever).
Semyon being the speaker inspired Imre Barany and me to ask about minimizing the product of other mixed volumes (say surface area). Here, in the plane the circle bits the square. Tom Braden and Victor Batyrev who will be speaking along with Sasha Zvonkin on Sunday, in the little workshop on toric varieties, polytope duality, and relation to Koszul duality and mirror symmetry, where also in the lecture.
So it looks that relations between metric/combinatoric/arithmetic and their combinations of convex bodies and their polars will still occupy us for a while. For the local combinatorialists like Ehud Friedgut and Irit Dinur this was the third combinatorics seminar of the week. And the CS theory seminar is yet to come tomorrow.
We had fierce competition from the little conference at the same time in the adjacent lecture hall on “sex, biology and game theory” ! followed by a lecture on “what is a language”. Tough!
13 March, 2007 at 2:33 pm
Greg Kuperberg
Michael Larsen and Ayelet Lindenstrauss came to the lecture and to add to Greg’s nice personal account above, let me mention that Michael even remembered some of this stuff from the time when he and Greg studied in Harvard and worked on their junior thesis (it should be in plural thesise (?) thesa(?), whatever).
That has to be a little out of order, because Michael was a senior at Harvard when I was a freshman. He was in graduate school at Princeton when I started this work. But I probably did talk to him about it later; in the 1990s I cared about the problem a lot.
13 March, 2007 at 5:08 pm
Deane Yang
How can university math departments help? Let me draw your attention to the letter to the March 2007 Notices of the AMS from Pisheng Ding (you can find it in http://www.ams.org/notices/200703/commentary.pdf), which I found to be quite insightful.
I wrote a letter in response to this that expressed the view that we cannot fix secondary school math education until we fix the graduate math education in this country. It has been convenient for us research mathematicians to paint the math education faculty and departments as the villains, but the fact is that our math departments are all graduating far too many incompetent majors at both the graduate and undergraduate levels. Until we are able to somehow set minimal competency requirements at all degree levels from Ph.D. down, I don’t expect us to make much progress with improving secondary school education.
Some of you may believe, as I once did, that the Ph.D. written and oral qualifying exams suffice as proof of minimal competency. It was a rude shock when I started to encounter, for example on Wall Street, math Ph.D.’s from highly ranked institutions who were capable of reciting the statements and proofs of sophisticated theorems but were incapable of dissecting and attacking elementary problems they were given (I wasn’t even looking for a solution). I learned, to my dismay, that it is indeed possible to obtain a Ph.D. in mathematics from a reputable institution through sheer memorization of facts without having developed strong basic mathematical analytical skills.
As for undergraduate math education, I’ll spare you any further rantings. Those of you unable to restrain your curiosity can find my writings about this by googling my name and “calculus”.
We in the research mathematical community are very proud of how unstructured and free our discipline is, in comparison to professional fields such as engineering, law, and medicine. I myself was attracted by this, when I first chose to go into mathematics. But I think we have to recognize the high price paid by both us and our society for our unwillingness to professionalize our discipline. It means that almost anyone can call themselves a “mathematician” or a “math teacher”, and when efforts are made by others to define what competency means, we are all too often on the periphery of the discussion, rather than leading it, as we should be.
There are some hopeful trends, with mathematicians like Wilfried Schmid getting involved in math education, but I am doubtful that we will be able to accomplish much unless we are willing to take a hard look at ourselves and make some tough pragmatic decisions.
13 March, 2007 at 6:52 pm
Nona
Is it possible for someone to provide a link to the actual video file so I can download it and watch it at my computer? I right clicked on the video link and did “save as” and only got a small .RAM file.
13 March, 2007 at 6:55 pm
Richard
Greg said “What I see in the average childhood experience in this country is a lack of cultural appreciation for mathematics.”
I agree with this statement completely, and I believe that this is the primary reason for decline of mathematics skills in the U.S. A talented kid can survive bad standard curricula, but many will not survive if they are never told that their talent and their work is something of value. This has always been the case in the U.S., but I believe that the problem is getting worse. Long ago, I went to a high school whose principal valued the sports trophy case more than anything else, and never approved advanced or special classes for talented kids. I was only saved by a single smart teacher who encouraged me to apply for and attend NSF summer programs in mathematics at several universities. Unfortunately, I don’t believe that these exist anymore.
Many immigrants to this country came from ancestry that highly valued learning of all kinds. Somehow much of this has been lost and forgotten, perhaps by this stupid American attitude that learning/talent = elitism.
13 March, 2007 at 7:43 pm
Greg Kuperberg
I went to a high school whose principal valued the sports trophy case more than anything else, and never approved advanced or special classes for talented kids.
Yes, outright antiintellectualism certainly is one side of the problem, but I also had in mind something else. There is a lack of cultural understanding of mathematics, even among most well-educated people. For instance, I can offer this quote from Robert Fixmer, who was once an editor for the New York Times: “Mathematics has no emotional impact. What physicists do challenges people’s notions of origins and creations. Math doesn’t challenge any fundamental beliefs or what it means to be human.” This regrettable comment came from a successful American intellectual, not from someone who only values the sports trophy case.
I suspect that most K12 schools will say that it is very important to learn mathematics. But they may not teach good mathematics, and they may not reveal a major reason that mathematicians themselves learn it, which is that it is beautiful. (I do mathematics precisely because Fixmer is wrong, it does have an emotional impact.)
To be sure, there is more to good teaching than liking mathematics. I don’t always do a good job of it when I teach. Many of my K12 teachers and my kids’ teachers genuinely like children and communicate with them well, for all I know better than I would. But they won’t convey a perspective that they don’t have themselves. Even if they do reward talent, they may not reward it in the right way.
I was only saved by a single smart teacher who encouraged me to apply for and attend NSF summer programs in mathematics at several universities.
I believe that the NSF program that fits this description ended, but the American Mathematical Society still supports several such programs with its Epsilon Fund program. These are indeed one of the main ways that univerisity math departments can popularize higher mathematics at the K12 level. I mentioned the example of the Ross program above.
Another way for math departments to help is to organize math contests and math contest training. I have seen first-hand how it can popularize rigorous mathematics. Just like all other solutions, this vehicle can be overused or coopted; it’s important to remember that the point is to stoke interest, not to “win”.
13 March, 2007 at 8:12 pm
Richard
Greg,
Well put! It’s just about bed time here, so I’ll only comment on emphasizing interest (and I’ll add fun and beauty) rather than winning. Terence, for example, seems to have escaped that trap well, but the emphasis on winning can lead to burnout and the inability to settle into a long and satisfying research program which requires much work, patience and reflectivity.
Your viewpoint on the cultural and emotional aspect of mathematics is interesting, has not been discussed much, and probably needs to be.
13 March, 2007 at 8:33 pm
Terence Tao
I actually don’t have much of substance to say in this discussion, and am happy to primarily play the role of moderator, but I can make a few small observations.
The culture in the U.S. might not be the most intellectual one in the world, but it is at least relatively supportive of activities which lead to individual “success”, especially financial success. There seems to be a general (albeit vague) public awareness here that mathematics is somehow “useful” for various industries and careers (e.g. IT, finance, engineering, etc.), and so people here do seem to agree that maths is important, even if they mostly don’t want to touch the stuff themselves. It’s not ideal, but it’s significantly better than apathy or outright anti-intellectualism. For instance, while teaching undergraduate linear algebra here in the U.S., I’ve found that students respond well to the story of how two Stanford graduate students in mathematics and computer science managed to exploit the theory of singular value decompositions of large sparse matrices to create a rather well-known multi-billion dollar web search company :-) .
And there are some rather good educational resources lying around, if you know where to look. For instance, my four-year old son loves the Flash videos from brainpop.com, which are remarkably well done, both in presentation and in content.
Mathematics competitions, when used in moderation, are indeed a good way to show high school students that maths has more aspects than the often rather dry material covered in classes. Out here in the west coast, though, they haven’t seem to have taken much of a hold, especially compared against more well-known activities such as the Spelling Bee. Perhaps one problem is that while a good performance at the Bee can be appreciated by just about anyone, a good result at a maths competition is only really appreciated by the participant and the grader. It would be interesting to have a maths-themed event which might have wider appeal to a non-mathematical audience.
13 March, 2007 at 9:22 pm
Greg Kuperberg
Out here in the west coast, though, they haven’t seem to have taken much of a hold, especially compared against more well-known activities such as the Spelling Bee.
Some of the math graduate students here at Davis are doing a great job with ARML “preparation”. To some extent it really is preparation for ARML, but they completely realize that it is also an excuse for a fun evening of problem solving, once a week. More students attend the meetings than will be on the team. I think that ARML is perfectly adequate for this mode of outreach. If not many students know about it in California, then people could work to change that. West Coast ARML is presently at UNLV.
Within the high school system there are also the AMC-10 and AMC-12 contests, which are the first American rung to the IMO.
At the more serious level of math summer camps, California has COSMOS.
The culture in the U.S. might not be the most intellectual one in the world, but it is at least relatively supportive of activities which lead to individual “success”, especially financial success.
Well, yes. I agree that this motivation is much better than nothing. But you see its limits when you teach standard calculus, for example; and I personally have some trouble coping with those limits. (But limits in calculus are great :-).) The syllabus for standard calculus reads like a carpentry manual. I am much happier with rigorous calculus or undergraduate analysis; the difference for me is the tone of the course rather than the caliber of the students.
14 March, 2007 at 11:08 am
MK
There is an interesting (to some) random matrix problem that when scaled by sqrt{n} the empirical distribution of an n times n matrix A with i.i.d. complex valued entries with mean zero and unit variance, converges to uniform distribution on the unit disk (“Girko’s circular law” – the non-Hermitian analogue of Wigner’s semicircle law).
Bai has shown that the essential problem is to get a bound on the probability that (zI-A) has a small singular value(which has been done only under certain conditions on the distribution of the entries). A much weaker bound than the type that you have in this paper is needed. I wonder if your results don’t actually settle this problem?
14 March, 2007 at 11:42 am
Jonathan Vos Post
As a fomer Adjunct Professor of Mathematics (and of Astronomy), son of a teacher who specialized in Math, married to a Physics professor who’s taught in 4 countries, I think that the USA’s Math Education problem has at least these components:
(1) the overall failure of the public education system in the mean and the bottom, although sometimes successful at the top;
(2) Failure in most states even to admit the existence of, let alone address, the problem of Dyscalculia;
(3) Loss of girls and women “in the pipeline”;
(4) Lack of cultural context — where da Vinci in the elementary textbooks? Or Mozart and Bach?
(5) Very incomplete detection of young talent, and thus ability to connect the talented with the best teachers. Contrast with the Jewish quarter of Budapest in the decade that produced von Neumann, Wigner, Szilard, and Edward Teller, having attended the same “best high school system in Europe”, and enjoyed the same cafe scene, where scientists mingled with poets and philosophers and painters. About 10 world-class geniuses, who won 7 or 8 Nobel prizes, share that description, and some had the same specific teachers. Then they saw their beloved home town ruined by Nazi and Communist invasions. Some did not come to America, but rather to England or France.
As to Dyscalculia, it is a scandal that the innumerate are allowed to teach Math, and that teachers and parents alike tell students with Math Disability” don’t worry, I never liked Math either. Remember the Talking Barbie Doll who would actually say: “I hate Math. Let’s go shopping!”
I have successfully gotten students in their 50s to
finally grasp what eluded them through bad teachers in their past, peer pressure, bad parenting, poor self-esteem. “Discalculia” = “Math Disability.”
The educational literature on Discalculia is
quite clear that roughly 2/3 of the clinically Math
Disabled CAN — with proper teaching my specialists — finally “get it” and be able to do at least algebra.
Only about 1/3 have their brains hardwired to make
intervention ineffective.
The problem is NOT in teaching people what they don’t know (and know that they don’t know). The big problem is in teaching people who THINK that they know something, but have the wrong facts, axioms, methods in place through wrong ideas uncorrected by good teachers. My wife has published in the area of college students who fail Physics because their high school teachers were wrong, and their textbooks were wrong.
I didn’t teach myself calculus from college textbooks until I was 11, but my son learned to at least take derivatives and integrals of polynomials by age 10, passed his college entrance exam aged 12, but we made him finish 8th grade before jumpring directly to University at age 13. he’s about to get his double B.S. in Math and Computer science, having just turned 18. But — and this gets to the deprecation of Math in the USA — he will then not go for a Math or CS graduate fellowship, but instead start at a top 10 Law School, still aged 18.
14 March, 2007 at 12:02 pm
Deane Yang
To follow up Jonathan’s posting, my diagnosis of the situation is that it stems from the failure of the research math community to find a way to professionalize itself. Engineering professional societies have set up a well-defined process for determining whether a person is allowed to call herself a “professional engineer” or not. The same is true for doctors, dentists, lawyers, and even architects.
We in the research mathematical community are very proud of how unstructured and free our discipline is, in comparison to the professional fields. I myself was attracted by this, when I first chose to go into mathematics. But I think we have to recognize the high price paid by both us and our society for our unwillingness to professionalize our discipline. It means that almost anyone can call themselves a “mathematician” or a “math teacher”, and when efforts are made by others to define what competency means, we are all too often on the periphery of the discussion, rather than leading it, as we should be.
I’d like to call your attention to the letter to the March 2007 Notices of the AMS from Pisheng Ding (you can find it in http://www.ams.org/notices/200703/commentary.pdf), which I found to be quite insightful about what’s wrong with university mathematics education i this country. I suggest that we need fix these things, before we can hope to fix the secondary school math education.
15 March, 2007 at 9:53 am
Terence Tao
Dear MK, Thank you for the interesting suggestion, which we will look into! My co-author Van pointed out to me that there is some precedent for this type of connection: a recent paper by Gotze and Tikhomirov uses (modifications of) some earlier work of Rudelson on bounding the least singular value of random matrices to justify a circular law in certain cases. Our work does share many features in common with Rudelson’s work, including the treatment of various exceptional directions and the use of inverse Littlewood-Offord technology.
15 March, 2007 at 12:43 pm
van vu
Dear MK,
Thanks for pointing it out. I have a quick look at a recent paper of Gotze and Tikhomirov and it seemed that our result can be applied directly to zI-A, and so one can skip
lots of calculations there.
There was another paper of Girko that we could not comprehend, about central limit theorem for the determinant of a random matrix (where the entries can have very general distributions, not just gaussians). Have you seen that ?
15 March, 2007 at 6:42 pm
Quant Jack
Did you guys ever consider that American children are smart enough to realize that studying too much math is a bad career and life–style investment? That investment banking, law, and medicine make better lives, careers, and well-rounded people? If you’re so smart, why are you not rich?
15 March, 2007 at 11:08 pm
Gil Kalai
Dear all, There are many nice open problems around the random 0-1 matrices and connections to Sperner’s theorem and Littlewood-Offord. Here is one problem which might be the most bizarre:
let
be the probability that a random n by n 0-1 matrix has determinant equal to k. So
is exponentially small and
is probably even smaller when k is not zero. (maybe
, this is interesting but not yet bizarre)
The question is if
Here are two heuristics in favor and against it.
Heuristic 1: the determinant as a sum of n! signed random product clearly behave smoothly. OK,
is sort of a singular value, but you can expect that
will be similar to
when
and
are small (in absolute value) (even just
.) Certainly
and
are very close.
Apropos this. Jeff Kahn challenged to prove that for
n by n matrices the permanent is zero with probability o(1). There you can also expect smoothness of the probabilities (beside that obvious parity condition.)
Heuristic 2: what is the probability that det = 0 (mod p) for a prime p? Well, 0-1 matrices should behave like random matrix mod p with entries 0,1,,…,p-1. But there the answer (denote it
} is known:
it is this familiar infinite product. (Not too far from 1/p.)
The probability
that a random such matrix has determinant 1 (mod p) is
.
Next we can assume that the behavior is independent for different primes. So you get the prediction that the probability for 0-1 matrices that det = 1 differ by a constant from the probability that det = a power of 2.
And the probability that
should perhaps be similar for different
.
All in all
is smaller than
by a factor of n or
or so.
And by this strange heuristic if |j| is not too large,
depends on how many prime factors
has.
Heuristic 2 is rather dubious but perhaps enough to shed some doubts.
16 March, 2007 at 5:51 am
vanvu
Dear Gil,
To me, Heuristic 2 looks like the situation with sieves in number theory. For very small
p, the independence look OK. But for relatively large p, it is quite wrong. If we do sieve the trivial way (before Brunn), the estimate was quite miserable.
16 March, 2007 at 7:24 am
Greg Kuperberg
I should also mention, in fact I should have mentioned much sooner, the “Math Circle” program at Davis and a number of other universities. These are weekly seminars for area high school students interested in mathematics. The one at Davis seems to be at least as successful as the ARML training. In particular, I asked the director, Yvonne Lai, whether the Davis program had stoked interest among any students who had once had no intention of studying higher mathematics. She said that she could definitely think of examples.
The broader lesson is that there is apparently enough extra labor at universities to export cultural appreciation of mathematics directly to many high school students. It’s hard to think of any real shortcoming to the “math circle” format. It may not scale up to the whole country, but I see no reason not to scale it up to more PhD-granting American research universities.
16 March, 2007 at 7:33 am
Terence Tao
Dear Quant Jack,
It is somewhat ironic that people who choose non-mathematical careers in order to avoid “too much math” find that this lack of mathematical literacy comes back to haunt them at later stages of their career. I’ve seen investment bankers who need to learn Black-Scholes theory or other mathematical aspects of risk management, doctors who need to know advanced statistics in order to correctly follow current medical literature, and lawyers who need scientific literacy in order to handle expert testimony, not to mention understanding basic probability theory, rigorous definitions, and propositional logic. All three of them could also use mathematical literacy when it comes to more mundane tasks, such as selecting a mortgage for an expensive home.
I wouldn’t recommend mathematical academia for everyone – the main attractions are things like academic freedom, creative expression, intellectual challenge and satisfaction, (eventual) job security, flexible schedule, and lasting recognition or legacy, rather than purely monetary incentives – but I would definitely recommend mathematical literacy, both for its own intellectual sake and for its ability to enhance a surprisingly large number of careers. Though, if it is riches that is your primary goal, I can point to people such as Sergey Brin or Jim Simons as examples of people who have used an advanced mathematical education to become extremely wealthy and successful. But it is worth noting that while intellectual ability and training can be converted into money, the converse is not always true: if you’re so rich, why are you not smart?
16 March, 2007 at 9:49 am
Quant Jack
Professor,
You indeed score valid intellectual points! It is easy to find many technically trained people who use technology (and luck) to become rich. And for every such rich techie, I can name an even more successful non-mathie (e.g., famous movie stars, writers, lawyers, politicians, investment bankers, even doctors) who might as well be mathematically illiterate because they don’t ever personally use anything beyond algebra (if that!) in their careers. But this example-counter-example sequence begs my question and does not address the problem at hand.
I am a practical fellow and the problem at hand is as follows. Suppose you are a youngster in America (such as your little boy) with life ahead of you. You want to take maximum advantage of all the opportunities America has to offer. You desire a happy, well rounded life, a wide circle of friends, a beautiful wife, wide travels, financial security, a career with high probability of success, and enough wealth to live in a great house in an upscale neighborhood, say, Bel Aire or Westwood. (Is this so unreasonable?) While you are reasonably intelligent (e.g., IQ = 100 – 125), you are not specially academically or mathematically gifted.
In this situation — which describes 90% of youngsters in middle America — what optimal strategy should a youngster adopt to achieve his aforementioned life goals? In particular, how much math should he study (if he doesn’t naturally enjoy math)? Should he take anything but the standard required math courses in high school and college? Why should he take AP Math? Why should he major in math instead of business, political science, communications, or pre-med, or pre-law? Why should he join Greg K’s “special math programs” instead of playing baseball with his friends? Why should he attend a “weekly high school math seminar” instead of enjoying his teen girl friend?
By the way, what do you advise your own kids?
QJ
16 March, 2007 at 2:11 pm
MK
Dear Vu and Tao, Gotze and Tikhomirov apparently need Gaussian tails for the entries, so one point of interest would be to remove all such assumptions (Bai has a Lindeberg type condition to reduce the circular law problem to one of estimating smallest singular values and perhaps there should be no more assumptions on the entries at all?).
I do not know which paper of Girko you refer to, but his claimed proofs of circular law are incomplete as he skips over this issue of small singular values of ZI-A. There are also some other nonHermitian random matrix problems which were held up by this issue so far and should be accessible now (assuming there is no catch in extending your result to complex matrices).
16 March, 2007 at 4:10 pm
vanvu
Dear MK,
If I understand well, there is a standard machinery to reduce the CL to the problem of estimating the smallest singular value. If it is true, then what would be the most general assumption for which this machinery works ? (Then we just have to check whether our result still hold under that assumption; it is certainly hold for complex entries.)
Btw, can we know your full name ? (You can send private email to us if not want to disclose it here).
Thanks, Van Vu
16 March, 2007 at 6:20 pm
Richard
Why should he join Greg K’s “special math programs” instead of playing baseball with his friends? Why should he attend a “weekly high school math seminar” instead of enjoying his teen girl friend?
Jack,
My nephew, when asked, elected to go to math circle rather than trick or treating on Halloween. Why? He made a choice to do what he he has fun doing. It’s that simple. Apparently you do not like or understand math, and that’s your choice, but not necessarily the choice of others.
16 March, 2007 at 7:16 pm
Greg Kuperberg
For every such rich techie, I can name an even more successful non-mathie (e.g., famous movie stars, writers, lawyers, politicians, investment bankers, even doctors) who might as well be mathematically illiterate because they don’t ever personally use anything beyond algebra (if that!) in their careers.
There is a degree of illusion in your description of the rich people in America. A really tremendous amount of money has gone to technology in America; and even in some of the other professions, the most successful people are more mathematically literate than the others. Meanwhile very few writers or movie stars are rich, it’s rather that the few who are tend to flaunt their wealth. Most techies and businessmen tend to hide their wealth. So do many professors, for that matter. Some professors really do spend up to their credit limit, but some of them have saved a lot more money than you might expect.
That brings me to another issue, whether or why it’s worthwhile to be rich in the first place. Would you accept dementia for a million-dollar salary? (Some boxers and football players have made bargains of that sort.) I certainly wouldn’t. In my view, unless you have a certain amount of intellectual wealth, you’re not in a position to appreciate or even retain financial wealth. A striking fraction of lottery winners go bankrupt.
There are a great many Americans who fumble or quarrell away most of their money. The main reason is not a lapse of morality, because most people have a pretty good emotional sense of responsibility. Rather, the problem is that many people just do not trust or enjoy mathematics enough to make the right decisions. They splurge, they fret, they gamble, they overprotect idle cash, they accuse friends and relatives, they delegate the math to CPAs. They do everything but sit down and do the right calculation and act on it. As you say, it usually requires no more than high-school algebra. But even though most reasonably bright people have learned it, they have not learend enough beyond it to actually live by it.
On the flip side, if you do have enough intellectual wealth, you may not need a Bel Air mansion to be happy. I enjoy a raise just like everyone else, but I don’t really need my own swimming pool, because I’m perfectly happy with a long bike ride. I don’t enjoy spending money all that much; spending money just for fun is inane.
Why should he join Greg K’s “special math programs” instead of playing baseball with his friends?
I completely respect playing sports with or without friends. (Frankly I think that baseball is boring, but there is basketball, ultimate, bicycling, etc.) My argument for a mathematics seminar for high school students is straight from “Green Eggs and Ham”. Just try it and see if you like it. I am confident that at least 1 in 40 students, from high-school through college, can learn to enjoy recreational, proof-based math problems. Not just learn to do it, learn to enjoy it, whether or not you’re good at it. If you are able, you should certainly try, because it’s a wonderful, always-available, free life experience. It is also one way (certainly not the only way) to be a more logical thinker who does better at most jobs and doesn’t squander savings.
But the truth is that far fewer than 1 in 40 students are ever exposed to exciting, rigorous mathematics; it’s more like 1 in 1000. So I think that there is a lot of room for us mathematicians to popularize our tradition.
I am also confident that at least of half of high-school students could learn to enjoy word problems at the level of high school algebra. Or at least, learn to do them and not dislike them. It could make a real difference for their life experiences. Algebra-based word problems are much more accessible than proof-based problem-solving, so it’s not really the same educational issue, but it is related.
17 March, 2007 at 9:26 am
Terence Tao
This is perhaps somewhat tangential, but regarding the topic of explaining the nature and purpose of mathematics to a non-mathematical audience, Tim Gowers does an excellent job, both in his talk “The importance of mathematics” (here’s the video and slides) and in his little book “Mathematics: a very short introduction“.
17 March, 2007 at 10:21 am
Manjunath Krishnapur
The machinery for reducing CL to estimating small singular values is in Bai’s 1997 paper “Circular law” (Ann. Probab.) The recent book of Bai and Silverstein seems to say that the weakest known condition is (A is the
random matrix)
for any t>0 and some s>0. (page 326, “Spectral analysis of large dimensional random matrices”).
And thanks for confirming that your theorems remain valid for complex matrices!
17 March, 2007 at 12:13 pm
Gil Kalai
Dear all. Thanks for your comment, Van. Sure, if Heuristic 2 can be improved a la Brunn this is even better. I suppose my question is whether the probability that an n by n matrix with entries 0/1 (or plus or minus one,) has determinant equal to k, depends in a strong way on the arithmetic of k. Heuristic 2 suggests that this IS the case and this is an appealing conjecture simply because it will be more interesting.
For matrices with entries plus or minus one I would expect the probability for the permanent to be equal to 2k behave very smoothly for k, like the probability of a length n! random walk from the origin landing in location 2k. But for determinants the heuristic will suggest a very different and nicer picture.
17 March, 2007 at 2:15 pm
Upcoming Activities... and Math Education « Less Drift, More Diffusion
[...] For an interesting read on this, check out Tery Tao’s recent blog entry. [...]
17 March, 2007 at 9:23 pm
Quant Jack
>> I can point to people such as Sergey Brin or Jim Simons as examples of people who have used an advanced mathematical education to become extremely wealthy and successful.
As a mathematician, why do you mis-use statistics? You should know better than to use extreme “tail” points to support your argument. The existence of Sergey Brey does not imply math is a good career choice any more than the existence of Shaq or Kareem Abdul Jabar proves that basketball is a good career choice for the average kid!
I also note that mathematics departments (maybe even the AMA) often mis-use statistics to try to attract math majors. In particular, many math depts put up stats touting that the “average starting salary” of a math major or actuary exceeds that of, say, the average starting salary of a business or political science major. This is a mis-use of statistics on two counts. First, if they do not control for incoming student characteristics (e.g., IQ, hours of effort invested in degree). It is likely that a high earnings math major would also earn more if s/he majored in business or pre-med — the earnings difference isn’t caused by the choice of major, its due to the innate hardworking character of the student. Second, starting salary does not reflect lifetime salary. Pre-law and business students start out lower than actuaries, but 10 years later, they are way, way ahead! If math depts are going to do a fair comparison of salaries associated with choice of major, they should compare expected lifetime salaries and control for differences of the incoming students. I have not doubt that, if they do the comparison fairly, a math degree, on average, leads to one of the LEAST rewarding careers from a financial compensation perspective. For every Terry Tao, there are a 1000 underpaid, overworked calculus teachers or math phds who can not find rewarding work in mathematics even though they “love” math.
18 March, 2007 at 12:16 am
Terence Tao
Dear Quant Jack,
Sergey Brin and Jim Simons are indeed exceptional individuals, but they are also part of a much larger story. For instance, both of them run large and successful companies (Google and Renaissance Technologies), which together hire thousands of well-paid people for which mathematical literacy is an essential job requirement (and hundreds of very well-paid people for which advanced mathematical education is similarly required). More broadly, they represent two major sectors of the modern economy (IT and finance) which both have an immense demand for the type of skills which are provided by mathematical training at both undergraduate and graduate levels. Maths departments routinely “lose” many of their fresh PhDs to lucrative careers in these (and other) sectors; there are real shortages created here that need to be addressed, because not enough good people are entering mathematics and related sciences. For instance, in the US, IT, finance, and academia all need to employ a significant fraction of their mathematical talent from overseas, in no small part because the domestic supply is insufficient to meet demand all by itself. In contrast, there is not exactly a shortage of people (in the US or elsewhere) lining up for an MBA or an MD.
If you have some figures for your “fair comparison of salaries”, you are welcome to share them here. As you point out, though, it is indeed a difficult task to do the maths correctly. For instance, which job is better financially: a $120K/year attorney job in which one works 60 hours/week and 12 months/year, or an $80K/year tenure-track academic job in which one works 40 hours/week and 9 months/year? (There is also the issue of repaying student loans for law or business school, while on the medical track there is also the issue of malpractice insurance; these are not entirely trivial financial burdens.) From the point of view of a “youngster in middle America”, one also needs to consider how competitive it will be to even get into the career track one is envisaging. Even the best and brightest students are not assured entry into the pre-med, law, or business school of their choice, due to the intense competition. In contrast, the top maths departments routinely compete for the first class prospective graduate students – because there are far too few bright students who want to do maths. The situation is of course more competitive for the more typical prospective graduate student in maths, but such a student would also have an even tougher time trying to get into law, business, or pre-med.
It would be a mistake to recommend one career as the “best” choice for everybody – just because (say) business careers are more lucrative than all others, does not mean that we should recommend that 100% of all students go into the business track (and discourage all interest in the other tracks). The love of the subject (and, of course, one’s natural talent for the subject) should of course be a major factor. As this discussion shows, though, students who have a potential love or talent for mathematics are often discouraged from achieving this potential for a number of reasons, including a lack of awareness of the career options that a mathematical education opens up (or equivalently, the many career options which become limited or even closed if one tries to avoid having too much maths in one’s education).
18 March, 2007 at 10:09 am
KJY
“Second, starting salary does not reflect lifetime salary. Pre-law and business students start out lower than actuaries, but 10 years later, they are way, way ahead!”
I’m not sure about that. Recall that those who enter law and business credential themselves through formal schooling, usually before they enter the field, whereas actuaries credential themselves during their working years. If you look at the salary surveys by D.W. Simpson, an actuarial recruiting firm, you find that, for example, fellows of the Casualty Actuarial Society (fellow being the final step in the credentialing process) with 9.5-14.5 years of experience have compensation packages ranging from $138,000 to $270,000 (10th-90th percentiles). Fellows with 19.5+ years of experience have 10th-90th percentiles of $155,000-$421,000+.
A brief search for lawyer salaries yields the site payscale.com, which tells us that the median salary for lawyers/attorneys with 10-19 years of experience is $107,000, while the median salary for those with 20+ years of experience is $120,000. The payscale.com entry for actuaries lists a 10-19 years median of $111,000, and a 20+ years median of $125,000, both of which are higher than the respective salaries for lawyers. The BLS’s occupational outlook handbook does tell us that a quarter of all lawyers make more than $143,000, and Monster’s salary center tells us that in Cleveland, Ohio (which is a relatively average cost-of-living area) the median salary for “Attorney III’s,” who typically have at least 5-8 years of experience, is about $147,000. So, there is some question as to whether the payscale.com numbers are entirely accurate, but in any case, I think the first paragraph shows that experienced, credentialed actuaries have salaries that are at least reasonably competitive with those of experienced, credentialed lawyers. Obviously, it would be better if someone was able to find more directly comparable numbers for lawyers.
Investment bankers will of course make more than actuaries, but it is not clear to me that it is worth sacrificing the best years of one’s life in order to make a huge amount of money later. I might add that most people who enter the field burn out; the managing directors and so forth who earn millions of dollars are not at all representative of the typical IB entrant.
Doctors also will tend to make more than actuaries, but at the cost of four years of medical school and four to eight years of residency, during which they rack up debt, and then struggle to live on a resident’s salary. We should also add that the much-hyped salaries of say, neurosurgeons and plastic surgeons, are totally different from those of say, internists and general practitioners.
Another point to make is that what mathematicians seem to be promoting is mathematical literacy, and the encouragement of serious interest in mathematics for those with the natural ability and curiosity for it. Mathematical literacy is beneficial for just about everyone, and as long as students are not “misled” into entering mathematics, the promotion of the field will only attract additional students inasmuch as those students are willing to forgo the apparently horrific compensation in exchange for whatever it is about mathematics that appeals to them. Mathematicians have never, to my knowledge, claimed that a math degree is for everyone, or that a math degree is the most lucrative option available.
On the topic of controlling for student characteristics, what about the fact that doctors, lawyers, etc, are apparently much, much more “well-rounded” than mathematicians? Perhaps most mathematicians are not capable of entering those other fields, due to their lack of well-roundedness, thus making a career in mathematics their ideal choice?
Finally, on being “well-rounded.” The distinction is often made that lawyers, doctors, executives, etc, are much more well-rounded than scientists, mathematicians, and engineers, and that this, in addition to salary considerations, means that the former are better careers, and more valuable to society. However, it was not well-rounded individuals who got us to where we are today; it was people who were extremely, extremely good at one or two things in particular. A well-rounded individual may be exceptionally talented at functioning within a given society, but he or she will never advance that society (unless, of course, he or she is a genius at everything). The rare example of Google is perhaps not directly relevant to an individual’s career choice*, but what it does is illustrate the enormous value of mathematics as a whole (especially in comparison to its meager cost), which then leads to the consideration that we should, in the interests of society, find ways of encouraging talented and interested students to choose careers in mathematics (i.e., to make mathematics a good career choice).
* This is still not entirely true, as safe careers such as medicine or law are better-paying on average, but will not yield multi-billionaires such as Brin and Page. Thus, if individuals value greatly the tiny probability of becoming insanely rich, then the fact that two Stanford graduate students used mathematics to become billionaires could be relevant to their career decision. This same idea can be extended beyond mathematics to science and technology, neither of which tend to be as well-paying, on average, as medicine, law, or investment banking, but which will occasionally yield individuals who become far richer than any doctor, lawyer, or investment banker in existence.
18 March, 2007 at 12:51 pm
Greg Kuperberg
In all honesty, research appointments in mathematics are so competitive that it’s hard for me to argue that there is a pressing shortage. There is certainly a shortage of Americans interested in research in mathematics, as evidenced by the fact that many mathematics departments are more than half foreign-born. For Americans, “Come study mathematics, because academia needs you badly!” is not really the right message, even though it is a popular message.
I think that mathematics departments, or NSF-funded programs, could do a lot more to connect undergraduate and graduate programs in mathematics to industry. Going to graduate school makes you want to be a research mathematician, even though the appointments are very competitive. And I suspect that many corporations don’t know to hire mathematicians, although it’s hard to blame them if the mathematicians don’t want to come. We could reform graduate training (for instance by including more computer programming) and post-graduate placement (for instance by encourging companies to use mathjobs.org) to justify the argument that more Americans should go to graduate school in mathematics.
18 March, 2007 at 1:10 pm
C.D.
I really think you have a good point QJ. I’m a (bachelor-)student of mathematics (Univ. of. Copenhagen), and i often wonder about how my future is going to be? I thought I would give you my point of view, since I am very much in a situation relevant to the questions which you are discussing (I apologize for any spelling mistakes etc.):
I’m no wonder-kid, i just like mathematics a lot, and choose to study it, really because of it’s beautiful, beautiful nature (‘the Queen of sciences’ :-). I try to work hard and disciplined, and I would love to get a PhD some day, and to get a career doing research in pure mathematics.
But everywhere I turn I seem to be told that only the very best ‘gifted’ and ‘talented’ students will be able to have a somewhat successful international career; hard work isn’t enough, you need some sort of natural ability or gift, and if you didn’t compete in and won some mathematics-competition or learned to do calculus at age 10 you’re no good. if you haven’t got this ‘natural ability’, too bad if you love math and would love to spend your life doing mathematical research; you won’t be able to reach the level of those top mathematical researches. Perhaps if I am lucky, I will end up as assistant professor at the local university, never having published anything and spend all my time teaching…
So my question is really if I want to invest the next 5-10 years of my life, trying (most probably in vain) to achieve the level and skill required to do serious mathematical research. Spending many hours every day working on mathematical problems, reading and attending lectures (all of which i love), but probably never going to be able to make any original contribution. I’ll then end with a salary much lower than that of business student who never spend much time and energy studying.
And by the way, every time we (at the uni.) talk to business-representatives, they tell us: ‘Oh yes, the private industry-sector absolutely hires mathematicians, mainly because of their ability to think structured and analytical bla bla bla’. Although, we would very much prefer if you have some programming skills, or have taken some enginering-courses.’
So my only alternative is to take some job where I am never going to put to use anything I have learned during my 5 years at uni, and most important: not being able to work with mathematics.
Summa sumarum:
I consider myself an average mathematics student, perhaps with a somewhat ‘above-average’ will to do alot of studying. I absolutely love to study mathematics, but I am having no prospect of being able to get a career within mathematical research, and no prospect of otherwise using my knowledge somehow. Not heartening prospects at all!
At least, this is my impression of my status quo, here in Denmark.
Best regards C.D.
(Please forgive me for any grammatical mistakes, my English could be better.)
PS. A recent article in Scientific American (Aug. 2006) indicates that ‘effortful study’ is the key to becoming an expert within some field (of course), but much more important than presumed: ”The preponderance of psychological evidence indicates that experts are made, not born ”. This article have given me some hope, but still it doesn’t seem to rhyme with the general opinion on these matters. What are you’re opinions of all this?
18 March, 2007 at 1:55 pm
Greg Kuperberg
Although, we would very much prefer if you have some programming skills, or have taken some enginering-courses.
As I’ve been saying, I think that this is good advice. I have followed it myself. But it doesn’t mean that you shouldn’t study the theorems. For me, computer programming without mathematics is like sex without love.
18 March, 2007 at 7:14 pm
Greg Kuperberg
This is an interesting-looking review of a topic that I don’t know beans about. But I notice that you do not mention models of fluid flow with quantized vortices. These models approximate continuous 3+1-dimensional fluid equations by non-local 1+1-dimensional equations. It seems conceivable, although I know nothing about the analytic issues, that you could hope for some kind of convergence as the quantization parameter goes to zero.
My wife, Rena Zieve, studies fluids with quantized vortices. In her case, the motivation is not that the model approximates classical fluids, but that it is a true description of superfluid helium.
18 March, 2007 at 9:07 pm
Terence Tao
Dear Greg,
It is indeed the analytic issues (and specifically, the establishing of estimates which control the solution, or convergence of approximate solutions to the true solution) which are the heart of the matter. Suppose for instance that we do manage to construct a global solution
to a quantised Navier-Stokes for every
(this is certainly conceivable, as one can do similar things for other relaxations, regularisations, or discretisations of Navier-Stokes). Now, as you say, one now needs to establish convergence in some topology of these quantised solutions in the limit
. In order to get any useful sort of limit, as a bare minimum one needs the sequence
to enjoy some sort of bound in some function space norm, uniformly in
, otherwise all sorts of bad things could happen (e.g. breakdown of the energy conservation identity). The key phrase here is uniformity in
. If one has some argument that bounds solutions to the quantised Navier-Stokes uniformly in
, then by taking limits one expects the same argument to work directly in the limit
. In other words, one can dispense with the quantised Navier-Stokes and work with the original Navier-Stokes equation directly.
Thus, for the point of view of establishing estimates (which is the key problem), perturbing the equation by varying a small parameter does not really help. Such tricks are however useful for more qualitative or formal aspects of the theory. For instance, to justify things like the energy conservation identity for very rough solutions (for which one cannot directly justify things like differentiation under the integral sign), a typical trick is to first relax the solution to a smoother solution (and perhaps a smoother equation), establish the relevant identity in that smooth setting, and then take limits to recover the identity in the original solution. These sorts of perturbations are also useful for a number of simple topological arguments, such as an application of the continuity method (showing that all solutions obey some property P by showing first that the set of solutions obeying P is open, closed, and non-empty, and then using the connectedness of the solution space).
From a physical viewpoint, it may well be that one of these modified equations is in fact a more realistic model for fluids than Navier-Stokes. But for the narrow purposes of solving the Clay Prize Problem, we’re stuck with the original Navier-Stokes equation :-) .
18 March, 2007 at 9:13 pm
Greg Kuperberg
You make it sound like it could be worthwhile to try to disprove the conjecture.
19 March, 2007 at 5:01 am
Nets Katz
Terry,
I think you are a little pessimistic about strategy 1 for proving blow-up – working with ensembles. For that strategy to work it is not absolutely essential that the typical solution blow up. In order for Navier Stokes to blow up, you just to have each scale “activated”, that is have energy flow sufficiently into each scale.
Suppose you could prove with positive probabiity that energy cascades into a generic situation in which it is one scale higher. Generic has to mean that we get positive probability of flowing to the next scale. You might get a measure zero set of solutions with blow-up but nevertheless use a probabilistic argument
to prove that it occurs.
19 March, 2007 at 8:50 am
Terence Tao
Greg, I certainly think one should pursue the blowup direction as well as the regularity problem. As noted in my main post, though, our technologies for establishing blowup are rather limited at present. Which brings me to Nets’ interesting idea. This idea may at least be able to establish a “norm inflation” scenario, in which some high regularity Sobolev norm is shown to increase quite rapidly in a bounded amount of time. That would be enough to disprove a fairly strong version of global regularity, namely that one can bound some critical global spacetime norm uniformly by, say, something depending only on the
norm of the inital data. (This would imply as a corollary by standard persistence-of-regularity arguments that any Sobolev norm of the solution at a late time is controlled by the Sobolev norm at time zero, and the
norm of the initial data.) I suspect that this type of global regularity result, while very common for critical problems, is probably false for supercritical problems, and might be disprovable by some sort of contradiction argument (e.g. by Bourgain’s induction on energy method). If one was extremely optimistic one might then hope to run a Baire category type argument to create an actual blowup, but this looks somewhat unlikely to me.
19 March, 2007 at 9:43 am
RK
>In all honesty, research appointments in mathematics are so competitive that it’s hard for me to argue that there is a pressing shortage. There is certainly a shortage of Americans interested in research in mathematics, as evidenced by the fact that many mathematics departments are more than half foreign-born.
As someone who works with maths in industry, I simply don’t see a need for more advance math research and probably a lot less should be done that what is going on in academia. I believe most math majors like me in industry find it too esoteric and self-centered for maths done for its own sake, without any regards to real-world applications. I am not saying it shouldn’t be done by a select few geniuses, since it is what they do best and could possibly lead to some breakthrough applications in the future, but for the majority of the math-loving populace, it’s a plain waste of time and should not be supported by tax dollars. Government funding for basic research in math or any other pure sciences are probably bloated, which explains why all these advance math department are seeing so many foreign-borns, when taking away these fundings would do more good and discourage such overt foreign slave-labor. The way I see it, more funding for applied math research and better math teachers who know how math is applied to everyday living is the way to improve our current maths situation. There’s too many ivory towers with their wizards and elves (publishing unfathomable works of doubtful benefit) that creates this feeling of apathy and even some resentment towards advance maths. Paring it down to a select few would more likely create the feeling of wonder and respect. I see a good analogy in the movie X-men between mutants and advance math researchers (AMR for short). How many more AMRs does the general populace need before intolerance takes over ? By the way, it has happened before in history, persecution of intellects in China and Iran are some well-known cases.
19 March, 2007 at 7:19 pm
Terence Tao
Dear Nets,
I thought about it a bit more, and this strategy to establish blowup may run up against the same supercriticality issues which plague the global regularity problem.
Thanks to the recent work of Escauriaza-Seregin-Sverak and others, we know that blowup solutions to Navier-Stokes must in fact blow up in the critical norm
. This rules out self-similar type solutions in which the solution shifts all of its energy from one scale to the next while keeping critical norms under control. It is plausible that these results can be localised, leading one to also rule out “self-similar + radiation” solutions (such as those appearing in the recent works of Merle and Raphael for NLS, as well as even more recent work on wave maps by Krieger-Schlag-Tataru and Rodnianski-Sterbenz) in which a lot of
mass is radiated away to coarse scales but the concentrating portion of the solution stays bounded in
. If these scenarios are ruled out, then the blowup solution must become increasingly large (and hence increasingly nonlinear) in critical norms at fine scales as one approaches the blowup time. This creates a lot of scope for an anti-Maxwell’s demon (Maxwell’s angel?) to cause trouble – one may start with a promisingly randomly distributed ensemble at coarse scales, but as one pushes the ensemble into fine scales with large critical norm, and the evolution becomes very nonlinear and unpredictable, Maxwell’s angel could conceivably sneak in and drain a lot of entropy out of the ensemble, and eventually nudge the entire ensemble into states in which the energy all dissipates harmlessly and one has global regularity. (Now, if we could obtain some sort of rigorous version of the second law of thermodynamics here, one might be able to prevent this from happening, but this looks very remote at present.)
19 March, 2007 at 7:53 pm
Terence Tao
Dear RK,
I have seen mathematicians compared to several things, but to be equated to X-men is a new one for me. Certainly it would be nice to have some sort of secret super-power. :-)
The proposal to subdivide math into “useful math” and “useless math”, and then save money by defunding the latter, has come up in the past. The talk of Tim Gowers that I mentioned in an earlier post already addresses the problems with this proposal rather well, but I can recap some of Tim’s points here.
Firstly, the cost-benefit ratio of mathematics is quite large: a mathematical breakthrough (e.g. fast linear programming algorithms, to name one at random) can benefit multiple sciences at once, and doesn’t require expensive labs or field tests. There are many examples of waste in government spending, but math research is nowhere near the dominant contribution to such. (See for instance this graphic depiction of the US discretionary federal budget to get some idea of the relative size of the NSF (the main federal funding body for mathematics in the US) against other government agencies.)
Secondly, the benefit of mathematics can come from unexpected sources. Elliptic curves, for instance, would likely have been placed on hypothetical lists of “useless mathematics” for decades, right up until they turned out to be of major importance in cryptography. It is not entirely clear at present what fields of mathematics will contribute to the sciences of the future (e.g. protein folding or quantum computing), but it may well be ones which previously have seen very little application.
Finally, maths is much more highly interconnected than one may think; individual researchers may only focus on one or two fields, but collectively every field is connected to every other by a surprisingly small number of degrees of separation. It becomes rather self-defeating to try to draw a sharp line between useful and useless. Tim gives some examples in his talk, but I can give one from my own research experience. The closest thing I have to a “useful” research product is my work with Emmanuel Candes on compressed sensing, which has helped lead to the development of a new type of camera as well as my only patent. This work resulted from investigations into random minors of Fourier matrices, which resulted in part from my work on the uncertainty principle in cyclic groups, which resulted from my attempt to understand the Cauchy-Davenport inequality, which resulted from my attempt to understand the additive combinatorics underlying Gowers’ work, which resulted from my attempt with Nets Katz to understand and improve the additive combinatorics underlying a paper of Bourgain, which resulted from my attempt to understand the Kakeya conjecture, which originally was a cute but impractical problem on how to rotate a needle around in the plane. At each stage of this process I was unaware of where next my investigations would take me, and it was a fair surprise to me to realise that my pure mathematical research had suddenly taken on a practical dimension. But this type of thing happens surprisingly often in a fundamental science such as mathematics.
(Note: for some reason two earlier posts from Deane Yang were automatically flagged as spam, and I only discovered this by accident today; they are now visible.)
19 March, 2007 at 11:20 pm
Nets Katz
Terry,
That’s really unfortunate. For this to work, nonlinearity needs to be a
friend and not an enemy. In fact the test problem should first to prove blow up
for 3D Euler with finite energy.
Some advantages there: Energy never dissipates away, cascading always
occurs exactly by vortex stretching so that some structure seems to be preserved (for instance the locations of zeroes just convect)
Of course the passage to high scales should happen at time scales so startlingly
fast that the linear term is having very little effect at all so that the blow up for
Euler should indicate the blow up for Navier Stokes.
20 March, 2007 at 12:14 am
Kashif Javed
hi!
So I guess this comes under the heading of related material, which begs the question of exactly how it is related to your research.
http://www.expertsitsolutions.com
20 March, 2007 at 12:21 am
Nets Katz
In fact, so that I can understand your objection better, let’s restrict
our attention to 3D Euler. The critical norm which you say we should be
desperately clinging to is L^{infty} of vorticity. We don’t need a recent result
here. Beale-Kato-Majda already prove that if we have blow up of Euler
then the L^{infty} of vorticity must blow up. Does that mean the equation
becomes “increasingly unstable”?
By contrast let me mention a 2D problem for Euler which is critical. There
global solvability is guaranteed since vorticity is convected so that L^{\infty}
of vorticity is conserved. Beale-Kato-Majda show that the growth of
Sobolev norms in time is doubly exponential. The question, perhaps somewhat less glamorous than global solvability, is whether this estimate
is sharp and double exponential growth actually occurs. As we’ve discussed in the past, if one looks at the highest order part of the growth
exclusively, in a certain position, one is forced to consider a system
of ODE for SL(2) valued variables, one variable for each active scale. The
criticality enforces that different scales could be of equal strength and
that many different scales have to work together to achieve double exponential growth. More over if we look at this system of N equations
which governs the growth of the frequencies that are like 2^{-N}, this
system is highly nonlinear, in that, in the relevant time scale which
is 1/N, you could have all of the N/2 highest scales change completely.
You can make very precise that this critical setting becomes increasingly
unstable as the N gets larger. In particular the system has increasing
numbers of variables more remote from each other in scale, all of which
interact. That is tough going.
On the other hand, in the supercritical case, if you posit any reasonable
worst cascading of energy, what happens is that the highest order scale
dominates exponentially the lower order scales. The cascading process
should indeed be nonlinear, but it seems self-similar as frequencies increase. In the time we can activate the N+1st scale, the Nth scale might get all messed up – that’s the nonlinearity – but we don’t care so
much what happens at slightly lower scales. The problem is essentially local in scale.
Now you might argue that this local process could have a derandomizing effect on the ensemble. (And the more times you do it, the more derandomizing it gets.) But if you could show that were true, it would
probably be the most amazing theorem in fluid mechanics ever. In any
case, in this setting, it seems to me the supercritical problem might be
easier to understand than the critical one. Am I wrong?
20 March, 2007 at 11:49 am
RK
>I have seen mathematicians compared to several things, but to be equated to X-men is a new one for me. Certainly it would be nice to have some sort of secret super-power. :-)
Thank you Terry. It is interesting that you don’t see yourself as having some secret superpower, and it feels somewhat disconcerting. If you solve a problem in less than 10 seconds while the person sitting next to you is still fumbling with it for an hour … well, you get the picture. In the Xmen movie, the mutants do tend to consider themselves as the “normal” people, while the rest of the population are treated by them as “subnormal”. Gaussian statistics dictates otherwise , don’t you agree ? Fortunately, we live in a world where one can be a dumbass in math, and be a genius in basketball :)
>Firstly, the cost-benefit ratio of mathematics is quite large: a mathematical breakthrough (e.g. fast linear programming algorithms, to name one at random) can benefit multiple sciences at once, and doesn’t require expensive labs or field tests. There are many examples of waste in government spending, but math research is nowhere near the dominant contribution to such. (See for instance this graphic depiction of the US discretionary federal budget to get some idea of the relative size of the NSF (the main federal funding body for mathematics in the US) against other government agencies.)
as I said, funding needs to be optimized, so only the deserving gets it. Not an easy thing to do I have to concede. In any case, in a free and efficient economy, NSF or other basic research funding should be small and and remain small. Out of 100 basic math proposals only 5 should be funded. Which 5 to pick is another matter for discussion. Other agencies are getting the lion’s share simply because it is what the market dictates. I would argue that most math breakthroughs occur in industry (like your fast linear programming example), not in academia. Just like in the old times, it’s the practical problems that create new useful maths (Newton, Bernoullis, Fourier are great examples). You mentioned elliptic curves. In the cryptographic field , of which I’m a bit familiar, they’re treated more as a curiosity rather than having any real applications. There are far better and sophisticated algorithms that are invented because of the need for secure communication. I happen to believe that necessity is the mother of all useful inventions. You might have invented a better mousetrap or some other trap where it’s not clear what use it may have in the future, but until someone needs it or sees a need for it, you fund it yourself first.
It seems to me that your camera patent resulted from your friend’s need to solve a practical problem, which you happened to do or have done so already. It is entirely possible that if he hadn’t met you, he’ d end up reinventing what you did with Fourier matrices, or something even new. However, kudos to you for even tackling one practical problem! I’ve had encounters with college math professors who treat any practical uses in their subject of expertise as if it were a pile of dung! (I distinctly remember one waving his hands as if having a shock) . The connections you mentioned are quite impressive, but it still require that conscious effort to make the last practical one. I imagine you have this tree of connections, that unless it ends in a leaf or flower or a fruit, will simply keep on branching nowhere. Not a pretty tree to see. I also want this tree to grow a branch where it needs to. Otherwise, I cut off all the unkempt branches. If I need an apple, I’d grow an apple tree. Someone out there likes to grow many kinds of trees for its own sake, you can do that on your own time and money. Of course, someone might need an orange in the future and you hit the jackpot if you have one, it doesn’t happen often I’m sure. Now if that camera end up in my house someday, I’d have you to thank for :).
20 March, 2007 at 7:07 pm
Richard
Unfortunately, RK, apple trees, like most trees, take a very long time to grow, mature, and produce fruit, acorns, or whatever. You’ll starve to death long before the apples appear.
20 March, 2007 at 8:26 pm
Terence Tao
Dear Nets,
It’s a bit trickier to use scaling analysis for Euler, since it has a two-parameter scaling symmetry rather than a one-parameter one; the spatial scale, temporal scale, and velocity magnitude are related only by a single constraint rather than two. Nevertheless I agree that the problem of improving the double-exponential Beale-Kato-Majda bound for 2D Euler is a great problem – it is perhaps the one place where the wall separating the impossibly supercritical problems from the feasible critical problems is thinnest. It is clear that some improvement should be possible, and whatever technique is used to accomplish this will undoubtedly be interesting.
I also agree that it is against all physical intuition that fluid equations could derandomise ensembles of solutions – but we don’t seem to have any way at present to make this intuition rigorous. If we have to admit the existence of derandomisers to move the solution to either the best-case scenario or worst-case scenario at whim, then any supercritical equation could conceivably do just about anything from generic blowup to global regularity, depending on the mood of the derandomiser. This is why I feel that “understanding pseudorandomness” needs to be part of the solution (unless we find a new monotonicity formula or something, which lets us gain more control on either the best case or worst case and allows us to establish results even if the equation tries its best to derandomise the flow).
20 March, 2007 at 8:37 pm
Ars Mathematica » Blog Archive » Why Navier-Stokes is Hard
[...] Tao has a thoughtful post that explains why proving existence results for Navier-Stokes equations is so [...]
21 March, 2007 at 9:14 am
Terence Tao
Dear RK,
I actually think that the concept of “secret superpowers”, in maths (or elsewhere), is in fact rather damaging, as it conveys the impression that maths is inaccessible to mere mortals, requiring some sort of “X-men gene” in order to play. (See also “the cult of genius“.) In reality, good mathematical work can be accomplished by anyone of reasonable intelligence given the right resources (interest, education, time, patience, work ethic, study ethic, access to literature, exposure to mathematical culture, inquisitiveness, good mentorship, and so forth); in that sense, it is not terribly different from other skilled professions. (Of course, giving people access to these resources is non-trivial, and is basically the original topic of this discussion.) Raw talent is of course a factor, but the differences in quality between mathematicians are mostly a matter of degree rather than kind, excepting perhaps some very rare true geniuses (Ramanujan comes to mind) who did indeed have extraordinary natural gifts. As a child, I myself was not actually able to solve hour-length problems in 10 seconds; nowadays, I can occasionally pull it off, but only because I happen to have 10 years of experience in the problem area. Ultimately it is long-term skills and experience which matter the most; initial talent may help one achieve short-term goals such as a university degree, but it is not the decisive factor in long-term success.
Now regarding the relative “practicality” of different types of mathematics. It is indeed true that pure maths, applied maths, and industrial maths all have different goals, mindsets, and priorities, and many mathematicians specialise in just one of these spheres and neglect the other two. However, mathematicians in all three spheres use (essentially) the same mathematical language, and publish in journals which are generally accessible to mathematicians in the other two spheres. As a consequence, it is not necessary (or even desirable) to get every single mathematician involved (or even interested) in every single sphere. As long as there is a significant minority of interdisciplinary mathematicians who can take the published research from one sphere and apply it to the other, it is in fact a much more efficient division of labour to encourage those who have talents in pure maths to specialise in pure maths, and similarly for applied, industrial, or interdisciplinary mathematics. A typical pure mathematician may have no interest in applications (or vice versa), but the research produced will appear in journals, conferences, or in conversations with other mathematicians, and it only takes a single mathematician with an interdisciplinary mindset to pick up on the potential of that research to transfer it to another area.
Of course it is not true that all research is of equal value; the best examples of good mathematics are indeed focused on one or more goals – not only immediate applications, but also addressing more fundamental questions, such as trying to formalise and understand a mysterious mathematical phenomenon. All of these goals are important, and surprisingly often the pursuit of one goal can lead to unexpected progress on other goals (cf. Wigner’s “unreasonable effectiveness of mathematics“). See also my article on “what is good mathematics?“.
p.s. Regarding elliptic curves, the NSA’s “Suite B” set of cryptographic algorithms, which may well become a standard in the near future, exclusively uses elliptic curve cryptography for digital signatures and key exchange (though of course other types of cryptography are used for hashing or symmetric-key encryption). Also, elliptic curve factoring methods are still widely used for medium-length numbers for which both trial division and number field sieve type methods are impractical.
21 March, 2007 at 10:41 am
van vu
To RK,
You wrote:
as I said, funding needs to be optimized, so only the deserving gets it. Not an easy thing to do I have to concede. In any case, in a free and efficient economy, NSF or other basic research funding should be small and and remain small. Out of 100 basic math proposals only 5 should be funded. Which 5 to pick is another matter for discussion.
—————————————-
What you wrote is actually not too far from the truth (:=((. I am not sure we have free and efficient economy, but NSF funding for math is certainly small. It costs about the making of Water World (very bad movie–if you have not seen it, don’t) to fund the whole math community in the US. Furthermore, if every eligible faculties
applies for funding, the acceptance ratio would be far worse than 5/100 !!
21 March, 2007 at 1:52 pm
Greg Kuperberg
I actually think that the concept of “secret superpowers”, in maths (or elsewhere), is in fact rather damaging, as it conveys the impression that maths is inaccessible to mere mortals, requiring some sort of “X-men gene” in order to play.
Of course it is defeatist to believe that mathematics research is only for a handful of geniuses. It’s hard to think of what to say to people who want to believe that. I would go back to the “Green Eggs and Ham” argument: Just try proof-based mathematics and see if you like it. Worry later about whether you would enjoy a career in it.
If it is meant as an argument against mathematics funding, then as Van Vu says, funding is already extremely competitive. NSF funds a great range of first-rate work with very little money. In a sense, Congress could hardly fund it less. Research mathematics funding could be viewed as an accident; it survives because it is a good thing and it is below the radar. Many Congressmen may not even know that it exists.
It costs about the making of Water World
Right, one year of NSF funding of mathematics costs about the same as one high-budget Hollywood movie. Ten years of NSF funding of mathematics costs less than the Chandra X-Ray Telescope. If all research in mathematics together is cheaper than just one telescope, then it really is almost nothing given the great range of work. And nobody debates whether telescopic astronomy is practical; it is funded just because it’s cool.
21 March, 2007 at 2:06 pm
Greg Kuperberg
By the way, there is another direction in math education which maybe you or we should speak to. Home schooling is very popular in the United States these days. To be sure, mathematical prodigies have to have some degree of special studies with elements of home schooling. In some circles, mathematical prodigies come across as a kind of jackpot of radical educational choices: “Unleash the genius within your child”, “see what home schooling can do”, “see what public schools can’t do”, etc.
Now, I certainly think that many public school policies are too rigid, and that public schools could teach mathematics better. But I have even less happy about overpromotion of home schooling. I can accept a right to home schooling. But even passable home schooling is hard work; good home schooling is even harder work. It is especially troublesome that parents are not in a position to criticize themselves for falling short. Bucking the system just for its own sake is a mistake.
21 March, 2007 at 9:40 pm
Nets Katz
One should not be too critical of home schooling.
Public schools do not exist to teach mathematics. They exist to integrate students into society. The two goals are perhaps not so compatible.
Another thing one might add is that there is plenty that university faculty
can do to aid the progress of home schooling in general. Home schoolers
are hungry for ideas and materials. Often much more so than teachers.
Nets
22 March, 2007 at 1:29 am
oz
Regarding math funding:
The MONEY spent on math seems indeed to be negligible compared both to other fields
and to its benefits, so let’s approximate it by zero.
However, I think what was missed in previous comments is the TRUE RESOURCE here,
which is the talent and time of all these smart people spending their life doing pure math.
Surely, it sometimes brings to unpredictable and useful things.
But many of these talented people can surely contribute much to more applied
fields in science and engineering (simons and brin are great examples, and also eric lander)
The true question is, had all these terry tao’s and their likes worked on applied problems in
engineering and science, wouldn’t their work lead to more useful stuff?
22 March, 2007 at 8:36 am
Imagine If Quantum Mechanics Ceases To Be So Esoteric? at Orbit Change Conversations
[...] brings me to Terence Tao’s insightful post on “Quantum Mechanics and Tomb Raider”. It uses a modified Tomb Raider computer game to describe many aspects of QM: the [...]
22 March, 2007 at 9:06 am
Terence Tao
But many of these talented people can surely contribute much to more applied fields in science and engineering (simons and brin are great examples, and also eric lander)
To some extent, this is what is already happening. I just happened across this cover story “Math will rock your world” in, of all things, BusinessWeek, which broadly (if a bit breathlessly) describes how maths is permeating various modern industries. And as discussed above, there are certainly financial incentives to work in areas other than academia. All of this does lead to shortages of mathematical talent at the higher levels, although as Greg pointed out above this is unfortunately not always translating into increased openings in academia.
But we don’t live in a centrally planned economy, and people also have the right to work in the field they enjoy the most, and get the most satisfaction out of, even if it is not the most “productive” in a utilitarian or financial sense. One should also bear in mind Ricardo’s law of comparative advantage. Just because someone has some absolute advantage in, say, creating new widgets, does not mean that this is necessarily the best thing for that person to do, because he or she may have an even stronger comparative advantage, in, say, proving theorems. For instance, a talented mathematician might be able to design widgets 1.5 times more efficiently than your average widget designer, but might also be able to prove theorems 100 times more efficiently than the widget designer. It is then more efficient to let the widget designer design the widgets, and let the mathematician prove the theorems.
22 March, 2007 at 9:12 am
ST
Mr. oz:
You must be religious(scientology?) becasue your rhetorical question presumes the existence of some mighty beings who will continue to provide the essential tools like calculus and number theory so that the terry tao’s and their likes can work on applied problems in engineering and science.
22 March, 2007 at 3:35 pm
Khoa Tran
Terry raised a good point on “Comparative Advantage”.
To address Oz’s question “had all these terry tao’s and their likes worked on applied problems in engineering and science, wouldn’t their work lead to more useful stuff?”, I’d like to add another argument, although it’s non-falsifiable. Had Terry Tao worked on Engineering, I doubt that he would have contributed to Engineering as much as he has done in Mathematics.
As Terry pointed out earlier, a person’s productivity/talent depends so much on his passion and hard work rather than raw intelligence.
22 March, 2007 at 4:26 pm
kt
on math funding…
is it feasible to patent certain mathematical results so that royalt fees can be collected for commercial applications that make use of the results? the fees can then go into the “International Monetary Fund For The Advancement Of Mathematics”.
kt
22 March, 2007 at 6:32 pm
Freiman's theorem in finite fields via extremal set theory « What’s new
[...] theorem in the 2-torsion case, and in particular getting some partial results on the polynomial Freiman-Ruzsa conjecture. Specifically, we show that if has doubling constant at most K, thus , then A can be contained in [...]
23 March, 2007 at 3:17 am
oz
ST:
I’m not religious. The question wasn’t rhetoric but a real one.
Sorry for the word ‘all’ which is a bit too strong – I actually meant ’some/most’.
Anyway, the comparative advantage and ‘do what you like’ arguments are good ones. Still its a matter of finding the correct balance. Whether society would be better off with more/less pure mathematicians than there are today I guess nobody could ever tell.
“Had Terry Tao worked on Engineering, I doubt that he would have contributed to Engineering as much as he has done in Mathematics. ”
Probably true. But there are opposite examples like Shannon.
23 March, 2007 at 8:18 am
Not Even Wrong » Blog Archive » All Sorts of Links
[...] each have fascinating blog entries on Millenium problems. Terry Tao writes a long explanation of Why Global Regularity for Navier-Stokes is Hard. He also comments about the recent New York Times piece about him and about math education issues. [...]
24 March, 2007 at 1:09 am
Anonymous
WHY DO PEOPLE HAVE TO KNOW MATHEMATICS BEYOND THE LEVEL WHICH THEY ARE TO TEACH ? 1 2 [Next]
24 March, 2007 at 7:09 am
zf
Dear RK
You said “Government funding for basic research in math or any other pure sciences are probably bloated, which explains why all these advance math department are seeing so many foreign-borns, when taking away these fundings would do more good and discourage such overt foreign slave-labor.”
I am from a third-world nation. I intend on pursuing a math phd from a good school in the US. I doubt i’m ’slave labor’. I want to go to the US because it has many good mathematicians. Your phrase sheds much light on the limitations of your thinking.
Moreover, I do math not because it has applications…its a bit like asking a musician why he makes patterned sound-waves which can’t be used for anything (perhaps heavy metal can scare away crows?).
Perhaps you’ve never felt joy doing math.
Strangely this is not much different from the joy ive seen felt by my physics peers..i remember being asked in my first year by a math student what possible use could there be of hilbert spaces…he shifted to physics…he loves QM and now he loves functional analysis too…but not ONLY because it has uses in QM or because its a beautiful fact that beautiful math explains deep things about nature. But because he now apprehends the beauty of the idea.
When i or any other serious student of math (even physics and some eco students) do math, applications is the last thing on out minds. Its not even a goal. And remember i come from a third world country — earning ‘big bucks’ is a big thing for us and we are aware of the fact that a banker gets more money than a mathematician. And yet many of us want to stick to the ‘irrational’ choice of taking up math as a career and perhaps come to the US where the pay for being just a phd student is better than what it is in my country (that helps, I wouldnt care to own a bentley, but its nice to be able to buy a small house and pay for your kids education).
- zf
25 March, 2007 at 7:49 am
Paul Cohen « What’s new « cow_gone_mad
[...] « What’s new Posted in Uncategorized by cowgonemad on the March 25th, 2007 Okay, over at Paul Cohen « What’s new, I learnt that Paul Cohen has recently died. The only thing about him I know personally, that I [...]
25 March, 2007 at 10:02 am
Anonymous
Cohen’s proofs (like virtually all set theoretic independence proofs) are relative consistency proofs. He proved that if ZFC is consistent, then ZFC is consistent with the failure of the continuum hypothesis. The power of Cohen’s technique is hard to underestimate – it is essentially the only known way to enlarge a model of set theory. (There are lots of ways to shrink them.) Cohen also showed that the axiom of choice is independent of the other axioms using a variation of this technique.
25 March, 2007 at 1:28 pm
On genius and hard work « Mulling Math
[...] of this cult of all, it does permeate math as well. Terry Tao in particular mentions this here and more extensively in his excellent career [...]
25 March, 2007 at 9:46 pm
HL
I have a question about the case of a general Abelian group G. What is the best known result for the case G has *no* finite subgroups?
26 March, 2007 at 6:45 am
Terence Tao
Dear HL,
In the torsion-free abelian case (i.e. G is abelian with no non-trivial finite subgroups), one can use the “Freiman isomorphism” trick to reduce the problem to the classical case
of the integers. (Quick sketch: since the set A one is studying is finite, one can reduce to the group generated by A, i.e. one can assume G is finitely generated. Since G is also abelian and torsion free, it is now isomorphic to a lattice
. Now, one cannot embed all of
homomorphically inside the integers. But we are not really using all of
, since A is finite, we can work inside a big d-dimensional cube inside
, which can be mapped homomorphically and injectively into the integers in a variety of ways.)
In the integers, the classical theorem of Freiman applies: if
, then A can be contained in a generalised arithmetic progression of rank at most F(K) and cardinality at most G(K) |A|. The best bounds for F and G are currently due to Chang, who obtained F(K) = K-1 and
for K large. There is also a variant due to Bilu and refined slightly by Ben Green and myself, which has the same hypothesis but concludes instead that A is contained inside
translates of a generalised arithmetic progression of rank at most
and cardinality at most |A|, where
is arbitrary.
27 March, 2007 at 8:50 am
Ford Denison
You can’t imagine how flattering it is to have my blog listed right under Bruce Schneier’s!
27 March, 2007 at 9:05 am
Greg Kuperberg
I’d like to throw out a specific recreational mathematics problem which I realized is on the same theme as non-commutative Freiman theorems.
It is known that the diameter of the Rubik’s cube group (in the half- and quarter-turn metric) is between 20 and 27. Meanwhile
here is a histogram of the word lengths of a random sample of 1,000 positions. You would guess just from looking at the histogram that the diameter is 20; it would be strange if it were more than 21.
Is there a way to prove that the diameter of the Rubik’s Cube group is 20, by combining Freiman-style results with simple-minded computer data? If not an outright proof, is there a probabilistic protocol, in the vein of Arthur-Merlin protocols or probabilistic primality?
27 March, 2007 at 10:41 pm
Greg Kuperberg
For example, the histogram does immediately demostrate, if you are content with a probabilistic verification, that the diameter of the Rubik’s cube group is at most 36. That’s just by the pigeonhole principle: more than half of the positions can be solved in 18 moves. I have heard that for a time, this probabalistic verification was the best upper bound on the diameter of the cube group.
27 March, 2007 at 10:44 pm
Greg Kuperberg
Actually, the pigeonhole principle even establishes a probabilistic verification that the diameter is at most 35, better than 36. More than a quarter of the positions can be solved in 17 or fewer moves, while only 3% require 19 or more moves. Therefore every ball of radius 17 intersects every ball of radius 18.
(Sorry, I made a mistake; this emendation could be attached to the previous post. )
28 March, 2007 at 8:45 am
Greg Kuperberg
I am not sure whether this is a helpful suggestion for actually proving this conjecture, but let me ask whether the following stronger version appears to be the truth.
Suppose that you start with an eigenstate of a rectangle, whose Laplacian can of course be solved explicitly. Not just any eigenstate, but one whose momentum is almost entirely vertical; the horizontal quantum number is the smallest possible. Then suppose that you continuously bulge the sides of the rectangle, say by attaching half-ellipses whose horizontal semiradii grow from 0 to half the height of the rectangle. There should be a spectral flow. Do the eigenstates in question flow to counterexamples to QUE?
28 March, 2007 at 8:48 am
thomas1111
Very naive question: is there any chance to ever know about “the shape and size of specific eigenfunctions for a general manifold” since there might not be analytical ways of expressing them in closed form, except in highly non-generic cases (just like the Galois theory story for roots of polynomials) ?
(Aside comment: isn’t there a latex formula missing in the paragraph “The big gap…” after “but to even get control…”)
28 March, 2007 at 8:50 am
Greg Kuperberg
NB — there may be geometrically easier ways to do the spectral flow. The question of course has both quantum and classical interpretations. You can imagine a stadium drumhead, for example. If the semicircular sides are covered with heavy rubber mats, then the drumhead reduces to a rectangle. But now you can shave rubber mats continuously until they disappear. Again, the question is whether a vertical mode flows to a scarred state.
28 March, 2007 at 9:04 am
Terence Tao
Dear Thomas: thanks for the correction, I think I fixed it. It is true that, like many other PDE questions, exact closed-form solutions are probably impossible. But that doesn’t necessarily prevent us from establishing more “fuzzy” properties of these solutions, such as their size in various function space norms, or how they are distributed in space, indeed much of the previous century’s work in PDE was (and still is) devoted to these sorts of questions. For instance, we understand very well how large these eigenfunctions will be in the Sobolev norms
, which describe the regularity of these eigenfunctions, although our understanding of the
norms, which describe the distribution of these eigenfunctions, is still far from satisfactory.
Greg: this is an interesting suggestion. There is a problem when two eigenvalues come close to colliding, at which point the spectral flow can become very unstable; a scarred state may end up switching identities with a non-scarred state that it nearly collides with (sort of a quantum version of an elastic collision). I haven’t done the requisite dimensional analysis back-of-the-envelope calculation, but I would imagine that the number of near-collisions one would experience over the lifetime of the deformation would be a positive power of the energy, making it almost hopeless to maintain control of the final state in terms of the initial state. One might hope that the effect of all these collisions cancel themselves out (some sort of quantum second law of thermodynamics), but this brings us back to the seemingly intractable problem of how to show that deterministic dynamical systems exhibit pseudorandom behaviour, which as I mentioned in the Navier-Stokes post is a basic problem which plagues a lot of modern analysis.
28 March, 2007 at 9:15 am
Greg Kuperberg
I would even have to learn how to do such a dimensional analysis, but even if it is unfavorable, it seems possible that the vertical modes are protected by their emergent translational symmetry. (That is, the mode locally looks more and more like f(x)sin(ky), which has a translational symmetry in the y direction.)
Life would be a little cruel if you can’t find a spectral flow that even works numerically. Spectral flow is related to one’s faith in the quasimodes in the first place.
28 March, 2007 at 9:27 am
Terence Tao
Greg, that’s a good point. Though the interesting thing is that the quasimode f(x) sin(ky) does not seem to be concentrated in a single pure mode, but is instead a superposition of O(1) pure modes; conversely, each pure mode consists only partly of the translation-symmetric quasimode but also has a significant fraction of chaotic junk as well. The basic problem is that in the quasimode, f(x) needs to vanish to high order near both open ends of the rectangle; however, to be a pure mode, f(x) “wants” to be an eigenfunction of
. These two facts are incompatible (since f of course needs to be non-zero). However, because smooth compactly supported functions have rapidly convergent Fourier series, one can express f(x) as a very rapidly convergent sum of pure modes of the interval, and so by analogy we expect something similar in the stadium.
So the spectral dynamics should not be spread out over a huge-dimensional space, but instead should be focused on some low-dimensional “cabal” of O(1) modes, which trade fragments of the quasimode between themselves. This makes things a bit more promising, though it’s still not clear how to proceed. Perhaps one approach is not to focus on a single mode but rather on the space spanned by all the relevant modes (or maybe some associated projection operator, i.e. a mixed state rather than a pure state). This may be a bit more stable than trying to evolve a single mode.
Part of the problem stems from the fact that the operator
, despite appearances, does not actually commute with the Laplacian
, due to effects coming from the non-horizontal portions of the boundary. One can localize that operator to within the rectangle with a cutoff function, but then it is the cutoff which doesn’t commute with the Laplacian. Indeed the uncertainty principle then forces a spectral spreading of O(1), as discussed in the main post, which by the Weyl law heuristic then incorporates a non-trivial number of pure modes (indeed, even in the rectangle, the quasimode f(x) sin ny is split up among the pure modes sin mx sin ny for m = O(1)). If one could tighten the spectral bandwidth from O(1) to o(1) then one might be in much better shape, but this is already looking difficult.
28 March, 2007 at 9:35 am
Greg Kuperberg
I can’t quite tell if you’re still saying that spectral flow stories are outright false (you won’t sustain scarring) or merely difficult to prove. You could also be saying that spectral flow is a spurious model on the grounds thta whathever you could prove with it, you could also prove without it.
28 March, 2007 at 9:38 am
Greg Kuperberg
Oh, sorry, I read the middle paragraph more closely; I get the discussion now.
28 March, 2007 at 9:59 am
Terence Tao
Greg, my guess is that you will have to somehow get either
or
involved in the spectral flow equation (which will force one to leave the world of single pure modes and work with more mixed objects), otherwise one will be faced with the identity switching problem.
A model case may illustrate the issue. Let A and B be two Hermitian matrices of size n, and consider their direct sum
, which is of size 2n. The spectrum of this sum is simply the union of the spectra of A and B. Now suppose we evolve A and B until one of the eigenvalues of A crosses one of the eigenvalues of B. What happens to the eigenstate of A under naive spectral flow in this case? Well, the flow becomes singular at the moment of collision.
But this is a highly non-generic case. Suppose we now add an epsilon of noise to the direct sum. This lets one avoid the codimension-2 set where eigenvalues collide; instead, the eigenvalues will bounce off of each other after getting within epsilon (or maybe square root of epsilon) of each other (the famous repulsion-of-eigenvalues effect). But then, if we evolve eigenvalues continuously, an eigenvalue which was previously close to an eigenvalue of A, would become something close to an eigenvalue of B instead after a collision; hence, an eigenstate close to an eigenstate of A will, after the collision, flow to an eigenstate close to that of B. The state which was close to that of A is now assigned to an eigenvalue
with a different value of k, because of the crossing. There has been an identity switch.
My guess is that a numerical scheme which somehow can detect when this switching has occurred (e.g. by using the above momentum operators) may be fairly stable. As long as only O(1) pure modes are present in the spectral band of the quasimode, things should be OK. As mentioned in the main post, though, the enemy is if the special energy levels
have some very weird “attractor” property, in that many more eigenvalues get attracted to these levels than Weyl’s law would predict. One can’t have too many (more than n) in this band, as that really would contradict Weyl’s law, but with our current understanding of the error term in Weyl’s law (and even with the conjectured best bounds on this term) one still can have far too many modes in this band for comfort. This shouldn’t happen, at least for generic values of n, but I have no idea how to prove that would be the case.
Incidentally, one can phrase this problem as a question of determining how stable the method of separation of variables is in solving PDE. This method trivially lets us determine the eigenfunctions of a rectangle; why shouldn’t some variant of it still give some control on eigenfunctions of a near-rectangle?
28 March, 2007 at 1:45 pm
Greg Kuperberg
I certainly understand that symmetric matrices with eigenvalue multiplicity have codimension 2, that Hermitian matrices with multiplicity have codimension 3, and that spectral flow can end up permuting the eigenvalues adversarilally. These principles show up, for example, in adiabatic quantum computing.
I think that you and may agree that there could be truths about spectral flows that could make this problem easier to solve, unlike for example the Navier-Stokes question, where a variety of truths are known to make the problem harder to solve.
28 March, 2007 at 2:03 pm
Terence Tao
Oops, you’re right; the codimension in the Hermitian case is indeed 3, not 2.
Certainly I rate this question as significantly easier than Navier-Stokes – not that that is really saying all that much. The problem here is one of linear constant coefficient elliptic PDE, rather than a nonlinear parabolic PDE, and the domain is about as simple as one can get without actually being able to get explicit closed-form solutions. The domain even has a little bit of symmetry, though this is unlikely to be terribly useful (and it does cause a minor amount of headache here, for instance many of the eigenvalues here will acquire some trivial multiplicity due to the symmetries).
Spectral flows are very pretty, but the fact that they don’t react well to eigenvalue collisions or multiplicity is somewhat disconcerting. It comes down to the fact that, from a functional calculus point of view, it doesn’t actually make much sense to talk about things like “the
eigenfunction”, because this doesn’t depend smoothly on the operator when multiplicity occurs. Whereas aggregate objects such as heat kernels, resolvents, wave propagators, etc. are much more robust, and make a lot more sense from the functional calculus perspective. So we can control eigenmodes on the average by very stable and robust methods, but it isn’t obvious that we can do something similar for individual eigenmodes.
Perhaps one needs to transform the question to a very different looking question which looks more stable. For instance, it is tempting to perform a Fourier expansion inside the rectangle and a spherical harmonic expansion (i.e. a Fourier expansion in polar coordinates) in the wings. Matching the two boundaries leads to a problem which looks very vaguely like an Anderson localisation problem, which is a fairly stable phenomenon in generic cases. I played around with this a bit in the past but didn’t really get anywhere, though.
28 March, 2007 at 7:34 pm
Greg Kuperberg
Actually there is something that I don’t quite understand about the existence argument that you credit to Heller and Zelditch. Is it supposed to apply to rectangles? At first glance, most rectangles do satisfy QUE.
28 March, 2007 at 7:53 pm
Terence Tao
Hmm, you’re right, I oversimplified the argument a bit.
It’s true that for rectangles with irrational sidelengths, all eigenfunctions, being plane waves, are uniformly distributed in physical space. I was hiding however the full definition of QUE – one has to look at the distribution of the eigenfunctions not just in physical space, but in momentum space and more generally in phase space. The plane waves are incredibly highly concentrated in momentum space, and rectangles are not quantum ergodic (which is unsurprising given that they are also not classically ergodic).
The formal definition of quantum ergodicity and QUE is a little technical; one can ask that the Wigner transform of the eigenfunction, suitably normalised, converges weakly to the uniform distribution; or one can ask that
for all pseudo-differential operators (or quantum observables, if you will) a(X,D) of order 0, where
is the integral of the symbol of a on the energy surface (the cosphere bundle). More informally, you want to see “quantum chaos” – the eigenfunction should be in all positions and all momenta at once.
The Heller-Zelditch argument for the rectangle would start with a quasimode, which is highly concentrated in frequency space (aka momentum space), and conclude the existence of pure modes with the same property; this would count as scarring, albeit in frequency space rather than physical space. One could pull something similar off in the stadium, provided that there are not too many eigenvalues in the relevant spectral band.
28 March, 2007 at 11:29 pm
Greg Kuperberg
With these remarks, I only have more trouble believing the story of out-of-control spectral flow. That story makes a lot of sense for many kinds of spectral flow, for example an overly optimistic adiabatic quantum algorithm. But in this case, suppose that you have a particle in a convex region which is somehow a pregnant rectangle. (A regular hexagon, a stadium, etc.) If a state is highly localized in momentum space, perpendicular to the parallel sides, then the pregnant regions in position space have to be blank. But if these regions are blank, then it is stable as you reshape the pregnant sides. If you execute spectral flow from a perfect rectangle, then the momentum must delocalize either fairly quickly or not at all.
29 March, 2007 at 5:52 am
Gil Kalai
Dear Terry
This is a wonderful post. I have a (rather generic) question: is looking at the problem at high dimensions (even asymptotically somehow) can make things easier regarding the negative direction or even regarding some aspects of the positive direction?
29 March, 2007 at 7:12 am
Terence Tao
Dear Greg, I suppose I should have said “concentrated” in frequency space rather than “localised”; the particle has a significant portion of its
norm located in the vertical frequency, but also a significant portion outside. Look for instance at the second and third eigenfunctions on the final image of the main post. At a PDE level, this is coming from the compatibility condition between the eigenfunction equation on the half-open rectangle (which wants to concentrate vertically) and the semicircular wings (which want to distribute the mode more evenly in phase space). Physically, I guess a quantum particle in this mode spends part of its time resonating vertically, then leaks out to bounce around arbitrarily around the stadium for a while, then returns to resonating vertically. (Of course, one can’t really use classical language to describe a mode like this, so one has to take this analogy with the usual grain of quantum salt.)
It’s because of this that I suspect that the spectral flow is going to be rather subtle. As we discussed earlier, the crossing of two eigenvalues of even two totally unrelated modes can cause a switch as long as there is even an epsilon of coupling between the dynamics of the two modes. One has to somehow restrict to the much smaller set of modes which exhibit at least some concentration in the desired vertical frequency before one would get a reasonable dynamics.
29 March, 2007 at 7:19 am
Terence Tao
Dear Gil,
Intuitively, the equation should behave worse in higher dimensions, because the relationship between the energy and the scaling becomes increasingly unfavourable. Certainly the equation is more unstable; but it is hard to convert instability to actual blowup (it’s the “Maxwell’s angel” problem discussed earlier – the instability might have the improbable effect of always rescuing the solution just before it blows up). Conversely, in two dimensions, where the energy becomes critical, the result of Beale-Kato-Majda shows that one has global smooth solutions for Navier Stokes.
Another popular model to study is the hyperdissipation model, in which the dispersive factor
in the Navier-Stokes equation is replaced by a power
, where
is a parameter which serves a similar purpose to dimension, in that it determines the relationship between energy and scaling. The level
is the threshold beyond which the energy becomes subcritical and one has global regularity. (I’m not sure what happens exactly at 5/4, but perhaps Nets does.) In this paper it is shown that as
moves from 5/4 continuously down to 1, the (upper bound on the) dimension of the singular set at blowup time moves continuously from 0 to 1.
29 March, 2007 at 8:40 am
Greg Kuperberg
Sure, if you want me to speak quantumly: x and y momentum are commuting quantum random variables, so you may as well consider them to be classical random variables, unless you were to attempt a joint distribution with variables that they don’t commute with. (*) I agree that no eigenstate of the Hamiltonian can be completely localized in a bounded region around +k and -k in momentum space, nor even localized with exponential decay.
(*) And non-commutation with position is sufficiently gentle that there is a kind of joint distribution, the Wigner function.
However, from two of the example plots, it looks like there exist eigenstates with pretty good momentum localization, i.e., with adequate polynomial bounds. (I admit that I have no idea how good they are.) There also exist eigenstates with less good momentum localization.
I also admit that I have no idea how to rigorously prove any really good theorem in PDEs; I have no practice. That said, I’m going to provocative and conjecture that spectral flow, as you get just by changing the shape of the corral from a square to a stadium, works. I think that it can be simulated without much more difficulty than solving the Laplacian with one shape. If it is really true that the spectral flow is merely subtle and not provably anomalous, then maybe the thing to do is just try it and see. (Could that qualify as agreeing to disagree?)
29 March, 2007 at 9:02 am
Terence Tao
Actually, I can come up with a rigorous argument as to why spectral flow won’t work. The basic problem is that spectral flow preserves the index k of the eigenvalue, which when combined with Weyl’s law will force an energy shift, whilst the quasimode stays at a roughly fixed energy.
Let A(t) be a Hermitian operator depending on a time parameter t; in this case, A(0) might be the Laplacian on the rectangle and A(1) the Laplacian on the stadium, with some continuous deformation in between. (Yeah, I know, the domains aren’t the same. We can fix this in a number of ways, e.g. conformally mapping all domains to a single reference domain, or something.) Suppose
is the
eigenvector and eigenvalue of A(0), normalised in
. If we then start spectral flow to create the time-evolved eigenvalues and eigenvectors
and
, one observes that barring an actual eigenvalue collision (which is unlikely, since this is a codimension 3 set that one is trying to hit with a dimension 1 trajectory), the eigenvalue
will retain its position as the
eigenvalue; for instance, we will have
. If there is an eigenvalue collision, then of course the spectral flow becomes singular and the whole strategy is moot anyway.
Now we have Weyl’s law, which asserts that the number of eigenvalues of a Laplacian
in a domain
with eigenvalues less than
is asymptotically
. This implies the asymptotic
.
Now let’s start the spectral flow with
the unit square and
, then
and
. We flow this to time 1, where
is now the stadium. Because spectral flow preserves k, we still have
, so by Weyl’s law the energy has dropped to
. But the stadium has area strictly larger than that of the square, and so the spectral flow has left one with an eigenstate of energy far lower than the energy band
where the quasimode
lives in. Thus the spectral flow has necessarily flowed away from the quasimode.
Now one can try to repair this by flowing from the stadium to a rectangle of equal area to the stadium, which would be better, but I don’t think it fixes things perfectly. The basic problem is that in order for spectral flow to work, the number of eigenvalues less than the eigenvalue of the scarred state in the stadium needs to be exactly equal to the number of eigenvalues less than the eigenvalue of the scarred state in the rectangle. Due to the error term in Weyl’s law, one is unlikely to hit this number exactly on the nose.
p.s. because of the effects of the boundary of the stadium, the x and y momenta operators don’t quite commute, but it’s close enough that one can still meaningfully talk about a joint distribution as long as one doesn’t try to localise to too fine a scale (one is pretty safe as long as one tolerates uncertainties of more than O(1) in both momenta).
29 March, 2007 at 9:44 am
Greg Kuperberg
Actually, I can come up with a rigorous argument as to why spectral flow won’t work. The basic problem is that spectral flow preserves the index k of the eigenvalue, which when combined with Weyl’s law will force an energy shift, whilst the quasimode stays at a roughly fixed energy.
Sigh. I was provocative all right, but I only provoked you to rigorously disprove my thinking. :-( I agree, there is no reason to believe that the qualitative attributes of the eigenvalues are strictly compatible with their ordering.
You’re absolutely right, it takes more thought to properly select the right eigenstates under any kind of spectral flow. Even a low-dimensional space of eigenstates is not good enough, by the same ordering argument. It could perhaps be interesting if you thought of a joint spectral flow of some kind involving two or more commuting operators. It isn’t worth a numerical simulation without more thought.
p.s. because of the effects of the boundary of the stadium, the x and y momenta operators don’t quite commute
Okay, this one is a purely semantic point, but I maintain that d/dx and d/dy do commute with each other. What you are saying is that they don’t quite commute with the Hamiltonian. That is perfectly true, but what it means is that when the Hamiltonian is definite, d/dx and d/dy have a (joint) distribution.
29 March, 2007 at 10:44 am
Terence Tao
Okay, this one is a purely semantic point, but I maintain that d/dx and d/dy do commute with each other. What you are saying is that they don’t quite commute with the Hamiltonian. That is perfectly true, but what it means is that when the Hamiltonian is definite, d/dx and d/dy have a (joint) distribution.
This point confused me for a long time when I first learnt it, but just because d/dx and d/dy commute on a dense subspace of
, e.g. the smooth functions supported on a compact subset of the interior, does not mean that their functional calculi commute with each other. You can see this already by thinking about the eigenfunction equations for d/dx and d/dy. (These operators are not elliptic and so don’t have discrete spectrum, but this exercise is illuminating nonetheless.) To be an eigenfunction for d/dx, the eigenfunction must look like a sine wave on each horizontal slice of the stadium. Similarly, to be an eigenfunction for d/dy, the eigenfunction must look like a sine wave on each vertical slice. Due to the funny geometry of the stadium and the Dirichlet boundary conditions, these two conditions are incompatible.
In fact there is a nice theorem of Fuglede which asserts that on a domain
, the partial derivative operators
have a joint functional calculus if and only if
enjoys an orthonormal basis consisting entirely of (normalised) plane waves. He then conjectured that this latter condition holds if and only if
tiles Euclidean space by translations (note that the stadium manifestly does not have this property, whereas the rectangle does). This conjecture was proven for convex bodies in the plane by Iosevich, Katz, and myself, but is false in higher dimensions (a result of myself, as well as one of Kolountzakis and Matolcsi).
29 March, 2007 at 11:01 am
Greg Kuperberg
My conventions are a bit different from this, but I believe that they work. Instead of looking at
, I would still define
as operators on
. Then they do commute exactly and have a meaningful joint distribution.
Now you could object that the Hamiltonian
does not make sense on this larger Hilbert space. But you can extend it to an operator whose spectrum lies in
, by assigning infinite energy to any state outside
. Another way to describe
is that
is a perfectly good bounded operator.
Another way to describe the momenta is that the projection from
to
changes
from a Hermitian operator to a POVM (on the smaller Hilbert space,
). The POVMs still commute.
Now, your operators
are also perfectly valid mathematically, although I think that they are not self-adjoint (or rather, non-normal) due to the boundary conditions. If so, they are technically not allowed as quantum random variables. However, you may have reason to like them better otherwise for the purpose of obtaining estimates in analysis.
29 March, 2007 at 11:06 am
Greg Kuperberg
The POVMs still commute.
Sorry, argh, they probably do not. But they have a mutual refinement, a positive operator-valued measure from momentum-space
to
. That’s the real trick: POVMs don’t strictly have to commute to admit a mutual refinement.
31 March, 2007 at 2:10 pm
Hongjie
Dear Terry,
3D Navier-Stokes equation is globally well-posed when \alpha=5/4. More generally, n D NSE is globally well-posed when \alpha=(n+2)/4. One proof of this can be found, for example, in the paper http://arxiv.org/abs/math/0104199 you mentioned. But it is not explicitly stated there…
31 March, 2007 at 8:25 pm
selling waves » Blog Archive » links for 2007-04-01
[...] Quantum mechanics and Tomb Raider Terence Tao again: “In trying to come up with a classical conceptual model in which to capture these non-classical phenomena, we eventually hit upon using the idea of using computer games as an analogy.” (tags: math quantum_mechanics tao tomb_raider mathematics physics science) [...]
31 March, 2007 at 9:27 pm
Gil Kalai
Is there a way to define/explain quantum ergodicity and quantum unique ergodicity in the context of graph Laplacians? (I suppose that since these are asymptotic properties they apply to sequences of graphs rather than a single graph; like the notion of expanders.)
31 March, 2007 at 10:29 pm
Terence Tao
Dear Gil, I believe the analogous notions for graph Laplacians are as follows. Let
be a graph sequence of increasingly large graphs; one should probably think of these graphs as having bounded degree, with polynomial growth of metric balls, otherwise the concepts below are likely to be trivial or useless. For each fixed k, let
be the
eigenfunction of
, normalised to have
norm equal to 1. Then one would say that the graph sequence is quantum uniquely ergodic if for each k, the eigenfunction density
is asymptotically uniformly distributed on large balls
in the graph metric, thus for arbitrary choices of
and
we have
as n goes to infinity (with k kept fixed). Quantum ergodicity would be the same statement but one would be allowed to exclude a set of k’s of density zero. For instance, a graph sequence which was disconnected with multiple components of positive density would not be quantum ergodic or quantum uniquely ergodic in the above senses.
I believe (but haven’t checked the details) that the above notions of quantum ergodicity (ignoring for now the technical distinction between physical space and phase space) correspond to the continuous notions defined above if we discretise the domain
in the obvious manner, namely replacing the domain by the finite set
for some large n, and then connecting any pair of adjacent vertices in this set (separated by either a horizontal or vertical displacement of 1/n) by an edge to form the graph
.
Incidentally, there appears to be some literature on “quantum graphs”, for which notions of quantum ergodicity and quantum unique ergodicity can be defined, but this seems to be a slightly different concept.
1 April, 2007 at 11:02 pm
Gil Kalai
Thanks, Terry. Usually the graph of the discrete cube is an interesting example to look at/start with. Here, with the naive definition it has QUE because all entries of all eigenvalues are plus or minus one. I suspect that with the notion of “phase spece” you referred to, it might be decorated with many scarrs related to its faces to the extent of desrving the title “scarface”. So maybe the discrete cube is a good example to demonstrate these notions and extra subtelties.
2 April, 2007 at 8:36 am
Terence Tao
Dear Gil,
I don’t think I can transplant the notion of phase space properly to the graph theory setting; I would need some notion of “momentum” or “frequency” to assign to the eigenfunction, which seems to require a Riemannian geometry structure (which one needs to define the notion of a plane wave). We do have a metric coming from the graph, but it isn’t obviously Riemannian unless one adds additional structure to the graph (e.g. one embeds it into a Riemannian manifold, such as the stadium, but then we are leaving graph theory and returning to geometry). The faces are not exactly the problem; for the discretised torus one also gets eigenfunctions which are evenly distributed in physical space but are highly localised in momentum space. (I do like the term “scarface” though, I wish I could cook up some situation where I coud introduce the term naturally. :-) )
2 April, 2007 at 5:02 pm
sum-product
two papers related to this problem came out
http://arxiv.org/abs/math/0703676
http://arxiv.org/abs/math/0703614
3 April, 2007 at 5:01 pm
Anonymous
Hi Terry,
Didn’t you use c(\beta) to denote two different things? You started with defining c(\beta) as the triangle-removal-lemma constant, but later say that a better bound for c(\beta) may imporve the triangle removal lemma.
4 April, 2007 at 10:21 am
Y. Charles Li
Hi, Terry
This is a very nice article, will be helpful especially to fresh Ph.D.’s.
About the scaling argument, simple as it is, it does dictate important
estimates, e.g. for 3D NS, Leray obtained: For any initial condition u(0),
when
t > T = C \nu^-5 || u(0) ||^4_L^2
no more singularity. Using the scaling,
|| u(0) ||^4_L^2 —> \lambda^2 || u(0) ||^4_L^2 , T —> \lambda^2 T
as dictated by the scaling t —> \lambda^2 t. Pretend this is 2D
(of course, 2D global regularity has easier argument),
|| u(0) ||^4_L^2 —> || u(0) ||^4_L^2 , T —> T
while t —> \lambda^2 t by scaling. So any local solution can be rescaled to global.
I’m also glad you start to appreciate “turbulence”. I wrote a thing on
Mathematical Intelligencer:
http://www.math.missouri.edu/~cli/Nature-T.pdf
It should have appeared in hard copy. It is a follow-up of the famous article by
Ruelle and Takens:
http://www.springerlink.com/content/h1760361517×10h2/
I hope it brings chaos “closer” to turbulence. How close? it is hard to judge.
From a different perspective, the case is like the “global warming” problem,
the debate of whether turbulence is chaos is more or less over, the question is:
What can one do about chaos (turbulence) in terms of physical “reachable” description?
I wrote something along this line which is too immature to make public. I asked
Joel Lebowitz to read it for me and he is reading it.
Best Regards
Charles
4 April, 2007 at 10:26 am
Y. Charles Li
Ruelle and Takens paper link is not working properly:
D. Ruelle, F. Takens, On the nature of turbulence, Comm. Math. Phys., 20 (167-192),
23 (243-244), 1971.
Y. Charles Li
4 April, 2007 at 11:27 am
Terence Tao
You’re right; I’ve changed the wording in the post to try to clarify the point. The hope is that this problem is easier than improving the triangle removal lemma bound directly, but perhaps a solution to this problem will then lead the way to find a new way to prove triangle removal.
5 April, 2007 at 4:47 pm
vanvu
Dear Greg,
Your arguments seem to say that most pairs of points have distance at most 35 from each other. I never played Rubik, but I guess the underlying graph is vertex-transitive. In that case, the histogram actually would imply that most pair of points have distance at most 20 from each other, as the distribution of the distance between two random points is the same as the distribution of the distance between a fixed point and a random one. Best, Van.
5 April, 2007 at 10:48 pm
Anonymous
The Rubik’s cube graph is actually a Cayley graph of the Rubik’s cube group. So yes, it is vertex-transitive, in fact freely transitive. My question is not about most pairs of points, it’s about all pairs of points. (I.e., the graph diameter or worst case.) If it is true, as the histogram almost proves, that the distance between 97% of pairs is 18 or less and the distance between 25% of pairs is 17 or less, then the distance between all pairs of vertices is 35 or less. So my question is whether you can use approximate group theory somehow to improve this immediate but pretty inference.
5 April, 2007 at 10:55 pm
Greg Kuperberg
Oh oops, I accidentally signed that as “anonymous”!
6 April, 2007 at 9:07 am
Simons Lecture I: Structure and randomness in Fourier analysis and number theory « What’s new
[...] lectures at MIT. (These lectures, incidentally, are endowed by Jim Simons, who was mentioned in some earlier discussion here.) While preparing these lectures, it occurred to me that I may as well post my lecture notes on my [...]
6 April, 2007 at 1:53 pm
van vu
I don’t understand your point. The histogram is about a set of random samples, how can it say something about all pairs ? In theory, the only thing a small set of random samples can say is that the number of bad pairs are small.
6 April, 2007 at 3:20 pm
Greg Kuperberg
The point is that the histogram is a probabilistic verification of a mathematical assertion. If there existed even one pair of points in the cube whose distance is 36, then the probability of making such a histogram would be less than
. Or if you are not satisfied with that histogram, you could make a bigger histogram whose odds of occurrence are less than
. Think about it.
Anyway, my question is whether one can improve such a probabilistic verification using better group theory ideas.
6 April, 2007 at 5:48 pm
stevenm
Hi Terry,
Your chosen theme “dichotomy between structure and randomness” is certainly a very interesting one not only in mathematics but in theoretical physics, complexity and biology: fluctuations or randomness will give
rise to the growth of structure, whether it is galaxies arising from
randomness in the early universe, or the complexity and the evolution of biological structures. The effect of random noise on multi-dimensional nonlinear systems is also of much interest, for example in turbulence or pattern formation. As regards the pseudorandom objects you mention, an open problem should surely still be whether one can truly distinguish
between a pseudorandom sequence and a truly random sequence, without knowing the generating algorithm that was used and the state in which it was initially prepared. The security of most encryption algorithms I assume would be based on the assumption that it is actually totally impossible to distinguish between these. (Is it though?). Von Neumann remarked that those who think they can generate random numbers using arithmetical algorithms are “in sin”. But since his time some rather sophisticated pseudorandom number generators have evolved such as the ‘Blum Blum Shub’, the ‘Fortuna’ and the ‘Mersenne Twister’. Could the output of one of these be distinguished–within a tractable sample size–from a truly random number sequence that has been generated via a physical process such as thermal noise or a quantum process like radioactive decay? Monte Carlo simulations in physics and finance etc. will assume that a pseudo-random sequence is good enough for the simulation of processes that are expected to be truly random. I was also curious whether your interests have ever included, or currently include, stochastic partial differential equations (SPDEs) or “stochastic geometries”; geometries or geometric structures that fluctuate or evolve in a random way?
7 April, 2007 at 8:51 am
Terence Tao
Dear Stevenm,
There is certainly an analogy between algorithmic pseudorandomness, and the type of pseudorandomness that I will talk about in these lectures (Fourier pseudorandomness (Gowers uniformity), ergodic pseudorandomness (mixing), graph pseudorandomness (regularity), and dispersive pseudorandomness (local decay). For instance, if one views an algorithm as a finite state machine, then the running of this algorithm does resemble a kind of discrete version of a dynamical system. I am not an expert in computational complexity, but it appears though that the analogous questions for algorithmic pseudorandomness are significantly more difficult to answer, touching on major unsolved problems such as
and
.
There is a heuristic conjecture that the primes (or more precisely, the Möbius function
) are so “algorithmically pseudorandom” that they having vanishing asymptotic correlation with any “finite complexity” bounded sequence a(n), though I am not sure I can define what “finite complexity” means precisely (something like the output at time n of a dynamical system which can be described algebraically using only finitely many symbols). This conjecture, even if it is stated rigorously, is probably beyond the reach of current technology.
Incidentally, I do have a project on random quasiconformal geometries, which should appear here in a few months.
7 April, 2007 at 9:01 am
Simons Lecture II: Structure and randomness in ergodic theory and graph theory « What’s new
[...] the last lecture we saw that sets of integers could be divided (rather informally) into structured sets, [...]
7 April, 2007 at 10:11 am
Greg Kuperberg
The conjecture that
is much more to the point of algorithmic pseudorandomness than
. There is a theorem that
relative to a random oracle, i.e., that
if
is a randomly chosen Boolean function. The exponential notation means that that
is supplied to the computer as a black-box function. The proof is the classic, original derandomization argument. If you run the randomized algorithm with the pseudorandom source with enough trials, and then combine the results with majority rule, then consensus will be wrong for only finitely many inputs with probability 1, with respect to the random choice of
. An algorithm that is only wrong for finitely many inputs can be repaired to one that is always right.
Much of the work in showing that
has gone into replacing
with a cryptographic function
. If you imagine using
is the above derandomization argument, then the requirement on
is exactly that it should be cryptographically pseudorandom. Because, you can pick the algorithm to be anything that attempts to distinguish
from a truly random
. (Note the two uses of the word “random” here.
means algorithms that are allowed randomness on the fly, whlie
is deterministic algorithms with a randomly chosen, infinite ROM table.)
Essentially, people believe that
because they believe that universal pseudorandom functions
exist. In fact, they can make specific examples, they just can’t prove that they work. I have the feeling that there are weaker circumstances that still prove that
, but this is a summary of what people think is really going on.
7 April, 2007 at 10:22 am
Dave Purvance
In the time evolution of a viscous, incompressible flow’s Fourier modes convection’s quadratic nonlinearity is expressed as a convolution of an unknown flow unshifted and shifted in wavenumber. When the unshifted portion is left to be some unknown function and its shifted counterpart is assumed to be a time series of finite order, and, when a discrete span of wavenumbers are considered simultaneously, then the three-space Navier-Stokes evolution equations become a large matrix differential equation. As the order of the time series and the number of wavenumbers considered get large, this Navier-Stokes matrix differential equation becomes ever closer to the continuous three-space Navier-Stokes evolution equations. Using conjugate symmetry and similarity transforms, it is argued at “http://arxiv.org/ftp/math/papers/0610/0610086.pdf” that the exact solution to the Navier-Stokes matrix differential equation is a stable matrix exponential and that the assumed time series is its truncated Taylor expansion in time.
Care to comment?
7 April, 2007 at 10:45 am
Terence Tao
Dear Dave,
This is an instance of the type 1 of Strategy 3 listed in my post above: “Using weaker or approximate notions of solution”. A localisation of the frequency space is essentially equivalent via the uncertainty principle to a discretisation of physical space. As discussed above, there is indeed no difficulty creating a global solution for the approximated equation. When however one tries to take a limit to create a global strong solution for the original equation, it is necessary to obtain a uniform subcritical or critical bound on all approximate solutions (i.e. to achieve Strategy 2) in order to extract a smooth limit. Otherwise, the best one can achieve are the global weak solutions of Leray.
7 April, 2007 at 1:48 pm
Terence Tao
Dear Greg,
This is an interesting question. My guess is that one will have to understand the representation theory of the Rubik’s cube group in order to be able to convert these sorts of probabilistic statements into rigorous ones. There is a nascent theory of quasirandom groups, as pioneered by Gowers, which asserts roughly speaking that if a finite group G has no low-dimensional representations, then its multiplicative structure is highly expansive, thus we expect for instance A.A.A to be much larger than A if A is already rather large to begin with. This type of result probably can be used to show that large balls intersect each other rather often, which might indeed lead to some diameter bound. This type of thing has for instance been carried out for
recently by Bourgain and Gamburd.
If there are small representations, there could be the very unlikely scenario that all the random elements you selected happen to lie in one corner of the representation – e.g. in a normal subgroup of small index. Then the data you have collected does not “span” enough of the group to definitively pin down the diameter.
7 April, 2007 at 6:08 pm
Greg Kuperberg
First, my own perspective is that the probabilistic statements actually are rigorous, in a way: They are entirely rigorous probabilistic verifications rather than rigorous proofs. The situation is similar to that of numbers that are known as “probable primes”. It is not the same kind of lack of rigor as a physics discussion where many things have not been defined. So let us say that it would be good to change probabilistic verifications into deterministic proofs. This is true — although it is known using less elegant combinatorial methods that the cube group diameter is at most 27.
But I also think that it would be exciting to find a new probabilistic verification that the cube group diameter is 26 or better. In the competition between probabilistic verifications and deterministic proofs, the former can never do worse; it would be fun to have more examples where they can do better. Unfortunately the diameter < 35 result was a temporary victory.
Anyway, for either purpose, I agree that it would be nice to know the representation theory of the Rubik’s cube group. But that isn’t mysterious. The Rubik’s cube group is an index 12 subgroup of the disassembly group. That group is the Cartesian product of the corner group and the edge group. The corner group is the wreath product of S8 and C3, while the edge group is the wreath product of S12 and C2. The reason that the cube group has index 12 is that the corner rotations and edge flips satisfy congruences, while the parity of the edge permutation has to equal the parity of the corner permutation.
Note that the center of the cube group has order 2. The central involution is called the “superflip” move or position. It is known to have distance 20, and for all I know it is the unique position of distance 20. There may be a lesson in that too.
I realize that it is a bit dull to prove a theorem (or probabilistically verify a theorem) about one specific finite case instead of infinite sequences. But I think that this particular one is an interesting test case for new ideas. Among other things, it robs you of the license to prove estimates with bad constants.
7 April, 2007 at 9:34 pm
rcourant
Dr. Tao, what do you think about Richard Courant as a mathematician?
7 April, 2007 at 9:47 pm
rcourant
Terry,
Do you believe that financial markets are predictable in the long run? Black and Scholes came up with the following:
which gives the price of a derivative
at a time
.
Thanks!
7 April, 2007 at 10:51 pm
Emmanuel Kowalski
Concerning exponential sums over primes, I wonder what you might think about exponential sums involving sqrt(p), in particular what kind of “conspiracies” or
natural problems they could be related to?
The point is that Iwaniec, Luo and Sarnak have related such exponential sums
to the Riemann Hypothesis for certain modular L-functions (see
http://www.numdam.org/numdam-bin/fitem?id=PMIHES_2000__91__55_0
specifically their Hypothesis (S), discussed in Section 10, and Appendix C;
technically, good cancellation in sums with sqrt(p) uniformly over arithmetic
progressions would lead to a zero-free strip, under some condition).
There is definitely a group underlying their work, namely SL(2,Z).
8 April, 2007 at 6:24 am
Dave Purvance
Dear Terence,
Thanks for your reply.
Unless something strange happens to a matrix eigendecomposition as the matrix size becomes infinitely large, convection remains oscillatory (purely imaginary eigenvalues) for all time. Convection is therefore absolutely bounded by the magnitude of the initial condition. Viscous shear (negative real eigenvalues) diffuses these convective oscillations to zero as time goes to infinity. The only problem I see in taking a limit here is that convective oscillations may become extremely (infinitely?) fast.
Is this the limit problem you are talking about ?
8 April, 2007 at 8:21 am
thomas1111
About prewiewing comments (which I was interested in too): this feature is not available at wordpress.com for security reasons, according to this answer of a wordpress.com person.
On the other hand wordpress.org releases software for people to run a blog either on their own server or on recommended hosts. In both cases this requires some unix/linux administrator skills, especially on security issues in the first case.
In that setting, previewing comments is then possible using plugins: there are many possibilities, for example this one has live preview. Besides,
display in the main posts is also available via this pluging (requires apparently a small manual change in the code to enable LaTeX in comments too). But I’m not sure how the LaTeX then interacts with comment preview, probably some further hacking is required (which almost means back to square one as far as I’m concerned, so I gave up on that).
8 April, 2007 at 9:23 am
Terence Tao
Dear Emmanuel,
Thank you very much for your comment! I had heard of the ILS paper before but had not had a chance to seriously look at it.
There are two sorts of conspiracies: blatant conspiracies, which for instance prevent
for various “algebraic” functions
, and the much weaker spoiler conspiracies, which instead cause just enough non-randomness to prevent
. A single zero of an L-function off the critical line, for instance, is a spoiler conspiracy, whereas one at 1 or at 1+it is a blatant conspiracy. I don’t really have the courage to state what spoiler conspiracies are possible – and the Hypothesis S in ILS is of a “no spoiler conspiracy” type. But it seems plausible that the only blatant conspiracies that the primes have are the “obvious” ones, coming from local considerations.
The story may be more complicated if one restricts to a thinner set than the full integers or the full primes, such as polynomial sequences f(T), where T ranges over the integers. For instance, in the “finite field model” in which the integers are replaced by a polynomial ring
over a finite field, Conrad-Conrad-Gross have a blatant conspiracy that shows that certain one-variable polynomials f(T) are biased to not be prime despite avoiding all obvious local obstructions (e.g. reducibility) – though this particular conspiracy is very much a feature of the finite characteristic of the underlying field. Then of course there are all the Hilbert 10th problem constructions, such as polynomials whose positive values are precisely the primes, and so forth. So I would be hesitant to make any grand conjectures on sparse sets, such as those given by polynomials or group actions, at present (though I hear that Bourgain, Gamburd, and Sarnak have made some very recent advances in this direction, at least regarding almost primes instead of primes).
8 April, 2007 at 10:12 am
Terence Tao
Dear Dave, this is part of the problem, but more importantly is the fact that as one transitions from finite-dimensional state spaces to infinite-dimensional ones, being bounded in one norm no longer implies being bounded in another. Thus for instance it is conceivable that limit remains bounded in energy, but is no longer smooth, which is the whole point of the Navier-Stokes global regularity property. (Bounded-energy global weak solutions were constructed all the way back in 1933 by Leray, essentially by a variant of the method you describe, but it is known that weak solutions need not be smooth.) In particular, the energy may concentrate in finer and finer length scales until a singularity occurs in finite time. This singularity is not entirely visible from the finite dimensional approximations, as they only have finitely many scales in the first place; it is a phenomenon which emerges in the limit.
In order to obtain regularity control on the limit, as opposed to merely a weak solution, it is not enough to bound the energy (which is a supercritical quantity); one also needs to bound a critical or subcritical quantity, and no such global bounds for arbitrarily large data are currently known at present (this is the “Strategy 2″ I discuss above).
8 April, 2007 at 12:01 pm
Simons Lecture III: Structure and randomness in PDE « What’s new
[...] Hamiltonian PDE, such as the Schrödinger equation , which are heuristically related (via Liouville’s theorem) to measure-preserving actions of the real line (or time axis) , somewhat in analogy to how combinatorial number theory and graph theory were related to measure-preserving actions of and respectively, as discussed in the previous lecture. [...]
8 April, 2007 at 12:02 pm
Simons Lecture III: Structure and randomness in PDE « What’s new
[...] For , one has global smooth solutions for small data with either sign. For large data in the focusing case, finite time blowup is possible. For large data in the defocusing case, the existence of global smooth solutions are unknown even for spherically symmetric data, indeed this problem, being supercritical, is of comparable difficulty to the Navier-Stokes global regularity problem. [...]
8 April, 2007 at 5:00 pm
Paul Krapivsky
Sorry, this is the last try…
Dear Terry and Emmanuel,
Another potentially interesting problem is the distribution of gaps in the sequence of fractional part of square-roots of primes. If one considers the sequence {n^a}, with some 0
8 April, 2007 at 8:41 pm
Terence Tao
Dear Paul,
I think your comments are turning the < signs into HTML. You can try using < and > instead. I know, it’s a pain, I wish I could do something about it…
9 April, 2007 at 3:56 am
Emmanuel Kowalski
I definitely agree with the distinction between “spoiler” and “blatant” conspiracies.
In classical prime number theory, it seems that most “blatant” conspiracies
are really symptoms of ignorance: they can/should be incorporated to the “main
term” (but of course that can be wishful thinking…), or used to reformulate the original
problem (e.g. p-1 can not be prime infinitely often, but (p-1)/2 can…).
Another (maybe) interesting point is that when we deal with prime numbers, sometimes
difficulties are really inherited from the integers. This is very clear when using sieve methods; for instance, in the type of problems that Bourgain, Sarnak and Gamburd are investigating (looking at points with prime or almost prime coordinates in orbits of certain groups actions, etc) even the corresponding counting problem for integers are hard, but they more or less have to be used as a starting point for looking at primes with sieve.
On the other hand, their problems, as far as I understand, do not actually involve sums
over primes (or integers) with functions like sqrt(p) — and correspondingly, the source
of cancellation lies in spectral gaps of various types (Selberg’s theorem, approximations to Ramanujan-Petersson conjecture, and especially expanding graphs in the more combinatorial settings, etc).
9 April, 2007 at 5:45 am
One side of "our" difficult problem (i.e. turbulence, did you think about something else?) « Alex’s Blog
[...] Why global regularity for Navier-Stokes is hard « What’s new [...]
9 April, 2007 at 10:08 am
Paul Krapivsky
Dear Terry and Emmanuel,
Another potentially interesting problem is the distribution of gaps in the sequence of fractional part of square-roots of primes. If one considers the sequence {n^A} of natural numbers n=1,…,N, and multiplies the gap lengths by N, the gap distribution becomes exponential in the large N limit for all exponents A (between 0 and 1) except A=1/2. This is an astonishing experimental observation; for A=1/2 the gap length distribution was recently found by Elskies and McMullen, and it is indeed very different from exponential — it exhibits two `phase transitions’ and it has an algebraic tail. It is possible of course that the replacement of the whole numbers by primes merely makes the problem difficult but does not alter the gap distribution.
9 April, 2007 at 11:20 am
Allen Knutson
I once saw a talk announced at Harvard by X-Y Z. I’m pretty sure it was Serre not Yau. Serre lecturing tended to shut down math departments all over Cambridge.
9 April, 2007 at 1:38 pm
Simple Country Boy
I once encountered the violinist Hillary Hahn playing incognito at Seattle’s Pike Place Market. Her music was fabulous. I barely paused as I did not want to intrude in her space. I later regretted not stopping.
When I was an undergraduate, Feynman came to my school for a visit with a small class. He made it very clear that nobody outside the class should know of his visit until after he had gone back to Caltech.
9 April, 2007 at 3:17 pm
JL
The article begs two questions: Would the circumstances have been different had Bell played in the Boston T or the New York subway? (I think the answer is at least a mild yes.) And, isn’t this one of the underrepresented joys of being an academic? Many of us work just as hard as our counterparts in the business world, but still we are not as busy as them, in a sense- we have more of an opportunity to stop and appreciate the beauty around us (so we come in ten minutes late to a meeting, or miss the intro of a talk…)
9 April, 2007 at 4:21 pm
Terence Tao
Dear JL, I think the WP deliberately avoided all attempts to make the setting more favourable for Bell (e.g. by choosing a more relaxing time, by choosing more crowd-pleasing music, or a better venue). The experiment seemed to be whether first-class concert-quality music, shorn of all supporting props and context, could by quality alone catch the attention of the average passer-by. Unfortunately the answer seemed largely to be no, though it would have been interesting (and more scientific) to have some sort of control experiment at the same time, such as an “ordinary” busker at a different part of the subway.
There is also the reverse experiment, in which one gets a busker to perform under Joshua Bell’s name at a concert (cf. the old joke about Einstein’s chauffeur). Presumably this would not have gone down well with the ticket-paying audience, however. :-)
9 April, 2007 at 4:43 pm
Deane Yang
I would not expect an average person, especially one who does not normally listen to classical music, to take particular note of Bell’s playing. But among the thousands of people who passed through, I would have expected more than a few amateur musicians who have a sufficiently well developed ear to be completely wowed by Bell’s playing. And I agree with JL that the results would have been different in Boston or Manhattan, but it’s also very likely that someone would have recognized Bell by sight.
I agree that an equivalent experiment involving math talks would be a lot of fun, but this seems hard to set up properly. You would want to set up a situation where, say, average mathematicians happen to walk into a math talk and see if they can appreciate the quality of the talk despite not knowing who is talking. One crucial difference, however, is that for many of us the best math talks are not ones where the speaker displays his or her virtuosity but where the speaker is able to present the ideas in a particularly simple and transparent way. I have never heard you talk (unfortunately, I arrived in Paris last summer too late to attend your talks there), but your writings definitely display this quality. I did hear Okounkov speak last summer, and these were among the most beautiful and yet easiest to understand talks I have ever heard anyone give.
There is, of course, no chance that an average passerby has any chance of differentiating between a Fields Medalist and a complete charlatan.
10 April, 2007 at 1:47 am
Nets Katz
Terry,
Once, about eight or nine years ago, I invited you to give a colloquium at UIC.
We were working on something pretty hard and I didn’t want to be disturbed
by trivia during your visit, but I got a frantic visit from the local senior harmonic analyst. No one was going to come to a colloquium in analysis. I had
to compose an e-mail to the department describing you and urging people to come. To my annoyance, I spent fifteen or twenty minutes composing this
e-mail, writing exactly what I thought. An hour later, this colleague came back
and told me that what I had written was entirely too effusive. He demanded
some embarassing changes to tone it down.
The colloquium was on the equivalence between Bochner Riesz and Restriction problems. It was well attended and it was a good talk. I think
the only person in the audience who appreciated it besides me was Henri
Gillet. I felt somewhat humiliated that no one wanted to go out to dinner with
us. The general consensus, I learned later, was that I had made a particularly
uninspired choice of colloquium speaker.
One does not have to look far to find examples of professional mathematicians not appreciating a lecture they have not been told to appreciate. Conversely,
I could provide a fairly dramatic example of a mathematician receiving
undeserved hype, far and wide, on the basis of a promotion campaign.
But it wouldn’t be appropriate for this blog.
Nets
10 April, 2007 at 2:42 am
陳力恒 Chan, Lik Hang Nick
Another analogy:
I can’t do maths under exam condition. For two reasons: I work too slow and I can’t memorise things well. That’s why I prefer assignments.
If the quality of music is affected when it’s played in the subway, it’s like we do make mistakes in our calculation in a exam. I believe this is generally true. But it’s not true for Terry, I guess, judging from his performance in the IMO.
10 April, 2007 at 4:07 am
Pseudo-Polymath » Blog Archive » Morning Highlights
[...] Terrence Tao notes an interesting human experiment involving one of the worlds best violinists and a subway at What’s New. [...]
10 April, 2007 at 6:26 am
Deane Yang
Of course, there is one critical difference between us and violinists that I overlooked. The quality of being a great violinist is the exactly the same as the quality of being able to give great performances. But the quality of being a great mathematician is not at all the same as being able to give great talks.
So we all too often suffer through completely incoherent and impenetrable talks just because of who the speaker is, and we sometimes skip out on much better talks, because we’re not sure who the speaker is. I’ve made both mistakes far too many times.
10 April, 2007 at 7:53 am
Allan Greenleaf
Terry,
The Washington Post experiment was interesting, but basically a journalistic stunt. After it was over, Joshua Bell could return to his life of a fully-booked calendar and recording contract. There are many skilled musicians of Bell’s caliber or close to it, who live his pretend life on a daily basis. I’ve visited Helsinki a number of times in the last few years, and there are a number of truly amazing classical musicians playing for whatever passersby will throw in their instrument cases. From their appearance and age, I would guess that most of them are refugees from the former Soviet Union, where they may have had positions (dare I say tenure?) in the good old days.
10 April, 2007 at 7:55 am
Anonymous
“This theorem relies on Zorn’s lemma, although it is possible to give a proof of the recurrence theorem without recourse to the axiom of choice. The proof requires various tools from infinitary analysis (e.g. the compactness of integral operators) but is relatively straightforward.”
Can you give us a link or reference?
10 April, 2007 at 9:02 am
Terence Tao
Dear Anonymous,
Furstenberg’s original 1977 paper proved the recurrence theorem without Zorn’s lemma, and with some effort can be made fully choice-free. The later proof of Furstenberg-Katznelson-Ornstein in 1982 is perhaps more elegant, proceeding via the Furstenberg structure theorem, but that theorem of course requires transfinite induction (Zorn’s lemma) to prove.
10 April, 2007 at 10:19 am
Not Even Wrong » Blog Archive » Math and Physics Roundup
[...] of the highest possible quality, see Terry Tao’s postings on his Simons lectures at MIT, here, here and [...]
10 April, 2007 at 10:23 am
Not Even Wrong » Blog Archive » Math and Physics Roundup
[...] the highest possible quality, see Terry Tao’s postings on his Simons lectures at MIT, here, here and [...]
10 April, 2007 at 10:24 am
Not Even Wrong » Blog Archive » Math and Physics Roundup
[...] For some more mathematics blogging of the highest possible quality, see Terry Tao’s postings on his Simons lectures at MIT, here, here and here. [...]
10 April, 2007 at 10:37 am
pspacexnoesis
Interesting experience,
though I wonder how many were really interested in classical music, and for the ones who were, how many did not only listen to it, but did also attend to ‘classical music concerts’ and so on, and thus ‘made’ their ears to fine classical music… My guess is that it leaves only a few people really able to judge and recognize, the talent of a musician (such as Bell), during a rush hour, passing close to him only for a some seconds. (and not thinking that it sounded not to bad because that was the only songs he learned to play).
But the experiment still keeps its flavour, and shows how people aren’t as ‘open-minded’ as they may think, and tend not to give even a glimpse when that’s not for their interest, or imediate profit.
Maybe, that if they had put a poster saying: “Maestro Joshua Bell, internationnaly recognized as the finest Violin virtuoso” a lot more people would have stopped by, maybe giving him money, or asking for autographs, pictures or whatever, just for the heck of showing-off at diner-time while telling that to their relatives, when in fact they did not know who this man was :)
And as Furukawa said in the article, flipping quarters at him, that’s very insulting, him or anybody else, that’s something I wouldn’t do either.
Something similar happened to me, while in a Baduk club, some mid-level player was showing off, showing all his “knowledge” to beginners and how he was right (and stupid), a friend of mine, in the top-level amateur players decided to go argue with him about some patterns, the result was that the guy always though he was right thinking everybody in the room was dumb enough to believe him, well he’s been taugh a lesson later on… If he had known who he was before, his behavior would have been just the opposit.
-pxnm.
10 April, 2007 at 11:34 am
Teresa
Beautiful, ‘Joshua Bell-like performances’ abound in our ‘ordinary’ existence. All we need to do is take notice with an open mind. I always find it fascinating that people would go to great lengths or make unusual sacrifices just to take a look at a lunar eclipse, for example. It’s like people are programmed into thinking that since an eclipse is a once-in-a-blue-moon event that it must be spectacular and not to be missed. In reality, however, it is not nearly as beautiful and wonderful to watch (to me at least) as a beautiful, glorious sunset, for example. Yet, how many people are fascinated by an ‘ordinary’ sunset?
10 April, 2007 at 1:47 pm
ST
Nets,
You can now invite Terry back to give the same colloquium to complete the control part of the experiment.
Also, kindly decline dinner invitations from those who didn’t go out last time.
10 April, 2007 at 3:24 pm
Dave Purvance
With the best of respect, I really believe the new bound provided by the Navier-Stokes “matrix” differential equation is the [-1,1] bound on convection (purely imaginary eigenvalues). Convection “chirps up” in frequency in all powers of increasing time, but stays bounded in magnitude by [-1,1]. Viscous shear (real negative eigenvalues) smoothly diffuses these highly nonlinear oscillatory chirps to zero as time goes to infinity. There needs to be no mention of energy.
And with this I’ll quit babbling…
11 April, 2007 at 3:12 am
Chui’s counterpoint » Blog Archive » Parallel Universes and Lara Croft
[...] Tao explains how parallel universes and quantum mechanics can be possible by way of Tomb Raider. In trying to come up with a classical conceptual model in which to capture these non-classical pheno…. The exact choice of game is not terribly important, but let us pick Tomb Raider – a popular game [...]
11 April, 2007 at 8:47 am
Why is many-worlds winning the foundations debate? « Quantum Quandaries
[...] April 11, 2007 at 11:47 am | In Quantum, Philosophy, Uncategorized | Almost every time the foundations of quantum theory are mentioned in another science blog, the comments contain [...]
12 April, 2007 at 7:39 am
Mahdi
This reminds me of a similar story in chess.
13 April, 2007 at 10:57 am
stevenm
For parabolic PDE like the heat equation, can one construct entropies to describe these dispersions or flows? In your example, you define the ‘Dirichlet energy’ but is there an entropy that can be constructed and associated with the removal of the random components from the structured components that you described? I think in one of Perelman’s papers he discusses an entropy in connection with the Ricci flow, which seems analogous to heat flow. (Although the details of his papers are beyond my understanding.) A good example of an entropy in Lorentzian (spacetime) geometry would be that associated with black holes and black hole mechanics (leading to Hawking radiation etc.) where an entropy is naturally associated with the area of the horizon, which is a purely geometrical entity.
The algebraic similarity of the heat/diffusion and Schrodinger equation is also quite interesting I think. On a historical note, Schrodinger in 1931 originally considered diffusion processes or Brownian motions, described by the parabolic diffusion/heat equation, as a basis for quantum mechanics. He attempted to formulate Brownian motions in a symmetric form of time reversal, which (supposedly) helped motivate Kolmogorov to pursue some of his own (now classic) work on stochastic processes. But, as you mention, the diffusion/heat equation is not time symmetric whereas quantum theory is, but Schrodinger still felt that quantum theory must be some kind of diffusion or stochastic theory. Feynman’s path integral or sum over paths does also seem to have a very natural interpretation as a sum over mathematical Brownian motions if you make a Wick rotation. I don’t know what the status of this stochastic interpretation and connection with quantum mechanics is these days though.
13 April, 2007 at 11:48 am
Hoang
Dear Professor,
“but that theorem of course requires transfinite induction (Zorn’s lemma) to prove.”
I guess transfinite induction is quite different from Zorn’s lemma. Transfinite induction works with every well-ordered set regardless of Choice (of course this is the case of every set if we assume Choice). As far as I know, the Furstenberg structure theorem makes use of transfinite induction on the ordinal numbers, which are already well-ordered regardless of Choice.
13 April, 2007 at 9:38 pm
randomwalker
On anyone else’s blog this would be an excellent article, but you should be proving deep and awe-inspiring theorems rather than engaging in pedagogy for non-technical audiences.
Re. P \neq NP: were you comparing it with learning-parity-with-noise?
13 April, 2007 at 10:33 pm
Harald Hanche-Olsen
Randomwalker, I am sure that Terence Tao is perfectly able to prioritize his own time. His track record so far proves it. It’s an interesting phenomenon to me that so many successful people have an unusually broad range of interests, and somehow find the time to pursue them. I believe this may be a help, rather than a hindrance, since having broad interests is helpful both in finding good research topics and in solving them. (I might worry too, if this blog got filled up with the sort of post we see here, but that doesn’t seem likely.)
13 April, 2007 at 10:43 pm
Mark
tough crowd :-)
13 April, 2007 at 11:28 pm
randomwalker
Harald: you’re right. I should think before speaking :)
13 April, 2007 at 11:58 pm
marco
I hope Terence Tao will keep publishing posts of this kind: they are very well written, accessible to non-technical people, and full of hints for those who want to dig a bit deeper.
That’s more than enough for me to read them.
14 April, 2007 at 1:15 am
Anonymous
If I were the author, I would be shocked or even saddened by the first comment …
14 April, 2007 at 2:08 am
Radu Grigore
Thank you for this great post.
Here is something you might find mildly interesting. A month ago TopCoder organized a marathon (i.e., one week long) match in which people had to write a program that reconstructs a 100×100 grid of `densities’ based on the information gathered from 10000 (noisy) measurements. A few tried the Radon transform but none used it in the end because it was hard to implement and didn’t work well at all with so few measurements. I wonder if people would have done better if they would have known about compressed sensing and read some of the articles you link to here…
14 April, 2007 at 4:51 am
aravind
thanks for your detailed description as usual, terry. you say: “… two randomly chosen 100,000-dimensional subspaces of a 300,000 dimensional ambient space will be almost certainly disjoint from each other”. i must be missing something, but from the “birthday phenomeon”, isn’t the above true only if the two subspaces have dimension up to about sqrt(300,000), i.e., much smaller than 100,000?
14 April, 2007 at 10:46 am
Brian Benson
randomwalker, to expand on Professor Hanche-Olsen’s comments, it may seem naively true that the opportunity cost of Professor Tao writing this post is quite high; however, this argument is a rather short-sighted. It is my interpretation that the post above seeks to increase overall public awareness of and interest in both Professor Tao’s work and the relatively new field of compressed sensing. This increase in public awareness is likely to increase the demand for further works by Professor Tao and promote future work in the area of compressed sensing. This is not to mention the educational value of the post to the community as a whole.
14 April, 2007 at 2:17 pm
Igor Carron
Radu, the current results seem to indicate that you need to know exactly the random functions or masks onto which the measurements are made. In other words, the amount of information needed to reconstruct the signal includes all the random functions or the masks on which the original signal was projected on. Therefore, being given only a signal corrupted with noise is not, as far as I understand, a case where one could reconstruct the original signal. Furthermore, nothing in the topcoder problem statement says anything about these densities being sparse in any basis. If these densities are random spikes, then less than 10000 random masks won’t do any good with the current results. For instance, if one uses a one-pixel camera to image some random brownian motion in a microscope application, it would take a long time to get a full picture. It might be better to flip those micro-mirrors one at a time!
14 April, 2007 at 2:54 pm
Greg Kuperberg
aravind: No, subspaces are different from subsets. Technically speaking, the subspaces will be disjoint with probability 1. You should instead consider a suitable notion of approximate intersection, so that if one of the angles between the subspaces is merely moderately acute, then that counts as tantamount to an intersection. A “moderately acute” angle is typically an angle less than, say, 80 degrees.
Nonetheless, if you have two 1/3-dimensional random subspaces of a large vector spaces, the chance of even an approximate intersection is exponentially small. This phenomenon arises, for instance, in quantum computing, where subsets are commonly replaced by subspaces for various purposes.
14 April, 2007 at 3:08 pm
Greg Kuperberg
Brian Benson: You could be right about the merit of Terry’s posts here, but in a way you’re also reading more into the question than is really there. I have met a few Fields Medalists such as Terry, and the truth is that they put their pants on just like everyone else. They can procrastinate, or get distracted, or otherwise be ordinary human beings.
On the other hand, it is true that Terry in particular does seem to be more efficient in his work than many other mathematicians. It is also true that many award-winning scientists and mathematicians get inundated with attention, most of which is perfectly respectable, but it gets to be too much and they are forced to prioritize.
14 April, 2007 at 3:58 pm
Top Posts « WordPress.com
[...] Compressed sensing and single-pixel cameras I’ve had a number of people ask me (especially in light of some recent publicity) exactly what “compressed […] [...]
14 April, 2007 at 6:23 pm
Terence Tao
Dear stevenm,
That is a good question, and I do not know the answer, though given that entropy and the heat equation both show up in thermodynamics, one would expect there to be some sort of connection; presumably it should be known to experts in probability. Certainly the heat flow has many interesting monotonicity properties, especially for non-negative solutions (which are the ones which have a physical Brownian motion interpretation), and I would not be surprised that an entropy-like expression has some nice monotonicity formulae. Perhaps one has to somehow lift the problem to very high dimension (e.g. taking repeated tensor products of the heat flow solution with itself) to see the connection more clearly, as entropy tends to be associated with various asymptotic statistics in the infinite-dimensional limit.
Perelman did introduce an entropy to study Ricci flow (indeed he interpreted Ricci flow as a gradient flow for this entropy), and Hamilton had earlier introduced another entropy-like quantity
which also enjoyed a monotonicity property under certain conditions.
It does seem helpful to think of quantum mechanics as a kind of complexified (i.e. Wick rotated) version of classical Brownian motion, in which paths which increase the Hamiltonian are phase rotated rather than damped. There are also some formal similarities between manipulations of bras and kets, and the laws of conditional probability. But as I said before, these analogies seem well suited for understanding the algebraic structure of quantum mechanics, but not its analytic structure.
14 April, 2007 at 6:31 pm
Terence Tao
Dear Paul,
Gap distributions in any statistics related to primes are of course very interesting, but also well beyond our current technology; they seem to require knowing an incredible amount of “fine-scale” or “local” behaviour for the primes, whereas our current analytic knowledge primarily extends to rather “coarse-scale” or “aggregate” statistics; even the generalised Riemann hypothesis, while it would dramatically improve our coarse-scale control of the primes, doesn’t actually do much for the fine-scale questions (e.g. the twin prime conjecture doesn’t seem to become much easier on GRH). Even finding the right conjectures to make for, say, the asymptotic distribution of prime gaps
, is remarkably subtle. Perhaps the best way to proceed here is simply by numerics, to see if anything unexpected happens – I would guess that the gap distribution for {p^A} would be similar to {n^A}, especially near generic (e.g. Diophantine) points in [0,1], but it could well be that some unusual irregularity emerges.
14 April, 2007 at 7:04 pm
Terence Tao
Actually, now that I think about it, I do know one explicit link between entropy and heat flow: one can interpret the Fisher information of a random variable as the rate of change of the Shannon entropy via heat flow (or more precisely, the Ornstein-Uhlenbeck process, which is basically heat flow renormalised by scaling). I believe this is a useful fact in probability theory, particularly in showing that gaussians are extremisers of various probabilistic quantities, but it is sort of outside my area of expertise.
14 April, 2007 at 7:17 pm
Terence Tao
Dear Hoang,
That’s a good point, though I do view Zorn’s lemma as a generalisation of transfinite induction, which takes place on a poset rather than a well-ordered set. Indeed, one often uses Zorn’s lemma in the following formulation: Assume that a property P holds for at least one element in a poset (base case), that the limit of every chain of elements obeying P, also obeys P (limit ordinal case), and that every non-maximal element of the poset obeying P has a larger element also obeying P (successor ordinal case). Then at least one maximal element of the poset obeys P. In this formulation the similarity to transfinite induction is quite clear.
One can prove Furstenberg’s recurrence theorem using Zorn’s lemma in the above formulation, where the poset is the poset of factors of the original system, and P is the property that Furstenberg’s recurrence theorem holds for that factor. The Furstenberg structure theorem is also proven by Zorn’s lemma, though in this case it is easier to use that lemma in its more usual formuation (to construct a maximal distal factor of a system) than to try to prove this structure theorem in an “inductive” manner.
It turns out that the tower of compact factors that shows up in the Furstenberg structure theorem has an ordinal which is at most countable (and that any such ordinal can arise in this way – a result of Foreman and Beleznay), which probably means that the structure theorem can be proven just using the axiom of countable choice, though I haven’t checked this.
14 April, 2007 at 7:33 pm
Terence Tao
Aravind: there is an analogy between the birthday paradox (which regards subsets of a finite set) and the intersection of random subspaces (which regards subspaces of a finite dimensional vector space). The birthday paradox tells you that if you have a set with N elements and you pick two random subsets of size much larger than
, then there will very likely be a collision, but if you pick random subsets of size much smaller than
, then there is very little chance of a collision.
Very informally, you can think of a d-dimensional space as having
elements, and an r-dimensional subspace as having
elements. And, as with birthday, if you pick two random subspaces of dimension r bigger than d/2, there is necessarily a non-trivial intersection, whereas if you pick subspaces of dimension r less than d/2, there is almost surely no intersection except at the origin.
High-dimensional geometry has a number of unintuitive properties which distinguish it from low-dimensional geometry. One of them is that in very high dimensions, any two randomly chosen directions (or one deterministic direction and one random direction) are very very likely to be very close to orthogonal. (For instance, the angle between two random vectors in
is going to deviate by more than 1 degree from a right angle with a probability something like
– i.e. it is very likely to deviate in low dimension but almost never deviates in very high dimensions.) This fact is also related to the law of large numbers.
Randomwalker: There may well be a connection between parity-with-noise and sparsest-solution-of-linear-systems, but the connection I had in mind was that one can encode the subset-sum problem as a very special case of the sparsest-solution-of-linear-systems problem, showing that the latter problem is NP-hard in the worst case. On the other hand, it seems that this problem can be solved in polynomial time by basis pursuit in the “average case” for suitable definitions of “average case”. As I understand it, this is a fairly typical state of affairs (a general class of problems being NP-hard worst-case but P or P# in average-case).
14 April, 2007 at 8:26 pm
techbuffet
Thanks for the great explanation. I have heard about compressed sensing for a while, but could not quite wrap my mind around it. I think I understand the basic idea now.
15 April, 2007 at 3:11 am
sgrajeev
What a thoughtful and inspiring discussion!
Some questions
Are there other examples of supercritical PDEs that are better understood, or are the difficulties you mention generic to all such cases? e.g. \waveoperator\phi+\lambda\phi^p=0 for p big enough.
The scaling arguments you start with also appear in
quantum field theory where supercritical means something like `non-renormalizable’. Is there more to this analogy? If there is, it is bad news for proving regularity.
There is some geometry in the Euler equations: the fluid follows a geodesic w.r.t. the L^2 metric on the group of volume preserving Diffeomoprhisms (Arnold). Does this suggest anything for Navier-Stokes?
15 April, 2007 at 6:52 am
Terence Tao
Dear sgrajeev,
Most supercritical evolution equations are just as poorly understood as the Navier-Stokes equations, unfortunately, with the notable exception of the supercritical elliptic and parabolic equations for which there is a favourable sign which allows for some sort of maximum principle or comparison principle to take hold; basically, in such cases, the nonlinearity is extremely strong, but it always acting in a favourable direction, using its strength to reduce the size of the solution rather than increase it. However, in oscillatory evolution equations such as wave or Schrodinger equations it appears that there is no similar phenomenon taking place; the nonlinearity can “try” to reduce the size of its solution, but it is so strong that it can “overshoot” and end up changing the sign or direction of the solution and making it much larger at the same time.
There do seem to be analogies between classical PDE and quantum field theory (which can be viewed as a kind of quantum PDE) but this is definitely an underexplored area of study. For instance, the Cauchy problem for quantum field theory has not been studied much, even in linear models (perhaps it is a bad question to ask).
It is true that the Euler equation has some nice geometrical structure, and I would indeed think this structure will be key in the future understanding of this equation, and thus indirectly to Navier-Stokes. There is however a major obstruction with using the diffeomorphism group structure to understand Navier-Stokes, namely that this equation contains the Laplacian (via the viscosity term), which relies on the Euclidean (or Riemannian) geometry of space. This type of geometry is not preserved at all by diffeomorphisms. Because of the presence of two very different types of geometry in Navier-Stokes it seems significantly less likely that we get the same type of “geometric miracles” (e.g. unexpected monotonicity formulae) that we do in, say, Ricci flow, which only involves one type of geometry. But there could still be many surprises in this equation.
15 April, 2007 at 11:08 am
Ostrowski lecture: The uniform uncertainty principle and compressed sensing « What’s new
[...] mentioned in the previous post here, compressed sensing is a relatively new measurement paradigm which seeks to capture the [...]
15 April, 2007 at 12:17 pm
Simons Lecture I: Structure and randomness in Fourier analysis and number theory « What’s new
[...] first lecture, I describe the dichotomy as it appears in Fourier analysis and in number theory. (In the second, I discuss the dichotomy in ergodic theory and graph theory, while in the third, I discuss [...]
15 April, 2007 at 12:21 pm
barry Friedman
In the quantum case, is there some “physically observable” consequence for the existence of the scar states at “arbitrarily” high energy?
For classical billiards, such physical observables exist. For example, for the Sinai billiard, looking at the velocity autocorrelation function
(vacf) averaged over the microcanonical ensemble distinguishes the case of bounded and unbounded horizon. for unbounded horizon the vacf decays with a power law vs the exponential decay in the bounded horizon case. the difference in the 2 situations is the presence of arbitrarily long trajectories that don’t collide with the central obstacle. (analogous to the bouncing ball trajectories in the stadium). even though these initial conditions have zero measure, they still make themselves felt in certain averaged quantities.
Does something of this nature occur in the quantum case ?
15 April, 2007 at 1:02 pm
Rajeev
Thanks.
). Euler equations are the geodesic equations for this metric, according to Arnold.
. Often dissipative equations can be thought of as a Hamiltonian system with a complex valued hamilton
. (Sorry to plug my work here, but it is relatively recent, so may not be known: arXiv:quant-ph/0701141).
The Euclidean metric on the domain does go into the Euler equations as well; e.g., in determining the metric on the group of volume preserving diffeomorphisms. The length^2 of a tangent vector to the group is just the length^2 of the velocity field w.r.t. the Euclidean metric integrated on the domain of the fluid ( which is also the kinetic energy
Viscosity introduces the Laplacian in a more direct way. The viscous force is the gradient of the function
Not claiming that this gives a better control on the scaling properties of Navier-Stokes. But, it looks like Navier-Stokes has a geometrical meaning too, using complex geometry. Might be of independent interest.
16 April, 2007 at 8:00 am
Igor Carron
Thank you for this post. I am not a mathematician so please forgive my ignorance on that subject matter but you said:
“..For a single function f, this type of localisation of the Plancherel identity… uses the same phenomena that explain why Monte Carlo integration is so effective.”
Do you mean to say that Monte Carlo integration is working well because the function to be integrated is sparse ? In the affirmative, does this mean that if I were to know better the sparsity of a function to be integrated using a Monte Carlo technique, I could, in principle, accelerate the convergence rate of that integration ?
16 April, 2007 at 9:58 am
piotr
Nice post! One comment: in addition to camera, MRI, astronomy, and linear coding, random linear measurements also have applications to so-called “data stream algorithms”. For example, the signal x could represent a pattern of traffic passing through a network router, with x_i being the number of packets, say, going to destination i. Since x is very high-dimensional, one cannot really store it explicitly. However, one can store the measurement vector Ax instead, and use it to recover an approximation of x. The linearity of measurements is crucial here, since it enables maintaining Ax under linear updates to x. E.g., if you see a new packet with destination i, you can easily compute A(x+e_i)=Ax + Ae_i from Ax. So everything stays in the measurement space, so to speak.
16 April, 2007 at 10:03 am
Terence Tao
Dear Igor,
Here we are summing the Fourier transform of f (or more precisely,
) rather than f itself. The uncertainty principle tells us that as f is sparse, the Fourier transform of f should be “smooth” or “structured” (or “low complexity”). For these types of functions, Monte Carlo integration works rather well (though there are other integration schemes that are even faster).
Compressed sensing methods are only moderately fast from a computational perspective; their main strength lies in the fact that they can give accurate results (after reasonable amounts of computation) from relatively small amounts of data; they are thus valuable in situations where computation is relatively cheap but measurement is expensive. These types of algorithms, for instance, would allow one to compute the integral of a function f exactly (not just approximately) if one knew that the Fourier transform of f was sparse, and one sampled f at a fairly small number of randomly chosen values. (In fact, one would recover all of f, not just its integral.)
16 April, 2007 at 10:13 am
Terence Tao
Dear Barry,
A good question! I’m not entirely sure as to the answer. The thing is that (a) the scarred states will be very infrequent, as we know that almost all of the eigenstates do not exhibit scarring, and (b) there are also a large number of scarred quasimodes which behave physically like scarred states under either the wave equation or Schrodinger equation for bounded amounts of time, and could thus “spoof” the scarred states on these intervals. However it may be that the very long time singularity propagation behaviour of either the wave or Schrodinger equation may hinge on the existence of scarred states; if there are many such states one could imagine that a wave which is initially singular on a periodic trajectory (e.g. take a horizontal wave front moving vertically) will retain some residual version of that singularity for indefinite periods of time, whereas in the absence of scarring this singularity will eventually be “evenly distributed” across phase space, whatever that means. I don’t have a rigorous version of this statement though.
16 April, 2007 at 11:18 am
Igor Carron
Thank you for your time.
17 April, 2007 at 7:44 am
Daniel Doro Ferrante
There’s a nice book that deals exactly with this problem: Distinction: A Social Critique of the Judgement of Taste, by Pierre Bourdieu.
Perhaps the question to be posed should have been along the following lines: “If, rather than Joshua Bell — a classical musician, violonist — they had Michael Jackson or some American Idol winner, how would the crowd react; how different would the results be?” ;-)
This looks much more like a “sociological measure” than an intrinsic measure of quality, taste or merit [on Bell's part].
Cheers.
17 April, 2007 at 11:07 am
Igor Carron
Another comment. I was looking at the some of the most advanced work on modeling the brain, and there is an uncanny parallel (if not match) between at least one model of visual processing and compressed sensing.
T. Serre, A. Oliva and T. Poggio. A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Science, 104(15), pp. 6424-6429, April 2007
http://www.pnas.org/cgi/content/abstract/104/15/6424?etoc
The model is a layered one where at every layer, cells respond to a specific and increasingly complex stimuli. The interesting bit is that the model shows, with painstaking parallel with current biology understanding, that it is in fact one way (no need to compute all the coefficients first and then threshold them) and moreover, at every step there is a random picking of some of the stimuli to be used by the next layer.
As stated in the abstract, the model “can predict the level and the pattern of performance achieved by humans on a rapid masked animal vs. non-animal categorization task.”
17 April, 2007 at 12:29 pm
Chris Hillman
Absolutely fascinating experiment! Although I am not sure one can draw any conclusion beyond the obvious: urban commuters tend to be very preoccupied, and strenuously avoid noticing anything unusual in their environment. After all, on an urban sidewalk, “something unusual” is more likely to mean a scary and unpleasant encounter with a criminal or a madman than an unexpected gift like the busking Bell.
Some might be tempted to argue that this experiment supports the assertion that few Americans are capable of appreciating really fine music, if only because too few public school students have the opportunity to learn to play some instrument. I am not sure that this conclusion is wrong, but I don’t think this stunt provides much evidence for it.
Interestingly enough, I always become very uneasy in settings in which I feel everyone is “required” to pay very close attention, like museums or orchestral concerts. I conjecture that the man who realized this was no ordinary busker, and who reported feeling conspicuous, humbled, and oddly embarassed, might know just what I mean.
17 April, 2007 at 12:37 pm
Terence Tao
A small update: Yuval Peres has pointed out to me that there is a conjecture of Sidorenko which has a somewhat similar flavour. It asserts that given bipartite graphs
and
(one should think of G as large and H as small), the number of copies of H inside G (mapping
to
,
to
, and allowing repeated vertices) is at least
, which is the value attained asymptotically when G is a large random graph. This conjecture is known to be true for instance when H is a cycle, a path, a star, and for a couple other graphs. It also has an analytic formulation similar to the one given above for the triangle-diamond problem.
17 April, 2007 at 1:15 pm
Chris Hillman
Changing topics (but perhaps not leaving the subject of sharp inequalities), I am curious to know whether other fans of “counting with categories” a la Baez and Dolan, who are aware of Cameron’s conjectures, have seen a recent article of Pouzet.
(For those who have no idea what I am talking about: I claim one can rewrite Wilf, Generatingfunctionology in terms of “structors” or combinatorial species, a kind of “categorification” of generating functions, along the lines one would guess after reading Baez and Dolan’s paper. This is “merely” a change of perspective but I think it suggests interesting new questions. See the undergraduate text by Cameron, Permutation Groups, Cambridge University Press, LMS student texts vol. 45, 1999, for a generalization of Polya enumeration which lies at the heart of this program, plus a very nice introduction to random graphs. It would be interesting to know more about the “density of p-cycles in a random Schreier digraphs”, if one can make sense of that notion. Schreier digraphs generalize Cayley graphs to general actions by a finite group.)
17 April, 2007 at 3:43 pm
Terence Tao
Dear Chris,
This is quite interesting. Certainly categorification is useful in enumerative combinatorics (being a souped up version of bijective proofs, as Baez and Dolan note), but as far as I am aware it has not really played much of a role in extremal combinatorics. In particular, the Cauchy-Schwarz inequality, arguably the most basic of the fundamental inequalities in the subject, does not appear to have a categorified version. A more concrete problem: suppose
is a map from one finite set to another. Then the relative product
is known to have cardinality at least
, by Cauchy-Schwarz. In other words,
is at least as large as
. Is there a categorical proof of this fact (which, incidentally, is basically equivalent to Sidorenko’s conjecture for paths of length 2)? See also Tim Gowers’ discussion of the Cauchy-Schwarz inequality.
17 April, 2007 at 4:06 pm
As coisinhas interessantes de hoje… at It’s Equal but It’s Different
[...] Simons Lecture I: Structure and randomness in Fourier analysis and number theory; [...]
17 April, 2007 at 4:07 pm
As coisinhas interessantes de hoje… at It’s Equal but It’s Different
[...] Simons Lecture III: Structure and randomness in PDE; [...]
17 April, 2007 at 4:43 pm
Greg Kuperberg
It may not count as categorification, but I can think of an example in extremal combinatorics where it pays to enhance bijections, rather than to erase them as in the example of the fibered Cartesian square of a set. Namely, Peter McMullen conjectured the optimal inequalities for the f-vector (vector of face numbers) of a simple polytope; the conjecture was that the associated h-vector is unimodal. Richard Stanley proved this conjecture by building homology groups (the homology of a toric variety) whose dimension sequence is the h-vector, and then showing that the homology groups form a representation of the Lie algebra sl(2). Thus, the h-vector is unimodal because it is a weight diagram.
Another example of proof by enhancement of combinatorics is Lovasz’s celebrated proof of the Kneser conjecture, using a topological method to bound the chromatic numbers of graphs.
Perhaps what can be said about these examples is that they are fairly exacting bounds. Most, or perhaps all, of the inequalities become elementary if you relax them moderately. This is unlike Ramsey theory, where in a sense the inequalities cut far deeper than any elementary bound. On the other hand, I am not sure whether this is really a robust distinction.
17 April, 2007 at 7:57 pm
Terence Tao
Dear Greg,
These are good examples, which I wasn’t previously aware of; I concede that there are several important non-analytic (and non-algorithmic) techniques in extremal combinatorics, at least some of which should be categorifiable :-) . Certainly algebraic or geometric tools are good for proving monotonicity or positivity properties, which are of course an important approach to extremal combinatorics problems. There is also the “polynomial method” in extremal combinatorics, which is based on various generalisations of the basic fact “a polynomial of one variable of degree n can have at most n zeroes”; a particularly useful such generalisation is the combinatorial Nullstellensatz of Alon.
I hadn’t seen Lovasz’s argument before – it is very beautiful – but topological arguments have appeared in extremal combinatorics in some other contexts. For instance, the crossing number inequality is really useful in incidence geometry and in arithmetic combinatorics, and is ultimately proven by a combination of Euler’s formula and the probabilistic method.
18 April, 2007 at 8:22 am
Chris Hillman
Fascinating thread!
Re astronomy, here’s an interesting proposal: http://news.bbc.co.uk/2/hi/science/nature/6566317.stm One cannot but ask: what if these miniature gliders carried a single pixel camera and had some way to communicate with an orbiter which could relay information back to Earth?
Many years ago, I coauthored an argument which I hoped to present to ESA, via a private thinktank, to the effect that it makes good sense to move toward space exploration missions using a highly redundant fleet of small, simple, relatively cheap spacecraft each designed to do just one thing. As far as I know, this never went anywhere.
18 April, 2007 at 8:58 am
Chris Hillman
Well, I seem to have committed the cardinal sin of posting before lurking, so I don’t know if you (plural) have said this here before! I hope you’ll all bear with me while I try to catch up. (Also, I haven’t tried to use pseudolatex in wordpress before, but I’ll take a guess and hope for the best.)
IIRC, what Terry called “relative product” and Greg called “fibered Cartesian square of a set” is called “kernel congruence” in Mac Lane, Categories for the Working Mathematician, Springer, 1971, and it is a special case of a “pullback square”. As a relation on X, the kernel congruence
does indeed correspond to the kernel when f is a group epimorphism, say, in which case the preimages all have the same size and equality holds.
More generally, drawing a picture of the kernel congruence for a mapping from a finite set X to a finite set Y as an |X| by |X| by |Y| “box” made up of unit cubes in which the ordered pairs in the kernel congruence are darkened, and where the elements of X are listed in the order first preimage, second preimage, and so on, the cubes in kernel form square “briefcases” one layer thick and the number of these cubes is seen to the sum over the image of f of the squares of the preimages. So in the CS inequality

with equality iff (a_j) and (b_j) are linearly dependent, i.e. iff the preimages all have the same size, where the last bit is I take it the interesting part. That’s what you have in mind by “the CS argument”, right?
let a_j be the sizes of the primages and let all the b_j = 1. This gives
18 April, 2007 at 9:12 am
Terence Tao
Dear Chris,
Sorry about the latex issues; it seems that the wordpress site doesn’t let one preview comments. But I can manually repair the latex (in this case, the editor didn’t like the use of primes, x’, or of \operatorname{}). Worst case, just type in pseudolatex and I can clean it up as best I can.
That is indeed the Cauchy-Schwarz argument I had in mind; the question is whether there is a “categorical” version of this proof, in which the numbers a_j, b_j are replaced by richer objects such as sets. Incidentally, there is now a parallel discussion of this topic over at the n-category cafe.
18 April, 2007 at 1:02 pm
Chris Hillman
Thanks for fixing my markup! And for the link to n-category cafe, although unfortunately, I don’t have MathML functionality so I can hardly ever read anything there. Your blog seems to work much better for me.
I just read your ICM talk and am feeling suitably inspired. Or distracted— I seem to be getting farther OT with every word! A few comments/queries:
1. “Solutions to highly nonlinear systems should behave randomly after accounting for conservation laws, etc.” Would it be fair to guess that the KdV would be “modestly nonlinear” in this sense? (“Too many” conservation laws?)
What about the BKL conjecture in general relativity? (I don’t know of an adequate exposition— the Wikipedia article is seriously misleading— but one version says something to the effect that curvature singularities in generic vacuum solutions of the Einstein field equation resemble the strong spacelike curvature singularity which occurs in the so-called Mixmaster universe. The kicker here is that approach to the Mixmaster singularity is “regulated” by a simple continued fraction expansion!
BTW, I recommend the article by Lagarias in The Unreasonable Effectiveness of Number Theory, ed. by Stefan A. Burr, AMS, 1992 as a wonderful survey of how number theory is relevant to dynamical systems.)
2. The difficulty of “assessing” (much less “navigating”) large function spaces, such as solution spaces of nonlinear PDEs: I have found the paper by C.T.C. Siklos, “Counting solutions of Einstein’s equations”, Class. Quantum Grav. 13 (1996): 1931-1948 to be very intriguing— and perfectly maddening. If I can ever say something sensible about this I plan to post it in http://www.arsmathematica.net/ …
3. I just came across http://tosio.math.toronto.edu/wiki/index.php/Main_Page and I want one of those! (A pywiki.) I have felt for some time that the future of wikis is this: authorship tools for research groups or even single authors. So I am very glad to see this well-disciplined example of an “open notebook”!
I guess you know the website http://eqworld.ipmnet.ru/ ? As it happens, I have computed the point symmetry group of a whole bunch of PDEs, including many mentioned in these places, and AFAIK there is no website collecting this information.
19 April, 2007 at 11:29 am
Terence Tao
Dear Chris,
Thanks for the comments. I have only very fuzzy answers to some of your questions, but yes, I would consider KdV as being very close to linear in behaviour (indeed, with regard to suitable action-angle coordinates, or via the inverse scattering transform) the evolution is (formally) perfectly linear. In particular one doesn’t see turbulent behaviour, such as cascades of energy into increasingly higher frequency scales, which seems to plague other very nonlinear equations and which seems to generate a lot of “mixing” in the dynamics.
For perturbations of integrable systems one of course has KAM theory on one hand, so that there is a positive measure subset of phase space in which the dynamics continues to behave like an integrable system, and Arnold diffusion on the other, where one spreads out to cover most of the energy surface. I suppose this is some sort of intermediate regime between integrable and strongly nonlinear.
I am not an expert on mathematical relativity, though I can confidently say that the rigorous understanding of singularity formulation for the vacuum Einstein equations, while it is making good progress, is still very far away from the point where we can say or even conjecture any really good structural properties of a typical blowup. If one makes some strong symmetry reductions then one might be able to say something more (cf. Christodoulou’s proof of the cosmic censorship conjecture assuming spherical symmetry) but then it is arguable whether one is now considering “generic” behaviour. It is cute to see that number theory plays a role in these special solutions, though in some other contexts (e.g. periodic dispersive equations) it seems that many of these number-theoretic structures are fragile (e.g. they go away if a torus is replaced by a slightly perturbed domain).
Thanks for the link to eqworld, which I hadn’t seen before. It is good to see many different types of experiments regarding how to make good mathematical use of the Web; I feel that this is still an under-utilised medium in our subject, though it is not entirely clear how to systematically exploit it more effectively.
19 April, 2007 at 1:43 pm
chris forrester
regarding compression.. do you think that it would be possible to create an infinitely compressible system using arbitrary data?
that is.. do you think you could utilize a stream to provide a folding mechanism for the data itself?
19 April, 2007 at 2:17 pm
Kea
Another open problem in this subject is to find a succinct and descriptive name for the field.
How about M theory?
20 April, 2007 at 8:03 pm
Weiyu Xu
It is very nice to see you remain active in the compressive sensing area.
This topic is very interesting, but I would like to know what open problems still look interesting to the applied mathematics and singal processing society? It seems to me that most have been done,… I would appreciate it if you can share with us your opinions about the future directions of the compressive sensing area.
21 April, 2007 at 8:22 am
Terence Tao
Dear Weiyu,
It appears to me that research in this area is now being primarily driven by fairly concrete applications (imaging, MRI, etc.) rather than by theory; in particular, there is a lot of work on efficient implementation of algorithms, numerical simulation, testing on real-world signals, etc. But there are still some interesting theoretical questions of a rather vague and broad nature – these types of applied fields don’t seem to lend themselves to the kind of precise conjectures one sees in pure mathematics. One is derandomisation; all of the measurement ensembles for which the really strong compressed sensing results are known have to be generated by a random process; no deterministic compressed sensing method which is rigorously backed by theory is known. Another question is to see whether one can improve the two basic algorithms of basis pursuit and matching pursuit and obtain better results (in either accuracy, speed, or robustness); given the lack of lower bounds in this subject it is difficult to tell how close the current algorithms are to being optimal. A third is to relax the hypotheses on the measurement ensemble, in particular to allow for some limited amount of linear dependence (or near-dependence) between columns of the measuremement matrix.
22 April, 2007 at 8:16 am
John R Ramsden
I think in his final paragraph Chris Hillman gets close to the essence of the situation, except that when striding through subway passages and down escalators at a brisk pace most people are in the opposite frame of mind to one which favours “close attention” to a busker’s music, not to mention that they are literally passing the busker by and therefore unable to savour the music for long. Also, these days people who do like music on the hoof are likely to have an iPod wired to their ear.
But I do agree that a skilled musician busking is oddly embarrasing and, paradoxically, may not make as much at it as a hopeless discordant clown going through the motions. The reason I think is that if people recognize disproportionate talent and skill in such a humble occupation they feel that to offer a pittance is demeaning and condescending, even if with a moment’s thought they’d realize it might be just as welcome and needed as the coin they wouldn’t hesitate to throw in a more unskilled busker’s hat.
22 April, 2007 at 9:43 am
RD
Great writeup!!
But I do not quite understand the following:
Sadly, there does not seem to be an
application to P\neq NP; ………… be
treated by the above methods
Terry: Could you please give a few more details?
22 April, 2007 at 10:26 am
Sri Hari
Dear Tao,
I am novice in the field of mathematics and I want to ask what do you mean by “degrees of freedom” of a function space ? Is it the number of complex exponentials which make the signal I mean is it the dimension of the Fourier basis?
Thank you
22 April, 2007 at 7:17 pm
Doug
I do not have access to this paper, but look at a 1986 abstract for mathematical modeling of the honeycomb with Fourier Transformation:
Journal of Mathematical Biology
‘Mathematical model of honeycomb construction’ MR Bell et al
http://www.springerlink.com/content/n3rrj73658488631/
Although this is certainly at a scale different from the quantum, comparison of the Bell paper with your effort may be warranted?
23 April, 2007 at 1:20 am
De ‘Mozart van de wiskunde’ over succes
[...] heeft trouwens een weblog. Meestal schrijft hij wiskundige abracadabra, maar vorige week had hij het over het merkwaardige experiment van de Washington Post, om een wereldberoemde concertviolist met een Stradivarius in een [...]
23 April, 2007 at 7:37 am
Terence Tao
Dear RD,
Suppose you are given a matrix A and a vector b and want to solve the problem “find the sparsest vector x for which Ax=b”.
For certain matrices A (basically, those in which there are plenty of linear dependencies between a relatively small number of columns) this problem is NP-hard.
What the work of ourselves and others shows is that for other types of matrices A (where the columns are close to being orthogonal, e.g. random matrices typically satisfy this), the problem can be solved in polynomial time by linear programming methods.
Unfortunately, these two classes of matrices seem to be totally disjoint, so there is no obvious progress on
here. As I said in an earlier comment, this situation (NP-hard in worst case, P in average case) seems to be rather common.
23 April, 2007 at 7:38 am
Terence Tao
In this case, yes. More generally, degrees of freedom refer to the number of independent real numbers one needs to fully describe the state of an object; it can be made precise with various mathematical notions of “dimension” but in this post I am using it mostly in an informal heuristic sense.
23 April, 2007 at 8:57 am
RD
Terry: Thanks for the explanation.
I understand this is a new area and there are possibly
many open problems.
Is there any particular (doable!) open problem that you think is interesting?
23 April, 2007 at 4:12 pm
Andy
Dear Prof. Tao,
Thanks so much for taking time to write this post! I feel it is very helpful.
Regarding your comment on high-dimensional geometry, a couple of months ago, I was actually playing around with the angle \theta between one fixed vector and another randomly picked vector on n-dim unit ball and trying to calculate the probability Pr( cos(\theta)>1/n^(0.5-\epsilon)), I found when \epsilon>0, this probability decays to zero exponentially fast with n, and if \epsilon
23 April, 2007 at 5:29 pm
Stephen
I wonder if ideas related to these Honeycombs could help analyze adiabatic quantum algorithms. Eddie Farhi and collaborators proposed an adiabatic quantum algorithm for solving NP-hard problems. The algorithm provably works in the limit where the runtime is allowed to go to infinity, but it is not known what the minimal runtime is. This hinges on the gap between the smallest and second smallest eigenvalue of the Hamiltonian, which is the sum of two 2^n by 2^n Hermitian matrices, each of which have well understood spectra. (The runtime is upper bounded by something like O(1/gap^3).) Figuring out how this eigenvalue gap scales for large n is a longstanding* open problem. Although some feel it is unlikely that the runtime could be polynomial in the worst case (this would imply that NP is contained in BQP), it seems quite plausible that runtime is polynomial for random instances, and this would still be a very exciting result. Have you ever thought about this?
*by quantum computation standards!
23 April, 2007 at 7:23 pm
chris forrester
What about using arbitrary data’s harmonic alignments with one another to prove P != NP? If you’re into this idea, please check out http://nimbusgear.wordpress.com/
to get an idea of what i’m talking about.
From what i can tell, because of the abstraction levels, this is a really good way of representing the problem in terms of something that is incredibly easy to check, but may take forever to determine the answer of.
It’s a.. holistic.. approach. My own application was to produce a mechanism for the infinite compression of arbitrary data, and i think the mechanism i discovered/shaped will also be of use in the P != NP problem.
24 April, 2007 at 12:37 am
Gil Kalai
Are honeycomb related to amoebas (or non-archimedean amoebas) considered by Kontsevich, Mikhalkin, Shustin and various othres? (this is a “tropical” area, I suppose.) At least the pictures look rather similar.
24 April, 2007 at 6:00 am
Anonymous
Does this potentially provide a better way than bicubic interpolation to resize an image from N x N to 2N x 2N? If so, maybe Photoshop should include it as a filter.
24 April, 2007 at 7:21 am
Terence Tao
Dear Andy,
There is a glitch in the commenting parser: if you have a post which contains both > and < then everything in between gets interpreted as HTML and thus disappears. You can use > and < instead.
Dear Chris: there are significant theoretical limits as to lossless compression of arbitrary data, even if one asssumes infinite computational time. For instance, for digital data, a simple counting argument shows that any lossless compression algorithm must lengthen at least some input data. For analog data one can do something similar once one realises that measurement error (and other sources of error) are non-zero. (Incidentally, this is a common pitfall regarding attempts to prove P=NP; if one wants to run an algorithm using an analog computer, the algorithm will only be equivalent to a P algorithm if _all_ parameters are polynomial size; not just the data input and running time, but also the measurement accuracy, the energy consumption, temperature, etc.)
Dear Anonymous: In principle one could use compressed sensing techniques for image resizing, but I believe that special-purpose image processing tools customised for this specific task will perform better. Also, the N x N array of measurements do not look particularly “random” with respect to the wavelet basis that one expects the image to be sparse or compressible with respect to, and so these algorithms will not perform as well as they do in the random measurement case.
24 April, 2007 at 7:28 am
Terence Tao
Dear Stephen, I know some people are looking at the limiting distribution of “random” honeycombs in the large n limit, which should have connections both with random tilings and to free convolution, but I don’t believe there are published results on this problem yet (though it is an interesting direction to look into). There is also the question of what “random” means; typically for this type of analysis one assumes that the probability distributions of the input matrices are U(n)-invariant, which may be too strong for the quantum computing application.
Dear Gil, there is indeed a tropical amoeba interpretation of the additive honeycomb, due to David Speyer. Roughly speaking, one replaces the real line R by the ring of Puiseux series (to get the tropicalisation), and starting from a triple of matrices A+B+C=0 the honeycomb can be recovered by looking at the curve det(xA+yB+zC)=0.
24 April, 2007 at 8:24 am
oz
Thanx a lot for two very interesting posts.
I have a problem which appears to be conceptually similar to the ‘compressed sensing’ problems.
The problem is finding a sparse convex representation:
Suppose you’re given m points a_1,.., a_m in R^n with m > > n.
You’re also given a point b which is a sparse convex combination of the points, i.e. b = \sum_{i} x_i a_i where the x_i’s are non-negative and sum to one, and only k of them are non-zero (k < n, but still too large to enumerate all sets of k points …). The goal is to find the weights x -
this amounts to finding a sparse positive solution for Ax=b, (here A is the matrix whose columns are a_1,..,a_m) so to me it looks like the setting mentioned in the post, and my guess is that the two approaches (i.e. ‘greedy’ and ‘convex optimization’) could work here.
Any suggestions/known results for this problem?
Thanx,
oz
24 April, 2007 at 11:07 am
Terence Tao
Dear Oz,
Assuming that the a_1,…,a_m are dispersed more or less randomly, it seems that one can simply plug in either of the two algorithms mentioned above to find a sparse solution to Ax=b. It is not a priori positive, but if k < < n and the a_1,…,a_m are in general position then there can be at most one k-sparse solution, so if a k-sparse positive solution exists, this method will find it (assuming that the hypotheses necessary for the reconstruction method to work are satisfied, of course).
One could add the positivity constraints to the linear program one solves for the basis pursuit method, but my guess is that this will not significantly improve performance, though one should try some numerical experiments both ways to see if this is the case.
24 April, 2007 at 1:17 pm
Andy
Dear Prof. Tao,
Sorry for the above message and thanks a lot for the help :) I will just type my question again:
between one fixed vector and another randomly picked vector on n-dim unit ball and trying to calculate the probability
, I found when
, this probability decays to zero exponentially fast with n, and if
, it goes to one. Only if
, there is a nontrivial probability around 0.1576 and 0.1675 asymptotically. I couldn’t get a good interpretation of what is happening when dimension is high and why
is special. It looks like the surface area of the unit ball is becoming more and more concentrated around the equator when n grows big. I’m interested to know more about its relationship to the law of large numbers and more generally about high dimensional geometry. It will be great if you could point me to any reference or explain briefly. Thanks so much!!
Regarding your comment on high-dimensional geometry, a couple of months ago, I was actually playing around with the angle
24 April, 2007 at 3:27 pm
Denis Charles
Nice article! You are right about the gulf between the worst-case hardness and average case hardness of a problem. To my knowledge the widest such gulf (for a natural problem) occurs for the computation of Groebner bases. This problem is known to be EXPSPACE-hard in the worst case (best time bounds would be doubly exponential), whereas (conjecturally) on average the problem is polynomial time. The conjecture being a conjecture of Bayer on the Mumford regularity of ideals. I would be interested if anyone knows of a wider gulf for a natural problem. It is easy to create artificial problems where the worst case is at least as hard as the Halting problem and the average case is polytime by a technique called padding but that is uninteresting.
On the other hand there are natural problems where the average case hardness and worst case hardness are believed to coincide. A famous instance being approximating the shortest non-zero lattice vector in L_2 norm.
24 April, 2007 at 4:32 pm
Nets Katz
Terry,
I’m interested in learning more about NLS after hearing an inspiring talk by Ana Vargas at the BMC in Wales. She made a big point about the open problem of global solvability for mass-critical defocusing NLS in dimension greater than or equal 3, for which you, Visan, and Zhang obtained the result in the radial case.
Would you say that problem is close to resolution? If not, can you say something about what might make it less tractable than other global solvability questions in the critical setting which have fallen in recent years?
One thing I found interesting about Vargas’ talk was her work (with Bezout?)
in which she gets superior estimates for data which are not well concentrated
in frequency space. It made me wonder whether one should expect supercritical defocusing NLS to satisfy partial regularity results analogous
to Caffarelli Kohn Nirenberg for Navier Stokes.
Do you consider defocusing supercritical NLS to be a more or less tractable
object of study than Navier Stokes?
Nets
24 April, 2007 at 7:07 pm
Terence Tao
Dear Nets,
One can broadly measure the difficulty of a critical equation by just how many symmetries are present; the more symmetries, the more difficult the problem, because all of your technology better be invariant under the relevant symmetries if it is going to be efficient – and for critical problems, one needs efficient technology.
For a typical NLS, one only has spacetime symmetry to deal with. For energy-critical NLS, one also has scaling, while for mass-critical NLS one has scaling and Galilean invariance. On the other hand, the assumption of spherical symmetry eliminates spatial translation and Galilean invariance. Finally, all things being equal, the large data defocusing problem is easier than the large data focusing problem, where in the focusing problem “large” is understood to mean “nearly as big as the ground state”. For various technical reasons, 3+ dimensions is also easier than 1 or 2 dimensions (Morawetz inequalities are favourable, as is the rapid decay of the fundamental solution). With these heuristics one can understand the historical development of critical global regularity and scattering results: energy-critical defocusing radial NLS (Bourgain, etc.) -> energy-critical defocusing NLS (CKSTT, etc.) -> energy-critical focusing radial NLS (Kenig-Merle) -> 3+ D defocusing mass-critical NLS (TVZ).
Monica and I are currently looking at the 2D radial problem. The higher-dim non-radial problem looks nasty (though not intractable) due to the triple symmetry: translation, scaling, Galilean. Perhaps the first step is to do the energy-critical focusing non-radial case (I think Kenig-Merle are looking at this), or to redo the energy-critical defocusing non-radial case (our proof is 80+ pages, surely there is a better way!). I get the sense that these problems are only a few years away from resolution, though. (The 2D non-radial mass-critical NLS will be a bit tougher, and the 1D critical problems (quintic NLS and quintic mKdV) will certainly have to wait for the next decade. I’m also looking at the energy-critical wave maps problem, of course, hopefully I’ll have something meaningful to report about that in the nearish future.)
The Begout-Vargas estimates actually play a crucial role in my work with Visan and Zhang on mass-critical NLS, as they show that the only way in which the nonlinearity is strong enough to counteract the dispersive effect of the linear Schrodinger equation (in the finite mass case of course) is if the solution is simultaneously concentrated in space and frequency. This lets us compactify the evolution of the minimal-mass blowup solutions modulo the symmetry group.
I know that people are actively looking at the partial regularity problem for supercritical NLS, or (what is very closely related) getting good mass/energy concentration estimates near singular points. The infinite speed of propagation is a significant issue, though. I think even in the radial case (where partial regularity is of course rather easy), the technology for getting the expected mass and energy concentration is still not quite perfect.
My guess is that the GWP problem for defocusing supercritical NLS is harder than Navier-Stokes, just because dispersive equations tend to be harder than parabolic ones, although in the former case one can at least impose spherical symmetry, which has no interesting analogue for Navier-Stokes. But I think spherically symmetric supercritical NLW is easier than both; note for instance this is the only place where we have penetrated the scaling barrier at all, albeit only by a lousy logarithm.
24 April, 2007 at 10:36 pm
Ars Mathematica » Blog Archive » Combinatorial Nullstellensatz
[...] this thread on his weblog, Terry Tao mentioned an exciting paper by Noga Alon. The paper explains a result that [...]
25 April, 2007 at 1:26 am
David Corfield
Have weak
-nets found any use in computational learning theory?
25 April, 2007 at 5:51 am
Alex
Thank You
25 April, 2007 at 5:57 am
Gil Kalai
Dear David, I do not know of such a use and will be happy to learn.
25 April, 2007 at 8:47 am
Gil Kalai
Dear David,
If the VC dimension of a family is finite, so strong
-nets exist, it is possible to PAC-learn for every distribution on the elements of the ground set.
Let me demonstrate it with the set of planar pentagons: For every probability distribution
for points in the plane, and a pentagon
. Suppose you see sufficiently many examples (
where a point
is drawn according to the probability distribution
then with high probability you will be able to guess the answers for further questions with high level of success.
If you replace pentagons by arbitrary convex sets there is no hope for something so strong. If your distribution is concentrated on unit circle, every sequence of questions and answers represents some convex set.
We are left with a hope that maybe PAC learnability is possible when we restrict the class of probability distributions, or perhaps (more realistically) let the notion of error depends on how well the set is “supported” by
. It would be nice to explore it. Good question!
Apropos learnability. I am not aware of a good criterion for the weaker notion of “PAC-testability”. Namely after a long list of questions as above to be able to tell with low chance of failure only if the queries represent some pentagon. (Or, in the general case some edge in your hypergraph.) PAC-learnability implies PAC testability. (So with pentagons we are OK.) But I am not aware of a more general condition that guarantee PAC-testability.
25 April, 2007 at 9:06 am
Terence Tao
Dear Andy,
If you fix one of the vectors to be, say,
, and use Euler angles, one soon sees that
has a probability distribution of
for some constant
, and so by changing variables
has a probability distribution of
. Note that this is a constant measure for n=3; this is a theorem of Archimedes, who liked it so much he engraved it on his tombstone. (It is also related to symplectic geometry, toric varieties, and moment mappings, but that’s another story.) When n is large, this measure concentrates in the region
, and in this region we can approximate
. This means that asymptotically,
is normalised around the origin with a variance of
. If you play around with the erf function that probably gives you a close match to your numerical results.
The connection with the central limit theorem arises when instead of considering the cosine of the angle between two unit vectors, one considers the closely related problem of computing the dot product between two independent normalised Gaussian vectors
and
, where the
are iid standard Gaussians of mean zero and variance 1. The CLT easily shows that
and
are almost normally distributed with mean 1 and variance
, while
is almost normally distributed with mean zero and variance
.
25 April, 2007 at 11:07 am
David Corfield
Dear Gil,
As you probably know, the direction Vapnik moved in was to Support Vector Machines, maximum-margin hyperplane classifiers. Although a VC-dimension could be found for the class of classifiers exceeding a certain margin, see Theorem 1 here, a PAC result couldn’t be given since the result was data-dependent.
However, Shawe-Taylor et al. employed the fat-shattering dimension rather than the VC-dimension to yield a PAC result for SVMs.
It would be interesting to know whther
-nets had anything to say to this result.
25 April, 2007 at 11:56 am
Vlado Nikiforov
Dear Teri,
In another tradition diamonds are called “books of size 2″. In general, a “book of size k” is a set of k triangles sharing a common edge. The size of the maximal book in a graph is called its booksize.
Since 1962 Erdős has asked many questions about the booksize of graphs. Some of them are answered in arXiv.org:math/0405080.
More up to the point is the following open question of Erdős and Rothschild (e.g., Chung and Graham “Erdős on graphs”, p. 48):
What is the minimal booksize of a graph of order n with edge density c>0 if every edge is in a triangle?
Using the Regularity Lemma, Szemerédi showed that this value goes to infinity with n. The upper bound due to Erdős and to Alon and Trotter is proportional to the square root of n.
To see the importance of graphs with every edge in a triangle note that the following statement, together with the Solymósi projection, implies the existence of a three term arithmetic progression in dense subsets of integers:
Let a graph of order n has edge density c>0 and every edge is in a triangle. Then, if n is sufficiently large, the graph contains a diamond.
This statement seems weaker than the triangle removal lemma. Is this really so?
Also it seems that the functions involved in the problems of Trevisan and of Erdős-Rothschild are related. Is there any clear relationship between them?
25 April, 2007 at 1:11 pm
chris forrester
Indeed, i did much studying as to pidgeons and holes and the impossibility of arbitrary compression of information.
however, to expand that specific argument entirely, one would have to say that there could be no meta-algorithms that are composed of multiple ways of encoding the information structures. literally, that there would be a balance of informational structures that couldn’t be compressed by any algorithm, whatsoever.
this, i’ve found, is not a true precept, since we have something that can convert information from one form to another form, losslessly.
in such a manner, by utilizing the abstraction of the mathematical base switch, to provide an alternative look at the information without needing to encode an entirely new method of observation (re: algorithm), we find a meta-construct already there in the perspective of how to view the information at all.
my experiment runs like so:
Taking F as the arbitrary data, somewhere in the hundreds of digital base 2 bits.
Since the information is already digitally encoded, it forms a digital wave when you take a look at from the vantage point of a specific base, lets call that E. There is then A, the number of elements involved which form the digital waveshape itself.
This number of elements has associated with it its own limitations, which are namely the number of integer bases that produce that many elements from the arbitrary information given.
Using base 2 for E is in fact the worse case scenario, since the next step involves a sort of skimming over the digital waveform.
That is, to percieve the digital signature as a waveform, one finds that for a specific F, there are multiple E’s, a set Edelta, to percieve F which results in A digits.
These A digits in bases Edelta, naturally, totally compose F, with zero loss.
Now, there is an offset, O, which is the same for all A digits that are in bases Edelta. Above that offset, for all A digits, is information in base D.
At some point, for a specific Efinal and Afinal, there is a waveform D which, combined with the offset O, the information to specify E and the information to specify A, is within ‘compression’ range.
For lower density F information, this is not that useful. However, the entire algorithm can be run *per binary digit*, which is where the F from Cloud C comes into play. By keeping the entirety of the information that you would want to compress, say an mp3 file, in cloud C, and always moving one single bit at a time to F, then running the algorithm on F, one always finds a way to compress C.
This of course hinges upon the fact that we have more than we originally thought we had:
One: that the cloud C, and thusly F, is of suitable length. this is a result of the offset O needing a specific number of bits to encode in the first place. Thus, A must be 3 or higher, and since it is Edelta being checked, this needs to have a suitable number of opportunities to be able to reexpress the information in F.
Result: yes. information is not infinitely compressible, since at some point, base 2 becomes the most efficient way of perceiving the information. however, this is only a few hundred bits or less. for *anything* over this, there is a guaranteed ‘compression’ by means of digital wave re-expression methodology.
Thusly, the pidgeons and holes theory does not pan out, since we already have something in the informational system itself that allows for lossless reorientation of the information. And, we can choose between how the information is represented on a case-by-case basis, which reveals, as the possibilities increase, an increasing number of ways to choose between the representations.
The reason i come up to the P and NP thing with this is that checking per binary bit, all ranges of A, Edelta, is definitely deterministic. To say when the actual D + O + E + A is, bit-wise, less than F, however is non-deterministic. All that can be proved is to say that, eventually, it will happen. At this point, D + O + E + A can even be moved back into C with an ‘expand fold’ marker.
Thus, folding information, in terms of spacial harmonics.
To summarize:
Re-expression utilizing very high bases to recompose the information into different formats.
Sorting through those recompositions to find minimal change of relative heights, as percieved in two dimensions.
Encoding the change in heights, the offset, the bases of both the height changes and the offset plus the height changes, and the number of digits utilized.
In terms of algorithms, since it’s never attempted to fold the non-re-expressional informational spectrum, and guaranteed to always find foldable structures in the larger spectrum, it reveals an infinite compression mechanism, of higher density information.
25 April, 2007 at 1:54 pm
Jonathan Vos Post
One can fit 5 unit-edge regular tetrahedra in R^3 sharing a common edge, but no more. This might be called a 3-book of size 5. In hyperbolic space, one can have more tetrahedra share an edge on polytetrahedra.
Polytetrahedra relate loosely to tetrahedralization, and tightly to the structure of proteins and glasses. Some results are known for the Euclidean geometry, usefully in projecting from R^4 to R^3. But are the equivalent results for 3-books known as for books as previously defined, in the graph-theory sense (where the “steric hinderence” of R^3 is not a constraint)?
Search on the term “polytetrahedra” at the Online Encyclopedia of Integer Sequences for an extremely elementary and yet novel combinatorial enumeration problem I’ve posed and only scratched the surface of answering.
25 April, 2007 at 10:12 pm
Gil Kalai
Dear David, I think the first interesting special case regarding the connection of weak
-net and learnability is this:
In what sense (if any) can we statistically learn (or statistically test) planar convex sets.
26 April, 2007 at 11:23 am
osias
Why I am seeing “formula does not parse” everywhere?
27 April, 2007 at 9:04 am
Mark Meckes
Andy:
The best-known application in high-dimensional geometry of the phenomenon you’ve observed is Dvoretzky’s theorem (in a given proof by Vitali Milman in the early 1970s). Possibly the most accessible reference for this is Lectures 7-9 of K. Ball, “An elementary introduction to modern convex geometry”, in this volume.
27 April, 2007 at 10:43 am
Fields Medalist Symposium « What’s new
[...] I gave a more detailed presentation of the material I had discussed in my first Simons lecture, and focussing a bit more on the role of nilflows in characterising the “structure” or [...]
27 April, 2007 at 11:53 am
Anon
Is monstrous moonshine really a completely different area? I was under the impression that vertex operators, infinite dimensional Grassmannians, and generalized Kac-Moody algebras arise naturally in conformal field theories, which are a baby version of QFT’s proper. Indeed, as far as I am aware Borcherd’s proof crucially relied on the no-ghost theorem from perturbative string theory.
27 April, 2007 at 12:06 pm
Terence Tao
Well, it’s outside my field, so I can’t really say, though Richard did tell me yesterday that he considers himself as having moved away from moonshine quite a bit, but perhaps this is only in regard to the focus of study, rather than the mathematical technology employed. He did list vertex algebras and Verma modules of affine Kac-Moody algebras as examples of “Level 2″ objects, though, in his talk. (Verma modules of plain old Lie algebras are merely “Level 1″ objects.)
27 April, 2007 at 6:51 pm
Greg Kuperberg
The symposium sounds very interesting, but it reminds of a quip that I made at a workshop that I attended not long ago. The workshop had an informal lunch buffet — I’m not sure that they even bothered with tablecloths — and a group of us had picked a particular table. I think that one Fields Medalist at the workshop (who shall remain nameless) was to eat with us. But he lost track of that and sat down with the other Fields Medalist at the workshop (who shall also remain nameless) at another table. So I said to the rest of the lunch party, “If it were only one Fields Medalist, we could still ask him to come sit with us. But since it’s two Fields Medalists, I think that we’re outvoted.” :-)
Anyway, to address the math question, yes, I believe that vertex operator algebras are intended as a axiomatization of a simplified part of conformal field theory. One of the simplifications is to restrict to CFT on a sphere with marked points, instead of surfaces with higher genus. Monstrous moonshine is an example CFT which is constructed with a certain amount of symmetry (that of the Leech lattice) but is revealed to have more symmetry (that of the monster group). It’s not my area either, but I think that this is a fair summary of what I have been told.
However, there is a great variety of quantum field theories in the world, so that you can easily move far away from any one set of examples and still do quantum field theory.
27 April, 2007 at 7:28 pm
Kaleberg
This approach reminds me of early X ray astronomy sensors, before they developed grazing lenses. They used a sheet of X ray blocking metal with a random pattern of holes punched in it and rotated the sheet while collecting a single pixel of data, the overall intensity. I think they could also move the sheet further and closer to the single sensor in a sort of “trombone” arrangement. I never worked with this, but I had a roommate who worked on the SAS-C team, and he tried explaining how the sensor, or perhaps an improved sensor, might work.
I’m not sure of the mathematics of this approach, since their “wavelet basis” was rather simple, and not completely random. Still, they did get the sensor count down to one, they traded off satellite based computing for ground computing, and they did get some neat images, included Cygnus X-1.
28 April, 2007 at 1:48 pm
Andres Corrada-Emmanuel
Thank you for this introduction to compressed sensing. I want to point out that the statement on the applicability of this technique to sensor networks needs to be tempered by an actual comparison between the power requirements of on-board processing versus transmission of data. For some sensor networks the greater cost is transmission. This will increasingly become so in the future since network bandwidths are physically limited or turned off by users that want ‘information’ not sensor data. In cases like that, it would pay to process data on-board and then relay extracted information rather than raw data itself. For example, an aerial mapping appliance would process all aerial images on-board and relay a map of the area with a ortho-rectified image on top. This would give a user all the information they require to do an impressive number of tasks and would have a compression ratio running in the 1000:1 range.
28 April, 2007 at 9:48 pm
Terence Tao
Dear Andres,
Thanks for your comments! Compressed sensing actually has comparable transmission costs to traditional processed sensing methods; I think you might have been comparing processed sensing with non-compressed non-processed sensing, which indeed would be inefficient on many levels.
Suppose for instance that a traditional sensor in a sensor network collects 1,000,000 bytes of data, processes the data on-board, and extracts 1,000 bytes to transmit, giving the 1000:1 compression ratio you mention. I agree that the non-compressed non-processed sensing alternative of simply transmitting all 1,000,000 bytes collected without any on-board processing would be expensive and inefficient. However, this is not what compressed sensing does; instead of collecting so much data, it would only collect, say, 3,000 bytes of measurements to transmit, and computers at the user end will extract the 1,000 bytes of processed data they need from the 3,000 transmitted bytes via one of the known data recovery algorithms for compressed sensing. Thus the data transmission costs in this case are comparable to a traditional sensor, but the data collection and processing costs are much lower. (It may seem in this example that the compressed sensing method requires three times as much data transmitted, but actually the gap can be a bit lower in noisy environments, because the traditional sensor will need to lengthen (and possibly retransmit) the data to enable error detection or correction, whereas the compressed sensing data will not need any such lengthening, as there is already significant redundancy contained within the transmitted data, and the recovery algorithms are robust with respect to loss or corruption of a small fraction of this data.)
29 April, 2007 at 12:57 pm
Not Even Wrong » Blog Archive » Various Events and Other News
[...] the Fields Medalist blogging front, there’s a report from Terry Tao about a symposium at UCLA where he and three other Fields medalists gave talks. He gives a detailed [...]
30 April, 2007 at 6:03 am
Utpal Sarkar
Dear Terence,
Do you know if your uncertainty principle for cyclic groups of prime order generalizes to general finite abelian groups if you restrict to functions whose support is not contained in any proper subgroup?
Thanks,
Utpal
30 April, 2007 at 7:53 am
Terence Tao
Dear Uptal,
That’s an interesting question; I don’t have any counterexample to your conjecture (though one should probably look first at
). As I said in the main post, the best uncertainty principle for arbitrary groups known is by Meshulam, and it is not well understood exactly how sharp his estimates are (though I believe they are quite sharp for p-groups). Such a result also seems consistent with Kneser’s theorem (just as the Cauchy-Davenport inequality is a consequence of the uncertainty principle in the prime order case, as discussed in my own paper); in the best case scenario, it would provide a new proof of Kneser’s theorem, which would be very worthwhile.
30 April, 2007 at 9:03 pm
Sacha
“It’s an interesting phenomenon to me that so many successful people have an unusually broad range of interests, and somehow find the time to pursue them. I believe this may be a help, rather than a hindrance, since having broad interests is helpful both in finding good research topics and in solving them. (I might worry too, if this blog got filled up with the sort of post we see here, but that doesn’t seem likely.)”
Of course it’s great to have a broad range of interests – it helps keep one’s mind active and alert!
Why would anyone be concerned if Terry started filling his blog with non-technical or non-mathematically related work? That sounds snobbish to me.
1 May, 2007 at 3:19 pm
Yi-Zhi Huang
Vertex operator algebras are the algebras of meromorphic
quantum fields on the sphere. Rartional conformal field
theories on the sphere and on the torus have now been
constructed mathematically using the representation theory
of vertex operator algebras. The higher-genus case
might be a little tedious but no fundamental difficulties
are expected. Indeed, in this construction, the Hilbert space
structure is the last step. Actually one can introduce the
Hilbert space structure using a suitable inner product
from the beginning but it is not very useful.
What we need is a nice dense subspace of the Hilbert
space constructed from representations of the vertex operartor
algebra such that correlation functions on Riemann surfaces
associated to elements of this dense subspace
can be constructed. Then in the last step, one can use these
correlation functions to give a topological completion
(not the inner product completion) of
the dense subspace and show that this completion is the
same as the inner product completion.
Now it is clear why the direct use of the Hilbert space might be
difficult: We can look at the construction of the Hilbert
space using correlation functions. These correlation functions
involve higher genus Riemann surface structure, but from the
Hilbert space structure obtained from
the inner product completion, we do not see anything
about Riemann surfaces.
1 May, 2007 at 3:40 pm
Andy
I don’t want to take too much space here, but really want to thank Terry and Mark for their comment and help!
1 May, 2007 at 5:40 pm
stevenm
Thanks for taking the time to provide summaries of these interesting talks. In Jones’ talk you mention he discusses the braid group and the Jones polynomial within the context of describing the dynamics of non-colliding points in the plane. I am wondering if he was referring to ‘vicious walkers’, which are Brownian motions or random walkers not allowed to intersect or which can annihilate when they meet. The probability densities for these types of random walks (for N particles) can be related to the partition function of a Chern-Simons theory on the 3-manifold
with gauge group U(N). As regards the Jones polynomial and QFT, a beautiful and quite famous result that should perhaps be mentioned here is that of Witten who demonstrated that the Jones polynomial has a
representation in terms of a topological quantum field theory, namely an non-abelian SU(2) Chern-Simons theory on a 3-manifold. In a sense, this result would also seem to be bridge between the talks of Jones and that of Borcherds on QFT. The problem mathematicians have with this construction of the Jones is the usual lack of a rigorous definition of the Feynman path integral measure, although for once the path integral here can actually be performed exactly.
I always find it particularly fascinating when fields that at first seem unrelated turn out to have these tantalizing connections: the Jones connects with von Neumann algebras, statistical mechanics, braids etc. as you mentioned, but also with 3-dimensional topological quantum field theory. In molecular biology the Jones polynomial–and its generalisations like the HOMFLY polynomial, derivable from the SU(N) Chern-Simons–are also a potentially useful and powerful tool in classifying and identifying knotted states and topological properties of dna, polymers and enzymes; essentially ‘biological strings’. But an SU(N) Chern-Simons theory can also be related to topological string theory in the large N limit. A lot of fascinating things and connections going on here. Incidently, did Borcherd mention anything about the hard problems of actually defining QFTs on 4-manifolds, specifically the Yang-Mills gauge theory, which of course is a Clay problem?
2 May, 2007 at 7:06 am
Distinguished Lecture Series I: Shou-wu Zhang, "Overview of rational points on curves" « What’s new
[...] the to be non-zero (otherwise there is a local obstruction at p, similar to what I discussed in my first Simons lecture). Hasse formed his famous local-to-global principle, which is the heuristic that the converse [...]
2 May, 2007 at 1:52 pm
Emmanuel Kowalski
The “Lefschetz principle” is presumably a reference to the Grothendieck-Lefschetz
with
elements is the alternating
ranging from 0 to
, where
is the dimension.
, the first trace (
) is
(trivial action on a one-dimensional space), the last trace (
) is
(multiplication by
on a
) is the trace on
(analogue of the classical first (co)homology group of a Riemann surface), written as sum of eigenvalues
.
trace formula, which states that the number of points on an algebraic variety
defined over a finite field
sum of traces of the “global” Frobenius automorphism acting on some finite-dimensional
vector spaces, the index
For a smooth irreducible projective curve,
one-dimensional space) and the middle (interesting) trace (
a vector space of dimension
2 May, 2007 at 5:08 pm
bb
I’d just like to point out a tiny mistake in the definition of the Tate-Shafarevich group (Sha, for short). What you’ve defined (the set of genus 1 curves over a field k with jacobian E, upto isomorphism) is actually the cohomology group H^1(k,E) (this can also be defined as the set of curves over k with a free and transitive E-action defined over k). In order to get Sha, we must account for local obstructions as we’re trying to capture the failure of the Hasse principle. So over a number field k, we define Sha(E,k) to be the subgroup of H^1(k,E) consisting of curves which have points over all completions of k. This is the group that’s conjectured to be finite — the H^1(k,E) could very well be infinite.
Also, as perhaps an interesting aside, I believe that in the function field case the truth of BSD conjecture is now equivalent to the finiteness of Sha which, in turn, follows from the 2-dimensional case of the remarkable conjecture of Artin that any smooth projective variety over a finite field (in fact, any proper scheme over Spec(Z) works!) has a finite Brauer group (Morita equivalence classes of central simple algebras).
2 May, 2007 at 5:45 pm
Terence Tao
Dear Emmanuel and bb,
Thanks very much for the corrections; they are especially helpful for me as I am trying to learn this subject properly. (I sort of absorbed bits and pieces of this through osmosis while a graduate student at Princeton – I was there when Wiles proved his theorem, in particular – but it’s only now that I think I’m getting a bit of a handle on things.)
3 May, 2007 at 3:38 am
Sadegh Jokar
Dear Terence,
CS is really interesting. I am working on this topic and your joint paper with Emmanuel is really useful and important. I have some questions about that.
I am wondering if UUP is also holds for Wavelets? If it holds, could you please give me a reference for that?
Also another question is that are Fourier+Wavlets satisfying on RIP conditon?
If yes could you let me know what is \delta_k in that case or where can I find the proof of this fact?
Third question is that it appear to me that for given matrix A, to check that A satisfies on RIP conditon is NP-Hard. Is there any methods to check for special class of matrices that this matrices are satisfying in this condition or not? It looks to me similar to finding the sparsest solutiond of underdetermined linear system in some cases using tractable BP method.
Also as last question, if for given $k $, $A$ has RIP constant $\delta_k$ and $B$ has RIP constant $\beta_k$, can someone say something about the RIP constant of $A+B$ in terms of $\delta_k$ and $\beta_k$ and some additional information about $A^TB$?
I will be appreciate, if you could help me to find the solutions for these questions?
Thanks.
Sadegh
3 May, 2007 at 9:24 am
Distinguished Lecture Series II: Shou-wu Zhang, “Gross-Zagier formula and Birch and Swinnerton-Dyer conjecture ” « What’s new
[...] In the previous lecture, Shou-wu defined an elliptic curve E as a genus 1 curve with a marked point P (which will eventually become the group identity element). It is traditional to move this point P to the point at infinity and view the curve affinely, in which case the curve can be placed via an affine transformation in the normal form [...]
3 May, 2007 at 9:31 am
Terence Tao
Dear Sadegh,
I believe the wavelet basis does not have good UUP type properties with regard to signals that are sparse in the Dirac basis, because the fine wavelet coefficients have too much correlation with some members of the Dirac basis. (There are some general results of Candes and Romberg which show that the UUP holds for random samples of a basis if there are no large correlations between that elements of that basis and elements where the signal is suspected to be sparse.)
For Fourier-Wavelet I think there is a similar problem with the very coarse wavelets, which are quite concentrated in the Fourier side. Of course, Gaussian-wavelet is OK, because a Gaussian matrix composed with any other matrix is still basically a Gaussian matrix.
It is an open problem (which I was intending to blog about at some point) to find a non-trivial class of matrices for which the UUP property can be detected in a reasonable amount of time (e.g. in polynomial time). By non-trivial, I mean that there is a non-zero probability that the matrix actually has the UUP property. In particular, no determinstic example of large UUP matrices are known in which there are significantly more columns than rows. (An orthonormal matrix obeys UUP but this is rather uninteresting, and must have at most as many columns as rows.)
As for sums of matrices, one only has really good bounds on the singular values of A+B (and hence on the RIP type constants) in terms of those of A and B if one of the two matrices A, B is very small; so there should be some results about the stability of RIP or UUP under small perturbations, but when summing two large matrices together, virtually anything can happen.
3 May, 2007 at 9:37 am
Terence Tao
Dear Stevenm,
Thanks for the interesting comments. Vaughan didn’t mention any sort of random walks in his talk, but given that these things already show up in classical stochastic field theories I am not surprised that some version of them also shows up in QFTs. As for the 4-dim QFTs, the sense I got from Richard was that the algebraic issues (working in the perturbative regime of formal power series) are reasonably well understood and on a fairly solid foundation, but all the analytic issues remain extremely difficult. (As an analogy, one can easily construct “global” “solutions” to the Navier-Stokes equation from analytic initial data by a formal power series expansion (e.g. using Cauchy-Kowaleski), but without any convergence result on such formal power series, this formal power series sheds no light on the global regularity problem.)
3 May, 2007 at 12:39 pm
Emmanuel Kowalski
The question of average rank of elliptic curves (or average order of vanishing) seems quite tricky, and there are really two schools of thought as to whether, ordering by discriminant, one should or not get 50% of rank 0 and rank 1, and a set with “density 0” of curves with rank > 1. There’s a fascinating account of the most recent numerical evidence in:
Average ranks of elliptic curves: Tension between data and conjecture
Baur Bektemirov; Barry Mazur; William Stein; Mark Watkins
Bull. Amer. Math. Soc. 44 (2007), 233-254.
The Random Matrix models for L-functions tend to strongly suggest the 50/50 split between rank 0 and 1 (assuming the B-SD conjecture), with some function field evidence by Katz and Sarnak.
In some other families of abelian varieties (obtained from modular curves in ways similar to elliptic curves, though involving modular forms with not-necessarily integral coefficients), the results are quite a bit more complete: there, it has been shown that at least 25% of the relevant abelian varieties have rank 0 (50% among those which should have even rank), and at least 43.75% have rank 1 (i.e., 87.5% among those which should have odd rank have rank 1). See e.g.
http://www.math.u-bordeaux1.fr/~kowalski/high-derivatives.pdf
(joint with P. Michel and J. Vanderkam).
Incidentally, these results arise from some form of very precise approximate orthogonality for the Fourier coefficients of the relevant modular forms (the Petersson formula). If the Fourier coefficients of elliptic curves, averaged properly with the desired ordering, have the same type of approximate orthogonality, there would be a similar result. This is described a bit in the following survey :
http://www.math.u-bordeaux1.fr/~kowalski/elliptic-curves-families.pdf
3 May, 2007 at 8:32 pm
Dima Jakobson
Hi Terry, one can define classical dynamics for graphs, this will just be a random walk. One can also define directions, just look at oriented edges (take a line graph). You can put weights on them, this will define the metric etc. You can write laplacian as d*d, where d would be the difference operator (that’s how Kirckhoff’s theorem can be proved). About uniform distribution, one should (as you suggest) take a sequence of graphs of growing size, and call it ergodic if “almost all” eigenvectors become equidistributed (as the size of the graph grows). This seems like a nice “random graph” problem. Btway, here is my favorite question about the spectrum of random regular graphs: is it simple (you are not allowed to put weights on the edges, of course)?
About scarring in different systems: the best understood case seems to be (quantum) cat maps, where you have both examples of scars (due to De Bievre, Faure and Nonnenmacher), and “non-scarring” results of Kurlberg-Rudnick. In the continuous case, the best chance for scarring is either Bunimovich stadium or Donnelly’s examples; in other situations (e.g. for Sinai billiard or in strictly negative curvature) you have much less chance.
4 May, 2007 at 8:52 am
Distinguished Lecture Series III: Shou-wu Zhang, “Triple L-series and effective Mordell conjecture” « What’s new
[...] discussed in the first lecture, one of the landmark achievements in the higher genus theory is Faltings’ theorem (proving [...]
5 May, 2007 at 12:23 pm
Not Even Wrong » Blog Archive » All Sorts of Stuff
[...] continues to come up with amazingly good blog entries. His latest is a series of three postings (here, here and here), reporting on my colleague Shouwu Zhang’s lectures at UCLA on the topic of [...]
5 May, 2007 at 12:28 pm
Not Even Wrong » Blog Archive » All Sorts of Stuff
[...] to come up with amazingly good blog entries. His latest is a series of three postings (here, here and here), reporting on my colleague Shouwu Zhang’s lectures at UCLA on the topic of rational [...]
5 May, 2007 at 1:20 pm
Scott Carnahan
I’m not an expert on renormalization, but my impression was that there is a space of Lagrangians, a space of renormalization prescriptions (basically perturbative Feynman path integral measures), and a way to combine a Lagrangian with a renormalization prescription to get a “quantum field theory”, which is a family of Green’s functions or generalized Wightman distributions. The group of renormalizations is some infinite dimensional nonabelian nilpotent group that acts on both spaces (changing coupling constants, counterterms, etc), in a way that the resulting quantum field theory is fixed. Finite dimensional orbits of this action on Lagrangians are called renormalizable.
I think Witten proposed the mass gap problem as a Clay prize problem in part because of its very nonperturbative nature. He has mentioned in some articles that there is both theoretical and computational evidence that the mass gap is exponentially damped as coupling constants are taken to zero. Since perturbative expanions are basically calculating in a formal neighborhood of zero, it sounds like they would be quite useless for this problem.
5 May, 2007 at 3:03 pm
Jordan Ellenberg
I agree with the philosophy ascribed to me above but I’m certain I don’t deserve any primary credit for articulating it!
(This is apparently quite remarkable; most moduli spaces do not have concrete representations as algebraic varieties. I gather from Shou-wu that this phenomenon is ultimately due to the fact that elliptic curves are an abelian variety.)
I’m not sure I agree with this, though (but I may not be completely clear about what you’re saying.) The moduli spaces that you meet in arithmetic geometry do have representations as algebraic varieties, or at least algebraic stacks; what is rare is to have, as with X_0(N), a reasonably nice way of writing down explicit equations for such a variety. I think that’s pretty rare even when you are studying moduli spaces of abelian varieties, so I’m not quite sure of the meaning of Shou-Wu’s comment here.
But in general it usually turns out not to be so useful to have explicit equations! (That said, I did write a paper where it was quite important to have explicit equations for a Hilbert modular surface, and it was a huge pain. But it would have been a better paper if there were a way to get the result without writing the equations down!) It tends to be more important just to know that there is a moduli space which is an algebraic variety, and to know its properties (dimension, smoothness, properness, morphisms to other varieties of interest, etc.)
5 May, 2007 at 4:00 pm
Top Posts « WordPress.com
[...] Distinguished Lecture Series III: Shou-wu Zhang, “Triple L-series and effective Mordell conjecture… On Thursday Shou-wu Zhang concluded his lecture series by talking about the higher genus case [LaTeX], and in […] [...]
5 May, 2007 at 9:19 pm
Terence Tao
Hi Jordan! Thanks for your comments. As for the “philosophy”, I know at least one expert in Diophantine equations who assigns your name to it, but I can believe it has been around for a while.
I went back to my notes for Shou-wu’s talk and tried to reconstruct his comments more precisely. I don’t really understand what’s going on, but it was something like this: just as moduli of elliptic curves give modular curves, moduli of abelian varieties give Shimura varieties (and this was only because
was abelian, apparently). Shou-wu made a “moral” at the end of the talk that a really good way to find rational points on a variety is to somehow get nontrivial maps into them from “richer” varieties such as modular varieties, in which the individual points on the variety themselves have algebraic structure. He did concede though that at the present time, this strategy has not really borne much fruit with regard to general Shimura varieties (there was some connection here between Shimura varieties involving a congruence subgroup in
and a moduli space
of curves, or a moduli space of subgroups of
, but I can’t reconstruct what the deal was from my notes).
My notes also state “For most moduli spaces, there is no “concrete” embedding into projective space”, which is presumably what you just said above. I’m not sure what “concrete” means here, but Shou-wu gave the exponential function
modeling the torus
in the complex plane as an analogy.
6 May, 2007 at 5:26 pm
Chris Woodward
Hi Terry,
I once had this crazy dream, that the quantum version of a honeycomb
should be a honeycomb on a thrice-punctured sphere with hyperbolic
metric. That is, one has these geodesics coming in from infinity,
as in the pictures in Thurston’s book, and one can talk about
honeycombs where each segment is a segment of a geodesic
of this type. Given data at infinity (namely, distinct
finite subsets of each circle at infinity) one can try to count honeycombs.
On the good side, there is an obvious Euclidean limit which gives
the usual notion of honeycomb. There would also be a strategy
for proving associativity (degenerate a quadruply punctured
sphere in two ways ). On the bad side, it wasn’t
clear to me whether one has the expected cyclic symmetry.
Or even, how does one choose the base points on the circles
at infinity?
I would love it if someone would explain why my crazy dream couldn’t
work (or could it?)
Chris
6 May, 2007 at 8:58 pm
Terence Tao
Hi Chris! Thanks for posting.
It’s certainly a very visually appealing idea, and fits of course with the role of the thrice-punctured sphere in the product of unitary matrices problem. One could start with investigating the n=1 case: the space of all “hyperbolic 1-honeycombs” cuts out a two-dimensional subset of the three circles at infinity, which one can view as a group-type operation (mapping the product of two circles to a third); it should then be easy to check whether this operation is isomorphic to the expected one (basically the addition operation on R/Z) by checking that it behaves in a commutative and associative manner.
The other difficulty I see is that there is no obvious discretisation which would count fusion product multiplicities.
It may be that we have to somehow abandon the idea of a honeycomb as a 1-D subset of a 2-D space. Take for instance the equivalent “hive” model of the additive honeycomb, which is a piecewise linear concave function with prescribed boundary values. The piecewise linearity is what is causing the honeycomb (which is sort of a dual object to the hive in some ways) to behave one-dimensionally, but one could imagine some sort of “hyperbolic hive” which had the concavity but not the piecewise linearity, and so the corresponding honeycomb becomes “smeared out” somehow. I don’t exactly see what such a hyperbolic hive could be, though; I vaguely want to take some sort of section of a bundle of thrice-punctured spheres over a triangle, or something strange like that.
But this is just wild speculation, not supported by much hard data :)
7 May, 2007 at 7:33 am
Terence Tao
p.s. Vaughan Jones pointed me towards the thesis of his student Scott Morrison, in which he describes the representation theory of the quantum group
in terms of spherical categories. One gets a number of topologically planar diagrams involved, related to Kuperberg’s webs and Jones’ planar algebras; they do of course look vaguely honeycomb-ish, but even in the classical additive setting we were never able to figure out what was the connection between these topologically planar objects and the geometrically planar honeycombs. The idea (implicit in your post) of using the uniformisation theorem to connect topology to geometry is very intruiging, though.
Vaughan ventured the opinion that planarity (or more specifically, a non-crossing condition) should play a much more prominent role in the quantum (multiplicative) setting than in the classical (additive setting), though I don’t know exactly what to make of this “hint”. Certainly the “overlay” operation for additive honeycombs looks like it should extend to multiplicative honeycombs, for instance.
8 May, 2007 at 1:29 am
Sacha
Terry, in using the term “quantum honeycomb”, are you saying that “it” (whatever it is) would be related to quantum algebraic structures (eg quantum algebras)? I’m unfamiliar with the algebraic structures you mention in your post.
8 May, 2007 at 8:59 am
Terence Tao
Dear Sacha,
Quantum (or “multiplicative”) honeycombs should relate to two “quantum” objects: firstly, quantum Schubert calculus, i.e. the structure of the quantum cohomology of the Grassmannian, which basically is counting how many holomorphic sections one can find in bundles of Grassmannians over punctured spheres (these counts are known as Gromov-Witten invariants). The other is the fusion product multiplicities of the Verlinde algebra of SU(n). I believe the latter is related to quantum groups, though I do not know the precise relationship. In the classical (or “additive”) honeycomb setting, the analogous objects are classical Schubert calculus (the intersection numbers of Schubert varieties), and tensor product multiplicities of irreducible representations of SU(n) or U(n).
Incidentally, just to make matters a bit more confusing, there is an unrelated “classical/quantum” distinction in the subject, between the “continuous” setting (which relates to sums of Hermitian matrices or products of unitary matrices) and the “discrete” setting (which is where the Schubert calculus and representation theory is). The continuous/discrete distinction is analogous to the distinction between classical (Hamiltonian) mechanics and quantum (Schrodinger/Heisenberg) mechanics. (I’m actually not entirely clear why quantum cohomology and quantum groups also have the adjective “quantum” attached, but that’s the standard terminology.)
9 May, 2007 at 4:56 am
Sacha
Thanks Terry – I’ll have to read up about these topics. I know a few things about quantum groups (I did my PhD on one family of quantum supergroups) – I think that, essentially, the use of “quantum” as an adjective is by analogy with quantum mechanics in relation to classical mechanics.
In one version of quantum groups, they can be seen as one-parameter deformations of the universal enveloping algebras of lie algebras or lie superalgebras. The name seems a bit funny to me that it seems to have stuck now. I also call them quantum algebras as they’re not groups (unless you are actually talking about the groups).
13 May, 2007 at 1:37 am
Sadegh
Dear Terence,
Thanks a lot for your useful comments and your response.
About my question on RIP.
Except Fourier, do you know another deterministic dictionary that satisfied in RIP condition with small delta?
or do you have any guess, which dictionary should have small RIP constant?
It seems that for deterministic case, there are not so many examples that have small RIP constant.(Fourier and cyclic matrices constructed by De Vore)
Thanks again for your time,
Sadegh
13 May, 2007 at 4:35 am
Rajeev’s Journal » Blog Archive » Fuzzy Fluids
[...] Terrence Tao has made some deep observations on why the regularity of three dimensional Navier-Stokes is such a hard problem. He has gone on to many other equally fascinating topics, I remain fixated on his main point there: that Navier-Stokes is `supercritical’. The nonlinearities become stronger at small distance scales, making it impossible to know (using present techniques) whether solutions remain smooth for all time. Thus it is crucial to understand the scale dependence of non-linearities in fluid mechanics. [...]
15 May, 2007 at 10:00 pm
Anonymous
If the moduli space is of general type, then in particular it is not rational, i.e. not birational to C^n. Thus it is not possible to write down a generic curve of genus g.
On the other hand, by the recent groundbreaking paper arXiv:math/0610203 now we know that varieties of general type do admit a canonical model, i.e. they are birational to a projective variety with negative first chern class.
15 May, 2007 at 11:18 pm
Anonymous
What connection are you talking about? Levi-Civita? If so, at what point did you admit a metric? And surely it is not flat in general… In the case when a connection is flat, the holonomy just detects the fundamental group, but in general, it is sensitive to curvature also…
What is the definition of the “affine bundle”? In particular, I believed that in the absense of any other geometric structure enabling identification of bundles, that torsion can only be defined for a connection on the tangent bundle. Presumably you are claiming that a projective structure imposes such an identification?
15 May, 2007 at 11:56 pm
sz
Dear Prof. Tao,
Your comment
“Torelli’s theorem abstractly shows that a surface is determined by the periods of its holomorphic differential forms, but does not show how to explicitly construct a concrete surface from these periods.”
is not right, such explicit construction do exist, it’s given by Aldo Andreotti in his 1958 paper “On a theorem of Torelli”. Roughly the periods determines an abelian variety, the Jacobians J of the Riemann surface, and integration along a based path gives the Abel embedding X->J, whose (g-1) iteration defines the theta divisor \Theta. The Gauss map (since the Jacobian is flat) defined by this theta divisor has an image which is the dual of the canonical curve in P^{g-1}. From this we can reconstruct the original Riemann surface.
Such explicit constructive Torelli however is completely missing for higher dimensions, notably for K3 surfaces for example, whose Torelli theorem is proved very indirectly.
16 May, 2007 at 6:19 am
Frederick
Professor Tao,
I want to ask a question. Given a sequence B(n), integer n from -infinity to +infinity. I require B(n)=B(-n). Now define a matrix C(p,q)=B(q-p)*q*q (square q). Now I ask under what conditions of B(n), the eigenvalues of matrix C are the eigenvalues of random matrix? The question comes from many fields from physics, such as the Helmholtz equation in a two dimensional billiard.
This is a very difficult question for me (although it seems easy). I thought it hard, but could not find any answer, even any hints to attack it. I even could not find any references to look at. I even don’t know whether C is some kind of special matrix, on which mathematicians have done a lot of work already.
Thanks!
Frederick
16 May, 2007 at 2:39 pm
Terence Tao
Dear Anonymous #2,
From what I understand of Yau’s talk, it is the geometric structure which is generating the torsion-free connection on some canonical bundle. This structure might be a Riemannian metric (in which case the connection is the Levi-Civita connection on the tangent bundle, which is torsion-free by definition), or it might be a preferred coordinate system in which the coordinate transformations between charts are linear, affine or projective (in which case the tangent or affine bundle locally trivialises to a flat bundle on Euclidean space, and one uses the standard connection for that bundle, which is clearly torsion-free by Clairaut’s theorem and is also independent of the chart by the assumption on the coordinate transformations). The affine Kahler manifolds that Yau discussed in his talk are of this latter type.
The other example Yau mentioned was that of complex structures, i.e. local complex coordinate charts of an almost complex manifold. Here the analogue of the torsion tensor would be the Nijenhuis tensor. I presume that this tensor can indeed be interpreted as the torsion of some natural connection on some natural bundle (perhaps the complexified tangent bundle?), which is manifestly torsion-free when local complex coordinates exist, but I don’t have the details.
By affine bundle, I think Yau means the direct sum of the tangent bundle and the trivial R bundle, thus sections of the affine bundle are formal sums of a vector field and a scalar field.
Dear sz: I’ll ask Yau tomorrow about Andreotti’s construction. I gather that Yau was for some reason particularly interested in constructing representatives of a given conformal class of Riemann surfaces as embedded surfaces in R^3 rather than P^{g-1}, though I don’t exactly know why.
16 May, 2007 at 3:59 pm
Top Posts « WordPress.com
[...] Distinguished Lecture Series I: Shing-Tung Yau, “What is a Geometric Structure” The final Distinguished Lecture Series for this academic year at UCLA was started on Tuesday by Shing-Tung Yau. […] [...]
16 May, 2007 at 9:06 pm
Anonymous
“For instance Yau conjectured that a Fano manifold admitted a Kähler metric if and only if it was stable in the GIT sense.”
Fano manifolds are projective and therefore they admid a Kahler metric.
I believe that the conjecture is wether they admit a Kahler-Einstein metric.
17 May, 2007 at 9:41 am
Terence Tao
Dear Frederick,
The matrix is not self-adjoint or normal, and so it may be that eigenvalues might not be the most relevant feature of this matrix (there is no spectral theorem, for instance). After a Fourier transform, the matrix is equivalent to the operator
acting on periodic functions, where b is the Fourier series with coefficients B. If instead you consider the self-adjoint version
(this corresponds to replacing q*q by q*p in your matrix), then you can use tools from microlocal analysis, such as Weyl’s law, to obtain some control on the eigenvalues, though to get fine information such as eigenvalue spacing distributions is still well beyond current technology.
17 May, 2007 at 2:43 pm
Terence Tao
Dear sz: I asked Yau to clarify what he meant by explicit reconstruction of a Riemann surface from the periods of its holomorphic forms. He said that he was interested in a concrete algebraic description of a surface (e.g. cut out by a set of algebraic equations); the Andreotti construction is more analytic in nature, giving a description which involves transcendental functions. In his talk, he gave the analogy of modeling a (topological) torus
explicitly and algebraically in the familiar manner as the surface of revolution of a circle around an axis in
; he said he would like to achieve something similar for, say, the conformal class of a geometric torus
but did not know of any “good” embedding into Euclidean space for this purpose (I don’t know what Yau’s exact criteria for “good” are, though). Yau was also interested in obtaining representations that could actually be worked out numerically by computers in a practical amount of time; apparently the Andreotti-type constructions are not feasible in this respect.
17 May, 2007 at 8:20 pm
Distinguished Lecture Series II: Shing-Tung Yau, "The Basic Tools to Construct Geometric Structures" « What’s new
[...] Yau’s first lecture, Yau informally defined a “geometric structure” as an object (such as a metric or a [...]
18 May, 2007 at 12:11 am
sz
Dear Prof. Tao,
Thank you. You are so kind. I know Yau must have some specific motivations in his mind. I guess “cut out by a set of algebraic equations” must mean “polynomial calculation time” for him. Now he’s talking about embedding an elliptic curve in to R^3, which of course would make Wiles horrified: “Where is the Modell-Weil group, man?”
But seriously, for a Riemann surface to explicitly find a set of algebraic equations to cut it out, is a difficult problem. For example let’s assume the Riemann surface is the canonical curve in P^{g-1}, which is defined by an canonical ideal. This ideal then by Hilbert’s syzygy theorem has a resolution, this means we can have a finite set of generators (i.e. equations for the curve), and a finite set of relations among the generators, and relations on relations, and so on. There is a well-known conjecture by your colleague Mark Green, about 25 years ago, which asserts this resolution is determined by whether the Riemann surface pocesses certain special divisors. But even this conjecture does not tell how to find these explicit set of equations.
18 May, 2007 at 6:31 am
Richard
Hello,
The “version of Mostow Rigidity” you mention, which states that the (minimally parabolic) geometrically finite hyperbolic structures on a manifold (with incompressible boundary) are parameterized by the Teichmüller space of the boundary, should be attributed to Ahlfors-Bers-Marden-Sullivan. Of course, the heart of Geometrization for Haken manifolds is the fixed point problem and its solution, and that’s certainly Thurston’s.
Cheers.
18 May, 2007 at 2:08 pm
Anonymous
Out of place, but congrats on your election to the Royal Society!
18 May, 2007 at 4:01 pm
Top Posts « WordPress.com
[...] Distinguished Lecture Series II: Shing-Tung Yau, “The Basic Tools to Construct Geometric Struc… On Thursday, Yau continued his lecture series on geometric structures, focusing a bit more on the tools and philosophy […] [...]
18 May, 2007 at 4:47 pm
warren smith
I want to know the following.
Are there an infinite number of 2-tuples (p,q) of odd primes such that
(p-1)*(q-1) is a product of at most a constant number of primes.
And is this still true if we demand both p and q be larger than C
(for any particular C)?
It sounds like this is similar to things you and Chen Jing-Run could do…
18 May, 2007 at 7:49 pm
Ars Mathematica » Blog Archive » Tao on Zhang
[...] Overview of rational points on curves [...]
18 May, 2007 at 8:16 pm
Distinguished Lecture Series III: Shing-Tung Yau, "Application of the Geometric Structures to Solve Problems in Algebraic Geometry and Topology" « What’s new
[...] metrics were also used in Yau’s proof of the Bogomolov-Miyaoka-Yau inequality mentioned in Zhang’s lectures. (In fact, Yau recalled that he learnt about this problem from a colloquium of Mumford right here [...]
18 May, 2007 at 8:19 pm
Terence Tao
Dear Warren,
The argument of Chen will give infinitely many primes p for which (p-1)/2 is the product of at most 2 primes, which then in turn gives arbitrarily large pairs (p-1)(q-1) which is the product of at most 4 primes, times 4 (so six primes in all). This is probably the best one can do with current technology, due to the parity problem (I plan to post more on this in the future).
One does not quite need the full power of Chen’s result to get a qualitative result of this nature; using slightly lower-tech methods such as the Selberg sieve coupled with the Bombieri-Vinogradov inequality, one can achieve the same result but with O(1) primes in the prime factorisation of (p-1)(q-1).
18 May, 2007 at 8:48 pm
Allen Knutson
collapses a rational curve inside a Calabi-Yau manifold into a (singular) conifold; one can then smooth out this conifold (destroying the Kähler structure, but retaining the complex structure) and then deform to another conifold, which lets one end up at a different Calabi-Yau manifold. (Yau also mentioned a mirror symmetric version
One collapses a complex curve by changing the symplectic structure while the complex structure remains constant. One nice toric picture starts with a 3-d ziggurat, seen from above looking like
\ /
|
/ \
where the vertical line corresponds to the curve. That line is shrunk to zero length, in a way that modifies the polytope but not its fan. So algebraic geometers who only watch the complex structure and not the Riemannian structure have a nasty surprise, when without warning the curve suddenly becomes a point.
The toric picture
\ /
/ \
is, locally, the conifold of 2×2 determinant zero matrices. That has a deformation of the complex structure to 2×2 determinant 1 matrices. There is a map from {det 1} ->> {det 0} collapsing the (Lagrangian!) SU(2) to the singular point, and elsewhere a symplectomorphism. In that sense one may imagine that it is a variation of the complex structure while the symplectic structure is held fixed, the mirror dual to the previous family.
18 May, 2007 at 8:50 pm
Allen Knutson
Greedy wordpress, for eating the space I put before that | !
19 May, 2007 at 12:11 am
sz
Dear prof. Tao,
Yau’s this statement:
“Kähler-Einstein metrics were used (together with a vanishing theorem for holomorphic sections) by Yau to determine which algebraic manifolds were Shimura varieties; roughly speaking, the Chern class has to be non-positive and a certain U(n)-irreducible component of a tensor power of the tangent bundle must have a non-trivial holomorphic section. One purely algebraic consequence of this is that any Galois conjugate of a Shimura variety is again a Shimura variety, a fact proven earlier by Kazhdan.”
mystefied me. I know Yau and Siu’s great work on characterizing the locally symmetric varieties or Shimura varieties by some curvature conditions and others, but this is purely over the complex numbers. Shimura varieties, among the other thing, has a rational model, means it can be defined by a set of algebraic equations with coefficients in some number field , and when Galois conjugates brought in to act on this coefficients, we get a new variety, and it is natural to ask if this new variety is also a Shimura variety. But over the complex numbers these two varieties are identical. I don’t see how can we tell a rational structure from some conditions over the complex numbers.
19 May, 2007 at 12:30 am
sz
Dear Prof. Tao,
I think Yau’s absolutely right to point out the spectrum of Kahler-Einstein metric is a fundamental invariants of the manifold. For example for Calabi-Yau manifold we know for each Kahler class we have an unique Ricci-flat metric. The spectrum of this metric then would provide the fundamental invariants for this Kahler class, and even might be used to construct global coordinate functions for the Kahler moduli space.
For example for Abelian varieties from the spectrum of the flat metric we can construct the theta functions, and the “theta null” is then a canonical coordinates for the moduli space of Abelian varieties.
But the big question is, what’s the geometric meaning of these spectrums?
19 May, 2007 at 1:34 am
bb
sz,
if i’m parsing your question correctly, you’re claiming that “Galois conjugates” of a fixed variety are all isomorphic (and consequently, the Kazhdan-Yau theorem is contentless).
this is false! the point is that Galois automorphisms of C over Q are extremely discontinuous when it comes to the analytic topology (think of one which restricts to the non-trivial automorphism of Q(sqrt(2)). Serre has constructed examples of varieties defined over number fields whose fundamental groups are different for different embeddings of the number field in to C (c.f: his paper “Exemples de variétés projectives conjuguées non homéomorphes” , C. R. Acad. Sci. Paris 258 1964 4194–4196).
what is true, however, is that varieties defined over Q have isomorphic Galois conjugates and this is simply because Q has a unique embedding in C. but it’s hard to find Shimura varieties whose canonical model is rational (for ex: the most basic Shimura variety, the modular curve X(N), has a canonical model defined over the ring of integers of the N-th cyclotomic field).
19 May, 2007 at 2:02 am
Frederick
Dear Professor Tao,
If b(x) is periodic, then lots of problems are solved by Bloch’s theorem. All the eigenfunctions are extended (no Anderson localization occurs). Actually I was thinking the eigenvalues and wavefunction of a billiard (scar). After making a conformal transformation, the Laplacian will have a factor/coefficient, I thought the factor/coefficient would be periodic. Actually it’s not periodic. So things are more complex than I first thought.
Lots of physical problems are either solved numerically such as using finite difference method or solved by the spectral method. But the spectral method always brings big infinite matrices. It is a pity there is not very effective to handle these kinds of infinite matrices.
Thanks very much!
Frederick
19 May, 2007 at 7:56 am
Doug
RE: Allen Knutson diagram above:
If o is substituted for | , then
\/
o
/\
This may represent at least two things:
a – flop transition [top row, figure 11.5, The Elegant Universe, Brian Greene], if rotated 90 degrees >oo representing maximum
19 May, 2007 at 9:08 am
sz
bb,
Thank you for pointing this out, I think you are right. My “identical” is only a “set theoretical identical” and this is obviously not I want.
But in this context to apply the complex analytic criterion like Yau’s, I think
over the complex numbers this also has to be continous, for otherwise how can we carry over the metric, curvature, and the existence of holomorphic sections from one to another? I am still quite puzzled.
19 May, 2007 at 12:19 pm
Walt
I use MimeTeX to generate LaTeX images for my weblog. It’s a low-hassle solution.
19 May, 2007 at 1:12 pm
Terence Tao
Dear sz and bb,
I found a paper of Milne discussing the Kazhdan-Yau result,
http://www.jmilne.org/math/Manuscripts/KazTh.pdf
I would imagine that the continuous structures on the Galois-transformed variety are not obviously related to the continuous structures on the original variety, but that the arguments of Kazhdan and Yau nevertheless enable one to construct such structures on the new variety.
19 May, 2007 at 2:30 pm
warren smith
to TT: I sent you some email, it appears your precediing post is the extra ingredient I needed to solve an open problem in coding theory.
(Or I’m just wrong.) Anyway, read the email. Cheerio.
19 May, 2007 at 4:01 pm
Top Posts « WordPress.com
[...] Distinguished Lecture Series III: Shing-Tung Yau, “Application of the Geometric Structures to … On Friday, Yau concluded his lecture series by discussing the PDE approach to constructing geometric structures, […] [...]
19 May, 2007 at 8:31 pm
bb
sz and TT,
i have no idea what Yau’s result is and can claim no familiarity with the techniques that go into it. but, based on what was posted, one guess i can make is the following:
Say X_C is the shimura variety in question (over the complex numbers) and X_k is the canonical model of X_C defined over a number field k with a specified embedding i:k –> C which realises X_C as the complex variety associated to the scheme X_k. then tangent bundle of X_C is actually the pullback of a bundle defined over k, the formation of the space of global “holomorphic” sections commutes with the base change i:k –> C, the chern class of the tangent bundle of X_k is defined as a class in the l-adic cohomology (or, better, the chow ring) of X_k and, on identifying the l-adic cohomology of X_k with singular cohomology of X_C, is identified with the topologically defined chern class of (the tangent bundle) of X_C i.e: all the gadgets quoted above are defined over k itself. assuming now that the properties in question (non-positivity of the chern class in the sense of intersection theory, and non-emptyness of the space of global sections) can be checked after the base change i: k –> C, yau tells us that the canonical model X_k satisfies the same properties. now further assuming these properties are preserved under a base change j: k –> C, we can conclude that these properties are satisfied for the complex variety Y obtained using the embedding j instead of i (which gives us X_C) and, thereby, establish what was asked. i’ve most definitely belaboured the obvious, but my point is that the properties in question admit algebraic formulations and, therefore, conceivably, might be preserved under Galois conjugations by virtue of “descent”…
i’ve already taken up too much space here, so i’ll stop now and wait for the experts to correct my mistakes.
20 May, 2007 at 12:40 am
sz
Dear bb and TT,
I think perhaps Yau’s talking about the compact Shimura varieties, where topological conditions like c_2=3C_1^2 and holomorphic sections can characterize it, and these conditions can be descent to the number field k. For the general noncompact case I checked Milne’s paper, the thing becomes quite complicated. The criterion of Hermitian symmetric domains involves the positive definiteness of Ricci tensor, this can not be descented to k, so instead some indirect arguments are needed.
But as Milne remarked, in the case when the Shimura varieties are moduli spaces of some abelian varieties, the question becomes easy. Galois conjugate of the moduli is just the moduli of Galois conjugate abelian varieties, provided everything is defined naturally. This already covers a large part of all Shimura varieties, the remaining cases are E_6, E_7, and certain mixed type. To prove his theorem Kazhdan actually checked these case by case, it’s quite outrageous that Shimura varieties should be treated so differently. Maybe we can hope someday these remaining cases can also be realized as moduli spaces of some objects, that would settle the matter once for all.
This is of course, only some wishful thinkings…
20 May, 2007 at 5:56 am
Gil Kalai
While quite far from my own research I find this problem and posting very interesting for the following reasons. First, shortly after the Calderon conjecture was proved I heard a talk about it by Chris Thiele at the HU colloquium which (to my surprise) I could follow and enjoy. (And even find there was some combinatorics.) Second, at a later time Rafi Coifman gave me the Lacey-Thiele theorem as an example how concepts and ideas from applied mathematics interlace and influence the study of classical pure math problem. (This aspect of the (even wider) story is something I will be happy to learn more about.) Third, I always wondered if Carleson’s theorem could ever be “industrialized”. (This is a fancy way to the ask if we can hope for a proof that can (really) be presented at class.) I do not know if the Lacey-Thiele proof quite get there but it is certainly in this direction. And, the analogy with Szemeredi theorem is intruiging.
Is there a quick reference/link for an easy proof for the boundedness of Hilbert transform, especially the proof using wavelets?
Regarding the analogy with Szemeredi’s theorem. Are wavelets or something similar useful there? Is there a way to think about the Roth case (or its improved versions) as related to a large spanning class of functions like wavelets?
I remember that various people (e.g. Eli Shamir) raised the idea that appropriate notion of wavelets can be useful to improve known results and find further results in places were Fourier analysis applies to combinatorics, (mainly Fourier analysis over the discrete cube, or products of fixed graphs); but I am not aware of some nice definitions/applications. In particular, I do not know what wavelets are in these contexts.
20 May, 2007 at 7:28 am
Terence Tao
Dear Gil,
in something like O( N log N ) steps by using wave packets; roughly speaking, there is a representation of the form

),
is the spatial width of P, and
are (
-normalised) wave packets which are essentially localised in phase space to P. There are about O(N log N) of these tiles, and the above inner products can all be computed by an FFT-like algorithm in O(N log N) time. (The FFT performs O(N log N) computations but only retains N of the numbers computed as the Fourier transform. If instead one saves all of the O(N log N) numbers that one computes, one essentially obtains the wave packet transform.) In accordance with Coifman’s principle, the above representation is precisely what Lacey and Thiele need to control the bilinear Hilbert transform properly.
and one wants to compute (exactly) the expression

and wants to compute
?
computations, even for nice groups G such as
. A faster algorithm to compute this expression exactly may be very useful to the trilinear Hilbert transform problem, by suggesting the “correct” way to represent that transform.
norm of f in O( |G| log |G| ) steps, but the fastest I can compute the
norm in is
steps (by using the recursive formula connecting each Gowers norm to its predecessor). Is there a faster way?
Good questions! I may need several responses to answer all of them.
Coifman actually has a nice principle connecting pure and applied analysis: given any operator T, there is a positive correlation between T being tractable to estimate in pure analysis, and T having an efficiently computable representation for applied analysis purposes. The point is that the representations which allow T to be computed quickly, tend to also be the representations which allow one to analyse T efficiently, and vice versa.
For instance, suppose one wanted to compute the discrete Hilbert transform Hf of a function f on an N point set (e.g. the cyclic group Z/NZ). A naive computation of Hf would take O(N^2) computations, but by using either the Fourier representation or the wavelet representation (and using the FFT or FWT) one cuts this down to O(N log N).
For the bilinear Hilbert transform B, discretised to Z/NZ, I don’t know of a way to compute B(f,g) for given f,g which is any faster than O(N^2) steps. However, given f, g, h, one can compute the inner product
where P ranges over all Heisenberg tiles (dyadic rectangles in phase space with area
No such efficient representation (better than O(N^2)) is known for the trilinear Hilbert transform; finding a fast representation here may be the key to this problem.
There is a “finite field”, “single scale” model which might be more tractable for this latter problem. Let G be a finite group (whose order is coprime to 2 and 3). If one is given three functions
then by using the FFT one can achieve this in O( |G| log |G| ) computations. But what if one is given four functions
I know of no way to compute this in fewer than
There is an analogous problem for the Gowers norms; the FFT allows one to compute the
20 May, 2007 at 7:48 am
Terence Tao
The Lacey-Thiele proof Carleson’s theorem can be presented in a graduate harmonic analysis class in a handful of lectures. It is “simple” modulo knowing all the modern machinery of harmonic analysis (Calderon-Zygmund theory, Littlewood-Paley theory, interpolation theory, and the uncertainty principle). Things are a little simpler in a “finite field” model, in which the real line is replaced by the Walsh-Cantor group
. See for instance my lecture notes on these topics.
Wavelets appear in harmonic analysis due to the role of scale, which is largely absent in additive combinatorics, and in particular in Szemeredi’s theorem. In Szemeredi’s theorem there is essentially no distinction made between an arithmetic progression of small step and an arithmetic progression of large step, thus there is no need to treat these two scales separately (and indeed, since there are so many more large steps than small steps, the net contribution of the small steps is negligible compared to the large). But in things like the Hilbert transforms, the presence of the 1/t factor in the integral causes the small scales to be elevated to be of “equal strength” to the large scales, and so they have to be treated separately. Wavelets are a good way to separate the fine-scale and coarse-scale behaviours from each other. Wave packets are generalisations of wavelets which also capture frequency information, indeed the (overdetermined) wave packet basis contains both the wavelet basis and the Fourier basis as subsets.
So, in summary, I would only expect wavelet-type tools to be useful in “multiscale” situations in which one needs to treat fine-scale and coarse-scale oscillations separately.
As to the “easy” proof of the boundedness of the Hilbert transform using wavelets, this is a little trickier to answer; again, this result is easy only modulo a certain amount of standard wavelet theory. Roughly speaking, the main tool is the Littlewood-Paley inequality for wavelets
which asserts in particular that the
norm of a function f is controlled by the magnitude of the wavelet coefficients
, but is insensitive to the phase. In the wavelet basis, the Hilbert transform is almost a diagonal operator; it alters the phase of the wavelet coefficients, and also mixes some nearby wavelet coefficients together, but does not significantly affect the magnitude of these coefficients. Making this precise, one can easily deduce the boundedness of the Hilbert transform (and many other operators) from the above inequality (plus some other useful tools, such as the Fefferman-Stein maximal inequality). The proof of the Littlewood-Paley inequality is not too difficult, but is basically a fancy version of the standard arguments used to establish boundedness of the Hilbert transform. Thus if boundedness of the Hilbert transform was your only goal, this would not be the most efficient way to establish it; but if one was interested in treating systematically a large class of operators by a single theory, this is a good approach.
20 May, 2007 at 10:08 pm
Gil Kalai
Thanks, Terry
You wrote: “In Szemeredi’s theorem there is essentially no distinction made between an arithmetic progression of small step and an arithmetic progression of large step, thus there is no need to treat these two scales separately (and indeed, since there are so many more large steps than small steps, the net contribution of the small steps is negligible compared to the large).”
Is what you wrote a characteristic of Szemeredi’s theorem or rather of its proofs? (We can talk just about Roth’s theorem or the analogous problem for bounds for cap sets.) We will be happy to find a 3-term AP of any step; For the analysis in the present proofs AP with small steps are negligebale. Couldn’t it give an advantage to consider kernels which introduce delicate multi scale situations?
21 May, 2007 at 8:40 am
Terence Tao
Dear Gil,
. This observation was used by Varnavides to show that once Szemeredi’s theorem is proven for a small interval, it automatically holds for a large interval as well, and furthermore provides a positive density family of arithmetic progressions in that larger interval.
. If there is no such large Fourier coefficient, then we “win”; if instead we find a frequency
where things are large, we can use that frequency to determine a “scale” (we can give each element x in Z/pZ a “norm”, defined as the distance of
to the nearest integer). We can then pass to a smaller scale (i.e. to a Bohr set generated by
), and this is the heart of the original “density-incremented” proof of Roth’s theorem. Or we can then go and hunt for more types of scales to capture as much of the global structure of A as one can, until we have so much control that it becomes “obvious” that there are lots of progressions; this brings us closer to the more “ergodic” or “energy-incremented” proofs of Roth’s theorem, or to the hybrid energy+density increment proofs of Heath-Brown and Szemeredi.
This is possible, but I think it is unlikely (at least if one uses “naive” notions of scale) because it doesn’t seem to be compatible with the symmetries of the problem. For instance, in Z/pZ, the statement of Szemeredi’s theorem is invariant under any dilation by an invertible element of the field Z/pZ. Thus, there is no reason to give small steps (i.e. steps r in an interval such as {1,…,R}) any more privileged role than steps in a dilated interval such as
That said, though, there is certainly a lot of scope for utilising adaptive notions of scale, tailored to the set or function that one is seeking arithmetic progressions inside. Indeed, the known proofs of Roth and Szemeredi all do this in one way or another. For instance, to find arithmetic progressions of length 3 inside a set A in Z/pZ, one can first look for a large Fourier coefficient
21 May, 2007 at 8:32 pm
Govind
Thank you Prof. Tao for the blog. It did help me to understand the bascis easily so that now I can attack the real leterature.
21 May, 2007 at 9:52 pm
Ars Mathematica » Blog Archive » Tao on Yau
[...] What is a geometric structure? [...]
22 May, 2007 at 8:00 am
Smooth Solution to the 3-Space Navier-Stokes Equations by David Purvance « Smooth Chaos in Fluid Dynamics
[...] post argues that viscous shear is both the coercive and critical quantity that assures smooth solutions to the 3-space-periodic Navier-Stokes [...]
22 May, 2007 at 8:19 am
DavePurvance
This post argues that viscous shear is both the “coercive” and “critical” quantity that assures smooth solutions to the 3-space Navier-Stokes equations. Isn’t this what physicists have long been expected?
23 May, 2007 at 7:44 am
math student
Hi Prof. Tao,
I am curious that if we assume the solution is periodic , any partial result ?
23 May, 2007 at 7:35 pm
Soft analysis, hard analysis, and the finite convergence principle « What’s new
[...] theorem on one hand and the Furstenberg recurrence theorem on the other, as briefly discussed in my Simons lecture on these topics. (In these contexts, the correspondence between the finitary and infinitary statements is known as [...]
23 May, 2007 at 8:33 pm
Terence Tao
Dear math student,
The difficulty in global regularity for Navier-Stokes lies at the fine spatial scales – in particular, scales much smaller than the period L. Intuitively, we expect the behaviour at such scales to not “see” the periodicity, and so whatever difficulties which are present in the non-periodic case will also be expected to be present in the periodic case.
Dear Dave,
There is an error in your post when you attempt to derive (42) from (40), trying to diagonalise different matrices A_n simultaneously. The rotation matrices T_n used to diagonalise A_n depend on n, so you cannot deduce (42) from (7) by conjugating by a single matrix.
More generally, there is no quantitative advantage gained in discretising the problem by introducing some truncation parameters (such as the parameters N and M in your post). The only bounds in the discretised model which would be of use to the original continuous model would be those bounds which are uniform in the truncation parameters, since these are the only bounds which will survive the passage back to the limit. But if there was a bound in the discretised model that was independent of those parameters, one could also have proven it directly in the continuous model simply by taking the limit of the _proof_ rather than of the result. Truncated models are useful for justifying some qualitative statements (e.g. ensuring that all sums and integrals converge, etc.) but do not progress towards the heart of the matter, which is to establish global bounds on solutions to the continuous equation.
24 May, 2007 at 3:27 am
thomas1111
That’s a fascinating post! I was wondering if this had any chance of appearing in book form one day (e.g. in a second edition of your undergraduate books on analysis)?
(Btw, a few trivial typos: 5. should read “The deduction of the finitary statement…”, and the ‘latex’ command is misplaced in the Hilbert space infinitary convergence principle, and missing in a paragraph a little later.)
24 May, 2007 at 5:59 am
Michael Kinyon
A very well-written and insightful essay. You should consider submitting this to the Monthly, if you haven’t already.
24 May, 2007 at 8:18 am
Terence Tao
Thanks for the suggestions! Actually, Ben Green and I are planning to write a paper on the finite convergence principle and the philosophy surrounding it, once we have collected a few more applications of this principle.
24 May, 2007 at 9:21 am
richard borcherds
I suggest that hard analysis is essentially first order (Peano) arithmetic, while soft analysis is
second order arithmetic; in other words, hard analysis proofs are those that can be encoded in Peano arithmetic. Harvey Friedman has done a lot of work on what happens when you try to convert a proof in second order arithmetic into one in first order arithmetic: even when this is possible, the proof can get MUCH longer.
Friedman has also investigated several subsystems of second order arithmetic, which might be related to the different subsystems of soft analysis mentioned in the article.
24 May, 2