Metrics of linear growth – the solution

21 December, 2017 in math.GR, math.MG, polymath | Tags: norms, polymath14 | by Terence Tao

In the tradition of “Polymath projects“, the problem posed in the previous two blog posts has now been solved, thanks to the cumulative effect of many small contributions by many participants (including, but not limited to, Sean Eberhard, Tobias Fritz, Siddharta Gadgil, Tobias Hartnick, Chris Jerdonek, Apoorva Khare, Antonio Machiavelo, Pace Nielsen, Andy Putman, Will Sawin, Alexander Shamov, Lior Silberman, and David Speyer). In this post I’ll write down a streamlined resolution, eliding a number of important but ultimately removable partial steps and insights made by the above contributors en route to the solution.

Theorem 1 Let ${G = (G,\cdot)}$ be a group. Suppose one has a “seminorm” function ${\| \|: G \rightarrow [0,+\infty)}$ which obeys the triangle inequality

$\displaystyle \|xy \| \leq \|x\| + \|y\|$

for all ${x,y \in G}$ , with equality whenever ${x=y}$ . Then the seminorm factors through the abelianisation map ${G \mapsto G/[G,G]}$ .

Proof: By the triangle inequality, it suffices to show that ${\| [x,y]\| = 0}$ for all ${x,y \in G}$ , where ${[x,y] := xyx^{-1}y^{-1}}$ is the commutator.

We first establish some basic facts. Firstly, by hypothesis we have ${\|x^2\| = 2 \|x\|}$ , and hence ${\|x^n \| = n \|x\|}$ whenever ${n}$ is a power of two. On the other hand, by the triangle inequality we have ${\|x^n \| \leq n\|x\|}$ for all positive ${n}$ , and hence by the triangle inequality again we also have the matching lower bound, thus

$\displaystyle \|x^n \| = n \|x\|$

for all ${n > 0}$ . The claim is also true for ${n=0}$ (apply the preceding bound with ${x=1}$ and ${n=2}$ ). By replacing ${\|x\|}$ with ${\max(\|x\|, \|x^{-1}\|)}$ if necessary we may now also assume without loss of generality that ${\|x^{-1} \| = \|x\|}$ , thus

$\displaystyle \|x^n \| = |n| \|x\| \ \ \ \ \ (1)$

for all integers ${n}$ .

Next, for any ${x,y \in G}$ , and any natural number ${n}$ , we have

$\displaystyle \|yxy^{-1} \| = \frac{1}{n} \| (yxy^{-1})^n \|$

$\displaystyle = \frac{1}{n} \| y x^n y^{-1} \|$

$\displaystyle \leq \frac{1}{n} ( \|y\| + n \|x\| + \|y\|^{-1} )$

so on taking limits as ${n \rightarrow \infty}$ we have ${\|yxy^{-1} \| \leq \|x\|}$ . Replacing ${x,y}$ by ${yxy^{-1},y^{-1}}$ gives the matching lower bound, thus we have the conjugation invariance

$\displaystyle \|yxy^{-1} \| = \|x\|. \ \ \ \ \ (2)$

Next, we observe that if ${x,y,z,w}$ are such that ${x}$ is conjugate to both ${wy}$ and ${zw^{-1}}$ , then one has the inequality

$\displaystyle \|x\| \leq \frac{1}{2} ( \|y \| + \| z \| ). \ \ \ \ \ (3)$

Indeed, if we write ${x = swys^{-1} = t zw^{-1} t^{-1}}$ for some ${s,t \in G}$ , then for any natural number ${n}$ one has

$\displaystyle \|x\| = \frac{1}{2n} \| x^n x^n \|$

$\displaystyle = \frac{1}{2n} \| swy \dots wy s^{-1}t zw^{-1} \dots zw^{-1} t^{-1} \|$

where the ${wy}$ and ${zw^{-1}}$ terms each appear ${n}$ times. From (2) we see that conjugation by ${w}$ does not affect the norm. Using this and the triangle inequality several times, we conclude that

$\displaystyle \|x\| \leq \frac{1}{2n} ( \|s\| + n \|y\| + \| s^{-1} t\| + n \|z\| + \|t^{-1} \| ),$

and the claim (3) follows by sending ${n \rightarrow \infty}$ .

The following special case of (3) will be of particular interest. Let ${x,y \in G}$ , and for any integers ${m,k}$ , define the quantity

$\displaystyle f(m,k) := \| x^m [x,y]^k \|.$

Observe that ${x^m [x,y]^k}$ is conjugate to both ${x (x^{m-1} [x,y]^k)}$ and to ${(y^{-1} x^m [x,y]^{k-1} xy) x^{-1}}$ , hence by (3) one has

$\displaystyle \| x^m [x,y]^k \| \leq \frac{1}{2} ( \| x^{m-1} [x,y]^k \| + \| y^{-1} x^{m} [x,y]^{k-1} xy \|)$

which by (2) leads to the recursive inequality

$\displaystyle f(m,k) \leq \frac{1}{2} (f(m-1,k) + f(m+1,k-1)).$

We can write this in probabilistic notation as

$\displaystyle f(m,k) \leq {\bf E} f( (m,k) + X )$

where ${X}$ is a random vector that takes the values ${(-1,0)}$ and ${(1,-1)}$ with probability ${1/2}$ each. Iterating this, we conclude in particular that for any large natural number ${n}$ , one has

$\displaystyle f(0,n) \leq {\bf E} f( Z )$

where ${Z := (0,n) + X_1 + \dots + X_{2n}}$ and ${X_1,\dots,X_{2n}}$ are iid copies of ${X}$ . We can write ${Z = (1,-1/2) (Y_1 + \dots + Y_{2n})}$ where $Y_1,\dots,Y_{2n} = \pm 1$ are iid signs. By the triangle inequality, we thus have

$\displaystyle f( Z ) \leq |Y_1+\dots+Y_{2n}| (\|x\| + \frac{1}{2} \| [x,y] \|),$

noting that $Y_1+\dots+Y_{2n}$ is an even integer. On the other hand, $Y_1+\dots+Y_{2n}$ has mean zero and variance $2n$ , hence by Cauchy-Schwarz

$\displaystyle f(0,n) \leq \sqrt{2n}( \|x\| + \frac{1}{2} \| [x,y] \|).$

But by (1), the left-hand side is equal to ${n \| [x,y]\|}$ . Dividing by ${n}$ and then sending ${n \rightarrow \infty}$ , we obtain the claim. $\Box$

The above theorem reduces such seminorms to abelian groups. It is easy to see from (1) that any torsion element of such groups has zero seminorm, so we can in fact restrict to torsion-free groups, which we now write using additive notation ${G = (G,+)}$ , thus for instance ${\| nx \| = |n| \|x\|}$ for ${n \in {\bf Z}}$ . We think of ${G}$ as a ${{\bf Z}}$ -module. One can then extend the seminorm to the associated ${{\bf Q}}$ -vector space ${G \otimes_{\bf Z} {\bf Q}}$ by the formula ${\|\frac{a}{b} x\| := \frac{a}{b} \|x\|}$ , and then to the associated ${{\bf R}}$ -vector space ${G \otimes_{\bf Z} {\bf R}}$ by continuity, at which point it becomes a genuine seminorm (provided we have ensured the symmetry condition ${\|x\| = \|x^{-1}\|}$ ). Conversely, any seminorm on ${G \otimes_{\bf Z} {\bf R}}$ induces a seminorm on ${G}$ . (These arguments also appear in this paper of Khare and Rajaratnam.)

141 comments

Comments feed for this article

21 December, 2017 at 4:28 pm

Terence Tao

Thanks to everyone who participated! The project ended up being a de facto Polymath project, and played out much as other successful such projects have – with a large number of small observations that, despite some corrections and backtracking, did cause the project to ultimately converge towards a final solution, and in a manner much faster than if just a handful of us were working on the problem.

The result looks interesting enough (and non-trivial enough) to publish, possibly under the usual “D.H.J. Polymath” pseudonym. (Our tradition in that case is for participants to report their name, affiliation, and (if applicable) grant support) on the corresponding wiki page, rather than being listed in the published paper.); let me know what you all think about this. One should of course look to see if there are any applications or generalisations for the result… maybe Apoorva can tell us a bit more about the motivation for the problem?

21 December, 2017 at 6:30 pm

Pace Nielsen

I just got back from a family gathering to see the problem was finished up.
Great job everyone, and thanks Terry and Apoorva for giving us this problem to think about!

22 December, 2017 at 3:33 am

Tobias Fritz

It would be great to have it published, and I’d be happy to help with the writing to the best of my abilities as a member of D.H.J. Polymath.

(Technically I should be working on things that could please hiring committees. But then again I don’t want to adapt my work to that.)

22 December, 2017 at 10:08 am

Lior Silberman

I also support publishing, naturally under the psedonym. Tobias: I’m not sure about the exact etiquette, and Terry has more experience, but I’m fairly certain you may include this paper in your list of publications.

22 December, 2017 at 10:24 am

Terence Tao

I include the Polymath papers I was involved in my own “unofficial” list of publications, but my institution doesn’t formally accept them as publications under my name. However, they can still be entered in as “other work”. Also I think it is completely appropriate to list yourself as a contributor to the project in talks, research statements, citations in other papers, etc..

I’ll start a skeleton draft of a paper on a Dropbox folder later today and share it with you all soon.

Three unrelated retrospective comments about this project. Firstly, as with Polymath8 (bounded gaps between primes), it seemed to be quite helpful to have a quantitative way of measuring progress. In Polymath8, it was the bound on gaps between primes; here, it was the bound on $\|[a,b]\|$ . Because of this scorekeeping mechanism, it became easier to identify the most promising techniques and focus attention on them.

Secondly, again as with some previous Polymath projects, computer assistance was quite important, even if the final proof is not visibly computer-assisted in any way. In particular, the crucial “inward repetition” technique (now formalised in the proof of the inequality (3) above) was discovered by deconstructing some inequalities that were computer generated. We’re still some way off from the dream of computers routinely generating large chunks of proofs and/or conjectures for us, but nevertheless they are playing an increasingly essential role in mathematics.

Finally, I just wanted to observe that the proof techniques here are implicitly using (a variant of) the “tensor power trick“, which is one of my favourite techniques in analysis, particularly with regards to how it can “magically” eliminate lower order error terms from one’s bounds by taking the tensor power limit $n \to \infty$ .

For posterity, it might be nice for other participants to record any impressions they had on the project (particularly with regards to any suggestions on how to improve future projects of this type).

22 December, 2017 at 11:39 am

Apoorva Khare

These are three very informative observations, Terry!

Speaking of the Wiki page for this Polymath, can it be found on the main Polymath page:

http://www.michaelnielsen.org/polymath1/index.php?title=Main_Page

or should I be looking somewhere else?

22 December, 2017 at 12:24 pm

Lior Silberman

Yes — on the main Polymath wiki page look for the last item in the section “Polymath-like projects”

22 December, 2017 at 12:47 pm

Lior Silberman

Terry: would you consider trying a source control system (e.g Subversion) instead of dropbox? Our department has a server so I can host the project.

22 December, 2017 at 2:18 pm

Terence Tao

I think this particular project is simple enough (the paper is likely to just be six or seven pages) that we won’t need more advanced version control, but this is certainly something to consider in the future. (Usually for polymath projects we have a separate planning thread running concurrently with the research in which these things can be worked out, but this particular project started and ended too quickly to set all this up.)

From my experience, though, while almost all participants are willing to use Dropbox, there are some who are not as keen on using version control such as Subversion (and if one switches to even more sophisticated control systems such as git, then the enthusiasm can dip even lower). Dropbox seems “good enough” for small and medium sized writing projects, particularly if one breaks up files into pieces to help avoid edit conflicts. I certainly agree though that more advanced version control would be desirable if one had a more complicated project (e.g. a monograph).

Of course, if the other editors of the paper prefer Subversion (or some other platform) over Dropbox, it would be relatively easy to transfer over at this stage.

EDIT: one thing about Dropbox: it helps if people announce on this blog if they are going to perform substantial edits on a section of the paper, and then to also announce when they are done editing. However, quick minor edits can usually be done without any such announcement (provided that nobody else is claiming a “lock” on the relevant section). This is of course the price one pays for not having more sophisticated version control in place…

24 December, 2017 at 8:21 am

Siddhartha Gadgil

Firstly, this experience has been truly enjoyable – thanks to everyone for this.

An observation: one useful feature here was an “adversarial approach” – trying to construct semi-norms positive on commutators, while trying to rule these out. While this is normal in mathematics, it may be easier in polymath projects as it needs fewer mental switches, and also we have a range of attacks based on the ways of thinking of the participants.

21 December, 2017 at 5:12 pm

Anonymous

This solution seems to be from “the Book” !

21 December, 2017 at 5:49 pm

Siddhartha Gadgil

Looks like this elegant proof gives a quantitative refinement (applicable in particular to quasi-morphisms) where we replace the triangle inequality by

$\Vert xy\Vert \leq \Vert x\Vert + \Vert y \Vert + c$

where we have $c < \epsilon min(\Vert a\Vert, \Vert b\Vert)$

If this holds, it is a curious rigidity phenomenon.

21 December, 2017 at 10:01 pm

Siddhartha Gadgil

That was too hopeful, but it looks like the norm of the commutator is bounded by $2c$ , which for actual semi-norms gives the result.

21 December, 2017 at 8:29 pm

Lior Silberman

Erratum: in the probabilistic inequality no need to take norm of $f((0,n)+X_1+\cdots)$ , while in the next sentence it’s $\vert\cdot\vert$ that is the Euclidean norm in the plane.

[Corrected, thanks -T.]

Finally, in the probabilistic context this usage of Cauchy–Schwarz is often called Chebycheff’s inequality, but it’s a matter of taste.

21 December, 2017 at 8:37 pm

Lior Silberman

Sorry — I’m wrong about the name of the inequality.

23 December, 2017 at 4:58 am

Tobias Fritz

I’m still a bit confused about the Cauchy-Schwarz inequality there. How exactly does that work? It’s clear by Jensen’s inequality and the concavity of the square root,

$\mathbf{E}|Z| = \mathbf{E}\sqrt{Z^2} \leq \sqrt{\mathbf{E}Z^2} = \sqrt{2n},$

but I don’t see how to do it with Cauchy-Schwarz. Can anyone clear this up?

23 December, 2017 at 6:16 am

Anonymous

It is simply applied to $1 Z$

22 December, 2017 at 12:00 am

Avi Levy

Just before the last display in the proof of Theorem 1, you refer to the random variable $(0,n)+X_1+\cdots+X_{2n}$ (call it $X$ ) as having variance $\tfrac{5}{4}n$ . But $X$ is two-dimensional, so I’m not sure what this means exactly – and in fact, its covariance matrix is not even diagonal. However, by a simple calculation $\mathbb E|X|^2=\tfrac{5}{2}n$ , yielding the (slightly weaker but still sufficient) estimate $f(0,n)\leq \sqrt{\frac{5n}{4}}(\|x\|+\|[x,y]\|)$ . Perhaps this was what was intended all along?

[Corrected, thanks – T.]

22 December, 2017 at 1:30 am

Anonymous

It also seems clearer to call $X$ “random vector” (instead of “random variable”)

[Corrected, thanks – T.]

22 December, 2017 at 3:18 am

Anonymous

Since the random vector is degenerate (its covariance matrix has rank 1), it seems clearer to represent it by
$X = (-1,0) (1/2 +Y) + (1,1) (1/2 - Y) = (0, -1/2) + (-2, 1)Y$
where $Y$ is a random variable, taking the values $1/2, -1/2$ with probability $1/2$ each. Hence
$Z = (-2, 1) (Y_1 + ... + Y_{2 n})$ which implies
$f(Z) \leq (2||x|| + ||[x,y]||) |y_1 + ... + y_{2 n}|$
whose expectation is $O(n^{1/2})$ .

[Suggestion implemented, thanks – T.]

22 December, 2017 at 3:45 am

Tobias Fritz

To get the optimal bound that follows from that method, we could also use the fact that $Z=(0,n)+X_1+\ldots+X_n$ is supported on a one-dimensional subspace, apply the central limit theorem there, and use the known expectation value of the half-normal distribution. I don’t have time to do this now, but I could do it later.

22 December, 2017 at 7:41 am

Tobias Fritz

But of course, optimizing the constant is completely pointless, since what we’re showing is that $f(0,n)=0$ anyway.

	Anonymous on 254A, Supplement 4: Probabilis…
	Terence Tao on Analysis II
	Anonymous on Analysis II
	El problema de Erdős… on Two announcements: AI for Math…
	Anonymous on An airport-inspired puzzle
	oliverknill on Two announcements: AI for Math…
	Anonymous on An airport-inspired puzzle
	Prashant Patil on Two announcements: AI for Math…
	Anonymous on Two announcements: AI for Math…
	Anonymous on Two announcements: AI for Math…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on Two announcements: AI for Math…
	Anonymous on Two announcements: AI for Math…
	Lior Silberman on Two announcements: AI for Math…

Metrics of linear growth – the solution

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

141 comments

Leave a reply to Apoorva Khare Cancel reply

For commenters