Let be a natural number. We consider the question of how many “almost orthogonal” unit vectors one can place in the Euclidean space . Of course, if we insist on being *exactly* orthogonal, so that for all distinct , then we can only pack at most unit vectors into this space. However, if one is willing to relax the orthogonality condition a little, so that is small rather than zero, then one can pack a lot more unit vectors into , due to the important fact that pairs of vectors in high dimensions are typically almost orthogonal to each other. For instance, if one chooses uniformly and independently at random on the unit sphere, then a standard computation (based on viewing the as gaussian vectors projected onto the unit sphere) shows that each inner product concentrates around the origin with standard deviation and with gaussian tails, and a simple application of the union bound then shows that for any fixed , one can pack unit vectors into whose inner products are all of size .

One can remove the logarithm by using some number theoretic constructions. For instance, if is twice a prime , one can identify with the space of complex-valued functions , where is the field of elements, and if one then considers the different quadratic phases for , where is the standard character on , then a standard application of Gauss sum estimates reveals that these unit vectors in all have inner products of magnitude at most with each other. More generally, if we take and consider the different polynomial phases for , then an application of the Weil conjectures for curves, proven by Weil, shows that the inner products of the associated unit vectors with each other have magnitude at most .

As it turns out, this construction is close to optimal, in that there is a polynomial limit to how many unit vectors one can pack into with an inner product of :

Theorem 1 (Cheap Kabatjanskii-Levenstein bound)Let be unit vector in such that for some . Then we have for some absolute constant .

In particular, for fixed and large , the number of unit vectors one can pack in whose inner products all have magnitude at most will be . This doesn’t quite match the construction coming from the Weil conjectures, although it is worth noting that the upper bound of for the inner product is usually not sharp (the inner product is actually times the sum of unit phases which one expects (cf. the Sato-Tate conjecture) to be uniformly distributed on the unit circle, and so the typical inner product is actually closer to ).

Note that for , the case of the above theorem (or more precisely, Lemma 2 below) gives the bound , which is essentially optimal as the example of an orthonormal basis shows. For , the condition is trivially true from Cauchy-Schwarz, and can be arbitrariy large. Finally, in the range , we can use a volume packing argument: we have , so of we set , then the open balls of radius around each are disjoint, while all lying in a ball of radius , giving rise to the bound for some absolute constant .

As I learned recently from Philippe Michel, a more precise version of this theorem is due to Kabatjanskii and Levenstein, who studied the closely related problem of sphere packing (or more precisely, cap packing) in the unit sphere of . However, I found a short proof of the above theorem which relies on one of my favorite tricks – the tensor power trick – so I thought I would give it here.

We begin with an easy case, basically the case of the above theorem:

Lemma 2Let be unit vectors in such that for all distinct . Then .

*Proof:* Suppose for contradiction that . We consider the Gram matrix . This matrix is real symmetric with rank at most , thus if one subtracts off the identity matrix, it has an eigenvalue of with multiplicity at least . Taking Hilbert-Schmidt norms, we conclude that

But by hypothesis, the left-hand side is at most , giving the desired contradiction.

To amplify the above lemma to cover larger values of , we apply the tensor power trick. A direct application of the tensor power trick does not gain very much; however one can do a lot better by using the *symmetric* tensor power rather than the raw tensor power. This gives

Corollary 3Let be a natural number, and let be unit vectors in such that for all distinct . Then .

*Proof:* We work in the symmetric component of the tensor power , which has dimension . Applying the previous lemma to the tensor powers , we obtain the claim.

Using the trivial bound , we can lower bound

We can thus prove Theorem 1 by setting for some sufficiently large absolute constant .

## 23 comments

Comments feed for this article

18 July, 2013 at 2:34 pm

frankpmurphyhReblogged this on algebrafm.

18 July, 2013 at 7:13 pm

John MangualDidn’t you build independent vectors with inner product less than ? Then why is your bound in Lemma 2 ?

I am skeptical as we had drilled into our heads that had exactly basis vectors.

18 July, 2013 at 7:23 pm

Terence TaoLemma 2 concerns vectors whose inner product is less than in magnitude, which in the case would be . This is less than the bound obeyed by the example of the quadratic phase functions. The factor of makes a big difference!

This is one of the initially unintuitive aspects of high-dimensional geometry; perfect orthogonality is difficult to achieve (and, as you say, one can only pack perfectly orthogonal vectors into ), but almost orthogonality becomes increasingly easy to achieve as one loosens the definition of almost orthogonality, basically because with the exponentially large amount of “room” available in a high-dimensional space. With , one can pack vectors together with an inner product of at most , or vectors together with an inner product of at most , and so forth. Of course, these vectors cannot be linearly independent, but it is possible in high dimensions for a family of vectors to be simultaneously linearly dependent and almost orthogonal, as the latter only guarantees a “local” form of linear independence. (Among other things, this is one of the reasons why compressed sensing can work.)

19 July, 2013 at 12:26 am

Mikhail KatzI wonder if this ties in with Gil Kalai’s negative solution of the Borsuk problem (coloring pointsets in Euclidean space).

19 July, 2013 at 6:43 am

A.Sorry for the stupid question. I haven’t seen the symmetric component before. Is the scalar product of tensor powers in the symmetric component the same as in the tensor product?

19 July, 2013 at 6:58 am

A.Ok, I got it now.

19 July, 2013 at 10:44 am

valuevarThis is quite nice! A few years ago, Akshay wondered whether there was a bound of this quality with a simple proof (but didn’t get anything).

This means that the number of points one can pack on an -dimensional sphere at angles from each other is at most , where . Can the be removed?

19 July, 2013 at 2:06 pm

Terence TaoFor very small , comparable to , one can’t remove the logarithm, because of the number-theoretic examples, e.g. vectors all at angle at least from each other. But one may be able to do better for larger values of ; when the upper bound is of shape or so but the number-theoretic construction only gives vectors (this is something to do with the Sato-Tate type fluctuations in the Frobenius eigenvalues in the number theoretic constructions, and so may represent the limit of the number-theoretic method).

20 July, 2013 at 5:08 pm

The Riemann hypothesis in various settings | What's new[…] To enter in LaTeX in comments, use (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy. « A cheap version of the Kabatjanskii-Levenstein bound for almost orthogonal vectors […]

21 July, 2013 at 1:27 am

Gil KalaiA few (a bit rusty) remarks and thoughts on the post. We start with a nice symmetric space . In our case is an -dimensional sphere, and other important cases are when (Hamming scheme) , and (Johnson scheme). (Association schemes is a useful general setting, and Gelfand pairs may serve as another nice general relevant setting.) The questions we can ask are:

1) What is the maximum number of points of a set

YinXof vectors with the property that the distance between every two points inYis at least α?2) What is the maximum measure for a (measurable) set

Ywhere the distance between every two points is at most α?3) What is the maximum measure for a (measurable) set where the distance between every two points is different than α?

4) What is the maximum ratio where the distance between every two points in

Yis at least α, and between two points inXis greater than α.5) What is the maximum ratio between the measures where the distance between every two points in

Yis at least α, and between two points inXis greater than α?The object for question 1) are called

codes. (Error-correcting codes for and spherical codes for .) Question 2 is an isoperimetric question. Question 3 is related to the Frankl-Wilson theorem where the polynomial methods is used to derive (close to optimal) bounds for the {0,1}^n case. The Frankl-Rodl paper gives an amazing method to move from question 2 to question 3. Problem 5 is related to the Borsuk’s problem that Jeff Kahn and I disproved. A special case of Problem 4 is the Erdos-Faber-Lovasz conjecture. A natural construction for question 3 is to take open caps of radius β/2 around the codewords of an optimal spherical code of minimal distance α+2β for some β. (Maybe it is even always the best to take α+2β=π so the code is two antipodal points.)The very basic upper bounds for codes is the volume bound. There is a simple improvement based on intersecting our code with a small spherical strip (for ) or with all vectors of fixed weight (for the Hamming scheme) and apply the volume bound for the intersection. (I am not quite sure if this already gives the quantitative results in this post.) The most advanced method is Delsartes linear-programming method which is used by Kabatjanskii and Levenstein, to give the best known results for spherical codes. Improving asymptotically these bounds for spherical codes and the bounds by McELIECE, RODEMICH, RUMSEY, and WELCH for error correcting codes are outstanding open problems with no progress over three decades. Of course the question of spherical codes is closely related to the classical problem of the densest sphere packing in Euclidean spaces.

The tensor power/symmetric power trick is indeed very nice and it is curious that here we have to insist on symmetric powers. A similar trick is used to move from problem 3 to 5 in the case of Borsuk’s problem, and it also played a role in the disproof by Khot and Vishnoi of the Goemans-Linial conjecture. A propos this, I am curious if symmetric powers of sets of constant width have constant width but this is a long shot.

21 July, 2013 at 1:50 am

Gil KalaiLet me state more carefully problems 4 and 5.

4) What is the minimum over subsets of of the maximum over subsets of of the ratio , where the distance between every two points in is at least α, and between two points in is greater than α.

5) What is the minimum over of the maximum over of the ratio between the measures where the distance between every two points in Y is at least α, and between two points in Z is greater than α?

22 July, 2013 at 2:57 am

Gil KalaiSorry, 5) still needs to be corrected

5) What is the minimum over choices of of the maximum over of the ratio between the measures where the distance between every two points in Y is

at most α, and between every two points in Z isless than α?22 July, 2013 at 7:39 am

konradswanepoelThis proof seems to be closely related to results of of Noga Alon on matrices close to the identity. See Section 9 of his paper

Problems and results in extremal combinatorics, I, Discrete Math. 273 (2003), 31-53

or his paper

Perturbed identity matrices have high rank: proof and applications, Combinatorics, Probability and Computing 18 (2009), 3-15

22 July, 2013 at 9:22 am

Terence TaoThanks for the reference! This isn’t the first time I managed to rediscover some argument or trick that Noga had figured out years ago… :-) (I once gave a talk mentioning a result I had proven recently and was quite proud of, and within a minute Noga (who was in the audience) asked, “Did you use [technique X] to reduce the problem to [simpler problem Y]?”. Of course, this is what I had done, and was the heart of the argument.)

I also see that Noga has a discussion of Harald’s question of removing the logarithm factor (when is fixed rather than decaying in ), but it seems that this is closely related to the notorious problem of removing the logarithm in the Johnson-Lindenstrauss lemma, so is likely to be very difficult (though also very interesting).

22 July, 2013 at 9:52 am

Emmanuel KowalskiThis is very nice indeed!

Actually, I think that Kabatjanski-Levenstein never explicitly address the regime where the inner product is O(1/sqrt(n)); in my paper with Phllippe and Fouvry where we used such ideas to give upper bounds on the number of sheaves with bounded complexity on a curve ( http://www.math.ethz.ch/~kowalski/counting-sheaves.pdf ) we ended proving such a bound using their method and a lemma on zeros of Hermite polynomials.

The question of optimizing the lower bounds for the regime O(1/sqrt(n)) using as many sheaves as possible is very interesting, and related to conjectures of Drinfeld, Deligne and others. We haven’t yet succeeded in getting a serious improvement on counting merely rank one sheaves.

26 July, 2013 at 5:22 am

A cheap version of the Kabatjanskii-Levenstein bound for almost orthogonal vectors | Fashion[…] Read more… […]

8 November, 2013 at 10:59 pm

Mustafa SaidSome of the ideas and results in this blog may be used to study “almost normal matrices.” If we let be the matrix with “almost orthogonal vectors,” then will be small. Moreover, one can show that there exists a normal matrix and a universal constant, such that $||N-A|| \leq K ||AA^* – A^*A||$. In other words, “almost normal matrices” are near normal approximates.

6 February, 2014 at 2:50 am

gowersI don’t understand your remark that the bound when A=1 is O(n). It seems to be for an absolute constant C.

I’m asking because I’m interested to know what the correct bound is when A is something like 1/2. Do you have any information about this?

6 February, 2014 at 9:08 am

Terence TaoOops, you’re right; the argument above only gives a linear bound for A=1/2 rather than A=1 (Lemma 2). I’ve reworded the text accordingly.

6 February, 2014 at 10:41 am

gowersOops — I missed that what I wanted to know about was dealt with in Lemma 2.

3 March, 2014 at 3:08 pm

The Rank Lemma | Geometry and Combinatorics[…] Before giving a proof of Lemma 1, I present the following application, which is a slight strengthening of Lemma 2 of this post by Terence Tao. […]

5 June, 2014 at 9:55 am

When is correlation transitive? | What's new[…] the exceptional pairs that are even weakly correlated to each other become exponentially rare. (See this previous blog post for some related discussion; in particular Lemma 2 from that post is closely related to the van der […]

1 September, 2015 at 6:39 pm

AnonymousIs this post available in any of your books, I just checked some of them but I could not find anyone which includes with this post?

[The posts from 2012 onward have not been published in book format; I may do so at some later point in time, though. -T]