Last updated October 27, 2014

Topics in random matrix theory.

Terence TaoPublication Year: 2012

ISBN-10: 0-8218-7430-6

ISBN-13: 978-0-8218-7430-1

Graduate Studies in Mathematics, vol. 132

American Mathematical Society

This continues my series of books derived from my blog. The preceding books in this series were “Structure and Randomness“, “Poincaré’s legacies“, “An epsilon of room“, and “An introduction to measure theory“.

A draft version of the MS can be found here (last updated, Aug 23, 2011; note that the page numbering there differs from that of the published version). It is based primarily on these lecture notes.

Pre-errata (errors in the draft that were corrected in the published version):

- p. 20: In Exercise 1.1.11(i), “if and only for” should be “if and only if for”.
- p. 21: In Exercise 1.1.18, in the definition of convexity, should be .
- p. 46: In Exercise 1.3.16, Weilandt should be Wielandt. Similarly on p. 47 after Exercise 1.3.9, in Exercise 1.3.22(v) on page 53, on page 137 before (2.82), on page 184 after (2.129), and on page 208 before 2.6.6. Also, before (1.66), the supremum should be over rather than .
- p. 72: All occurrences of on this page should be .
- p. 183: The formula (2.127) should be attributed to Dyson ( The three fold way, J. Math. Phys. vol. 3 (1962) pgs. 1199-1215) rather than to Ginibre. Similarly on pages 251, 259, and 265.
- p. 225-226: U should be U_0 (several occurrences). Also, should be and should be .
- p. 225, Section 2.8.2: right parenthesis should be added after “sufficient decay at infinity.”
- p. 228, just before (2.179): “g_n” should be “f_n”
- p. 231: “lets ignore” should be “lets us ignore”
- p. 258: In the second paragraph, should be , and should be .

- Page 16: In Example 1.1.19, the Poisson distribution should be classified as having sub-exponential tail rather than being sub-Gaussian.
- Page 22: In Exercise 1.1.18(ii), the requirement that the take values in should be dropped.
- Page 29, In Definition 1.1.23, should lie in rather than .
- Page 32: In Definition 1.1.29, Remark 1.1.30, and Exercise 1.1.25, “-compact” should be “-compact and locally compact”.
- Page 41: In the proof of Theorem 1.3.1, should strictly speaking be (though it makes no difference to the remainder of the argument).
- Page 49: After (1.72), should be , and “orthogonal” should be “orthogonal (using the real part of the inner product)”.
- Page 51: In Section 1.3.6, the role of rows and columns should be reversed in “at least as many rows as columns”.
- Page 53: In Exercise 1.3.22 (vi), (vii), the eigenvalues should be replaced by singular values .
- Page 61: In the proof of Theorem 2.1.3, should be throughout (three occurrences), and should be . Also, the reference to (2.9) here may be replaced by (2.10).
- Page 68: In the last display of Proposition 2.1.9, should be .
- Page 70: The footnote “Note that we must have …” should read “Note that we should take if we wish to allow the variance to actually be able to attain the value “.
- Page 76: In the proof of Lemma 2.1.16, after (2.24), the expectation in the next two expressions should instead be conditional expectation with respect to .
- Page 78: In the proof of Proposition 2.1.19, the definitions of and are missing absolute value signs (they should be and respectively).
- Page 81: In Remark 2.2.2, “central limit” should be “central limit theorem”
- Page 90: In Theorem 2.2.9(i), k should range in 1,2,3,… rather than 0,1,2,… .
- Page 95: In the first display of the proof of Theorem 2.2.11, should be .
- Page 97: Near the end of Section 2.2.5: [TaVuKr2010] should be [TaVu2009b].
- Page 98: In the proof of Theorem 2.2.13, should be assumed to be Lipschitz and not just continuous.
- Page 99: In the final display, every term should have an expectation symbol attached to it.
- Page 113: Before (2.62), the symbol should be .
- Page 114: In Proposition 2.3.10, should be (two occurrences).
- Page 117: In item (ii) i the list after (2.69), the condition should be added.
- Page 126, second line: the error term needs to be improved to .
- Page 127: In the proof of Lemma 2.3.22, the first arrival can be either a fresh leg or a high multiplicity edge, not simply a fresh leg as stated in the text. However, this does not affect the rest of the argument.
- Page 128: For each non-innovative leg, one also needs to record a leg that had already departed from the vertex that one is revisiting; this increases the total combinatorial cost here from to (and the first display should be similarly adjusted). However, the rest of the argument remains unchanged. In the last display and the first display of the next page, should be .
- Page 130: The statement “(2.76) holds” should read “(2.76) fails”.
- Page 170: Before Exercise 2.5.10, the constraint should be .
- Page 174: In Exercise 2.5.15, the additional hypothesis that X and Y are self-adjoint should be added.
- Page 175: In the second display, an extra right parenthesis should be added to the left-hand side.
- Page 176: In the proof of Lemma 2.5.20, should be . Also, should be .
- Page 181: The formulae for in Exercises 2.5.20 and Exercises 2.5.21 should be swapped with each other.
- Page 183: In (2.127), the factor is missing from the denominator.
- Page 187-188: The derivation of the Ginibre formula requires modification, because the claim that the space of upper triangular matrices is preserved with respect to conjugation by permutation matrices is incorrect. Instead, the given data needs to be replaced by a pair consisting of the random matrix , together with a random enumeration of the eigenvalues of , and the factorisation is then subjected to the constraint that has diagonal entries in that order. (To put it another way, one works in an n!-fold cover of the space of matrices with simple spectrum.) One then performs the analysis in the text, with the enumeration of the eigenvalues of a perturbation of understood to be the one associated with the diagonal entries of . (Details may be found at the associated blog entry for this section.)
- Page 192: In Footnote 52 to Section 2.6.3, the exponent should be instead.
- Page 203: In Exercise 2.6.6, a factor of is missing in the error term. Earlier in the eigenfunction equation for , should be .
- Page 206: In Remark 2.6.8, the denominator in the first display should instead be in the numerator, and similarly for (2.169); the denominator two displays afterwards should similarly be .
- Page 212: For the application of Markov’s inequality and through to the next page, all appearances of should be replaced by , and “for at least values of ” should be “for at least values of .
- Page 213: In Exercise 2.7.1, should be , the condition should be , the final bound should be rather than , and should be . The definition of incompressibility should be , with to be chosen later, in the next display should be , and “within … positions” on the next paragraph should be “within … positions”. Finally, in footnote 58, the summation should go up to rather than to in both occurrences.
- Page 214: should be (two occurrences), and should be in Exercise 2.7.2.
- Page 215: In the last line “Proposition 2.7.3″ should be “Proposition 2.7.3 and (2.172)”, and on the next page, should be (two occurrences).
- Page 237: In Proposition 3.1.5, “same distribution as ” should be “same distribution as . Similarly in Proposition 3.1.16.
- Page 251: In Exercise 3.1.11, should be . In (3.12) and the preceding display, should be .

## 58 comments

Comments feed for this article

20 February, 2011 at 2:24 pm

SimonSpace missing on p. 273 – “In this sectionwe lay the necessary probability

theory…”

[Thanks, this will be corrected in the next revision of the ms. -T.]20 February, 2011 at 5:40 pm

PietroDear Terry,

I read a bit of your draft and saw some places where a line runs past the right margin, mostly because of formulas. Since these are easy to spot, I took the liberty of scanning the rest for other examples and here’s what I found. Pages 10, 28, 35, 39, 44, 60, 86, 88, 95, 111, 115, 138, 139, 175, 200, 2×205, 207, 210, 213, 216, 218, 228, 233, 243, 255, 2×257, 258, 260, 261, 262, 270, 280, 287, 290, 292, 303, 317.

Thanks for these notes!

21 February, 2011 at 5:42 am

tomsimHello,

preface towards bottom of page ix “as will as a …” should probably read “as well as a …”.

The only type of mistake a layman can catch…..

Thanks for your effort on the blog!

[Thanks, this will be corrected in the next revision of the ms. -T.]21 February, 2011 at 10:02 am

Ambuj TewariThanks for putting up the book draft here. Some comments up to p. 20 of the manuscript:

p. 8, last line: the ending double quotes in “X \in S” don’t look right

p. 9, item (ii): might be useful to introduce the bold R notation (used later) for reals here

p. 9, item (iii): same as above for bold C notation for complex numbers

p. 13, line 3: “measure P \times \mu on R” – did you mean “on \Omega \times R”?

p. 16, eq. 1.14: missing period

p. 19, Examples 1.1.9: Cauchy distribution is referred to here but not defined anywhere. Also, numbered examples have the heading “Examples x.y.z”. Should it be “Example x.y.z”?

p. 20, Lemma 1.1.10, part (iii): The text before this lemma says this is a consequence of eq. (1.23) but this one, in contrast to parts (i) and (ii), seems to directly follow from definition of sub-exponential tails.

p. 20, Exercise 1.1.4, line 2: “there exist C” -> “there exists C”

p. 20, Exercise 1.1.5, line 2: same as above

p. 20, sentence after eq. (1.26): “Upper bounds … are known as large deviation inequality.” Last word should be “inequalities”.

p. 20, 5th line from the bottom: missing space in “Chebyshev’s inequality(1.26)”

22 February, 2011 at 8:17 am

Ambuj TewariFurther comments from p. 21 up to the end of Section 1.1 (p. 41 of the manuscript)

p. 24, Exercise 1.1.12, 2nd line: “subsapce” -> “subspace”

p. 26, eq. (1.29): missing period

p. 26, Remark 1.1.14, last line: “we will discuss in the next set of notes” Does this change to “next chapter” etc. now that the notes are being converted into a book?

p. 27, 5th line from the bottom: “underyling” -> “underlying”

p. 33, Example 1.1.20, last sentence: linearity of condition expectation (and that it follows from linearity of unconditional expectation) is currently mentioned as part of this example. For unconditional expectation, linearity was emphasized and placed in the main text. Perhaps do the same here?

p. 36, Remark 1.1.27, 3rd line: extra semi-colon

[Thanks; these corrections will appear in the next revision of the ms.]23 February, 2011 at 6:54 am

Topics in random matrix theory « What’s new[…] finished writing the first draft of my second book coming out of the 2010 blog posts, namely “Topics in random matrix theory“, which was based primarily on my graduate course in the topic, though it also contains […]

24 February, 2011 at 5:33 am

SimonExercise 3.15: Perhaps X_1 should be relabelled X_0 to make the notation consistent.

3.1.4: Dyson Brownian motion. Seems like the definition of the Frobenius norm is missing a square root.

Slightly incongruous “OK” just above Theorem 3.1.16, though perhaps this is meant to be there :)

[Corrected for the next revision of the ms, thanks – T.]25 February, 2011 at 4:01 pm

Ambuj TewariFurther comments from p. 41 through p. 55 of the current manuscript:

section 1.2 (Stirling’s formula): The build-up to the final result (1.49) is very enlightening and I thoroughly enjoyed every bit of it! The use of techniques of increasing sophistication to derive the successively better approximations (1.43,44), (1.45), (1.46) and (1.49) is very nice. The observations recorded in footnote 7 (“one can often get a non-terrible bound for a series by using the largest term”) and Remark 1.2.2 (“near these maxima, e^\phi(x) usually behaves like a rescaled Gaussian”) show that these techniques have wider applicability.

p. 45, two times in the paragraph right after Exercise 1.1.2: “gaussian” -> “Gaussian”

p. 46, line 6: “constrains” -> “constrain”

p. 46, third line from the bottom: It is clear that by matrix C you mean the sum A+B but this is not mentioned anywhere.

p. 47, paragraph in the middle of the page: missing space in “augmented matrix(2.80)”

p. 53, section 1.3.3: It is amazing how all these named inequalities (Weyl, Lidskii, Ky-Fan) all follow from the minimax formulae. Remark 1.3.6 whets the reader’s appetite for more!

p. 53, last line: two periods after the reference “[Klyachko]”

p. 55, Exercise 1.3.7: “p-Schatten norms are indeed a norm” -> “p-Schatten norm is indeed a norm” (or is it okay as is?)

p. 55, 2 lines above (1.67): “coeffiicents” -> “coefficients”

[Thanks, these corrections will appear in the next revision of the ms. – T.]1 March, 2011 at 8:35 am

Ambuj TewariFurther comments from p. 56 through end of Chap. 1 (p. 64):

Section 1.3.4. (Eigenvalue deformation): This section is really nice. First, it shows how to derive (some) eigenvalue inequalities using the Hadamard first variation formula. Second, the mysterious “force” keeping distinct eigenvalues apart is alluded to on p. 56 and finally revealed in Exercise 1.3.13.

p. 56, line 4 from the bottom: “derivartive” -> “derivative”

p. 57, line 6 from the bottom: “For instance, If” -> “For instance, if”

p. 57, last line: “Lipschitz with norm” -> “Lipschitz with constant” (?)

p. 58, Exercise 1.3.12: “rather than Exercise 1.3.6″

Lidskii’s inequality was derived using Exercise 1.3.3 before. Did you mean exercise 1.3.3 here?

p. 58, Exercise 1.3.13: “By differentiating (1.70), (1.71) twice”

Differentiating the given equations twice will lead us into third derivatives. I guess you meant (1.68) and (1.69).

p. 59, Exercise 1.3.14, right after (1.74): “for all 1 <= i <= n". I guess it should be i < n.

[Thanks for the corrections. Was not able to salvage the last portion of your text though due to HTML formatting. – T.]2 March, 2011 at 8:19 am

Brianp. 69 I think footnote 1 should have the covariance set equal to 0.

[This will be corrected in the next revision, thanks – T.]4 March, 2011 at 10:38 am

Ambuj TewariComments from p. 65 through p. 85:

p. 69, line 1: missing space in “Chebyshev’s inequality(1.26)”

p. 69, footnote 1: As noted above, the covariance should be set to zero. Also, shouldn’t we take complex conjugates of the j-dependent terms?

p. 70, line 12 from the bottom: “ways one can assign” -> “ways one can choose” (?)

p. 71, Remark 2.1.1: “net effect of such care … unspecified constant C”

I wasn’t sure what this refers to. The are no unspecified constant in (2.7). Does this refer to the C in (2.8) below the remark?

p. 73, line 7 from the bottom: space missing in “Markov’s inequality(1.13)”

p. 74, Remark 2.1.4: “reliance on the identity e^{X+Y} = e^x e^Y”

Just a clarification: the later section on Golden-Thompson inequality (Section 3.2) will show to get around this barrier, right? Will it be good to refer to that section in this remark?

p. 76, line 7: space missing in “Markov’s inequality(1.13)”

p. 76–77: The 5 item summary of the preceding discussion is really nice. Zero-moment gives no decay (in lambda); First-moment gives linear decay; Second-moment gives quadratic decay; Higher moments give kth power decay; All moments combined or exponential moment method gives quadratic-exponential decay.

p. 80: last sentence in proof of strong law of large numbers: It could just be me but I couldn’t follow the instructions to complete the proof here. What does “lacunary nature of the n_m” mean?

p. 80, Proposition 2.1.9: “for some constants C_A, c_A depending on A,p,C,c”

There seems to be no “p” in the picture here.

p. 82, line 4 from the bottom: missing period after the math display “Y := …”

p. 85: proof of Theorem 2.1.12 (Gaussian concentration inequality for Lipschitz functions): “It is tempting to use the fundamental theorem of calculus along a line segment”

It is clear from the remaining argument that the circular arc is very clever choice since it allows one to use the Gaussian nature of X,Y to exploit independence between X_\theta and its derivative. But this does bring up a natural question: is there a less magical and more brute-force argument not relying heavily on Gaussian assumption that just does the obvious thing: use a line segment to apply fundamental theorem of calculus? I see that the next result (Talagrand’s inequality) lifts the Gaussian assumption but adds the convexity assumption on F. Any pointers to a concentration inequality (if there is one) for the “jointly Lipschitz case” that neither requires X_i’s to be Gaussian nor requires F to be convex would be great.

5 March, 2011 at 9:08 am

Terence TaoThanks for the corrections!

The lacunary nature of the n_m allows one to estimate and for any , whence the claims in the text.

I believe that the general concentration inequality fails if one does not make a convexity or Gaussian-type assumption (although there are substitute hypotheses, such as a log-Sobolev inequality, which can compensate for this). The book by Ledoux covers these topics in some detail.

8 March, 2011 at 9:48 am

Ambuj TewariThanks Prof. Tao for the explanation and the pointer to log-Sobolev inequalities! On reading further I found that Remark 2.1.18 also partly answers my question.

7 March, 2011 at 2:30 pm

Yan FyodorovDear Prof. Tao,

I wonder if you may wish to have a look through the article

on Random Matrix Theory which I am now preparing for Scholarpedia.

http://www.scholarpedia.org/article/Random_matrix_theory

There I tried to give a kind of historic overview of RMT with a brief mentioning of main research directions, accompanied by a rather extensive bibliography. Obviously my account of the field is rather strongly biased by my personal research interests & experience, as well as by my background in Theoretical Physics.

That is why I will be much obliged for bringing

to my attention omissions from the list/discussion of any important/influential mathematical papers which you may happen to notice.

Sure, any other corrections, amendments and suggestions are more than welcome – including those concerning the presentation style and/or grammar, as I am not a native English speaker.

with best regards,

Yan Fyodorov

1 February, 2012 at 9:17 pm

JohnHi Fyodorov,

I was just reading your arxiv paper on Introduction to Random Matrix Theory. It’s very well motivated. Thanks for the great exposition!

8 March, 2011 at 10:04 am

Ambuj TewariFurther comments from p. 86 through p. 93 (end of Section 2.1 “Concentration of Measure”) of the current manuscript:

p. 86, line 7 from the bottom: There is little possibility of confusion here but I’ll just point out that \Omega is being used for the unit disk here while it was used for the sample space in the foundational material.

p. 87, Lemma 2.1.14: After the assumption, “Let A be a convex set …”, the statement appears abruptly. Is some text required here?

p. 87, line 8 from the bottom: “coefficent” -> “coefficient”

p. 88, 2nd math display: missing power of 2 in the distances d_c inside the expectation

p. 90 bottom, p. 91 top: Should the x_n in the subscript below F (denoting directional derivative) be X_n?

p. 91, line 11 from the bottom: missing space in “Markov’s inequality(1.13)”

p. 93, line 4: “The summands here are pairwise independent …”

This confused me: do you mean that (X_i,X_j) and (X_i’,X_j’) are independent for (i,j) \neq (i’,j’)? Probably not, for it doesn’t seem right (they might share variables). But I am not able to figure out the intended interpretation of the above statement. This means that I am unable to follow the variance calculation for d(X,V)^2.

p. 93, line 12: missing space in “Chebyshev’s inequality(1.26)”

8 March, 2011 at 11:29 am

Terence TaoThanks for the corrections!

For page 90, it is best to use an independent variable (such as x_n) to denote partial differentiation, as opposed to the random variable X_n, to avoid confusion. For page 93, it is true that one does not have independence when variables are shared, but as long as one is off the diagonal, one still has zero correlation between the quantities , and on the diagonal one of course has full independence. So one can compute the variances for the on-diagonal and off-diagonal terms, and then combine them by the triangle inequality (losing a factor of 2 or so, but this is an acceptable loss). I’ll amend the text accordingly.

13 March, 2011 at 1:53 pm

Ambuj TewariFurther comments from p. 93 through p. 124 (“the central limit theorem”) of the current manuscript:

p. 94, line -10: “in this post” -> “in this section”

p. 101, proof of Theorem 2.2.1: Its amazing how slick the Fourier-analytic proof of CLT is. Once the machinery is in place, the actual proof is literally 10 lines long!!!

p. 102, Exercise 2.2.13, part (ii), line 5: small sigma should be capital sigma

p. 103, proof of Theorem 2.2.8, line 6: “an non-negative” -> “a non-negative”

p. 103, eq. (2.39): This is called sigma because of the ‘s’ in ‘sup’, right? It doesn’t look the standard deviation of anything.

pp. 108-109, explicit computation of moment up to k=4: It was really nice that these initial cases were explicitly worked out before jumping into the calculation for general k. This makes the later discussion (pp. 109-110) much easier to understand (and even anticipate).

p. 111. line -12: “boudned” -> “bounded”

p. 112, middle of the page: What is mentioned is a beautiful way to understand what the Lindeberg trick does: it factors/decomposes the central limit theorem into two components. First, it shows that the limiting distribution of interest doesn’t change if we replace arbitrary RVs by gaussian RVs. Second, the distribution is explicitly calculated in the case of gaussian RVs (which turns out to be gaussian itself(!) in the CLT case).

p. 115. line after eq. (2.45): “we have lost an exponent of 1/4″

The exponent is 1 in Theorem 2.2.8 and is 1/4 here. So, haven’t we lost 3/4 in the exponent?

p. 118, line 11, “to in particular to give”: there’s seems to be an extra “to” here

p. 119, line right after proof of Theorem 2.2.14: “improvement already reduces the 1/4 loss in (2.45) to 1/2″

If my comment for p. 115 made any sense, then this should also say “reduces the 3/4 loss … to 1/2″

p. 121, beginning of section 2.2.7. (“Predecessor comparison”):

“Suppose one had never of the normal distribution, but one still suspected the existence of the central limit theorem. … Could one still work out what that limit was?”

Could think of a better start for a section meant to end the discussion of the CLT!!!

p. 123, footnote 11: there seems to be an extra opening parenthesis here

p. 124, first para:

“it is profitable to study the distribution of some random object … by comparing it with its predecessor”

“it may … be helpful to approximate a discrete process with a continuous evolution”

This is one of those pithy “summary” paragraphs that are sprinkled throughout the text and make the reading all the more rewarding!!!

[Corrections implemented, thanks – T.]6 April, 2011 at 12:50 pm

Ambuj TewariFurther comments from p. 125 through p. 143 (up to the end of section 2.3.4) of the current manuscript:

p. 126, middle para: “several the tools seen here”

missing “of”

p. 126, first footnote: extra closing parenthesis at the end

p. 127, last line: missing space in “Markov’s inequality(1.14)

p. 128, second para: “refine the countable dense subset further, to a finite dense subset”

I got confused: how can a finite subset also be a dense subset of the unit sphere?

p. 130, line 6: “entropy loss”

This seem to refer to the loss we incur in the bound due to the cardinality of \Sigma. But why is it called “entropy loss”?

p. 131, last line before sec. 2.3.2: “… in this course” -> “… in this book” (?)

p. 135, line 2: “convex norm”

Is it possible for a norm to not be convex?

p. 136, sec. 2.3.4, line 6: “end of these notes” -> “end of this section” (?)

p. 139, 2nd para, 2nd line: “assumptions tells us” -> “assumptions tell us”

pp. 138-141: It is really helpful to see the cases k=4,6 explicitly worked out before handling the general k case. Also seeing the upper bound improve from n to n^{3/4} to n^{2/3} for k=2,4,6 raises hopes that higher k’s might eventually lead to n^{1/2} upper bound. Indeed, we see at bottom of p. 142 that k as large as log(n) is enough to get n^{1/2} (up to log factors).

p. 141, last para: The equivalence symbol is used three times but the second time, it is 3 horizontal bars (probably \equiv) while the other two times it’s a “~”. Is this is a typo or am I missing something?

p. 142, line 10 from the bottom: missing space in “Markov’s inequality(1.13)”

p. 143, 2nd line below prop. 2.3.13: was puzzled by what you mean by “half a logarithm”. Does it have something to do with the fact that log(sqrt{n}) = 1/2 log(n)?

p. 143, 3rd line below prop. 2.3.13: “…later in these notes” -> “… later in this chapter” (?)

[Thanks for the corrections! Entropy refers to metric entropy – the (logarithm of) the number of possible states in a configuration. “half of a logarithm” should just be “logarithm”. Convexity is used for emphasis (one occasionally talks about non-convex “norms”, such as the L^p norms for p<1, for instance). – T.]11 April, 2011 at 10:02 am

Florent Benaych-Georgesp. 69, footnote 1 : “equals zero” is missing.

[Thanks! I’ve just updated the draft to address all the collected typos etc. -T.]15 April, 2011 at 3:29 pm

Topics in Random Matrix Theory | Ebooks Forge[…] Download or Read Book here Might also interested…The Astrobiology Primer: An Outline of General KnowledgeA Guide to Claims-Based Identity and Access ControlMany-Minds RelativityThe Food of the Gods and How it Came to EarthEconomic aspects and business models of Free SoftwareComing up for AirOperator Algebras and Quantum Statistical MechanicsEvidence, Proof and Justice: Legal Philosophy and the Provable in English CourtsLet’s Go FishingThe Book of Wonder You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site. […]

19 April, 2011 at 10:37 am

Ambuj TewariThanks Prof. Tao for the clarifications regrading my previous comments. I also see that you have uploaded an updated version (on Apr 11, 2011). Some comments from p. 143 through p. 159 (end of section 2.3) of the current (Apr 11, 2011 version) manuscript:

p. 144, Example 2.3.14: “non-crossing cycle of length k” -> “non-crossing cycle of length 6″

p. 144, Exercise 2.3.10: “draw a line segment between … whenever i_a = i_b”

For the cycle (a,b,a,c,a,d) in Example 2.3.14, this will give us line segments joining 1 to 3, 3 to 5 and 1 to 5 since i_1 = i_3 = i_5 = a. This is k/2 not k/2-1. I guess we should not connect 1 to 5. Perhaps you meant “i_a = i_b and i_c \neq i_a for a < c

“formula derived in Section 2.2″ (?)p. 149, Proof of Theorem 2.3.21, line 10: “net contribution of the remaining cycles is O(… n^{k/2})”

It seems that n^{k/2} should be n^{k/2+0.98}.

p. 159, extra right parenthesis at the end of footnote 25.[Corrected, thanks. Unfortunately, HTML processing deleted a portion of your comment; you may need to repost that portion, using < and > instead of < and > to avoid the HTML parser. -T]19 April, 2011 at 10:49 am

Ambuj TewariSorry, something caused comments 2 and 3 to merge. My comment 3 was:

p. 147, Remark 2.3.18: “formula derived in Notes 1″ -> “formula derived in Section 2.2″ (?)

[Corrected, thanks – T.]19 April, 2011 at 8:44 pm

msdbetDear Prof. Tao

I am studying stochastic control as an electrical engineer and I wish to explore more deeply the connections that seem to exist between stochastic stability and martingales. I do understand the use of Lyapunov functions in determining stability of deterministic dynamical systems; but I wish to understand more clearly the stochastic analogs of the definition of stability and the Lyapunov functions. I can intuitively see that classical Newtonian potential theory and martingales are very similar but I wish to read more precise formulations of these concepts. I’d be grateful if you could give pointers to your work or any other existing literature on stochastic stability

30 April, 2011 at 2:43 pm

Ambuj TewariSome comments from p. 159 (start of sec. 2.4 “The semicircular law”) through p. 180 (end of sec. 2.4.3) of the current version (Apr 11, 2011) of the manuscript:

pp. 162-163

At three places here (p. 162, last para, section 2.4.1 first para and Exercise 2.4.2), the semicircular law is referred to as the “circular law”. Not sure if that was intentional.

p. 162, last line: “the the” -> “the”

p. 164, line 8 from the bottom: Missing space in “Frobenius nom(2.64)”

p. 167, just before eq. (2,87): “recall from Theorem 2.3.16 of that” -> “recall from Theorem 2.3.16 that”

p. 169, line 7

You refer to the Harish-Chandra integration formula in connection with the analogue of the Fourier method. You also say that it will not be discussed in this book. Is there an accessible reference for the reader of this book that can help him/her understand this approach based on non-commutative Fourier analysis?

p. 169, line 13 from the bottom: missing sqrt over n in the subscript of s_\mu

p. 169, last math display:

Two things seem missing here. I think there should a 1/n multiplying the first term -1/z. Second, all M_n’s should be M_n/sqrt{n}, right?

p. 173, line 9: “constaants” -> “constants”

p. 174, line 2: “concetration” -> “concentration”

p. 174, line just above 2nd math display: “from the linearity of trace” -> “from the linearity of expectation” (?)

p. 174, last line: Pointing out a minor notational inconsistency. Here, in “(X’)*”, prime denotes transpose and * denotes complex conjugate. However, later on pp. 175-176, the notation used is “X* R X” where * denotes both transpose and complex conjugate.

p. 178, eq. (2.102) and its discussion: “… this equation came by playing off two ways in which the spectral properties of a matrix M_n interacted with that of its minor M_{n-1}; firstly via Cauchy interlacing inequality, and secondly via the Schur complement formula.”

The basic idea is “predecessor comparison” but the way this is accomplished here is quite remarkable! Cauchy interlacing yields one key ingredient viz. eq. (2.96). But then, after the use of Schur complement formula, it wasn’t clear where things were going until you pointed out that in X* R X, R is independent of X and so we can bring in Talagrand’s inequality. Very nice!

1 May, 2011 at 8:48 am

Terence TaoThanks for the corrections!

As far as I know there is no text that treats the RMT and Harish-Chandra connections in general, but here is a typical paper that uses this connection: http://arxiv.org/abs/math/0306386

In the last display of p. 169, there is no 1/n factor because tr(I) = n. But there does need to be additional factors of 1/sqrt(n) in the other terms as you noted.

In p. 174, X’ was intended to just be an arbitrary vector unrelated to X; I’ve renamed it Y to reduce confusion.

11 May, 2011 at 10:16 am

Ambuj TewariSince section 2.4.4 relies on 3.1 (Brownian motion and Dyson Brownian motion), I skimmed through that as well. Just reporting a few typos I found (this for the April 11 version of the manuscript):

Section 3.1 (pp. 280–299):

p. 283, right after last math display

As in previous definition that was followed by “for t=0,1,2,…”, this should probably say “for t in N/2″

p. 291, line 3 from the bottom: “in this course” -> “in this book/text”

p. 292, line 6: “indentification” -> “identification”

p. 292, line 11 from the bottom: not a real typo but just confirming that you are using “Weyl chamber” and “Weyl cone” interchangeably, right?

Now, back to Section 2.4.4 (pp. 180–184)

p. 183: Unmatched left parenthesis in footnote 34

[Corrected for the next revision, thanks – T.]12 July, 2011 at 11:26 am

Ambuj TewariSome comments for pp. 184–217 (Section 2.5) of the current version (Apr 11, 2011) of the manuscript:

p. 193: 2nd line after eq. 2.114

“without difficulty as mathcal{X} is a *-algebra”

There is no mathcal{X} in the context. Did you mean mathcal{A}?

p. 194: line 8 from the bottom

“define an positive” -> “define a positive”

p. 198: line 2 from the bottom

missing space in “theoremto”

p. 199: line 10 from the bottom

“we can start define” -> “we can start to define”

p. 203: line 6 from the bottom

“we use the following looser definition”

This probably remains from an earlier version when Definition 2.5.12 was below this sentence. Now the definition appears before.

p. 210: Exercise 2.5.18 (Hint)

“construction here is in an abstraction” -> “construction here is an abstraction”

p. 211: Proof of Proposition 2.5.21

I know that Exercise 2.5.19 asks the reader to expand the sketch into a full proof. Still, I would like comment that it would have been very useful to see a full proof as part of the text (given that Prop. 2.5.21 connects freeness to the main theme of this book: random matrices).

p. 212: line 4 from the bottom

“one can that” -> “one can assume that” (?)

p. 214: R-transform

From the 3rd unnumbered equation on p. 214, it feels like one would define a transform by z(s) + 1/s. However, the R-transform is defined with s replaced by -s. Is there a particular reason for defining it this way?

p. 216: Remark 2.5.24: reference to [Sp]

In the bibliography (p. 327) [Sp] refers to a URL in which a tilde before “speicher” seems to be missing. It could be my pdf viewer’s problem or a problem in the pdf itself. If it’s the latter, the LaTeX package ‘url’ provides a command to typeset URLs containing tildes, etc.

p. 216: last line

“by the Bai-Yin theorem, Notes 3″ -> “by the Bai-Yin theorem, Section 2.3″ (?)

p. 217: line 7

missing right parenthesis in “(or just the additivity property of R-transforms”

[Thanks for the corrections! These will appear in the next revision of the paper. The reason for the R-transform sign conventions is that the literature is based on the moment power series instead of the Stieltjes transform; the two essentially differ by a minus sign, see (2.119). -T]15 July, 2011 at 5:04 pm

Srivatsan NarayananTerry, first of all, thank you for your wonderful book. This is a list of typos I (along with Sivaraman Balakrishnan) found.

p. 35: “second integration” should be “second disintegration”.

p. 39: Mismatched paranthesis in “(to the uniform”.

p. 40: “vaariable” should be “variable”

p. 48: This is not a typo, but I found the notation Rev*Tv slightly confusing. Perhaps adding a little space after Re might help?

p. 73, bottom: Do you mean concentration in the range $\mu + O(\sigma)$?

p. 82, Theorem 2.1.10: “fluctuates by most $c_i$” should be “fluctuates by at most $c_i$”.

p. 104: In the equation preceding “Putting all these together”, $\mathbf{P}$ should be $\mathbf{E}$.

p. 111: “The basic idea is follows” should be “The basic idea is as follows”.

p. 118, first para, last line: $\phi$ should be $\varphi$. Same in p. 123 in “view $\phi$ as depending”.

p. 122: In the heuristic equation (3rd equation after Equation 2.56), missing $dx$ in the left hand side.

p. 127, lines 4-5: Mismatched paranthesis in “(e.g. Exercise 2.1.4, Exercise 2.1.5, or Theorem 2.1.13”.

p. 127, proof of Lemma 2.3.1, line 2: “if we let” should be “If we let”.

p. 129, Remark 2.3.3: “an maximal epsilon-net” should be “a maximal epsilon-net”.

p. 129, Remark 2.3.3: “$\lambda/(1-\epsilon)$” should be “$\lambda (1-\epsilon)$”.

p. 129, Lemma 2.3.4: “a epsilon-net of” should be “an epsilon-net of”.

15 July, 2011 at 5:47 pm

Srivatsan NarayananMore corrections/clarifications.

p. 28, Remark 1.1.16, second line: “of the these measures” should be “of these measures”.

p. 129, Exercise 2.3.1: Adapting the volume argument, I managed to prove a lower bound of only $(c/\epsilon)^{n-1}$, rather than $(c/\epsilon)^n$. Can you double-check the question?

p. 129, Corollary 2.3.5: In “There exists absolute constants”, “exists” should be “exist”. Second occurrence in p. 130, Corollary 2.3.6.

p. 135, Proposition 2.3.10: In the inequality, the median operator $M$ should be in bold face.

p. 142: “Summign” should be “Summing”.

p. 143, second para from bottom: “each class of cycle” should be “each class of cycles”.

p. 144: In Exercise 2.3.10, I think that $a$ should be connected by a line segment to $b$ only if $i_a = i_b$ *and* $i_c \neq i_a$ for $a < c < b$. For instance, in the non-crossing partition consisting of just one part, just the first condition would give an edge between every pair of points, which is clearly too large. Is this correct?

p. 148, second para from bottom: In "$\epsilon$ to grow slowly in $n$", did you mean that $\epsilon$ goes to zero slowly in $n$?

p. 150, the paragraph following Equation 2.73: "apeparance" should be "appearance".

p. 159, last para of Section 2.5: "See this [So1999] for" should be "See [So1999] for".

p. 159, last para of Section 2.5: "There has also been" should be "There have also been".

p. 159, Footnote 23: Spurious closing paranthesis.

p. 161, Remark 2.4.1, just before the second equation: Missing "\emph{in}" in "converges the vague topology to $\mu$".

p. 161, Remark 2.4.1: "for all $\phi$" should be "for all $\varphi$".

p. 161, Exercise 2.4.1: In multiple places, the quantity $\mu_{1/\sqrt{n}}$ is missing the matrix $M_n$.

p. 162, last line: Repeated "the" in "we will study the the".

p. 162, last para, second line: By "circular law", did you actually mean "semicircular law"?

[Thanks for the corrections! These have been incorporated into the most recent version of the ms. -T]15 July, 2011 at 10:55 pm

Srivatsan NarayananAnother installment.

p. 137, proof of Lemma 2.4.3: “$\eps^2 (j-i) n$” should be “$\eps^2 (i-j) n$”.

p. 140, Exercise 2.4.7(i): Unnecessary caps in “…integer, \emph{Use} Talagrand’s”.

p. 140, Exercise 2.4.7(i): The condition that $k$ is even is repeated.

p. 143, Equation 2.92: $(1+o(1))/z$ should be $(-1+o(1))/z$. Same correction in p. 150, the equation before Equation (2.104).

p. 145, Hint to Exercise 2.4.10(iii): Should $s_{\mu}(a+ib)$ be $Im(s_{\mu}(a+ib))$?

p. 155: Redundant use of “theory” in “theory of non-commutative probability theory”.

p. 166, Proposition 2.5.7: This is just a clarification. Does the Proposition hold with the LHS replaced by $\rho(P(X))$? Did you intend this?

p. 166, Equation 2.122: The denominator of the Stieltjes transform should be $z^(k+1)$.

p. 166: In the definition of $\Omega$ following Equation (2.122), should $z$ be $|z|$?

p. 168, Exercise 2.5.6: “let $\mu_X$ as the” should be “let $\mu_X$ be the”.

p. 173, Exercise 2.5.13: “when $k$ is odd” should be “when $k$ is even”.

p. 183: Missing fullstop after “throughout the computation” in the main text, before Footnote 47.

p. 184, first para: “matirx” should be “matrix”.

p. 184, first para: “if R is diagonal” should be “if R is a diagonal”.

p. 185, third equation after (2.130): The Vandermonde determinant is missing a vertical line on the left.

p. 185, fourth equation after (2.130): The o(1) on the LHS should be 0.

p. 188, para 2 from bottom, first line: “eigenvector of $T_0$” should be “eigenvectors of $T_0$”.

p. 188, last para: “theorem make the” should be “theorem to make the”.

p. 191, Footnote 50: The semicircular law is wrong (square should be square root).

p. 194, second line: “matrix matrix” should be “matrix”.

p. 197, second para, first line: “For now, we we” should be “For now, we”.

p. 208: “Marchenko” should be “Marcenko”.

[Thanks, this will be incorporated into the next revision of the ms. -T.]16 August, 2011 at 2:51 pm

Ambuj TewariSome comments for Sections 2.6 and 2.7 of the current manuscript (July 15, 2011 version):

(I have tried to avoid repeating comments already made by readers above.)

p. 183, lines 5 and 7 from the bottom:

“V” (for the space of n x n Hermitian matrices) should be “V_n”

p. 188, line 8

“On the one hand” is missing a matching “On the other hand”.

p. 188, line 13

Missing period just before “As the eigenvalues”.

p. 189, line 8 from the bottom

“S” used twice to mean different things in “S ranges in the space S”

p. 191, line 6 from the bottom

Extra “be” in “lambda_i should be have”

p. 192, 4th and 9th (unnumbered) math displays

I got confused: what does “p.v.” stand for here?

p. 196, 4 lines after (2.149)

Extra “the” in “orthogonal to the”

p. 206, 3rd line of subsection 2.6.5

Missing argument “z” in “P_0″ while it is there in “P_{n-1}(z)”.

p. 209, Theorem 2.7.1 statement, 2nd line

extra “1” in “1p”

p. 213, line 1

“we prove the theorem” should be “we prove the proposition”

p. 215, Theorem 2.7.5 statement, line 4

lambda should appear in the subscript of small-oh (instead of epsilon)

p. 216, last line

“next set of lectures” should be “next section”

p. 218, discussion of “universality” in the 2nd para after Theorem 2.7.8

As you mention in the nice high-level discussion, Theorem 2.7.8 is proved by showing that the limiting distribution is the same whether we have the Bernoulli or Gaussian ensemble.

Is it known what is the “universality class” here? In other words, what is the set of distributions allowed for the entries of M such that we still get the same limit that Edelman worked out for the Gaussian ensemble?

17 August, 2011 at 10:15 am

Terence TaoThanks for the corrections! Regarding the universality of the least singular value, this is currently known for all iid matrix ensembles of mean zero and variance 1 that obey a finite moment condition (the C_0^th moment needs to be bounded for some sufficiently large C_0; the optimal value of C_0 is not currently known).

24 August, 2011 at 10:08 am

Ambuj TewariThanks for your reply to my qeustion regarding the universality class for Theorem 2.7.8. Here are some comments for Section 2.8 (pp. 223–234) of the current (Aug 23, 2011) version of the manuscript.

p. 224, line 8 from the bottom

U_0 is referred to as U three times

p. 225, Section 2.8.2, line 5

unmatched left parenthesis in “(at least, when the measure … sufficient decay at infinity.”

p. 228, line 3

“g_n” should be “f_n”

pp. 229–231

The discussion starting from Girko’s strategy, followed by Bai’s approach and ultimately ending in a description of the techniques used by Tao-Vu-Krishnapur is really nice. Unlike other sections of the book, some of the details are omitted. But the discussion prepares the reader well and motivates him/her to read the original papers!

p. 231, line 1

“lets ignore” should be “lets us/one ignore”

[Added to a pre-errata list, thanks. -T.]13 September, 2011 at 3:33 am

Florent Benaych-GeorgesAt the last equation page 258 (line -9), shouldn’t “n” be replaced by “d” (since “d” is the size of the matrices)?

[Corrected, thanks – T.]14 February, 2012 at 8:28 pm

Jesus A DominguezPage 21: Exercice 1.1.8 change “$\geq$” by “$\leq$” in the definition of convex function.

[Added, thanks – T.]28 February, 2012 at 9:39 pm

salazaarthere are a few typos of “Weilandt” which should be “Wielandt”, e.g., Exercise 1.3.6, p. 46. It is probably better do to do a global search

[Correction added, thanks – T.]28 February, 2012 at 11:54 pm

salazaarp. 46, before (1.66), the sup should be over $1 \leq i \leq n$, not $p$.

[Correction added, thanks – T.]27 March, 2012 at 3:42 am

dblairDoes the proof of WLLN actually use the uniform integrability of sequences of i.i.d. integrable random variables? The proof doesn’t mention it but it seems to me necessary if you’re going to pick N to bound all tails for large n.

27 March, 2012 at 8:32 am

Terence TaoIn my book I only consider the WLLN for identically distributed, absolutely integrable variables, which are automatically uniformly integrable, and for which selection of the truncation parameter N can be achieved directly from the monotone convergence theorem. The WLLN can be generalised to averages of uniformly integrable sequences by a similar argument to the one given in my book, though.

28 November, 2012 at 9:46 am

Nick CookMislabeled reference: end of the Lindberg individual swapping section of CLT, refers to Appendix D of TaVuKr10 for multidimensional Berry-Esseen. Should be TaVu2009b, or TaVu2010 for the uncorrelated case (still Appendix D in both). (On the blog it’s mislabeled but does link to TaVu2009b.)

[Corrected, thanks – T.]3 December, 2012 at 11:48 am

Gautam KamathHey Professor Tao,

Thanks a lot for these notes – I’m relatively new to concentration of measure, and this has helped in my research.

I did have one question/possible correction – in the proof of Proposition 2.1.9 on page 80, you perform a dyadic decomposition of the variables X_i. However, based on the indicator events you define, it seems that only the positive X_i events are decomposed, and all negative X_i events are grouped into X_i,0.

It seems like the obvious ways to decompose [-infinity,0] would be by either changing the existing random variables to X_i,m := X_i I(2^(m-1) < |X_i| < 2^m), or to define similar random variables as X_i I(-2^m < X_i < 2^(m-1)). Did you intend one of these decompositions?

Thanks,

G

[The former decomposition was intended; thanks for the correction! – T.]3 December, 2012 at 3:10 pm

Gautam KamathSorry, I messed up the page reference – I was going by the February 2011 version linked from http://terrytao.wordpress.com/2011/02/20/topics-in-random-matrix-theory/. In the August 2011 version linked from this post, Proposition 2.1.9 appears on page 68.

9 June, 2013 at 1:53 pm

salazarExercise 1.1.9, p. 16: it was claimed that Poisson distribution is sub-Gaussian. But it appears that its MGF grows double exponentially which is way faster than exp(quadratic). Also the tail probability scales 1/factorial, slower than gaussian tail.

[Correction added, thanks – T.]11 July, 2013 at 11:54 am

Probability and Statistics Books Online | Download free ebook[…] Topics in Random Matrix Theory by Terence Tao, 2011, 340 pages, 1.5MB, PDF […]

16 September, 2013 at 6:53 am

Free Mathematics eBooks Online : Probability and Statistics | Top Free Books[…] Topics in Random Matrix Theory by Terence Tao, 2011, 340 pages, 1.5MB, PDF […]

18 December, 2013 at 8:02 pm

SugarP40. the last line, “Re(v*Tv)-\lambda v*v” should be “Re(v*Tv)-\lambda (v*v-1) ”

[Correction added, thanks – T.]21 December, 2013 at 7:37 pm

SugarP48 (version: Aug 23,2011 pdf): from eqn(1.72): (\dot u_i^*) (u_i) + (u_i^*)(\dot u_i) = 0, one can not conclude (\dot u_i^*) (u_i) = 0 but only the “Real Part” of (\dot u_i^*) (u_i) is zero.

However, even without this result, (1.73) can follow directly from (1.71) by noticing that .

[Correction added, thanks -T.]13 October, 2014 at 12:48 pm

Doron PuderOn Page 139, in the list of four possible cycle types, type (ii) should be

“i_1=i_3, i_2=i_4 and i_1 \ne i_2″.

[Correction added, thanks – T.]15 October, 2014 at 11:03 am

Doron PuderAlso, on page 144, the current definition of ‘Dyck words’ does not exclude strings like ‘((‘.

21 October, 2014 at 10:57 am

F r i e d e r S i m o nShouldn’t the first formula on pp. 70 be ? If that is indeed the case, then how does one deal with the case that one appears only once ? (Since then mean zero doesn’t guarantee anymore that is zero.)

25 October, 2014 at 6:57 am

Terence Taois assumed to be even and the are assumed to be real, so that .

27 October, 2014 at 6:47 am

F r i e d e r S i m o nOh, I overlooked that $k$ was even, thank you.

Is there a reason why in the proof of Hoeffding’s lemma from pp. 61 you are emphasizing $X=O(1)$ *after* assuming that $b-a=1$ ? Since $X=O(1)$ should already follow from the hypothesis of $X$ being bounded (in which case the bounding constant is hidden inside the $O$). I feel like you wanted to convey some intuition here that I’m missing.

27 October, 2014 at 7:43 am

F r i e d e r S i m o n(A minor nitpick: I think in the proof of Theorem 2.1.3 on pp. 62 , when computing $\mathbf{E}\exp(tX)$, you’re referring to (2.10) instead of (2.9) – although since the former follows from latter one could call this a moot point.)

[Correction added, thanks -T.]27 October, 2014 at 9:11 am

Terence TaoIn some applications it is important that the implied constants in Hoeffding’s inequality do not depend on quantities such as a or b.

29 October, 2014 at 9:23 am

F r i e d e r S i m o nI’d be very grateful if you could indicate a short example or reference to such an application, since seeing hands-on what you mean would enable a better appreciation of the importance of the independence of the implied constants.

(I assume the case is similar for other inequalities, like Chernoffs bound, where the implied constants also should be independent of e.g. $\sigma$.)

3 November, 2014 at 9:30 am

F r i e d e r S i m o nIn the proof of Chernoff’s inequality you use a parameter , which is to be optimised later. I’m wondering about this restriction: Why do you allow ? In this case, since , you could only get the trivial bound, so one may as well omit .

Is there a specific reason to restrict to be at most ? The equality would be valid for all and I everything else seems to work too (the final bound on the probability would actually be better, I think, since one wouldn’t need to distinguish by cases if the that minimizes the right-hand side of the last inequality lies in or not). Maybe it is of advantage later in the text to have a bound of the form (2.11) ?

3 November, 2014 at 10:07 am

Terence TaoThe condition is needed so that we may ignore the term in (2.9). In this particular case, t=0 is indeed never the optimal choice, and could be removed if desired. (But it is traditional to allow parameters to range in compact spaces such as [0,1] if there is no harm in doing so, if only because this automatically ensures that any continuous optimisation problem attains its extremum, which is then automatically finite.)