You are currently browsing the tag archive for the ‘sunflowers’ tag.

I’ve just uploaded to the arXiv my paper The Ionescu-Wainger multiplier theorem and the adeles“. This paper revisits a useful multiplier theorem of Ionescu and Wainger on “major arc” Fourier multiplier operators on the integers ${{\bf Z}}$ (or lattices ${{\bf Z}^d}$), and strengthens the bounds while also interpreting it from the viewpoint of the adelic integers ${{\bf A}_{\bf Z}}$ (which were also used in my recent paper with Krause and Mirek).

For simplicity let us just work in one dimension. Any smooth function ${m: {\bf R}/{\bf Z} \rightarrow {\bf C}}$ then defines a discrete Fourier multiplier operator ${T_m: \ell^p({\bf Z}) \rightarrow \ell^p({\bf Z})}$ for any ${1 \leq p \leq \infty}$ by the formula

$\displaystyle {\mathcal F}_{\bf Z} T_m f(\xi) =: m(\xi) {\mathcal F}_{\bf Z} f(\xi)$

where ${{\mathcal F}_{\bf Z} f(\xi) := \sum_{n \in {\bf Z}} f(n) e(n \xi)}$ is the Fourier transform on ${{\bf Z}}$; similarly, any test function ${m: {\bf R} \rightarrow {\bf C}}$ defines a continuous Fourier multiplier operator ${T_m: L^p({\bf R}) \rightarrow L^p({\bf R})}$ by the formula

$\displaystyle {\mathcal F}_{\bf R} T_m f(\xi) := m(\xi) {\mathcal F}_{\bf R} f(\xi)$

where ${{\mathcal F}_{\bf R} f(\xi) := \int_{\bf R} f(x) e(x \xi)\ dx}$. In both cases we refer to ${m}$ as the symbol of the multiplier operator ${T_m}$.

We will be interested in discrete Fourier multiplier operators whose symbols are supported on a finite union of arcs. One way to construct such operators is by “folding” continuous Fourier multiplier operators into various target frequencies. To make this folding operation precise, given any continuous Fourier multiplier operator ${T_m: L^p({\bf R}) \rightarrow L^p({\bf R})}$, and any frequency ${\alpha \in {\bf R}/{\bf Z}}$, we define the discrete Fourier multiplier operator ${T_{m;\alpha}: \ell^p({\bf Z}) \rightarrow \ell^p({\bf Z})}$ for any frequency shift ${\alpha \in {\bf R}/{\bf Z}}$ by the formula

$\displaystyle {\mathcal F}_{\bf Z} T_{m,\alpha} f(\xi) := \sum_{\theta \in {\bf R}: \xi = \alpha + \theta} m(\theta) {\mathcal F}_{\bf Z} f(\xi)$

or equivalently

$\displaystyle T_{m;\alpha} f(n) = \int_{\bf R} m(\theta) {\mathcal F}_{\bf Z} f(\alpha+\theta) e( n(\alpha+\theta) )\ d\theta.$

More generally, given any finite set ${\Sigma \subset {\bf R}/{\bf Z}}$, we can form a multifrequency projection operator ${T_{m;\Sigma}}$ on ${\ell^p({\bf Z})}$ by the formula

$\displaystyle T_{m;\Sigma} := \sum_{\alpha \in \Sigma} T_{m;\alpha}$

thus

$\displaystyle T_{m;\alpha} f(n) = \sum_{\alpha \in \Sigma} \int_{\bf R} m(\theta) {\mathcal F}_{\bf Z} f(\alpha+\theta) e( n(\alpha+\theta) )\ d\theta.$

This construction gives discrete Fourier multiplier operators whose symbol can be localised to a finite union of arcs. For instance, if ${m: {\bf R} \rightarrow {\bf C}}$ is supported on ${[-\varepsilon,\varepsilon]}$, then ${T_{m;\Sigma}}$ is a Fourier multiplier whose symbol is supported on the set ${\bigcup_{\alpha \in \Sigma} \alpha + [-\varepsilon,\varepsilon]}$.

There are a body of results relating the ${\ell^p({\bf Z})}$ theory of discrete Fourier multiplier operators such as ${T_{m;\alpha}}$ or ${T_{m;\Sigma}}$ with the ${L^p({\bf R})}$ theory of their continuous counterparts. For instance we have the basic result of Magyar, Stein, and Wainger:

Proposition 1 (Magyar-Stein-Wainger sampling principle) Let ${1 \leq p \leq \infty}$ and ${\alpha \in {\bf R}/{\bf Z}}$.
• (i) If ${m: {\bf R} \rightarrow {\bf C}}$ is a smooth function supported in ${[-1/2,1/2]}$, then ${\|T_{m;\alpha}\|_{B(\ell^p({\bf Z}))} \lesssim \|T_m\|_{B(L^p({\bf R}))}}$, where ${B(V)}$ denotes the operator norm of an operator ${T: V \rightarrow V}$.
• (ii) More generally, if ${m: {\bf R} \rightarrow {\bf C}}$ is a smooth function supported in ${[-1/2Q,1/2Q]}$ for some natural number ${Q}$, then ${\|T_{m;\alpha + \frac{1}{Q}{\bf Z}/{\bf Z}}\|_{B(\ell^p({\bf Z}))} \lesssim \|T_m\|_{B(L^p({\bf R}))}}$.

When ${p=2}$ the implied constant in these bounds can be set to equal ${1}$. In the paper of Magyar, Stein, and Wainger it was posed as an open problem as to whether this is the case for other ${p}$; in an appendix to this paper I show that the answer is negative if ${p}$ is sufficiently close to ${1}$ or ${\infty}$, but I do not know the full answer to this question.

This proposition allows one to get a good multiplier theory for symbols supported near cyclic groups ${\frac{1}{Q}{\bf Z}/{\bf Z}}$; for instance it shows that a discrete Fourier multiplier with symbol ${\sum_{\alpha \in \frac{1}{Q}{\bf Z}/{\bf Z}} \phi(Q(\xi-\alpha))}$ for a fixed test function ${\phi}$ is bounded on ${\ell^p({\bf Z})}$, uniformly in ${p}$ and ${Q}$. For many applications in discrete harmonic analysis, one would similarly like a good multiplier theory for symbols supported in “major arc” sets such as

$\displaystyle \bigcup_{q=1}^N \bigcup_{\alpha \in \frac{1}{q}{\bf Z}/{\bf Z}} \alpha + [-\varepsilon,\varepsilon] \ \ \ \ \ (1)$

and in particular to get a good Littlewood-Paley theory adapted to major arcs. (This is particularly the case when trying to control “true complexity zero” expressions for which the minor arc contributions can be shown to be negligible; my recent paper with Krause and Mirek is focused on expressions of this type.) At present we do not have a good multiplier theory that is directly adapted to the classical major arc set (1) (though I do not know of rigorous negative results that show that such a theory is not possible); however, Ionescu and Wainger were able to obtain a useful substitute theory in which (1) was replaced by a somewhat larger set that had better multiplier behaviour. Starting with a finite collection ${S}$ of pairwise coprime natural numbers, and a natural number ${k}$, one can form the major arc type set

$\displaystyle \bigcup_{\alpha \in \Sigma_{\leq k}} \alpha + [-\varepsilon,\varepsilon] \ \ \ \ \ (2)$

where ${\Sigma_{\leq k} \subset {\bf R}/{\bf Z}}$ consists of all rational points in the unit circle of the form ${\frac{a}{Q} \mod 1}$ where ${Q}$ is the product of at most ${k}$ elements from ${S}$ and ${a}$ is an integer. For suitable choices of ${S}$ and ${k}$ not too large, one can make this set (2) contain the set (1) while still having a somewhat controlled size (very roughly speaking, one chooses ${S}$ to consist of (small powers of) large primes between ${N^\rho}$ and ${N}$ for some small constant ${\rho>0}$, together with something like the product of all the primes up to ${N^\rho}$ (raised to suitable powers)).

In the regime where ${k}$ is fixed and ${\varepsilon}$ is small, there is a good theory:

Theorem 2 (Ionescu-Wainger theorem, rough version) If ${p}$ is an even integer or the dual of an even integer, and ${m: {\bf R} \rightarrow {\bf C}}$ is supported on ${[-\varepsilon,\varepsilon]}$ for a sufficiently small ${\varepsilon > 0}$, then

$\displaystyle \|T_{m;\Sigma_{\leq k}}\|_{B(\ell^p({\bf Z}))} \lesssim_{p, k} (\log(1+|S|))^{O_k(1)} \|T_m\|_{B(L^p({\bf R}))}.$

There is a more explicit description of how small ${\varepsilon}$ needs to be for this theorem to work (roughly speaking, it is not much more than what is needed for all the arcs ${\alpha + [-\varepsilon,\varepsilon]}$ in (2) to be disjoint), but we will not give it here. The logarithmic loss of ${(\log(1+|S|))^{O_k(1)}}$ was reduced to ${\log(1+|S|)}$ by Mirek. In this paper we refine the bound further to

$\displaystyle \|T_{m;\Sigma_{\leq k}}\|_{B(\ell^p({\bf Z}))} \leq O(r \log(2+kr))^k \|T_m\|_{B(L^p({\bf R}))}. \ \ \ \ \ (3)$

when ${p = 2r}$ or ${p = (2r)'}$ for some integer ${r}$. In particular there is no longer any logarithmic loss in the cardinality of the set ${S}$.

The proof of (3) follows a similar strategy as to previous proofs of Ionescu-Wainger type. By duality we may assume ${p=2r}$. We use the following standard sequence of steps:

• (i) (Denominator orthogonality) First one splits ${T_{m;\Sigma_{\leq k}} f}$ into various pieces depending on the denominator ${Q}$ appearing in the element of ${\Sigma_{\leq k}}$, and exploits “superorthogonality” in ${Q}$ to estimate the ${\ell^p}$ norm by the ${\ell^p}$ norm of an appropriate square function.
• (ii) (Nonconcentration) One expands out the ${p^{th}}$ power of the square function and estimates it by a “nonconcentrated” version in which various factors that arise in the expansion are “disjoint”.
• (iii) (Numerator orthogonality) We now decompose based on the numerators ${a}$ appearing in the relevant elements of ${\Sigma_{\leq k}}$, and exploit some residual orthogonality in this parameter to reduce to estimating a square-function type expression involving sums over various cosets ${\alpha + \frac{1}{Q}{\bf Z}/{\bf Z}}$.
• (iv) (Marcinkiewicz-Zygmund) One uses the Marcinkiewicz-Zygmund theorem relating scalar and vector valued operator norms to eliminate the role of the multiplier ${m}$.
• (v) (Rubio de Francia) Use a reverse square function estimate of Rubio de Francia type to conclude.

The main innovations are that of using the probabilistic decoupling method to remove some logarithmic losses in (i), and recent progress on the Erdos-Rado sunflower conjecture (as discussed in this recent post) to improve the bounds in (ii). For (i), the key point is that one can express a sum such as

$\displaystyle \sum_{A \in \binom{S}{k}} f_A,$

where ${\binom{S}{k}}$ is the set of ${k}$-element subsets of an index set ${S}$, and ${f_A}$ are various complex numbers, as an average

$\displaystyle \sum_{A \in \binom{S}{k}} f_A = \frac{k^k}{k!} {\bf E} \sum_{s_1 \in {\bf S}_1,\dots,s_k \in {\bf S}_k} f_{\{s_1,\dots,s_k\}}$

where ${S = {\bf S}_1 \cup \dots \cup {\bf S}_k}$ is a random partition of ${S}$ into ${k}$ subclasses (chosen uniformly over all such partitions), basically because every ${k}$-element subset ${A}$ of ${S}$ has a probability exactly ${\frac{k!}{k^k}}$ of being completely shattered by such a random partition. This “decouples” the index set ${\binom{S}{k}}$ into a Cartesian product ${{\bf S}_1 \times \dots \times {\bf S}_k}$ which is more convenient for application of the superorthogonality theory. For (ii), the point is to efficiently obtain estimates of the form

$\displaystyle (\sum_{A \in \binom{S}{k}} F_A)^r \lesssim_{k,r} \sum_{A_1,\dots,A_r \in \binom{S}{k} \hbox{ sunflower}} F_{A_1} \dots F_{A_r}$

where ${F_A}$ are various non-negative quantities, and a sunflower is a collection of sets ${A_1,\dots,A_r}$ that consist of a common “core” ${A_0}$ and disjoint “petals” ${A_1 \backslash A_0,\dots,A_r \backslash A_0}$. The other parts of the argument are relatively routine; see for instance this survey of Pierce for a discussion of them in the simple case ${k=1}$.

In this paper we interpret the Ionescu-Wainger multiplier theorem as being essentially a consequence of various quantitative versions of the Shannon sampling theorem. Recall that this theorem asserts that if a (Schwartz) function ${f: {\bf R} \rightarrow {\bf C}}$ has its Fourier transform supported on ${[-1/2,1/2]}$, then ${f}$ can be recovered uniquely from its restriction ${f|_{\bf Z}: {\bf Z} \rightarrow {\bf C}}$. In fact, as can be shown from a little bit of routine Fourier analysis, if we narrow the support of the Fourier transform slightly to ${[-c,c]}$ for some ${0 < c < 1/2}$, then the restriction ${f|_{\bf Z}}$ has the same ${L^p}$ behaviour as the original function, in the sense that

$\displaystyle \| f|_{\bf Z} \|_{\ell^p({\bf Z})} \sim_{c,p} \|f\|_{L^p({\bf R})} \ \ \ \ \ (4)$

for all ${0 < p \leq \infty}$; see Theorem 4.18 of this paper of myself with Krause and Mirek. This is consistent with the uncertainty principle, which suggests that such functions ${f}$ should behave like a constant at scales ${\sim 1/c}$.

The quantitative sampling theorem (4) can be used to give an alternate proof of Proposition 1(i), basically thanks to the identity

$\displaystyle T_{m;0} (f|_{\bf Z}) = (T_m f)_{\bf Z}$

whenever ${f: {\bf R} \rightarrow {\bf C}}$ is Schwartz and has Fourier transform supported in ${[-1/2,1/2]}$, and ${m}$ is also supported on ${[-1/2,1/2]}$; this identity can be easily verified from the Poisson summation formula. A variant of this argument also yields an alternate proof of Proposition 1(ii), where the role of ${{\bf R}}$ is now played by ${{\bf R} \times {\bf Z}/Q{\bf Z}}$, and the standard embedding of ${{\bf Z}}$ into ${{\bf R}}$ is now replaced by the embedding ${\iota_Q: n \mapsto (n, n \hbox{ mod } Q)}$ of ${{\bf Z}}$ into ${{\bf R} \times {\bf Z}/Q{\bf Z}}$; the analogue of (4) is now

$\displaystyle \| f \circ \iota_Q \|_{\ell^p({\bf Z})} \sim_{c,p} \|f\|_{L^p({\bf R} \times {\bf Z}/Q{\bf Z})} \ \ \ \ \ (5)$

whenever ${f: {\bf R} \times {\bf Z}/Q{\bf Z} \rightarrow {\bf C}}$ is Schwartz and has Fourier transform ${{\mathcal F}_{{\bf R} \times {\bf Z}/Q{\bf Z}} f\colon {\bf R} \times \frac{1}{Q}{\bf Z}/{\bf Z} \rightarrow {\bf C}}$ supported in ${[-c/Q,c/Q] \times \frac{1}{Q}{\bf Z}/{\bf Z}}$, and ${{\bf Z}/Q{\bf Z}}$ is endowed with probability Haar measure.

The locally compact abelian groups ${{\bf R}}$ and ${{\bf R} \times {\bf Z}/Q{\bf Z}}$ can all be viewed as projections of the adelic integers ${{\bf A}_{\bf Z} := {\bf R} \times \hat {\bf Z}}$ (the product of the reals and the profinite integers ${\hat {\bf Z}}$). By using the Ionescu-Wainger multiplier theorem, we are able to obtain an adelic version of the quantitative sampling estimate (5), namely

$\displaystyle \| f \circ \iota \|_{\ell^p({\bf Z})} \sim_{c,p} \|f\|_{L^p({\bf A}_{\bf Z})}$

whenever ${1 < p < \infty}$, ${f: {\bf A}_{\bf Z} \rightarrow {\bf C}}$ is Schwartz-Bruhat and has Fourier transform ${{\mathcal F}_{{\bf A}_{\bf Z}} f: {\bf R} \times {\bf Q}/{\bf Z} \rightarrow {\bf C}}$ supported on ${[-\varepsilon,\varepsilon] \times \Sigma_{\leq k}}$ for some sufficiently small ${\varepsilon}$ (the precise bound on ${\varepsilon}$ depends on ${S, p, c}$ in a fashion not detailed here). This allows one obtain an “adelic” extension of the Ionescu-Wainger multiplier theorem, in which the ${\ell^p({\bf Z})}$ operator norm of any discrete multiplier operator whose symbol is supported on major arcs can be shown to be comparable to the ${L^p({\bf A}_{\bf Z})}$ operator norm of an adelic counterpart to that multiplier operator; in principle this reduces “major arc” harmonic analysis on the integers ${{\bf Z}}$ to “low frequency” harmonic analysis on the adelic integers ${{\bf A}_{\bf Z}}$, which is a simpler setting in many ways (mostly because the set of major arcs (2) is now replaced with a product set ${[-\varepsilon,\varepsilon] \times \Sigma_{\leq k}}$).

A family ${A_1,\dots,A_r}$ of sets for some ${r \geq 1}$ is a sunflower if there is a core set ${A_0}$ contained in each of the ${A_i}$ such that the petal sets ${A_i \backslash A_0, i=1,\dots,r}$ are disjoint. If ${k,r \geq 1}$, let ${\mathrm{Sun}(k,r)}$ denote the smallest natural number with the property that any family of ${\mathrm{Sun}(k,r)}$ distinct sets of cardinality at most ${k}$ contains ${r}$ distinct elements ${A_1,\dots,A_r}$ that form a sunflower. The celebrated Erdös-Rado theorem asserts that ${\mathrm{Sun}(k,r)}$ is finite; in fact Erdös and Rado gave the bounds

$\displaystyle (r-1)^k \leq \mathrm{Sun}(k,r) \leq (r-1)^k k! + 1. \ \ \ \ \ (1)$

The sunflower conjecture asserts in fact that the upper bound can be improved to ${\mathrm{Sun}(k,r) \leq O(1)^k r^k}$. This remains open at present despite much effort (including a Polymath project); after a long series of improvements to the upper bound, the best general bound known currently is

$\displaystyle \mathrm{Sun}(k,r) \leq O( r \log(kr) )^k \ \ \ \ \ (2)$

for all ${k,r \geq 2}$, established in 2019 by Rao (building upon a recent breakthrough a month previously of Alweiss, Lovett, Wu, and Zhang). Here we remove the easy cases ${k=1}$ or ${r=1}$ in order to make the logarithmic factor ${\log(kr)}$ a little cleaner.

Rao’s argument used the Shannon noiseless coding theorem. It turns out that the argument can be arranged in the very slightly different language of Shannon entropy, and I would like to present it here. The argument proceeds by locating the core and petals of the sunflower separately (this strategy is also followed in Alweiss-Lovett-Wu-Zhang). In both cases the following definition will be key. In this post all random variables, such as random sets, will be understood to be discrete random variables taking values in a finite range. We always use boldface symbols to denote random variables, and non-boldface for deterministic quantities.

Definition 1 (Spread set) Let ${R > 1}$. A random set ${{\bf A}}$ is said to be ${R}$-spread if one has

$\displaystyle {\mathbb P}( S \subset {\bf A}) \leq R^{-|S|}$

for all sets ${S}$. A family ${(A_i)_{i \in I}}$ of sets is said to be ${R}$-spread if ${I}$ is non-empty and the random variable ${A_{\bf i}}$ is ${R}$-spread, where ${{\bf i}}$ is drawn uniformly from ${I}$.

The core can then be selected greedily in such a way that the remainder of a family becomes spread:

Lemma 2 (Locating the core) Let ${(A_i)_{i \in I}}$ be a family of subsets of a finite set ${X}$, each of cardinality at most ${k}$, and let ${R > 1}$. Then there exists a “core” set ${S_0}$ of cardinality at most ${k}$ such that the set

$\displaystyle J := \{ i \in I: S_0 \subset A_i \} \ \ \ \ \ (3)$

has cardinality at least ${R^{-|S_0|} |I|}$, and such that the family ${(A_j \backslash S_0)_{j \in J}}$ is ${R}$-spread. Furthermore, if ${|I| > R^k}$ and the ${A_i}$ are distinct, then ${|S_0| < k}$.

Proof: We may assume ${I}$ is non-empty, as the claim is trivial otherwise. For any ${S \subset X}$, define the quantity

$\displaystyle Q(S) := R^{|S|} |\{ i \in I: S \subset A_i\}|,$

and let ${S_0}$ be a subset of ${X}$ that maximizes ${Q(S_0)}$. Since ${Q(\emptyset) = |I| > 0}$ and ${Q(S)=0}$ when ${|S| >k}$, we see that ${0 \leq |S_0| \leq K}$. If the ${A_i}$ are distinct and ${|I| > R^k}$, then we also have ${Q(S) \leq R^k < |I| = Q(\emptyset)}$ when ${|S|=k}$, thus in this case we have ${|S_0| < k}$.

Let ${J}$ be the set (3). Since ${Q(S_0) \geq Q(\emptyset)>0}$, ${J}$ is non-empty. It remains to check that the family ${(A_j \backslash S_0)_{j \in J}}$ is ${R}$-spread. But for any ${S \subset X}$ and ${{\bf j}}$ drawn uniformly at random from ${J}$ one has

$\displaystyle {\mathbb P}( S \subset A_{\bf j} \backslash S_0 ) = \frac{|\{ i \in I: S_0 \cup S \subset A_i\}|}{|\{ i \in I: S_0 \subset A_i\}|} = R^{|S_0|-|S_0 \cup S|} \frac{Q(S)}{Q(S_0)}.$

Observe that ${Q(S) \leq Q(S_0)}$, and the probability is only non-empty when ${S, S_0}$ are disjoint, so that ${|S_0|-|S_0 \cup S| = - |S|}$. The claim follows. $\Box$

In view of the above lemma, the bound (2) will then follow from

Proposition 3 (Locating the petals) Let ${r, k \geq 2}$ be natural numbers, and suppose that ${R \geq C r \log(kr)}$ for a sufficiently large constant ${C}$. Let ${(A_i)_{i \in I}}$ be a finite family of subsets of a finite set ${X}$, each of cardinality at most ${k}$ which is ${R}$-spread. Then there exist ${i_1,\dots,i_r \in I}$ such that ${A_{i_1},\dots,A_{i_r}}$ is disjoint.

Indeed, to prove (2), we assume that ${(A_i)_{i \in I}}$ is a family of sets of cardinality greater than ${R^k}$ for some ${R \geq Cr \log(kr)}$; by discarding redundant elements and sets we may assume that ${I}$ is finite and that all the ${A_i}$ are contained in a common finite set ${X}$. Apply Lemma 2 to find a set ${S_0 \subset X}$ of cardinality ${|S_0| < k}$ such that the family ${(A_j \backslash S_0)_{j \in J}}$ is ${R}$-spread. By Proposition 3 we can find ${j_1,\dots,j_r \in J}$ such that ${A_{j_1} \backslash S_0,\dots,A_{j_r} \backslash S_0}$ are disjoint; since these sets have cardinality ${k - |S_0| > 0}$, this implies that the ${j_1,\dots,j_r}$ are distinct. Hence ${A_{j_1},\dots,A_{j_r}}$ form a sunflower as required.

Remark 4 Proposition 3 is easy to prove if we strengthen the condition on ${R}$ to ${R > k(r-1)}$. In this case, we have ${\mathop{\bf P}_{i \in I}( x \in A_i) < 1/k(r-1)}$ for every ${x \in X}$, hence by the union bound we see that for any ${i_1,\dots,i_j \in I}$ with ${j \leq r-1}$ there exists ${i_{j+1} \in I}$ such that ${A_{i_{j+1}}}$ is disjoint from the set ${A_{i_1} \cup \dots \cup A_{i_j}}$, which has cardinality at most ${k(r-1)}$. Iterating this, we obtain the conclusion of Proposition 3 in this case. This recovers a bound of the form ${\mathrm{Sun}(k,r) \leq (k(r-1))^k+1}$, and by pursuing this idea a little further one can recover the original upper bound (1) of Erdös and Rado.

It remains to prove Proposition 3. In fact we can locate the petals one at a time, placing each petal inside a random set.

Proposition 5 (Locating a single petal) Let the notation and hypotheses be as in Proposition 3. Let ${{\bf V}}$ be a random subset of ${X}$, such that each ${x \in X}$ lies in ${{\bf V}}$ with an independent probability of ${1/r}$. Then with probability greater than ${1-1/r}$, ${{\bf V}}$ contains one of the ${A_i}$.

To see that Proposition 5 implies Proposition 3, we randomly partition ${X}$ into ${{\bf V}_1 \cup \dots \cup {\bf V}_r}$ by placing each ${x \in X}$ into one of the ${{\bf V}_j}$, ${j=1,\dots,r}$ chosen uniformly and independently at random. By Proposition 5 and the union bound, we see that with positive probability, it is simultaneously true for all ${j=1,\dots,r}$ that each ${{\bf V}_j}$ contains one of the ${A_i}$. Selecting one such ${A_i}$ for each ${{\bf V}_j}$, we obtain the required disjoint petals.

We will prove Proposition 5 by gradually increasing the density of the random set and arranging the sets ${A_i}$ to get quickly absorbed by this random set. The key iteration step is

Proposition 6 (Refinement inequality) Let ${R > 1}$ and ${0 < \delta < 1}$. Let ${{\bf A}}$ be a random subset of a finite set ${X}$ which is ${R}$-spread, and let ${{\bf V}}$ be a random subset of ${X}$ independent of ${{\bf A}}$, such that each ${x \in X}$ lies in ${{\bf V}}$ with an independent probability of ${\delta}$. Then there exists another ${R}$-spread random subset ${{\bf A}'}$ of ${X}$ whose support is contained in the support of ${{\bf A}}$, such that ${{\bf A}' \backslash {\bf V} \subset {\bf A}}$ and

$\displaystyle {\mathbb E} |{\bf A}' \backslash {\bf V}| \leq \frac{5}{\log(R\delta)} {\mathbb E} |{\bf A}|.$

Note that a direct application of the first moment method gives only the bound

$\displaystyle {\mathbb E} |{\bf A} \backslash {\bf V}| \leq (1-\delta) {\mathbb E} |{\bf A}|,$

but the point is that by switching from ${{\bf A}}$ to an equivalent ${{\bf A}'}$ we can replace the ${1-\delta}$ factor by a quantity significantly smaller than ${1}$.

One can iterate the above proposition, repeatedly replacing ${{\bf A}, X}$ with ${{\bf A}' \backslash {\bf V}, X \backslash {\bf V}}$ (noting that this preserves the ${R}$-spread nature of ${{\bf A}}$) to conclude

Corollary 7 (Iterated refinement inequality) Let ${R > 1}$, ${0 < \delta < 1}$, and ${m \geq 1}$. Let ${{\bf A}}$ be a random subset of a finite set ${X}$ which is ${R}$-spread, and let ${{\bf V}}$ be a random subset of ${X}$ independent of ${{\bf A}}$, such that each ${x \in X}$ lies in ${{\bf V}}$ with an independent probability of ${1-(1-\delta)^m}$. Then there exists another random subset ${{\bf A}'}$ of ${X}$ with support contained in the support of ${{\bf A}}$, such that

$\displaystyle {\mathbb E} |{\bf A}' \backslash {\bf V}| \leq (\frac{5}{\log(R\delta)})^m {\mathbb E} |{\bf A}|.$

Now we can prove Proposition 5. Let ${m}$ be chosen shortly. Applying Corollary 7 with ${{\bf A}}$ drawn uniformly at random from the ${(A_i)_{i \in I}}$, and setting ${1-(1-\delta)^m = 1/r}$, or equivalently ${\delta = 1 - (1 - 1/r)^{1/m}}$, we have

$\displaystyle {\mathbb E} |{\bf A}' \backslash {\bf V}| \leq (\frac{5}{\log(R\delta)})^m k.$

In particular, if we set ${m = \lceil \log kr \rceil}$, so that ${\delta \sim \frac{1}{r \log kr}}$, then by choice of ${R}$ we have ${\frac{5}{\log(R\delta)} < \frac{1}{2}}$, hence

$\displaystyle {\mathbb E} |{\bf A}' \backslash {\bf V}| < \frac{1}{r}.$

In particular with probability at least ${1 - \frac{1}{r}}$, there must exist ${A_i}$ such that ${|A_i \backslash {\bf V}| = 0}$, giving the proposition.

It remains to establish Proposition 6. This is the difficult step, and requires a clever way to find the variant ${{\bf A}'}$ of ${{\bf A}}$ that has better containment properties in ${{\bf V}}$ than ${{\bf A}}$ does. The main trick is to make a conditional copy ${({\bf A}', {\bf V}')}$ of ${({\bf A}, {\bf V})}$ that is conditionally independent of ${({\bf A}, {\bf V})}$ subject to the constraint ${{\bf A} \cup {\bf V} = {\bf A}' \cup {\bf V}'}$. The point here is that this constrant implies the inclusions

$\displaystyle {\bf A}' \backslash {\bf V} \subset {\bf A} \cap {\bf A}' \subset {\bf A} \ \ \ \ \ (4)$

and

$\displaystyle {\bf A}' \backslash {\bf A} \subset {\bf V}. \ \ \ \ \ (5)$

Because of the ${R}$-spread hypothesis, it is hard for ${{\bf A}}$ to contain any fixed large set. If we could apply this observation in the contrapositive to ${{\bf A} \cap {\bf A}'}$ we could hope to get a good upper bound on the size of ${{\bf A} \cap {\bf A}'}$ and hence on ${{\bf A} \backslash {\bf V}}$ thanks to (4). One can also hope to improve such an upper bound by also employing (5), since it is also hard for the random set ${{\bf V}}$ to contain a fixed large set. There are however difficulties with implementing this approach due to the fact that the random sets ${{\bf A} \cap {\bf A}', {\bf A}' \backslash {\bf A}}$ are coupled with ${{\bf A}, {\bf V}}$ in a moderately complicated fashion. In Rao’s argument a somewhat complicated encoding scheme was created to give information-theoretic control on these random variables; below the fold we accomplish a similar effect by using Shannon entropy inequalities in place of explicit encoding. A certain amount of information-theoretic sleight of hand is required to decouple certain random variables to the extent that the Shannon inequalities can be effectively applied. The argument bears some resemblance to the “entropy compression method” discussed in this previous blog post; there may be a way to more explicitly express the argument below in terms of that method. (There is also some kinship with the method of dependent random choice, which is used for instance to establish the Balog-Szemerédi-Gowers lemma, and was also translated into information theoretic language in these unpublished notes of Van Vu and myself.)