Let {n} be a natural number, and let {\sigma: \{1,\ldots,n\} \rightarrow \{1,\ldots,n\}} be a permutation of {\{1,\ldots,n\}}, drawn uniformly at random. Using the cycle decomposition, one can view {\sigma} as the disjoint union of cycles of varying lengths (from {1} to {n}). For each {1 \leq k \leq n}, let {C_k} denote the number of cycles of {\sigma} of length {k}; thus the {C_k} are natural number-valued random variables with the constraint

\displaystyle  \sum_{k=1}^n k C_k = n. \ \ \ \ \ (1)

We let {C := \sum_{k=1}^n C_k} be the number of cycles (of arbitrary length); this is another natural number-valued random variable, of size at most {n}.

I recently had need to understand the distribution of the random variables {C_k} and {C}. As it turns out this is an extremely classical subject, but as an exercise I worked out what I needed using a quite tedious computation involving generating functions that I will not reproduce here. But the resulting identities I got were so nice, that they strongly suggested the existence of elementary bijective (or “double counting”) proofs, in which the identities are proven with a minimum of computation, by interpreting each side of the identity as the cardinality (or probability) of the same quantity (or event), viewed in two different ways. I then found these bijective proofs, which I found to be rather cute; again, these are all extremely classical (closely related, for instance, to Stirling numbers of the first kind), but I thought some readers might be interested in trying to find these proofs themselves as an exercise (and I also wanted a place to write the identities down so I could retrieve them later), so I have listed the identities I found below.

  1. For any {1 \leq k \leq n}, one has {{\bf E} C_k = \frac{1}{k}}. In particular, {{\bf E} C = 1 + \frac{1}{2} + \ldots + \frac{1}{n} = \log n + O(1)}.
  2. More generally, for any {1 \leq k \leq n} and {j \geq 1} with {jk \leq n}, one has {{\bf E} \binom{C_k}{j} = \frac{1}{k^j j!}}.
  3. More generally still, for any {1 \leq k_1 < \ldots < k_r \leq n} and {j_1,\ldots,j_r \geq 1} with {\sum_{i=1}^r j_i k_i \leq n}, one has

    \displaystyle  {\bf E} \prod_{i=1}^r \binom{C_{k_i}}{j_i} = \prod_{i=1}^r \frac{1}{k_i^{j_i} j_i!}.

  4. In particular, we have Cauchy’s formula: if {\sum_{k=1}^n j_k k = n}, then the probability that {C_k = j_k} for all {k=1,\ldots,n} is precisely {\prod_{k=1}^n \frac{1}{k^{j_k} j_k!}}. (This in particular leads to a reasonably tractable formula for the joint generating function of the {C_k}, which is what I initially used to compute everything that I needed, before finding the slicker bijective proofs.)
  5. For fixed {k}, {C_k} converges in distribution as {n \rightarrow \infty} to the Poisson distribution of intensity {\frac{1}{k}}.
  6. More generally, for fixed {1 \leq k_1 < \ldots < k_r}, {C_{k_1},\ldots,C_{k_r}} converge in joint distribution to {r} independent Poisson distributions of intensity {\frac{1}{k_1},\ldots,\frac{1}{k_r}} respectively. (A more precise version of this claim can be found in this paper of Arratia and Tavaré.)
  7. One has {{\bf E} 2^C = n+1}.
  8. More generally, one has {{\bf E} m^C = \binom{n+m-1}{n}} for all natural numbers {m}.