Equidistribution of Syracuse random variables and density of Collatz preimages

25 January, 2020 in expository, math.CO, math.NT | Tags: Collatz conjecture, equidistribution | by Terence Tao

Define the Collatz map ${\mathrm{Col}: {\bf N}+1 \rightarrow {\bf N}+1}$ on the natural numbers ${{\bf N}+1 = \{1,2,\dots\}}$ by setting ${\mathrm{Col}(N)}$ to equal ${3N+1}$ when ${N}$ is odd and ${N/2}$ when ${N}$ is even, and let ${\mathrm{Col}^{\bf N}(N) := \{ N, \mathrm{Col}(N), \mathrm{Col}^2(N), \dots \}}$ denote the forward Collatz orbit of ${N}$ . The notorious Collatz conjecture asserts that ${1 \in \mathrm{Col}^{\bf N}(N)}$ for all ${N \in {\bf N}+1}$ . Equivalently, if we define the backwards Collatz orbit ${(\mathrm{Col}^{\bf N})^*(N) := \{ M \in {\bf N}+1: N \in \mathrm{Col}^{\bf N}(M) \}}$ to be all the natural numbers ${M}$ that encounter ${N}$ in their forward Collatz orbit, then the Collatz conjecture asserts that ${(\mathrm{Col}^{\bf N})^*(1) = {\bf N}+1}$ . As a partial result towards this latter statement, Krasikov and Lagarias in 2003 established the bound

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Col}^{\bf N})^*(1) \} \gg x^\gamma \ \ \ \ \ (1)$

for all ${x \geq 1}$ and ${\gamma = 0.84}$ . (This improved upon previous values of ${\gamma = 0.81}$ obtained by Applegate and Lagarias in 1995, ${\gamma = 0.65}$ by Applegate and Lagarias in 1995 by a different method, ${\gamma=0.48}$ by Wirsching in 1993, ${\gamma=0.43}$ by Krasikov in 1989, ${\gamma=0.3}$ by Sander in 1990, and some ${\gamma>0}$ by Crandall in 1978.) This is still the largest value of ${\gamma}$ for which (1) has been established. Of course, the Collatz conjecture would imply that we can take ${\gamma}$ equal to ${1}$ , which is the assertion that a positive density set of natural numbers obeys the Collatz conjecture. This is not yet established, although the results in my previous paper do at least imply that a positive density set of natural numbers iterates to an (explicitly computable) bounded set, so in principle the ${\gamma=1}$ case of (1) could now be verified by an (enormous) finite computation in which one verifies that every number in this explicit bounded set iterates to ${1}$ . In this post I would like to record a possible alternate route to this problem that depends on the distribution of a certain family of random variables that appeared in my previous paper, that I called Syracuse random variables.

Definition 1 (Syracuse random variables) For any natural number ${n}$ , a Syracuse random variable ${\mathbf{Syrac}({\bf Z}/3^n{\bf Z})}$ on the cyclic group ${{\bf Z}/3^n{\bf Z}}$ is defined as a random variable of the form

$\displaystyle \mathbf{Syrac}({\bf Z}/3^n{\bf Z}) = \sum_{m=1}^n 3^{n-m} 2^{-{\mathbf a}_m-\dots-{\mathbf a}_n} \ \ \ \ \ (2)$

where ${\mathbf{a}_1,\dots,\mathbf{a_n}}$ are independent copies of a geometric random variable ${\mathbf{Geom}(2)}$ on the natural numbers with mean ${2}$ , thus

$\displaystyle \mathop{\bf P}( \mathbf{a}_1=a_1,\dots,\mathbf{a}_n=a_n) = 2^{-a_1-\dots-a_n}$

} for ${a_1,\dots,a_n \in {\bf N}+1}$ . In (2) the arithmetic is performed in the ring ${{\bf Z}/3^n{\bf Z}}$ .

Thus for instance

$\displaystyle \mathrm{Syrac}({\bf Z}/3{\bf Z}) = 2^{-\mathbf{a}_1} \hbox{ mod } 3$

$\displaystyle \mathrm{Syrac}({\bf Z}/3^2{\bf Z}) = 2^{-\mathbf{a}_1-\mathbf{a}_2} + 3 \times 2^{-\mathbf{a}_2} \hbox{ mod } 3^2$

$\displaystyle \mathrm{Syrac}({\bf Z}/3^3{\bf Z}) = 2^{-\mathbf{a}_1-\mathbf{a}_2-\mathbf{a}_3} + 3 \times 2^{-\mathbf{a}_2-\mathbf{a}_3} + 3^2 \times 2^{-\mathbf{a}_3} \hbox{ mod } 3^3$

and so forth. After reversing the labeling of the ${\mathbf{a}_1,\dots,\mathbf{a}_n}$ , one could also view ${\mathrm{Syrac}({\bf Z}/3^n{\bf Z})}$ as the mod ${3^n}$ reduction of a ${3}$ -adic random variable

$\displaystyle \mathbf{Syrac}({\bf Z}_3) = \sum_{m=1}^\infty 3^{m-1} 2^{-{\mathbf a}_1-\dots-{\mathbf a}_m}.$

The probability density function ${b \mapsto \mathbf{P}( \mathbf{Syrac}({\bf Z}/3^n{\bf Z}) = b )}$ of the Syracuse random variable can be explicitly computed by a recursive formula (see Lemma 1.12 of my previous paper). For instance, when ${n=1}$ , ${\mathbf{P}( \mathbf{Syrac}({\bf Z}/3{\bf Z}) = b )}$ is equal to ${0,1/3,2/3}$ for ${x=b,1,2 \hbox{ mod } 3}$ respectively, while when ${n=2}$ , ${\mathbf{P}( \mathbf{Syrac}({\bf Z}/3^2{\bf Z}) = b )}$ is equal to

$\displaystyle 0, \frac{8}{63}, \frac{16}{63}, 0, \frac{11}{63}, \frac{4}{63}, 0, \frac{2}{63}, \frac{22}{63}$

when ${b=0,\dots,8 \hbox{ mod } 9}$ respectively.

The relationship of these random variables to the Collatz problem can be explained as follows. Let ${2{\bf N}+1 = \{1,3,5,\dots\}}$ denote the odd natural numbers, and define the Syracuse map ${\mathrm{Syr}: 2{\bf N}+1 \rightarrow 2{\bf N}+1}$ by

$\displaystyle \mathrm{Syr}(N) := \frac{3n+1}{2^{\nu_2(3N+1)}}$

where the ${2}$ –valuation ${\nu_2(3n+1) \in {\bf N}}$ is the number of times ${2}$ divides ${3N+1}$ . We can define the forward orbit ${\mathrm{Syr}^{\bf N}(n)}$ and backward orbit ${(\mathrm{Syr}^{\bf N})^*(N)}$ of the Syracuse map as before. It is not difficult to then see that the Collatz conjecture is equivalent to the assertion ${(\mathrm{Syr}^{\bf N})^*(1) = 2{\bf N}+1}$ , and that the assertion (1) for a given ${\gamma}$ is equivalent to the assertion

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Syr}^{\bf N})^*(1) \} \gg x^\gamma \ \ \ \ \ (3)$

for all ${x \geq 1}$ , where ${N}$ is now understood to range over odd natural numbers. A brief calculation then shows that for any odd natural number ${N}$ and natural number ${n}$ , one has

$\displaystyle \mathrm{Syr}^n(N) = 3^n 2^{-a_1-\dots-a_n} N + \sum_{m=1}^n 3^{n-m} 2^{-a_m-\dots-a_n}$

where the natural numbers ${a_1,\dots,a_n}$ are defined by the formula

$\displaystyle a_i := \nu_2( 3 \mathrm{Syr}^{i-1}(N) + 1 ),$

so in particular

$\displaystyle \mathrm{Syr}^n(N) = \sum_{m=1}^n 3^{n-m} 2^{-a_m-\dots-a_n} \hbox{ mod } 3^n.$

Heuristically, one expects the ${2}$ -valuation ${a = \nu_2(N)}$ of a typical odd number ${N}$ to be approximately distributed according to the geometric distribution ${\mathbf{Geom}(2)}$ , so one therefore expects the residue class ${\mathrm{Syr}^n(N) \hbox{ mod } 3^n}$ to be distributed approximately according to the random variable ${\mathbf{Syrac}({\bf Z}/3^n{\bf Z})}$ .

The Syracuse random variables ${\mathbf{Syrac}({\bf Z}/3^n{\bf Z})}$ will always avoid multiples of three (this reflects the fact that ${\mathrm{Syr}(N)}$ is never a multiple of three), but attains any non-multiple of three in ${{\bf Z}/3^n{\bf Z}}$ with positive probability. For any natural number ${n}$ , set

$\displaystyle c_n := \inf_{b \in {\bf Z}/3^n{\bf Z}: 3 \nmid b} \mathbf{P}( \mathbf{Syrac}({\bf Z}/3^n{\bf Z}) = b ).$

Equivalently, ${c_n}$ is the greatest quantity for which we have the inequality

$\displaystyle \sum_{(a_1,\dots,a_n) \in S_{n,N}} 2^{-a_1-\dots-a_m} \geq c_n \ \ \ \ \ (4)$

for all integers ${N}$ not divisible by three, where ${S_{n,N} \subset ({\bf N}+1)^n}$ is the set of all tuples ${(a_1,\dots,a_n)}$ for which

$\displaystyle N = \sum_{m=1}^n 3^{m-1} 2^{-a_1-\dots-a_m} \hbox{ mod } 3^n.$

Thus for instance ${c_0=1}$ , ${c_1 = 1/3}$ , and ${c_2 = 2/63}$ . On the other hand, since all the probabilities ${\mathbf{P}( \mathbf{Syrac}({\bf Z}/3^n{\bf Z}) = b)}$ sum to ${1}$ as ${b \in {\bf Z}/3^n{\bf Z}}$ ranges over the non-multiples of ${3}$ , we have the trivial upper bound

$\displaystyle c_n \leq \frac{3}{2} 3^{-n}.$

There is also an easy submultiplicativity result:

Lemma 2 For any natural numbers ${n_1,n_2}$ , we have

$\displaystyle c_{n_1+n_2-1} \geq c_{n_1} c_{n_2}.$

Proof: Let ${N}$ be an integer not divisible by ${3}$ , then by (4) we have

$\displaystyle \sum_{(a_1,\dots,a_{n_1}) \in S_{n_1,N}} 2^{-a_1-\dots-a_{n_1}} \geq c_{n_1}.$

If we let ${S'_{n_1,N}}$ denote the set of tuples ${(a_1,\dots,a_{n_1-1})}$ that can be formed from the tuples in ${S_{n_1,N}}$ by deleting the final component ${a_{n_1}}$ from each tuple, then we have

$\displaystyle \sum_{(a_1,\dots,a_{n_1-1}) \in S'_{n_1,N}} 2^{-a_1-\dots-a_{n_1-1}} \geq c_{n_1}. \ \ \ \ \ (5)$

Next, observe that if ${(a_1,\dots,a_{n_1-1}) \in S'_{n_1,N}}$ , then

$\displaystyle N = \sum_{m=1}^{n_1-1} 3^{m-1} 2^{-a_1-\dots-a_m} + 3^{n_1-1} 2^{-a_1-\dots-a_{n_1-1}} M$

with ${M = M_{N,n_1,a_1,\dots,a_{n_1-1}}}$ an integer not divisible by three. By definition of ${S_{n_2,M}}$ and a relabeling, we then have

$\displaystyle M = \sum_{m=1}^{n_2} 3^{m-1} 2^{-a_{n_1}-\dots-a_{m+n_1-1}} \hbox{ mod } 3^{n_2}$

for all ${(a_{n_1},\dots,a_{n_1+n_2-1}) \in S_{n_2,M}}$ . For such tuples we then have

$\displaystyle N = \sum_{m=1}^{n_1+n_2-1} 3^{m-1} 2^{-a_1-\dots-a_{n_1+n_2-1}} \hbox{ mod } 3^{n_1+n_2-1}$

so that ${(a_1,\dots,a_{n_1+n_2-1}) \in S_{n_1+n_2-1,N}}$ . Since

$\displaystyle \sum_{(a_{n_1},\dots,a_{n_1+n_2-1}) \in S_{n_2,M}} 2^{-a_{n_1}-\dots-a_{n_1+n_2-1}} \geq c_{n_2}$

for each ${M}$ , the claim follows. $\Box$

From this lemma we see that ${c_n = 3^{-\beta n + o(n)}}$ for some absolute constant ${\beta \geq 1}$ . Heuristically, we expect the Syracuse random variables to be somewhat approximately equidistributed amongst the multiples of ${{\bf Z}/3^n{\bf Z}}$ (in Proposition 1.4 of my previous paper I prove a fine scale mixing result that supports this heuristic). As a consequence it is natural to conjecture that ${\beta=1}$ . I cannot prove this, but I can show that this conjecture would imply that we can take the exponent ${\gamma}$ in (1), (3) arbitrarily close to one:

Proposition 3 Suppose that ${\beta=1}$ (that is to say, ${c_n = 3^{-n+o(n)}}$ as ${n \rightarrow \infty}$ ). Then

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Syr}^{\bf N})^*(1) \} \gg x^{1-o(1)}$

as ${x \rightarrow \infty}$ , or equivalently

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Col}^{\bf N})^*(1) \} \gg x^{1-o(1)}$

as ${x \rightarrow \infty}$ . In other words, (1), (3) hold for all ${\gamma < 1}$ .

I prove this proposition below the fold. A variant of the argument shows that for any value of ${\beta}$ , (1), (3) holds whenever ${\gamma < f(\beta)}$ , where ${f: [0,1] \rightarrow [0,1]}$ is an explicitly computable function with ${f(\beta) \rightarrow 1}$ as ${\beta \rightarrow 1}$ . In principle, one could then improve the Krasikov-Lagarias result ${\gamma = 0.84}$ by getting a sufficiently good upper bound on ${\beta}$ , which is in principle achievable numerically (note for instance that Lemma 2 implies the bound ${c_n \leq 3^{-\beta(n-1)}}$ for any ${n}$ , since ${c_{kn-k+1} \geq c_n^k}$ for any ${k}$ ).

— 1. Proof of proposition —

Assume ${\beta=1}$ . Let ${\varepsilon>0}$ be sufficiently small, and let ${n_0}$ be sufficiently large depending on ${\varepsilon}$ . We first establish the following proposition, that shows that elements in a certain residue class have a lot of Syracuse preimages:

Proposition 4 There exists a residue class of ${{\bf Z}/3^{n_0}{\bf Z}}$ with the property that for all integers ${N}$ in this class, and all non-negative integers ${j}$ , there exist natural numbers ${n_j, L_j}$ with

$\displaystyle (2-\varepsilon^2) n_j \leq L_j \leq (2+\varepsilon^2) n_j$

and

$\displaystyle (4/3)^{(1-\varepsilon^2) (1+\varepsilon)^j n_0} \leq 3^{-n_j} 2^{L_j} \leq (4/3)^{(1+\varepsilon^2) (1+\varepsilon)^j n_0}$

and at least ${3^{-n_j - \varepsilon^4 n_j} 2^{L_j}}$ tuples

$\displaystyle (a_1,\dots,a_{n_j-1}) \in S'_{n_j,N}$

obeying the additional properties

$\displaystyle a_1+\dots+a_{n_j-1} = L_j \ \ \ \ \ (6)$

and

$\displaystyle a_1+\dots+a_i - \frac{\log 3}{\log 2} i \geq - \varepsilon^5 n_0 \ \ \ \ \ (7)$

for all ${1 \leq i \leq n_j-1}$ .

Proof: We begin with the base case ${j=0}$ . By (4) and the hypothesis ${\beta=1}$ , we see that

$\displaystyle \sum_{(a_1,\dots,a_{n_0-1}) \in S'_{n_0,N}} 2^{-a_1-\dots-a_{n_0-1}} \gg 3^{-(1+\varepsilon^6) n_0}$

for all integers ${N}$ not divisible by ${3}$ . Let ${S''_{n_0,N}}$ denote the tuples ${(a_1,\dots,a_{n_0-1})}$ in ${S'_{n_0,N}}$ that obey the additional regularity hypotheses

$\displaystyle |a_1 + \dots + a_i - 2i| \leq - \varepsilon^5 n_0 \ \ \ \ \ (8)$

for all ${1 \leq i \leq n_0-1}$ ,note that this implies in particular the ${j=0}$ case of (7). From the Chernoff inequality (noting that the geometric random variable ${\mathrm{Geom}(2)}$ has mean ${2}$ ) and the union bound we have

$\displaystyle \sum_{b \in {\bf Z}/3^{n_0}{\bf Z}: 3 \not | b} \sum_{(a_1,\dots,a_{n_0-1}) \in S'_{n_0,b} \backslash S''_{n_0,b}} 2^{-a_1-\dots-a_{n_0-1}} \ll 3^{-c \varepsilon^5 n_0}$

for an absolute constant ${c>0}$ (where we use the periodicity of ${S'_{n_0,N}, S''_{n_0,N}}$ in ${N}$ to define ${S'_{n_0,b}, S''_{n_0,b}}$ for ${b \in {\bf Z}/3^{n_0}{\bf Z}}$ by abuse of notation). Hence by the pigeonhole principle we can find a residue class ${b}$ not divisible by ${3}$ such that

$\displaystyle \sum_{(a_1,\dots,a_{n_0-1}) \in S'_{n_0,b} \backslash S''_{n_0,b}} 2^{-a_1-\dots-a_{n_0-1}} \ll 3^{-(1+c \varepsilon^5) n_0}$

and hence by the triangle inequality we have

$\displaystyle \sum_{(a_1,\dots,a_{n_0-1}) \in S''_{n_0,N}} 2^{-a_1-\dots-a_{n_0-1}} \gg 3^{-(1+\varepsilon^6) n_0}$

for all ${N}$ in this residue class.

Henceforth ${N}$ is assumed to be an element of this residue class. For ${(a_1,\dots,a_{n_0-1}) \in S''_{n_0,N}}$ , we see from (8)

$\displaystyle a_1 + \dots + a_{n_0-1} = (2+O(\varepsilon^5)) n_0,$

hence by the pigeonhole principle there exists ${L_0 = (2+O(\varepsilon^5)) n_0}$ (so in particular ${3^{-n_0} 2^{L_0} = (4/3)^{(1+O(\varepsilon^5))n_0}}$ ) such that

$\displaystyle \sum_{(a_1,\dots,a_{n_0-1}) \in S''_{n_0,N}: a_1+\dots+a_{n_0-1} = L_0} 2^{-L_0} \gg 3^{-(1+\varepsilon^6) n_0}$

so the number of summands here is at least ${\gg 2^{L_0} 3^{-(1+\varepsilon^6) n_0}}$ . This establishes the base case ${j=0}$ .

Now suppose inductively that ${j \geq 1}$ , and that the claim has already been proven for ${j-1}$ . By induction hypothesis, there exists natural numbers ${n_{j-1}, L_{j-1}}$ with

$\displaystyle (2-\varepsilon^2) n_{j-1} \leq L_{j-1} \leq (2+\varepsilon^2) n_{j-1}$

and

$\displaystyle (4/3)^{(1-\varepsilon^2) (1+\varepsilon)^{j-1} n_0} \leq 3^{-n_{j-1}} 2^{L_{j-1}} \leq (4/3)^{(1+\varepsilon^2) (1+\varepsilon)^{j-1} n_0} \ \ \ \ \ (9)$

(which in particular imply that ${n_{j-1} = (1+O(\varepsilon^2)) (1+\varepsilon)^{j-1} n_0}$ ) and at least ${3^{-n_{j-1} - \varepsilon^4 n_{j-1}} 2^{L_{j-1}}}$ tuples

$\displaystyle (a_1,\dots,a_{n_{j-1}-1}) \in S'_{n_{j-1},N} \ \ \ \ \ (10)$

obeying the additional properties

$\displaystyle a_1+\dots+a_{n_{j-1}-1} = L_{j-1} \ \ \ \ \ (11)$

and (7) for all ${1 \leq i \leq n_{j-1}-1}$ .

Let ${n_{j}}$ be an integer such that

$\displaystyle 3^{-n_{j}} 2^{L_{j-1} + 2(n_{j}-n_{j-1})} \asymp (4/3)^{(1+\varepsilon)^j n_0} N. \ \ \ \ \ (12)$

One easily checks that

$\displaystyle n_{j} = (1+\varepsilon+O(\varepsilon^2)) n_{j-1} = (1+O(\varepsilon^2)) (1+\varepsilon)^{j-1} n_0.$

For each tuple (10), we may write (as in the proof of Lemma 2)

$\displaystyle N = \sum_{m=1}^{n_{j-1}-1} 3^{m-1} 2^{-a_1-\dots-a_m} + 3^{n_{j-1}-1} 2^{-L_{j-1}} M_{\vec a}$

for some integers ${M_{\vec a}}$ . We claim that these integers lie in distinct residue classes modulo ${3^k}$ where

$\displaystyle k :=\lfloor \frac{\log 2}{\log 3} L_{j-1} - n_{j-1} + \varepsilon^4 n_{j-1} \rfloor.$

Indeed, suppose that ${M_{\vec a} = M_{\vec b} \hbox{ mod } 3^k}$ for two tuples ${\vec a = (a_1,\dots,a_{n_{j-1}-1})}$ , ${\vec b = (b_1,\dots,b_{n_{j-1}-1})}$ of the above form. Then

$\displaystyle \sum_{m=1}^{n_{j-1}-1} 3^{m-1} 2^{-a_1-\dots-a_m} = \sum_{m=1}^{n_{j-1}-1} 3^{m-1} 2^{-b_1-\dots-b_m} \hbox{ mod } 3^{n_{j-1}-1+k}$

(where we now invert ${2}$ in the ring ${{\bf Z}/3^{n_{j-1}-1+k}{\bf Z}}$ ), or equivalently

$\displaystyle \sum_{m=1}^{n_{j-1}-1} 3^{m-1} 2^{a_{m+1}+\dots+a_{n_{j-1}-1}} = \sum_{m=1}^{n_{j-1}-1} 3^{m-1} 2^{b_{m+1}+\dots+b_{n_{j-1}-1}} \hbox{ mod } 3^{n_{j-1}-1+k}.$

By (11), (7), all the summands on the left-hand side are natural numbers of size ${O( 2^{L_{j-1}} 3^{O(\varepsilon^5 n_{j-1})})}$ , hence the sum also has this size; similarly for the right-hand side. From the estimates of ${n_{j-1}, n_{j}}$ , we thus see that both sides are natural numbers between ${1}$ and ${3^{n_{j-1}-1+k}}$ , by hypothesis on ${k}$ . Thus we may remove the modular constraint and conclude that

$\displaystyle \sum_{m=1}^{n_{j-1}-1} 3^{m-1} 2^{a_{m+1}+\dots+a_{n_{j-1}-1}} = \sum_{m=1}^{n_{j-1}-1} 3^{m-1} 2^{b_{m+1}+\dots+b_{n_{j-1}-1}}$

and then a routine induction (see Lemma 6.2 of my paper) shows that ${(a_1,\dots,a_{n_{j-1}-1}) = (b_1,\dots,b_{n_{j-1}-1})}$ . This establishes the claim.

As a corollary, we see that every residue class modulo ${3^{n_j-n_{j-1}}}$ contains

$\displaystyle O( 3^{k - (n_j-n_{j-1})} ) = O( 2^{L_{j-1}} 3^{-n_j + \varepsilon^4 n_{j-1}} )$

of the ${M_{\vec a}}$ at most. Since there were at least ${3^{-n_{j-1} - \varepsilon^4 n_{j-1}} 2^{L_{j-1}}}$ tuples ${\vec a}$ to begin with, we may therefore forbid up to ${O(3^{n_j-n_{j-1} - 3 \varepsilon^4 n_{j-1}})}$ residue classes modulo ${3^{n_j-n_{j-1}}}$ , and still have ${\gg 3^{-n_{j-1} - \varepsilon^4 n_{j-1}} 2^{L_{j-1}}}$ surviving tuples ${\vec a}$ with the property that ${M_{\vec a}}$ avoids all the forbidden classes.

Let ${\vec a}$ be one of the tuples (10). By the hypothesis ${\beta = 1}$ , we have

$\displaystyle \sum_{(a_{n_{j-1}},\dots,a_{n_j-1}) \in S'_{n_j-n_{j-1},M_{\vec a}}} 2^{-a_{n_{j-1}}-\dots-a_{n_j-1}} \gg 3^{-(1+\varepsilon^6) (n_j-n_{j-1})}.$

Let ${S'''_{n_j-n_{j-1},M}}$ denote the set of tuples ${(a_{n_{j-1}},\dots,a_{n_j-1}) \in S'_{n_j-n_{j-1},M}}$ with the additional property

$\displaystyle |a_{n_{j-1}} + \dots + a_i - 2(i-n_{j-1}+1)| \leq - \varepsilon^3 (n_j - n_{j-1})$

for all ${n_{j-1} \leq i \leq n_j - 1}$ , then by the Chernoff bound we have

$\displaystyle \sum_{b \in {\bf Z}/3^{n_j-n_{j-1}}{\bf Z}} \sum_{(a_{n_{j-1}},\dots,a_{n_j-1}) \in S'_{n_j-n_{j-1},b} \backslash S'''_{n_j-n_{j-1},b}} 2^{-a_{n_{j-1}}-\dots-a_{n_j-1}}$

$\displaystyle \ll 3^{-c\varepsilon^3 (n_j-n_{j-1})}$

for some absolute constant ${c>0}$ . Thus, by the Markov inequality, by forbidding up to ${O(3^{n_j-n_{j-1} - 3 \varepsilon^4 n_{j-1}})}$ classes, we may ensure that

$\displaystyle \sum_{(a_{n_{j-1}},\dots,a_{n_j-1}) \in S'_{n_j-n_{j-1},M_{\vec a}} \backslash S'''_{n_j-n_{j-1},M_{\vec a}}} 2^{-a_{n_{j-1}}-\dots-a_{n_j-1}} \ll 3^{-(1+\varepsilon^5) (n_j-n_{j-1})}$

and hence

$\displaystyle \sum_{(a_{n_{j-1}},\dots,a_{n_j-1}) \in S'''_{n_j-n_{j-1},M_{\vec a}}} 2^{-a_{n_{j-1}}-\dots-a_{n_j-1}} \gg 3^{-(1+\varepsilon^6) (n_j-n_{j-1})}.$

We thus have

$\displaystyle \sum_{a_1,\dots,a_{n_j-1}} 2^{-a_{n_{j-1}}-\dots-a_{n_j-1}} \gg 3^{-n_{j-1} - \varepsilon^4 n_{j-1}} 2^{L_{j-1}} 3^{-(1+\varepsilon^6) (n_j-n_{j-1})}$

where ${(a_1,\dots,a_{n_j-1})}$ run over all tuples with ${\vec a = (a_1,\dots,a_{n_{j-1}-1})}$ being one of the previously surviving tuples, and ${(a_{n_{j-1}},\dots,a_{n_j-1}) \in S'''_{n_j-n_{j-1},M_{\vec a}}}$ . By (11) we may rearrange this a little as

$\displaystyle \sum_{a_1,\dots,a_{n_j-1}} 2^{-a_1-\dots-a_{n_j-1}} \gg 3^{-n_{j} - \varepsilon^4 n_{j-1}-\varepsilon^6 (n_j-n_{j-1})}.$

By construction, we have

$\displaystyle a_1 + \dots + a_{n_j-1} = L_{j-1} + (2 + O(\varepsilon^3)) (n_j - n_{j-1})$

for any tuple in the above sum, hence by the pigeonhole principle we may find an integer

$\displaystyle L_j = L_{j-1} + (2 + O(\varepsilon^3)) (n_j - n_{j-1}) \ \ \ \ \ (13)$

for which

$\displaystyle \sum_{a_1,\dots,a_{n_j-1}: a_1+\dots+a_{n_j-1}=L_j} 2^{-a_1-\dots-a_{n_j-1}} \geq 3^{-n_{j} - \varepsilon^4 n_j}.$

In particular the number of summands is at least ${3^{-n_{j} - \varepsilon^4 n_j} 2^{L_j}}$ . Also observe from (13), (12) that

$\displaystyle 3^{-n_j} 2^{L_j} = 3^{-n_{j} + O( \varepsilon^3 (n_j - n_{j-1})} 2^{L_{j-1} + 2(n_j - n_{j-1})}$

$\displaystyle = (4/3)^{(1+\varepsilon)^j n_0} 3^{( \varepsilon^3 (n_j - n_{j-1})}$

so in particular

$\displaystyle (4/3)^{(1-\varepsilon^2) (1+\varepsilon)^j n_0} \leq 3^{-n_j} 2^{L_j} \leq (4/3)^{(1+\varepsilon^2) (1+\varepsilon)^j n_0}.$

It is a routine matter to verify that all tuples in this sum lie in ${S'_{n_j,N}}$ and obeys the requirements (6), (7), closing the induction hypothesis. $\Box$

Corollary 5 For all ${N}$ in the residue class from the previous proposition, and all ${j \geq 0}$ , we have

$\displaystyle \{ M \in (\mathrm{Syr}^{\bf N})^*(N): M \leq 3 (4/3)^{(1+\varepsilon^2) (1+\varepsilon)^j n_0} N \}$

$\displaystyle \gg (4/3)^{(1-\varepsilon) (1+\varepsilon)^j n_0}.$

In particular, we have

$\displaystyle \{ M \in (\mathrm{Syr}^{\bf N})^*(N): M \leq x \} \gg_{\varepsilon,n_0,N} x^{1-\varepsilon}$

as ${x \rightarrow \infty}$ .

Proof: For every tuple ${(a_1,\dots,a_{n_j-1})}$ in the previous proposition, we have

$\displaystyle N = \sum_{m=1}^{n_{j}-1} 3^{m-1} 2^{-a_1-\dots-a_m} + 3^{n_{j}-1} 2^{-L_{j}} M$

for some integer ${M}$ . As before, all these integers ${M}$ are distinct, and have magnitude

$\displaystyle M \leq 3^{-n_j+1} 2^{L_j} N \leq \leq 3 (4/3)^{(1+\varepsilon^2) (1+\varepsilon)^j n_0} N.$

From construction we also have ${\mathrm{Syr}^{n_j}(M) = N}$ , so that ${M \in (\mathrm{Syr}^{\bf N})^*(N)}$ . The number of tuples is at least

$\displaystyle 3^{-n_j - \varepsilon^4 n_j} 2^{L_j}$

which can be computed from the properties of ${n_j,L_j}$ to be of size at least ${(4/3)^{(1-\varepsilon) (1+\varepsilon)^j n_0}}$ . This gives the first claim, and the second claim follows by taking ${j}$ to be the first integer for which ${3 (4/3)^{(1+\varepsilon^2) (1+\varepsilon)^j n_0} N \geq x}$ . $\Box$

To conclude the proof of Proposition 3, it thus suffices to show that

Lemma 6 Every residue class ${b \hbox{ mod } 3^{n_0}}$ has a non-trivial intersection with ${(\mathrm{Syr}^{\bf N})^*(1)}$ .

Indeed, if we let ${b \hbox{ mod } 3^{n_0}}$ be the residue class from the preceding propositions, and use this lemma to produce an element ${N}$ of ${(\mathrm{Syr}^{\bf N})^*(1)}$ that lies in this class, then from the inclusion ${(\mathrm{Syr}^{\bf N})^*(N) \subset (\mathrm{Syr}^{\bf N})^*(1)}$ we obtain (3) with ${\gamma = 1-O(\varepsilon)}$ , and then on sending ${\varepsilon}$ to zero we obtain the claim.

Proof: An easy induction (based on first establishing that ${2^{2 \times 3^n} = 1 + 3^{n+1} \hbox{ mod } 3^{n+2}}$ for all natural numbers ${n}$ ) shows that the powers of two modulo ${3^{n_0+1}}$ occupy every residue class not divisible by ${3}$ . From this we can locate an integer ${N}$ in ${b \hbox{ mod } 3^{n_0}}$ of the form ${N = \frac{2^n-1}{3}}$ . Since ${\mathrm{Syr}(N)=1}$ , the claim follows. $\Box$

We remark that the same argument in fact shows (assuming ${\beta=1}$ of course) that

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Col}^{\bf N})^*(N_0) \} \gg_{N_0} x^{1-o(1)}$

in the limit ${x \rightarrow \infty}$ for any natural number ${N_0}$ not divisible by three.

96 comments

Comments feed for this article

25 January, 2020 at 8:14 pm

Anonymous

Is it possible to extend proposition 3 for each given $\beta$ to find the explicit expression of the best possible $\gamma$ as a function of each given $\beta$ ?

26 January, 2020 at 9:11 am

Terence Tao

The argument I give in this post does give in principle an explicit way to convert a specific value of $\beta$ to a specific value of $\gamma$ , but it is quite inefficient and would not give the best possible $\gamma$ . Heuristically one can obtain a relationship as follows. From (5) we have for any natural number $N$ not divisible by 3 that

$\displaystyle \sum_{(a_1,\dots,a_{n-1}) \in S'_{n,N}} 2^{-a_1-\dots-a_{n-1}} \gg 3^{-(\beta+o(1)) n}.$

The law of large numbers suggests that typically one has $a_1+\dots+a_{n-1} \approx 2(n-1)$ . (Making this intuition precise is in fact a bit tricky, and is the reason why the arguments in the post are rather technical and lead to inefficiencies in converting from $\beta$ values to $\gamma$ values.) If one blindly inserts this law of large number approximation, we see that

$\displaystyle \# S'_{n,N} \gg 3^{-\beta n + o(n)} 2^{2n}.$

This basically tells us that $\mathrm{Syr}^{-n}(N)$ (or $\mathrm{Syr}^{-n+1}(N)$ ) has $\gg 3^{-\beta n + o(n)} 2^{2n}$ preimages, each of which have size about $3^{-n} 2^{a_1+\dots+a_{n-1}} N \approx (4/3)^n N$ by the equation just after (3). (A good rule of thumb is that each iterate of the Syracuse map tends to reduce the magnitude by about 3/4 on the average.) Applying this heuristic to $N=1$ , and setting $x = (4/3)^n$ , we then arrive at a predicted value of $\gamma$ given by

$\displaystyle ((4/3)^n)^\gamma = (3^{-\beta n} 2^{2n})$

or equivalently

$\displaystyle \gamma = \frac{\log 4 - \beta \log 3}{\log 4 - \log 3}$

$\displaystyle \beta = \frac{\log 4 - \gamma \log(4/3)}{\log 3}.$

If we take this numerology as the value of the best in principle value of $\gamma$ one can get for a given $\beta$ , this argument would need $\beta$ to be less than $\log 4 / \log 3 \approx 1.26$ to get a nontrivial bound for $\gamma$ , and less than $\frac{\log 4 - 0.84 \log(4/3)}{\log 3} \approx 1.04$ to improve upon Krasikov-Lagarias. (For comparison, the inequality $\beta \leq |\log_3 c_2|$ gives an upper bound for $\beta$ of $\log(63/2)/\log 3 \approx 3.14$ ; this can presumably be improved upon substantially, but clearly there is a gap to close here. Perhaps a computation of $c_3$ and $c_4$ would shed some more light on the situation.) However if one wishes to get a better numerical value of $\gamma$ , it may be possible to work with a more complicated quantity than $\beta$ , taking into account the entire distribution of the Syracuse random variable and not just the infimal value of the distribution function. (In some sense this is what Krasikov and Lagarias actually do, though not quite in this language.)

26 January, 2020 at 11:33 am

Uwe Stroinski

Instead of computing the $c_n$ for large $n$ exactly it might be sufficient to lower bound the numerator by $1$ and to work out the denominator with your recursive formula.

26 January, 2020 at 12:06 pm

Terence Tao

The denominator unfortunately grows quite fast: the worst case is that the denominator for $c_n$ is $\prod_{m =0}^n (2^{2 \times 3^m} - 1)$ , though already for $n=2$ there is a little bit of cancellation and the final denominator ends up being $2^{2 \times 3}-1 = 63$ rather than $(2^2-1) \times (2^{2 \times 3}-1) = 189$ . My feeling is that while it is technically true that the probability densities are rational numbers, by the time $n$ gets even moderately large (e.g., $n=4$ ) it is better off thinking of them as being generic real numbers.

25 January, 2020 at 10:33 pm

Anonymous

The inequality $c_n \leq 3^{-\beta (n-1)}$ (given below proposition 3) seems to imply an upper (not lower!) bound on $\beta$ .

[Corrected, thanks – T.]

26 January, 2020 at 7:36 am

Jeff

Mistakes like these tend to come from fatigue. And they tend to artificially bolster one’s confidence, especially if the subject is integral to one’s philosophical gestalt.

Taking a break is a good thing. Take care! Best wishes.

26 January, 2020 at 1:47 am

Anonymous

Wow, very exciting result. I am just wondering, can you make o(1) arbitrarily small, meaning that o(1) goes to 0 and thus having the collatz conjecture settled?

26 January, 2020 at 9:20 am

Terence Tao

Not with the current argument, but I would imagine that the limit of the method would be to show that the preimage of any given number $N$ not divisible by 3 has positive density (so the $x^{-o(1)}$ error would be replaced by an constant depending only on $N$ ). Note that my previous paper already shows that a positive proportion of all orbits reach a bounded value, so by the pigeonhole principle there is at least one number $N$ whose preimage has positive density.

As I stated in the previous post, the full Collatz conjecture is still well out of reach of current methods (though Alex Kontorovich has recently pointed out that there is a remote possibility that one could perhaps engineer a counterexample to the conjecture if one could somehow demonstrate enough “Turing completeness” to the iteration). However, partial results such as these can still give various worthwhile insights on the nature of the problem and its connection to other questions, even if this is not sufficient to lead to a full resolution of the problem. For instance these arguments seem to indicate that the 3-adic structure of the Collatz iteration (as captured by the Syracuse random variables $\mathbf{Syrac}({\bf Z}/3^n{\bf Z})$ ) should be studied further, particularly for statistical versions of the Collatz conjecture (in which one is primarily focused on controlling the behaviour of “typical” orbits rather than _all_ orbits); previous work has focused more on the 2-adic structure than the 3-adic (and perhaps ultimately the two need to somehow be combined, though how to do so is still beyond our current technology). Related to this, these arguments also highlight the fact that the Syracuse map $\mathrm{Syr}$ is better suited than the Collatz map $\mathrm{Col}$ (or the standard acceleration $\mathrm{Col}_2$ ) for studying the 3-adic structure (this is basically because each application of $\mathrm{Syr}$ contains precisely one multiplication by 3 and a variable number of divisions by $2$ , in contrast for instance to $\mathrm{Col}_2: n \mapsto \frac{3n+1}{2}, \frac{n}{2}$ that contains exactly one division by 2 but a variable number of multiplications by $3$ ). As I previously mentioned, I think it will also be instructive to see how these results depend on the precise parameters of the Collatz-type iteration one is studying (e.g., if one replaces $3x+1$ with some other affine form).

26 January, 2020 at 9:41 am

Jeff

It may help if you define positive density. Do you mean positive ‘natural’ density. I think you are confusing natural density with some other one.

26 January, 2020 at 12:03 pm

Terence Tao

The arguments in my paper currently apply to logarithmic density, but I believe this is mainly a technical restriction and that with some additional work they can be upgraded to natural density (I believe someone is looking into this question right now); see Remarks 1.4 and 1.16 of my paper.

More generally, it is often the case in analytic number theory problems that an easier result in logarithmic density (or sometimes Dirichlet density) is established first, with the harder natural density result coming later, so it makes sense to focus first on the logarithmic density problem to get some initial results, basically because the logarithmic density enjoys some approximate multiplicative invariance properties that natural density lacks. (For instance, the fact that the set $\{ n: \lambda(n)=+1\}$ of numbers that are the product of an even number of primes has logarithmic density 1/2 can be proven in an elementary half-page argument, but to get natural density 1/2 is equivalent to the prime number theorem. The fact that $\{ n: \lambda(n)=\lambda(n+1)=+1\}$ has logarithmic density 1/4 was only established in 2015, but the natural density statement remains open, being the first unknown case of the Chowla conjecture.)

26 January, 2020 at 12:18 pm

Jeff

There are infinite sets where the natural and log densities are equivalent. But there are also, if I recall, infinite subsets where the natural density is not defined. And also infinite subsets where the natural density is zero. Bridging the gap between the two densities in light of all these obstacles seems crazy to me.

27 January, 2020 at 4:58 am

Jeff

Maybe you used this Theorem already,
Davenport–Erdős theorem.

The reverse direction iterations make the mind swirl. Trying to use the forms one gets to say whether some notion of logarithmic density transfers to natural density looks crazy.

My work gives concrete examples one can use to test the idea with. But even staying abstract is a waste of time.

I support horizontal progress, but it isn’t partial progress to me.

27 January, 2020 at 6:11 am

Jeff

Somewhere in the above arguments I saw n sub j minus n sub j, which is zero. Too tedious to track and check whether you’ve some indexing error. These iterations are a hell.

27 January, 2020 at 7:01 am

Jeff

Saw an inequality where it should perhaps read n sub (j -1) – 1 instead of n sub (j) – 1.

Anyway, epsilon cannot be between 0 and 1 and all of these inequalities hold in general.

26 January, 2020 at 6:07 am

Dmitriy Z

It seems that the inequality $c_n \le \frac{3}{2}3^{-n}$ should be in other direction. So this implies that $\beta \ge 1$ while an upper bound on $\beta$ is $|\log_3 c_2|$ .

[Corrected, thanks – T.]

26 January, 2020 at 8:00 am

Jeff

I don’t think it is possible to find and ‘repair’ where all of the probabilistic type approaches “fail”. They fail from the start, and so cannot be ‘repaired”. There is no remedy. But it obviously won’t stop more folks from trying. Absolutely bizarre….

17 February, 2020 at 2:02 pm

Gottfried H

@Jeff – might you please provide a link to your work?

27 January, 2020 at 6:42 am

Anonymous

In the definition of $c_n$ , it seems that the constraint $3 \mid b$ should be removed.

[It’s actually $3 \nmid b$ , but there was a LaTeX issue that I will try to correct – T.]

27 January, 2020 at 9:48 am

Anonymous

It seems simpler to just remove this constraint (thereby simplifying the expression for $c_n$ ) since this constraint ( $3 \mid b$ ) clearly does not contribute to the probability expression of $c_n$ .

27 January, 2020 at 10:16 am

Terence Tao

Unfortunately this would then make the infimum equal to zero.

27 January, 2020 at 10:57 am

Anonymous

I meant that in the expression for $c_n$ , the probability in the RHS (for each fixed n) as a function of $b$ should vanish(!) whenever $3 \mid b$ . Hence inclusion or removal of this constraint should not affect the expression for $c_n$ .

27 January, 2020 at 2:48 pm

Terence Tao

The infimum of a set of non-negative numbers will vanish if one or more zeroes are added to the set on which the infimum is taken. (This is in contrast to the supremum, which remains unaffected by addition of zero entries.)

27 January, 2020 at 7:51 am

David Speyer

In the definition of $c_n$ (just above equation (4)), should $3^2$ be $3^n$ ?

[Corrected, thanks – T.]

27 January, 2020 at 8:11 am

David Speyer

Also, this isn’t an error, but it took me a while to figure out why the summand in Definition 1 is $3^{n-m} 2^{-a_m-a_{m+1} - \cdots - a_n}$ , but the summand in the 3-adic discussion below is $3^{m-1} 2^{-a_1-a_2-\cdots-a_m}$ . (Answer: You made the change of variable $m' = n+1-m$ and reindexed $a'_j = a_{n+1-j}$ . So $n-m = m'-1$ and $a_m+a_{m+1} + \cdots + a_n = a'_1 + a'_2 + \cdots + a'_m$ . Then you sent $n \to \infty$ .)

[Clarification added – T.]

27 January, 2020 at 8:20 am

Jürgen Sander

Dear Terence,

just for historical completeness you may wish to know that, independently of the work of Krasikov, a comparable â though slightly weaker (my constant 3/10, Krasikovâs constant 3/7) â result was published (see attachment) at the same time.

Best wishes,
JÃ¼rgen

“”””””””””””””””””””””””””””””””””””””””””””””””””””””””””””””””””
Prof. Dr. JÃ¼rgen Sander

VizeprÃ¤sident fÃ¼r Studium, Lehre, studentische Belange
und Digitalisierung

UniversitÃ¤t Hildesheim

UniversitÃ¤tsplatz 1, D-31141 Hildesheim, Germany
Telefon.: +49 (0)5121 883-40140
Email: sander@imai.uni-hildesheim.de,
Sekret.: Tel. +49 (0)5121 883-40100, Fax +49 (0)5121 883-40101

“”””””””””””””””””””””””””””””””””””””””””””””””””””””””””””””””””

[Reference added – T.]

27 January, 2020 at 10:29 am

Allan van Hulst

I’m curious if you were able to resolve (some) of the difficulties described in the previous post relating to the explicit computation (where a technical limitation around n=10 arose, if I remember correctly).

29 January, 2020 at 4:03 am

Anonymous

Dear Pro.Tao,
By the way, firstly I would like to greet” Happy New Year” to you. Next, I need to suggest some ideas which is not related to above topic. I really think you do not spend most of time solving Collatz conjecture 100%. Your past Collatz conjecture paper was very good.If there is a string between Collatz, Goldbach, Twin prime, Riemann , you should do Collatz 100% as you as possible. My idea is very stupid , foolish and crazy. I do not need any one understands me.
I think you should spend most of time focusing on Hodge conjecture and Birch Swinnerton Dyer conjecture . I think two above problems are very easy if comparing the rest of 4 problems. The level of two problems is equal to Poincare conjecture. Poincare conjecture is very very very hard , but it there is still a point end “bottom”. In contrast of four rest problems, they do not have bottom like the center of Pacific ocean. I guarantee more 200 years ,noone can solve them.Although I am not an expert in maths, I believe in my sense completely. This is not proved by sience. I always use my sense to help and save many people. My predictions are always true and right in every field of human beings. I always want to bring many good things to you. Pro.Tao , you should quickly do the greatest thing before 50 age. Not over 65 age.
Today, a crazy man meets a genius man each other. I do not need you reply my question. I only want you think deeply my idea every night before bedtime. I always side by side to you.
Byebye ,Pro.Tao.

27 January, 2020 at 11:57 am

Ehud Schreiber

I seem to be confused – you say that c_3 and c_4 should be computed, but in the previous post it seems you’ve computed the steady-state distribution up to n=10, so essentially have the c’s up to that. If so, how does the “empirical” slope of -log_3(c_n) look like? Does it support beta = 1?

[I have that data on my home computer; will look it up later this afternoon -T.]

27 January, 2020 at 4:29 pm

Terence Tao

OK, here is the data I have:

$n$ $c_n$ $-\log(c_n)/(n-1)$
2 0.031746 3.140
3 0.006096 2.321
4 0.001789 1.919
5 0.000445 1.756
6 0.000124 1.637
7 0.000035 1.555
8 0.000011 1.4785
9 0.000003 1.4331
10 0.000001 1.3941

(Not completely confident in the numerical accuracy of the n=10 data.)

Particularly reassuring is the fact that $c_n$ begins decreasing by a factor of close to 3 after a while, which is consistent with the upper bounds for $\beta$ converging to 1. On the other hand it doesn’t look like brute force by itself is going to lead to any computationally feasible improvement of the Krasikov-Lagarias exponent this way.

27 January, 2020 at 6:10 pm

Anonymous

Is it possible to combine the current approach with the Krasikov-Lagarias approach ?

28 January, 2020 at 10:38 am

Terence Tao

Good question! I think their linear programming approach could conceivably lead to some more refined bounds on $c_n$ that could lead to slightly sharper numerical upper bounds on $\beta$ , and perhaps they may shed some light on how one might hope to reach $\beta=1$ . Roughly speaking, translated to this setting, the approach proceeds by exploiting various inequalities between the more general quantities

$\displaystyle c_n(b,m) := \inf_{b' = b \hbox{ mod } 3^m} {\bf P}( {\bf Syrac}({\bf Z}/3^n{\bf Z}) = b' )$

for various residue classes $b \hbox{ mod } 3^m$ with $1 \leq m \leq n$ , thus for instance $c_n = \min( c_n(1,1), c_n(2,1) )$ (and $c_n(b,m)$ vanishes when $b$ is a multiple of 3). Clearly these quantities are all non-negative with

$\displaystyle c_n(b,m) = \mathrm{min}( c_n(b, m+1), c_n(b+3^m, m+1), c_n(b+2 \times 3^m, m+1))$

and there is also a recursive inequality

$\displaystyle c_{n+1}(b,m+1) \geq \frac{ \sum_{1 \leq a \leq 2 \times 3^m: 2^a b = 1 \hbox{ mod } 3} 2^{-a} c_n( \frac{2^a b - 1}{3}, m ) }{1 - 2^{-2 \times 3^m}}$

that follows easily from Lemma 1.12 of my paper. Thus for instance

$c_n(1,1) = \mathrm{min}( c_n(1,2), c_n(4,2), c_n(7,2) )$

$c_n(2,1) = \mathrm{min}( c_n(2,2), c_n(5,2), c_n(8,2) )$

$c_{n+1}(1,2) \geq \frac{64}{63} ( \frac{1}{4} c_n(1,1) + \frac{1}{16} c_n(2,1) )$

$c_{n+1}(2,2) \geq \frac{64}{63} ( \frac{1}{2} c_n(1,1) + \frac{1}{8} c_n(2,1) )$

$c_{n+1}(4,2) \geq \frac{64}{63} ( \frac{1}{64} c_n(1,1) + \frac{1}{4} c_n(2,1) )$

$c_{n+1}(8,2) \geq \frac{64}{63} ( \frac{1}{32} c_n(1,1) + \frac{1}{2} c_n(2,1) )$

$c_{n+1}(7,2) \geq \frac{64}{63} ( \frac{1}{16} c_n(1,1) + \frac{1}{64} c_n(2,1) )$

$c_{n+1}(5,2) \geq \frac{64}{63} ( \frac{1}{8} c_n(1,1) + \frac{1}{32} c_n(2,1) )$

etc.. One can then hope to prove various lower bounds of the form $c_n(b,m) \geq a_{b,m} 3^{\beta n}$ for various constants $a_{b,m}$ by induction provided that $a_{b,m}, \beta$ obey certain inequalities of their own (basically a dual set of inequalities to the ones listed above). This approach can probably improve slightly upon the bound $\beta \leq -\log(c_n)/(n-1)$ that I am using here (for instance I would imagine that the inequalities listed above would already improve upon the first bound $\beta \leq 3.14$ by a fair amount), though it would require significantly more numerical effort for any given value of $n$ .

28 January, 2020 at 11:41 am

Terence Tao

Looking at the Krasikov-Lagarias paper, they observe a symmetry that I had not previously noticed, which in my notation is that

${\bf P}( {\bf Syrac}({\bf Z}/3^n {\bf Z}) = 2b ) = 2 {\bf P}( {\bf Syrac}({\bf Z}/3^n {\bf Z}) = b )$

whenever $n \geq 1$ and $b = 1 \hbox{ mod } 3$ , which basically follows from splitting the geometric random variable $\mathbf{a}_1$ appearing in (2) as $2 \mathbf{a}'_1 - \mathbf{c}$ where $\mathbf{a}'_1$ is another geometric random variable of mean 4/3 (so $\mathbf{a}'_1=n$ with probability $3 \times 4^{-n}$ ) and $\mathbf{c} \in \{0,1\}$ is a Bernoulli random variable of mean 2/3, independent of all the other variables, and then observing that the residue class of (2) mod 3 is entirely determined by $\mathbf{c}$ . So this simplifies the preceding calculations because we now have $c_n(2b,m) = 2 c_n(b,m)$ whenever $b = 1 \hbox{ mod } 3$ . In particular $c_n(1,1) = c_n$ , $c_n(2,1) = 2c_n$ , and the above system of inequalities basically simplifies after some calculation to

$c_{n+1} \geq \frac{2}{21} c_n$

leading to the bound $\beta \leq 2.14$ instead of $\beta \leq 3.14$ . More generally I think with this new observation that the bound $c_{n_1+n_2-1} \geq c_{n_1} c_{n_2}$ improves to $c_{n_1+n_2-1} \geq 3 c_{n_1} c_{n_2}$ (we gain a factor of 3 in (5) now), leading to the improved bound $\beta \leq - \log(3c_n)/(n-1)$ which for instance improves the $n=10$ bound to $\beta \leq 1.2830$ .

29 January, 2020 at 10:07 am

Anonymous

Is it possible that there are other similar splittings of random variables which may lead to additional other symmetries and recursive relations for more efficient computation of $c_n$ and improved related inequalities ?

29 January, 2020 at 7:38 pm

Terence Tao

So I now realise that the above identity generalises to

${\mathbf P}( \mathbf{Syrac}({\mathbf Z}/3^n {\mathbf Z}) = 2b )$
$= 2 {\mathbf P}( \mathbf{Syrac}( {\mathbf Z}/3^n {\mathbf Z}) = b ) - {\mathbf P}( \mathbf{Syrac}( {\mathbf Z}/3^{n-1} {\mathbf Z}) = \frac{2b-1}{3} ),$

for any integer $b$ and any $n \geq 1$ , with the convention that the second term on the RHS vanishes when $\frac{2b-1}{3}$ is not an integer. In fact this identity uniquely determines the Syracuse random variables. The identity follows from the fact that the geometric random variable ${\mathbf a}_1$ is equivalent to the random variable that equals $1$ with probability $1/2$ and $\mathbf{a}_1+1$ with probability $1/2$ .

27 January, 2020 at 8:05 pm

Anonymous

Lemma 2 indicates the possible convexity of the sequence
$\log c_n /(n-1)$ of (decreasing ?) upper bounds for $\beta$ .

28 January, 2020 at 2:49 am

Ehud Schreiber

Writing a quick-and-dirty R code, I get the same results for $n \leq 4$ .
In particular, $c_3 = 1598/(2^{18} - 1)$ , so the cancellation persists. I can’t say for $c_4$ , as it involves $2^{54} - 1$ while double precision gives only 53 bits of precision…

Looking at the values of $-\log_3 c_n$ it certainly seems that $\beta = 1$ , although it would be difficult to distinguish between that and $\beta = 1.1$ . It even seems that $-\log_3 c_n$ is $n + O(1)$ or even $n + \text{const} + o(1)$ , but again distinguishing between that and $n + O(\log n)$ is impossible with this “empirical” data.

28 January, 2020 at 4:29 am

Jeff

You can you use my work to get real data to compare this bizarre guesswork with. It was a hell to figure out.

But unfavorable comparisons take all the fun out of guessing games. Better to keep the blinders on :).

28 January, 2020 at 8:56 am

Anonymous

This observed cancellation indicates that the sequence
$d_n: = (2^{2 \times 3^{n-1}} -1) c_n$ is a sequence of integers.
If so, the recursive expression for $c_n$ can be used to find the corresponding recursive expression for $d_n$ which may be used to prove that $d_n$ is indeed a sequence of integers.

29 January, 2020 at 3:19 pm

Terence Tao

Numerically, I can obtain $d_1=1, d_2 = 2, d_3=1598, d_4 = 322293411417192$ , and then I run out of precision. This sequence does not appear in the OEIS (of course, we haven’t yet proven that it is even a sequence of integers; perhaps if we do I will submit it there). But it is clear that the series grows extremely fast (one has $d_n = 2^{2 \times 3^n - O( n )}$ for instance).

29 January, 2020 at 6:10 pm

Jeff

Yeah, I had to resort to doing the large calculations in my mind when my machine ran out of precision :). And of course my subsequent proofs are rigorous. But who cares?

28 January, 2020 at 10:19 am

Terence Tao

Interesting! I will check my own data later today but perhaps it can be conjectured that the probability densities ${\bf P}( \mathbf{Syrac}({\bf Z}/3^n{\bf Z})=b)$ are always integer multiples of $1 / (2^{2 \times 3^{n-1}}-1)$ . If this conjecture is true, it is presumably due to some alternate formula for these densities that I’m not seeing currently, but which may be worthwhile to uncover.

29 January, 2020 at 9:56 am

Anonymous

This conjecture means that the cancelled factor is $2^{3^{n-1}} +1$

28 January, 2020 at 3:59 am

J.P. McCarthy

that $1 \in \mathrm{Col}^{{\bf N}}(n)$ for all $N \in \bf{N}+1$ .

Should be $n\in\bf{N}+1$ .

[Corrected, thanks – T.]

29 January, 2020 at 1:30 am

Bogdan

You write:”…a positive density set of natural numbers iterates to an (explicitly computable) bounded set, so in principle the case \gamma=1 of (1) could now be verified by an (enormous) finite computation”

I do not understand this for two reasons.

First, the title of your paper talks about “almost bounded values”, not bounded ones. And to reduce the problem to finite computation, we ready need the orbits to iterate to BOUNDED set, not almost bounded one.

Second, the case \gamma=1 of (1) seems to correspond to “almost all” with NATURAL density, while your paper proves the statement for a different density.

Can you please clarify?

29 January, 2020 at 4:33 am

Jeff

I’m confused about why anyone would do the above bizarre guesswork when I already rid us of such crutches several years ago.

The quest for a positive density is a waste of time too, using probabilistic methods. Good grief people. Didn’t the 3x-1 case teach you all anything?

29 January, 2020 at 8:36 am

Terence Tao

This is discussed further in Remarks 1.4 and 3.1 of the paper. The argument that shows that almost all values iterate to almost bounded values also shows (and is in fact equivalent to) the assertion that for any $\delta>0$ , there is a $C>0$ such that a set of natural numbers of logarithmic density at least $1-\delta$ iterate to a value bounded by $C$ (and in fact the arguments give an in principle explicit relation between $C$ and $\delta$ , which I believe to be of the form $C = \exp( \delta^{-O(1)})$ ). In particular, setting $\delta$ to be any fixed value between 0 and 1, e.g., $\delta=1/2$ , we conclude that there is a positive (logarithmic) density set of numbers that iterates to an explicitly computable bounded set.

30 January, 2020 at 10:49 am

Jeff

Have you tried working this argument out with pre-images of 3x-1 ?

What do you mean exactly by ‘in principle’?

Is it explicit or not?

30 January, 2020 at 3:48 pm

Terence Tao

Every step in the argument can be made quantitative (with no ineffective bounds), so it is just a matter of carefully keeping track of all the implied constants if one wants. However, this would be time consuming and quite messy to optimize; usually it is more efficient to wait until the non-explicit form of an argument has become more streamlined before attempting to invest the effort to convert it to a fully explicit form.

Regarding the 3x-1 problem: it needs to be checked, of course, but I expect that pretty much all the results in my paper (or in this blog post) will carry over with only minor changes to the 3x-1 problem. Now it is of course true that the direct analogue of the Collatz conjecture is false for the 3x-1 conjecture due to the existence of the two known additional cycles (5,14,7,20,10) and (17,50, …, 34). However this does not exclude the possibility that progress can be made simultaneously on understanding both iterations. For instance, one could envisage that there is an argument common to both the 3x+1 and 3x-1 iterations that establishes the existence of an absolute constant C such that all orbits of either iteration end up reaching a number less than or equal to C: note that the two additional cycles of the 3x-1 iteration are not in contradiction to the above claim. If we could prove this partial result, and if C could be given explicitly and was of a computationally feasible magnitude, one could then finish off the 3x+1 conjecture for good by numerically verifying that all iterates of the 3x+1 map starting from a number less than or equal to C eventually reached 1. This would of course not work for the (false) 3x-1 conjecture due to the existence of the additional two cycles, so the failure of the 3x-1 conjecture does not prohibit this strategy from being a viable approach to the 3x+1 problem. [Basically, one should make a distinction between the asymptotic and non-asymptotic components of the dynamics. The 3x+1 and 3x-1 dynamics differ non-asymptotically (by which I mean the dynamics on small integers $n \leq C$ ) due to the additional cycles possessed by the former but not the latter, but there is no evidence at present that the asymptotic dynamics of the two (by which I mean the dynamics on large integers $n > C$ ) are particularly different from each other. Indeed, one could unify the 3x+1 and 3x-1 dynamics by working on the integers rather than the natural numbers, in which case the unified conjecture asserts that all orbits eventually reach one of the five known cycles (listed for instance on the Wikipedia page), of which only one lives in the positive integers, with three others living in the negative integers and one being the zero cycle.]

That said, I don’t view the results of my paper as coming close to giving a partial result of the above form. However, a realistic goal would be that of obtaining an explicit C for which iterations of either the 3x+1 or 3x-1 map would end up reaching a value C or below for a set of positive (logarithmic) density. Again, combined with a suitable numerical verification, this would establish the 3x+1 conjecture for a set of inputs of positive logarithmic density, without being in contradiction to the failure of the 3x-1 conjecture.

30 January, 2020 at 4:23 pm

Anonymous

It seems that establishing explicit numerical bounds $C$ corresponding to some given logarithmic densities and improving the current bound for $\gamma$ are suitable for a Polymath project.

30 January, 2020 at 5:11 pm

Jeff

I must say I admire your optimism and energy. And I especially admire your dedication to your convictions. If you polymath this, doing large calculations on 3x-1 prior may help. Perhaps the ‘17’ cycle acts as a catch or net of sorts such that ‘more’, in the sense of some density measure, numbers map to it than the other two separately, whether or not there are more cycles.

Just stay healthy sir.

29 January, 2020 at 7:29 pm

Terence Tao

I can now establish that $c_n$ is always an integer multiple of $1/(2^{2 \times 3^{n-1}}-1)$ . In fact, if for any integer $b$ we define the quantities

$\displaystyle d_n(b) := (2^{2 \times 3^{n-1}}-1) {\mathbb P}( \mathbf{Syrac}({\bf Z}/3^n {\bf Z} = b \hbox { mod } 3^n )$

then one can show inductively that the $d_n(b)$ are all natural numbers.

It is convenient to define $d_n(b)=0$ for $b$ non-integer. Then $d_n$ is periodic with period $3^n$ , and $d_1(0)=0, d_1(1)=1, d_1(2)=2$ . This establishes the $n=1$ case. To obtain the higher $n$ case we obtain a recursive formula for $d_n(b)$ that only involves integer arithmetic. We can assume that $b$ is an integer not divisible by 3, since $d_n(b)$ vanishes otherwise. From Lemma 1.12 of my paper and some calculation we see that

$\displaystyle d_n(b) = \frac{4^{3^{n-1}}}{4^{3^{n-2}}-1} \sum_{a=1}^{2 \times 3^{n-1}} d_{n-1}( \frac{2^a b-1}{3} ) 2^{-a}. \quad (1)$

The main problem here is the denominator $4^{3^{n-2}}-1$ , which needs to be canceled. The key point here is that

$\displaystyle 2^{2 \times 3^{n-2}} = 1 + 3^{n-2} \hbox{ mod } 3^{n-1}$

which is easily established by induction. Splitting the sum into triples $a, a+2 \times 3^{n-2},a+4 \times 3^{n-2}$ for $1 \leq a \leq 2 \times 3^{n-2}$ , we arrive at

$\displaystyle d_n(b) = \frac{4^{3^{n-2}}}{4^{3^{n-2}}-1} \sum_{a=1}^{2 \times 3^{n-2}} 2^{-a} \times$

$\displaystyle ( 4^{2 \times 3^{n-2}} d_{n-1}( \frac{2^a b-1}{3} ) + 4^{3^{n-2}} d_{n-1}( \frac{2^a b-1}{3} + 2^a b 3^{n-2})$

$\displaystyle + d_{n-1}( \frac{2^a b-1}{3} + 2^{a+1} b 3^{n-2} ) ).$

Now one observes that $\frac{2^a b-1}{3}, \frac{2^a b-1}{3} + 2^a b 3^{n-2}, \frac{2^a b-1}{3} + 2^{a+1} b 3^{n-2}$ sweep out the residue classes mod $3^{n-1}$ that reduce to $\frac{2^a b-1}{3}$ modulo $3^{n-2}$ , hence

$\displaystyle d_{n-1}( \frac{2^a b-1}{3} ) + d_{n-1}( \frac{2^a b-1}{3} + 2^a b 3^{n-2}) + d_{n-1}( \frac{2^a b-1}{3} + 2^{a+1} b 3^{n-2} )$
$\displaystyle = d_{n-2}( \frac{2^a b-1}{3} ).$

Using this and the $n-1$ case of (1), we arrive after some calculation at the recursive identity

$\displaystyle d_n(b) = \sum_{a=1}^{2 \times 3^{n-2}} 2^{2 \times 3^{n-2} - a} (d_{n-1}( \frac{2^a b-1}{3} + 2^a b 3^{n-2} )$
$\displaystyle + (4^{3^{n-2}}+1) d_{n-1}( \frac{2^a b-1}{3} ) )$
$\displaystyle + d_{n-1}(b)$

and hence the $d_n(b)$ are all natural numbers.

30 January, 2020 at 4:35 am

Jeff

A bit of Basic algebra works wonders. Well, now you have a small taste of the amount of work I did.

Getting hooked on this will wear you down. It’s not worth it.

30 January, 2020 at 10:01 am

Gabe Khan

I realize this might be too optimistic, and apologies again if this is obvious. However, from the recursive identity it seems that the most important term to get lower bounds on $d_{n-1}(b)$ is $\sum_{a=1}^{2 \times 3^{n-2}} 2^{2 \times 3^{n-2}-a} 4^{3^{n-2}} d_{n-1} ( \frac{ 2^a b - 1}{3} )$ . Everything else in the series seems to be much smaller (i.e. grows slower than $2^{2 \times 3^{n-1}}$ ). As such, if you could get an upper bound on how many small values of $a$ you can pick so that $d_{n-1} ( \frac{ 2^a b - 1}{3} )$ is zero (i.e. how many $a$ so that $2^a b-1$ is divisible by 9), one might hope that this gives something to induct on to get lower bounds on the $c_n$ .

This alone is almost certainly not sharp enough to actually get good bounds, but that partial sum seems to be the key terms in the recursive formula.

30 January, 2020 at 3:35 pm

Terence Tao

Yes, this is the main term, and in fact the series should decay exponentially in $a$ , so that only the first few terms should be significant. It may indeed make sense to truncate the recursive formulae for these sorts of expressions to the low values of $a$ (e.g., $a \leq 10$ ) to obtain a sparser system of linear inequalities that could be more tractable even at the cost of a small amount of optimality in the final exponents.

31 January, 2020 at 8:32 am

Gabe Khan

Right, and since $\frac{1}{3} = \sum_{a=1}^\infty \frac{1}{4^a}$ , this suggests that for a given $n$ , we can expect to find $b$ so that $d_{n-1}( \frac{2^a b -1}{3})$ vanishes for $a = 1,3,5, \ldots$ However, it seems like this is somehow the “worst case”. By truncating the series, the hope would be to verify this to some level. As you mentioned, there will definitely be some loss in this approximation.

30 January, 2020 at 12:50 pm

Anonymous

A classical approach to estimate $c_n$ is to find from its recursive expression the corresponding (functional) identity satisfied by its generating function (e.g. $C(z) = \sum_n c_n z^n$ ) and estimate its coefficients (as done e.g. by the circle method for the partition function.)

30 January, 2020 at 3:37 pm

Terence Tao

This is indeed tempting; however the formulae for $c_n$ contain a minimum at one point, which does not interact well with classical generating functions. (There is a remote possibility that some sort of “tropical generating function” might be useful here, although those are more suited for working with maxima rather than minima.)

2 February, 2020 at 5:04 am

Dyachenko Eduard

In the table, the transformation introduced by Collaz for the number 27 is parsed.
The main proffer is reduced the “length of the number” when converting oddness of the form 4k+1and possibilities preserving it when converting 4k+3.
Since the transformation 4k+3 cannot be stored indefinitely, periodically the “length of the number” decreases.
As a result the sequence to transformed a number of the form 2^p/3^q
The work can be read here(https://zenodo.org/record/3630682#.XjagkGhKjIU).
i n(i) D(i) r(i)-r(i)>1 4k(i)+1 4k(i)+3 k(i) P(i)
0 27 1/ 2 -1 4*6+3 14+13 even 1
1 41 1/ 4 -2 2 4*10 + 1
2 31 1/ 2 -1 4*7+3 16+15 odd 4
3 47 1/ 2 -1 4*11+3 odd
4 71 1/ 2 -1 4*17+3 odd
5 107 1/ 2 -1 4*26+3 even
6 161 1/ 4 -2 2 4*40+1
7 121 1/ 4 -2 2 4*30+1
8 91 1/ 2 -1 4*22+3 46+45 even 1
9 137 1/ 4 -2 2 4*34+1
10 103 1/ 2 -1 4*25+3 52+51 odd 2
11 155 1/ 2 -1 4*38+3 even
12 233 1/ 4 -2 2 4*58+1
13 175 1/ 2 -1 4*43+3 88+87 odd 3
14 263 1/ 2 -1 4*65+3 odd
15 395 1/ 2 -1 4*98+3 even
16 593 1/ 4 -2 2 4*148+1
17 445 1/ 8 -3 3 4*111+1
18 167 1/ 2 -1 4*41+3 84+83 odd 2
19 251 1/ 2 -1 4*62+3 even
20 377 1/ 4 -2 2 4*94+1
21 283 1/ 2 -1 4*70+3 213+212 even 1
22 425 1/ 4 -2 2 4*106+1
23 319 1/ 2 -1 4*79+3 160+159 odd 5
24 479 1/ 2 -1 4*119+3 odd
25 719 1/ 2 -1 4*179+3 odd
26 1079 1/ 2 -1 4*269+3 odd
27 1619 1/ 2 -1 4*404+3 even
28 2429 1/ 8 -3 3 4*607+1
29 911 1/ 2 -1 4*227+3 456+455 odd 3
30 1367 1/ 2 -1 4*341+3 odd
31 2051 1/ 2 -1 4*512+3 even
32 3077 1/16 -4 4 4*769+1
33 577 1/ 4 -2 2 4*144+1
34 433 1/ 4 -2 2 4*108+1
35 325 1/16 -4 4 4*81+1
36 61 1/ 8 -3 3 4*15+1
37 23 1/ 2 -1 4*5+3 12+11 odd 2
38 35 1/ 2 -1 4*8+3 even
39 53 1/32 -5 5 4*13+1
40 5 1/16 -4 4 4*1+1
41 1 1/ 1 0
balance -70 46 24
Text is available under the Creative Commons NonCommercial-NoDerivatives 4.0 International
(CC BY-NC-ND 4.0)

2 February, 2020 at 5:50 am

Dyachenko Eduard

Unfortunately, the table is not obvious, although I tried to insert correctly.
The work can be read here(https://zenodo.org/record/3630682#.XjagkGhKjIU).

7 February, 2020 at 6:48 am

y.y.

apologies for my english.
I have commented your previous blog posts about the conjecture. there I have appended professor Peter Hellekalek’s this paper:
https://arxiv.org/abs/1605.02634
the paper seems to concern Syracuse function in 3-adic as you mentioned.
and I wrote that ∀x∈(2N_0+1), ∃μ∈N, f^(1+μ)(x)＜x.
because the function have its inverse and it can be defined by RCWA with metrices.
the structure contains S_3 especially C_3 and yes, 2-adic is somehow combined. there is no randomness.
I don’t have background in mathematics, but I would be grateful if professor Tao could get to know my findings. it seems something important to prove (or progress to prove) the conjecture. thank you.

8 February, 2020 at 2:35 am

y.y.

Mistake. when x=1, f^(1+μ)(x)=x. 1 is a unique fixed-point. and μ=2, of course. As for the indices μ, it can also be defined as resides when it is in inverse form.

10 February, 2020 at 7:33 pm

y.y.

So the particular orbit such as; x∈2N_0+1, Syrac(x)→2N_0+1:= (3x+1)/2^-(1+n), Syrac(x)＞x.
These numbers can be determined as certain subset. Looking at the subset as in inverse, we can also demermine the image of the subset. Then consider the intersection of the image and its preimage, we see that the set of the intersecion goes to null at m times. I think that is not so interesting part of the problem for mathematicians. The interesting obserbation of the problem is that the syracuse function has its inverse which S_3 acts on.

10 February, 2020 at 8:46 pm

Anonymous

Again mistake. Remove “/” or “-” from (3x+1)/2^-(1+n) ! Don’t factorial!

10 February, 2020 at 9:14 am

Peter C

I’ve worked on this problem with varying approaches over the years. Recently msg’d you via email on a traverse method to the 3x+1 tree. Sourcing from odd multiple of 3 numbers, the problem can be converted into an infinite density series to explain that all numbers land at key nodes N*, and vice versa. Those key nodes (N*) all descend in a formulated way (which another infinite density series can explain) to other N*. I’d love to chat with you about it.

18 February, 2020 at 10:16 am

Peter C

As mentioned previously, if one defines a way to progress outward rather than the typical inward motions (i.e. to land a 1) it is easily shown that all numbers flow outwards to Mod(n,3)=0 nodes; traversing in the odd numbers only.

Such a step can be declared as:
NewN = w [(-3*(w^2)+7w)n -1] / [3(2^(w-1))], where w=MOD(n,3)
E.g 5->3, w=2, n=5, NewN=3; This would count as 1 step to get to divisible by 3.

A number like 17 would require 3 steps to get to a divisible by 3 outer node, div3 require no steps (i.e. 3, 9, 21, etc.)

This approach yields a density series: (x:0 to inf) Ʃ [2^x] / [3^(x+1)] = 1,
where x = STEPS to get w=0

However, such a method doesn’t guarantee that a given number & chain actually attaches to the rest of tree. Therefore a secondary density series explains that the attaching nodes all derive from other attaching nodes.

So Logically, if all numbers flow outwards to a div3, that chain contains an N* (can be considered the source node for that chain of nodes), N* comes from a (N*-1)/4 node [which will flow inward to its own relative N* and outward div3], etc.. A similar series can be generated here to ensure the total sum =1 as well.

Example: 35 would have 2 steps to to get to a div3. Its N* is 53, (N*-1)/4 = 13, 13 is an N*,
(13-1)/4 = 3, its N* is 5 (the base case).

This would account for all chains, unless one odd number out there isn’t part of a chain. If that happens, it would seem contradictory because that means an infinite amount of chains & nodes exist (so far not seen) that don’t link back using the (N*-1)/4 rule.

It my impression that the typical approaches to solving this problem are the problem.

26 February, 2020 at 7:43 am

Anonymous

The main proffer is reduced the “length of the number” when converting oddness of the form 4k+1and possibilities preserving it when converting 4k+3.
Since the transformation 4k+3 cannot be stored indefinitely, periodically the “length of the number” decreases.

26 February, 2020 at 7:59 am

Peter C

I believe you are saying the method I have suggested gives a way to convert the overall tree into odds from {4K+1,4K+3} forms. Yes. The important aspect is the periodic nature of the Steps. Using densities to understand the entire set of odd numbers, you can convert the tree into the series mentioned above (x:0 to inf) Ʃ [2^x] / [3^(x+1)] = 1. The secondary part of most importance is only one N* exists in a chain, and that is joins to another N* as well. One must decompose the tree in a slightly different manner that it is typical known, but is equivalent… the answer becomes apparent then. All number flow outwards to a multiple of 3, all multiple of 3 have exactly 1 N* in its chain, and N* join inwards. Each have density series that are cyclical and sum to 1.

18 February, 2020 at 12:24 am

y.y.

I am really glad if the conjecture would be affirmatively solved in the near future because I’ve been obsessed this. I have no mathematical background nor related areas so that I cannot write the paper about this in my own. but again say, I have found the inverse map of the Syracuse map which is known as a surjective but not injective mapping. The inverse map has of course one-one correspondence for all x∈2N_0+1. The conjecture is for group theory. I wonder who should I contact to discuss this.

18 February, 2020 at 11:33 am

Anonymous

Alex Kontorovich’s twitter thread is really interesting, another thing to keep us up at night. I guess it isn’t lost on you that it’s reminiscent of your approach to Navier-Stokes blowup.

18 February, 2020 at 12:05 pm

Anonymous

Another (graph theoretic) interpretation of the Collatz conjecture is that it is equivalent to the connectedness of the (infinite) graph corresponding to the Col mapping and that this graph has only the trivial cycle.
Is it possible that graph theoretic methods may help to get more results ?

16 March, 2020 at 8:55 pm

Arunava mondal

Very good explanation.

16 March, 2020 at 8:56 pm

Arunava mondal

Sir what is formality conjecture.

12 August, 2020 at 8:41 pm

Li Jiang

The essence of Collatz conjecture is the iterative operation of odd numbers. It is known from the basic theorem of arithmetic: the number of factors with a value equal to 2 in each even number is determined for the even number, so an iterative operation for odd numbers $x$ can be defined: $T=\frac{3x+1}{2^a}$ , (where $a$ is the number of factors whose value in $3x+1$ is equal to 2), the result of the operation $T$ is still odd number. According to this definition, the general formula for continuous iteration of odd number can be derived

$T^{(k)}(x)=\dfrac{\displaystyle{3^k x+\sum_{i=1}^{k}2^{g(i-1)}3^{k-i}}}{2^{g(k)}}.$ (where $g(i+1)>g(i), g(0)=0$ ).

From this, if the continuous iteration may cause loops, then we can deduce the equation

$x=\dfrac{\displaystyle{\sum_{i=1}^{k}2^{g(i-1)}3^{k-i}}}{2^{g(k)}-3^k}.$

Solving this equation can get the result that the equation has no positive integer solution except 1, which indicates that continuous iteration cannot cause a loop.

On the other hand, the continuous iterative formula can be transformed into a linear indefinite equation

$A_kY=B_kx+C_k.$ (where $\displaystyle{Y=T^{(k)}(x), A_k=2^{g(k)}, B_k=3^k, C_k=\sum_{i=1}^{k}2^{g(i-1)}3^{k-i}}$ )

The process of solving the equation reveals that it is impossible for odd numbers to reach infinity through iterative operations. Therefore, all odd numbers will return to 1 after a limited number of iterations.
By extending this result to even numbers, it can be determined that all positive integers will return to 1 after a finite number of iterations, which fully proves Collatz conjecture.

My article The Collatz conjecture and linear indefinite equation is in the following website
http://www.sciepub.com/tjant/content/8/2

12 August, 2020 at 11:01 pm

Anonymous

where is the 0.8 in your equation 4.9 coming from?

13 August, 2020 at 2:01 am

Li Jiang

$C_4>0.8$

13 August, 2020 at 4:27 am

Li Jiang

Correction: $C_4>0.8\times 3^k$

13 August, 2020 at 11:49 am

Anonymous

You mean $C_4>0.8\times 3^4$ ? $C_4 > 64.8$

13 August, 2020 at 11:51 am

Anonymous

$C_k$ then seems to be diverging.

13 August, 2020 at 3:35 pm

Hollis Williams

Be sceptical of your own work

13 August, 2020 at 1:57 pm

Li Jiang

Sorry, it should the sum of the first 4 items of $C_k$ be greater than $0.8\times 3^k$ .
When $k\rightarrow\infty, C_k$ is divergent, but $\frac{C_k}{A_k}$ is always a finite value.

13 August, 2020 at 11:06 pm

Anonymous

$A_k = 2^{g(k)}$ where $g(k)$ is typically less than $k$. So $\frac{C_k}{A_k}\approx\frac{3^k}{2^k}$ which also diverges.

14 August, 2020 at 2:34 am

Li Jiang

Please read my article:
In the process of determining the special solutions $X_0$ and $Y_0$ of the indefinite equation, formula (4.8) is obtained, where
$X_0=\frac{C_k+uA_k}{r_2}$

Since $latex r_2 \frac{C_k}{A_k}+u.$

Therefore, if $X_0$ is finite then $\frac{C_k}{A_k}$ cannot be divergent.

14 August, 2020 at 2:42 am

Anonymous

since $r_2 < A_k$

14 August, 2020 at 2:43 am

Anonymous

so $X_0 > \frac{C_k}{A_k}+u$

12 February, 2021 at 5:31 am

Cifra Finale

Dear prof. Tao and readers of this blog, we are Dr. Cinzia, Silvana graduate in Materials Technology and their Dad Giovanni Di Savino, Air Force technician, we believe that Collatz’s conjecture can be simplified: Euclid with the formula 2 * n + 1 generates infinite odd numbers and proves that there is a prime number greater than the largest known prime. Certainly the first known and the first greater than the first known can be generated and identifiable but no prime can be generated that can be defined as the first greater than all the former. Thales measures “the inaccessible and unattainable”, Euclid generates and demonstrates the existence of the inaccessible and unreachable prime number, Gauss with the Fundamental Theorem of Arithmetic demonstrates the factorization mentioned by Euclid and states that every natural number is the product of prime numbers. Einstein, with the theory of relativity and with E = m * c ^ 2 shows that as the digits that make up the number increase, (the largest prime number we know today we write it with the power notation of two minus one (2 ^ 82.589. 933 -1) or with 24,862,048 decimal digits that can be reported on paper in km of writing) the same becomes inaccessible and unreachable, the “space” to be analyzed increases and the quantity = space to be processed which can be communicated with a maximum speed that does not can exceed: “the speed of light”. Larger numbers correspond to larger spaces, it follows that it takes longer to process that number which, however large, inaccessible and unreachable it may be Euclid has shown that it exists, Thales has shown that it can be measured and Einstein made us take act and showed us that our life cycles are insufficient to know the prime number greater than the largest known. The natural numbers, also generated with primes that exist and that we will never know, are the even and odd numbers that can be in a sequence that must be considered in the Collatz conjecture. Results of verification of numbers already made, are to be considered as records to be overcome. The numbers and sequences referred to in the conjecture are infinite and therefore it is impossible to represent the sequences but it is possible to satisfy the conjecture,.Statement in 1930 by the German mathematician Lothar Collatz, the procedure is also known as the Ulam or Thwaites conjecture, as the Kakutani or Syracuse problem, as the Hasse algorithm, or even as the hail sequence. Intuition might suggest that the number you start with affects the number you end up with but Collatz predicted that if you start with a positive integer and run this process long enough, all initial values will lead to 1. And once you reach 1 , the rules of Collatz’s conjecture limit you to one cycle: 1, 4, 2, 1, 4, 2, 1, over and over forever. Euclid formulating 2 * n + 1 generates all the infinite odd numbers which, factored with the Fundamental Theorem of Arithmetic, do not have 2 among their factors; Collatz formulating n * 3 + 1 makes even any odd number of starting or sequence of the statement of Collatz; the even number or which has become even has 2 among its factors and any even number is a multiple of it. The result of an even number divided by 2 can also be an odd number but Collatz with its algorithm (n * 3 + 1) makes any known and unknown odd number even and the known and unknown even number can be divided by 2 and , halving the even numbers, we always arrive at 2/2 = 1 which being odd becomes 4 (1 * 3 + 1) which in turn becomes 2 (4/2) and then 2/2 = 1 to continue indefinitely 1_4_2 . thanks for reading

3 April, 2021 at 4:56 am

sacirisi

The Fundamental Theorem of Arithmetic proves that the infinite (hereinafter ∞) natural numbers are primes or compounds and are the producer of prime numbers; satisfying the two versions of Goldbach’s conjecture will allow us to prove that even ∞ natural numbers are the sum of 2 prime numbers, ∞ odd numbers are the sum of only 3 prime numbers; satisfying Euclid’s twin primes conjecture allows us to give a solution to the strong version of Goldbach’s conjecture, “all even ∞ are the sum of two primes” by not looking for the impossible combinatorics between two primes whose quantity and value we do not know , but analyzing the distances between two consecutive prime numbers spaced only by the multiple numbers. https://www.facebook.com/groups/5349868060/?multi_permalinks=10158013236403061

28 September, 2021 at 12:20 am

sacirisi

mutatis mutandis at 20210927. The Collatz conjecture is defined as a quagmire or labyrinth from which it is best to stay away. I went in, I left and I can go in and out whatever the starting number. The only way to get out of the “quagmire or labyrinth” was known that it was necessary to find the way to get to 1. All even numbers that are the result of a power of 2 arrive at 1, from half of the smaller power 2 ^ 1 to the middle from the largest and the smallest generable. The even number is: either a choice of the starting number and, if it is the result of 2 ^ n, we arrive at 1 after so many divisions by 2 equal to the value of the exponent of the power of 2, or it is the result of a number odd * 3 + 1. There is the odd number that multiplying it * 3 + 1 obtains a result equal to the result of a power of 2 with even exponent ≥2. The primes are infinite and are generated by the known primes, the odd numbers that * 3 + 1 generate the powers “to get out of the quagmire” are infinite and are the sum of the results of the even powers ≥0 notes or the odd binary number with digits which are the results of even powers ≥0. Each starting number generates new, unique and unrepeatable numbers; among these there is an odd number which is the sum of the results of powers of 2 with even exponent. https://www.facebook.com/photo/?fbid=3082225312098634&set=pcb.10158351014883061 Emzari Papava Fan Club

15 October, 2021 at 5:10 am

sacirisi

mutatis mutandis at 15 apr: Because the Collatz algorithm will never create an infinite loop. With any natural number and with the Collatz algorithm we obtain an odd number that multiplying it * 3 + 1 generates an even number which is the result of a power of 2, the algorithm halves even numbers, results of powers of 2 , until you get 1. All numbers end with a power of 2, all numbers end at 1. il lavoro:https://www.facebook.com/groups/5349868060

15 October, 2021 at 5:12 am

sacirisi

The steps to get to 1: will never know the factors of natural numbers which are the result of the production of prime numbers raised to a power equal to a natural number. We do not know which is the largest number to verify and we will never know the largest value of the exponent of the factor 2. In the Collatz conjecture, this value determines the steps required to halve an even number and arrive at an odd number. For an even number that is the result of a power of 2, the number of operations required to reach 1 is the exponent value of the power of 2; for an even number which is the result of the product of a power of 2 and powers of odd primes, the number of operations necessary to arrive at 1 is the sum of the exponents of all powers of 2 which are factors of the even numbers that the algorithm generates by multiplying the odd * 3 + 1 and that, with the power of the final 2, include the succession of values that the algorithm generates in each starting number.

6 May, 2021 at 5:21 am

Anonymous

I would like to share my result on the problem with polymath project.
I tried to write the result mathematically as possible as I could.
There might be some mistakes on notations because I am not a math-researcher, sorry.
I think this would be some of help for understanding that the pseudo-randomness, invariant measure, and the most difficulty part of the problem. Thank you.

https://drive.google.com/file/d/1KtzjqtceHlf5sCvjEMCkVvnVQbMo8nFP/view?usp=sharing
https://drive.google.com/file/d/1TmiKN4vLwY3r3gZ367u1G8hjrTNtFVr0/view?usp=sharing

22 August, 2021 at 2:29 am

Alberto Ibañez

On falling into a non-trivial orbit Dear professor, despite trying to understand your arguments, I can’t, but I still wanted to ask you if with this technique, using the pre-images, Collatz’s conjecture in reverse, it would be possible to show that every orbit has a multiple of three (actually every orbit originates from a multiple of 3)?

22 August, 2021 at 1:09 pm

Terence Tao

Yes, this is true. For if $n$ is not a multiple of $3$ , then simple modular arithmetic reveals that $2^k n = 1 \hbox{ mod } 9$ for some natural number $k$ . Then $\frac{2^k-1}{3}$ is a multiple of three whose orbit will contain $n$ .

23 August, 2021 at 12:10 pm

Alberto Ibañez

Thank you Professor and excuse me because I think I am not expressing myself well.
Every n odd multiple of 3 (or an orbit containing one) cannot fall into a non-trivial periodic orbit
Is it proven that no odd n can fall into periodic orbit? I think not yet, right?
So I was wondering if your technique could be used to show that every reverse orbit contains an odd n multiple of 3 (the origin).

Orbits that escape to infinity

I take the opportunity to ask what is the value of the fact that, if k is the number of multiplications by 3 and y is the number of divisions by 2, when k goes to infinity, y-k also goes to infinity
This is deduced from an accelerated Col2 type map, which follows the worst possible trajectory, ascending, in the sense of the time an orbyt maintaining a multiplication by 3 and a division by 2, and ensures that for all odd n, its worst trajectory, in this sense, is finite, and ends up dividing more than once by 2

If n odd, n + 1 even = z * 2 ^ t, then in t steps of a Col2 map the worst orbit ends, in this sense

$odd\, n + (n + 1) = n +( z \times 2 ^ t) \rightarrow n + z \times (3 ^ t-2 ^ t) = even$

Thanks

30 October, 2022 at 7:13 am

sacirisi

With the Collatz algorithm it is not possible to process all natural numbers because we do not know: quantities and values of even and odd numbers and all their factors. From Tartaglia’s triangle we can detect odd numbers which are the sum of the results of the infinite powers of 2 which have an even index and which are also equal to the previous odd * 4 + 1. These are all the odd numbers that * 3 + 1 generate an even number that is the result of a base power 2 and even index 2 ^ (2 * n≥1) and that, the nth half, ends at 1 because ½ of 2 ^ 1 = 2 ^ 0 = 1. pre print https://vixra.org/abs/2112.0004

27 October, 2023 at 7:39 am

Cynthia Moore

ISO someone interested in reviewing my Collatz sketch. Please contact CynthiaMooreMath@gmail.com. Thanks!

	Lior Silberman on Two announcements: AI for Math…
	Terence Tao on Marton’s conjecture in a…
	Terence Tao on 275A, Notes 3: The weak and st…
	Aditya Guha Roy on Two announcements: AI for Math…
	Two announcements: A… on Marton’s conjecture in a…
	Anonymous on Petition to support maths, sta…
	Anonymous on AI Mathematical Olympiad…
	Anonymous on Petition to support maths, sta…
	Anonymous on 254A, Notes 2: The central lim…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on 254B, Notes 5: Product theorem…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on Stein’s maximal principl…
	Theorem Proving usin… on A slightly longer Lean 4 proof…
	Theorem Proving usin… on Formalizing the proof of PFR i…

Equidistribution of Syracuse random variables and density of Collatz preimages

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

96 comments

Leave a reply to Anonymous Cancel reply

For commenters

Equidistribution of Syracuse random variables and density of Collatz preimages

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

96 comments

Leave a reply to Anonymous Cancel reply

For commenters