On product representations of squares

20 May, 2024 in math.NT, paper | Tags: Paul Erdos | by Terence Tao

I’ve just uploaded to the arXiv my paper “On product representations of squares“. This short paper answers (in the negative) a (somewhat obscure) question of Erdös. Namely, for any ${k \geq 1}$ , let ${F_k(N)}$ be the size of the largest subset ${A}$ of ${\{1,\dots,N\}}$ with the property that no ${k}$ distinct elements of ${A}$ multiply to a square. In a paper by Erdös, Sárközy, and Sós, the following asymptotics were shown for fixed ${k}$ :

${F_1(N) = (1+o(1)) N}$ .
${F_2(N) = (\frac{6}{\pi^2} + o(1)) N}$ .
${F_3(N) = (1+o(1)) N}$ .
${F_{4k}(N) = (1+o(1)) \frac{N}{\log N}}$ for ${k \geq 1}$ .
${F_{4k+2}(N) = (\frac{3}{2}+o(1)) \frac{N}{\log N}}$ for ${k \geq 1}$ .
${(\log 2 + o(1)) N \leq F_{2k+1}(N) \leq N}$ for ${k \geq 2}$ .

Thus the asymptotics for ${F_k(N)}$ for odd ${k \geq 5}$ were not completely settled. Erdös asked if one had ${F_k(N) = (1-o(1)) N}$ for odd ${k \geq 5}$ . The main result of this paper is that this is not the case; that is to say, there exists ${c_k>0}$ such that any subset ${A}$ of ${\{1,\dots,N\}}$ of cardinality at least ${(1-c_k) N}$ will contain ${k}$ distinct elements that multiply to a square, if ${N}$ is large enough. In fact, the argument works for all ${k \geq 4}$ , although it is not new in the even case. I will also note that there are now quite sharp upper and lower bounds on ${F_k}$ for even ${k \geq 4}$ , using methods from graph theory: see this recent paper of Pach and Vizer for the latest results in this direction. Thanks to the results of Granville and Soundararajan, we know that the constant ${c_k}$ cannot exceed the Hall-Montgomery constant

$\displaystyle 1 - \log(1+\sqrt{e}) + 2 \int_1^{\sqrt{e}} \frac{\log t}{t+1}\ dt = 0.171500\dots$

and I (very tentatively) conjecture that this is in fact the optimal value for this constant. This looks somewhat difficult, but a more feasible conjecture would be that the ${c_k}$ asymptotically approach the Hall-Montgomery constant as ${k \rightarrow \infty}$ , since the aforementioned result of Granville and Soundararajan morally corresponds to the ${k=\infty}$ case.

In the end, the argument turned out to be relatively simple; no advanced results from additive combinatorics, graph theory, or analytic number theory were required. I found it convenient to proceed via the probabilistic method (although the more combinatorial technique of double counting would also suffice here). The main idea is to generate a tuple ${(\mathbf{n}_1,\dots,\mathbf{n}_k)}$ of distinct random natural numbers in ${\{1,\dots,N\}}$ which multiply to a square, and which are reasonably uniformly distributed throughout ${\{1,\dots,N\}}$ , in that each individual number ${1 \leq n \leq N}$ is attained by one of the random variables ${\mathbf{n}_i}$ with a probability of ${O(1/N)}$ . If one can find such a distribution, then if the density of ${A}$ is sufficienly close to ${1}$ , it will happen with positive probability that each of the ${\mathbf{n}_i}$ will lie in ${A}$ , giving the claim.

When ${k=3}$ , this strategy cannot work, as it contradicts the arguments of Erdös, Särközy, and Sós. The reason can be explained as follows. The most natural way to generate a triple ${(\mathbf{n}_1,\mathbf{n}_2,\mathbf{n}_3)}$ of random natural numbers in ${\{1,\dots,N\}}$ which multiply to a square is to set

$\displaystyle \mathbf{n}_1 := \mathbf{d}_{12} \mathbf{d}_{13}, \mathbf{n}_2 := \mathbf{d}_{12} \mathbf{d}_{23}, \mathbf{n}_3 := \mathbf{d}_{13} \mathbf{d}_{23}$

for some random natural numbers ${\mathbf{d}_{12} \mathbf{d}_{13}, \mathbf{d}_{23}}$ . But if one wants all these numbers to have magnitude ${\asymp N}$ , one sees on taking logarithms that one would need

$\displaystyle \log \mathbf{d}_{12} + \log \mathbf{d}_{13}, \log \mathbf{d}_{12} + \log \mathbf{d}_{23}, \log \mathbf{d}_{13} + \log \mathbf{d}_{23} = \log N + O(1)$

which by elementary linear algebra forces

$\displaystyle \log \mathbf{d}_{12}, \log \mathbf{d}_{13}, \log \mathbf{d}_{23} = \frac{1}{2} \log N + O(1),$

so in particular each of the ${\mathbf{n}_i}$ would have a factor comparable to ${\sqrt{N}}$ . However, it follows from known results on the “multiplication table problem” (how many distinct integers are there in the ${n \times n}$ multiplication table?) that most numbers up to ${N}$ do not have a factor comparable to ${\sqrt{N}}$ . (Quick proof: by the Hardy–Ramanujan law, a typical number of size ${N}$ or of size ${\sqrt{N}}$ has ${(1+o(1)) \log\log N}$ factors, hence typically a number of size ${N}$ will not factor into two factors of size ${\sqrt{N}}$ .) So the above strategy cannot work for ${k=3}$ .

However, the situation changes for larger ${k}$ . For instance, for ${k=4}$ , we can try the same strategy with the ansatz

$\displaystyle \mathbf{n}_1 = \mathbf{d}_{12} \mathbf{d}_{13} \mathbf{d}_{14}; \quad \mathbf{n}_2 = \mathbf{d}_{12} \mathbf{d}_{23} \mathbf{d}_{24}; \quad \mathbf{n}_3 = \mathbf{d}_{13} \mathbf{d}_{23} \mathbf{d}_{34}; \quad \mathbf{n}_4 = \mathbf{d}_{14} \mathbf{d}_{24} \mathbf{d}_{34}.$

Whereas before there were three (approximate) equations constraining three unknowns, now we would have four equations and six unknowns, and so we no longer have strong constraints on any of the ${\mathbf{d}_{ij}}$ . So in principle we now have a chance to find a suitable random choice of the ${\mathbf{d}_{ij}}$ . The most significant remaining obstacle is the Hardy–Ramanujan law: since the ${\mathbf{n}_i}$ typically have ${(1+o(1))\log\log N}$ prime factors, it is natural in this ${k=4}$ case to choose each ${\mathbf{d}_{ij}}$ to have ${(\frac{1}{3}+o(1)) \log\log N}$ prime factors. As it turns out, if one does this (basically by requiring each prime ${p \leq N^{\varepsilon^2}}$ to divide ${\mathbf{d}_{ij}}$ with an independent probability of about ${\frac{1}{3p}}$ , for some small ${\varepsilon>0}$ , and then also adding in one large prime to bring the magnitude of the ${\mathbf{n}_i}$ to be comparable to ${N}$ ), the calculations all work out, and one obtains the claimed result.

24 comments

Comments feed for this article

20 May, 2024 at 7:30 pm

Anonymous

Bravo!

20 May, 2024 at 7:53 pm

domotorp

See also https://arxiv.org/abs/2405.12088 which just appeared on arXiv.

20 May, 2024 at 8:08 pm

Terence Tao

Wow, that is quite a coincidence in timing! It seems the papers, while very adjacent in topic, do not actually overlap, but I will certainly update my paper to reference theirs.

21 May, 2024 at 9:04 am

Anonymous

That’s certainly amazing.

21 May, 2024 at 3:56 pm

Anonymous

Why the case of odd 𝑘 ≥ 5? see above

22 May, 2024 at 9:58 am

Terence Tao

I am not sure exactly what the thrust of your question is, but one can get significantly stronger upper bounds in the even case because one can exploit the birthday paradox: if one can find two separate $k$ -tuples in $A$ whose products have the same square-free part, then the product of the entire $2k$ -tuple will be a square. One can use this trick to generate even tuples that multiply to a square, but not odd tuples. Similarly, it is easier to generate strong lower bounds in the odd case than in the even case, because the set of numbers that have an odd number of prime factors in a given set ${\mathcal P}$ of primes, will necessarily have all odd products being a non-square, and all known counterexamples are based on this construction with a suitably optimized choice of ${\mathcal P}$ .

23 May, 2024 at 3:19 am

Will Sawin

Lovely!

If one were to make the upper bound on the density obtained by this method explicit, it would go to 1 with k. Indeed it couldn’t possibly be higher than 1-1/k since we need to lower bound the probability that a k-tuple of random variables lies in A based only on information in the marginal distribution, requiring us to use the union bound, which loses at least 1 minus the density for each element of the tuple.

Might it be possible to prove a bound uniform in k at least 5 by some modification of this method? It seems tricky as one would have to prove the events n_i in A approximately independent in some sense. But actually I’m not sure that this is the main obstacle – it’s possible the current bounds from an explicit version of your argument go to 1 significantly faster than this.

23 May, 2024 at 9:40 am

Terence Tao

It seems one can get a lower bound on the leading constant that is uniform in $k$ by leveraging the even $k$ theory, which among other things tells us that any set of positive density will contain $k$ distinct elements multiplying to a square if $k \geq 4$ is fixed and even. So then if $k \geq 9$ is odd, and $A$ is a subset of $\{1,\dots,N\}$ that has density close to $1$ , then one can first locate a tuple of five elements multiplying to a square, delete those elements, and then locate a tuple of $k-5$ (which is an even number that is at least 4) elements that multiply to a square in the remaining set. (In the notation of the paper, this argument shows that $F_{k+2m}(N) \geq F_k(N) - o(N)$ for any fixed $m \geq 2$ .) I’ll add this as a remark to the paper.

25 May, 2024 at 12:17 am

Csaba Sandor

Let $k$ be a positive integer. Then $F_{k} \ge F_{k+2}-2$ . Let $A$ be a largest subset of $[1,N]$ that does not contain $k+2$ distinct elements whose product is a square. It is known that $F_{2}(N)+1 \le F(N) \le F_{k+2}(n)$ . Then one can locate elements $a<a'$ , $a,a' \in A$ multiplying to a square. Then $A$ is a set that does not contain $k$ distinct elements whose product is a square.

25 May, 2024 at 7:06 am

Terence Tao

Nice! So now the monotonicity of $c^\pm_k$ for odd $k$ is settled.

3 June, 2024 at 2:27 am

Anonymous

can you add it to your article (with ack. for Sandor of course)?

3 June, 2024 at 2:42 am

Anonymous

my bad, v2 already is there…

25 May, 2024 at 7:31 am

Terence Tao

One way to think about this phenomenon is that the set $\{ (n_1,\dots,n_k): n_1 \dots n_k \hbox{ is a square} \}$ is not an “irreducible” set in some arithmetic geometry sense, but instead has multiple interesting “components”. In my paper I rely more or less exclusively on the component which has the parameterization $n_i = \prod_{j \neq i} d_{i,j}$ for some natural numbers $d_{i,j}$ with $d_{i,j} = d_{j,i}$ . But there are other components as well; for instance one can retain this sort of parameterization just for $n_1,\dots,n_{k-2}$ , but have an unrelated parameterization $n_{k-1} = a^2 m$ , $n_k = b^2 m$ for the final two components; this is related to the monotonicity of $F_k(N)$ for odd $k$ discussed previously. It is not clear a priori which components give the strongest bounds on $F_k(N)$ (and perhaps the best bounds could even come from a synergy between multiple components, rather than just taking the best bound provided by a single component).

25 May, 2024 at 8:22 am

Terence Tao

As an experiment, I asked ChatGPT to provide Python code to compute the function $F(N)$ , defined as the size of the largest subset of ${1,\dots,N}$ with no odd number of them multiplying to a square. This is a lower bound for all the $F_k(N)$ for odd $k$ ; the quantity $N-2F(N)$ is also the minimal size of $\sum_{n \leq N} f(n)$ where $f$ ranges over ${-1,+1}$ -valued completely multiplicative functions. As it turned out the code ran flawlessly (other than I had to figure out install the sympy package, and how to print the results), and only took me a few minutes to complete: https://chatgpt.com/share/e021a5fb-5040-427b-a1a8-260fb31eeb08 . On consulting the OEIS, I found that the sequence $N - 2F(N)$ already appears as https://oeis.org/A360659 (which also provided confirmation of the correctness of the GPT-provided code, which I also spot-checked with small values of $N$ ), although $F(N)$ is not, but I just submitted it to the OEIS. (I haven’t tried the analogous exercise with the $F_k(N)$ , which are harder to compute for $k\geq 3$ , but presumably one could get GPT to produce some algorithm that could produce enough entries to check the OEIS.)

UPDATE: F(N) is now at https://oeis.org/A373114

25 May, 2024 at 9:55 am

Terence Tao

Indeed, ChatGPT could compute $F_3(N)$ by brute force for $N$ up to 10 in less than a minute: https://chatgpt.com/share/79ab37e7-d14d-4ae7-96d6-a5794a0a9581 . I could verify and extend the sequence by hand to

$1, 2, 3, 4, 5, 5, 6, 6, 6, 7, 8, 8, 9, 10, 10, 10, ...$

and it does not appear to be in the OEIS; I will add it after the sequence $F(N)$ is approved (this is currently in process). [ $F_1(N)$ and $F_2(N)$ already appear in the OEIS as https://oeis.org/A028391 and https://oeis.org/A013928 respectively.]

UPDATE: $F_3(N)$ is now at https://oeis.org/A372306. $F_4(N)$ is at https://oeis.org/A373119, $F_5(N)$ is at https://oeis.org/A373178, and $F_6(N)$ is at https://oeis.org/A373195.

25 May, 2024 at 3:55 pm

Anonymous

has the oeis led to discoveries?

26 May, 2024 at 3:57 pm

Terence Tao

There are over 10,000 research papers that cite the OEIS, so I presume some non-trivial fraction of these found it useful.

30 May, 2024 at 1:58 pm

Anonymous

Are there similar results for representations of cubes?

[Yes; see the paper https://arxiv.org/abs/2405.12088 referenced in a previous comment -T.]

9 June, 2024 at 3:57 am

Anonymous

Is it possible to show that the sequence $c_k$ is monotonic?

14 June, 2024 at 2:41 pm

Anonymous

Yes.

14 June, 2024 at 3:16 pm

Terence Tao

The $c_k$ are known to equal 1 for even $k \geq 4$ , and are at most $c = 0.1715\dots$ for odd $k$ , so the sequence is not monotonic over the full natual numbers; however they are monotonic non-decreasing for odd $k$ (see the comment of Sandor here).

19 June, 2024 at 8:52 pm

Anonymous

Out of curiosity, how much time have you (Terry) spend browsing erdosproblems.com? I have spent some time clicking the random button.

	Daily Links: Friday,… on There’s more to mathematics th…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on On product representations of…
	Anonymous on 275A, Notes 3: The weak and st…
	数学不仅仅只有严密性和证明（2007年）… on There’s more to mathematics th…
	There's more to… on There’s more to mathematics th…
	数学不只是严谨和证明。 - 偏执的码农 on There’s more to mathematics th…
	Anonymous on Polymath8b: Bounded intervals…
	Mohammed Mannan on The Hilbert-Smith conjecture
	Anonymous on The Poisson-Dirichlet process,…
	Anonymous on Two announcements: AI for Math…
	Anonymous on A problem involving power…
	Terence Tao on 275A, Notes 3: The weak and st…
	Terence Tao on Marton’s conjecture in a…
	Terence Tao on Analysis I

On product representations of squares

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

24 comments

Leave a comment Cancel reply

For commenters

On product representations of squares

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

24 comments

Leave a comment Cancel reply

For commenters