The Collatz Conjecture: Simple Rules, Infinite Mystery

Statement of the Conjecture

The Collatz Conjecture, also known as the $ 3x + 1 $ problem, is one of the most famous unsolved problems in mathematics. It is deceptively simple to state.

[Image: Collatz sequence trajectories plotted as a graph, showing the number of iterations for various starting values to reach 1]

Important: The Collatz Conjecture (Lothar Collatz, 1937).
Define the function $ \phi : \mathbb{N}^+ \to \mathbb{N}^+ $ by
$$ \phi(x) = \begin{cases} x/2, & \text{if } x \text{ is even}, \\ 3x + 1, & \text{if } x \text{ is odd}. \end{cases} $$
Then for every positive integer $ x $, the sequence $ x, \phi(x), \phi^2(x), \phi^3(x), \ldots $ eventually reaches 1.

Example. Starting with $ x = 7 $:

$$ 7 \to 22 \to 11 \to 34 \to 17 \to 52 \to 26 \to 13 \to 40 \to 20 \to 10 \to 5 \to 16 \to 8 \to 4 \to 2 \to 1. $$

Seventeen steps. But it gets there.

Example. The number 27 is particularly interesting. Its Collatz sequence requires 111 steps and reaches a maximum value of 9,232 before finally descending to 1. This dramatic escalation illustrates how unpredictable Collatz trajectories can be.

[Image: Collatz sequence trajectories for starting values n = 3, n = 7, and n = 27]

Paul Erdos famously said of the Collatz Conjecture: “Mathematics may not be ready for such problems.”

Collatz sequence trajectories

Collatz tree structure

Basic Properties

Lemma. If $ x $ is odd, then $ \phi(x) = 3x + 1 $ is always even. Therefore, after every odd step, an even step necessarily follows.

Proof. If $ x $ is odd, then $ 3x $ is odd, and $ 3x + 1 $ is even.

Definition (Collatz Sequence / Trajectory). The Collatz sequence (or trajectory) of a positive integer $ x $ is the sequence

$$ x, \; \phi(x), \; \phi^2(x), \; \phi^3(x), \; \ldots $$

obtained by iterating the Collatz function.

Definition (Stopping Time). The stopping time of $ x $ is the smallest $ k $ such that $ \phi^k(x) < x $. The total stopping time is the smallest $ k $ such that $ \phi^k(x) = 1 $.

[Image: Tree visualization of the Collatz conjecture showing how multiple starting values converge through shared paths to reach 1]

The stopping times of small integers:

$ x $: 1 | Total stopping time: 0 | Max value reached: 1
$ x $: 2 | Total stopping time: 1 | Max value reached: 2
$ x $: 3 | Total stopping time: 7 | Max value reached: 16
$ x $: 5 | Total stopping time: 5 | Max value reached: 16
$ x $: 7 | Total stopping time: 16 | Max value reached: 52
$ x $: 27 | Total stopping time: 111 | Max value reached: 9,232
Computational Verification

The Collatz Conjecture has been verified computationally for all integers up to $ 2^{68} \approx 2.95 \times 10^{20} $. Every single one reaches 1. Yet this massive computational evidence does not constitute a proof.

A Sieve-Based Approach

Eswar Chellappa proposed a proof approach inspired by the Sieve of Eratosthenes. We present the method and then analyze its limitations.

The Sieve Process

Define the helper functions $ f(x) = x/2 $ for even $ x $ and $ g(x) = 3x + 1 $ for odd $ x $.

Step 1. Consider the natural numbers from 1 to $ N $.

Step 2. Process all even numbers in decreasing order. Each even number $ x $ maps to $ x/2 $, which is smaller. Mark each as “verified.”

Step 3. For each remaining odd number $ x $ (starting from 3), compute $ g(x) = 3x + 1 $. If $ g(x) $ has already been verified, then $ x $ is also verified.

Step 4. Continue until $ g(x) > N $, then increase $ N $ and repeat.

Example. For $ N = 20 $: After processing even numbers, we verify 3 via $ g(3) = 10 $ (already verified), 5 via $ g(5) = 16 $ (verified), 7 via $ g(7) = 22 $, but $ 22 > 20 $, so 7 is not verified in this range.

Extending to $ N = 100 $: now $ g(7) = 22 $ is in range, $ g(9) = 28 $, $ g(11) = 34 $, etc. Eventually we verify all odd numbers up to 33 (since $ g(33) = 100 $).

The Argument Against Loops

Chellappa argues that non-trivial loops are impossible because $ f(x) = x/2 $ is injective. If $ \phi^i(x) = \phi^j(x) \neq 1 $ for $ i \neq j $, one might think this leads to a contradiction. However, this argument requires more careful analysis.

Critique: Why the Sieve Approach Is Insufficient

While the sieve method correctly verifies the conjecture for finite ranges, it does not constitute a proof. There are three fundamental gaps:

Finite verification does not equal infinite proof. The sieve works perfectly for any fixed $ N $, but “keep extending the range” is an observation, not a proof technique. We need an argument that covers all natural numbers simultaneously.
The unbounded sequence problem. The sieve assumes all trajectories eventually enter the “already verified” region. But a number could, in principle, have a trajectory that increases forever, never returning to previously visited values. Ruling this out is precisely what the conjecture asserts.
Circular reasoning. The argument that “every number eventually maps to a smaller number” is equivalent to the conjecture. It cannot be used as a premise.

Tip: The number 27 illustrates the difficulty: it requires 111 steps and climbs to 9,232 before descending. The sieve method provides no a priori guarantee that every number will eventually descend.

Why Is the Collatz Conjecture So Hard?

The conjecture resists proof because it combines several difficult features:

Multiplicative and additive mixing. The operation $ 3x + 1 $ combines multiplication by 3 with addition of 1. These two types of operations interact in chaotic ways that are extremely difficult to analyze.
Unpredictable parity dynamics. After applying $ 3x + 1 $ to an odd number, the result is even, but the number of times we can divide by 2 before reaching another odd number is highly irregular.
No known invariant. Standard proof techniques (induction, descent, invariants) rely on finding a quantity that behaves predictably. Collatz trajectories defy such analysis.

Partial Results and Modern Progress

Theorem (Terras, 1976). Almost all positive integers (in the sense of natural density) have finite stopping time.

Theorem (Tao, 2019). Almost all Collatz orbits attain almost bounded values. Precisely: for any function $ f : \mathbb{N} \to \mathbb{R} $ with $ f(n) \to \infty $ as $ n \to \infty $, the set

$$ \{n \in \mathbb{N} : \min_{k \geq 0} \phi^k(n) \leq f(n)\} $$

has natural density 1.

Tao’s result is the strongest partial progress toward the Collatz conjecture. However, “almost all” is not “all”, and the conjecture remains open.

Possible Failure Modes

The conjecture could fail in exactly two ways:

Definition (Divergent Trajectory). A positive integer $ x $ has a divergent trajectory if $ \sup_k \phi^k(x) = \infty $.

Definition (Non-trivial Cycle). A non-trivial cycle is a finite set $ \{x_1, x_2, \ldots, x_m\} $ of positive integers, none equal to 1, such that $ \phi(x_i) = x_{i+1} $ (indices mod $ m $).

Theorem (Eliahou, 1993). Any non-trivial cycle in the Collatz iteration must have length at least 17,087,915.

This makes non-trivial cycles extremely unlikely but does not rule them out entirely. Similarly, divergent trajectories have never been observed but have not been proven impossible.

The Collatz Sequence for n = 27

The sequence beginning at 27 requires 111 steps and reaches a maximum of 9,232:

$$ 27 \to 82 \to 41 \to 124 \to 62 \to 31 \to 94 \to 47 \to 142 \to 71 \to 214 \to 107 \to 322 \to \cdots $$ $$ \cdots \to 9232 \to 4616 \to 2308 \to 1154 \to 577 \to \cdots \to 4 \to 2 \to 1. $$

Connections to the Broader Picture

The Primacy of Primes

Primes are the central characters across all of recreational number theory:

Fermat numbers ask: which numbers of the form $ 2^{2^n}+1 $ are prime?
Euler’s polynomial generates sequences of primes.
The Collatz conjecture involves the interplay between the prime 2 (division) and the prime 3 (multiplication).

Empirical Evidence vs. Proof

Important: A Cautionary Principle.
In number theory, empirical evidence can be spectacularly misleading:

Fermat’s five examples led to a false conjecture.

Euler’s 40 primes cannot extend further.

The Collatz conjecture, verified to $ 2^{68} $, remains unproven.

The gap between observation and proof is the defining challenge of the subject.

Simple Definitions, Deep Consequences

Each of the objects studied in this course is defined by an elementary formula:

$$ \begin{aligned} F_n &= 2^{2^n} + 1, \\ P(n) &= n^2 + n + 41, \\ \phi(x) &= \begin{cases} x/2, & x \text{ even}, \\ 3x+1, & x \text{ odd}. \end{cases} \end{aligned} $$

Yet each leads to problems of extraordinary difficulty. This phenomenon, simple rules generating complex behavior, is a hallmark of number theory and a bridge to fields like dynamical systems and computational complexity.

Dynamical Systems Perspective

The Collatz function defines a discrete dynamical system on the positive integers. Similarly, the iteration $ F_{n+1} = (F_n – 1)^2 + 1 $ gives a recurrence for Fermat numbers. While the Fermat recurrence grows predictably (doubly exponential), the Collatz iteration exhibits chaotic behavior. Understanding what distinguishes “tame” from “wild” iterations in number theory remains a fundamental open question.

Major Open Problems

We collect the main unsolved questions arising from this course:

Are there infinitely many Fermat primes? Most experts believe not, that $ F_0 $ through $ F_4 $ are the only Fermat primes, but this remains unproven.
Is the Collatz conjecture true? Despite verification to $ 2^{68} $ and Tao’s “almost all” result, a complete proof (or disproof) remains out of reach.
Is the Collatz conjecture decidable? Conway showed that generalizations are undecidable. The specific $ 3x+1 $ case might be independent of ZFC.
Are there better prime-generating polynomials? Rabinowitz’s theorem shows that $ n^2 + n + 41 $ is optimal among polynomials of the form $ n^2 + n + p $. But might other polynomial forms produce longer runs of primes?
What is the true density of primes in polynomial sequences? The Bunyakovsky conjecture predicts that any irreducible polynomial with no fixed prime divisor produces infinitely many prime values. This is unproven for any polynomial of degree $ \geq 2 $.

Exercises:

Compute the full Collatz sequence for $ x = 19 $. How many steps does it take to reach 1? What is the maximum value reached?

Prove that powers of 2 always reach 1 in exactly $ \log_2(x) $ steps. What does this tell you about the conjecture?

Show that the Collatz sequence for any number of the form $ 8k + 4 $ reaches $ 6k + 4 $ in exactly 3 steps.

The “shortcut” Collatz function applies $ (3x+1)/2 $ when $ x $ is odd (since the result of $ 3x+1 $ is always even). Compute the shortcut sequence for $ x = 27 $. How many steps does it save?

Investigate the $ 5x + 1 $ variant: compute the first 20 terms of the sequence starting at $ x = 13 $. Does it reach 1?

If a non-trivial Collatz cycle existed, explain why it must contain at least one odd number and at least one even number. Then argue why the cycle length must be at least 3.

Verify Terras’s result informally: for the first 100 positive integers, compute what fraction have stopping time less than 50. (A stopping time is when the sequence first goes below its starting value.)