Not quite, I think this is the relevant part of the paper:
> Structure of the proof. The proof of our main result, Theorem 1.1, is given in Section 6. The structure of the proof is as follows: machines are enumerated arborescently in Tree Normal Form (TNF) [9] – which drastically reduces the search space’s size: from 16,679,880,978,201 5-state machines to “only” 181,385,789; see Section 3. Each enumerated machine is fed through a pipeline of proof techniques, mostly consisting of deciders, which are algorithms trying to decide whether the machine halts or not. Because of the uncomputability of the halting problem, there is no universal decider and all the craft resides in creating deciders able to decide large families of machines in reasonable time. Almost all of our deciders are instances of an abstract interpretation framework that we call Closed Tape Language (CTL), which consists in approximating the set of configurations visited by a Turing machine with a more convenient superset, one that contains no halting configurations and is closed under Turing machine transitions (see Section 4.2). The S(5) pipeline is given in Table 3 – see Table 4 for S(2,4). All the deciders in this work were crafted by The bbchallenge Collaboration; see Section 4. In the case of 5-state machines, 13 Sporadic Machines were not solved by deciders and required individual proofs of nonhalting, see Section 5.
So, they figured out how to massively reduce the search space, wrote some generic deciders that were able to prove whether large amounts of the remaining search spaces would halt or not, and then had to manually solve the remaining 13 machines that the generic deciders couldn't reason about.
No, they are correct, because the deciders themselves are just a cog in the proof of the overall theorem. The specification of the deciders is not part of the TCB, so to speak.
There are a few concepts at play here. First you have to consider what can be proven given a particular theory of mathematics (presumably a consistent, recursively axiomatizable one). For any such theory, there is some finite N for which that theory cannot prove the exact value of BB(N). So with "infinite time", one could (in principle) enumerate all proofs and confirm successive Busy Beaver values only up to the point where the theory runs out of power. This is the Busy Beaver version of Gödel/Chaitin incompleteness. For BB(5), Peano Arithmetic suffices and RCA₀ likely does as well. Where do more powerful theories come from? That's a bit of a mystery, although there's certainly plenty of research on that (see Feferman's and Friedman's work).
Second, you have to consider what's feasible in finite time. You can enumerate machines and also enumerate proofs, but any concrete strategy has limits. In the case of BB(5), the authors did not use naive brute force. They exhaustively enumerated the 5-state machines (after symmetry reductions), applied a collection of certified deciders to prove halting/non-halting behavior for almost all of them, and then provided manual proofs (also formalized) for some holdout machines.
I think you have to exhaustively check each 5-state TM, but then for each one brute force will only help a bit - brute force can't tell you that a TM will run forever without stopping?
The busy beaver numbers form an uncomputable sequence.
For BB(5) the proof of its value is an indirect computation. The verification process involved both computation (running many machines) and proofs (showing others run forever or halt earlier). The exhaustiveness of crowdsourced proofs was a tour de force.
Isn't it rather that the Busy Beaver function is uncomputable, particular values can be calculated - although anything great than BB(5) is quite a challenge!
Only finitely many values of BB can be mathematically determined. Once your Turing Machines become expressive enough to encode your (presumably consistent) proof system, they can begin encoding nonsense of the form "I will halt only after I manage to derive a proof that I won't ever halt", which means that their halting status (and the corresponding Busy Beaver value) fundamentally cannot be proven.
That's a misinterpretation of what the article says. There is no actual bound in principle to what can be computed. There is a fairly practical bound which is likely BB(10) for all intents and purposes, but in principle there is no finite value of n for which BB(n) is somehow mathematically unknowable.
ZFC is not some God given axiomatic system, it just happens to be one that mathematicians in a very niche domain have settled on because almost all problems under investigation can be captured by it. Most working mathematicians don't really concern themselves with it one way or another, almost no mathematical proofs actually reference ZFC, and with respect to busy beavers, it's not at all uncommon to extend ZFC with even more powerful axioms such as large cardinality axioms in order to investigate them.
Anyhow, just want to dispel a common misconception that comes up that somehow there is a limit in principle to what the largest BB(n) is that can be computed. There are practical limits for sure, but there is no limit in principle.
You can compute a number that is equal to BB(n), but you can't prove that it is the right number you are looking for. For any fixed set of axioms you'll eventually run into BB(n) too big that gets indepentent.
>You can compute a number that is equal to BB(n), but you can't prove that it is the right number you are looking for.
You can't categorically declare that something is unprovable. You can simply state that within some formal theory a proposition is independent, but you can't state that a proposition is independent of all possibly formal theories.
They didn't claim that. They claimed that any (sound and consistent) finitely axiomatizable theory (basically, any recursively enumerable set of theorems) can only prove finitely many theorems of the form BB(n) = N.
Only if your goalpost of what "mathematics" is endlessly shifting. To prove values of BB(50000) you're probably going to need some pretty wacky axioms in your system. With BB(any large number) that's just going to be unfeasible to justify that the system isn't tailored to prove that fact, just short of adding axiom of "BB(x) = y".
Yes, but as far as I know, nobody has shown that the Collatz conjecture is anything other than a really hard problem. It isn't terribly difficult to mathematically imagine that perhaps the Collatz problem space considered generally encodes Turing complete computations in some mathematically meaningful way (even when we don't explicitly construct them to be "computational"), but as far as I know that is complete conjecture. I have to imagine some non-trivial mathematical time has been spent on that conjecture, too, so that is itself a hard problem.
But there is also definitely a place where your axiom systems become self-referential in the Busy Beaver and that is a qualitative change on its own. Aaronson and some of his students have put an upper bound on it, but the only question is exactly how loose it is, rather than whether or not it is loose. The upper bound is in the hundreds, but at [1] in the 2nd-to-last paragraph Scott Aaronson expresses his opinion that the true boundary could be as low as 7, 8, or 9, rather than hundreds.
If such proofs exist that are checkable by a proof assistant, you can brute force them by enumerating programs in the proof assistant (with a comically large runtime). Therefore there is some small N where any proof of BB(N) is not checkable with existing proof assistants. Essentially, this paper proves that BB(5) was brute forcible!
The most naive algorithm is to use the assistant to check if each length 1 coq program can prove halting with computation limited to 1 second, then check each length 2 coq program running for 2 seconds, etc till the proofs in the arxiv paper are run for more than their runtime.
In this perspective, couldn't you equally say that all formalized mathematics has been brute forced, because you found working programs that prove your desired results and that are short enough that a human being could actually discover and use them?
... even though your actual method of discovering the programs in question was usually not purely exhaustive search (though it may have included some significant computer search components).
More precisely, we could say that if mathematicians are working in a formal system, they can't find any results that a computer with "sufficiently large" memory and runtime couldn't also find. Yet currently, human mathematicians are often more computationally efficient in practice than computer mathematicians, and the human mathematicians often find results that bounded computer mathematicians can't. This could very well change in the future!
Like it was somewhat clear in principle that a naive tree search algorithm in chess should be able to beat any human player, given "sufficiently large" memory and runtime (e.g. to exhaustively check 30 or 40 moves ahead or something). However, real humans were at least occasionally able to beat top computer programs at chess until about 2005. (This analogy isn't perfect because proof correctness or incorrectness within a formal system is clear, while relative strength in chess is hard to be absolutely sure of.)
Not quite. There is some N for which you can’t prove BB(N) is correct for any existing proof assistant, but you can prove that BB(N) by writing a new proof assistant. However, the problem “check if new sufficiently powerful proof assistant is correct” is not decidable.
Length of proof / machine for proving it can be much bigger than BB(n) itself. Or even there can be specific machines that don't halt, but there is no proof for this at all – you can encode problems independent of ZFC in a few hundred states.
You can try them with simple short loops detectors, or perhaps with the "turtle and hare" method. If I do that and a friend ask me how I solved it, I'd call that "bruteforce".
They solved a lot of the machines with something like that, and some with more advanced methods, and "13 Sporadic Machines" that don't halt were solved with a hand coded proof.
Not quite, I think this is the relevant part of the paper:
> Structure of the proof. The proof of our main result, Theorem 1.1, is given in Section 6. The structure of the proof is as follows: machines are enumerated arborescently in Tree Normal Form (TNF) [9] – which drastically reduces the search space’s size: from 16,679,880,978,201 5-state machines to “only” 181,385,789; see Section 3. Each enumerated machine is fed through a pipeline of proof techniques, mostly consisting of deciders, which are algorithms trying to decide whether the machine halts or not. Because of the uncomputability of the halting problem, there is no universal decider and all the craft resides in creating deciders able to decide large families of machines in reasonable time. Almost all of our deciders are instances of an abstract interpretation framework that we call Closed Tape Language (CTL), which consists in approximating the set of configurations visited by a Turing machine with a more convenient superset, one that contains no halting configurations and is closed under Turing machine transitions (see Section 4.2). The S(5) pipeline is given in Table 3 – see Table 4 for S(2,4). All the deciders in this work were crafted by The bbchallenge Collaboration; see Section 4. In the case of 5-state machines, 13 Sporadic Machines were not solved by deciders and required individual proofs of nonhalting, see Section 5.
So, they figured out how to massively reduce the search space, wrote some generic deciders that were able to prove whether large amounts of the remaining search spaces would halt or not, and then had to manually solve the remaining 13 machines that the generic deciders couldn't reason about.
Last but not least, those deciders were implemented and verified in the Rocq proof assistant, so we know they are correct.
We know that they correctly implement their specification*
No, they are correct, because the deciders themselves are just a cog in the proof of the overall theorem. The specification of the deciders is not part of the TCB, so to speak.
Those 13 sporadics are the interesting ones then ...
Probably just generating some fractal pattern on the tape.
There are a few concepts at play here. First you have to consider what can be proven given a particular theory of mathematics (presumably a consistent, recursively axiomatizable one). For any such theory, there is some finite N for which that theory cannot prove the exact value of BB(N). So with "infinite time", one could (in principle) enumerate all proofs and confirm successive Busy Beaver values only up to the point where the theory runs out of power. This is the Busy Beaver version of Gödel/Chaitin incompleteness. For BB(5), Peano Arithmetic suffices and RCA₀ likely does as well. Where do more powerful theories come from? That's a bit of a mystery, although there's certainly plenty of research on that (see Feferman's and Friedman's work).
Second, you have to consider what's feasible in finite time. You can enumerate machines and also enumerate proofs, but any concrete strategy has limits. In the case of BB(5), the authors did not use naive brute force. They exhaustively enumerated the 5-state machines (after symmetry reductions), applied a collection of certified deciders to prove halting/non-halting behavior for almost all of them, and then provided manual proofs (also formalized) for some holdout machines.
I think you have to exhaustively check each 5-state TM, but then for each one brute force will only help a bit - brute force can't tell you that a TM will run forever without stopping?
You can not rely on brute force alone to compute these numbers. They are uncomputable.
The busy beaver numbers form an uncomputable sequence.
For BB(5) the proof of its value is an indirect computation. The verification process involved both computation (running many machines) and proofs (showing others run forever or halt earlier). The exhaustiveness of crowdsourced proofs was a tour de force.
Isn't it rather that the Busy Beaver function is uncomputable, particular values can be calculated - although anything great than BB(5) is quite a challenge!
https://scottaaronson.blog/?p=8972
Only finitely many values of BB can be mathematically determined. Once your Turing Machines become expressive enough to encode your (presumably consistent) proof system, they can begin encoding nonsense of the form "I will halt only after I manage to derive a proof that I won't ever halt", which means that their halting status (and the corresponding Busy Beaver value) fundamentally cannot be proven.
The bound on that is known to be no more than BB(745) which is independent of ZFC [1].
[1] https://scottaaronson.blog/?p=7388
That's a misinterpretation of what the article says. There is no actual bound in principle to what can be computed. There is a fairly practical bound which is likely BB(10) for all intents and purposes, but in principle there is no finite value of n for which BB(n) is somehow mathematically unknowable.
ZFC is not some God given axiomatic system, it just happens to be one that mathematicians in a very niche domain have settled on because almost all problems under investigation can be captured by it. Most working mathematicians don't really concern themselves with it one way or another, almost no mathematical proofs actually reference ZFC, and with respect to busy beavers, it's not at all uncommon to extend ZFC with even more powerful axioms such as large cardinality axioms in order to investigate them.
Anyhow, just want to dispel a common misconception that comes up that somehow there is a limit in principle to what the largest BB(n) is that can be computed. There are practical limits for sure, but there is no limit in principle.
You can compute a number that is equal to BB(n), but you can't prove that it is the right number you are looking for. For any fixed set of axioms you'll eventually run into BB(n) too big that gets indepentent.
>You can compute a number that is equal to BB(n), but you can't prove that it is the right number you are looking for.
You can't categorically declare that something is unprovable. You can simply state that within some formal theory a proposition is independent, but you can't state that a proposition is independent of all possibly formal theories.
They didn't claim that. They claimed that any (sound and consistent) finitely axiomatizable theory (basically, any recursively enumerable set of theorems) can only prove finitely many theorems of the form BB(n) = N.
I quoted the specific statement that I refuted.
Only if your goalpost of what "mathematics" is endlessly shifting. To prove values of BB(50000) you're probably going to need some pretty wacky axioms in your system. With BB(any large number) that's just going to be unfeasible to justify that the system isn't tailored to prove that fact, just short of adding axiom of "BB(x) = y".
It's not that there "exists n, such that for all theories", but that "for all theories there exists n", that BB(n) will get independent eventually.
Once you can express Collatz conjecture, you're already in the deep end.
Yes, but as far as I know, nobody has shown that the Collatz conjecture is anything other than a really hard problem. It isn't terribly difficult to mathematically imagine that perhaps the Collatz problem space considered generally encodes Turing complete computations in some mathematically meaningful way (even when we don't explicitly construct them to be "computational"), but as far as I know that is complete conjecture. I have to imagine some non-trivial mathematical time has been spent on that conjecture, too, so that is itself a hard problem.
But there is also definitely a place where your axiom systems become self-referential in the Busy Beaver and that is a qualitative change on its own. Aaronson and some of his students have put an upper bound on it, but the only question is exactly how loose it is, rather than whether or not it is loose. The upper bound is in the hundreds, but at [1] in the 2nd-to-last paragraph Scott Aaronson expresses his opinion that the true boundary could be as low as 7, 8, or 9, rather than hundreds.
[1]: https://scottaaronson.blog/?p=8972
https://people.cs.uchicago.edu/~simon/RES/collatz.pdf Generalized colatz is uncomputable.
> particular values can be calculated
You need proofs of nontermination for machines that don't halt. This isn't possible to bruteforce.
If such proofs exist that are checkable by a proof assistant, you can brute force them by enumerating programs in the proof assistant (with a comically large runtime). Therefore there is some small N where any proof of BB(N) is not checkable with existing proof assistants. Essentially, this paper proves that BB(5) was brute forcible!
The most naive algorithm is to use the assistant to check if each length 1 coq program can prove halting with computation limited to 1 second, then check each length 2 coq program running for 2 seconds, etc till the proofs in the arxiv paper are run for more than their runtime.
In this perspective, couldn't you equally say that all formalized mathematics has been brute forced, because you found working programs that prove your desired results and that are short enough that a human being could actually discover and use them?
... even though your actual method of discovering the programs in question was usually not purely exhaustive search (though it may have included some significant computer search components).
More precisely, we could say that if mathematicians are working in a formal system, they can't find any results that a computer with "sufficiently large" memory and runtime couldn't also find. Yet currently, human mathematicians are often more computationally efficient in practice than computer mathematicians, and the human mathematicians often find results that bounded computer mathematicians can't. This could very well change in the future!
Like it was somewhat clear in principle that a naive tree search algorithm in chess should be able to beat any human player, given "sufficiently large" memory and runtime (e.g. to exhaustively check 30 or 40 moves ahead or something). However, real humans were at least occasionally able to beat top computer programs at chess until about 2005. (This analogy isn't perfect because proof correctness or incorrectness within a formal system is clear, while relative strength in chess is hard to be absolutely sure of.)
Not quite. There is some N for which you can’t prove BB(N) is correct for any existing proof assistant, but you can prove that BB(N) by writing a new proof assistant. However, the problem “check if new sufficiently powerful proof assistant is correct” is not decidable.
Length of proof / machine for proving it can be much bigger than BB(n) itself. Or even there can be specific machines that don't halt, but there is no proof for this at all – you can encode problems independent of ZFC in a few hundred states.
You can try them with simple short loops detectors, or perhaps with the "turtle and hare" method. If I do that and a friend ask me how I solved it, I'd call that "bruteforce".
They solved a lot of the machines with something like that, and some with more advanced methods, and "13 Sporadic Machines" that don't halt were solved with a hand coded proof.
They are at the very boundary of what is computable!