More

Straw · 2025-06-28T17:41:53 1751132513

Sure, if someone just gives you the number, ZFC can represent it. But ZFC cannot prove that the value is correct, so how do you know you have the right number? Use a stronger proof system? Go a bit bigger and same issue.

ajkjk · 2025-06-28T18:01:57 1751133717

Not an expert, but I've read about this a bit because it bothered me also and I think this is the answer:

Most of these 'uncomputable' problems are uncomputable in the sense of the halting problem: you can write down an algorithm that should compute them, but it might never halt. That's the sense in which BB(x) is uncomputable: you won't know if you're done ever, because you can't distinguish a machine that never halts from one that just hasn't halted yet (since it has an infinite number of states, you can't just wait for a loop).

So presumably the independence of a number from ZFC is like that also: you can't prove it's the value of BB(745) because you won't know if you've proved it; the only way to prove it is essentially to run those Turing machines until they stop and you'll never know if you're done.

I'm guessing that for the very small Turing machines there is not enough structure possible to encode whatever infinitely complex states end up being impossible to deduce halting from, so they end up being Collatz-like and then you can go prove things about them using math. As you add states the possible iteration steps go wild and eventually do stuff that is beyond ZFC to analyze.

So the finite value 745 isn't really where the infinity/uncomputability comes from-it comes from the infinite tape that can produce arbitrarily complex functions. (I wonder if over a certain number of states it becomes possible to encoding a larger Turing machine in the tape somehow, causing a sort of divergence to infinite complexity?)

Scarblac · 2025-06-28T18:43:41 1751136221

And also, if BB were computable, then it could be used to solve the halting problem: run the Turing machine of size n for BB(n) steps, and if it hasn't halted yet, it never will. So the BB function is clearly not computable.

But to me as a layman that seems true regardless of formal axioms chosen, but I guess I need to read that linked thesis.

ajkjk · 2025-06-28T18:46:00 1751136360

That is the standard argument for why BB is uncomputable for general n, but it's not the same as why BB(n) would be independent of ZFC for fixed n.

dtech · 2025-06-28T18:27:52 1751135272

I am also not an expert, but this does not sound right to me. Godel's incompleteness theorem shows that there are certain things that cannot be proven. Being independent of ZFC means that something is such a case. So BB(643) being independent of ZFC means that we cannot prove or disprove that a certain number is BB(643). Aka we don't have the math to know for certain.

SAI_Peregrinus · 2025-06-28T22:23:25 1751149405

Independence from ZFC means we can't prove that any given number is BB(643) using ZFC. It doesn't mean we can't prove it at all, e.g. one could use a stronger set theory like NBG which can prove the consistency of ZFC to verify the value of BB(643). But there would be some n for which BB(n) is independent of that set theory, requiring a yet-stronger theory, and so on ad infinitum.

ZF & ZFC are as important as they are because they're the weakest set theories capable of working as the foundations of mathematics that we've found. We can always add axioms, but taking axioms away & still having a usable theory on which to base mathematics is much more difficult.

ajkjk · 2025-06-29T03:30:29 1751167829

sure, but it is still very hard to wrap one's head around how the value of a function can be independent of ZFC, and how it could not be for (e.g.) 642 but then be true for 643. That was the point of my post. It seems like you could just... run the function on every 643-state input and see what the value is, which would in some sense constitute a "proof" in ZFC? but maybe not, because you wouldn't even know if you had the answer? That's the part that is so intriguing about it.

SAI_Peregrinus · 2025-06-30T14:30:44 1751293844

Some 643-state inputs never halt. Some 643-state inputs do eventually halt. Only if you can run them for infinite time can you determine whether a given machine halts in a finite length of time: for any finite time you pick, if the machine is still running it could still halt eventually. That's just the halting problem, the impossibility of solving it is quite famous and it's easy to find the proof stated more formally than I want to with the limits of HN's markdown.

The interesting bit is they were able to construct a machine that halts if ZFC is consistent. Since a consistent axiomatic system can never prove its own consistency (another famous proof) ZFC can't prove that this machine halts. And ZFC can't prove that it never halts without running it for infinite steps.

That ZFC-consistency-proving machine has 643 states, so BB(643) either halts after the ZFC-consistency-proving machine or the ZFC-consistency-proving machine never halts. If BB(643) halts after the ZFC-consistency-proving machine, then ZFC is consistent and ZFC can't prove BB(643) halts since ZFC can't prove the ZFC-consistency-proving machine halts.

Straw · 2025-06-30T22:18:55 1751321935

Don't you want the weakest (ie makes the fewest assumptions) theory that works?

SAI_Peregrinus · 2025-07-02T02:14:02 1751422442

Yes, which is why ZFC gets used. NBG & MK are stronger and occasionally used, but ZFC being weaker meant it got more popular since it's almost always good enough.

ajkjk · 2025-06-28T18:59:06 1751137146

Yeah, but the vexing part is "how can that be true for e.g. N=643 but not N=642"? What happens at whatever number it starts true at?

Incidentally, Gödel's theorem eventually comes down to a halting-like argument as well (well, a diagonal argument). There is a presentation of it that is in like less than one page in terms of the halting problem---all of the Gödel-numbering stuff is essentially an antiquated proof. I remember seeing this in a great paper which I can't find now, but it's also mentioned as an aside in this blog post: https://scottaaronson.blog/?p=710

wait jk I found it: https://arxiv.org/abs/1909.04569

LegionMammal978 · 2025-06-28T19:32:07 1751139127

> What happens at whatever number it starts true at?

Usually, "what happens" is that the machines become large enough to represent a form of induction too strong for the axioms to 'reason' about. It's a function of the axioms of your theory, and you can add more axioms to stave it off, but of course you can't prove that your new axioms are consistent without even more axioms.

> There is a presentation of it that is in like less than one page in terms of the halting problem---all of the Gödel-numbering stuff is essentially an antiquated proof.

Only insofar as you can put faith into the Church–Turing thesis to sort out all the technicalities of enumerating and verifying proofs. There still must be an encoding, just not the usual Gödel numbering.

thaumasiotes · 2025-06-28T19:50:31 1751140231

> Incidentally, Gödel's theorem eventually comes down to a halting-like argument as well (well, a diagonal argument).

> There is a presentation of it that is in like less than one page in terms of the halting problem

Those are two very different ideas. Your second sentence says that Gödel's theorem is easy to prove if you have results about the halting problem. Your first one says that in order to prove Gödel's theorem, you need to establish results about the halting problem.

ajkjk · 2025-06-28T20:53:00 1751143980

I'm saying that if you want to understand why Gödel's theorem is true, look at the one-paragraph proof based on the halting problem, not the like 20-page one with Gödel numbers.

wasabi991011 · 2025-06-29T17:35:56 1751218556

Wow that might be the best, most entertaining, and most elucidating academic article I've ever read. Thanks for sharing.

thaumasiotes · 2025-06-29T00:01:30 1751155290

> Most of these 'uncomputable' problems are uncomputable in the sense of the halting problem: you can write down an algorithm that should compute them, but it might never halt. That's the sense in which BB(x) is uncomputable: you won't know if you're done ever, because you can't distinguish a machine that never halts from one that just hasn't halted yet (since it has an infinite number of states, you can't just wait for a loop).

> So presumably the independence of a number from ZFC is like that also: you can't prove it's the value of BB(745) because you won't know if you've proved it; the only way to prove it is essentially to run those Turing machines until they stop and you'll never know if you're done.

These aren't similar ideas. You can't know if a machine that hasn't halted yet will ever halt. But you can easily know if a machine that has already halted was going to halt.

Independence is the second case. For the value of BB(x) to be independent of ZFC, one of two things must hold:

(1) ZFC is inconsistent, and therefore all statements are independent of it.

(2) ZFC is consistent with two different statements, "BB(x) = a" and "BB(x) = b" for two different a, b. This means that a disproof of either statement cannot exist.

This, in turn, means that there is no observation you could ever make that would distinguish between the values a and b (for the identity of BB(x)). No matter what you believe the value of BB(x) might secretly be, there are no consequences; nothing anywhere could ever change if the value turned out to be different. Because, if there were an observable consequence of the value being different, the hypothetical observation of that consequence would be a disproof of the value that didn't cause it, and no such disproof can exist.

Neither value, a or b, can be more true than the other as the answer to the question "what is BB(x)?". It doesn't make sense to consider that question to have an answer at all.

josephcsible · 2025-06-29T15:42:33 1751211753

> (2) ZFC is consistent with two different statements, "BB(x) = a" and "BB(x) = b" for two different a, b. This means that a disproof of either statement cannot exist.

> This, in turn, means that there is no observation you could ever make that would distinguish between the values a and b (for the identity of BB(x)). No matter what you believe the value of BB(x) might secretly be, there are no consequences; nothing anywhere could ever change if the value turned out to be different. Because, if there were an observable consequence of the value being different, the hypothetical observation of that consequence would be a disproof of the value that didn't cause it, and no such disproof can exist.

There's one part of this I don't understand. "BB(x) = n" means "there is at least one x-state Turing machine that halts after exactly n steps, and there are no x-state Turing machines that halt after more than n steps", right? Then why wouldn't this approach work (other than the numbers being way too big to actually do in this universe)? WLOG, assume a < b. Run all possible x-state Turing machines for b steps. If any halted on step b, then you've disproved "BB(x) = a". If not, then you've disproved "BB(x) = b".

adgjlsfhk1 · 2025-07-02T02:40:12 1751424012

The trick is that if none halt on `b` steps, you don't know that BB(x)<b. Specifically, if you have one TM that keeps going, you don't know whether that TM halts eventually or keeps going forever.

josephcsible · 2025-07-02T14:35:57 1751466957

Sure, you don't know whether BB(x) < b or BB(x) > b for that reason, but wouldn't you still know BB(x) ≠ b, and isn't that good enough?

Scarblac · 2025-06-29T08:33:00 1751185980

What happens if you take the larger of a and b and run all the Turing machines for that many steps?

Kranar · 2025-06-29T19:11:15 1751224275

Among all possible values of BB(n) for some fixed n, it's the smallest such value that is the true value.

The issue is that there is no way within ZFC to determine which value is the smallest.

thaumasiotes · 2025-06-29T16:56:18 1751216178

What are a and b?

josephcsible · 2025-06-30T20:01:31 1751313691

Does it matter? My reading is basically "if you have two distinct candidates, isn't that a way to always disprove at least one of them?"

Straw · 2025-06-30T22:23:16 1751322196

It likely comes from the smallest machine that someone has been able to construct that can diagonalize over all proofs in ZFC, or something similar.

lupire · 2025-06-28T18:27:44 1751135264

It has to come from a finite value (specifically, the amount of complexity that can be enocoded in 745 pieces of information https://turingmachinesimulator.com/shared/vgimygpuwi), because the finite size 745 with infinite tape leads to uncomputability, but the size 5 does not.

In a very real sense, a deep kind of infinite complexity can be generated from 745 objects of certain kind, but not from 5 objects of that kind..

Turing machines have infinite tape, not infinite state. The entire set of all halting machines of a given size collectively only use finite tape. Totally finite. Only (some of) the non-halting machines use infinite tape.

The problem is that we don't know in advance how large the (definitely finite) upper bound on the amount of tape all the size-N halting machines use, until after enough of them (one per known equivalence class) halt. And we don't know (in general) how to run all the halting ones until they halt, without also running a non-halting program for an unbounded amount of time.

TL:DR: unbounded is not infinite, but big enough to be a problem.

ajkjk · 2025-06-28T18:42:06 1751136126

I am aware it's an infinite tape and finite state (maybe I misspoke somewhere), as well as the halting machines using finite tape (because of course they do).

But the overall 'complexity' (at a timestep, say) is going to be due to the states and the tape together. The BB(5) example that was analyzed, iirc, was a Collatz-like problem (Aaronson describes it here: https://scottaaronson.blog/?p=8088 ). My interpretation of this is that:

1. collatz-like functions have a lot of complexity just due to math alone 2. 5 states turned out to be enough to "reach" that one that 3. more states means you're going to reach more possible Collatz-like functions (they don't have to be Collatz-like; it's just easier to think about them like that) 4. eventually you reach ones that ZFC cannot show to halt, because there is effectively no way to prove it other than running them, and then you would have to solve the halting problem.

The part that was helpful for me to be less unsettle by BB(745) being independent of the ZFC was the notion that it eventually boils down to a halting problem, and asking ZFC to "solve" it... which is more agreeable than the idea that "ZFC cannot compute a function that seems to be solvable by brute force".

thechao · 2025-06-28T18:18:33 1751134713

We need to distinguish between a computer that's equivalent to BB(n), and a computer big enough to compute the value of the number that is BB(n). By (terrible) analogy: a 4004 can be made to write a finite loop that describes how many FLOPs the number 1 supercomputer can compute without, itself, being able to usefully perform the computations of that supercomputer. (The 4004 will run out of memory/addressable disk space.) Similarly, we can no longer build decidable programs in ZFC that can compute the number BB(748). Scott is saying that they now think this "disassociation" might occur at BB(7)!

vhcr · 2025-06-29T05:10:22 1751173822

Nobody can give you that number, because it's way bigger than what can be represented in the universe.

Straw · 2025-06-28T17:39:45 1751132385

The category error is in thinking that BB(748) is in fact, a number. It's merely a mathematical concept.

jerf · 2025-06-28T18:00:39 1751133639

No, that's one of the freakiest things about things like the Busy Beaver function. There is an exact integer that BB(748) defines. You can add one to it and then it would no longer be that number anymore.

If you are refering to the idea that nothing that can't exist in the real universe "really exists", then the "Busy Beaver" portion of that idea is extraneous, as 100% of integers can't exist in the real universe, and therefore, 100% of integers are equally just "mathematical concepts". That one of them is identified by BB(748) isn't a particularly important aspect. But certainly, a very specific number is identified by that designation, though nothing in this universe is going to know what it is in any meaningful sense.

Straw · 2025-06-30T21:58:02 1751320682

You say the busy beaver function is a function. But I can claim it's not, because you cannot make it constructively- in constructive analysis, all functions are computable.

Many other numbers and functions are computable, including e, pi, 10^100, etc- these are fundamentally different than BB.

So in what sense is it actually a number? There is no algorithm which can resolve questions such as BB(748) < x given x. That doesn't seem like a number to me!

In fact, for some x, such questions will depend on the consistency of ZFC. All normal math we do is expressible in ZFC, but by incompleteness, ZFC cannot prove it's own consistency or is inconsistent. So, we cannot really ever know the value, we can only ever find lower bounds. Does this seem like a number to you? It's not in the English sense and neither is it in what I would consider a reasonable definition of numbers you actually encounter, the computable numbers. Real numbers are in fact, not very real at all.

perthmad · 2025-06-28T18:29:54 1751135394

This integer only exists if you assume classical logic. Otherwise, there is no such integer a priori, and actually there is none in general.

jerf · 2025-06-28T19:54:00 1751140440

I'm fairly certain that's wrong, and I see a couple of other people may be making that mistake elsewhere in this conversation too. A Turing Machine is a Turing Machine. The execution trace of a Turing Machine is fully determined by its ruleset and its initial input. It doesn't matter which axioms you "take", nor does it matter what the intent of the initial construction of the 748-state machine was, or indeed even if that proof is somehow flawed. The definition of a Turing Machine is effectively the axiom set for this particular case. There is a finite set of 748-state Turing machines, and it is absolutely the case that there is a set of them that loop infinitely, the complementary set that do not, and that there is a maximum length amoung the set that do not. There is no situation where "the next step" of the Turing Machine "depends on your axioms" and could thereby be affected by such a decision.

For that to be the case, there would have to be some symbol under the tape and some state the machine is in for which the action the machine takes and the next state it goes to would depend on the axioms taken somehow. There is no place where the Turing Machine has somehow been running for so long and just gotten so large that its behavior becomes non-deterministic somehow.

What this means is that even if we lived in a universe where we had the unfathomable resources to actually have this number somehow meaningfully "in hand", we would be unable to prove that it was the correct one with just ZFC. Maybe one of the really quite numerous other machines still spinning away would in fact terminate in the future and be the real BB winner, because even this staggeringly monstrously large universe is still piddling nothing next to infinity and the other machines still require infinite resources to run them to discover they never terminate. But that doesn't do anything to affect whether or not there in fact is a single concrete integer that corresponds to BB(748).

Although one imagines that any universe with the resources to "have" BB(748) in it might also have some much more powerful axiom systems to play with in the process. The amount of computational power this universe apparently possesses is beyond all comprehension and who knows what they could know. But even if they used a more powerful system, it wouldn't change what BB(748) is... it just might affect whether or not they were correct about it.

LegionMammal978 · 2025-06-28T20:15:57 1751141757

> There is no situation where "the next step" of the Turing Machine "depends on your axioms" and could thereby be affected by such a decision.

That's easy, you just have to be an ultrafinitist, and say, "The definition of a TM presupposes an infinite set of natural numbers for time steps and tape configurations. But there aren't actually infinitely many natural numbers, infinitely long executions, arbitrarily long proofs, etc., outside of the formalism. If a formal statement and its negation do not differ regarding any natural numbers small enough to actually exist (in whatever sense), then neither is more true than the other." In particular, consistency statements may have no definite truth value, if the hypothetical proof of an inconsistency would be too large.

Of course, metamathematics tells us "you can't do that, in principle you could tell the lie if you wrote out the whole proof!" But that principle also presupposes the existence of arbitrarily-long proofs.

(Personally, hearing some of the arguments people make about BB numbers, I've become attracted to agnosticism toward ultrafinitist ideas.)

jerf · 2025-06-28T20:32:53 1751142773

To be honest I'm not even particularly impressed by that line of reasoning because even if you accept ultrafinitism, there's still a definite integer that it corresponds to. You can deny the "existence" of integers, and thus that the number "exists", but that's contingent on your definition of "existence". It doesn't change what it would be if it did exist.

Plus, ultafinitism is essentially relative to the universe you find yourself in. I hypothesized a universe in which BB(748) could actually exist, but you can equally hypothesize ones in which not only can it exist, it exists comfortably and is considered a small number by its denizens. We can't conceive of such a thing but there's no particular a priori reason to suppose it couldn't exist. If such a universe does actually "exist" does that mean our ultrafinitism is wrong? I'm actually a sort of a proponent of knowing whether your operating in a math space that corresponds to the universe (see also constructive mathematics), but concretely declaring that nothing could possibly exist that doesn't fit into our universe is a philosophical statement, not a mathematical one.

LegionMammal978 · 2025-06-28T20:46:31 1751143591

> there's still a definite integer that it corresponds to.

The formalism says that there's still a definite integer that it corresponds to. The ultrafinitist would deny that the formalism keeps capturing truth past where we've verified it to be true, or some unknown distance farther.

> I hypothesized a universe in which BB(748) could actually exist, but you can equally hypothesize ones in which not only can it exist, it exists comfortably and is considered a small number by its denizens.

Sure, but the ultrafinitist would argue, "All this is still just a shallow hypothesis: you've said the words, but that's not enough to breathe much 'life' into the concept. It is but the simplest of approximations that can fit into our heads, and such large things (if they could exist) would likely have an entirely different nature that is incomprehensible to us."

> We can't conceive of such a thing but there's no particular a priori reason to suppose it couldn't exist.

That's why I wouldn't call myself an ultrafinitist, but would prefer an agnostic approach. There may be no great a priori reason to suppose it cannot exist, but I similarly do not see any such reason it must necessarily exist. We empirically notice that our formalism works for numbers small enough to work with, and we pragmatically round it off to "this formalism is true", but one could argue that surprising claims about huge numbers need stronger support than mere pragmatism.

nyssos · 2025-06-28T18:47:43 1751136463

Classical logic is the presumed default for mathematics, if someone is working in a different system they will say so explicitly.

gylterud · 2025-06-28T19:09:04 1751137744

Pondering mathematical objects such as BB(n) is exactly the kind of stuff which rooks one’s faith in classical logic.

Dylan16807 · 2025-06-28T18:55:21 1751136921

> that's one of the freakiest things about things like the Busy Beaver function

Every sentence ever spoken and every view ever looked at is also a number. It's not a freaky thing about "things like" busy beaver, it's a freaky thing about the concept of information.

But even though everything is a number, saying "it's crazy that a number can be X" is usually someone making a mistake, using the everyday concept of numbers in their head. If you replace "a number" with "some text and code and data", people wouldn't say it's surprising that "some text and code and data" can be unprovable in ZFC.

Technically a photograph is a number, but primarily it's something else. BB(748) is the same, technically a number but primarily it's a series of detailed computer calculations.

gnramires · 2025-06-28T20:15:24 1751141724

> Every sentence ever spoken and every view ever looked at is also a number. It's not a freaky thing about "things like" busy beaver, it's a freaky thing about the concept of information.

I'd say that's a bit of a wrong or misleading statement. I think the correct version is "everything[1] can be encoded as a number". The concept of number is a very particular concept! It's pretty absurd to say "a screwdriver is a number" or "a word is a number". That is true for the peano axiomatization of numbers; but to me in particular, I believe numbers are a generalization (and formalization) of the idea or concept of quantity. There's a particular idea that refers to say 'two' apples, the quantity of apples. A word is not a quantity, it's a different concept. Even though each of them could be encoded as a number somehow!

[1]: Everything that we believe to be finite and of interest, that is. We don't know presently anything that could be used in reality (a music, picture, etc.) that can't in principle be encoded as a large enough number.

I think this is quite interesting, because this encoding is critical, and it completes the system. You essentially need a machine to turn things into numbers and numbers into things; and this is unavoidable. You can actually encode this machine itself with numbers! This number (which encodes this transcoding machine) can even be decoded by its own machine! But we cannot actually avoid the machine itself, some actual realization in the real world, because any number, in order to represent something, can only be translated by one "machine" (which can be essentially a computer, or a mind, etc.).

Instead of thinking of machines, you can also think of conventions. So you can have a convention that say the number '5' encodes the concept 'word', or maybe it simply encodes the string of letters "word" ("w"+"o"+"r"+"d"). But the convention interpretation isn't complete, because you still need someone, or something, to interpret this convention in practice and potentially turn concepts into reality, or simply manipulate those concepts in significant and useful ways.

Some more examples: (1) you can encode objects by describing a series of solid operations, essentially CAD modelling, so you have numbers that represent solids. The machine that interprets this number and is able to translate it for example into a picture, a series of instructions to be interpreted by a 3d printer, of performs operations (inferences) about the relevant solid model (for example, a structural analysis) is your "machine", i.e. your software, without which a number, or string of bits by itself doesn't mean anything (except the quantity associated with that binary number, perhaps), and again this encoding or number is essentially arbitrary, it could be very different. (2) a JPEG file for example encodes an image that is read by a software stack (jpeg decoder+picture viewer+operating system+display driver) and forwarded to your monitor to be viewed as a pixel array. Again the string of bits associated with any image could in principle represent anything else representable.

Information (Shannon information in particular) simply implies the encoding possible.

It's really interesting that a lot of the time we are performing essentially translations between different representations of a thing: a series of bits into states of pixels on a screen (a picture), [a series of bits] into a 3d printed object, into a visualization of an object on a screen, etc. (one way), or a reading of a camera sensor (photograph) into a series of bits, a conception of an object (3d modelling), a conception of a story (writing), etc. (the other way). We of course can (and must for them to be built of course) conceptualize those "machines" themselves (e.g. the software part), represent them in some way (our encoding), and then turn this representation into a realization of those machines (a software, a piece of hardware, or just a representation convention, etc.).

In other words, the mind or computer itself is always an integral part of the process, and information in a vacuum doesn't represent anything necessarily.

Finally, most of what we do is some kind of translation, inference, and construction -- everything to assist our lives. Of course some "machines" are capable of generating new concepts, those are very interesting "machines" :)

Dylan16807 · 2025-06-28T21:12:47 1751145167

> I'd say that's a bit of a wrong or misleading statement. I think the correct version is "everything[1] can be encoded as a number". The concept of number is a very particular concept! It's pretty absurd to say "a screwdriver is a number" or "a word is a number". That is true for the peano axiomatization of numbers; but to me in particular, I believe numbers are a generalization (and formalization) of the idea or concept of quantity. There's a particular idea that refers to say 'two' apples, the quantity of apples. A word is not a quantity, it's a different concept. Even though each of them could be encoded as a number somehow!

Well if we're using a more narrow view, then "BB(748)" isn't a number, it's an encoding of a partial algorithm. And it shouldn't be surprising that an algorithm might be unprovable in ZFC.

The actual number, the quantity, is quite easy to write down inside ZFC. And so is the beaver turing machine itself. The hard part is knowing which of the 748-state machines is the beaver.

Scarblac · 2025-06-28T18:47:36 1751136456

There is a finite number of Turing machines of size 748. The number of them that eventually halt is thus also finite, and BB(748) is the highest number of steps in the finite list of how many steps each took to halt. It has to be a number.

We just can't prove which number it is, we don't know which of the machines halt.

bmacho · 2025-06-28T19:14:22 1751138062

Let S be a statement. S is called semidecidible (also: Turing recognizable, most commonly "recursively enumerable", abbreviated as "r.e.", but I hate that one) if there is a Turing machine that halts if and only if S is true.

With this definition, we can say that "ZFC is inconsistent" is semidecidible: you run a program that searches for a contradiction.

The question BB(748) =/= 1000 is similarly semidecidable. You can run a program that will rule out 1000 if it is not BB(748).

So they are in the same "category", at least regarding their undecidability.

Also, if you turn "ZFC is consistent" into a number: {1 if ZFC is consistent; 0 if ZFC is inconsistent}, you will see, that BB(748) is not very much different, both are defined (well, equivalently) using the halting of Turing machines, or, the result of an infinite search.

Straw · 2025-06-30T22:15:48 1751321748

Yes, I would say that neither is really a number in the traditional sense of the word, nor in constructive analysis.

gylterud · 2025-06-28T19:04:00 1751137440

A constructive mathematician would indeed deny that BB(748) is a well defined number. One could define it as a predicate on natural numbers, but lest we find a contradiction in ZFC we cannot hope to constructively prove that it holds for any number.

Almondsetat · 2025-06-28T17:45:36 1751132736

As if numbers weren't merely mathematical concepts

Straw · 2025-06-30T22:21:29 1751322089

To downvoters:

I'm well aware that BB(748) is an integer definable in classical logic. My claim is that "integer definable in classical logic" does not actually correspond well to what people mean by "number" in almost any other setting when pushed to extremes such as this.

dtech · 2025-06-28T18:31:31 1751135491

It's as much a number as 12

lupire · 2025-06-28T18:39:17 1751135957

Only if you believe that a number you can't count is a number. You can believe that, but it's a leap.

falcor84 · 2025-06-28T19:06:38 1751137598

Couldn't you make the same argument for sqrt(2), or better yet for zero [0]?

[0] https://en.wikipedia.org/wiki/Zero:_The_Biography_of_a_Dange...

Dylan16807 · 2025-06-28T19:11:39 1751137899

For sqrt(2) I can tell you the order of magnitude and output as many digits as you want. I think that's plenty specific for this use case.

For zero I can not only do that, I can also count to it if you let me count both up and down, which seems like a very simple ask.

falcor84 · 2025-06-28T22:17:00 1751149020

But that's the thing - each generation struggles with whether some new thing is a number. We're typically very inclusive, accepting imaginary numbers and even weirder things like surreal numbers, which we definitely can't count.

But as someone in this generation, I see a good argument for rejecting the big busy beaver numbers, which are provably outside of the realm of calculating with all the resources of our universe's runtime, from being fully accepted as numbers, any more than the first uninteresting number [0].

[0] https://en.wikipedia.org/wiki/Interesting_number_paradox

Straw · 2025-06-30T22:29:43 1751322583

sqrt(2), and pretty much everything else you can think if, is computable- there's a program that can output rational numbers arbitrarily close.

BB(n) is not.

Straw · 2025-06-28T17:37:38 1751132258

Yes, that's right, dividing by that ratio essentially barely affects the number in a sense that 'adjacent' numbers in that notation give a much bigger change.

10↑↑10,000,000 / (sand grains per universe) is vastly larger than, say, 10↑↑9,999,999

So on system we're using to write these numbers, there's really no better way to write (very big)/ (only universally big) than by writing exactly that, and then in the notation for very big, it pretty much rounds to just (very big).

Straw · 2025-06-24T12:46:57 1750769217

You can do all of these in terms of matmul to some extent:

Solving AX=B can be done with Newton's method to invert A, which boils down to matmuls.

Matrix exponential is normally done with matmuls- the scale down, Taylor/Pade and square approach.

Why do you need Cholesky? It's typically a means to an end, and when matmul is your primitive, you reach for it much less often.

Eigendecomposition is hard. If we limit ourselves to symmetric, we could use a blocked Jacobi algorithm where we run a non-matmul Jacobi to do 128x128 off-diagonal blocks and then use the matmul unit to apply to the whole matrix- for large enough matrices, still bottlenecked on matmul.

SVD we can get from Polar decomposition, which has purely-matmul iterations, and symmetric eigendecomposition.

One does have to watch out for numerical stability and precision very carefully when doing all these!

RossBencina · 2025-07-05T09:46:26 1751708786

Cholesky for generating so-called sigma points in the Unscented Transformation.

Straw · 2025-06-15T14:32:37 1749997957

This post mischaracterizes AlphaZero/MuZero.

AlphaZero/MuZero are not model based, and aren't on policy either. They train at a significantly higher temperature, and thus intentional suboptimal play, than they use when playing to win. LeelaChessZero has further improvements to reduce the bias on training on suboptimal play.

There's a well known tradeoff in TD learning based on how many steps ahead you look- 1-step TD converges off policy, but can give you total nonsense/high bias when your Q function isn't trained. Many-step can't give you nonsense because it scores based on the real result, but that real result came from suboptimal play so your own off-policyness biases the results, plus it's higher variance. It's not hard to adjust between these two in AlphaZero training as you progress to minimize overall bias/variance. (As in, AlphaZero can easily do this- I'm not saying the tuning of the schedule of how to do it is easy!)

bee_rider · 2025-06-15T16:00:10 1750003210

It is the weekend, so let’s anthropomorphize.

The idea of sub-optimal play to increase learning is interesting. We can notice the human social phenomenon of being annoyed at players who play games “too mechanically” or boringly, and admiration of players (even in an overly studied game like chess) with an “intuitive” style.

I wonder how AI training strategies would change if the number of games they were allowed to play was fairly limited, to the handful of thousands of matches of a game that a person might play over the course of their lives. And perhaps if their “rank” was evaluated over the course of their training, like it is for humans.

Straw · 2025-06-15T17:05:19 1750007119

Its simpler than that- if you always play what you believe is optimal, you will never explore strategies that may in fact perform better.

edhu2017 · 2025-06-17T00:52:04 1750121524

Care to elaborate how AZ/MZ are not model based? MuZero learns a transition function. AZ uses a known transition model to do planning, but I would still say that's model-based RL.

Straw · 2025-06-19T13:51:46 1750341106

AZ uses a known transition model. I wouldn't call this model-based, because when people say model-based they usually mean learning a world model.

MuZero doesn't actually learn a state transition function- it only models reward. It's a model that predicts reward given an action sequence. I suppose we could consider this a hybrid that's slightly model based because it does have an incentive to learn what's going on. But there's no feedback where we predict future world states and feed that back in.

Straw · 2025-06-14T11:54:09 1749902049

You might find it interesting to know that in the mid 1800s, in the UK, during a 10-year period, private investment in railways totaled 40% of GDP- an average of 4% per year. For comparison, the Apollo program was <0.5% of GDP.

The list of enormously valuable privately developed technologies goes on and on- in fact, almost all technologies. Many of these developed in the 1800s and early 1900s when there was very little government spending on research/etc. For example, oil, railways, cars, etc.

Even in cases where government research did help, often the practical realization takes a lot of money!

P.S. Companies are not necessarily collectives. It depends on the ownership structure.

Straw · on May 11, 2025

I'd be very interested to see a state-size capacity analysis in the style of PCG- if you make cut down versions of your generator with reduced state size, say by reducing the word size of all three words of state, how low can you go while still passing PractRand/BigCrush? This gives a much better idea of how "close" to danger you are than simply passing.

Basically any generator with a 192 bit state can pass BigCrush/PranctRand- even known terrible ones like middle square!

https://www.pcg-random.org/posts/too-big-to-fail.html

the_othernet · on May 11, 2025

I'll have to re-optimize the rotations for the smaller state size, but I will do it and report here.

Straw · on May 11, 2025

I'm looking forward to seeing your results!

Here are corresponding results for LCGs, interestingly there's a clear affine relationship between state size and log(bytes to practrand failure).

https://www.pcg-random.org/posts/does-it-beat-the-minimal-st...

Veedrac · on May 11, 2025

Fun fact, you can extend this logic to measure the upper limit for how discriminating your RNG's test is, by shuffling a 2^n long vector with a CSRPNG and testing that until failure the same way. When I tried this ~8y ago, I believe these tested about 2 bits better on PractRand than PCG for 32 bit outputs. Sadly it's not really plausible to run this algorithm for 64 bit outputs.

The main thing this knowledge is worth when designing RNGs is that it tells you how reasonable it is to 'lose' N bits. Eg. if you're 2 bits worse than PCG, you're about twice as much worse than ideal.

the_othernet · on May 11, 2025

Using only 32bit fast_loop and mix (for 64bits of state) passes PractRand up to 256GB with only one unusual. That's about a 90 bit state LCG equivalent?

I did have to alter the output to be "(GR * mix) + fast_loop" and change the rotation constants to be 12 and 5.

the_othernet · on May 11, 2025

Super useful link. Thank you!

Straw · on May 1, 2025

Why does knowing your front door color have anything to do with hiring? You might just be someone who's very focused on things, so much so that you ignore the environment around you to focus!

WalterBright · on May 2, 2025

Read the article. It's about detecting laptop farms.

Straw · on May 2, 2025

I understand that, my claim is that you'd get false negatives- people who aren't laptop farm users but don't know the color of their front door and aren't at home to check.

Straw · on April 21, 2025

Lisp macros allow a general solution to this that doesn't just handle chained collection operators but allows you to decide the order in which you write any chain of calls.

For example, we can write: (foo (bar (baz x))) as (-> x baz bar foo)

If there are additional arguments, we can accommodate those too: (sin (* x pi) as (-> x (* pi) sin)

Where expression so far gets inserted as the first argument to any form. If you want it inserted as the last argument, you can use ->> instead:

(filter positive? (map sin x)) as (->> x (map sin) (filter positive?))

You can also get full control of where to place the previous expression using as->.

Full details at https://clojure.org/guides/threading_macros

gleenn · on April 21, 2025

I find the threading operators in Clojure bring much joy and increase readability. I think it's interesting because it makes me actually consider function argument order much more because I want to increase opportunities to use them.

aeonik · on April 22, 2025

These threading macros can increase performance, the developer even has a parallelizing threading macro.

I use these with xforms transducers.

https://github.com/johnmn3/injest

benrutter · on April 22, 2025

Yeah, I found this when I was playing around with Hy a while back. I wanted a generic `->` style operator, and isn't wasn't too much trouble to write a macro to introduce one.

That's sort of an argument for the existence of macros as a whole, you can't really do this as neatly in something like python (although I've tried) - I can see the downside of working in a codebase with hundreds of these kind of custom language features though.

sooheon · on April 22, 2025

Yes threading macros are so much nicer than method chaining, because it allows general function reuse, rather than being limited to the methods that happen to be defined in your initial data object.

Straw · on April 10, 2025

Adam Smith would be considered a right wing economist today, similar to Milton Friedman, perhaps with more support for certain government functions such as education. However, even on that issue, given the wildly different level of education available today, its not easy to predict what he would think.

Often quotations from him are taken wildly out of context to support a 'progressive' viewpoint. An example (From David Friedman's blog):

    "People of the same trade seldom meet together, even for merriment and diversion, but the conversation ends in a conspiracy against the public, or in some contrivance to raise prices.”

But, if we keep reading:

    It is impossible indeed to prevent such meetings, by any law which either could be executed, or would be consistent with liberty and justice. But though the law cannot hinder people of the same trade from sometimes assembling together, it ought to do nothing to facilitate such assemblies, much less to render them necessary.

Adam Smith was a proponent of free trade, free markets, and flat taxation (proportional to income). Not very socialist.

(https://daviddfriedman.substack.com/p/discovering-what-is-tr...)