Hacker News new | past | comments | ask | show | jobs | submit login
Weakening Cycles So That Turing Can Halt (jondgoodwin.com)
96 points by Kinrany on May 14, 2020 | hide | past | favorite | 68 comments



I feel like this is circling around total functional programming. I found this to be an accessible paper on this:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106...

The language it uses enforces that all recursive steps "simplify" the input, and gives a simple definition of "simplify".

Note that the language is not (nor does it try to be) Turing complete - intuitively you could imagine that there might be some level of expressiveness which is not enough to get to halting problem like levels of complexity, but enough to get "real world" complexity.


Total functional programming indeed is another example of a valid approach to weakening cycles and thereby guaranteeing termination. Although it works for that, it is too restrictive for my taste. I am looking for less restrictive forms of weakening that allow creation of a broader range of algorithms (including "imperative") that we can still guarantee will terminate.


> Although it works for that, it is too restrictive for my taste. I am looking for less restrictive forms of weakening that allow creation of a broader range of algorithms (including "imperative") that we can still guarantee will terminate.

How is it weak? Surely it's equivalent to any other approach: given a function and any form of termination guarantee for that function, you can encode that guarantee as a phantom type and then you have an expression of that function in a total functional language.


I mean, how long does it take to convince yourself a function will halt vs how long does it take to use implement it with well-founded recursion, with comparable readability? Not that it's not possible, it's just more of a pain.


The article uses an intuitive approach trying to find a way to crack the halting problem.

But the truth is that halting dilemma is real and despite any attempt to "unwind" or "simplify" the control and data flows, there is a real barrier. It is in the math.

My favorite example: finding an odd perfect number. Euler stated: "Whether (...) there are any odd perfect numbers is a most difficult question". More recently, Carl Pomerance has presented a heuristic argument suggesting that indeed no odd perfect number should exist. [1]

Does it exist? Definitely. Somewhere in 10^300 range. What is the exact value? Nobody knows. You can write a program that finds an odd perfect number:

    def odd_perfect_number():
        n = 1
        while True:
            if sum(filter(lambda x: n % x == 0, xrange(1, n))) == n:
                return "OK! I FIND IT!"
            n += 2
Will this program ever complete?

Maybe. However nobody can say for sure. Avoiding recursion or some kind of groups won't give you any favor.

This is the halting problem in all of its glory.

[1] https://en.wikipedia.org/wiki/Perfect_number


The best way the halting problem was explained to me was to think of it like this: imagine you write your perfect halting detection program. It reads in the contents of some program, and returns true if it halts or false if it doesn't.

Now, all I have to do is write a program that includes an embedded version of your program. My program reads in the contents of itself and passes it to the embedded program. Now, all my program has to do is take the output of the embedded halt checker, and do the opposite; if the halt checker says the program halts, just keep looping. If it says I don't halt, exit immediately.

Now, you might think this is cheating, because it is using your own program against you. However, this is at the core of the problem; it has to work for ALL programs, even ones written with knowledge of your program.


But why do I have to accept your arbitrary program? Why couldn't I only consider programs that come with a proof that they terminate? I don't care how you made the proof, maybe you used some restrictive programming language that can only express concepts that terminate, or your program implements paxos and you prove it by some embedding into Coq. Some restrictions though: the proof has to be formulated in a way that's compact and takes linear-time to check [1], and the proof goes down to some formalization of first-order logic that we can agree on.

[1]: Yes, I think this is possible. [2]

[2]: https://github.com/digama0/mm0#introduction


After a bit of searching, proof carrying code seems to be related to what I'm describing [3].

So I guess my point is: Does it really matter that you can come up with arbitrary programs that I will be unable to determine if they halt, if I can build up an entire OS platform of provably correct machine code with correctness proofs that I can scan out of a file to verify?

Ultimately, here's the point of re-framing it like this in the first place:

Now it's just a tooling problem.

I recognize that this is a very big 'just'.

[3] https://people.eecs.berkeley.edu/~necula/pcc.html


Doesn't this assume that your program can reliably access it's own source-code? Can it? It can of course read some file containing some source-code. But assume it accesses some file which contains the source-code of some other program, then wouldn't your logic still hold the same, since from the point of your program it is simply reading some file?


Doesn't matter. Doesn't have to be the executing source, can work with compiled representations; the original Turing paper "compiles" every program to a single giant integer.


No, it has nothing to do with source code.

The halting verifier program V never accesses itself. It takes two parameters: a program to check for halting (P) and an input to the program (I). It's job is to test whether P(I) (the program applied to the input) will halt.

To trick the halting verifier V we construct a diabolic P like this:

  program P(I) {
    if (V(I, I) == YES) {    // if Verifier says I(I) halts
      inifinite_loop();      // ... then do *not* halt
    } else {
      halt();                // ... otherwise halt
    }
  }
Like V, P is a special program that takes a program as input. It then asks the verifier: if the input program is given itself as input, does it halt? And then it behaves in the opposite manner: if the verifier says that I(I) halts, then P deliberately infinite-loops. Otherwise P halts.

Given that P, we ask the verifier the devastating question: V(P, P).

Does P halt if given itself as input; does P(P) halt?

V cannot decide. Note that if we consider P(P), then it means that in the above program, the input parameter I becomes P itself: I = P. And so the V(I, I) == YES expression is actually testing V(P, P) == YES.

But V(P, P) is the devastating question itself!

If V(P, P) yields YES (P(P) halts) P goes into an infinite loop (P(P) in fact fails to halt), and so V has calculated the wrong answer: the actual behavior of P(P) contradicts the YES answer from V(P, P), rendering V incorrect. The same problem occurs if V(P, P) yields NO.

Note that P contains no self refererence. The fact that P is applied to itself as P(P) doesn't come from P; it comes from the way P is used. P is used that way when we ask the verifier V(P, P). The verifier itself is checking what happens if P is given itself as input. P can happily accept other programs as the input I, programs which are not itself.

What P needs is a way to embed V, because it relies on it. The assumption is that whatever halting verifier V someone tries to develop, it has to be public. So that is to say, some CS researcher claims he or she has developed a universal halting verifier algorithm V and they publish it. Then, promptly, their adversary takes the algorithm V and sticks it into their program P(I) as a subroutine, and has an instant test case which defeats V.

The tricky thing is not self-reference, but the fact that P contains the verifier and turns it against itself. The self-reference is instigated from the outside by choice of inputs. In other words, the devastating question V(P, P) is what perpetrates the self-reference which breaks V.


Even if the code isn't public... it still exists. And since we are talking about "does their exist any possible program it can't determine", we know that some sequence of bytes will be that program, and therefore it is possible to exist.


Even if it can't access its source code, you can... you know what the code is


Author here. Cool story about odd perfect examples. That said, my post is not trying to find a way to crack the halting problem, and thereby demonstrate it is not real. Turing wrote a most excellent proof, and I stand by it wholeheartedly. I even cite another famous numeric example of a program we still do not know if it will terminate for all numbers: the Collatz conjecture.

Notice my description: "Turing proved that no general algorithm can be formulated that can answer this question across all possible programs." The operative part here is "across ALL possible programs". The proof does stop us from knowing that SOME programs will or will not halt. It only stops us from being able to determine this for EVERY possible program. My post explores a very specific subset of programs that we can prove will terminate, then asks what pattern(s) underlie such programs, and then explores the use of such patterns in a variety of interesting problems. It is this last result that most intrigued me, and caused me to write about it.


Important mistype. It should read: "The proof does NOT stop us from knowing that SOME programs will or will not halt.


You may also be interested in Willard's Self-Verifying Theories: https://en.wikipedia.org/wiki/Self-verifying_theories


It does however apply to more properties than merely halting: https://en.wikipedia.org/wiki/Rice's_theorem


A bit tangential, but turings proof only works with some very strict assumptions.

- Clasical logic as the foundation for math.

- Proofs by contradiction work.

- That the church turing thesis holds.

I guess these are kinda two sides of the same coin. But still worth noting.


I believe that it's not a proof by contradiction, but a direct proof of a negative statement. Subtle distinction:

Direct proof of negative statement: Assuming P is true, there is a contradiction. Thus, P is false.

Proof by contradiction: Assuming P is not true, there is a contradiction. Thus, P is true.

The key bit is that in constructive mathematics, "P is not true" does not imply "P is false."


After some digging I found several blogs that make the same distinction as you do.

However, basically all of literature calls the undecidability of the halting problem a proof by contradiction.

So I guess that there are some intuitionists that make that distinction. I'm not sure that I'd call it a direct or constructive proof though.

The thing is that if you follow constructive reasoning and you assume that the church turing thesis is true, then all of math becomes somewhat small. No more uncountably infinite stuff, no dense reals, computable objects only e.t.c.

I feel like we've painted mathematics into a dogmatic corner, and tbh I have no idea how to get out of it.


Sorry to say, this example has nothing to do with the halting problem. Rather, I would say, it's a reformulation of a(n unsolved) mathematical problem to the question whether a certain function halts on a certain input (being "no input", as the function is nullary). Or, if you insist, we can regard it as a trivial instance of the halting problem.

An instance of the halting problem asks, given (a partial recursive) function F, whether there is an decision algorithm A (aka. Boolean valued algorithm) on the universe of inputs U of F that calculates the domain of F (those inputs where F takes a value, ie. halts; ie, A(x) <=> F halts on x, where x in U).

For any finite subset S = {s_1, ..., s_n} of U, there is a trivial decision algorithm on U that calculates S:

  def A_S(x):
     for s in (s_1, ..., s_n):
         if x == s:
             return True
     return False
Therefore if the domain D of F is finite, the answer to the halting problem of F is trivially yes: A_D is a decision algorithm for F. (This has noting to do with the question whether we know D or not!)

If U is finite, then D is finite, so the answer to the halting problem of F is trivially yes.

If F is a thunk (nullary function), then U is finite, ie. the singleton set of "no input", so the answer to the halting problem of F is trivially yes.

odd_perfect_number is a thunk.


> An instance of the halting problem asks, given (a partial recursive) function F, whether there is an decision algorithm A

> so the answer to the halting problem of F is trivially yes.

The answer is "yes, there is a decision algorithm"? What would that decision algorithm be for this particular function (odd_perfect_number)? And what is its answer?


> The answer is "yes, there is a decision algorithm"?

Indeed.

> What would that decision algorithm be for this particular function (odd_perfect_number)?

Either of

  () => True
or

  () => False
where '()' represents the empty input.

> And what is its answer?

I'd be a Fields medal candidate if I knew ;)


> Does it exist? Definitely.

I am unsure how you reached this conclusion?


Interesting. Nitpick: Odd perfect numbers are no more in the 10^300 range but nowl provably beyond 10^1500. The Wikipedia links to a math paper to that extend.


If it definitely exists, don't we then know that this program will definitely halt?


Only from algebraic and philosophical standpoints. This knowledge is not 100% precise unless a specific number is found.


Nobody cares about the halting problem from a specific standpoint. If it's algebraically guaranteed to halt, it halts. When it halts is entirely meaningless for purposes of the halting problem.

But the thing is, it isn't algebraically guaranteed either - we still don't know if an odd perfect number exists. (I'm not sure where the "Definitely. Somewhere in 10^300 range." is coming from. There's nothing definite about the existence, only constraints that need to be satisfied if it exists)


> Nobody cares about the halting problem from a specific standpoint. If it's algebraically guaranteed to halt, it halts. When it halts is entirely meaningless for purposes of the halting problem.

Halting problem from a specific standpoint is a popular area of research. Microsoft Research in particular has been active in this area, and Windows device drivers and other kernel event handlers are analyzed to prove that they will halt in a reasonable amount of time to avoid locking up the OS.

https://www.microsoft.com/en-us/research/publication/t2-temp...


In fairness, though, the halting problem doesn't care if you can predict if SOME programs halt... you have to predict if ALL programs halt.


There's that, too. (Well, we don't have to predict anything. We have to show that halting or not is a decidable condition, and we've shown it isn't. Even if we had shown it was decidable, there still might be programs where we practically can't figure out it halts. So is the beauty of math - it gives us the gestalt of the world without caring about the world :)


If you've prove a finite upper bound,then you've proven the program will halt. You don't need to know what the exact number is.


The trick with boundaries does not stand true for evasive predicates on unbounded data.

Please note that suggested odd perfect number search algorithm in Python uses an evasive predicate on a big integer which is unbounded.

This means that a typical tracer (or verifier in Turing terms) won't be able to infer the upper bound for the given example.


>This is important because ... Pony can still experience deadlocks. By carefully modeling each dining philosopher and fork using distinct actors that synchronize by sending messages to each other, it becomes clear that at some point the philosopher actors just end up silently waiting on messages that will never arrive.

I'm interested by this rather offhand comment. Pony makes rather strong claims about being deadlock free - and what the author describes is not a deadlock, any more than hitting the end of `main()` represents a deadlock because your sequential program doesn't progress any further.

Are there examples of Pony programs which actually "deadlock" - that is, fail to continue executing in some way while there is still some unfinished work to be done (and not just because they choose not to progress)?


Author here. You are 100% correct that Pony makes a strong claim about being deadlock-free: "It’s deadlock free. This one is easy, because Pony has no locks at all! So they definitely don’t deadlock, because they don’t exist." (from the first page of the tutorial). Sylvan Clebsch is a really smart guy and Pony is a stunning piece of engineering based on rigorous analysis. He would surely know.

So far as I can determine, this part of the claim is true: it makes no use of locks, not in the runtime and none are surfaced to the Pony programmer. The only synchronization mechanism that Pony makes use of is the MPSC queues that make possible message passing between actors. Each actor has its own queue, and it is only "active" when it has messages on its queue that need to serviced. The work-stealing scheduler will give every actor with messages an opportunity to process at least one message. When it does so, it never blocks and (hopefully) finishes with no requirement to return a result to any other actor (although it may be pass messages to other actors if it wishes).

Now consider Wikipedia's definition of a deadlock: "a deadlock is a state in which each member of a group is waiting for another member, including itself, to take action, such as sending a message or more commonly releasing a lock". Notice that deadlocks can occur not only because of locks (which Pony does not have), but also because of waiting on another actor to send a message. The wiki article lists four conditions for a deadlock, which (it turns out) can also occur in a message-passing architecture.

How would Pony trigger this sort of deadlock? Consider implementing the dining philosophers problem in Pony by having the main actor spawn five fork actors (with on-table state) and then five philosopher actors, telling each the fork actor to their left and the fork actor to their right. The philosopher actors all wake up hungry, so they reach for their left fork first, by sending it an acquire message and then effectively suspend. If the left fork A is in on-table state, it changes its state to indicate it has been acquired by a specific philosopher and sends a message back to the philosopher to say it has been acquired. That reactivates the philospher to now send an acquire message to the right fork B. But maybe by this point in time another philosopher has already acquired that right fork B as its left fork! So it sends a message back to the requesting philosopher to say it is not currently available to be acquired, try again later.

If you walk through this implications of this carefully, you will see that although every philosopher is getting activated by messages regularly, at a certain point in time it is possible that meaningful forward progress can cease for at least one philosopher. At least some philosophers will starve, because they become deadlocked in not being able to get two forks needed to eat, because they are effectively contending over access to shared resources.

The situation is semantically equivalent to when we use locks for forks instead. And if we apply Dijkstra's partial order to how the philosopher actors acquire their forks in an actor-based message passing architecture, the deadlock risk vanishes and all the philosophers have a reasonable chance, eventually, to eat and therefore guarantee eventual forward progress.

Does this explanation of an example satisfy your question?


Is this meaningfully different than a "livelock"?


> not just because they choose not to progress

To a computer, what's the difference? Instructions are instructions.


AIUI, Pony programs don't have a blocking wait. They terminate if they are idle.


AIUI actors are regularly idle. However, you are right that Pony offers a clever mechanism not offered by Erlang, which allows an actor to effectively despawn itself. I believe how it does this is to notice when no other actor has a reference to it. If no other actor has a reference to it, no one cannot send it any messages. If it has no messages in its queue and it is idle (not currently dispatched on a message), we know it will never again receive any messages, and therefore it is safe to free itself because it will never again have work to do.


This was a very enjoyable read... twice. A metaphorical fugue on Turing, Godel, type theory and the generalization of this fundamental paradox that is Lawvere's proof. And, how it may relate to the more practical problems of memory management!

I've been reading Permutation City recently. There has to be a way to shoe horn the fundamental problem of consciousness in here somewhere.


Permutation City is fantastic!


I've noticed a related phenomenon in day-to-day life, especially regarding deadlocking. There are occasions when two parties will deadlock; the classic one is when two people walk towards each other, directly in line, and come to a point where they want to pass without colliding. It happens -- often with comic effect -- when the two parties sidestep at perceptibly the same point and in the same direction, again and again, deadlocking themselves. However, something kicks in after two-or-three failed attempts to resolve the deadlock.

I've noticed similar things at traffic junctions; roundabouts and single-lane roads, especially. Personally, I sometimes do this when I go to open a door that is often, but not always, locked: my hand will reach for the latch, then go for the handle, then back and forth semi-automatically, until my higher thinking kicks in and I realise I've just got to make a decision!

In these cases, I feel as though the cycles aren't weakened, but instead there's some kind of executive function -- that's normally asleep, because it's presumably expensive to run -- which oversees to break the impasse, when needed.


In the "two people walking towards each other" case, I've battled that my whole life. I typically stare right at the other person, anxiously waiting for them to pick a direction so I can go the other way.

My wife pointed out to me that just by picking a side, and then looking in that direction as the other person comes towards you, that they (possibly unconsciously) go the other way, and conflict avoided! Mind blown. And it works.


Another solution I've heard for this problem is to not look the other person in the eye and just look passed/through them without faltering. IIRC, this comes across as an assertion of dominance and the other party will submit and move aside. I don't actually know if this works :P Also, I assume there's a hilarious edge case where two super-assertive people just walk into each other.


Or you could just say that this is an assertion of “I am intelligent enough to pick the path I would like to go given the fact that the choice of a path is irrelevant for both of us”.

The fact that you lack confidence doesn’t suddenly make others alpha fe/males.


I think this is called a livelock.


Oh, wow, this is a very stimulating comparison.

My summary is that all of these are structurally the same:

* programs with recursion that either halts or not -- is there a partial order on states of a recursive algorithm such that some state is eventually terminal?

* threads with resource dependencies that either deadlock or not -- is there a partial order on resource acquisition so that eventually some operation is unblocked?

* garbage collection of memory references -- is there a partial order on resource ownership such that some element is eventually unowned?

* Russell's paradox -- presumably something like: is there a partial order on the definitions of some sets such that all of the definitions can be resolved?

* Godel's Incompleteness theorem and Liar's paradox, etc -- presumably something like: is there a partial order on statements that allows their truth values to be resolved?

In every case it seems like the solution is some form of diagonal argument.

I really like the version I just found in this paper[1]: if Y is a set and there's a function a(Y): Y that doesn't have a fixed point, then for any function f(T, T): Y there's a function g(T): Y that can't be written as g(s) = f(s,t) for all t. Namely, set g(t) = a(f(t,t)); clearly f(s,s) = g(s) = a(f(s,s)) != f(s,s).

For the Liar Paradox ("this sentence is false") and Godel ("this statement cannot be proved in G") the set Y is (True/Proven, False/Disproven) and the function a() just takes False <-> True and so has no fixed point. The function f(s, t) tries to assign a truth value to a statement 's' and also to the _claim_ made by 's'. The function g(s) tries to assign a truth value to 's' alone, but for 'this sentence is false' the definition is g(s) = a(f(s,s)), and the definition of f(s,s) = g(s), so g(s) = a(g(s)) and there's no solution -- just like a recursive function that says f(x) = !f(x) never terminates!

The connection to partial orders is: there must be a partial order on the possible states of the system such that there isn't an infinite loop. The states can be the truth values of statements, the input to a function, the states of a Turing machine's tape, the state of the condition a thread is waiting on, or the state of whether an object has been freed. Etc.

Of course I have never really thought through this before and I'm making it up as I go, but this is grand and delightful to sort-of hand-wavingly understand.

[1]: https://arxiv.org/pdf/math/0305282.pdf


It is indeed grand and delightful. I am glad you enjoyed the post so much. It was every bit as much fun for me to grok and then and share with others.


Paradoxical self-reference cycles seem to give rise to the halting problem, the incompleteness theorem, the liar paradox, and Russell's paradox (among other paradoxes and self-contradictions) so breaking those cycles seems like a practical idea.

However, in practice there really isn't a noticeable difference between a program that never completes and one that takes 100 years to complete. Both programs will almost certainly be halted (or at least paused indefinitely) by some external means. If a program takes too long to produce its desired output, then it simply isn't a useful piece of software.


Well, that depends. A program that takes 100 years to complete (say factoring a large number) can be optimized to not take 100 years, and it is worth looking for algorithmic advances that would make that possible. A program that can proven to never halt (a soundness detector that checks programs for behavioral violations) can never be perfect and must rely on heuristics or be known to fail on certain input.


In fact, 100 years to 1/10th of a year is a mere constant factor, that could be arrived at in some cases by buying a lot of GPUs.


If it doesn't take 100 years to complete, then it doesn't take 100 years to complete.

But it is a sort of good point that a 100 year runtime in 2020 might turn into a 10 year runtime in 2030, making it only a 20 year runtime overall. For some jobs, a 20 year runtime might be acceptable, but probably those jobs are rare.

In general if a program takes so long to complete that its output is no longer useful, that limits its practicality and utility.


If it doesn't take 100 years to complete, it doesn't take 100 years to complete.


Take an algorithm that halts soon enough. Now scan a bunch of them to find the good enough one. This new scanning process no longer halts soon enough. Even with runtime limits the halting problem is annoying.


It reminds me of something I had posted to a math/engineering forum some time ago but my answer was taken down because I didn't use proper mathematical terminology and I had ended the comment with a question mark asking for feedback.

My idea was that what if the halting detection function sometimes intentionally lied about whether or not the program halts (E.g. just returned a random true/false 1% the time instead of actually doing the analysis work) then because Turing's proof is recursive (and relies on the occurrence of an infinite recursive deadlock condition), eventually, at a certain random depth in the recursion, the halting detection function would break out of the recursion when it hits that random error case. So the intentional % error added to the halting detection function would prevent hitting the infinite recursion case on which Turing's proof relies. The downside of this approach would be that my hypothetical halting detection function would return incorrect results a certain percentage of the time when applied to other algorithms; but this problem can be offset by running it multiple times against the same input. The accuracy of such a function could approach 100% as you run it more often but never quite reach 100%.


So you’ll approach 100% of this program lying to you that the other program halts.

This isn’t any different from running a program for 30 seconds and checking whether it halted or not.

You are simply sampling it differently.


I'm not so sure I understand, because if you do have some code that a turning machine would throw for an infinite loop, wouldn't your 1% trick be the only escape the first time, second time, third time and forth time you tried to run the halting analyzer?

Thus you wouldn't get closer and closer to 100% accuracy, but rather, you'd get the randomness response every single attempt


Given that the analyzer function 'H' needs to work with any algorithm 'F'. We could separate 'F' into two distinct subsets of functions (the union of which fully spans the set of all possible F):

1. 'F1' which represents any possible function which does not invoke the H(F1) function internally.

2. 'F2' which represents any possible function which invokes the H(F2) function internally.

Turing's proof relies entirely on the existence of 'F2' functions and the fact that these kinds of functions always lead to a an infinite loop deadlock condition. Turing's proof relies on these 'F2' functions as a counter example to prove why H cannot exist. Turing's proof does not apply for functions of the 'F1' kind.

So my thinking is that assuming that there existed a probabilistic analyzer function `H` which worked with 'F1' functions and which could be modified to also work with 'F2' functions (I.e. by adding a probabilistic error % as I suggested), then that would invalidate Turing's counter-example. It would not give any answers about the halting problem itself though; just move the problem back to a state of uncertainty.

My idea aims to prove that if you have an 'H' which works for 'F1' then it can be easily modified to also work for 'F2' if you're prepared to accept an error probability each time you run the function.

A key part of my argument is that any F2 function can use an error % as an escape hatch to protect itself from infinite recursion and thus gives those functions the ability to decide their own output.

For these kinds of F2 functions, whatever output H(F2) returns becomes self-fulfilling after the recursion has finished unwinding and so it does not matter if H(F2) is true or false.


But still the diagonalization argument applies. In order for your slightly wrong oracle to work, you still need a 100% correct oracle. Now remove the slightly wrong wrapper, and you are back in square one.


Seems like the results in this blog post would be of interest.

https://www.scottaaronson.com/blog/?p=2741

Not so much in avoiding the halting problem while allowing complexity, but instead using it and actual runnable Turing machines to create ridiculously cool cases of unprovable truths.

For example a running Turing machine that only halts if it contradicts set theory.


The liar "paradox" is only one if you're operating with the assumption all statements are falsifiable and valid, which I've never been shown to be a good ground truth.


That's pretty much the same idea underlying intuitionistic logic, IIUC


Check out Kolmogorov complexity and the law of independence conservation.


Link for "the law of independence conservation"? Google results look mostly unrelated (except for one page trying to use it to prove AI is impossible).


See: https://www.sciencedirect.com/science/article/pii/S001999588...

look at section 1.2

it is the non growth theorem for algorithmic information, but also deals with randomness

Levin proves randomness + computation is not expected to increase algorithmic mutual information. In other words there is no randomness loophole in Godël's theorem.


wow brilliant article! I always love the connections to be made between theory and computer system! are there any good reads that cover topics like in detail?


You'd like "Gödel, Escher, Bach: An Eternal Golden Braid", mentioned in the article.


I really like the "delay" datatype as a way to implement general recursion (i.e. Turing-completeness) in a non-general/non-Turing-complete system like Coq or Agda.

We can think of 'Delay<T>' as being a bit like 'List<T>'. If we follow Lisp's encoding, there are two forms of list: the empty list 'Nil', which doesn't "contain" anything; and the pair 'Cons(x, y)' which contains a value 'x' (the "head" or "car" of the list) and another list 'y' (the "tail" or "cdr", which might be 'Nil').

'Delay' is the other way around: it is either 'Now(x)', which contains the value 'x', or it is 'Later(y)' which contains another delay 'y' (which might be a 'Now'). We can imagine 'Delay<T>' a bit like a list containing a single 'T' value at the end, preceded by an arbitrary amount of NULL elements. I've gone into a bit more depth on such constructions (AKA "initial algebras") in a blog post http://chriswarbo.net/blog/2014-12-04-Nat_like_types.html

This gets interesting if our language allows co-termination (like Agda, Coq, Haskell, etc.). This is where we're allowed to recurse 'inside' our return value. For example, the following function doesn't terminate, since it will always call itself recursively; yet it does co-terminate, since the recursion is happening "inside" the 'Cons'. We don't know what the full result will be, but at every stage we definitely know what the next part will be (a 'Cons' list). We can reach any finite "depth" of the result within a finite amount of time. This is also known as being "productive":

    function countUpFrom(x) {
      return new Cons(x, countUpFrom(x+1));
    }
The following function doesn't co-terminate (whether it always terminates is currently unknown), since (in the non-1 case) we don't 'narrow down' the return value before recursing. As far as we know, this could keep "passing the buck" forever, and we'd never get any closer to knowing what any part of the result looks like:

    function collatz(x) {
      return (x == 1)? 1
                     : collatz(even(x)? x / 2
                                      : 3 * x + 1);
    }
Implementing co-termination requires that the "contents" of a datastructure are calculated on-demand rather than straight away (AKA 'lazily'). We can actually do this in any language with first-class functions by creating (and subsequently forcing) "thunks", although it gets quite verbose, e.g.

    function countUpFrom(x) {
      return new Cons(x, function() { return countUpFrom(x+1); });
    }
I wrote a blog post which uses this to generate Fibonacci numbers in PHP: http://chriswarbo.net/blog/2014-07-23-fib.html

We can use co-termination to implement a Turing machine, or any other generally-recursive system, by simply making a list of each "step":

    function turingMachineList(tm) {
      return Cons(tm,
                  halted(tm)? Nil
                            : turingMachineList(step(tm)));
    }
This will produce a list of each machine state, which might stop (if the machine reaches a halted state) or go on forever. Likewise we can implement a list of Collatz values:

    function collatzList(x) {
      return Cons(x,
                  (x == 1)? Nil
                          : collatzList(even(x)? x / 2
                                               : x * 3 + 1));
    }
If we only care about the "result" then we can use 'Delay' instead of 'List':

    function turingMachineDelay(tm) {
      return halted(tm)? Now(tm)
                       : Later(turingMachineDelay(step(tm)));
    }

    function collatDelay(x) {
      return (x == 1)? Now(1)
                     : Later(collatzList(even(x)? x / 2
                                                : x * 3 + 1));
    }
These will return a value like 'Later(Later(Later(...)))' which will either keep nesting forever, or (if the calculation has an answer) will eventually contain 'Now(answer)'.

'Delay' is nice since we can write programs in pretty much the same way as normal. The only changes are mechanical and obvious:

- If a recursive call isn't "inside" the return value, we wrap it in 'Later(...)' so it is; and wrap any corresponding base-cases in 'Now(...)'.

- If we need to perform some calculation on the result of a 'Delay' (i.e. the value in its 'Now'), we can't do so directly, since there might not be a result (it could be 'Later(Later(...))' forever); and even if there is, we don't know how many 'Later' wrappers to dig through. Instead, we use these values indirectly, by mapping our calculation over the 'Delay' value. This isn't hard: it's exactly the same as mapping over the elements of a list, or over an optional value, etc. Here's what 'map' looks like for 'Delay':

    function mapDelay(f, x) {
      return case x {
          Now(y):   Now(f(y));
        Later(y): Later(mapDelay(f, y));
      };
    }
We can run such programs in two ways:

- We can remove (up to) some finite number of 'Later' wrappers, which is equivalent to running with a timeout. We can always do this inside the total (terminating/co-terminating) language. We might still end up with a 'Later(...)', unless we can prove that the computation will always finish before our timeout (i.e. if we have a termination proof).

- We can use an external program (in a Turing-complete language) as a clock to 'tick' through each step; essentially playing the role of OS or VM (similar to using a trampoline for recursion, in languages which don't eliminate tail-calls). In this sense, 'Delay' is like a language-enforced version of cooperative multitasking, since we must always eventually yield, at which point the OS can pick another delayed value to tick. The ticks are also a convenient place for the user to interrupt/cancel a program.

Of course, knowing that the next tick will be occur within a finite time doesn't guarantee that it will be happen before the heat-death of the universe!

Note that we can also use co-termination to build datatypes which have no base-case, and hence every value must be infinite. This would be pretty useless for 'Delay' (since there is only one infinite value, namely 'Later(Later(...))' forever). The always-infinite equivalent of lists are called "streams", and are defined as having a 'Cons' form but no 'Nil' form.

From the OS/VM perspective above, this lets us write programs which are guaranteed to run forever (e.g. servers which never crash), although they may run out of memory, and responses might take a while (bounded by the busy beaver number for the server's code size + the request size). The classic example is to represent input as an infinite stream, e.g. 'Stream<HTTPRequest>', have a function for constructing a response 'respond : HTTPRequest -> HTTPResponse', and construct a server via 'streamMap(respond, input) : Stream<HTTPResponse>'. If we want a stateful server we can include the new state with our reponse and have the server plumb it through:

    respond : (State, HTTPRequest) -> (State, HTTPResponse)

    server : (State, Stream<HTTPRequest>) -> Stream<HTTPResponse>

    function server(currentState, stream) {
      return case stream {
        StreamCons(request, rest): case respond(currentState, request) {
          (nextState, response): StreamCons(response,
                                            server(nextState, rest));
        };
      };
    }




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: