Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> With a little more effort we can establish the same for any enumerable set of axioms.

This is a case of presentism, which due to the work of Gödel we accept the costs of the implications of his findings.

RE is semi-decidable, meaning we can find yes-instances, although some yes-instances may be false.

RE+co-RE=R, when the author is only referencing RE.

Failure as negation does have its place but it also is what will probably block strong-AI.

Also note that presburger arithmetic, FoL with (+,=) or (*,=) is fully decidable. But proofs require double exponential time. Superfactorial time isn't practical.

Same with attention in feed forward networks requiring exponential time with a reduction in expressability.

This is a lower bound for negation through exhaustion in zero order logic through the strong exponential time theorem.

The loss of reliable negation: (¬) in all but the weakest logic forms in the general case has huge practical implications for all but the most trivial problems.

Unless you are lucky enough to have a fully decidable model aka recursive, the costs are high.

PA was shown to be consistent with transfinite induction BTW. But that is because the natural numbers are well ordered.

But transformers with soft attention being limited to solving problems in TC^0 with failure as negation is a very real cost of Gödels (in) completeness theorems.



> Failure as negation does have its place but it also is what will probably block strong-AI.

Interesting, could you please explain more on this?


The deduction principle turns out to be invalid for weak rejections and strong AI requires it.

Consider strong negation:

Did Homer write the Iliad? No, he did not.

VS a too computationally expensive system with true,false,other with no knowledge of Homer:

Did Homer write the Iliad? No, Homer did not exist.

Vs negation as failure of binary threshold ANNs:

Did Homer write the Iliad? No.

Transformers explicitly can find known unknowns, unknowable unknowns (eg future unknowns), etc..

But exhaustive unknowns may or may not be valid.

There is a lot more to it, but strong AI requires universal quantification, ML is existential quantification.

The above is just one way to think about why Word sense disambiguation and ATP are though to be AI-complete.


How relevant are these considerations for practical use cases?

Specifically, for all practical purpose, it is sufficient to have probabilistic guarantees. Suppose an AI is able to generate mathematical proofs as well as humans, it wouldn't really matter if in theory "they are limited to solving problems in TC^0 with failure as negation"


While TC^0 can divide or multiply through say FFT, it actually can't do counters or addition which I am pretty sure requires TC^1 or Log-depth Threshold Circuits.

LTSM is much more powerful.


> Same with attention in feed forward networks requiring exponential time with a reduction in expressability.

Huh? Can you say more about that? What takes exponential time with transformers?


> We prove that the time complexity of self-attention is necessarily quadratic in the input length, unless the Strong Exponential Time Hypothesis (SETH) is false. This argument holds even if the attention computation is performed only approximately, and for a variety of attention mechanisms.

https://proceedings.mlr.press/v201/duman-keles23a/duman-kele...


Thanks! But, quadratic time complexity (which I had heard that attention requires, which doesn't surprise me) is not exponential time.

I acknowledge that they related this to the assumption of the SETH (which surprises me a little). But, this doesn't mean that transformers take exponential time. I don't think I understood what you meant by the "with a reduction in expressability" part of the statement, so maybe that is the reason behind me not following.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: