> With a little more effort we can establish the same for any enumerable set of axioms.
This is a case of presentism, which due to the work of Gödel we accept the costs of the implications of his findings.
RE is semi-decidable, meaning we can find yes-instances, although some yes-instances may be false.
RE+co-RE=R, when the author is only referencing RE.
Failure as negation does have its place but it also is what will probably block strong-AI.
Also note that presburger arithmetic, FoL with (+,=) or (*,=) is fully decidable. But proofs require double exponential time. Superfactorial time isn't practical.
Same with attention in feed forward networks requiring exponential time with a reduction in expressability.
This is a lower bound for negation through exhaustion in zero order logic through the strong exponential time theorem.
The loss of reliable negation: (¬) in all but the weakest logic forms in the general case has huge practical implications for all but the most trivial problems.
Unless you are lucky enough to have a fully decidable model aka recursive, the costs are high.
PA was shown to be consistent with transfinite induction BTW. But that is because the natural numbers are well ordered.
But transformers with soft attention being limited to solving problems in TC^0 with failure as negation is a very real cost of Gödels (in) completeness theorems.
How relevant are these considerations for practical use cases?
Specifically, for all practical purpose, it is sufficient to have probabilistic guarantees. Suppose an AI is able to generate mathematical proofs as well as humans, it wouldn't really matter if in theory "they are limited to solving problems in TC^0 with failure as negation"
While TC^0 can divide or multiply through say FFT, it actually can't do counters or addition which I am pretty sure requires TC^1 or Log-depth Threshold Circuits.
> We prove that the time complexity of self-attention is necessarily quadratic in the input length, unless the Strong Exponential Time Hypothesis (SETH) is false. This argument holds even if the attention computation is performed only approximately, and for a variety of attention mechanisms.
Thanks! But, quadratic time complexity (which I had heard that attention requires, which doesn't surprise me) is not exponential time.
I acknowledge that they related this to the assumption of the SETH (which surprises me a little). But, this doesn't mean that transformers take exponential time. I don't think I understood what you meant by the "with a reduction in expressability" part of the statement, so maybe that is the reason behind me not following.
This is a case of presentism, which due to the work of Gödel we accept the costs of the implications of his findings.
RE is semi-decidable, meaning we can find yes-instances, although some yes-instances may be false.
RE+co-RE=R, when the author is only referencing RE.
Failure as negation does have its place but it also is what will probably block strong-AI.
Also note that presburger arithmetic, FoL with (+,=) or (*,=) is fully decidable. But proofs require double exponential time. Superfactorial time isn't practical.
Same with attention in feed forward networks requiring exponential time with a reduction in expressability.
This is a lower bound for negation through exhaustion in zero order logic through the strong exponential time theorem.
The loss of reliable negation: (¬) in all but the weakest logic forms in the general case has huge practical implications for all but the most trivial problems.
Unless you are lucky enough to have a fully decidable model aka recursive, the costs are high.
PA was shown to be consistent with transfinite induction BTW. But that is because the natural numbers are well ordered.
But transformers with soft attention being limited to solving problems in TC^0 with failure as negation is a very real cost of Gödels (in) completeness theorems.