I view transformers as like the language center of the brain. When we write or s...

mannykannot · on March 31, 2024

I agree, and also, when I'm writing, I am working towards a hierarchy of goals at the level of sentence, paragraph and beyond, and I'm also wondering if what I have written and plan to write could be confusing or misunderstood.

I think it's fair to ask whether these are essential techniques for improving precision and clarity, or just a way to compensate for not being able to see the whole picture all at once - but if the latter is the case, there's still room for improvement in LLMs (and me, for that matter.) I notice that experts on a topic are often able to pick out what matters most without any apparent hesitation.

anon291 · on April 1, 2024

> I view this recursion as more of a strength than weakness

Sure, it's a strength given that transformers are currently limited by compute budget, but theoretically, if we were to have a way to overcome this, it seems obvious to me that transformer's 'one-shot' ability makes them better.

That being said the recursive aspect you're referencing can be built into a transformer as well. This is a sampling and training problem.