Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I view transformers as like the language center of the brain. When we write or speak, especially when it's critical to get things right, we have this ability to think "that doesn't make sense" and start over. I view this recursion as more of a strength than weakness. You can get an LLM to generate an answer and when asked about the validity of the answer it would acknowledge that it got it wrong. This begs the question that if it had perfect recall and understanding why did it give the wrong answer in the first place?

I don't know how the reasoning part comes to us but if we could implant that capability to a transformer model then it would end up pretty good.



I agree, and also, when I'm writing, I am working towards a hierarchy of goals at the level of sentence, paragraph and beyond, and I'm also wondering if what I have written and plan to write could be confusing or misunderstood.

I think it's fair to ask whether these are essential techniques for improving precision and clarity, or just a way to compensate for not being able to see the whole picture all at once - but if the latter is the case, there's still room for improvement in LLMs (and me, for that matter.) I notice that experts on a topic are often able to pick out what matters most without any apparent hesitation.


> I view this recursion as more of a strength than weakness

Sure, it's a strength given that transformers are currently limited by compute budget, but theoretically, if we were to have a way to overcome this, it seems obvious to me that transformer's 'one-shot' ability makes them better.

That being said the recursive aspect you're referencing can be built into a transformer as well. This is a sampling and training problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: