Hacker News new | past | comments | ask | show | jobs | submit login

It's not a fallacy. Previous words are very important part of its scratch space. Few-shot learning is based on previous words. Prompt modifiers like "let's think step by step" encourage encoding of reasoning verbosely in words, which then allow simpler induction rules to be pattern matched onto the previous words. Previous words is what gives an otherwise feed-forward network a way to recur.



> It's not a fallacy.

I feel it is, because it implies it just some statistical trick that's being performed, which is not true at all, imho.

I don't know enough about language models, my machine learning knowledge stops to around 2018, but I know from image recognition/style transfer that there's a lot of high-level self-organization/abstraction in those neural nets, and from the results I get from Chat GPT there's no doubt in my mind it's very well capable of reasoning and generalization.


"Guessing" implies those guesses are being compared for correctness against a reference. That only happens during training; the rest of the time, it's not guessing, it's selecting words. But then, how else would you expect a sentence to be made? First writing out all the vowels, and then filling in the rest of the letters between them?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: