Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not exactly. ChatGPT was absolutely trained to produce statistically likely output, it just had an extra training step added for human ratings. If they relied entirely on human ratings there would not have been sufficient data to train the model.


The last step is what matters. "Statistically likely" is very underdetermined anyway, answering everything with "e" is statistically likely.

(That's why original GPT3 is known for constantly ending up in infinite loops.)


"e" is not a likely response to anything. I think you are not understanding the type of statistics involved here.


GPT3 doesn't create "responses". Not till it's been trained to via RLHF.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: