Hacker News new | past | comments | ask | show | jobs | submit login

> OpenAI's flagship models are not even correct 50% of the time[1]

You're reading the link wrong. They specifically picked questions that one or more models failed at. It's not representative of how often the model is wrong in general.

From the paper:

> At least one of the four completions must be incorrect for the trainer to continue with that question; otherwise, the trainer was instructed to create a new question.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: