Hacker News new | past | comments | ask | show | jobs | submit login

It's not built on high test scores - while academics do benchmark models on various tests, all the many people who built up the hype mostly did it based on their personal experience with a chatbot, not by running some long (and expensive) tests on those datasets.

The tests are used (and, despite their flaws, useful) to compare various facets of model A to model B - however, the validation whether a model is good now comes from users, and that validation really can't be flawed much - if it's helpful (or not) to someone, then it is what it is, the proof of the pudding is in the eating.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: