Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Benchmarks scores aren't good because they apply to previous generations of LLMs. That 2.23% uptick can actually represent a world of difference in subjective tests and definitely be worth the investment.

Progress is not slowing down but it gets harder to quantify.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: