Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This model has the best score on that benchmark.

Edit: Huh... It does score highest in "Omniscience", but also very high in Hallucination Rate (where higher score is worse)...



this has one of the worse score in AA-Omniscience Hallucination Rate




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: