Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You should take a look at the more extensive reasoning tests used for LLMs right now, like MuSR, which clearly can't be the latter, since the questions are new: https://arxiv.org/abs/2310.16049


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: