unless you train directly against solving those problems... in which case how co... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		UnlockedSecrets on Feb 2, 2025 \| parent \| context \| favorite \| on: Recent results show that LLMs struggle with compos... unless you train directly against solving those problems... in which case how could you theoretically design a test that could stand against training directly against the answer sheet?

munchler on Feb 2, 2025 [–]

That's why they keep the evaluation set private: "Submit a solution which scores 85% on the ARC-AGI private evaluation set and win $600K."

[0] https://arcprize.org/guide

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact