Hacker News new | past | comments | ask | show | jobs | submit login

I recently watched some Claude Plays Pokemon and believe it's better measure than all those AI benchmarks. The game could be beaten by a 8yo which obviously doesn't have all that knowledge that even small local LLMs posess, but has actual intelligence and could figure out the game within < 100h. So far Claude can't even get past the first half and I doubt any other AI could get much further.



Now I want to watch Claude play Pokemon Go, hitching a ride on self-driving cars to random destinations and then trying to autonomously interpret a live video feed to spin the ball at the right pixels...

2026 news feed: Anthropic cited as AI agents simultaneously block traffic across 42 major cities while trying to capture a not-even-that-rare pokemon


the true measure of AI: does it have fun playing pokemon? did it make friends along the way?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: