Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Minor edits to well known problems do easily fool current models though. Here's one 4o and o1-mini fail on, but o1-preview passes. (It's the mother/surgeon riddle so kinda gore-y.)

https://chatgpt.com/share/6723477e-6e38-8000-8b7e-73a3abb652...

https://chatgpt.com/share/6723478c-1e08-8000-adda-3a378029b4...

https://chatgpt.com/share/67234772-0ebc-8000-a54a-b597be3a1f...



I think you didn't use the "share" function; I cannot open any of these links. Can you do it in a private browser session (so you're not logged in)?


Oops, fixed the links.

mini's answer is correct, but then it forgets that fathers are male in the next sentence.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: