Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Given how many university-level tests GPT4 places better than 50th percentile at, I don't know if "catching up to 2 year olds" is a fair description. For that kind of text based task it seems well ahead of the general adult human population.



To be fair, such tests are designed with the human mind in, well, mind, and assume that various hard-to-quantify variables – ones that the tester is actually interested in – correlate with test performance. But LLMs are alien minds with very different correlations. It’s clear, of course, that ChatGPT’s language skills vastly exceed those of an average 2-year-old, and indeed surpass the skills of a considerable fraction of general adult population, but the generality of its intelligence is probably not above a human toddler.


You could write a quiz answer bot that is well ahead of the general population without any AI, just by summarizing the first page of Google results for that question. We test humans on these subjects because the information is relevant, not because they are expected to remember and reproduce them better than an electronic database.

If the test is designed to quantify intelligence and is not present in the corpus, ChatGPT does about as good as a dog, and there is little reason to think LLMs will improve drastically here.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: