Even just in the exam passing category, GPT4 showed no improvement over GPT3.5 on AP Language & Composition or AP English Literature, and scored quite poorly.
Now, granted, plenty of humans don't score above a 2 on those exams either. But I think it's indicative that there's still plenty of progress left to make before this technology is indistinguishable from magic.
Now, granted, plenty of humans don't score above a 2 on those exams either. But I think it's indicative that there's still plenty of progress left to make before this technology is indistinguishable from magic.