Even just in the exam passing category, GPT4 showed no improvement over GPT3.5 o...

Even just in the exam passing category, GPT4 showed no improvement over GPT3.5 on AP Language & Composition or AP English Literature, and scored quite poorly.

Now, granted, plenty of humans don't score above a 2 on those exams either. But I think it's indicative that there's still plenty of progress left to make before this technology is indistinguishable from magic.