I don't think saying "humans pass and AI doesn't" makes any sense here because the two are not even taking the same exam for all the points outlined above.
Evolution alone means humans are "cheating" in this exam, making any comparisons fairly meaningless.
If all you care about is the results, or even specifically just the visible part of the costs, then there's no such thing as cheating.
That's both why I'm fine with the AI "cheating" by the transistors being faster than my synapses by the same magnitude that my legs are faster than continental drift (no really I checked) and also why I'm fine with humans "cheating" with evolutionary history and a much more complex brain (around a few thousand times GPT-3, which… is kinda wild, given what it implies about the potential for even rodent brains given enough experience and the right (potentially evolved) structures).
When the topic is qualia — either in the context "can the AI suffer?" or the context "are mind uploads a continuation of experience?" — then I care about the inner workings; but for economic transformation and alignment risks, I care if the magic pile of linear algebra is cost-efficient at solving problems (including the problem "how do I draw a photorealistic werewolf in a tuxedo riding a motorbike past the pyramids"), nothing else.
Evolution alone means humans are "cheating" in this exam, making any comparisons fairly meaningless.