It's both. I recently saw a comparison of various models on two IQ tests, one of...

It's both. I recently saw a comparison of various models on two IQ tests, one of which was public and the other of which was carefully curated to be not directly learnable from the likely training sets.

On public tests, LLMs vary between "just below average human" and "genius".

On the hopefully-private test (it's difficult to be sure*), the best was o1, which was "merely" just below an average human, Claude-3 Opus which was stupid, and all the rest were "would need a full time caretaker".

In both cases, the improvements to the models came with higher scores; but there's still a lot you can do by learning for the test — and one thing that LLMs are definitely superhuman at is that.

https://www.maximumtruth.org/p/massive-breakthrough-in-ai-in...

* I could have said the same last year about last year's models, so I'm emphatically not saying o1 really is as smart as this test claims; I'm only saying this demonstrates these IQ tests are a learnable skill up to at least this magnitude of difference.