The point is that if you have benchmarks for intelligence, which humans would also fail, then you have to concede that either humans are not intelligent, or that the benchmarks are too strict, or aren't a measure for intelligence at all.
The thing is, LLMs would fail that test every time, but humans would pass it most of the time (hopefully). Just because humans are fallible doesn't make LLMs intelligent.
We really haven't got a grip on what intelligence actually is, but it seems that humans and LLMs aren't really in the same ballpark, or even the same league.
>haven't got a grip on what intelligence actually is
Because intelligence isn't a thing, it's a bunch of different things that some intelligent things have more or less (or none of).
This is why measures of intelligence always fail because we try to binary it which doesn't work. Intelligence is spikey. Intelligence scales from very small and dumb to very smart. But even the things that are very smart on a lot of things still do very dumb things. We also measure human intelligence as a function of all humans and LLM intelligence on a particular model.