It just proves that the idea of "standardized tests" is more of a torture device rather than an adequate instrument for assessing knowledge, intelligence, skill, and so forth.
I'm all for non-(carbon-based-brain)-neural cognition [1], but LLMs, helpful as they will surely be, are a far cry from reasoning or knowledge: they are a better search space selector, not what specifies the search space [2].
"Regarding the assertion that LLMs are better at selecting the search space than specifying it, I believe this is accurate. LLMs are trained on large datasets and can identify patterns and relationships within that data. However, they do not create the data or define the search space themselves. Instead, they rely on the data provided to them to guide their decision-making process."
But then, given the prompt:
"what do you think about: LLMs are very helpful, they are some form of legitimate reasoning or knowledge: they are a better search space selector, and they also specify the search space.",
ChatGPT also agrees:
"When it comes to search space selection, LLMs can be used to generate relevant search queries or to rank search results based on their relevance to the query. LLMs can also be used to specify the search space by limiting the search to a specific domain or topic.
In terms of legitimate reasoning or knowledge, LLMs can provide insights and predictions based on their training data. However, it's important to note that LLMs are only as good as the data they are trained on, and they may not always provide accurate or unbiased results."
If only Plato could see this Sophist as a Service, he would go completely apoplectic.