And nowadays a better known benchmark, so data scientists can overfit their mode...

		ruszki 41 days ago \| parent \| context \| favorite \| on: GPT-5 is already (ostensibly) available via API And nowadays a better known benchmark, so data scientists can overfit their models to it even more, even when LLMs are famous for overfitting. So, I wouldn’t trust any results regarding this specific test nowadays.