This isn’t a complete answer, but my short list for moving the tech many steps forward would be:
* replying with “I don’t know” a lot more often
* consistent responses based on the accessible corpus
* far fewer errors (hallucinations)
* being able to beat Pokémon reliably and in a decent time frame without any assistance or prior knowledge about the game or gaming in general (Gemini 2.5 Pro had too much help)
* replying with “I don’t know” a lot more often
* consistent responses based on the accessible corpus
* far fewer errors (hallucinations)
* being able to beat Pokémon reliably and in a decent time frame without any assistance or prior knowledge about the game or gaming in general (Gemini 2.5 Pro had too much help)