Take active learning versus usual learning. Often with active learning you can learn much faster. That's a kind of "experience." Out of distribution problems where it fails to generalize could be dealt with much more efficiently when a model can ask "hey what's f(x=something really weird and specific that would never come up in an entire internet's worth of training data)?" Experience isn't passive, and that makes a whole world of difference. And that's not even touching on the difficulty of "tell me all about elephants" versus "let me interact with an elephant and see it and touch it and physically study it."