it's more complicated than this. I mean what you get is defined by what you put ...

it's more complicated than this. I mean what you get is defined by what you put in. At first is was random or selected internet garbage + books + docs. I.e. not designed for training. Than was tuning. Now we can use trained model to generate the data designed for training. With specific qualities, in this case reasoning. And train next model. Just intuitively it can be smaller and better at what we trained it for. I showed two options how data can be generated, there are others of course.

As for humans, assuming genetically they have the same intellectual abilities, you can see the difference in development of different groups. It's mostly defined by training the better next generation. Schools are exactly for this.