Probabilities matter. So the machine ends up with a model of human language, in which errors are present but uncommon, same trick a child is doing. For other reasons I was looking recently at the Wug test. If you run that test on very young kids in Japan, two year olds pass no problems, in Japanese there's no plurals so, "1 Wug" => "2 Wug". But English native kids at this age struggle, "1 Wug" => ??? they're aware there's a rule for how this works, but they aren't yet confident what the rule is. A year or two later, "1 Wug" => "2 Wugs" they have learned the rule, make plurals by adding a -S sound to the word.
I expect ChatGPT can pass the wug test. In fact, unlike a random two year old it will certainly have read about the actual Wug test, so definitely don't ask it about that word in particular, make up a new word.
Now, human kids are learning a spoken language, the model is learning a written language, but they're both linear so it's not that different.