Yes, lots of people have argued that Chomsky is wrong about various things for v...

ants_everywhere · 2025-11-08T15:37:48 1762616268

A realistic child linguistic corpus for a 2 year old starting to form sentences would be about 15 million words over the course of their lifetime. Converted to LM units that's maybe about 20 million tokens. There are small language models trained on sets that small.

Some LMs are specifically trained on child-focused small corpora in the 10 million range, e.g. BabyLM: https://babylm.github.io.

Keep in mind that before age 2, children are using individual words and getting much richer feedback than LMs are.

Humans can and do echolocate: https://en.wikipedia.org/wiki/Human_echolocation. There are also anatomical differences that are not cognitive that affect the abilities like echolocation. For example, the positioning and frequency response of sensors (e.g. ears) can affect echolocation performance.

foldr · 2025-11-08T16:10:36 1762618236

Yes, humans can echolocate to a limited extent, just as some animals have very limited analogs of human language. That was the point of the comparison. It is no more sensible to view human language as just a more complex variant of vervet monkey calls than it is to view bat echolocation as just a more complex variant of whatever limited capacity humans have in that area. There is continuity viewed from the outside, if you squint a little, but that's unlikely to correspond to continuity in terms of the underlying cognitive mechanisms. Bats, for example, can make precise calculations of distance based on a built-in reference for the speed of sound: https://www.pnas.org/doi/10.1073/pnas.2024352118

Children don't get 'rich feedback' at all on the grammatical structure of their sentences. I think this idea is probably based on a misconception of what 'grammar' is from a generative linguistics perspective. When was the last time that a child got rich feedback on their misinterpretation of an ACD construction? https://www.bu.edu/bucld/files/2011/05/29-SyrettBUCLD2004.pd...

LLMs trained on small datasets don't perform that well from the point of view of language acquisition – even up to 100 million tokens. There's not a very large literature on this because, as I said, there are many more people interested in making a drive-by critique of generative linguistics than there are people who are genuinely interested in investigating different models of child language acquisition. But here is one suggestive result: https://aclanthology.org/2025.emnlp-main.761.pdf See also the last paragraph of p.6 onwards of https://arxiv.org/pdf/2308.03228

The other point that's often missed in evaluations of these models is their capacity for learning completely non-human-like languages. Thus, the BabyLM models have some limited success in learning (for example) some island constraints, but could just have easily acquired languages without island constraints. That then leaves the question of why we do not see human languages without such constraints.

tim333 · 2025-11-08T22:01:36 1762639296

>Children don't get 'rich feedback' at all on the grammatical structure of their sentences.

They probably do get parents and the like correcting them or giving an example. Kid says we goed fish, adult say yeah we went fishing. I taught English as a foreign language a bit and people learn almost entirely from examples like that rather than talking about ellipsis or any sort of grammar jargon.

It seems brains / neurons / LLMs are good at pattern recognition. Brains probably quicker on the uptake than LLM backpropagation though.

foldr · 2025-11-08T23:22:45 1762644165

That particular example is irrelevant to poverty of the stimulus arguments because no-one has ever suggested that kids acquiring English lack evidence for the irregular past tense of ‘go’.

See above for some examples of the kinds of grammatical principles that can form the basis of a poverty of the stimulus argument. They’re not generally the kind of thing that parental corrections could conceivably help with, for two reasons:

1) (The main reason) Poverty of the stimulus arguments relate to features of grammatical constructions that are rarely exemplified. As examples are rarely uttered, deviant instances are rarely corrected, even assuming the presence of superlatively wise and attentive caregivers.

2) (The reason that you mention) Explicit instruction on grammatical rules has almost no effect on most people, especially young children. So corrections at most add a few more examples of bad sentences to the child’s dataset, which they can probably obtain anyway via more indirect cues.

If corrections were really effective, someone should be able to do a killer experiment where they show an improved (i.e. more adult-like) handling of, say, quantifier scope in four year olds after giving them lots of relevant corrections. I am open minded about the outcome of such an experiment, but I’d bet a fairly large amount of money that it would go nowhere.