I had this conversation before. I point out how your interpretation is insane an...

mythrwy · 2025-04-02T00:39:47 1743554387

Let me give you a simple example maybe you will understand better.

Let's say a person has a recessive faulty gene. The gene doesn't get expressed because there is only one copy (recessive). We can notate this Aa (small "a" being the faulty gene, large "A" being the good copy). The person has two copies because they get one from each parent.

So "Aa" has a partner we can notate as "AA" (two good copies of the gene). AA and Aa have a child. What is the chance the child has the recessive gene? 25% because we have 4 possibilities with 1 bad outcome. Can the child have two bad copies (i.e. "aa" where the gene gets expressed)? No, they cannot because there are not two copies available from the parents, only one. At most they get "Aa". 75% chance they get "AA".

Let's say AA and Aa have a bunch of kids, the kids intermarry. Then their kids intermarry. Now what is the chance of an individual having two bad copies (i.e "aa"). What is the chance they have 1 bad copy (Aa)?

It's just probability calculations, and the expression becomes more probable as there are more copies of the bad gene in the gene pool. I.E within a population, the errors accumulate, they build up, there is a larger chance of getting expression of the defect (aa) with continued inbreeding.

This works with desirable genes too which is why we have so many kinds of dogs for instance. We select for it and build up copies of gene expressions we want to see to the point there is a 100% (or close to) chance of expression.

Hopefully you get this now. If not, read up on Mendelian genetics and table calculations maybe that will help you see.

------------------------

So let me take this back to the original example of LLMs. Suppose there is 1% chance an LLM confidently claims Python library "Foo" exists and does XX when it's not true. This is analogous to a bad copy of the gene. If you train on that output (i.e. "inbreeding"), then use that as a reference (more inbreeding), soon many sources will say "Foo" exists and you'll have a larger chance of getting "Foobarred" information from the LLM.