Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because it IS wrong.

Just months ago we saw in research out of Harvard that even a very simplistic GPT model builds internalized abstract world representations from the training data within its NN.

People parroting the position from you and the person before you are like doctors who learned about something in school but haven't kept up with emerging research that's since invalidated what they learned, so they go around spouting misinformation because it was thought to be true when they learned it but is now known to be false and just hasn't caught up to them yet.

So many armchair experts who took a ML course in undergrad pitching in their two cents having read none of the papers in the past year.

This is a field where research perspectives are shifting within months, not even years. So unless you are actively engaging with emerging papers, and given your comment I'm guessing you aren't, you may be on the wrong side of the Dunning-Kreuger curve here.




> Because it IS wrong.

Do we really know it IS wrong?

That's a very strong claim. I believe you there's a lot happening in this field but it doesn't seem possible to even answer the question either way. We don't know what reasoning looks like under the hood. It's still a "know it when you see it" situation.

> GPT model builds internalized abstract world representations from the training data within its NN.

Does any of those words even have well defined meanings in this context?

I'll try to figure out what paper you're referring to. But if I don't find it / for the benefit of others just passing by, could you explain what they mean by "internalized"?


> Just months ago we saw in research out of Harvard that even a very simplistic GPT model builds internalized abstract world representations from the training data within its NN.

I've seen this asserted without citation numerous times recently, but I am quite suspicious. Not that there exists a study that claims this, but that it is well supported.

There is no mechanism for directly assessing this, and I'd be suspicious that there is any good proxy for assessing it in AIs, either. research on this type of cognition in animals tends to be contentious, and proxies for them should be easier to construct than for AIs.

> the wrong side of the Dunning-Kreuger curve

the relationship between confidence and perception in the D-K paper, as I recall, is a line, and its roughly “on average, people of all competency levels see themselves slightly closer to the 70th percentile than they actually are.” So, I guess the “wrong side” is the side anywhere under the 70th percentile in the skill in question?


> I guess the “wrong side” is the side anywhere under the 70th percentile in the skill in question?

This is being far too generous to parent’s claim, IMO. Note how much “people of all competency levels see themselves slightly closer to the 70th percentile than they actually are” sounds like regression to the mean. And it has been compellingly argued that that’s all DK actually measured. [1] DK’s primary metric for self-assessment was to guess your own percentile of skill against a group containing others of unknown skill. This fully explains why their correlation between self-rank and actual rank is less than 1, and why the data is regressing to the mean, and yet they ignored that and went on to call their test subjects incompetent, despite having no absolute metrics for skill at all and testing only a handful of Ivy League students (who are primed to believe their skill is high).

Furthermore, it’s very important to know that replication attempts have shown a complete reversal of the so-called DK effect for tasks that actually require expertise. DK only measured very basic tasks, and one of the four tasks was subjective(!). When people have tried to measure the DK effect on things like medicine or law or engineering, they’ve shown that it doesn’t exist. Knowledge of NN research is closer to an expert task than a high school grammar quiz, and so not only does DK not apply to this thread, we have evidence that it’s not there.

The singular reason that DK even exists in the public consciousness may be because people love the idea they can somehow see & measure incompetence in a debate based on how strongly an argument is worded. Unfortunately that isn’t true, and of the few things the DK paper did actually show is that people’s estimates of their relative skill correlate with their actual relative skill, for the few specific skills they measured. Personally I think this paper’s methodology has a confounding factor hole the size of the Grand Canyon, that the authors and public both have dramatically and erroneously over-estimated it’s applicability to all humans and all skills, and that it’s one of the most shining examples of sketchy social science research going viral and giving the public completely wrong misconceptions, and being used incorrectly more often than not.

[1] https://www.talyarkoni.org/blog/2010/07/07/what-the-dunning-...


Why are you taking the debate personally enough to be nasty to others?

> you may be on the wrong side of the Dunning-Krueger curve here.

Have you read the Dunning & Krueger paper? It demonstrates a positive correlation between confidence and competence. Citing DK in the form of a thinly veiled insult is misinformation of your own, demonstrating and perpetuating a common misunderstanding of the research. And this paper is more than 20 years old...

So I’ve just read the Harvard paper, and it’s good to see people exploring techniques for X-ray-ing the black box. Understanding better what inference does is an important next step. What the paper doesn’t explain is what’s different between a “world model” and a latent space. It doesn’t seem surprising or particularly interesting that a network trained on a game would have a latent space representation of the board. Vision networks already did this; their latent spaces have edge and shape detectors. And yet we already know these older networks weren’t “reasoning”. Not that much has fundamentally changed since then other than we’ve learned how to train larger networks reliably and we use more data.

Arguing that this “world model” is somehow special seems premature and rather overstated. The Othello research isn’t demonstrating an “abstract” representation, it’s the opposite of abstract. The network doesn’t understand the game rules, can’t reliably play full Othello games, and can’t describe a board to you in any other terms than what it was shown, it only has an internal model of a board, formed by being shown millions of boards.


Do you have a link to that Harvard research?





Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: