So the researchers at Deepmind, OpenAI, Anthropic, etc, are not "serious front l...

ein0p · on June 19, 2024

Apparently not. Or maybe they are heavily incentivized by the hype cycle. I'll repeat one more time: none of the currently known approaches are going to get us to AGI. Some may end up being useful for it, but large chunks of what we think is needed (cognition, world model, ability to learn concepts from massive amounts of multimodal, primarily visual, and almost entirely unlabeled, input) are currently either nascent or missing entirely. Yann LeCun wrote a paper about this a couple of years ago, you should read it: https://openreview.net/pdf?id=BZ5a1r-kVsf. The state of the art has not changed since then.

comp_throw7 · on June 20, 2024

I hope you have some advanced predictions about what capabilities the current paradigm would and would not successfully generate.

Separately, it's very clear that LLMs have "world models" in most useful senses of the term. Ex: https://www.lesswrong.com/posts/nmxzr2zsjNtjaHh7x/actually-o...

I don't give much credit to the claim that it's impossible for current approaches to get us to any specific type or level of capabilities. We're doing program search over a very wide space of programs; what that can result in is an empirical question about both the space of possible programs and the training procedure (including the data distribution). Unfortunately it's one where we don't have a good way of making advance predictions, rather than "try it and find out".

ein0p · on June 20, 2024

It is in moments like these that I wish I wasn’t anonymous on here and could bet a 6 figure sum on AGI not happening in then next 10 years, which is how I define “foreseeable future”.

ToValueFunfetti · on June 20, 2024

You disagreed that 2047 was reasonable on the basis that researchers didn't think it wouldn't happen in the foreseeable future, so your definition must be at least 23 years for consistency's sake

ein0p · on June 20, 2024

I'd be OK with that, too, if we adjusted the bet for inflation. This is, in a way, similar to fusion. We're at a point where we managed to ignite plasma for a few milliseconds. Predictions of when we're going to be able to generate energy have become a running joke. The same will be the case with AGI.

fastball · on June 19, 2024

LeCun has his own interests at heart, works for one of the most soulless corporations I know of, and devotes a significant amount of every paper he writes to citing himself.

He is far from the best person to follow on this.

ein0p · on June 20, 2024

Be that as it may, do you disagree with anything concrete from this paper?

fastball · on June 20, 2024

Fair, ad hominems are indeed not very convincing. Though I do think everyone should read his papers through a lens of "having a very high h-index seems to be a driving force behind this man".

Moving on, my main issue is that it is mostly speculation, as all such papers will be. We do not understand how intelligence works in humans and animals, and most of this paper is an attempt to pretend otherwise. We certainly don't know where the exact divide between humans and animals is and what causes it, which I think is hugely important to developing AGI.

As a concrete example, in the first few paragraphs he makes a point about how a human can learn to drive in ~20 hours, but ML models can't drive at that level after countless hours of training. First you need to take that at face value, which I am not sure you should. From what I have seen, the latest versions of Tesla FSD are indeed better at driving than many people who have only driven for 20 hours.

Even if we give him that one though, LeCun then immediately postulates this is because humans and animals have "world models". And that's true. Humans and animals do have world models, as far as we can tell. But the example he just used is a task that only humans can do, right? So the distinguishing factor is not "having a world model", because I'm not going to let a monkey drive my car even after 10,000 hours of training.

Then he proceeds to talk about how perception in humans is very sophisticated and this in part is what gives rise to said world model. However he doesn't stop to think "hey, maybe this sophisticated perception is the difference, not the fundamental world model". e.g. maybe Tesla FSD would be pretty good if it had access to taste, touch, sight, sound, smell, incredibly high definition cameras, etc. Maybe the reason it takes FSD countless training hours is because all it has are shitty cameras (relative to human vision and all our other senses). Maybe linear improvements in perception leads to exponential improvement in learning rates.

Basically he puts forward his idea, which is hard to substantiate given we don't actually understand the source of human-level intelligence, and doesn't really want to genuinely explore (i.e. steelman) alternate ideas much.

Anyway that's how I feel about the first third of the paper, which is all I've read so far. Will read the rest on my lunch break. Hopefully he invalidates the points I just made in the latter 2/3rds.