Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"I am not a madman for saying that it is likely that the code for artificial general intelligence is going to be tens of thousands of lines of code, not millions of lines of code. This is code that conceivably one individual could write, unlike writing a new web browser or operating system."

- John Carmack, Lex Fridman Podcast (August 4th, 2022)

This was around 3 months before ChatGPT's initial release.

Timestamped: https://www.youtube.com/watch?v=I845O57ZSy4&t=14677s



I listened to that podcast at the time. I remeber being surprised about one thing in particular. He was talking about predictions of when the first people will land on Mars. He said that we tend to underestimate many of the challenges involved and so we tend to think it will happen sooner than what it is reasonable to expect. He also recounted a time he was talking with other people about it but, since they decided to bet money on which prediction will be closest to the reality, the people involved started doing a more in-depth consideration of the challenges involved and being more cautious about claiming it will happen very soon. But then, he trew all of his arguments about the bias towards unrealistic optimism and the need for more nuanced analysis out of the window when they started talking about when we will have AGI.


For a detailed accounting of the challenges involved, I recommend this video:

Why It Would Be Preferable To Colonize Titan Instead Of Mars https://youtu.be/_InuOf8u7e4?si=hRO1ZYCZtbQXUuK9

If you fully watch it, or already know these issues, the notion of going to Mars any time soon seems outright foolish.

If you have a large enough platform, it might still be useful as fundraising campaign, a PR stunt, or as a way to rally the uninformed masses around a bold vision. But, that's about it.


His take was not really "novel" however, John McCarthy said basically the same thing multiple times in the 90s and maybe even 80s? He would say something along the lines of "If we ever get to an algorithm that expresses general intelligence, we will be able to write that in one or two pages of a manual. Such a book will still be rather long and the rest of the pages will be about how we got to that algorithm and why it took us so long".


Attention is all you need and GPT-2 were well known at that point. Many might doubt whether this approach leads to "general" intelligence – depends on the definition.

BTW, Karpathy has a nice video tutorial about building an LLM: https://www.youtube.com/watch?v=kCc8FmEb1nY


Depends what you mean by 'code', of course. There's some pretty high-level libraries doing heavy-lifting in those 10 thousands of lines. And if also count the training data and the billions of encoded parameters as "code", that is required to make this work, then that's a lot of "lines".

At the other extreme, if you're happy to exclude 'libraries' then I could wrap these tens of thousands of lines to a bashscript, and claim I have written an artificial general intelligence in only 1 line.


Everybody who has worked on a large/huge code base knows that most of it is there for historical reasons: it's always safer to add a new feature than to replace something old, as most of the time people don't know if that old code is useful or not.

BTW tinygrad shows that tens of thousands of lines are enough, it's skipping even most of the AMD kernel drivers and talks to the hardware directly.


I have seen people re-write large chunks of functionality not knowing it was already there in a form they didnt know about. Have done it myself a few times. Write largish blob of code. Replace it with one system call that does the same thing.

Also most things are simple. It is the data and libraries and applications from those simple things that are interesting. Take for example C++. The actual language is a few dozen keywords. But the libraries and things built using those small things is quite large.

Also most drivers are written so that we can re-use the code. But with the right docs you can twiddle the hardware yourself. That was 'the 80s/90s way'. It was not until macos/os2/windows/xlib that we abstracted the hardware. Mostly so we could consistently reuse things.



Time for another Year of Linux (TM) prediction:

Linux (and friends) do a lot of things on the command line. LLMs are good at writing text and using a command line interface. Creating LLMs and AIs takes relatively little work and is interesting, open source is especially good at this type of work. Therefore, I predict that Linux will keep pace with other operating systems when it comes to voice control.


Time for another prediction that a computer will have a PhD, by the end of the decade. 1960, 1970, 1980, 1990, 2010, 2020, 2030. Next, yawn.


His claim is just that AGI will largely be comprised of stateless mathematical operations -- all of which could be a single line of code if written that way.

As a claim, it's antique -- it would easily have been a view of Turing and others many decades ago.

And as a claim its as false then as now. Not least because all the code which is the actual algorithm for generating an LLM is all the code that goes into its data collection which is just being inlined (/cached) with weights.

However, more than that, it's an extremely impoverished view of general intelligence which eliminates any connection between intelligence and a body. All the "lines of code" beyond a single are concerned with I/O and device manipulation.

Thus this is just another way of repeating the antique superstition that intelligence has nothing to do with embodiment.


> As a claim, it's antique -- it would easily have been a view of Turing and others many decades ago.

Absolutely.

We discussed this in the 90s and were of the opinion, then, that event state of the art NNs (in the 90s) wouldn't get much more complex because of their actual mathematical descriptions.

It's all the bells and whistles around managing training, and real-world input/output translation code.

The core or 'general case' always will be tiny.


Executing computer code is embodied in a physical machine. Modern machines have vision and audio I/O capabilities, for example.

You seem to be assuming, without any evidence, a narrow view of what embodiment entails.

The "antique superstition" claim is particularly ironic in this context, since it much more clearly applies to your own apparently irrational hasty conclusion.


What claim do you think I'm making?


> antique claim ... extremely impoverished ... antique supersition

Frankly, when the total argument against a position consists of such "boo" words, I immediately suspect some projection of personal preferences.

But anyway I googled "intelligence vs embodiment" and found this quite nice summary albeit from 2012: https://pmc.ncbi.nlm.nih.gov/articles/PMC3512413/

The basic idea seems to be that human conscious experience is a mishmash of sensory-reflexive impulses and internal deliberation, combined into a sophisticated narrative. Simulating this kind of combination may help robots to move about in the world and to relate more closely to human experience. I have a lot of sympathy with this although I'm guarded about how much it really tells us about the potentials for AGI.

> Not least because all the code which is the actual algorithm for generating an LLM is all the code that goes into its data collection which is just being inlined (/cached) with weights.

Could this "data collection" code not potentially be put into a few thousand lines also?


> Could this "data collection" code not potentially be put into a few thousand lines also?

This would mean hardcoding a model of the world. Which maybe, with the help of some breakthrough from physics, would be possibe (but I think the kind of breakthrough needed to get down to such a small size would be a theory of everything). But this means eliminating the self-learning part of current neural networks, which is what makes them attractive: you don't have to hardcode the rules, as the model will be able to learn them.


What I mean is, we could program an entity which gathers its own data and updates its own weights. With the lines of code being in the tens of thousands.


But op's point was about people talking about tiny LLMs failing to consider the amount of parameters used. This has nothing to do with the ability of autonomously collecting the training data.


That's also why nvidia's position is fragile. The cost of the GPU is so high that at one point AI will have to be profitable and it is practical for those models to be ported to a competing platform.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: