What they do is extraordinary, but it's not just a claim, they actually do, thei...

sweezyjeezy · 2025-05-06T22:33:38 1746570818

Again - I'd argue that the extraordinary success of LLMs, in a relatively short amount of time, using a fairly unsophisticated training approach, is strong evidence that coding models are going to get a lot better than they are right now. Will it definitely surpass every human? I don't know, but I wouldn't say we're lacking extraordinary evidence for that claim either.

The way you've framed it seems like the only evidence you will accept is after it's actually happened.

sigmaisaletter · 2025-05-06T22:37:22 1746571042

Well, predicting the future is always hard. But if someone claims some extraordinary future event is going to happen, you at least ask for their reasons for claiming so, don't you.

In my mind, at this point we either need (a) some previously "hidden" super-massive source of training data, or (b) another architectural breakthrough. Without either, this is a game of optimization, and the scaling curves are going to plateau really fast.

sweezyjeezy · 2025-05-06T22:54:15 1746572055

A couple of comments

a) it hasn't even been a year since the last big breakthrough, the reasoning models like o3 only came out in September, and we don't know how far those will go yet. I'd wait a second before assuming the low-hanging fruit is done.

b) I think coding is a really good environment for agents / reinforcement learning. Rather than requiring a continual supply of new training data, we give the model coding tasks to execute (writing / maintaining / modifying) and then test its code for correctness. We could for example take the entire history of a code-base and just give the model its changing unit + integration tests to implement. My hunch (with no extraordinary evidence) is that this is how coding agents start to nail some of the higher-level abilities.

sigmaisaletter · 2025-05-07T13:43:34 1746625414

the "reasoning" models are already optimization, not a breakthrough.

They are not reasoning in any real sense, they are writing pages and pages of text before giving you the answer. This is not super-unlike the "ever bigger training data" method, just applied to output instead of input.

davidcbc · 2025-05-06T23:40:26 1746574826

This is like Disco Stu's chart for disco sales on the Simpsons or the people who were guaranteeing bitcoin would be $1 million each in 2020

sweezyjeezy · 2025-05-07T09:10:55 1746609055

I'm not betting any money here - extrapolation is always hard. But just drawing a mental line from here that tapers to somewhere below one's own abilities - I'm not seeing a lot of justification for that either.