Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I expect the models will continue improving though,

How? They've already been trained on all the code in the world at this point, so that's a dead end.

The only other option I see is increasing the context window, which has diminishing returns already (double the window for a 10% increase in accuracy, for example).

We're in a local maxima here.



This makes no sense. Claude 3.7 Sonnet is better than Claude 3.5 Sonnet and it’s not because it’s trained on more of the world’s code. The models are improving in a variety of ways, whether by being larger, faster, using the same number of parameters more effectively, better RLHF techniques, better inference-time compute techniques, etc.


> The models are improving in a variety of ways, whether by being larger, faster, using the same number of parameters more effectively, better RLHF techniques, better inference-time compute techniques, etc.

I didn't say they weren't improving.

I said there's diminishing returns.

There's been more effort put into LLMs in the last two years than in the two years prior, but the gains in the last two years have been much much smaller than in the two years prior.

That's what I meant by diminishing returns: the gains we see are not proportional to the effort invested.


You said we're in a local maximum. Your comment was at odds with itself.


One way is mentioned in the article, expanding and improving MCP integrations - give the models the tools to work more effectively within their limitations on problems in the context of the full system.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: