Hacker News new | past | comments | ask | show | jobs | submit login

Perhaps. Then how do you handle the computer being confidently wrong a large proportion of the time? From my experience it's inaccurate in proportion to the significance of the task. So by the time it's writing real code it's more wrong than right. How can you turn that into something useful? I don't think the system around us is configured to handle such an unreliable agent. I don't want things in my life to be less reliable, I want them to be more reliable.

(Also if you exist in an ecosystem where being confidently wrong 70% of the time is acceptable, that's kinda suspect and I'll return to the argument of "useless jobs")




Filters. If you can come up with a problem where incorrect solutions can be filtered out, and you accept that LLM outputs are closer to a correct answer than a random string then LLM's are a way to get to a correct answer faster than previously possible for a whole class of problems we previously didn't have answer generators for.

And that's just the theory, in practice the LLM's are orders of magnitude closer to generating correct answers than anything we previously had.

And then there's the meta aspect of them: they can also act as filters themselves. What is possible if you can come with filters for almost any problem a human can filter for, even if that filter has a chance of being incorrect? The possibilities are impossible to tell, but to me very exciting/worrying. LLM's really have expanded the realm of what it is possible to do with a computer. And in a much more useful domain than fintech.


As long as it’s right more than random chance, it’s potentially useful - you just have to iterate enough times to reach your desired level of statistical certainty.

If you take the current trend of the cost of inference and assume that’s going to continue for even a few more cycles, then we already have sufficient accuracy in current models to more than satisfy the hype.


I'm not following the statistical argument.

Firstly, something has to verify the work is correct right? Assuming you have a robust way to do this (even with humans coding it's challenging!), at some point the accuracy is so low that it's faster to create it manually than verify many times - a problem I frequently run into with LLM autocomplete and small scale features.

Second, on certain topics the LLM is biased towards the wrong answer and is further biased by previous wrong reasoning if it's building off itself. It becomes less likely that the LLM will choose the right method. Without strong guidance it will iterate itself to garbage, as we see with vibe coding shenanigans. How would you iterate on an entire application created by LLM, if any individual step it takes is likely to be wrong?

Third, I reckon it's just plain inefficient to iterate many times to get something we humans could've gotten correct in 1 or 2 tries. Many people seem to forget the environmental impact from running AI models. Personally I think we need to be doing less of everything, not producing more stuff at an increasing rate (even if the underlying technology gets incrementally more efficient).

Now maybe these things are solved by future models, in which case I will be more excited then and only then. It does seem like an open question whether this technology will keep scaling to where we hope it will be.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: