There's also mathematically excellent reasons why that happened.
Self-driving cars are an impossibly complex problem.
Statistics are statistics.
Predicting the minority class correctly 99% of the time isn't good enough for autonomous driving. A car has to break for little Suzie 100% of the time.
However, generating 1,000 lines of code for a CRUD app? That's 99% bug free?
That's a helluva lot better than I can do.
As with all things. The solution is watch what the domain experts do.
The equivalent is closer to a CRUD app that serves 99% of requests correctly. Which is nowhere near good enough to use.
But even if we do go with 99% bug free for the sake of argument, the usefulness depends on the type of bug. How harmful is it? How easy is it to detect?
I had my wife (a physician) ask ChatGPT medical questions and it was almost always subtly but dangerously and confidently wrong. It looked fine to me but it took an expert to spot the flaws. And frequently it required specialist knowledge that a physician outside of my wife’s specialty wouldn’t even know to find the problems.
If you need a senior engineer to read and understand every line of code this thing spits out I don’t see it as providing more than advanced autocomplete in real world use (which to be fair could be quite helpful).
It frequently takes more time to read and really comprehend a junior engineers PR than it would have to just do it myself. The only reason I’m not is mentoring.
There's also mathematically excellent reasons why that happened.
Self-driving cars are an impossibly complex problem.
Statistics are statistics.
Predicting the minority class correctly 99% of the time isn't good enough for autonomous driving. A car has to break for little Suzie 100% of the time.
However, generating 1,000 lines of code for a CRUD app? That's 99% bug free?
That's a helluva lot better than I can do.
As with all things. The solution is watch what the domain experts do.