I've taught undergraduates and graduates how to code. I've contributed to Open Source projects. I'm a researcher and write research code with other people who write research code.
You could say I've seen A LOT of poorly written human generated code.
Yet, I still trust it more. Why? Well one of the big reasons is exactly what we're joking about. I can trust a human to iterate. Lack of iteration would be fine if everything was containerized and code operates in an unchanging environment[0]. But in the real world, code needs to be iterated on, constantly. Good code doesn't exist. If it does exist, it doesn't stay good for long.
Another major problem is that AI generates code that optimizes for human preference, not correctness. Even the terrible students who were just doing enough to scrape by weren't trying to mask mistakes[1], but were still optimizing for correctness, even if it was the bare minimum. I can still walk through that code with the human and we can figure out what went wrong. I can ask the human about the code and I can tell a lot by their explanation, even if they make mistakes[2]. I can't trust the AI to tell an accurate account of even its own code because it doesn't actually understand. Even the dumb human has a much larger context window. They can see all the code. They can actually talk to me and try to figure out the intent. They will challenge me if I'm wrong! And for the love of god, I'm going to throw them out if they are just constantly showering me with praise and telling me how much of a genius I am. I don't want to work with someone where I feel like at any moment they're going to start trying to sell me a used car.
There's a lot of reasons, more than I list here. Do I still prompt LLMs and use them while I write code? Of course. Do I trust it to write code? Fuck no. I know it isn't trivial to see that middle ground if all you do is vibe code or hate writing code so much you just want to outsource it, but there's a lot of room here between having some assistant and having AI write code. Like the OP suggests, someone has got to write that 10-20%. That doesn't mean I've saved 80% of my time, I maybe saved 20%. Pareto is a bitch.
[0] Ever hear of "code rot?"
[1] Well... I'd rightfully dock points if they wrote obfuscated code...
[2] A critical skill of an expert in any subject is the ability to identify other experts. https://xkcd.com/451/
Engineers use Claude Code for rapid prototyping by enabling "auto-accept mode" (shift+tab) and setting up autonomous loops in which Claude writes code, runs tests, and iterates continuously.
The tool rapidly prototypes features and iterates on ideas without getting bogged down in implementation details
Don't cherry-pick, act in good faith. I know you can also read the top comment I suggested.
I know it's a long article and the top comment is hard to find, so allow me to help
> Treat it like a slot machine
>
> Save your state before letting Claude work, let it run for 30 minutes, then either accept the result or start fresh rather than trying to wrestle with corrections. ***Starting over often has a higher success rate than trying to fix Claude's mistakes.***
*YOU* might be able to iterate well with Claude but I really don't think a slot machine is consistent with the type of iteration we're discussing here. You can figure out what things mean in context or you can keep intentionally misinterpreting. At least the LLM isn't intentionally misinterpreting
That’s actually an old workflow. Nowadays you spin up a thin container and let it go wild. If it messes up you simply just destroy the container, undo the git history and try again.
Llms only work in one direction, they produce the next token only. It can't go back and edit. They would need to be able to back track and edit in place somehow
Plus, the entire session/task history goes into every LLM prompt, not just the last message. So for every turn of the loop the LLM has the entire context with everything that previously happened in it, along with added "memories" and instructions.
That's different then seeing if it's current output made a mistake or not. It's not editing in place. Your just rolling the dice again with a different prompt
Think of it like an append only journal. To correct an entry you add a new one with the correction. The LLM sees the mistake and the correction. That's no worse then mutating the history.
You put in its context window some more information, then roll the dice again. And it produces text again token by token. Its still not planning ahead and its not looking back at what was just generated. There's no guarantee everything stays the same except the mistake. This is different then editing in place. you are rolling the dice again.
It is, though. The LLM gets the full history in every prompt until you start a new session. That's why it gets slower as the conversation/context gets big.
The developer could choose to rewrite or edit the history before sending it back to the LLM but the user typically can't.
> There's no guarantee everything stays the same except the mistake
Sure, but there's no guarantee about anything it will generate. But that's a separate issue.
You could say I've seen A LOT of poorly written human generated code.
Yet, I still trust it more. Why? Well one of the big reasons is exactly what we're joking about. I can trust a human to iterate. Lack of iteration would be fine if everything was containerized and code operates in an unchanging environment[0]. But in the real world, code needs to be iterated on, constantly. Good code doesn't exist. If it does exist, it doesn't stay good for long.
Another major problem is that AI generates code that optimizes for human preference, not correctness. Even the terrible students who were just doing enough to scrape by weren't trying to mask mistakes[1], but were still optimizing for correctness, even if it was the bare minimum. I can still walk through that code with the human and we can figure out what went wrong. I can ask the human about the code and I can tell a lot by their explanation, even if they make mistakes[2]. I can't trust the AI to tell an accurate account of even its own code because it doesn't actually understand. Even the dumb human has a much larger context window. They can see all the code. They can actually talk to me and try to figure out the intent. They will challenge me if I'm wrong! And for the love of god, I'm going to throw them out if they are just constantly showering me with praise and telling me how much of a genius I am. I don't want to work with someone where I feel like at any moment they're going to start trying to sell me a used car.
There's a lot of reasons, more than I list here. Do I still prompt LLMs and use them while I write code? Of course. Do I trust it to write code? Fuck no. I know it isn't trivial to see that middle ground if all you do is vibe code or hate writing code so much you just want to outsource it, but there's a lot of room here between having some assistant and having AI write code. Like the OP suggests, someone has got to write that 10-20%. That doesn't mean I've saved 80% of my time, I maybe saved 20%. Pareto is a bitch.
[0] Ever hear of "code rot?"
[1] Well... I'd rightfully dock points if they wrote obfuscated code...
[2] A critical skill of an expert in any subject is the ability to identify other experts. https://xkcd.com/451/