I think we still need an LLM to enable the system as a whole to understand vague and half-baked human input.
I can easily ask an LLM to write be a function in a random programming language, then feed the output to a compiler, and pipe errors from the compiler back to the LLM.
What doesn't work so well is typing "pong in java" into a bash shell.
This isn't a perfect solution (not even for small projects), but it does demonstrate that automated validation can improve the output.
This is what ChatGPT's Code Interpreter does (writes code in Python and then runs it to check for errors). I'm not sure if it's enabled for everyone yet though.
I can easily ask an LLM to write be a function in a random programming language, then feed the output to a compiler, and pipe errors from the compiler back to the LLM.
What doesn't work so well is typing "pong in java" into a bash shell.
This isn't a perfect solution (not even for small projects), but it does demonstrate that automated validation can improve the output.