ChatGPT is good for asking questions about languages, SDKs, and APIs, or generating boilerplate, but it's useless if you want to give an AI a ticket and for it to raise PRs for you.
This is where you need agentic solutions like Codex which will be far more useful because they will actually have access to your codebase and a dev environment where they can test and debug changes.
They still do really dumb things, but a lot of this can be avoided if you prompt well and give it the right types of problems to solve.
In my experience at the moment there's a sweet spot with these agentic coding platforms which makes them useful for semi-complicated tasks – assuming you prompt well they can generate 90% of the code you need, then you just need to spend the extra 10% fixing it up before it's ready for prod.
Tasks too simple (a few lines) it's a waste of time. You spend longer prompting and going back and forth with the agent than it would take to just make the change yourself.
Then obviously very complicated tasks, especially tasks that require some thought around architecture and performance, coding agents really struggle with. Less because they can't do it, but because for certain problems simply meeting ACs is far less important than how the ACs are being met. Ideally here you want to get the architecture right first, then once that's in place you can break down the remaining work for the AI to pick up.
ChatGPT is good for asking questions about languages, SDKs, and APIs, or generating boilerplate, but it's useless if you want to give an AI a ticket and for it to raise PRs for you.
This is where you need agentic solutions like Codex which will be far more useful because they will actually have access to your codebase and a dev environment where they can test and debug changes.
They still do really dumb things, but a lot of this can be avoided if you prompt well and give it the right types of problems to solve.
In my experience at the moment there's a sweet spot with these agentic coding platforms which makes them useful for semi-complicated tasks – assuming you prompt well they can generate 90% of the code you need, then you just need to spend the extra 10% fixing it up before it's ready for prod.
Tasks too simple (a few lines) it's a waste of time. You spend longer prompting and going back and forth with the agent than it would take to just make the change yourself.
Then obviously very complicated tasks, especially tasks that require some thought around architecture and performance, coding agents really struggle with. Less because they can't do it, but because for certain problems simply meeting ACs is far less important than how the ACs are being met. Ideally here you want to get the architecture right first, then once that's in place you can break down the remaining work for the AI to pick up.