Absolutely stellar for 0-to-1-oriented frontend-related tasks, less so but still quite useful for isolated features in backends. For larger changes or smaller changes in large/more interconnected codebases, refactors, test-run-fix-loops, and similar, it has mostly provided negative value for me unfortunately. I keep wondering if it's a me problem. It would probably do much better if I wrote very lengthy prompts to micromanage little details, but I've found that to be a surprisingly draining activity, so I prefer to give it a shot with a more generic prompt and either let it run or give up, depending on which direction it takes.