Well, isn't it like Yolo mode from Claude Code that we've been using, without worry, locally for months now? I truly think that Yolo mode is absolutely fantastic, while dangerous, and I can't wait to see what the future holds there.
I run it from within a dev container. I never had issues with yolo mode before, but if it somehow decided to use the gcloud command (for instance) and affected the production stack, it’s my ass on the line.
If the code can call a method that provides the API key, what would stop the LLM from calling the same code? How do you propose to let an LLM run tests that execute code that requires API without the LLM also being able to grab the key?