My understanding is context size. Companies like Cursor are trying to minimize the amount of context sent to the models to keep their own costs down. Claude Code seems to send a lot more context with every request and that seems to make the difference.
I got the opposite experience. Not with Opus (too expensive), but with Sonnet. I got things done way more efficiently when using Sonnet with Roo than with Claude Code.
same. i ran a few tests ($100 worth of api calls) with opus 4 and didn’t see any difference compared to sonnet 4 other than the price.
also no idea why he thinks roo is handicapped when claude code nerfs the thinking output and requires typing “think”/think hard/think harder/ultrathink just to expand the max thinking tokens.. which on ultrathink only sets it at 32k… when the max in roo is 51200 and it’s just a setting.
I think you misread my comment. I wasn't asking for help. I get consistent good output from Sonnet 4 using RooCode, without needing Gemini for planning.
Edit: I think I know where our miscommunication is happening...
The "think"/"ultrathink" series of magic words are a claudecode specific feature used to control the max thinking tokens in the request. For example, in claude code, saying "ultrathink" sets the max thinking tokens to 32k.
On other clients these keywords do nothing. In Roo, max thinking tokens is a setting. You can just set it to 32k, and then that's the same as saying "ultrathink" in every prompt in claudecode. But in Roo, I can also setup different settings profiles to use for each mode (with different max thinking token settings), configure the mode prompt, system prompt, etc. No magic keywords needed.. and you have full control over the request.
i found them all disappointing in their own ways. Atleast deepseek models actually listen to what i say instead of ignoring me doing their own thing like a toddler.