@dickfickling beat me to it, but ultrathink is already explicitly called out in the public Anthropic documentation:
"Ask Claude to make a plan for how to approach a specific problem. We recommend using the word "think" to trigger extended thinking mode, which gives Claude additional computation time to evaluate alternatives more thoroughly. These specific phrases are mapped directly to increasing levels of thinking budget in the system: "think" < "think hard" < "think harder" < "ultrathink." Each level allocates progressively more thinking budget for Claude to use."
I don't know what the max allowable "budget_tokens" is for Claude 3.7 Thinking mode, but the SDK shows an example of 32k which matches up with the article's findings.
Looks like that documentation is incorrect. It suggests there are four levels - "think" < "think hard" < "think harder" < "ultrathink." - but if you look in the code there are actually only three.
Sincerely, I respect your response to how arbitrary it seems in this form.
But... I'd like you to take a moment and think really hard about whether this is truly novel behavior for LLMs, or rather something that has always been part of the interplay between inter-agent communication and intra-agent thought :)
It would be cool if these "secret keywords" were more directly exposed in the UI somehow, perhaps as a toggleable developer/experimental mode? I would have a lot of fun tinkering with them.
It's for Claude Code FWIW, just leaving a sigil here for fellow API implementers who are confused: your general point stands (though I wonder about UI affordances other than text given it's a CLI tool)
Crazy that it's a key word that's implemented in the code that expands the context window, and that a light touch of reverse engineering was required to find it.
"Ask Claude to make a plan for how to approach a specific problem. We recommend using the word "think" to trigger extended thinking mode, which gives Claude additional computation time to evaluate alternatives more thoroughly. These specific phrases are mapped directly to increasing levels of thinking budget in the system: "think" < "think hard" < "think harder" < "ultrathink." Each level allocates progressively more thinking budget for Claude to use."
https://www.anthropic.com/engineering/claude-code-best-pract...
I don't know what the max allowable "budget_tokens" is for Claude 3.7 Thinking mode, but the SDK shows an example of 32k which matches up with the article's findings.
reply