I switched back to 4.5 Sonnet or Opus yesterday since 4.6 was so slow and often “over thinking” or “over analyzing” the problem space. Tasks which accurately took under an minute in Sonnet 4.5 were still running after 5 minutes in 4.6 (yeah I had them race for a few tasks)
Someone of this could be system overload I suppose.
Edit ~/.claude/settings.json and add "effortLevel": "medium". Alternatively, you can put it in .claude/settings.json in a project if you want to try it out first.
They recommend this in the announcement[1], but the way they suggest doing it is via a bogus /effort command that doesn't exist. See [2] for full details about thinking effort. It also recommends a bogus way to change effort by using the arrow keys when selecting a model, so don't use that either.
Good to know it works for some people! I think it's another issue where they focus too much on MacOS and neglect Windows and Linux releases. I use WSL for Claude Code since the Windows release is far worse and currently unusable do to several neglected issues.
Hoping to see several missing features land in the Linux release soon.
I'm also feeling weak and the pull of getting a Mac is stronger. But I also really don't like the neglect around being cross-platform. It's "cross-platform" except a bunch of crap doesn't work outside MacOS. This applies to Claude Code, Claude Desktop (MacOS and Windows only - no Linux or WSL support), Claude Cowork (MacOS only). OpenAI does the same crap - the new Codex desktop app is MacOS only. And now I'm ranting.
I'm on v2.1.37 and I have it set to auto-update, which it does. I also tend to run `claude update` when I see a new release thread on Twitter, and usually it has already updated itself.
Yep, and their documentation AI assistant will egregiously hallucinate whatever it thinks you want to hear, then repeat itself in a loop when you tell it that it's wrong.
Yesterday I asked a question about a Claude Code setting inside Claude Code, don't recall which, and their builtin documentation skill—something like that—ended up doing a web search and found a wrong answer on a third party site. Later I went to their documentation site and it was right there in the docs. Wonder why they can't bundle an AI-friendly version of their own docs (can't be more than a few hundred KBs compressed?) inside their 174MB executable.
It's insane that they concluded the builtin introspection skill for claude documentation should do a web search instead of simply packing the correct documentation in local files. I had the same experience like you, wasting tokens and my time because their architecture decision doesn't work in practice.
I have to google the correct Anthropic documentation and pass that link to claude code because claude isn't able to do the same reliably in order to know how to use its own features.
They used to? I have a distinct memory of it doing exactly that a few months ago. Maybe it got dropped in the mad dash that passes for CC sprint cycles
Pathetic how they have no support for modifying sampling settings, or even a "logit_bias" so I can ban my claude from using the EM dash (and regular dash), semicolons, or "not". Also will upweight things like exclamation points
Clearly those whose job it is to "monitor" folks use this as their "tell" if someone AI generated something. That's why every major LLM has this particular slop profile. It's infuriating.
Yeah, nothing is sped up, their initial deployment of 4.6 is so unbearably slow they are just now offering you the opportunity to pay more for the same experience of 4.5. What's the word for that?
Remember having to write detailed specs before coding? Then folks realized it was faster and easier to skip the specs and write the code? So now are we back to where we were?
One of the problems with writing detailed specs is it means you understand the problem, but often the problem is not understand - but you learn to understand it through coding and testing.
Skip specs, and you often ended up writing the wrong program - at substantial cost.
The main difference now is the parrots have reduced the cost of the wrong program to near zero, thereby eliminating much of the perceived value if a spec.
We’re not „thinking with portals” about these things enough yet. Typically we’d want a detailed spec beforehand, as coding is expensive and time consuming, thus we want to make sure we’re coding the right thing. With AI though, coding is cheap. So let AI skip the spec and write the code badly. Then have it review the solution, build understanding, design a spec for better solution and have it write it again. Rinse and repeat as many times you need.
It’s also nothing new, as it’s basically Joe Armstrong's programming method. It’s just not prohibitively expensive for the first time in history.
Look up Sean Bell - not a stop a frisk, just an open fire.
Once, my wife and I were stopped, but not frisked, and cited for riding bikes, on a sidewalk at 2AM on a stretch of Atlantic Ave that would kill you to ride on. It made no sense, until I found out that my neighbor and his friend had been murdered at a street party. There was a drag net out trying to find the killer and they stopped anyone for anything.
This is our experience. We have added Sorbet to a 16 year old Rails app. It is a big win in avoiding errors, typos, documentation, code completion, fewer tests are required, etc.
And the LLMs take advantage of the types through the LSP and type checking.
One of the big advantages of types is documenting what is *not* allowed. This brings a clarity to the developers and additionally ensure what is not allowed does not happen.
Unit tests typically test for behaviours. This could be both positive and negative tests. But we often test only a subset of possibilities just because how people generally think (more positive cases than negative cases). Theoretically we can do all those tests with unit testing. But we need to ask ourselves honestly, do we have that kind of test coverage as SQLLite? If yes, do we have that for very large codebases?
We have some tests that ensure the interface is correct - that the correct type of args are passed say from a batch process to a mailer and a mail object is returned.
For these tests we don’t care about the content only that something didn’t get incorrectly set or the mailer interface changed.
Now if the developer changes the Mailer to require a user object the compiler tells us there is an error. Sorbet will error and say “hey you need to update your code here and here by adding a User object”
Before we would have had test coverage for that - or maybe not and missed the error.
First one that pops to mind is some old python code; the parameter that came in on some functions could be a single string or a list of them. Lots of bugs where arg[0] was a character rather than a string. So tests had to be written showing both being passed in.
The author said he had the assets and gave them to Claude. It would be obvious if he had one large image for all the planets instead of individual ones.
It seems to be gone from the repo, and doesn't seem to be worked on any more? A shame.
AOT compiling Ruby is hard. I'm trying [1] [2].
Sorbet would be in a good position because part of the challenge of making it fast is that Ruby has a lot of semantics that are rarely used but that makes making compiled Ruby fast really hard. E.g. the bignum promotion adds overhead to every single operation unless you can prove invariants about the range of the values; the meta-programming likewise adds overhead and makes even very basic operations really expensive unless you can prove classes (or individual objects) aren't being mucked with...
So starting with type checking is an interesting approach to potentially allow for compiling guarded type-specific fast paths. If my compiler ever gets close enough to feature complete (it's a hobby project, so depends entirely on how much time I get, though now I also justify more time for it by using it as a test-bed for LLM tooling), it's certainly a direction I'd love to eventually explore.
[2] https://github.com/vidarh/writing-a-compiler-in-ruby/ (updated now primarily by Claude Code; it's currently focusing on actually passing RubySpec and making speedy progress, at the cost of allowing some fairly ugly code - I do cleanup passes occasionally, but most of the cleanup will be deferred until more passes)
Something that the type system should do is "make impossible states impossible" as Evan Czaplicki said (maybe others too)
We have started to use typed HTML templates in Ruby using Sorbet. It definitely prevents some production bugs (our old HAML templates would have `nil` errors when first going into production).
Someone of this could be system overload I suppose.
reply