I disagree with the comparison between LLM behavior and traditional software get...

jjani · 2025-06-17T18:49:01 1750186141

> When regular software declines in quality, it’s usually noticeable through UI changes, release notes, or other signals.

Counterexample: 99% of average Joes have no idea how incredibly enshittified Google Maps has become, to just name one app. These companies intentionally boil the frog very slowly, and most people are incredibly bad at noticing gradual changes (see global warming).

Sure, they could know by comparing, but you could also know whether models are changing behind the scenes by having sets of evals.

theturtletalks · 2025-06-17T19:02:30 1750186950

This is where switching costs matter. Take Google Maps, many people can’t switch to another app. In some areas, it’s the only app with accurate data, so Google can degrade the experience without losing users.

We can tell it’s getting worse because of UI changes, slower load times, and more ads. The signs are visible.

With LLMs, it’s different. There are no clear cues when quality drops. If responses seem off, users often blame their own prompts. That makes it easier for companies to quietly lower performance.

That said, many of us on HN use LLMs mainly for coding, so we can tell when things get worse.

Both cases involve the “boiling frog” effect, but with LLMs, users can easily jump to another pot. With traditional software, switching is much harder.