Hacker Newsnew | past | comments | ask | show | jobs | submit | benterix's commentslogin

> This is a reminder that benchmarks are meaningless – you should always curate your own out-of-sample benchmarks.

Yeah I have my own set of tests and the results are a bit unsettling in the sense that sometimes older models outperform newer ones. Moreover, they change even if officially the model doesn't change. This is especially true of Gemini 2.5 pro that was performing much better on the same tests several months ago vs. now.


I maintain a set of prompts and scripts for development using Claude Code. They are still all locked to using Sonnet 4 and Opus 4.1, because Sonnet 4.5 is flaming hot garbage. I’ve stopped trusting the benchmarks for anything.

A lot of newer models are geared towards efficency and if you add the fact that more efficent models are trained on the output of less efficent (but more accurate) models....

GPT4/3o might be the best we will ever have


What makes you think it wouldn't end up in the training set anyway?

If you want a short answer: most people don't.

But a more nuanced is: the term "AI" has become almost meaningless as everything is being marketed as AI, with startups and bigger companies doing it for different reasons. However, if you mean GenAI subset, then very few people want it, in very specific products, and with certain defined functionality. What is happening now though is that everybody and their mum try to slap it everywhere and see if anything sticks (spoiler: practically nothing does).


What a strange title. X11 is still more popular than Gnome, and formulating a wish like a fact doesn't make it so.

Anecdotally, I strongly doubt this is true, although my environment is probably quite biased. I know a ton of people who use Gnome, some who use KDE, and I think roughly all of these people use them with Wayland. The standalone-WM users I know are also mostly on Sway or other Wayland ones. The only real X11 holdouts seem to be people using X11-only DE's, such as Xfce or Cinnamon.

> I think roughly all of these people use them with Wayland.

While we're making unfounded statements based on our own anecdotal experiences: can't speak for Gnome users (very few in my circle), but for KDE and tiling window manager users, it's a lot of X11. Hard to say exactly, but would put it at ≥50% X11.


KDE Plasma defaults to Wayland as of KDE 6, with X11 scheduled for removal whenever 7 is released.

I'm talking actual usage, not defaults, and nothing I said relates to future KDE plans.

Xfce is working on Wayland session support. It is working now with some limitations (limitations on what you can embed in the panel are all that's left, I think).

Pretty much possible to use gnome and x11 (until now).

Personally I have given up with wayland as in years ago. There will always be something I should not have wanted to do in the first place while using wayland. I would rather use x11 and have much better control.


Such a statement is pointless with any data to back it up.

Do you have a source for that statistic?


Be aware that Debian's xwayland depends on x11-common, so your number here will be the combined total of Xorg and Wayland.

You could try comparing xserver-xorg-core instead, but even then that'll only show you the number of submitters who have it installed, not the number that actually use it. The usual way to get a graphical desktop in Debian (task-desktop) pulls in both Wayland and Xorg, but uses the former by default.

The best estimate would be something like the number of xserver-xorg-core installs less the number of xwayland installs.

Using that method, it looks like there are roughly twice as many GNOME users as pure Xorg users.


Does podman contain or plan to implement something similar? Seems very useful.


OTOH kudos to them for regretting AI slop (even if they don't want to point out who precisely is regretting). I know some who'd vehemently deny in spite of evidence.

They don't regret serving you AI slop, they regret that the "writer" didn't even read their own article and that they got caught because of it.

"We regrets that mistakes were noticed."

With time, I discovered something interesting: for us, techies, using container orchestration is about reliability, zero-downtime deployments, limiting blast radius etc.

But for management, it's completely different. It's all about managing complexity on an organizational level. It's so much easier to think in terms "Team 1 is in charge of microservice A". And I know from experience that it works decently enough, at least in some orgs with competent management.


It’s not a management thing. I’m an engineer and I think it’s THE main advantage micro services actually provide: they split your code hard and allow a team to actually get ownership of the domain. No crossing domain boundaries, no in between shared code, etc.

I know: it’s ridiculous to have an architectural barrier for an organizational reason, and the cost of a bad slice multiplies. I still think in some situations, that is better to the gas-station-bathroom effect of shared codebases.


I don't see why it's ridiculous to have an architectural barrier for org reasons. Requiring every component to be behind a network call seems like overkill in nearly all cases, but encapsulating complexity into a library where domain experts can maintain it is how most software gets built. You've got to lock those demons away where they can't affect the rest of the users.

The problem is, that library usually does not provide good enough boundaries. C library can just shit over your process memory. Java library can cause all the hell over your objects with reflection, can just call System.exit(LOL). Minimal boundary to keep demons at bay is process boundary and you need some way for processes to talk to each other. If you're separating components into processes, it's very natural to put them to different machines, so you need your IPC to be network calls. One more step and you're implementing REST, because infra people love HTTP.

> it's very natural to put them to different machines, so you need your IPC to be network calls

But why is this natural? I’m not saying we shouldn’t have network RPC, but it’s not obvious to me that we should have only network RPC when there are cheap local IPC mechanisms.


Because horizontal scaling is the best scaling method. Moving services to different machines is the easiest way to scale. Of course you can keep them in the same machine until you actually need to scale (may be forever), but it makes sense to make some architectural decisions early, which would not prevent scaling in the future, if the need arises.

Premature optimisation is the root of all evil. But premature pessimisation is not a good thing either. You should keep options open, unless you have a good reason not to do so.

If your IPC involves moving gigabytes of transient data between components, may be it's a good thing to use shared memory. But usually that's not required.


I'm not sure I see that horizontally scaling necessarily requires a network call between two hosts. If you have an API gateway service, a user auth service, a projects service, and a search service, then some of them will be lightweight enough that they can reasonably run on the same host together. If you deploy the user auth and projects services together then you can horizontally scale the number of hosts they're deployed on without introducing a network call between them.

This is somewhat common in containerisation where e.g. Kubernetes lets you set up sidecars for logging and so on, but I suspect it could go a lot further. Many microservices aren't doing big fan-out calls and don't require much in the way of hardware.


And then we're back to 1980's UNIX process model before wide adoption of dynamic loading, but because we need to be cool we call them microservices.

>Requiring every component to be behind a network call seems like overkill in nearly all cases

That’s what I was referring to, sorry for the inaccurate adjective.

Most people try to split a monolith in domains, move code as libraries, or stuff like that - but IMO you rarely avoid a shared space importing the subdomains, with blurry/leaky boundaries, and with ownership falling between the cracks.

Micro services predispose better to avoid that shared space, as there is less expectation of an orchestrating common space. But as you say the cost is ridiculous.

I think there’s an unfilled space for an architectural design that somehow enforces boundaries and avoids common spaces as strongly as microservices do, without the physical separation.


How about old fashioned interprocess communication? You can have separate codebases, written in different languages, with different responsibilities, running on the same computer. Way fewer moving parts than RPC over a network.

That was the original Amazon motivation, and it makes sense. Conway's law. A hundred developers on a single codebase needs significant discipline.

But that doesn't warrant its use in smaller organizations, or for smaller deployments.


Conway's Law:

Organizations which design systems (in the broad sense used here) are constrained to produce designs which are copies of the communication structures of these organizations.


And then you have some other group of people that sees all the redundancy and decides to implement a single unified platform on which all the microservices shall be deployed.

Libraries do exist, unfortunely too many developers apparently never learn about code modularity.

> using container orchestration is about reliability, zero-downtime deployments

I think that's the first time I've heard any "techie" say we use containers because of reliability or zero-downtime deployments, those feel like they have nothing to do with each other, and we've been building reliable server-side software with zero-downtime deployments long before containers became the "go-to", and if anything it was easier before containers.


It would be interesting to hear your story, mine is that containers in general start an order of magnitude faster than vms (in general! we can easily find edge cases) and hence e.g. horizontal scaling is faster. You say it was easier before containers, I say k8s in spite of its complexity is a huge blessing as teams can upgrade their own parts independently and do things like canary releases easily with automated rollbacks etc. It's so much faster than VMs or bare metal (which I still use a lot and don't plan to abandon anytime soon but I understand their limitations).

In general, my experience is "the more moving parts == less reliable", if I were to generalize across two decades of running web services. The most reliable platforms I've helped manage has been platforms that tried to avoid adding extra complexity until they really couldn't avoid it, and when I left still deployed applications by copy a built binary to a Linux host, reload the systemd service, switch the port in the proxy and let traffic hit the new service while healtchecking, and when green, switch over and stop the old service.

Deploys usually took minutes (unless something was broken), scaling worked the same as if you were using anything else, increase a number and redeploy, and no Kubernetes, Docker or even containers as far as the eye could see.


As soon there is more than one container to organise, it becomes a management task for said techies.

Then suddenly one realises that techies can also be bad at management.

Management of a container environment not only requires deployment skills but also documentational and communication skills. Suddenly it’s not management rather the techie that can't manage their tech stack.

This pointing of fingers at management is rather repetitive and simplistic but also very common.


True but organizations are not uniform, one person decided to take a destructive action and another one might want to fix it.

But I agree, if after receiving such a message someone is saying "let's hop on a call", there's little chance of things going right on the call.


How about:

**

First of all, I'm shocked to learn about what happened. We at Mozilla had no idea that sumobot is wreaking such havoc on your work. Please accept my sincere apologies.

It would be a great pity to have all your precious work done, so we'll do our best to fix it. I would be very grateful if you could schedule a meeting with me so that I could understend the issues you described better so that they are fixed asap?

Once again, I'm very sorry for what happened and I hope in spite of that you can continue doing great work for the Japanese Mozilla community. **


I think Trump is an idiot and almost everything he is doing is a disaster. And the fact that the country is still running in spite of this is thanks to a lot of effort by other people.

However, in this particular case, I do have doubt. Because drug cartels are a huge problem and local governments are often very bad at handling them. Now, I take into consideration that it might be poor Venezuelan fishermen that are being mistaken for drug dealers, but I very much doubt it. It wouldn't make sense for anyone: for Trump, once the truth comes out, for the military personnel doing the strikes, for the reconnaissance teams - it's just nonsensical. And I believe that Trump, even though I don't keep him in high regard, actually is not a fan of killing just for killing. Or, to put it more cynically, he won't win his dream Nobel prize for killing innocent people senselessly. So, maybe, in this one particular case, maybe it could be effective in scaring the cartels into finding other routes.


Venezuela more known for gold smuggling (and 'trafficking' people who want out) than drug smuggling.

I bet some environmentalist will argue that gold smuggling is worse than drug trafficking, but still, my bet is that most of the kills were trafficked people and gold smugglers.


Venezuela has been a top 5 drug seizure country for many, many years.

https://en.wikipedia.org/wiki/Illegal_drug_trade_in_Venezuel...


>Because drug cartels are a huge problem and local governments are often very bad at handling them.

True, but the legal precedent this sets is very important. The requirement for sound legal justification is the only leverage the Judicial branch has. Today's Supreme Court may be too deferential to the President, but that's not to say they don't have a line (listen to yesterday's hearing on tariffs). Also, the Supreme Court a decade from now will rely on today's justifications.

I do not want to give any President the power to unilaterally conduct military killings of people he considers a terrorist. For this specific President, remember that he's declared Antifa a terrorist organization. And that he has very casually accused a lot of citizens as being in Antifa before.


> Or, to put it more cynically, he won't win his dream Nobel prize for killing innocent people senselessly

You say that, but the lady who just won it this year is practically cheering on the prospect of Trump taking military action _on her own country_ to overthrow their leader. So I don't think thirst for war or death precludes winning a peace prize, unfortunately

https://www.politico.com/news/2025/11/05/machado-praises-tru...


Enemy of my enemy...

> It wouldn't make sense for anyone: for Trump, once the truth comes out, for the military personnel doing the strikes, for the reconnaissance teams - it's just nonsensical.

You don't need to think about military personnel or reconnaissance teams. They all report to the president, and as such don't have much choice in the matter. You already said that you think Trump is an idiot.

I maintain that he's doing this because he thinks it intimidates people and makes him look strong. When, in the past, has he ever worried about getting caught doing something wrong or stupid?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: