Hacker News new | past | comments | ask | show | jobs | submit login

>This proposal is about bots identifying themselves through open HTTP headers.

The problem is that to CF, everything that isn't Chrome is a bot (only a slight exaggeration). So browsers that aren't made by large corporations wouldn't have this. It's like how CF uses CORS.

CORS isn't only CF but it's an example of their requiring obscure things no one else really uses, and using them in weird ways that causes most browser to be unable to do it. The HTTP header CA signing is yet another of these things. And weird modifications of TLS flags fall right in there too. It's basically Proof-of-Chrome via Gish Gallop of new "standards" they come up with.

>Absolutely nothing wrong with this, as it's site owners that make the decision for their own sites.

I agree. It's their choice. I am just laying out the consequences of these mostly uninformed choices. They won't be aware that they're blocking a large number of their actual human visitors initially. I've seen it play out again and again with sites and CF. Eventually the sites are doing as much work maintaining their whitelists of UAs and IPs that one wonders why they use CF at all if they're doing the job instead.

And that's not even starting on the bad and aggressive defaults for CF free accounts. In the last month or two they have slightly improved this. So there's some hope. They know they are a problem because they're so big,

"It was a decision I could make because I’m the CEO of a major Internet infrastructure company." ... "Literally, I woke up in a bad mood and decided someone shouldn't be allowed on the Internet. No one should have that power." - Cloudflare CEO Matthew Prince

(ps. You made some good and valid points, re: IETF process status quo, personal choice, etc, it's not me doing the downvotes)




There's another problem here that I haven't seen anyone talking about, and that's the futility of trying to distinguish between "good bots" and "bad bots".

The idea of Anubis is to stop bots that are meant to gather data for AI purposes. But you know who has a really big AI right now? Google. And you know who are the people who have the most bots indexing the web for their search engine? Yup, Google.

All these discussions have been assuming that Googlebot is a "good bot", but what exactly is stopping Google from using the data from Googlebot to feed Gemini? After all, nobody's going to block Googlebot, for obvious reasons.

At most, surely the only thing that blocking AI bots will do is stop locally-running bots, or stop OpenAI (because they don't have any other legitimate reason to be running bots over the web).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: