Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And the thousands of other bots also hitting those, together is far more than the legitimate traffic for many sites.


Yeah, there were times, even running a fairly busy site, that the bots would outnumber user traffic 10:1 or more, and the bots loved to endlessly troll through things like archive indexs that could be computationally (db) expensive. At one point it got so bad that I got permission to just blackhole all of .cn and .ru, since of course none of those bots even thought of half obeying robots.txt. That literally cut CPU load on the database server by more than half.


In the last month, bot traffic has exploded to 10:1 due to LLM bots on my forum according to Cloudflare.

It would be one thing if it were driving more users to my forum. But human usage hasn't changed much, and the bots drop cache hit rate from 70% to 4% because they go so deep into old forum content.

I'd be curious to see a breakdown of what the bots are doing. On demand searches? General data scraping? I ended up blocking them with CF's Bot Blocker toggle, but I'd allow them if it were doing something beneficial for me.


For me (as I'm sure for plenty other people as well) limiting traffic to actual users matters a lot because I'm using a free tier for hosting in the time being. Bots could quickly exhaust it, and your website could be unavailable for the rest of the current "free billing" cycle, i.e. until your quota gets renewed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: