Hacker News new | past | comments | ask | show | jobs | submit login

[flagged]



The intention is to crash bots' browsers, not users' browsers

Please point me to this 100% correct bot detection system with zero false positives.

You understand the difference between intent and reality right?

The article even warns about this side-effect.


[flagged]


If you are scraping forbidden data in my robots.txt, I don't give a damn. I am gonna mess with your bots however I like, and I'm willing to go as far as it takes to teach you a lesson about respecting my robots.txt.

Then I will teach you a lesson about trying to make public data private. Residential proxies and headful browsers go brrrr.

[flagged]


Malware installation is something completely different than segfaulting an .exe file that is running the scraper process.

If illegal scraping behavior is expected behavior of the machine, then what the machine is doing is already covered by the Computer Fraud Act.


several points here -

not sure if the same jurisdictions that are under the Computer Fraud Act have determined there is such a thing as "illegal scraping".

Does the Computer Fraud Act cover segfaulting an .exe file? I don't know, I don't live in the country that has it.

If The Computer Fraud act says it is ok to segfault an .exe which I highly doubt, is the organization doing this segfaulting as part of their protection against this supposed "illegal scraping" actually checking that the machines that they are segfaulting are all in jurisdictions that are under the Computer Fraud Act?

What happens if they segfault outside those jurisdictions and there are other laws that pertain there? I'm guessing it might happen they screwed then. Should have thought about that, being so clever.

Hey I get it, I am totally the kind of guy who might decide to segfault someone costing me a lot of money by crawling my site and ignoring my robots.txt. I'm vengeful like that. But I would accept hey what I am doing is probably illegal somewhere, too bad, I definitely wouldn't be going around arguing it was totally legal, and I would also be open to the possibility hey, this fight I'm jumping into might have some collateral damages - sucks to be them.

Everybody else here seems to be all righteous about how they can destroy people's shit in retaliation, and the people whose computers they are destroying might not even know they got a beef with you.

on edit: obviously once it got to courts or the media I would argue it was totally legal, ethical and the right thing to do to prevent these people from being able to attack other sites with their "illegal scraping" behavior. Because I don't win the fight if I get punished for winning. I'm just talking about keeping a clear view of what one is actually doing in the process of winning the fight.


Not my problem. The problem will be the for the malware creator. Twice.

If you are crashing some browser from a disallowed directory in robots.txt, is not your fault.

[flagged]


> If you’re not familiar with this, read up on it, the reasons can be quite thought-provoking

Are the reasons relevant to headless web browsers?


here's a potentially relevant example

https://news.ycombinator.com/item?id=43947910


Some, definitely not. Others, quite possibly.

Because people may be hurt.

Which people may be hurt by crashing the machine where the bot is running?


When said people decide to rob your home, they lose the right to not be hurt, IMO. Of course proportionality and all that.

If that's the case what do we do about websites and apps which do things like disable your back button (mobile phone's direct one) or your right click capabilities (desktop browser) while such functionality disabling is not present in the ToS or even presented to you upon visiting the site or using the app?

Then maybe we need laws about crashing my server by crawling it 163,000 times per minute nonstop, ignoring robots.txt? Until then, no pity for the bots.

if your software crashes due to normal usage then you only have yourself to blame

Yes indeed. Nginx running out of RAM due to A”I” companies hammering my server is my fault.

Yes. Fix your configuration so it won't try to allocate more ram than you have. You can still be upset about them hammering your site but if your server software crashes because of it that's a misconfiguration that you should fix regardless.

By all means, tell me how I should configure NGINX so that it properly serves all real humans who wish to visit my website without crashing due to the idiot robots?

Try rereading the above instead of making up your own fantasies.

Running a bot farm?

of course not, why are you immediately jumping at accusations? if i was i'd just patch the bug locally and thank OP for pointing out how they're doing it.

it's just blatantly illegal and i wouldn't want anyone to get into legal trouble




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: