Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If your goal is to "free access to everyone except people I don't like but I can't ask people if they are people who I don't like", well, I suppose it isn't a good mitigation for you, sorry.

Ah. Yes, the part in quotes here is what I think would count as a solution -- I've been assuming that simply steering anonymous users towards logging in would be the obvious thing to do otherwise, and that doing this is unacceptable for some reason. I was hoping that, despite attackers dispersing themselves across IP addresses, there would either be (a) some signal that nevertheless identifies them with reasonable probability (perhaps Referer headers, or their absence in deeply nested URL requests), or (b) some blanket policy that can be enforced which will hurt everyone a little but hurt attackers more (think chemotherapy).

> It will start rate-limiting (not blocking) anonymous users (not everyone).

If some entities (attackers) are making requests at 1000x the rate that others (legitimate users) are, the effect in practice of rate-limiting will be to block the latter nearly all the time.

> Scaling this up to an industrial-scale scraping operation

My understanding was that the PoW would be done in-browser, in which case this doesn't hold -- the attackers would simply use the multitudes of residential browsers they already control to do the PoW prior to making the requests, thus perfectly distributing that workload to other people's computers. What kind of PoW cannot be done in this way?



> My understanding was that the PoW would be done in-browser, in which case this doesn't hold -- the attackers would simply use the multitudes of residential browsers they already control to do the PoW prior to making the requests, thus perfectly distributing that workload to other people's computers. What kind of PoW cannot be done in this way?

I could be mistaken, but I don't think these residential VPN services are actual botnets. You can use the connection, but not the browser. In any case, you can scale the work factor as you want, making "unlikely" endpoints harder to access (e.g. git blame for an old commit might be 100x harder to prove than the main page of a repository). This doesn't make it impossible to scrape your website, it makes it more expensive to do so, which is what the OP was complaining about ("externalizing costs onto me").

All in all, it feels like there's something here to leverage proof of work as a way to maintain anonymous access while still limiting your exposure to excessive scrapers. It probably isn't a one-size-fits-all solution, but with some domain-specific knowledge it feels like it could be a useful tool to have in the new internet landscape.


> You can use the connection, but not the browser.

Fair enough, that would likely be the case if they're using "legitimate" residential IP providers, and in that case they would indeed need to pay for the PoW themselves somehow.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: