Rate limiting based on IP, blocking obvious datacenter ASNs and blocking identifiable JA3 fingerprints is quite simple and surprisingly effective in stopping most scrapers and can be done entirely server side, I wouldn't be surprised if this catches more than half of problematic requests to the average website. But I agree that if you have a website "worth" scraping there will probably be some individuals motivated enough to bypass those restrictions.
You block all VPN users then, and currently many countries have some kind of censorship, please don't do that. I use a personal VPN for over 5 years and that's annoying.
I understand the other side and captcha/POW captchas/additional checks is okay. But give people a choice to be private/non-censorable.
Enabling/disabling a VPN each minute to access the non-censored local site which blocks datacenters IPs, then bringing it back again for the general surfing is a bit of a hell.
That's a fair point, probably the best approach would be to do a client side challenge where the server side challenge fails but at that point it's no longer as simple of a setup. Toggling a VPN is definitely annoying but a captcha or something like POW do come with an impact to user experience as well and in my experience are easier (and cheaper) to deal with for bots, a good quality residential proxy where you pay per GB quickly becomes a lot more expensive than a captcha solver service or the compute for a POW challenge.
Yes, but you can use captcha/POW challenges based on IP reputation, which leaves usual users intact. I don't mind captchas too much, that's my choice to use the VPN.
What I mean is that it's better to give VPN users the choice to solve captchas instead of being banned completely.