Hacker News new | past | comments | ask | show | jobs | submit login

You don't need to agree to TOS to scrape a public website, what they write in it is moot.



Reddit may not be a public website for long. They push their app and ask for logging in so often, it wouldn’t surprise me if they shut off parts of the site for not logged in users.


True, though that will seriously hurt their SEO.


They can still ban your service, e.g. block your IPs. Sure, you could then play a cat and mouse game with them, but your API clone going down every few days will make it unusable for anyone with a real purpose. The companies that reddit (officially) wants to target, e.g. OpenAI et al., have no problem scraping.


You can just host on Gcloud and report as GoogleBot, good luck with that. There's also plenty of proxy services.

> The companies that reddit (officially) wants to target, e.g. OpenAI et al., have no problem scraping.

Indeed, on their case they just need the datasets anyways


Wasn't there a recent SCOTUS case that upheld the need to follow TOS?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: