Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm glad that all of the sites you target want your scraper to access them. The goal in many cases where one would use a scraper is to access information not provided in an API or otherwise encased in HTML. Most of their robots.txt are "User-Agent: *\nDisallow: /\n"


Then we have no right to scrape that content.

Why is there an implied right to scrape?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: