I'm glad that all of the sites you target want your scraper to access them. The goal in many cases where one would use a scraper is to access information not provided in an API or otherwise encased in HTML. Most of their robots.txt are "User-Agent: *\nDisallow: /\n"
"Just ignore it" is a great way to identify yourself as a crappy netizen.