Yes, but they aren't going to care for just 2400 pages. As a general rule, make ...

sarchertech · on Aug 19, 2021

I don’t think the op was talking specifically to the content author, but to all the people who read the article and get the idea to scrape Wikipedia.

bawolff · on Aug 19, 2021

Honestly i'd rather people err on the side of scrapping wikipedia too much than live in fear of being disruptive and not do cool things as a result. Wikipedia is meant to be used to spread knowledge. That includes data mining projects such as the one in this blog.

(Before anyone takes this out of context - no im not saying its ok to be intentionally disruptive, or do things without exercising any care at all. Also always set a unique descriptive user-agent with an email address if you're doing anything automated on wikipedia).

sarchertech · on Aug 21, 2021

Having been on the other side of this, I’d rather we encourage people to make use of formats/interfaces designed for machines and use the right tool for the job instead of scraping everything.

It’s incredibly easy for careless scrapers to disrupt a site and cost real money without having a clue what they’re doing.

I want people to think twice and consider what they are doing before they scrape a site.