Hmm, all they got to do is have a dynamic robots.txt that forbids wayback from t...

Exuma · on May 1, 2018

Once it's stored I imagine they don't need to even scrape the page again, so robots.txt wouldn't do anything.

dragonwriter · on May 1, 2018

Internet archive does rescrape periodically, and it removes archived pages based on the current robots.txt. This behavior is documented behavior of the archive that goes beyond the normal conventions of robots.txt.

Bender · on May 1, 2018

I would add, the content itself is not removed. They only stop displaying it whilst the robots.txt says not to. If they can not reach your robots.txt, the content comes back as I have experienced multiple times.

sp332 · on May 1, 2018

Sure, but they'd have to create a list of every deleted article which seems like it would be pretty long.

jrochkind1 · on May 1, 2018

I assume their software is quite capable of automatically creating such a list to include in robots.txt, automatically generated.