Internet archive does rescrape periodically, and it removes archived pages based... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

dragonwriter on May 1, 2018 | parent | context | favorite | on: Medium tries to prevent people reading deleted art...

Internet archive does rescrape periodically, and it removes archived pages based on the current robots.txt. This behavior is documented behavior of the archive that goes beyond the normal conventions of robots.txt.

Bender on May 1, 2018 [–]

I would add, the content itself is not removed. They only stop displaying it whilst the robots.txt says not to. If they can not reach your robots.txt, the content comes back as I have experienced multiple times.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact