Many years ago, I was asked to look at why all the content had vanished from a s...

acdha · on May 1, 2018

There were at least two browser extensions which also discovered that poor design was widespread and to disable prefetching for similar reasons:

http://fasterfox.mozdev.org/index.html

https://signalvnoise.com/archives2/google_web_accelerator_he...

I think the state of the web has improved slightly over the last decade but this is a great example of why browser vendors are so conservative. You can do this now but only opt-in.

wumpus · on May 1, 2018

Was it blekko? We had a website owner email us about that issue when blekko's ScoutJet crawler was new... although I don't recall the bit about ignored redirect headers.

saalweachter · on May 1, 2018

I'm pretty sure everyone with a crawler has hit this sort of problem before. The first startup I was at did with someone's wiki that had "delete" links everywhere with no auth.

wumpus · on May 1, 2018

Now that I've hit it once, I watch out for websites with this problem. I was surprised to notice that a Fortune50 tech company's internal employee-personal-webpages-maker-thingie had that issue. And then a week later they asked me if I could crawl their internal web. Uh, no, who knows what other internal systems had that problem?