The fact that reputable scientific journals can just disappear from the historical record is one more reason why public non-profit archives like arXiv are gradually, deservedly, becoming the dominant platforms for sharing and preserving new research.
I think libgen and scihub are even more future proof. Something as important as scientific (or any) literature must be distributed and uncensorable. Libraries are destroyed by authoritarians all the time throughout history.
That's not the role of arxiv, and this problem is addressed much more easily with things like https://clockss.org/
Edit -
Clockss is a backup of academic content, mirrored at 12 academic institutes around the world. It's built on existing tech and means that things like dois should still work (if I understand right, https://doi.org/10.4103/2315-7992.204679) .
That may or may not be an intended "role" of archive services like arXiv, but it's one more reason, on top of many other reasons, why such services are gradually, and in my view deservedly, becoming the dominant platforms for sharing and preserving new research. Regardless of how we may think archival is most "easily" addressed, note that arXiv alone is already ~50x bigger than clockss.
Sharing sure, it's a preprint platform. But why is it better for archiving content than clockss? How is it guaranteeing versions? What's the doing resolution like?
> note that arXiv alone is already ~50x bigger than clockss.
I'm sorry what's your measure of that? Arxiv is about 2-3M iirc and clockss is 50M.
Why does making it "public" or "non-profit" make a difference?
Things disappear because no one is paying to keep them up, either directly in time or indirectly with money. The government cuts off projects all of the time causing them to disappear. Witness, for instance, how Henry the VIII cut off Catholic monasteries effectively destroying their archives. Even things sent to well-funded organizations like the US National Archives disappear in confusion.
So while I think you're right that places like arXiv make more sense, I don't think that they're automatically better.
And it's an interesting question whether distributed, open source versions have an advantage. Sure, people can clone repos, but there's often no main person who promises to keep the repo alive making it possible for everyone to just assume someone else will cache a copy.
This is a very complex problem at the deepest level.
There are many companies which have complete archive of arXiv, even arXiv put it in S3 and kaggle[1]. If someone just wants to serve static pdf it could be done very cheaply using R3 or something. I don't really think the old articles could die in the same way as journals which forbids archiving materials.
My best friend and I started a journal together about 10 years ago. It folded when he passed away and now there’s almost no proof it even existed. Kinda sad.
I remember my postdoc mentor telling me this anecdote about another student of his PhD advisor:
"He published this result in this journal, which later disappeared. And I mean it literally, it was printed on acid paper which dissolved in the air over time."
Fortunately, that paper was digitized, I could download it.
I keep reading this submition's title as "Where do journalists go to die?" Like there's some obscure place that aging journalists make a pilgrimage to after they've completed their final article (probably their obituary) to lay down and breathe their last.
This is exactly the problem that the Internet Archive created their Scholar project to mitigate (https://scholar.archive.org/about). The https://fatcat.wiki component acts as a dashboard to track preservation of scholarly publications across multiple efforts. There are a bunch of projects in this area, including LOCKSS ("lots of copies keep stuff safe", including some fun/novel uses of cryptography), Scielo and similar regional platforms and archives (primarily outside the US/EU), Pubmed Central, etc. Zenodo (CERN) and figshare end up being an accessible option for some small journals. There are definitely gaps that content falls through and gets lost.
A few folks have mentioned shadow libraries like Sci-Hub. These efforts can play an archival role, but tend to focus on access, which means there is not as much attention on content which is freely available today, but could disappear in the future.
A common dynamic here is that clout and funding flows to globally prestigious publications, and there is a bias against marginal publications. For sure there are many content-farms and scammy publications, but a lot of gems and valuable small publications get bundled in and dismissed.
This is not a new issue. I’ve been hunting down some very specific issues of some very obscure journals for years.
To make things more complicated, some of the journals were privately published for a private audience, but to simplify the logistics, anyone could subscribe.
Such issues sporadically found online and in libraries.
Why can't we just mandate people to send a copy of every publication to an UN Media Preservation Library and use something like redundant LTO tapes + Piql film to make sure stuff is never lost?
Yes it's expensive but if we divide those expense across 8 billion people out becomes affordable.