I don't know if I'm reading this right, but it sounds like the article takes "A cool URL doesnt change" to mean "Expect that URLs are forever" - where I take it to mean "To be cool - dont change your URLs".
My point is - I expect all URLs to rot or change. But you're still cool for keeping your URLs stable while it's in your power.
Reddit permalinks disappearing is is the very reason for the phrase, not proof that it's wrong
The article is using this phrase per https://www.w3.org/Provider/Style/URI , which is nothing to do with Reddit and certainly was not inspired by it. The point M. Siebenmann is clearly making is that a quarter of a century's hindsight shows a whole bunch of holes in the W3C's argument, with the recent Reddit events being simply one more counterexample in a long list of counterexamples.
I would say that this style doc would respect the technical choices of both twitter and reddit since even though the rules of access changed, the urls themselves didn't.
From that document:
> At W3C we divide the site into "Team access", "Member access" and "Public access". It sounds good, but of course documents start off as team ideas, are discussed with members, and then go public. A shame indeed if every time some document is opened to wider discussion all the old links to it fail! We are switching to a simple date code now.
Thus the correct choice in URL design was indeed made as (AFAIK) no links need to be updated even though the rules for who is allowed to access the content at that url changed.
This question of URL design is orthogonal to any debate about the reasons and justifications for restricting access to content that was previously public.
Agreed. The corollary being that URLs that do change are usually because of poor stewardship on behalf of some responsible party: Reddit and Twitter are good examples of that.
Reddit has had a delete button on comments and a private subreddit option from the start. I don't think anyone would expect a permalink to show deleted or hidden content.
"There is nothing more temporary than a permanent solution and nothing more permanent than a temporary solution" needs to be designated as a rule or law akin to Hofstadter's law or the Ninety-ninety rule. As a placeholder until a better name is chosen, I propose we call it Foundart's Rule.
Googling for "Daugherty's Law" only returns law firms, and a cursory reading up on Richard Daugherty doesn't seem to indicate anything of the aforementioned.
So, yes, would appreciate a citation so we can all learn a new trivia of the day.
A second and more illustrative case is that within the past day or so, Twitter has stopped letting you see anything (tweets, profiles, etc) unless you're a logged in user of the site.
These recent examples have probably also brought to prominence a point that I've always considered to be a given: if you think some data you come across online is important, save a copy --- it might be the last time you'll be able to access that URL. As the old saying goes, "data that is not backed up is data you don't want to keep". Storage costs, relatively speaking, nothing for textual data, and very little for images. Relying on browser bookmarks, or even worse, search engine queries, to find and expect others to continue to provide the content you truly want, seems to be turning out increasingly a bad idea.
Another good example of this might be the transition from http to https. Granted, that's not actually the domain name but most people probably wouldn't make the distinction. Although HTTP URL's still work on most TLDs in most browsers, most sites have transitioned to HTTPS. Further, HTTP might not always be supported by browsers, and as a site owner you don't have a lot of control over that. There are some solutions to protocol changes, such as redirecting, but that too could fail down the road.
The internet is so ephemeral and at the same time feels rather permanent.
In terms of backwards compatibility, I think that would be an easier one to resolve. Returning a 308 (Permanent Redirect) for all http requests to a https version is a single catch-all rule.
Any harm in just making it default browser behavior to first try the link as clicked, then try https if it fails? Possibly could lead to unexpected behavior for resources loaded in the background, but I can’t think of unwanted behavior l.
the worse part of this is embedded http content in a https site breaking due to changing browser policies. images are upgraded to https if possible, otherwise broken. script embeds are just unilaterally broken if the embed is http, even if https is available.
In retrospect this cool urls don't change ideology might have been net negative. Instead of trying to fight the inevitably that content moves around and gets occasionally lost, it could have been better to embrace the ephemerality; promote mirroring and storing local copies. "Cool URLs are mirrorable"
Notably WARC format was developed only 10 years after that famous essay, and there are lot of other things that could have been done to make mirroring proper first class citizen in web.
No, they can’t. By definition, if they do that they are no longer cool.
Of course, Reddit and Twitter URLs were never especially cool in the first place. They’re particularly uncool now. Tragically unhip, you might even say.
A permanent link resembles the digital archive laws, where governments are mandated to keep documents readable indefinitely. Why not extend that to official publications and demand that gov links will never change. Not only the link should stay available, the page must also be renderend by a device still available in 500 years.
It is complicated but doable. There is a list of allowed formats, like pdf-a, ms word, sql where there is consensus it will be readable forever. Not sure how archive.org does it, but i assume they will also transform a page to static and standard html.
If you ever end up in the distant future, go to Svalbard and look for the Arctic World Archive. They have microfilm copies of a huge amount of data. They have Wikipedia pages in microfilm format, so all you need is a magnifying glass to get started. You can then look for the Github Code Vault slides that explain how to restart technology from scratch and run the code in the git repository archives.
One of the foundational ideas of the web (vs. most earlier hypertext systems) was that links could break, so there was no global scalability issue.
I've long thought we needed a replacement that doesn't have that property. NNTP had a good storage model for important data:
If you want to read articles from a site, then you create a mirror.
I wish we'd move back to that.
In addition to supporting distributed archival (the only type of archival that works), it breaks targeted ads, and eliminates the incentives that lead to clickbait.
I do wonder with all those cool .custom something TLD domains. Currently the private registry might be charging 10s or 100s of dollars for them. But what will happen once the brand will become a household name.
If Google had been an .supercool back in the day. What would the owner of .supercool charge them today?
I know that today Google would probably buy him out, but the transition period of going from nobody to Google they might needs to pay 100,000s of dollars for their domain.
Archiving sites are doing the lord's work with respect to this. The unfortunate reality is that the closest thing to a fixed point is (URL, timestamp).
Domains cost money. Servers cost money. Updating software costs time and therefore money. Ergo, once the incentive to keep it alive is gone, the URLs change. Not to mention other considerations. Maybe the expectations shouldn't be for URLs to change, rather than for them to be temporary things.
When I was thinking about this problem a couple of years ago (eek, time flies!) I came to the conclusion that the only archival approach that was viable for this was to have a dedicated url resolver service that was independent of the DNS system and could be swapped out.
Obviously you wind up with resolver resolver resolver services if this goes on for too long, but it is one of the few workarounds for the fact that time (and many other things) are not reified in the DNS system.
Yeah, if the service/website goes down any of its URLs will go down with it. Can happen to anyone. By being down I mean policy changes, moderation or server down. Pretty lame/catchy/scammy title, as the problem is not at all related to the URLs themselves but the services.
It's not the intended use case & it's explicitly bound by restrictions in accepted DNS practice that say 7 days is the max time one should trust domain authority, but I really wish & hope somehow Signed HTTP Exchanges (SXG) can become a thing. https://web.dev/signed-exchanges/
The idea is that http content can self sign itself, in a way where one can safely make a bundle of these exchanges & potentially sneakernet/thumb drive them to a friend who can then read the content. And trust that it came from the origin it came from.
Crossed with certificate transparency systems it really creates a new possible expectation on the web, that users can take-away the content they come across. This seems like a minimum bid towards FAIR. https://www.go-fair.org/fair-principles/
Same goes for the walled of news articles posted on HN. If you can't link to it or provide an accessible copy of it, it can't be expectes to be part of public discourse.
The word semantics game played in these articles is silly.
A URL cannot change; it is not a changeable object. For instance "https://example.com/index.html" is what it is: that string.
Just like the integer 42 cannot be changed to be 73.
All the problems with change have to do with how the URL resolves, like that it may point to different content at different times or become inaccessible. Changing and becoming inaccessible are not different problems: the former is impossible, the latter is the problem.
My point is - I expect all URLs to rot or change. But you're still cool for keeping your URLs stable while it's in your power.
Reddit permalinks disappearing is is the very reason for the phrase, not proof that it's wrong