This is a long but excellent article - well worth reading.
By coincidence, I became aware of this bizarre "deletionist" culture at Wikipedia recently when I was searching for information about a particular musician. This is someone who has a handful of popular-ish songs on streaming sites, all from television soundtracks, but hasn't really charted as far as I can tell - in other words, an artist I would peg at about 50/50 odds to have had someone bother to write a Wikipedia entry for them.
Lo and behold, I found that someone had written a Wikipedia entry for them, and that it had been deleted because they weren't deemed famous enough to have an entry! I was dumbfounded... This is someone that millions of people have probably heard in the background of primetime television, and information about them was actively deemed unworthy of Wikipedia!
It was a strange, depressing, and disturbing moment of realization about how Wikipedia had evolved from the early days when I was consistently delighted to find information on all manner of obscure topics, lovingly curated by people who cared deeply enough about them to invest their time in informing the world about them. Thanks to this article I now know that it wasn't just an isolated incident.
Perhaps Wikipedia should approach this from another angle: loosen their baseline criteria, and then treat "notability" like Twitter's "verified" tag. Let the debate be whether or not an article should have the notability checkmark, not whether or not it should exist at all (within reason).
Because of the way Google prefers Wikipedia articles over other sources in its SERPs, this would ultimately have the effect of turning Wikipedia into a UGC version of AOL: it would be overrun with crud (because the Internet incentivizes Wikipedia articles), and the project would have to expend significant effort at retraining the Internet to look for "verified" stickers on the things that actually were encyclopedia articles. For what benefit?
Wikipedia's response to this problem is the right one: there's a whole big wide Internet out there, and if Wikipedia isn't the appropriate place for your article, surely there are many other places that are. UGC sites are falling over themselves to get people to write on them. Why force them on the one Internet project that doesn't want them?
Good point on SERPs, and certainly nobody should force Wikipedia to act against what they believe is their strategic best interest. But isn't "the way Google prefers Wikipedia articles over other sources in its SERPs" an epiphenomenon? In an alternate universe where Wikipedia was more inclusive and used some kind of "notability" flag, Google could just as easily prefer those "notable" articles the way it prefers Wikipedia articles more generally in this universe.
I think it is indeed the case that if Wikipedia was subjected to less entropy, so that its editors could work on building up the encyclopedia in a sort of secluded peace, the project would be far less itchy about notability.
But remember that minimizing error is only half the argument for notability. The other half is, again, definitional: an article about a non-notable topic is almost by definition original research, and "no original research" is one of Wikipedia's oldest rules. The project's charter is to be a tertiary source.
> an article about a non-notable topic is almost by definition original research, and "no original research" is one of Wikipedia's oldest rules
While I understand it as a means to keep random cranks out of the science pages, all it has ever incentivized is to have people 'launder' their research via some 'reliable source'.
But which sources are 'reliable' is quite often purely a matter of editorial bias and there are other wiki projects with a very different take on the matter, for example:
Could you be more specific about this "laundering" of research through reliable sources?
I definitely saw savvy Wikipedia spammers lawyering their way into the encyclopedia, sometimes successfully, for instance by citing marginal trade press cites as evidence of notability ("my client is notable because one time a trade press writer got a quote from them on the importance of FCIP products for disaster recovery programs").
What I don't see is a lot of bogus research hiding in the secondary sources of major articles.
Whatever you might think about WP's policies on what does or doesn't constitute a "reliable" source, I think it's difficult to argue that any community outside of Wikipedia has spent more time thinking about this problem.
It's simple, you put it on your web page with some puffed-up credentials and then have a friend link to it. This works better for things that aren't commonly challenged as it often doesn't stand up to more than casual scrutiny as you do have to pass yourself off as somehow 'reliable'.
My experience editing Wikipedia suggests to me that this is a dubious tactic. Anything I cited on my own web pages, for topics I feel pretty comfortable asserting expertise on (like, for instance, the presence of a lisp interpreter embedded in the Seatbelt ACL system in OSX) was immediately sniped by other editors.
Google can easily retrain itself to weight verified Wikipedia topics higher than unverified. There is a lot of benefit to having these articles maintained by a non-profit; Wikia's load time on a phone browser is stupid long, mainly because of ads and tracking scripts. It is several seconds before they are loaded, then you start getting bombarded with full-screen ads. There are dozens of Wikia apps in the Play store, so effectively that means that, unless this is a topic that I have a deep and pre-existing commitment to, I can't have the accelerated app experience. Wikia is just not a replacement for Wikipedia.
Yes, UGC sites are clamoring for content: so they can monetize the shit out of it with no respect for the content, community, creator, or reader. Your comment makes it seem like there's no merit to keeping the content on WP, but if you actually use some of these other sites I don't know how you can even compare the two.
Something sort of like it already exists; it's called a 'Featured Article' badge. But since there is an expectation that every article should eventually reach 'featured' status (or at least it should be possible in principle), it doesn't have the result you'd expect.
And why wouldn't people have this expectation? Why keep an article at all if it's just going to lie there and gather dust, unmaintained and exposed to vandalism?
Well written articles that contain no speculation shouldn't rot.
>Bill cosby is a well respected television figure
Would be badly written content that, left unmaintained, would now be inaccurate.
>Bill cosby is a television figure who was held in high regard during the 70's-90's
Is better written, because it hinges on verifiable information that will not change. It would become incomplete if no one added to it, but I would argue that it is better to have incomplete information than no information.
Vandalism issues are a problem with the wikipedia software, not a problem with the concept of having lesser-trafficked pages. If this is an issue for low-traffic pages, then there should be an anti-vandalism queue where edits to articles that receive less than a certain amount of traffic
should be put in a queue to be reviewed for vandalism (and just vandalism, not content accuracy etc). It might take a while for edits to bubble through, but that's basically ok if the page receives that little traffic. Eventual information is better than no information.
> Well written articles that contain no speculation shouldn't rot.
And who is going to ensure that they are well-written and contain no speculation?
> Vandalism issues are a problem with the wikipedia software, not a problem with the concept of having lesser-trafficked pages.
If you mean to say that any software explicitly written for the purpose of letting random people publish content instantaneously with no supervision is bad software, I might agree. Otherwise, no. Vandalism is a social problem resulting from lack of article oversight. And I'm not just talking about inserting 'PENIS' into pages, because that is basically solved already with edit filters. I am talking about hoaxes, I am talking about presenting speculation as fact, I am talking about fabricated citations, I am also talking about well-meaning people who inadvertently turn carefully written technical articles into Potato Jesus. A while ago, I've looked at https://en.wikipedia.org/w/index.php?title=Blue_Screen_of_De... which contains this lovely sentence:
> Windows 98 and early builds of Windows Vista displayed the red screen from a boot loader error raised by ACPI.[19][20][21]
Cited to three sources. Looks great, doesn't it? But if you know anything about boot loaders, about ACPI or just about Windows, you understand this sentence is complete nonsense. You don't even need to look at the citations which obviously don't say any such thing. This is no doubt a result of shoddy copyediting by someone who didn't understand the subject. What is your answer to that?
That you should follow the links and see if the citation matches as part of routine anti-vandalism checks.
>Cited to three sources. Looks great, doesn't it? But if you know anything about boot loaders, about ACPI or just about Windows, you understand this sentence is complete nonsense.
All that says to me is that articles about computers should be reviewed by people who know about computers, which makes total common sense.
An encyclopedia is defined as much by it's breadth as by it's depth. Cutting notable but lesser-trafficked chunks out of it to save on effort is a lazy-arse solution that lowers the usefulness of the wiki.
I'm sure you could come up with a thousand reasons why it's too hard, but that's not how you make a good wiki. Hell, it's not how you make anything good.
Ultimately, as the consumer of your product, I don't care about your troubles. Sympathetic customer fallacy applies. If wikipedia doesn't have the information I'm looking for, I won't use it. It really doesn't matter how lovely and maintainable the articles I'm not looking for are. If you wanna be proud of the work you do, you're gonna have to dig deep, maybe make some compromises, and figure it out.
Here's the list of FAs, to get some sense of what the bar is here. The overwhelming majority of WP articles --- including the important ones people care about --- aren't FA's; an FA is essentially a very high quality professional grade encyclopedia article.
Wikipedia is currently -- for the second time -- trying to delete its article about the actress who voices/sings the title character of "Moana", on grounds that she's only notable for that one film and thus doesn't "deserve" to be written about separately from it.
What do you like so much about this article? What makes it excellent? What qualities of its analysis particularly appealed to you?
What I read was a piece that centered on a single --- and I thought dormant --- controversy about the purpose of Wikipedia: that, because the marginal cost of a new article on Wikipedia is zero, its charter should include comprehensive character-by-character breakdowns of fictional works. Not just of Pokemon, but of every fictional work in which its editors have an interest.
Reasonable people can disagree about this. But it's not the sweeping indictment of Wikipedia I'd expect from this comment thread.
The article offers two empirical studies to support its argument that deletionism is harming Wikipedia. In the first, it collects the "External Link" contributions of an anime-focused editor and records the (very small) percentage of proposed links that were ultimately accepted into the bodies of articles.
But the links he's highlighted appear to be of very low quality. I clicked through a random 10 of them, and most of them were 404'd. The ones that were alive all appeared to be UGC online reviews of individual episodes of anime films.
I'm not surprised that editors who watchlist anime titles were reluctant to stuff articles with tens of links to individual "mania.com" review articles (none of which appear to live today).
In his second experiment, Gwern selected a random set of Wikipedia articles and killed existing external links, to measure how long it would take for them to be restored. But by his own admission, the process of selecting random articles meant he was working primarily with articles nobody cares about. He suggests in the text that the median number of users watchlisting the articles he tampered with was 5 --- but links to the Wikipedia editor who informed him of that without noting that 5 is a very low number; by comparison, the Wikipedia article on Paul Graham (once notable on HN for being put up for deletion on WP) has over 125.
But the bigger flaw with the experiment is that it misconstrues Wikipedia's position on external links. The WP:EL policy Gwern cited in his external deletions is direct about this: Wikipedia is not a collection of links. The project does not view sprawling collections of "External Links" as a good thing: what it wants are links to reliable sources that back up points made in the prose article itself.
It's no wonder that article watchers didn't rush to add links back to stories; the links that he removed were marginal, at times totally disconnected from the stories themselves (as with the "nsplacenames.ca" link on Rockingham, Nova Scotia), or dead (as with the video link, unreferenced by the article, on the Shahrnush Parsipur article. This can't be surprising to Gwern: his methodology clearly selects for marginal links, by trawling for them in the disfavored "External Links" sections of articles far out into the long tail of interest on WP.
This article was written in 2009, and makes predictions about the future of the project. Were those predictions accurate? It doesn't look like it: article creation has grown steadily, and editor participation has been remarkably stable for years. Has the Wikipedia community moved away from its "deletionist" stance? This comment thread sure doesn't think so, and I agree. So: what gives?
I bet all those links weren't 404 errors in 2009. Which raises the question: How are external links in Wikipedia articles maintained? Is there some automated process that checks those links periodically to see if they are rotten?
This is someone who has a handful of popular-ish songs on streaming sites, all from television soundtracks, but hasn't really charted as far as I can tell
That's what's called "fails [[WP:MUSIC]]" on Wikipedia. The basic criteria are two recordings on a major label, or some major award, or charting on some well known chart, or historical importance. These criteria filter out the several million garage and Myspace bands that would like to be on Wikipedia.
You state the rule, state one positive from it, and don't even bother to consider why it could possibly be a positive (or not). You also failed to consider any negatives.
Wikipedia could have a rule that if a piece of media is notable enough to have an article on wikipedia, then everyone in the credits for that piece of media is notable enough to have an article on wikipedia. 99% of those people aren't famous, but why not?
Can you reword this? I don't understand the suggestion you're making.
I think you might be making a category error. Notability on Wikipedia isn't a reward earned by acquisition of fame; it's a level of status at which sourcing about a subject is likely to be reliable. The problem Wikipedia has isn't that no-names might get articles. It's that articles will be written about subjects for whom reliable sourcing is impossible, because no reliable source has ever deigned to write about them.
This gets us to a fundamental principle of what an encyclopedia actually is --- and to a project rule that is as old as "NPOV". An encyclopedia --- at least Wikipedia's conception --- is a tertiary source. It's a roadmap that summarizes and points to other, more in-depth sources.
By definition, if you don't have reliable sources to back the subject of an article up, it can't be hosted in an encyclopedia. The project's answer to this challenge would be, first create the reliable secondary sources you'd need to support an encyclopedia article, and then create the article itself.
Thank you for this explanation. You're right. For people who are only mentioned in credits for TV shows and not in newspaper articles, Wikipedia is not the place to list in which other TV shows someone was credited.
It's true that an additional article on Wikipedia has zero marginal cost in terms of compute, bandwidth, and storage.
But every article on Wikipedia imposes maintenance costs on the project, because every article is an opportunity for error, and it's the responsibility of every member of the project to eliminate those errors.
You don't have to spend much time on Wikipedia to get a visceral sense of the validity of this argument, whether or not you agree with it. It is kind of a miracle --- not a small one --- that Wikipedia exists at all. It is one of the great achievements of the Internet writ large. And it exists in spite of:
* Enormous numbers of articles written as advertisements designed to piggyback off Google's preference for Wikipedia articles on its first SERPs
* Enormous amounts of casually malicious spam and vandalism, some of which is purposely designed to avoid detection as long as possible
* Enormous amounts of agenda-driven bias working continuously to turn Wikipedia articles into advocacy pieces for one side or another of a given controversy
The point of the notability standard isn't to reward people for fame, or to save hard disk space. It's to put some reasonable boundary on the subjects for which unpaid editors should be expected to mount the often-tedious defense against these forces.
> But every article on Wikipedia imposes maintenance costs on the project, because every article is an opportunity for error, and it's the responsibility of every member of the project to eliminate those errors
Indeed. The best way to answer that is to recruit more members, not to reduce the size of the project.
This article struck a chord with me as a lot of the issues are growing pains that we've been going through with OpenStreetMap. OSM is expressly not deletionist. We do have maintenance concerns - particularly in the case where "out-of-towners" come in and blitz a town, adding all the shops then leaving before a community has formed to maintain them. But our chief response to that is to try to grow our community.
OpenStreetMap is an impressive project, but Wikipedia is in some sense the most impressive, most ambitious project in the history of the Internet (and, because of how important the Internet is in the history of recorded human knowledge... well, &c &c &c).
The active Wikipedia editing community has been remarkably stable for years (the predictions in Gwern's 2009 post do not appear to have borne out). It's a large and vibrant community. OSM is still a growth community (to wit: it is not the most important mapping project in the world, while Wikipedia is almost definitely the most important encyclopedia), and has a narrower charter. Community growth targets that might be a reasonable lift for OSM might not be for WP.
There are a lot of things Wikipedia could potentially do to grow the community. But curbing deletionism isn't likely to be one of the more important ones. Deletionism is primarily an Internet message board concern.
So my counterargument would be: Wikipedia should first do big things to improve participation (for instance: it can and should be made much easier, in a technical/UX sense, to write or edit an article). Then, once the community has grown, it can start turning the dials on how much vandalism and error its community can sustainable fight.
I think the page you linked to is a pretty wonderful capsule summary of how Wikipedia's community differs from other communities. I hadn't read it before and am glad I did. Thanks!
Except as you and I have argued about in the past, "notability" is redundant. Wikipedia already requires verifiable information from reliable sources. That should be the only bar.
"Notability" introduces a way to subjectively say "well, yes, this is verifiable information from reliable sources, but I don't like this subject, so it shouldn't be allowed anyway". No amount of hand-waving can change the fact that this is what the criterion is used to do, and it makes Wikipedia a laughingstock when coupled with its protestations of neutrality.
The same reason libraries deaccession books even when they have spare shelf space, and, unless you are a hoarder, you get rid of/donate/sell unused items from your dwelling.
Sifting through hoarded garbage has a huge cost - your time. You should not have to waste time even bothering to skip through garbage articles in an encyclopedia. Wikipedia editors should not have to waste their time reviewing updates to garbage articles, making sure the articles are categorized properly, etc.
I don't think you realize the amount of new stuff that pop up in the world every day. That would be a tremendous amount of articles if we wrote about every single little thing.
> Wikipedia had evolved from the early days when I was consistently delighted to find information on all manner of obscure topics
IMO, encyclopaedias aren't meant for obscure topics. And any products or services with a focus on curation have to discriminate. Of course the criteria for doing so will probably always be controversial.
I'm a bit confused by this reasoning. I can see space considerations in a paper environment, but not on a wiki. Why wouldn't Wikipedia try to be the home of all knowledge?
Space has little to do with it. Wikipedia is meant to be a reliable compendium of knowledge, and it can succeed only to the extent that it contains easily verified information. Obscure topics will of course not have much in the way of verifiable information available.
If you have to do original research to justify a topic's inclusion, you're outside the charter of the encyclopedia. The only "research" Wikipedia accepts are direct citations to reliable sources.
That may be true regarding physical encyclopedias, which are, of course, limited by space and weight (ha!). With an online encyclopedia like Wikipedia, though, those limitations are gone.
I'm looking forward to seeing Infogalactic give Wikipedia a run for its money!
>If you would like to edit articles on Infogalactic: the planetary knowledge core, you may complete and submit the following form to request a user account.
Please read the Terms of Service before requesting an account.
Once the account is approved, you will be emailed a notification message and the account will be usable at login.
Request an account?!? What an exclusionary language.
How did I find that out? Because I tried to make a tiny edit. Then found out I need an account. Then that I need to 'request' account. Yeah, no, infogalactic just lost me forever.
A very nice example of barriers to participation in action.
>Infogalactic has the right to block or ban any user for any reason whatsoever.
Even more totalitarian language... infogalactic seems to be less inclusive & open than wikipedia today.
By coincidence, I became aware of this bizarre "deletionist" culture at Wikipedia recently when I was searching for information about a particular musician. This is someone who has a handful of popular-ish songs on streaming sites, all from television soundtracks, but hasn't really charted as far as I can tell - in other words, an artist I would peg at about 50/50 odds to have had someone bother to write a Wikipedia entry for them.
Lo and behold, I found that someone had written a Wikipedia entry for them, and that it had been deleted because they weren't deemed famous enough to have an entry! I was dumbfounded... This is someone that millions of people have probably heard in the background of primetime television, and information about them was actively deemed unworthy of Wikipedia!
It was a strange, depressing, and disturbing moment of realization about how Wikipedia had evolved from the early days when I was consistently delighted to find information on all manner of obscure topics, lovingly curated by people who cared deeply enough about them to invest their time in informing the world about them. Thanks to this article I now know that it wasn't just an isolated incident.