I like Reddit. I recently obtained a data dump of every single submission and comment so I could perform interesting data analysis and may just determine what make a post on Reddit viral.
The problem I have with Reddit is that I'm still unsure if it's a positive externality. There's a lot of good aspects of Reddit (discovery, community), but there's so much bad about Reddit that it's impossible to overlook it (abusive subreddits, abusive users, no administrator transparency, etc.)
There's free speech, and then there's the ethics of promoting and profiting off of abusive/illegal content.
My dream startup would be a Reddit-esque link aggregator, which favors the actual quality of submissions, instead of submissions which are lowest-common-denominator which are optimized for the hive mind.
It's not like all forum-software-innovation stopped in June 2005 when the 2 of us launched reddit to the world.
The hard part is going to be quantifying "quality of submissions" in a scalable way. We thought a lot about this and while it's not perfect, the vast majority of content on reddit across those half million communities is indeed good.
It's a fascinating problem that I hope someone can solve -- improve on Steve's hotness algorithm!
Right now, the primary thing that causes something to succeed on reddit is the rate of upvotes. Anything that takes time to upvote will be less likely to succeed, because it will receive its initial upvotes at a lower rate. (It takes at least an hour to upvote a great new yorker article vs. something that will be voted based on the title alone or a 1 second click.)
To fix this, you need to track click -> upvote interval and correct for this.
This is the main reason why subreddit quality goes down with size. Only the extreme head of the "upvote rate" distribution has an opportunity to succeed when the subreddit is large, so the "upvote rate" drowns out the "upvote ratio" as a factor.
Isn't this an integral problem to the reddit model though, that you can point to half a million communities and say "Look, so many people doing so much good" while what many will point to is "look, you've made thousands of dollars from celebrity leaks" and "You've got huge communities of people sharing images of underage girls.
I guess my commentary would be that there's a lot of places for people to be pleasant to each other and to discuss their shared interests - be it enthusiasts forums or facebook. The risk is being the place people go to be abusive and share their degradation of other people, and it's difficult to just take the rough with the smooth in that respect, when other communities are held to account for that sort of behaviour.
I would agree with you that the vast majority of the content is indeed good unfortunately the bad is often concentrated into a few sub-reddits and at reddit scale that still is a lot of bad unfortunately.
I think it's an interesting issue because the primary issue is what interests people, not the website itself. If a majority of people want to concentrate on the bad, then the bad shows up more. If the mods or admins make the site such that it's impossible to concentrate on the bad, then that would involve some kind of censorship that could be very biased towards someone's definition of good.
This is a great point. There is a demand for violence. It's counter-intuituve and non-PC but people pay good money to see it. It has nothing to do with redit, IMHO its more of a media phenomena. Look at the middle east.
i didn't know reddit data dump's were available, other than crawling with the api. I have plenty of hardware, would love to play around with the data. Could you make it available on an amazon s3 bucket or something ?
I couldn't agree with you more. The bad parts of reddit are what push me away. Of course there are some smaller subreddits that actually value quality content, but if they somehow grow to 'default' subreddit status then things seem to go downhill quickly.
Hopefully you publish some of your findings down the road. I'm interested to hear more.
Reddit hit critical mass when Digg went under, and I think that was a short window when Reddit was both hugely popular without its toxic culture. It was kind of a "Reverse September" moment, I think. I mean obviously jailbait existed and other embarrassments, but they didn't have a heavy impact on reddit's culture like the current infighting about misogyny and the like does.
Yes, Reddit is certainly unique! I'm often aggravated at the terrible things I see on there, level of discourse, stupid snarky crap, etc.. Hive-mind indeed but that's up to each of us to deal with in ourselves.
But I still go on every day, because there probably isn't a single time I go on there that I don't learn something new and interesting. Not another site I can say that about, even HN.
One way to promote quality is to crack down on low-quality content (which is something that HN does and you've done a good job at doing. :P).
For example, many popular subreddits support the making of image macros/memes. Many of the serious subreddits have a) banned the use of image macros/memes and/or b) forced the sub to use text posts only, which forces the submitter to add discussion, and also reduces the incentive for karma-gaining since text posts generate no karma.
Another approach is to promote original content [OC], which gives an incentive to submitters to submit unique content instead of just being the first to post a submission to a new article on TechCrunch for internet points. The subreddit /r/dataisbeautiful has done this very well. (I really wish Reddit would remove its 10:1 rule, which was made to punish self-promoters and is a different issue entirely)
And of course, the standard machine learning techniques can be used to predict the probability of a post being good given, for example, a) keywords in title b) quality of domain's previous submissions c) quality of user's previous submissions, etc.
However, the lack of transparency can lead to baseless speculation, which can be just as toxic, if not worse.
Case in point, during The Fappening, a large number were filled with conspiracy theories on why the administrators hadn't banned it yet before Wong's blog post, which was only made a week after the first incident.
The problem with reddit right now is that community managers are rather ineffective at actually handling the community which leads to the other employees of reddit (engineers) to finally step in. The only time when reddit does usually act is when there are some serious legal implications. This is what often leads to the reddit administrator's actions to be labeled as arbitrary.
Although, as of recently the reddit administrators started banning some of the more racist subreddits even though they didn't have any official change in policy.
The admins didn't want to ban /r/TheFappening but decided to after they got a flood of DMCA takedowns from the celebs lawyers. The sub actually broke the site because of how much traffic it was generating and that was almost 5 days before it got banned. They knew about the sub for several days and let it go because they have a hands-off policy and don't generally step in unless something drastic happens.
> Although, as of recently the reddit administrators started banning some of the more racist subreddits
Not completely sure, but IIRC those weren't banned because of racism but brigading/doxing. Which of course is bound to happen with subreddits catering to extremists.
> My dream startup would be a Reddit-esque link aggregator, which favors the actual quality of submissions, instead of submissions which are lowest-common-denominator which are optimized for the hive mind.
On that note there's definitely a potential market opening, one sites like hackernews cater to.
The problem I have with Reddit is that I'm still unsure if it's a positive externality. There's a lot of good aspects of Reddit (discovery, community), but there's so much bad about Reddit that it's impossible to overlook it (abusive subreddits, abusive users, no administrator transparency, etc.)
There's free speech, and then there's the ethics of promoting and profiting off of abusive/illegal content.
My dream startup would be a Reddit-esque link aggregator, which favors the actual quality of submissions, instead of submissions which are lowest-common-denominator which are optimized for the hive mind.