At reddit we had two tactics for the frontpage problem.
The first thing we added was the "rising" page, which used to be reddit's default "new" page. The rising page was a weighted new page. It was a little hacky, but worked for a while. I see they've changed this, which I think may have been a mistake, but I haven't thought about it in forever.
The second thing we did, which worked really well, was to have an up-and-coming link placed at the top of the list on the frontpage. This helped those borderline posts get more visibility. I had a reddit simulator I used to use to test things like this. That space appears to be a subreddit search box for me, so it seems they've moved onto another solution.
Randomness is an interesting idea, though it might be a bit difficult to fit into the way we cached things.
I think the problem that the randomized algorithm solve is not missing good content. It is to give the same chance for equally good content - or at least to minimize the error due to page limit.
to have an up-and-coming link placed at the top of the list on the frontpage
Two things:
1) either the up-and-coming are a few borderline posts taken in order and then here you'd just extended the front page with a few more posts (ie. it changes nothing)
2) either you'd carefully choosen a heuristic for those posts which is pretty much what is suggested here.
Anyway, thank you for sharing your experience with this problem, it's very interesting.
An other idea would be to fix the content quality and make variable the number of posts in the front page.
Did you ever think of modifying "best" to apply to submissions? A measure of how quickly an article is rising would be a pretty good metric for determining its rightful prominence.
I think the submitted site (or article's) own title should be treated as canonical, and a text field provided for a byline description by a submitter (or maybe include some metadata with the domain in the header) could balance the need to accurately reflect the content and... whatever it is that determines changing the titles around here. I understand the reasoning against it though, among other things the competitive nature of submissions to HN would mean gaming the system with your own metadata would be unavoidable.
I've hypothesized a few times that: when many people submit the same url, x>1 of them are very likely to submit the actual title of the article while people who editorialize will tend to do so uniquely. Because of this HN can rely on an algo to update the title.
I realize there there are moderators for HN and they are confirmed in some cases to have changed titles but I don't think this necessarily means they are responsible for all the title updates.
Well there is some evidence that people vote on the title only without reading the article, sometimes even commenting just on the title. So if you were really interested in maximizing karma you would put an outrageous title on it, get some rage-comments to force it to the front page and then the mods switch it to the real title. Not saying this has ever happened of course but it did strike me that blogs use link baitey titles, why not karma-baitey titles?
I have no concrete proof that this is happening, but it sure feels like it from time to time. It's not uncommon to see articles chart to the front page with provocative headlines that, upon clicking through, bear little resemblance to the headlines or body of the linked content.
Gut-level guesstimate, but the following titling strategies seem to work disproportionately well:
1) Pointed, rhetorical questions (as much as we all claim to hate them)
2) "How I..." titles (usually some legitimate merit to these posts, but if one were so inclined to game the system...)
3) Contrarian declaratives, usually about popular topics. (Hypothetical example: "Facebook is not a social network." This will generate a lot of blind upvotes, plus at least a few knee-jerk comments in opposition).
In fairness to HN, the content actually matters here. I can't say the same for a lot of the subreddits I browse, where blind upvoting based on title alone is a lot more rampant. It's pretty hard to crack the front page with lousy but well-clickbait-titled content here, though we've all seen it done before (and it seems to happen at least once a week).
I wouldn't say it's too much harder than reddit. The key is long form content.
Legitimate commentary will take longer to flow in, and allow for knee jerk / blind influence to last longer from original submission time.
IMO the beet would be a clickbait title for a long article that starts contrarion, but ends on a neutral note.
Rational/logical people will take less offense b/c they're more likely to read the neutral perspectives and others will blind vote/comment based on title and first couple of sentences of the article.
Perhaps, but I don't think the link titles should be changing at all unless it's clearly necessary. They should reflect the content as the original site intended it to be presented (where possible or acceptable), if they provided any metadata to describe it. That's what metadata is for after all.
There already is a "title" bar for submitting links which is fine, but I think the temptation to tweak and try to game for karma might make that more suitable as a byline.
The 'new' page depresses the hell out of me. I read all of these articles about the best time to post to get maximum exposure, but you know it makes very little difference if you're going to end up with 1 vote anyway.
I looked at the 'new' page this morning while thinking about the "optimal" time (it was around 9am EST, which is supposedly one of the best times according to several articles). Almost every post had a single point. Since your post starts with one point, that means most posts have received absolutely no support whatsoever. I suppose this is inevitable. Only a small percentage of submitted content can get on the front, but nonetheless it's still frustrating to put effort into something and then... crickets.
I think there is counter-intuitive explanation. I've found a very strange case of getting better results from posting at off-peak hours. My theory is that my post stays on the first page of "new" long enough for enough people to vote for it. If I post at peak hours, then my post gets pushed off the front page within an hour, meaning too few people have seen it to get it to the front page.
It seems that, if the volatility of posting is greater than the volatility of reading, then it is best to post at off-peak hours. It is only good to post at peak hours if there are more readers than posters, i.e. there isn't a large corpus of users that aren't trying to game the system.
This is a valid approach. This is actually exactly what dithering [1] is in signal processing. You spend a little additive noise, but gain a much more regular distribution. The pagination is adding a quantization error of sorts in the amount of exposure each article is getting.
Could this be used to get rid of "new" altogether?
Instead of blocking them from the homepage, just give them a very very small chance of appearing. Might make the "Knights of New" which are considered self-sacrificing users a thing of the past.
It's always going to be obvious which ones are new, because they've got 1 vote and are on the front page.
I read HN all the time, but confess I never visit "new".
I'd definitely be a fan of including 1 random article from "new" for each pageview of the front page -- but make it explicit. Keep it at the top, or in the middle. It seems like that should produce a much "fairer" result.
I'd be curious if there are reasons why this wouldn't work -- why HN, reddit, etc. keep "new" on its own page...
I always visit /new and it's quite astonishing how many great posts never make it to the front page and instead, quietly drop off. This really has a lot to do with what time of day they post as well.
E.G. I've posted around 3AM - 6AM EST (which is usually when I have more free time) and it never makes it. But if I post a bit later, say around 9AM, then it has a much higher chance of coming to the front page.
There are a great many good stories that are several pages into /new and I think (with the exception of a handful that are already on the front page) those are barely seen by the vast majority of HN visitors.
I've found a very strange case of getting better results from posting at off-peak hours. My theory is that my post stays on the first page of "new" long enough for enough people to vote for it. If I post at peak hours, then my post gets pushed off the front page within an hour, meaning too few people have seen it to get it to the front page.
The bottom line is that posting your content to a social news website is usually the wrong way to gain exposure.
With sites like Facebook and Reddit, you're simply shouting from a mountain top and hoping someone listens.
To be fair, your odds of gaining exposure for a tech-type story are greater on sites like HN and subreddits devoted to technology. In these places, you can count on listeners sharing some of your interests.
Tweaking the ranking algorithms may improve the situation, but the fact is, many of us have become complacent; we resort to throwing our content into social news forums and expecting a lot of exposure instead of doing our homework to determine who the appropriate point people are.
For example, if you're trying to pitch a new web app, maybe you should forget about Hacker News and try strategically emailing / calling a few friends / colleagues. When it comes to early exposure, quality can trump quantity.
When the difference between shouting off a mountaintop and not doing so is 30 seconds AND the upside can be so great (thousands of page views), why NOT submit your content?
> quality can trump quantity
Maybe in general, but submitting a link online is so damn easy that why wouldn't you spend 30 seconds doing it?
And lottery tickets are a dollar each, and the upside can be so great (thousands of dollars), why not buy one?
But online submissions are worse than lottery tickets, because you buy a lottery ticket, and then check it from time to time. No, they're seductive because there seems to be some pattern behind it, because they're so easy to test, and before you know it, those 30 seconds have turned into 30 hours of testing and reading into the material for a parsley 600 views.
Technically, working at Costco's and spending your earnings on AdWords would have brought you more exposure with the added upside that those are more qualified leads (for business sites), or on Facebook, where that money could buy you thousands of impressions (I found some numbers putting average CPM for a sponsored check in story at 6$ with a CTR of 3.2%, so spending your 192$ from Costco will get you ~960 clicks @ 30k impressions, if you're doing a mediocre job).
So yeah, if it were 30 seconds it'd be a great deal. But unless you have the self control of a buddhist monk, it's not going to be 30 seconds. Never.
People continue to complain that easy one-liner jokes consistently dominate the top ranks in Reddit threads, but Slashdot already came up with contextual upvotes (Funny, Insightful, etc) years ago.
Of course every time someone comes and tries to build a community we end up reinventing online comments from square one...
The suggestion appears to be to randomly adjust the scores of articles, but ISTM it would be more democratic to randomly adjust the scores of articles separately for each user. There's no reason we all have to see the same random posts.
There are some ideas here for getting different content to the front page.
Perhaps someone, after they've made a submission, could "pay to promote" - paying with a lockout of their account for 48 hours, or loss of downvoting, or somesuch, to give their post a smidgen of front page attention.
That means that the people just churning submissions at the rate of 6 a day have to carry on with their scattergun approach, and other people who really think they have an interesting article could give it more attention but with some pain to themselves.
I tend to use browser tabs as a 'to-do list'; and as a result don't often get back to the listing to upvote it. A different setup might get better ranking data... at least from me.
or, even better, make a "second page" of HN that's more prominent (like a link in the nav), so that it's not as if there is a front page, and then several other pages, but a front page AND a second page and then the rest...therefore, articles that are in this limbo can get some extra attention, and I would assume that, having differentiated a "second page" from the following pages, people would be at least as likely (if not more) to checkout what's not on the front page.
Great idea. I think 5 random submissions should get a front page chance. It increases variety and as the other person said, it also combats group-think. Five out of thirty, even in the worst case scenario (not even one of them is interesting to a person) is not enough to degrade HN.
Also something has to be done about the fact that many news stories that are bad PR for certain companies or good about their competitors are quickly dispatched to page 2 and 3.
The first thing we added was the "rising" page, which used to be reddit's default "new" page. The rising page was a weighted new page. It was a little hacky, but worked for a while. I see they've changed this, which I think may have been a mistake, but I haven't thought about it in forever.
The second thing we did, which worked really well, was to have an up-and-coming link placed at the top of the list on the frontpage. This helped those borderline posts get more visibility. I had a reddit simulator I used to use to test things like this. That space appears to be a subreddit search box for me, so it seems they've moved onto another solution.
Randomness is an interesting idea, though it might be a bit difficult to fit into the way we cached things.