Hacker News new | past | comments | ask | show | jobs | submit | vang3lis's comments login

Yes, the difference is negligible, but you obviously can't change rules of the game when it has already finished.


And to be clear: The winner of the contest is not determined by the number that you see listed on the leaderboard. The number on the leaderboard [1] is on the Quiz dataset, the winner is determined by running against the Test dataset (which is kept private by Netflix). As stated by Yehuda on team BellKor: "our team is top contender for winning the Grand Prize, as we have a better Test score than The Ensemble."

This turnaround does not surprise me. BellKor's Pragmatic Chaos took their sweet time getting to 10%+ - and in doing so they were very sure not to overfit [2] the data (making their solution a much more generic and viable solution to the dataset). It's my guess that The Ensemble rushed quickly to 10%+ and overfit their data like mad (which yielded high numbers on the public dataset, but evidently does not translate to a generic solution).

I'm looking forward to seeing the final papers published by BellKor, et. al. - they're going to be a fascinating read, regardless.

1: http://www.netflixprize.com/leaderboard 2: http://en.wikipedia.org/wiki/Overfitting


I don't think that is correct. The numbers on the leaderboard are not the scores on the training set.

The netflix prize uses three datasets: the training set, the leaderboard set, and the test set. The training set is distributed to everyone. The test set is totally secret. Access to the leaderboard set is only by submitting results (once per 24 hours) and looking at the results. It is not at all trivial to "overfit" to this leaderboard set. (It could be done by, e.g. submitting results with slight tweaks to the algorithm parameters, but this would take a lot of time since you can only submit every 24 hours. Also, you would basically have to do it consciously.)


A lot of people have been tripped up by this, and it's a VERY important distinction. I hope Netflix posts more about the final (Test) results from each when they officially announce the winner.


That and the fact that computing the closed form expression still needs log(n) steps (as exponentiation).


The title mimicks the common misunderstanding that naive calculation for Fibonacci numbers is done in linear time, which is not true when cost of arithmetic operations is not negligible. The whole point of the article, as I understand it, was to show that not taking such details in account will lead to flawed analysis.


That's right!


> but they're obligated to license them to Netflix

Technically they are obliged to license them only if they are about to receive the prize.


404


can you elaborate, please?


I can't edit that anymore, but thank you!


clever ads == good public image == exploiting bias of people == profit!


a nice one, because it doesn't even suggest any particular alternatives when an update would suffice


My name is vang3lis and I'm an alcoholic. So what?


I know Elizier's name from Novamente, which is no small achievement.


Don't get me wrong, I like Elizier's posts on Overcoming Bias, but I disapprove of name calling when doing one sentence book reviews


Ah! In that case...

"Denny Crane!"


I want to clarify that this is a reference to the TV show Boston Legal, where William Shatner's named partner in a law firm, says his own name frequently, as if to say 'I'm Denny Crane and I'm the best'.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: