More

vang3lis · on July 27, 2009

Yes, the difference is negligible, but you obviously can't change rules of the game when it has already finished.

jeresig · on July 27, 2009

And to be clear: The winner of the contest is not determined by the number that you see listed on the leaderboard. The number on the leaderboard [1] is on the Quiz dataset, the winner is determined by running against the Test dataset (which is kept private by Netflix). As stated by Yehuda on team BellKor: "our team is top contender for winning the Grand Prize, as we have a better Test score than The Ensemble."

This turnaround does not surprise me. BellKor's Pragmatic Chaos took their sweet time getting to 10%+ - and in doing so they were very sure not to overfit [2] the data (making their solution a much more generic and viable solution to the dataset). It's my guess that The Ensemble rushed quickly to 10%+ and overfit their data like mad (which yielded high numbers on the public dataset, but evidently does not translate to a generic solution).

I'm looking forward to seeing the final papers published by BellKor, et. al. - they're going to be a fascinating read, regardless.

1: http://www.netflixprize.com/leaderboard 2: http://en.wikipedia.org/wiki/Overfitting

lliiffee · on July 27, 2009

I don't think that is correct. The numbers on the leaderboard are not the scores on the training set.

The netflix prize uses three datasets: the training set, the leaderboard set, and the test set. The training set is distributed to everyone. The test set is totally secret. Access to the leaderboard set is only by submitting results (once per 24 hours) and looking at the results. It is not at all trivial to "overfit" to this leaderboard set. (It could be done by, e.g. submitting results with slight tweaks to the algorithm parameters, but this would take a lot of time since you can only submit every 24 hours. Also, you would basically have to do it consciously.)

jedc · on July 27, 2009

A lot of people have been tripped up by this, and it's a VERY important distinction. I hope Netflix posts more about the final (Test) results from each when they officially announce the winner.

vang3lis · on July 7, 2009

That and the fact that computing the closed form expression still needs log(n) steps (as exponentiation).

vang3lis · on July 7, 2009

The title mimicks the common misunderstanding that naive calculation for Fibonacci numbers is done in linear time, which is not true when cost of arithmetic operations is not negligible. The whole point of the article, as I understand it, was to show that not taking such details in account will lead to flawed analysis.

pkrumins · on July 7, 2009

That's right!

vang3lis · on June 27, 2009

> but they're obligated to license them to Netflix

Technically they are obliged to license them only if they are about to receive the prize.

vang3lis · on May 12, 2009

vang3lis · on April 20, 2009

can you elaborate, please?

vang3lis · on April 18, 2009

I can't edit that anymore, but thank you!

vang3lis · on April 18, 2009

clever ads == good public image == exploiting bias of people == profit!

vang3lis · on April 18, 2009

a nice one, because it doesn't even suggest any particular alternatives when an update would suffice

vang3lis · on April 18, 2009

My name is vang3lis and I'm an alcoholic. So what?

zandorg · on April 18, 2009

I know Elizier's name from Novamente, which is no small achievement.

vang3lis · on April 18, 2009

Don't get me wrong, I like Elizier's posts on Overcoming Bias, but I disapprove of name calling when doing one sentence book reviews

zandorg · on April 18, 2009

Ah! In that case...

"Denny Crane!"

zandorg · on April 19, 2009

I want to clarify that this is a reference to the TV show Boston Legal, where William Shatner's named partner in a law firm, says his own name frequently, as if to say 'I'm Denny Crane and I'm the best'.