Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Take Florida, for example. He predicted a 50.3% chance of an Obama win. If Romney actually wins it, he's only a little bit wrong - counting him as 0/1 on Florida in this case wouldn't be accurate. If Obama wins it, he's only a little bit right - counting him as 1/1 on Florida here wouldn't be accurate.

It's hard to quantify just how close he was. I give him props because his prediction was that Florida was very close and that turned out to be true no matter which way the state actually ends up going in the final result.

Now take Virginia, which was one of the later states to be called. He had Obama at 79.4% chance to win. But it turned out to be reasonably close -- which makes me wonder if we really had enough data to justify such a large %. It's a highly polled state, so maybe we did and we just came somewhat close to hitting that 20% of the time where all the polling was inaccurate. Or maybe he had Obama's chances too high.

Another way to judge a prediction model like this is to credit him with .794/1 for Virginia, rather that 1/1. For Florida he gets .503 if Obama wins, and .497 if Romney wins. That doesn't capture it perfectly, either, but at least it doesn't give him 0/1 in Florida if Romney wins even though he successfully predicted that the state was super close.




You are definitely confusing prediction confidence with winning margin.

Nate Silver's model gave Obama a 79% chance to win Virginia, and predicted that the vote share would be Obama 50.7% - 48.7% Romney. He predicted that the election would be close, but with high confidence would favor Obama.

The actual vote share was Obama 50.8% - 47.8% Romney, so if anything Silver's model predicted a narrower outcome than actually happened. Given the available evidence, the model might have underestimated Obama's probability of winning.


I am definitely not confusing the two. My entire point is you can't look at the colors of Nate's map and the colors of the results map and say he got 50/50 because Nate's map is explicitly predicting some probability of a win in each state. They may be color coded but his prediction in a given state wasn't "Obama wins" or "Romney wins" but was instead "there is an x% chance that Obama wins.". My point relies entirely on my understanding that what was predicted in the big map is a percentage chance, not a result and not a vote share.

I may be communicating poorly but I feel I made that distinction clear in my post where I say "he predicted a 50.3% chance of an Obama win", "he had Obama at 79.4% chance to win", "makes me wonder if we really had enough data to justify such a large %". That last one is ambiguous but in the context it means "such a large % chance to win."

I don't know how to better clarify my post other than to restate that I am confident I am not confusing the two.

My point was simply that when a longshot comes in or is close to coming in, it's possible you were wrong about it being a longshot. It doesn't mean you were, but it's the first place to look for continued analysis.

I didn't know until just a few moments ago that Nate also had his vote share predictions on the blog. I've just compared all of those to the actual results and found only one state to be outside of his given margin of error.

http://pastebin.com/0RB5GRjQ


The confusion is probably due to the other commentators assuming you knew about Nate's vote share predictions when you didn't.

You don't think you are confusing "the two", since you thought that Nate only had one set of numbers in the first place.

The other commentators see you using vote share results to question winner probabilities, and assume that you're confusing Nate's winner probabilities with Nate's vote share predictions, since it would make more sense to compare vote share results with the vote share predictions.


I think you're confusing probability of winning with winning margin. His 79.4% number is his estimate of the probability that Obama would win Virginia by any margin. I don't think it's correct to say he's "a little bit right" -- either his prediction is correct, or it isn't. The only way I know to gauge the accuracy of his actual probabilities would be to run the election multiple times -- but there might be a more statistically advanced technique I'm not aware of.


I'm not confusing the two - I'm saying the fact that Florida was so close indicates Nate may have been doing about as good as possible with the data available in Florida, while the fact that Virginia was so close might indicate that Nate might not have done as well as possible with the data there. Neither case is at all conclusive.

It's possible the quality of the data was just different in the two states, due to random factors in sampling or random factors on election day.

And you are right, we couldn't know without running it multiple times with everything the same except these random factors.


Whether the race was close has nothing to do with how likely one side was to win. If there are 100 voters and I know with absolute certainty that 51 of them are die-hard republicans and 49 of them are die-hard democrats, I'm probably justified in saying that the republican candidate has some large probability of winning, 90% or something depending on actuarial probability that some of the republicans die etc. But that's a 90% probability of winning by a 51/49 margin. Alternately if there are 10 republicans, 10 democrats, and 80 independents I may have no idea how the independents will vote so AFAIK it's equally likely that either candidate wins. The margin of victory has nothing to do with the probability of the outcome.


You could reasonably conclude someone has 80% chance to win by using a large amount of slightly lopsided data, or you could reasonably conclude the same thing with a smaller amount of highly lopsided data.

As such there is a relationship involving the probability of the outcome, the quantity of the polling data, the quality of the polling data, and the margins of the polling data.

At the time I wasn't aware that Nate had published his predicted margins. So, seeing he had a high probability for Obama in Virginia, which turned out to be close, I concluded that one of three things was true:

1. Virginia was very heavily polled and there was just a ton of good and consistent data, justifying a prediction of 80%. OR 2. Virginia had a normal amount of good and consistent data, but it was all very lopsided. OR 3. There was an error with Nate's model or with the data.

Had I known that I could have easily accessed his predicted margins I could have easily seen that it was 1. But without knowing that, I was ruling out 2 because it turned out the state was very close.

My point was that it's not easy to just look at Nate's map and score it against the result map. That's a very shallow and fairly uninteresting way to look at it.

My secondary point was that even if you understand that he had predicted some percentage chance to win in each state, it's not even easy to look at that percentage chance and score it against the result map, because, given a prediction like "80% chance Obama wins", there are many different ways he could have arrived at his conclusion, and it would be more accurate to inspect the method by which he arrived at the conclusion compared to the margins in the state.

I've put up a text file comparing the two here: http://pastebin.com/0RB5GRjQ and it turns out the results were within Nate's given interval in 49 of 50 states. It's not the shallow and misleading "50/50" you get by just comparing the color of Nate's map with the results map -- but I think it's much more interesting, more accurate, and more impressive.



It's not clear whether you mean that link as a rejoinder or as a supporting argument for the parent. My take is that it supports the parent.


> It's hard to quantify just how close he was.

It is hard to do with the final, "chance of winning" numbers, but it is much easier with his share of the vote predictions, and you can check those predictions for all states that way, even the ones where the winner was certain.

Say Nate predicts that candidate A will get 80% of the votes in a particular state, with a margin of error of 2%. Then, after the results are out, if that candidate got 81% of the votes, then it was a good prediction, if they got 90% of the votes, it was a bad prediction.

To quantify that, you want to compare the actual error with the predicted error. There are better ways than this, but I'll make one up on the spot: score = 1 / (actual error percentage points * predicted error percentage points). In the first case, the score would be 1 / (1 * 2) = 0.5. In the second, 1 / (10 * 2) = 0.05. The bigger the score, the better the prediction. (This isn't a great model, since it rates a prediction with a large margin of error which happened to be bang on higher than a roughly right prediction with a narrow predicted error, but this is the general idea.)


Excellent point. In that case he only had one state outside his given margin of error in West Virginia.

I didn't do any error analysis aside from asking if it was within his margin of error, and all but one were. Hawaii was also exactly on the line.

http://pastebin.com/0RB5GRjQ




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: