Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> the election were to happen today, what is the probability of each candidate winning?

He's explicitly not doing this.

Here's how I think about it. Silver is answering the question: "how much would the polls (as an aggregate) need to differ from the final result in order for Candidate X to win/lose, conditioned on some reasonable priors?"

Taleb is pointing out that the polls could be really wrong in all sorts of ways that are impossible to predict a priori.

The whole argument is sort of pointless from an intellectual/academic perspective. It's a war of public personalities more than anything else.

It's both the case that Silver designed a good piece of software that does what it's supposed to do and also the case that Taleb's skepticism is valid. But then, that sort of skepticism of statistical models is always valid, and yet we use these models to great effect in all sorts of settings.



If I am expected to make a better prediction of the election outcome by averaging your last 10 predictions than by taking your current prediction, then your current prediction is suboptimal.

This is Taleb's point, and it is solid as a rock.


But he provides no evidence that this is the case. The 2016 election was extremely volatile due to what actually happened during it, not the modelling. There were several bombshell press events all throughout the campaign that dramatically shifted polls.


That’s exactly Taleb’s point, both in this instance and in every book Taleb has written. Those bombshells should have been included in the model. Fat tail risks. If the modeling doesn’t include the potential for dramatic unknowns that can DO happen, then the model is no good. In the Clinton/Trump election, Trump is a showman and a name caller focused on ratings, Clinton a career politician with accusations of dirt, the model should expect “big things can still happen today, tomorrow, next week” and the absence if big events yesterday or last election holds little value relative to the probability of a big event happening in the next 24 hours.

Edit- I’m getting downvoted im guessing because I said Clinton has lots of dirt and called Trump obsessive, I’ve revised the comment to be less politically accusative. I’m not concerned with the politics, just interested in the obsession with these “predictions”


You’re not getting downvoted for saying mean things about the Clintons; don’t try and be a victim here. People disagree with your take on Taleb vs. Silver.


How are you sure?


Yea that was my reaction. maybe, maybe not. I just decided to move along. I know it’s uncouth and against the rules to comment on downvotes on HN but, for the first time for me, I seem to have been downvoted several times here on HN in the last couple weeks. I can’t shake the feeling that I’m receiving downvotes as people want to suppress different views on certain issues. My comments here are substantive and thoughtful, I read this article, have discussed Nate’s projections at length with friends and have read all Taleb’s books except his latest. My initial comment may be wrong / worthy of rebuttal, but it’s not downvote wrong. (DanG- I won’t comment on downvotes again! Sorry!)


> Fat tail risks

The 538 model does have fat fails, and also, adding even fatter fails doesn't address Taleb's fundamental criticism.

> Edit- I’m getting downvoted im guessing because I said Clinton has lots of dirt and called Trump obsessive, I’ve revised the comment to be less politically accusative. I’m not concerned with the politics, just interested in the obsession with these “predictions”

The 538 model does have fat fails.


> If the modeling doesn’t include the potential for dramatic unknowns that can DO happen, then the model is no good.

Why do you think it didn't? It took a major news event breaking at the exact right time (too early, and people would realize it was meaningless, too late, and it wouldn't have time to get out). And even with that, Trump barely eked out a win. That seems unlikely-but-not-impossible, which seems to match with 538's estimations.


I disagree that it’s unlikely. Highly paid, brilliant people are pitted against each other with massive stakes. It’s very likely that something bizarre happens. And there are still thousands of other things that COULD happen that didn’t, like a candidate getting removed by assassination, car wreck, illness, fatigue, enemy attack on the State, pandemic, on and on, that will have a massive impact on the state of things. The model was misleading, therefore it didn’t take into account Trump teams’ plan and the voters’ action.

I really don’t know how to value/process information like “Clinton 90% most likely to win,” and “Clinton loses in hotly contested election.” How do you get from A to B. The outcome of that election is heavily into “butterfly effect” territory, Taleb says model should never have been so confident. I would agree. Was it bad input info into the model or just a bad model because it relies on inputs subject to bias? I don’t know but the output information is certainly less valuable than the attention it’s receiving. Seems largely academic, and worthwhile, but not worthy of broad attention outside of the quant circles. (Nate Silver 538 is a popular topic around me, deeply non-quant territory, self included)


> It’s very likely that something bizarre happens. And there are still thousands of other things that COULD happen that didn’t, like a candidate getting removed by assassination, car wreck, illness, fatigue, enemy attack on the State, pandemic, on and on, that will have a massive impact on the state of things.

Yes, there are lots of things that could happen, but they're all pretty unlikely. A huge impact times a low probability doesn't affect the outcome much.

> I really don’t know how to value/process information like “Clinton 90% most likely to win,” and “Clinton loses in hotly contested election.” How do you get from A to B.

Well, it helps to not start from "Clinton 90% most likely to win"—if I remember correctly (which I may not), the final odds for Clinton were in the 65%-70% range.

> Taleb says model should never have been so confident. I would agree.

538 was not especially confident.

> the output information is certainly less valuable than the attention it’s receiving. Seems largely academic, and worthwhile, but not worthy of broad attention outside of the quant circles.

Maybe, but it's popular because it's something that people want to know.


I was reading from the article: “Take a look at FiveThirtyEight’s forecast from the 2016 presidential election, where the probability of Clinton winning peaked at 90%“


> I disagree that it’s unlikely. Highly paid, brilliant people are pitted against each other with massive stakes.

By that logic, polls would be off every year in most races. But they're not. There are lots and lots of years where polls are highly predictive, including 2008, 2010, 2012, and 2018.

> And there are still thousands of other things that COULD happen that didn’t, like a candidate getting removed by assassination, car wreck, illness, fatigue, enemy attack on the State, pandemic, on and on, that will have a massive impact on the state of things.

It's unclear to me in which direction any of those things would push voters, to be honest.

If a guy in a MAGA hat assassinated Biden while he sat in Church, maybe one thing would happen. If a black bloc assassinated Trump while he walked down a suburban street then something else might happen. It's unclear to me that either of those scenarios is particularly likely, and it's also unclear to me which of those two scenarios is more likely than the other. Even 0.01% seems high for either? And they seem equally likely? So I guess add fat tails to both sides of the distribution. Which is what 538 does.

At work, in an area much more boring and less high-stakes than election modeling, we do our best to actively track these sorts of "out-of-distribution scenarios" and have a "Conservative Human Oversight Mode" the model gets pushed into whenever something crazy is happening. That mode does get activated! For us, getting rid of the model because it fails spectacularly every year or two would be economically idiotic. IDK what Taleb would do in our case, but I do know his hedge fund failed. WRT election models, I expect 538 would probably put a big warning banner on their forecast -- or even take it down -- if one of the candidates were assassinated. Which is sort of equivalent to "monitor for out-of-distribution and switch to human mode".

> I really don’t know how to value/process information like “Clinton 90% most likely to win,” and “Clinton loses in hotly contested election.” How do you get from A to B.

Silver's model gave Clinton a 70% chance, not a 90% chance.

On the night before the 2016 election, Silver predicted how you might get from Clinton's 70% lead to a Trump win [1]: larger than average polling error and undecideds breaking heavily for Trump. Which is exactly what happened.

(Read the headline of [1] again.)

Again, divorce yourself from the emotion of politics and personalities, and just treat it as another statistical forecast. It is what it is: not omniscient or genius, but a decent piece of software that does what it's supposed to do.

> I don’t know but the output information is certainly less valuable than the attention it’s receiving.

I tend to agree. I also think Taleb's rank skepticism of these models gets more attention that it deserves. Like I said in my original post, this whole contest is an intellectually boring fight between equally big personalities. It's entertainment for politics junkies.

> but not worthy of broad attention outside of the quant circles.

The one thing I appreciate about 538 is that they do pour a ton of resources into explaining -- in lay terms -- how their model works and what their model does and doesn't account for. I'm not aware of any other mass-consumed statistical model whose authors have put so much effort into explaining for those willing to listen. Maybe weather and climate models. I appreciate this because it provides me a touch-point when explaining work stuff to non-technical stake holders who happen to listen to the 538 podcast.

Anyways, junkies will be junkies and Taleb/Silver are their dealers. Point twitter and news sites at 127.0.0.1 in your /etc/hosts and go buy a nice bottle of scotch. It's going to be a long week.

--

[1] https://fivethirtyeight.com/features/final-election-update-t...


> If I am expected to make a better prediction of the election outcome by averaging your last 10 predictions than by taking your current prediction, then your current prediction is suboptimal.

That's not how poll averaging works. Or, at least, that's not how poll averaging works in the 538 model.


That's not what the OP was saying. The argument is if somebody can take Silver's model's predictions over time and produce a better next estimate than it, then Silver's model is incoherent about its own beliefs.


I came here to say exactly the same thing. Silver and 538 are explicit about not doing this. Their models do include the uncertainty of future events, and do adjust closer to 50/50 given the same polls the further from the election it is. That's why Biden's odds have been ticking upwards slightly over the past couple weeks despite polls remaining pretty stable.

So in theory, the forecast should be exactly what it appears: the percentage of the time the candidate would be expected to win given the situation they are in at the time of the forecast. So if on Oct. 14 it gives Biden a 72% chance, it means that a candidate in Biden's position, with his polling numbers, and the economic conditions and everything else that is the case on Oct. 14, would go on to win from there 72% of the time.

I agree that it appears the volatility of the 2016 forecast was larger than you would expect given a correct model behaving that way. I expect that uncertainty is better expressed in the 2020 model. However it's certainly conceivable for such a model to swing correctly. IE, there's nothing impossible about a candidate having a 90% chance of winning on X date, then having circumstances change such that they only have a 50% chance at Y date. (In fact, if you ignore the discontinuity on election day, at least 10% of the time their odds would have to pass through 50% at some point, as they would go on to lose. And of course there would also be scenarios when it would dip to that level or below and yet still win.)


I think Taleb’s point is a bit more damning — implying the software does not do what it says it would.

If all sorts of things can happen, where you predictions change widely, Taleb’s argument is that you can’t use a number like 90%. You have to include the error, which ends up saying “50%” for both before the election.

This doesn’t mean that all prediction suffers from this effect, and you have to be skeptical everywhere. I don’t think that’s a fair characterization. Taleb has won by mispriced options. His main statement is that the true probability of this particular event is much closer to 50%


> If all sorts of things can happen, where you predictions change widely, Taleb’s argument is that you can’t use a number like 90%. You have to include the error, which ends up saying “50%” for both before the election.

Taleb is, simply, wrong.

The difference from 100%/0% is the uncertainty. If new information confirms rather than contradicts prior information, that uncertainty goes down, if it contradicts it, the uncertainty goes up. You expect most of the time the general trend over time to be declining uncertainty, but when there is a period of continuing new information at odds with the prior information, you get a period of increasing uncertainty.

So if you had a very large number of Presidential election cycles with the same model, you'd expect most of them to generally trend toward greater certainty over time, but you'd expect a few of them to have extended periods of declining certainty.

Does Silver's Presidential model work that way? It's hard to tell. One, because there aren't a lot of cycles to look at, and because they aren't the exact same model, is it's not quite what you'd want to look at it to assess that. Well, 2012 and 2020 have had very much the general shape of growing certainty you'd expect, while 2016 didn't.


>Taleb is, simply, wrong.

Just not in any mathematics...


That's what confidence intervals are for. You cannot condense both your expectation value and uncertainty in a single percentage (it is not correct to say that the probability of an outcome is 50% if 80% of likely outcomes result in one candidate winning). Otherwise you'll end up with every 5-week weather forecast saying there's a 50% chance of rain.


> You cannot condense both your expectation value and uncertainty in a single percentage

Yes, you can.

The predicted outcome is a particular electoral vote total. The uncertainty around that predicted value is what gets you from that to a probability that the outcome will be at least 270. That's a single percentage that results from the combination of the predicted result and the uncertainty around it.


Indeed. Taleb’s point if I understand it correctly is that the uncertainty in this so high that the number comes realistically much closer to 50%.


We're not in agreement here -- the solution to uncertainty in statistics is to model the uncertainty and treat them as confidence intervals, not throw your hands up and give everything a coin-flip probability. The probability isn't "realistically much closer to 50%" -- that's not how uncertainties in statistics work. It is possible that (if the uncertainty in the model was smaller) that the "real" probability is 50% but there's no way of knowing if that is the case.


That is precisely how probabilities work. If you have a simulation that gives an 80% chance of one outcome, but you only have 75% confidence that the assumptions that the simulation is based on, your actual prediction should be closer to 50%. This is not "throwing your hands up", it is simply correctly assessing the possibility of model error.

There is no such thing as giving a probability and also giving a confidence interval, for an event with a binary outcome. Everything that your prediction can say about the outcome can be said with a single number between 0 and 1. To give an answer like "80% plus or minus 15%" just means you haven't finished calculating.


Right, but I think Silver is saying that averaging high quality polls produces a high level of certainty.

The only reason Trump has a 10% chance and not a 0% chance is that there's a 10% chance the polls are off by far more than they were in 2016 because Biden has consistently polled ahead of Trump in states totalling 270+ EV, plus he's polling at historically high levels nationally for a challenger and facing an incumbent with historically low approval ratings.

I think Taleb is saying "there's so much uncertainty it should be a 50/50 race" and Silver is saying "there's uncertainty, but it's equally plausible in either direction, so let's go off of polling data and historical examples which looks like a 90/10 race".

In the end they both should just bet their own predictions and let the winner emerge over time.


It’s impossible to be so certain of polls because certain demographics may change how forthcoming they are with pollsters over time.

Systemic bias is totally possible and can affect just one side, in a huge variety of ways, changing over time.

This is just one example of a fat tail event you can’t predict but which you can expect, and given Nate has only called a couple elections, there’s no way he can have 90% confidence a model of the polls will be accurate.


Models are backtested against the historical record where hundreds or thousands of elections (federal, state and local) can be used as training data.

For example, a weighted average of polls has Biden up ~5% in PA. In state-wide elections, I wouldn't be surprised if candidates with that sized lead in PA (and states with similar size and socioeconomic demographics) end up winning about 85% of the time, which is what the 538 model has Biden's chances at in PA. That 15% uncertainty accounts for the bias you describe. If polling was perfect and there was no bias, a ~5% polling lead (and exceeding the margin of error) would be a 100% guaranteed victory.

It would be great to see Taleb's forecast of federal and state elections so he has "skin in the game" as he likes to say. Assume you get points based on your % confidence, so if you predict there's a 55% chance of a particular outcome and you're right you get 55pts, and if you're wrong you lose 55pts. If Taleb pegs every race at 50/50 or 55/45 because you "can't predict elections", I doubt his score will be higher than Silver's. You'll end up with silly predictions like 40% chance of Trump winning Washington DC or 30% chance of Biden winning Wyoming, when in reality each have less than 1% chance of doing so.

When you make reasonable, weighted predictions based on state polling, you see that Biden has leads outside the margin of error in states totaling 270 EV. So, the 10% chance Silver gives Trump is saying, hey there's a chance that the polls could be extremely biased in all 7 swing states in Trump's favor. But, if Trump loses even 1 of those swing states, there are so many safe D electoral votes that Biden will win.


You totally ignored the point. Overfitting to previous results doesn’t account for the fact that the facts on the ground are changing. I mean this year alone we have a once in a century pandemic, can you fit that to any other election?


That's why you test predictions. Silver says 10% is enough uncertainty, Taleb says it's not. From the article: "Premise 1: If you give a probability, you must be willing to wager on it". Unless they're willing to publish and make a friendly bet on their predictions, it's just bloviating.

Every day/year/election is unique, but only to a certain magnitude. Some elections have had active wars, some health crises, some criminal scandals, some terrorist attacks, etc. In the end, the polls attempt to account for those uncertainties and historically that's been a much better predictor than a coin flip.

For example, Washington DC has never voted Republican since it won electoral votes in 1962. Polls have Biden winning 80-90% of the vote there. Even Taleb would agree there's a near 100% chance of Biden winning there, pandemic or not. Silver's methodology is just extrapolating that out to each state. When it gets to some tipping point states it's closer to 60/40 Biden or 70/30.

If Taleb wants to create his own forecast, it would be interesting to track it's performance over time. I have a hard time believing that model closer to a coin-flip is going to outperform weighted polling data and historical precedent over multiple predictions.


Seems like the polls were systematically dramatically off, again.


Right, but the original point was that Biden's lead was likely large enough in enough swing states to overcome a 2016-style polling bias, which looks like has occurred in some states. That's why it was a 10% chance for Trump, because he had to not just have a large polling error go his way, but also in enough swing states to total 270.

A landslide Biden victory was as probable as a narrow Biden victory. And both were more likely than a narrow Trump victory.


But that’s simply not true. Because Florida and Texas alone were so far off, 10% odds was definitely “way” wrong. Nate’s model predicted that if Florida went Trump the odds were at 33% or so. Texas moreso, etc. Since Florida went incredibly strongly Trump, we can say with certainty he was completely wrong on the 10% chance. That’s the point: he was way overconfident his past model would fit, it should have been closer to a slightly advantage, which would read something like 60/40 at best.


> Since Florida went incredibly strongly Trump, we can say with certainty he was completely wrong on the 10% chance.

The model generated 40k scenarios, and the scenario you're describing is one of them. The most extreme scenarios have Trump winning all swing states by a few %, which doesn't look likely. So what's playing out is not wildly outside the predictions by any means.

In other words, if a 10% chance happens, it doesn't mean the 10% prediction was wrong.

The only way to prove a prediction was wrong is to bet against it over time with your own predictions. As Silver has shown there's a huge appetite for election / sports predictions. Anyone able to beat him over time would have enormous income potential.

For example, if you thought Trump had at least a 40% chance to win, you'd have a great betting opportunity in the market that had Trump in the 25-33% range. You could have bought in then and sold when Trump's chance peaked at about 50-60%. If you arbitrage those mistakes you've identified in the market, overtime you could become very wealthy.

Until then, it's just pundits pontificating after the results are known without putting any money or reputation on the line beforehand, similar to a casual sports fan late in the 4th quarter: "of course the 49ers were going to blow their lead -- I just knew it!"


His probability of Florida being won by that much was basically 0. But it was actually almost the opposite, in the real world the probability of Biden winning was almost 0.

The polls were so far off systematically in one direction - it doesn’t mean he was right about the 10% chance, because that would mean there was equally likely a chance they were all off in the other direction, which is obviously just wrong. Polling missed huge groups of voters opinions - it was just wrong.

We have numbers now that show that Trump voters also aren’t forthcoming on their vote. Probably on purpose as an effect of last election and being spiteful towards pollsters in general. I actually had this as a strong prior, so my model predicted this would be close, but of course if you “just go by the polls” you’re essentially trusting a flawed system that is gameable. Anyone who trusts the polls is essentially a fool next time, they have been shown to be incorrect now twice in large amounts and I wouldn’t doubt that the “meme” the right has started to purposely deceive them continues even after Trump.

You are absolutely wrong on your 49ers analogy and the 10% covering this. Florida disproves it, as do the systemic nature of being off. If they weren’t off systematically, you’d have expected the polls to get Florida wrong one direction but other states wrong in a different direction. That would prove the model, but if every single state was multiple points off all in the same direction, you don’t get to call margin of error. Your bell curve was shifted one direction, it wasn’t a case of the curve being right but the dice rolled in the tail.


Yes, some states had big polling errors, while others like GA and AZ were reasonably accurate. But the model doesn't care if the polls are off in FL by 0.1% or 10% because it's a winner take all scenario. So while large polling errors are surprising, they have the same exact impact as a small polling error in a close race, which isn't surprising. But people like to cherry pick and say "how did you get X so wrong?" rather than "how did you get X/10 so right?" In fact, Trump might only end up winning 3 "upsets": FL, NC and ME-2, but they were all under 2% difference in the polls, so no one is shocked to see Trump win there. WI and MI would have been much bigger upsets, but that didn't end up happening and a 20k vote win is as good as a 2M vote win in the EC.

The 10% odds come from the path to 270. Here are the 9 states / districts that polled within 2% pre-election (considered toss ups): NE-2, AZ, FL, NC, ME-2, GA, OH, IA, TX. Even if Trump won all of those, he wouldn't get to 270. Using 50/50 odds for all 9 toss ups, Trump only gets a clean sweep 11% of the time. He would then still need another state like PA, WI or MI to get to 270. That's why Silver said that Biden's chances are so high, specifically because Biden could overcome a 2016-level polling error in every toss-up state and still win. In fact we saw 2016-level polling errors in some battleground states, but not all.

Another way to view it, is you have a 1/6 (16%) chance of rolling a 1 on a 6-sided die. Even if all 53 states and districts were projected at 84% confidence, and rolling a 1 meant "upset" and rolling 2-6 meant "polls were accurate", you'd end up with 8-9 states resulting in upsets! 84% confidence still means a lot of surprises. Their final FL prediction was only 69% confidence. Also, confidence intervals aren't linear, meaning that a 95% confidence anticipates half as many upsets as a 90% confidence interval. That +5% means -50% upsets, so 69 or 84% confidence is really not as sure a thing as it might sound.

So, yes it's worth re-examining why polling was very accurate in AZ & GA but very off in TX & FL (specifically Latino's in Miami). But a 10% chance was not crazy IMO given how easy it was for Biden to overcome even huge polling / vote disparities in multiple states and still win.

Again, those who disagree are free to get great odds in the betting markets.


Bringing in betting markets has nothing to do with the point, but you are right on one thing: if you had bet along with Nate’s odds in a simulation you’d lose money in a Monte Carlo simulator based on current results.

The end story is the polls weren’t a reliable measure of the vote. Their predictive ability was low, and even playing by the rules of Nate’s model you have to admit he was off by some 30% at minimum.

Nate’s model has the average PA poll at Biden +6 (!!) where the final result looks to be under 1 (and could go either way). It’s simply impossible to argue that was an accurate model.


> Nate’s model has the average PA poll at Biden +6 (!!) where the final result looks to be under 1 (and could go either way). It’s simply impossible to argue that was an accurate model.

538's PA indicator ended at Biden +4.7%. But the overall 538 model doesn't care if a candidate wins PA by +0.7% or +4.7%. The 538 model is designed to do 1 thing: predict the EC winner. It uses state-wide predictions as indicators, but you can't judge a model based on the performance of a single low-level indicator, you judge it by it's final prediction, which is Biden 90% / Trump 10% to win the EC.

But yes, the lower you go in the model the higher variance you'll see:

Level 1 -> EC Winner (538's focus -- could end up 100% correct)

Level 2 -> State Winners (low variance -- could end up 90-95% correct)

Level 3 -> State Polling Averages (moderate variance & could be 75-90% within margin of error)

Level 4 -> Individual Polls (high variance & could be 60-80% within margin of error)

If you click on any individual state (like PA: https://projects.fivethirtyeight.com/2020-election-forecast/...), you can see the state-wide vote projections. Biden was projected to earn between 49 and 55% of PA's vote with Trump expected to get between 45 and 50%. The final outcome looks certain to fall within those expected ranges. Again, the margin really doesn't matter, just that the idea of Trump getting above 50% and Biden getting below 50% seemed pretty difficult, hence Trump's low (but not impossible) 16% chance of carrying the state.

I like the way their "Winding Path to Victory" chart (https://projects.fivethirtyeight.com/2020-election-forecast) explains their model. It's basically saying: "Trump's path to 270 likely goes through PA, NE-2, AZ, FL, NC, ME-2, GA, OH, IA & TX, where he has a fighting chance of winning any of those, but a very low chance at winning all of them." On the flip side, "Biden's path to 270 goes through Wisconsin, Michigan, Nevada and Pennsylvania, where he's likely to win all of them". That seems like a pretty reasonable EC prediction to me.

If Silver was promoting the model as being able to accurately predict state vote percentages, I'd agree that it's underwhelming. But when you average all of that variability and uncertainty, you can get a pretty reasonable EC winner prediction IMO, at least better than others I've seen that had Biden closer to 95 or 98% to win the EC, or betting markets that had him as low as 60%.


If every single range it’s at at or below the lower bound for one candidate and at or just past the upper bound for the other, systematically, that’s the definition of a flawed model. You seem to breeze past every point I make, I suppose there’s no educating the unwilling so I’ll leave this thread. The bell curve was systemically off, the polls were systemically off. They actually missed the bounds in many races entirely. The betting markets were much more accurate.

I’ll leave this here:

https://mobile.twitter.com/NateSilver538/status/132287782408...

About 4pts off on average across them all in the same direction, all past the lower bound.


I really enjoyed reading your arguments. Thank you for going deep!


> If every single range it’s at at or below the lower bound for one candidate and at or just past the upper bound for the other, systematically, that’s the definition of a flawed model.

Yes, I agree that individual low-level polls for 2 elections in a row have underrepresented Trump's actual support. But, 538's model addresses exactly your point: low-level indicators like state polling can potentially be systematically flawed and biased. By simulating those unreliable low-level indicators through 40k scenarios, the scenario where Biden systematically underperforms biased pre-election polls in key swing states but Biden still wins (as we're seeing) ends up being a very reasonable outcome and the large reason why Biden was favored overall. Trump needed to outperform the polls in 7 states to win, and that just didn't happen.

And the polls weren't off dramatically in every single race. Here are some of 538's last predictions in key states: AZ: 50.7% Biden / 48.1% Trump NE-2: 51% Biden / 47.8% Trump GA: 50.1% Biden / 49.2% Trump PA: 52% Biden / 47.3% Trump NC: 50.5% Biden / 48.8% Trump

Each poll has a margin of error, usually +/-2.5% or more. So, 2-3% swings on Election day are completely normal. NC was off by a tiny margin and enough to flip the state to Trump, while the others were also off by tiny margins, but not enough to flip the state.

And here's how much polling has traditionally been off: https://en.wikipedia.org/wiki/Historical_polling_for_United_...

Sizable polling swings are not uncommon and historically have gone in both directions, but the model was more concerned about just how narrow Trump's path to 270 was, more so than a routine or even historical polling error.

> The betting markets were much more accurate.

Everyone who bet on Trump at a 30/40% chance to win is about to lose their bet. I definitely wouldn't have taken those odds for Trump to run the table in 7 straight must-win swing states all polling as a coin-flip and within the margin of error. In a completely random scenario Trump would have only had a 14% chance of pulling that off.


“The polls are not supposed to be off by that much, which is why we said that the polls messed up, and our forecast failed (in some part) by not correcting for these errors”

https://statmodeling.stat.columbia.edu/2020/11/06/comparing-...

You have a series of predictions that happens incredibly rarely (4 year interval). They are within a close range of accuracy. Your model is working.

Then, one year, the model is off by a full std dev or two.

How will it perform 4 years later? How do you correct for it if you don’t even have a high prior on why it was off last time? How many effects do you just guess and try and model for/against and can you find any data to use for it that’s actually reliable.

How many unknown number of forces is your team arbitrarily deciding were acting on it last time vs now based entirely on their personal histories, combined.

Models can be useful no doubt, whether they work 6/10 times or 9/10 - it entirely depends on the use. But if your model is right 8/10 times and the misses were the last 2? It’s time to rattle it and see what’s broken.

The takeaway: until someone makes a model that proves it can do a few elections in a row (not retroactively) who will really care about the model?


We can try going deeper.

I may be making a dumb mistake: when I got to 538 — the main number I see is 89%. I don’t see 89% +- 40.

What is his confidence interval?


If you want a more data-heavy version go to the graphs, then click on the "electoral votes" tab. The shaded part is the 80% confidence interval of each prediction. This is the historical graph form of the "every outcome in our simulations" chart underneath the dot chart. It should be noted that the dot chart at the top of the page does kind of outline the confidence interval (giving a realistic example of what kinds of outcomes might happen) but is obviously geared towards lay-people who don't know what a confidence interval is.

> We simulate the election 40,000 times to see who wins most often. The sample of 100 outcomes below gives you a good idea of the range of scenarios our model thinks is possible.

I think it would be better if the confidence intervals were better signposted, but it's possible that only the electoral college win margins have solid confidence intervals (I'll admit I don't know enough statistics to know if you could trivially transform the confidence intervals of the electoral college votes to win percentage). The senate and house predictions explicitly show the confidence interval in their main graphs.


I think if you see 90% with 80% confidence interval, you can intuit that he isn't counting uncertainty in the tails.

Consider, Would you be comfortable making a bet with someone, on a 90/10 payoff? This, I think is Taleb's central point. Silver's algorithm is not counting the uncertainty correctly.


The 89% probability of victory is the result of the actual predicted electoral vote total and the uncertainty in that prediction.

Placing an additional uncertainty on the uncertainty is...very much not understanding uncertainty.


In this case, Taleb's argument stands: 89% implies way too little uncertainty


Well, he's right in the sense that there's close to a 50% chance of a swing in either direction (blunders, dirt, health issues, polling errors, etc).

However, when one candidate is up 8+% (even months out) against an incumbent with 4 years of net negative approval ratings, to call that a toss up is like saying the frontrunner is substantially more likely to blunder than the underdog.

As Silver points out however, a Trump win is just as likely as the largest Democratic landslide victory in modern history. The polls could move either way, but the only way they move enough for Trump to win is a larger than 2016 polling error in his favor in all 7 battleground states, which doesn't sound like a 50% chance at all.


To say Biden is up by 8 points is missing a lot of microstructure. In every state Trump needs except PA, he’s within about 2 points. In PA, the critical state, he’s down by about 4.5 points, but Biden’s lead has been diminishing rapidly in the past two weeks.

These polls also rely on a turnout model. There’s no way to calibrate these models for a pandemic election where some people may be more scared to vote than others.

I give Biden the edge but it’s nowhere near 90/10.


I don't know where you got your numbers from. I got them from Tanenbaums site https://www.electoral-vote.com/evp2020/Pres/Graphs/all.html and this shows that Trump would win all swing states and thus the election, if there was not the surprising swing in Texas from Rep to Dem lately, which broke Trump. You cannot really trust Tanenbaums summary as heavy Dem fanboy, which was wrong last time and is wrong this time, but the raw polling numbers are showing a clear picture. Early votes missed the Hunter Biden scandal, and Texas was champion in early votes.


Taleb made a career from stating the obvious and making it appear as a smart discovery. I've read some of his writings and have never found anything really new.


My favorite story when I hear someone dismiss an argument saying 'that is obvious':

Lazarsfeld was writing about “The American Soldier”, a recently published study of over 600,000 servicemen, conducted by the research branch of the war department during and immediately after the second world war. To make his point, Lazarsfeld listed six findings that he claimed were representative of the report. Take number two: '“Men from rural backgrounds were usually in better spirits during their Army life than soldiers from city backgrounds.”

“Aha,” says Lazarsfeld’s imagined reader, “that makes perfect sense. Rural men in the 1940s were accustomed to harsher living standards and more physical labour than city men, so naturally they had an easier time adjusting. Why did we need such a vast and expensive study to tell me what I already knew?” Why indeed.

But Lazarsfeld then reveals the truth: all six of the “findings” were in fact the exact opposite of what the study found. It was city men, not rural men, who were happier during their army life. Of course, had the reader been told the real answers in the first place, they could just as easily have reconciled them with other things they already thought they knew: “City men are more used to working in crowded conditions and in corporations, with chains of command, strict standards of clothing, etiquette, and so on. That’s obvious!” But this is exactly the point Lazarsfeld was making. When every answer and its opposite appears equally obvious then, as he put it, “something is wrong with the entire argument of ‘obviousness'”

More here: https://www.newscientist.com/article/mg21128210-100-the-huma...


Oh yes, I hate this kind of thing so much. People have so many pseudoscientific explanations and theories to support whatever random products and practices they like.

I call these “huh, that makes sense” explanations, since that’s often what people say after hearing them. You even used the magic three words above.


The related phenomenon in biology is called the “just so” fallacy; e.g. “X species evolved Y just so that they could do Z”. It’s a comforting story, but it will often lead you astray.


The same is true of pg, but his discoveries seem obvious only in hindsight.

(I’m objecting to “the discoveries are obvious” as a putdown. The simplest discoveries are some of the hardest. Though I doubt this matters for political pundit analysis.)


> The same is true of pg

Y Combinator is a success. I find PG’s writings fantastic, but sure, some of it is more good writing than concept building.

Taleb has no Y Combinator. His hedge fund failed. He’s just a talking head. That removes an element of grounding from his words.


Does it?

I have no horse in this race, but with the rarity of a unicorn, it could be what amounts to luck.

Out of billion people flipping a coin 30 times, a few of them may guess correctly all 30 times; but the books they write on coin flip guessing amount to meaningless drivel.

My point is that success or failure in such a complex, edge-case, luck-based world may have very little to do with knowledge.

PG, if you’re reading this, of course I mean no disrespect and I know little of the topic and nothing of this person at hand except that he seems to have run a failed hedge fund.


Frequentists miss the obvious stuff all the time, Taleb is not smarter, it's other people who are tied in dogmatic thinking


Totally agree, he comes off as a hack to me. But he’s famous and I’m not, so shrug


Taleb is seriously proficient at probability which can be judged from his technical publications.


I don't know much about Taleb, but one problem I've always had with Silver is that he's very smug about this models but they often enough don't work out. Then when they don't work out, instead of reflecting on his models to improve them (at least publicly,) he just tells people they don't understand probability or his models. His sports models are -especially- bad; no better than a coin flip in some instances. I know there are a ton of Silver fanboys, but surely you can agree he has a pretty big ego, which always makes me take him with a grain of salt. When it comes to politics his former work with Obama's campaign and DailyKos also make me question if he can be truly unbiased in his predictions, but that's a separate issue.


Whose models are better? There have been years where Nate Silver's model has called every single one of the 50 states correct as far as which POTUS candidate they'd vote for. See: https://mashable.com/2012/11/07/nate-silver-wins/

He's not perfect (for instance, I think calling the 2016 model too certain early on is valid and I think some other people have basically equivalently-good models, but are less well known), but if confidence is smugness, then he has a right to be.


> Whose models are better? There have been years where Nate Silver's model has called every single one of the 50 states correct as far as which POTUS candidate they'd vote for.

Well, this certainly isn't how you'd go about evaluating model quality. Calling which candidate a state will vote for is not a good task; it's too discrete to be a good evaluator. For the same reason, no political model produces this output. You predict the vote share given to each candidate -- a nice continuous variable -- you don't predict who will win. Calling the state is a gimmick on top of that (since victory is a function of vote share, it's easy to degrade your model's output into victory predictions) which serves only to make the statistics worse.


Nate's probabilities are based on the electoral college, not the popular vote, so calling which way individual states goes is by definition necessary for a good evaluation.


Consider states like Maine who specifically allocate their electors by vote share rather than in a block to the winner.

Again, calling individual states is neither necessary for a good evaluation nor even helpful to it. It is a further processing step that runs on top of a finished prediction. If you have a vote share distribution, you can convert that into victor probabilities with basic algebra. You cannot operate in the reverse direction, and you cannot develop a good model that directly produces victor probabilities as its output.


Nates model calls the individual electors in maine and nebraska.

As we saw in 2016, trying to predict vote share is actually a bad idea for a presidential model, and predicting states is more useful.


To be perfectly honest, I think the Obama elections were a unique point in politics where models like Nate's worked a lot better. Realistically there weren't many up for grabs states, which means you could guess 8-12 states right and run the table. A lot has a evolved since then, especially in the social media landscape and the accelerated death of answering phones, which live polls rely so heavily on. Maybe Silver will hit this election out of the park, I don't know, but I would err on the side that he gets a number of states wrong.


It’s worth pointing out that Nate was mad at this result because it meant his implied uncertainty for each state was wrong. It meant his model was sandbagged to some degree.


What is your criteria for saying "his models don't work out"? If he says "candidate A has a 90% chance of winning", and then they don't win, that doesn't mean his model was wrong. In the same vain, if the 90% candidate won, it also doesn't mean his model was correct either.

You have to look at a lot of his predictions, and see if the percentages match up.

When you say his models don't work out, are you saying you have done that analysis and found that his percentages don't match the results? If so, I would love to see that analysis.


Well, that’s kind of on him to create, no?

Put another way, if trump wins...again, do you take that as part of the expected outcomes or evidence your distribution is wrong?

If he said something was a 10% likelihood, and it happened three times in a row, would you still think the prior is that the event was actually 10%, or that the estimate of a 10% likelihood was off?


They do do that analysis, in some detail: https://projects.fivethirtyeight.com/checking-our-work/


Pretty sure those are all based on sports or at least baseball. Question isn’t can silver predict baseball. We’ve know that since his time at baseball prospectus in 2007!


> Pretty sure those are all based on sports or at least baseball.

Both sports and political forecasts are included in the analysis, and are considered both separately and together (depending on the particular plot).

There's a dropdown at the top of the page that lets you pick specific polls to look at.


> he's very smug about this models but they often enough don't work out.

Taking all of the Presidential forecasts together, there's too few to really generalize about that. Taking the down allot results though, the odds have been pretty close to what he's forecast.

So I'm not sure what your basis is for saying they “often enough don't work out”.

> His sports models are -especially- bad

Maybe, but his political models are especially good.

> I know there are a ton of Silver fanboys, but surely you can agree he has a pretty big ego

He's never really come off that way to me, not that whether or not he has a big ego has any bearing on the quality of his models.


>> The whole argument is sort of pointless from an intellectual/academic perspective. It's a war of public personalities more than anything else.*

> ...he's very smug...

Right, it's just dueling personalities.


I frankly don't care if someone is smug or not; I want to know who builds the best models. If Silver's models are often wrong, who would you recommend we look into instead?


Silver might have the best models, but he isn't infallible, which a sizable number of his followers seem to believe...and I think that only feeds into his ego and is probably damaging to his models.


Your saying Nate Silver has “followers” and that their opinion of him “probably” affects the work he is producing?

How does ego affect the model?

Does your theory extend into other celebrities? Is Taleb affected too?


He recently tweeted that the only way Trump can win is through polling error or if Trump steals the election. Do you think that sounds like an unbiased person? His model can't be the reason why he was wrong, it had to be the input data! Imagine saying that to your boss.


His model is based off of point data. Ultimately what his model encodes is the polls+a degree of uncertainty of the polls are completely wrong. If you looked at just the polls, there was no way for trump to win, even assuming a polling error of the same variety as 2016, which appears to be about the situation we are in now.

So no, there's nothing wrong with his tweet. Otherwise, you should be able to explain how trump could win without a polling error. Which states that trump was polling behind by 5-7 points he'd win, and how he's win them without a polling error.


> one problem I've always had with Silver is that he's very smug about this models but they often enough don't work out.

What does it mean for a probabilistic prediction to "not work out"? If I tell you that the odds of flipping 2 heads in a row with a fair coin is only 25%, and it happens, did my model "not work out"?

> instead of reflecting on his models to improve them (at least publicly,) he just tells people they don't understand probability or his models.

I really don't intend this in any sort of rude way, but I think you might be interpreting his response as dismissive because it's accurate...

It's difficult to say a model "doesn't work out" based on a single event, especially when it puts a significant probability on the less-likely option. For example, 538's 2016 forecast gave Trump a roughly 1 in 3 chance of winning: that's _higher_ than the odds of flipping 2 heads in a row.

Measuring the quality of a model is more complicated: one way would be applying the model to the same outcome multiple times, but this isn't possible for single-event forecasts. Another way to see how calibrated your predictions are is to aggregate multiple predictions and see how often they line up with reality: an event predicted to occur with X% probability should occur X% of the time.

Lucky for us, 538 has done exactly this analysis[1]! Naturally, it's internal, so take it with a grain of salt, but it looks like their predictions are fairly well-calibrated.

[1] https://projects.fivethirtyeight.com/checking-our-work/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: