Why We Should Not Worship Nate Silver
by Joran Elias
30 January 2009
The recent presidential election provided quite a lot of fodder for those of us that have a fetish for quantitative data. I’m speaking, of course, of the staggering volume of election polling that took place. I have a healthy skepticism for the science (art?) of public opinion polling, but presidential elections seem to be providing ever increasing amounts of public opinion data on a single topic. And if there is anything that I do like, it’s large quantities of data!
Of course, it wasn’t long before we were informed that, really, in order to understand the implications of this glut of polling data, we needed some system for aggregating and combining the information. RealClearPolitics (RCP) was an early leader here. The other heavy hitters are Pollster.com, 538, the lesser known Sam Wang and quite a few others who I will omit for space.
I became a fan very early on of Mark Blumenthal, blogging as MysteryPollster, and then went on to found Pollster.com with political scientist Charles Franklin. In particular, I appreciate a style that emphasizes a restrained approach to data analysis. They are interested in clearly and cleanly displaying data. Nothing more, nothing less. (There’s lots of other commentary on Pollster.com, but this is the heart of what they do.) Most of the other sites, RCP, 538, etc. want to be oracles; Pollster.com wants to be a resource for public information.
In any case, one aspect of (most) sites like these was that they provided aggregated estimates of the level of support for Obama and McCain, not just nationally, but in each state as well. (Sam Wang is an exception here, as are some of the lesser known sites that take a fully Bayesian approach.) Inevitably, the question arises, who’s model did the best? This has been looked at (see here, here, and here ) by others, basically concluding that there wasn’t much of a difference.
The reason this is important is because the complexity of the approaches taken diverged wildly. I won’t go into the gory details, but RCP simply took (unweighted) averages of the most recent polls. Pollster.com fit non-parametric regression curves to all the polling data and 538….well, yeah. Let’s just say that Nate Silver built a lot of machinery up to tackle this problem.
I was curious about how much these three methodologies diverged in accuracy as well (RCP, 538 and Pollster; Chris Bowers didn’t consider RCP, though he did include Gov and Sen races, which I did not). I won’t clog this post with zillions of graphs, but it suffices to say that there were essentially no differences in the accuracy of the predictions made by any of these sites, but that they all did much better than had we simply picked one poll near the election and used that as our prediction. (If this post generates a lot of discussion, I might be convinced to go back and put up some graphs and extended analysis. At the moment, I’m too lazy, so you’ll have to make do with me summarizing my conclusions.)
What can we learn from all this? First, we should always be aware of the diminishing returns of increasing model complexity. Second, we should be frightened by how easily Nate Silver garnered a reputation as being (essentially) infallible by
- Evincing an air of certitude and
- Using methods far more complex than are necessary
Is it really surprising that his background is that of a quant, and not really that of a statistician, as is commonly believed? (Technically, Wiki tells me that his education was in economics, and that he worked as a financial analyst.) This isn’t meant as a harsh knock on Nate Silver, despite appearances. I mean, his model was really accurate. It just wasn’t any better than RCP or Pollster.com.
The rapid rise of Nate Silver worries me, because it sometimes begins to take on a cult-like atmosphere. Nate Silver doesn’t bother me; if I had as much free time as him, I’d probably spend a lot of it doing something similarly too-complex (and fun!). (Indeed, I read the commentary on his site all the time, although Sean Quinn brought way more to that site than Nate Silver ever did, in my opinion.) What frightens me is how easily many people seem to be convinced that “more complicated” is automatically better. Particularly given how well the quants did with their super complicated models for credit scoring in recent years.
PS – I’m being glib about the quants and credit scoring; all I mean is that “we” (someone!) royally screwed that up. I know it’s complicated and don’t mean to single out any particular group for blame.

Jan 31, 07:14 PM
Thanks Joran, that was good. I’ve never liked Silver, but the kicker was when he wrote the post that prompted this “gentle correction”:http://gregmankiw.blogspot.com/2009/01/importance-of-being-exogenous.html and then continued to defend his original position.
Jan 31, 07:14 PM
I didn’t follow any of these web sites that closely, but I thought 538 had a lot going for it.
First, given the difficulty of describing statistical techniques to a broad audience, I think Silver did a good job of that. Doing that on a daily basis seems very difficult.
It sounds like your less impressed since, at least in retrospect, some of this complexity didn’t buy him much. Still, I thought he communicated his techniques well considering what he was doing.
Also, could the other sites answer questions like “What’s the probability that Obama wins every 400 electoral votes while winning Pennsylvania but losing Ohio?”
If you want a restrained (descriptive) approach to data, those questions seem silly. But many people seemed titillated by that, and I thought the ability to simulate those outcomes was kind of neat.
It’s not careful enough that I’d trust it with my life savings, but if you compare it to most other websites people read to kill time at work, I think it’s pretty good.
Jan 31, 11:10 PM
Jon: Yeah, I’ve become less of a fan of his general writing style the more of it I’ve seen.
Dan: Huh. I don’t think that “seems” like a silly question, I think it IS a silly question. Which isn’t to say considering it isn’t fun; like I said, given enough free time I probably would have done something similar.
What I would NOT have done is go to great lengths to sell my answers to these silly and meaningless questions as Serious Statistical Research, and then leverage the resulting fame and prestige to become the Eminent Authority on the Truth of Polling.
Which is again making me sound a lot bitchier about him than I really feel. But look, it’s one thing to dick around with fun shit like that for you and your friends, but if you want to be PAID MONEY to provide statistical commentary on polling data to the general public, I think you have an obligation to do so responsibly.
The responsible (and correct) answer to all those “What is the probability of some particular election outcome?” questions is that they are fun but meaningless questions. If pressed, we could concoct come wacky simulation to provide us with an “answer”, but this would be an illusion, and pretending that this “answer” should have an implications for the real world would be foolish.
Now I’ve been ripping on him too much, so I’ll end on a positive note. His regression model was genuinely interesting. Why? Well, I thought he was doing interesting stuff with it during the primaries that merited further consideration, and secondly, in the general his regression model alone (no polling) was remarkably accurate (I’ve checked this myself). This is pretty cool, but to my knowledge he never commented on it at all.
Feb 1, 11:30 AM
Joran,
If “What is the probability of some particular election outcome?” is a silly, does that include the question “What is the probability of Obama winning the election?” If they are different, can you put your finger on what makes them different?
More broadly, what he’s doing may not be “serious statistical research,” but I doubt you’d object to him going on a sports show and saying the Cubs have an X% chance of going to the World Series.” Why is it worse if he gets paid to go on a political show?
No one complains that Larry Sabato gets paid for confident predictions, and his methodology is even less transparent than Silver’s.
Feb 1, 08:51 PM
Dan,
Basically, I think they are both silly (or at least unhelpful) questions, although their relevance grows as we get closer to an election. So asking these questions was pointless in August, but maybe somewhat informative within a week of the election.
If we say (2 months out) that Obama has an 80% chance of winning the election, what could that possibly mean? We could mean that
(i) Obama has an 80% chance of winning if the election were held today, or
(ii) Obama has an 80% chance of winning on Nov 4th.
If what we mean is (i), then why make the statement at all? The election ISN’T being held today, so why would be interested in his chances of winning today? More broadly, how is this any different from the banal statement that “Obama is currently ahead in public opinion polls”. Why dress that up as a prediction?
If what we mean is (ii), then I call bullshit. I don’t see how you can make this claim without having modeled the remaining 2 months of the election!!! Some people try to get out of this problem by saying that what we mean is (ii), but under the assumption that “things remain relatively the same”. But I think this just reduces our claim to the meaning in (i).
The closer we get to the actual election, the more likely it is that the current state of polling will accurately reflect public opinion on election day.
The baseball point is a good one. First, I could argue that having a large number of people who are statistically illiterate with regard to the outcome of baseball seasons is not very harmful to our country, while the same being true about the outcome of elections is potentially fairly serious. But maybe that’s a strained argument…
I could also argue that baseball is just easier to model. It’s more or less a closed system with well defined outcomes and events. And we have a lot more data on typical baseball players than on “typical” politicians, whatever that means. (That said, I would admit that I find the extent of baseball modeling taking place to be excessive as well; indeed, I’ve read somewhere that a lot of people bitch about Nate Silver in this arena too, that he came up with a system (PECOTA) that only performed slightly better than its competitors but marketed it very well.)
Larry Sabato is also a good point. I really don’t know that much about him, but based on what I do know, I’m not really a fan.
I do like some of the poli sci predictions that are done using only macro-level economic conditions. That’s pretty cool, and interesting. Which brings me back to what I actually liked about Silver in the first place, which was his demographics based regression model that he started using in the primaries. It was interesting that he was having some success with that in situations where the polling was relatively sparse, and I kept waiting for him to analyze the performance of his regression model vs. pure polling, but he has yet to do that.
Basically, I feel (as an almost-statistician) that if you are doing statistical analyses for the general public, you have a responsibility to be super clear about the implications of all sources of uncertainty. Which was why this was the last straw for me with him. (See here for a somewhat caustic response.)
Feb 2, 10:16 AM
It sounds like we just use different points of comparison. Silver is over-reaching as scientists go. By the pundit standard, it’s represents some restraint that he isn’t yet predicting the 2012 GOP presidential nominee.