What the pollsters got wrong
Had the US presidential polls been correct in 2016, Nate Silver and other forecasters would be anointed oracles and the polling companies would be viewed as soothsayers revealing fundamental truths about society. None of these things happened. Instead, forecasters were embarrassed and polling companies got a bloody nose. If we want to understand if things will go any differently in 2020, we have to understand what happened in 2016 and why.
What happened in 2016
The simple narrative is: "the polls got it wrong in 2016", but this is a reductio ad absurdum. Let's look at what actually happened.
Generally speaking, there are two types of US presidential election opinion polls: national and state. National polls are conducted across the US and are intended to give a sense of national intentions. Prediction-wise, they are most closely related to the national electoral vote. State polls are conducted within a state and are meant to predict the election in the state.
All pollsters recognize uncertainty in their measurement and most of them quote a margin of error, which is usually a 95% confidence interval. For example, I might say candidate cat has 49% and candidate dog has 51% with a 4% margin of error. This means you should read my results as cat: 49±4% and dog: 51±4%, or more simply, that I think candidate dog will get between 47% and 55% of the vote and candidate cat between 45% and 53%. If the actual results are cat 52% and dog 48%, technically, that's within the margin of error and is a successful forecast. You can also work out a probability of a candidate winning based on opinion poll data.
The 2016 national polling was largely correct. Clinton won the popular vote with a 2.1% margin over Trump. Wikipedia has a list of 2016 national polls, and it's apparent that the polls conducted closer to the election gave better results than those conducted earlier (unsurprisingly) as I've shown in the chart below. Of course, the US does not elect presidents on the popular vote, so this point is of academic interest.
The state polls are a different matter. First off, we have to understand that polls aren't conducted in every state. Wyoming is very, very Republican and as a result, few people would pay for a poll there - no newspaper is going to get a headline from "Republican leads in Wyoming". Obviously, the same thing applies to very, very Democratic states. Polls are conducted more often in hotly contested areas with plenty of electoral college votes. So how did the state polls actually do in 2016? To keep things simple, I'll look at the results from the poll aggregator Sam Wang and compare them to the actual results. The poll aggregation missed in these states:
(Trump - Clinton)
|Poll aggregator spread
(Trump - Clinton)
Poll aggregators use different error models for calculating their aggregated margin of error, but typically they'll vary from 2-3%. A few of these results are outside of the margin of error, but more tellingly, they're all in the same direction. A wider analysis looking at all the state results shows the same pattern. The polls were biased in favor of Clinton, but why?
Why they got it wrong
In the aftermath of the election, the American Association for Public Opinion Research created an ad-hoc commission to understand what went wrong. The AAPOR published their findings and I'm going to provide a summary here.
Quoting directly from the report, late changes in voter decisions led earlier polls to overestimate Clinton's support:
"Real change in vote preference during the final week or so of the campaign. About 13 percent of voters in Wisconsin, Florida and Pennsylvania decided on their presidential vote choice in the final week, according to the best available data. These voters broke for Trump by near 30 points in Wisconsin and by 17 points in Florida and Pennsylvania."
The polls oversampled those with college degrees and undersampled those without: "In 2016 there was a strong correlation between education and presidential vote in key states. Voters with higher education levels were more likely to support Clinton. Furthermore, recent studies are clear that people with more formal education are significantly more likely to participate in surveys than those with less education. Many polls – especially at the state level – did not adjust their weights to correct for the over-representation of college graduates in their surveys, and the result was over-estimation of support for Clinton."
The report also suggests that the "shy Trump voter" effect may have played a part.
Others also investigated the result, and a very helpful paper by Kennedy et al provides some key supporting data. Kennedy also states that voter education was a key factor, and shows charts that illustrated the connection between education and voting in 2016 and in 2012. As you might expect, there was little influence in 2012, but in 2016, education was a strong influence. In 2016, most state-level polls did not adjust for education.
Although the polls in New Hampshire called the results correctly, they predicted a much larger win for Clinton. Kennedy quotes Andrew Smith, a UNH pollster, and I'm going to repeat the quote here because it's so important: "We have not weighted by level of education in our election polling in the past and we have consistently been the most accurate poll in NH (it hasn’t made any difference and I prefer to use as few weights as possible), but we think it was a major factor this year. When we include a weight for level of education, our predictions match the final number."
Kennedy also found good evidence of a late swing to Trump that was not caught by polls conducted earlier in the campaign.
On the whole, there does seem to be agreement that two factors were important in 2016:
- Voter education. In previous elections, it didn't matter, in this one it did. State-level polls on the whole didn't control for it.
- Late swing to Trump missed by earlier polls.
2020 and beyond
The pollsters' business depends on making accurate forecasts and elections are the ultimate high-profile test of the predictive power of polls. There's good evidence that at least some pollsters will correct for education in this election, but what if there's some other factor that's important, for example, housing type, or diet, or something else? How will we be able to spot bias during an election campaign? The answer is, we can't. What we can do is assume the result is a lot less certain than the pollsters, or the poll aggregators, claim.
In the run-up to the 2016 election, I created an opinion poll-aggregation model. My model was based on the work of Sam Wang and used election probabilities. I was disturbed by how quickly a small spread in favor of a candidate gave a very high probability of winning; the election results always seemed more uncertain to me. Textbook poll aggregation models reduced the uncertainty still further.
The margin of error quoted by pollsters is just the sampling error assuming random sampling. But sampling isn't wholly random and there may be house effects or election-specific effects that bias the results. Pollsters and others make the assumption that these effects are zero, which isn't the case. Of course, pollsters change their methodology with each election to avoid previous mistakes. The upshot is, it's almost impossible to assess the size of these non-random bias effects during an election. My feeling is, opinion poll results are a lot less certain than the quoted margin of error, and a 'real' margin of error may be much greater.
The lesson for poll aggregators like me is to allow for other biases and uncertainty in our models. To his great credit, Nate Silver is ahead here as he is in so many other areas.