What I did
One of my hobbies is forecasting US presidential elections using opinion poll data. The election is over and Joe Biden has been sworn in, so this seems like a good time to look back on what I got right and what I got wrong.
What I got right
My final model correctly predicted the results of 49 out of 51 states (including Washington D.C.).
What I got wrong
The two states my model got wrong were Florida and North Carolina, and these were big misses - beyond my confidence interval. The cause in both cases was polling data. In both states, the polls were consistently wrong and way overstated Biden's vote share.
My model also overstated Biden's margin of victory in many of the states he won. This is hidden because my model forecast a Biden victory and Biden won, but in several cases, his margin of victory was less than my model predicted - and significantly so.
The cause of the problem was opinion polls overstating Biden's vote share.
The polling industry and 2020
The polling industry as a whole overstated Biden's support by several percentage points across many states. This is disguised because they got most states directionally correct, but it's still a wide miss.
In the aftermath of 2016, the industry did a self-examination and promised it would do better next time, but 2020 was still way off. The industry is going to do a retrospective to find out what went wrong in 2020.
I've read a number of explanations of polling misses in the press but their motivation is selling advertising, not getting to the root cause. Polling is hard and 2020 was very different from previous years; there was a pandemic and Donald Trump was a highly polarizing candidate. This led to a higher voter turnout and many, many more absentee ballots. If the cause was easy to find, we'd have found it by now.
The 2020 investigation needs to be thorough and credible, which means it will be several months at least before we hear anything. My best guess is, there will be an industry paper in six months, and several independent research papers starting in a few months. I'm looking forward to the analysis: I'm convinced I'm going to learn something new.
There are lots of tweaks I could make to my model, but I'm not going to do any of them until the underlying polling data improves. In other words, I'm going to forget about it all for three years. In fact, I'd quite like to forget about politics for a while.
If you liked this post, you might like these ones
- Forecasting the 2020 election: a retrospective
- What do presidential approval polls really tell us?
- Fundamentally wrong? Using economic data as an election predictor - why I distrust forecasting models built on economic and other data
- Can you believe the polls? - fake polls, leading questions, and other sins of opinion polling.
- President Hilary Clinton: what the polls got wrong in 2016 and why they got it wrong - why the polls said Clinton would win and why Trump did.
- Poll-axed: disastrously wrong opinion polls - a brief romp through some disastrously wrong opinion poll results.
- Who will win the election? Election victory probabilities from opinion polls
- Sampling the goods: how opinion polls are made - my experiences working for an opinion polling company as a street interviewer.
- The electoral college for beginners - how the electoral college works