Monday, October 12, 2020

Fundamentally wrong? Using economic data as an election predictor

What were you thinking?

Think back to the last time you voted. Why did you vote the way you did? Here are some popular reasons, how many apply to you?

  • The country's going in the wrong direction, we need something new.
  • My kind of people vote for party X, or my kind of people never vote for party Y.
  • I'm a lifelong party X voter.
  • Candidate X or party X is best suited to running the country right now.
  • Candidate Y or party Y will ruin the country.
  • Candidate Y or party X are the best for defense/the economy/my children's education and that's what's important to me right now.

(Ballot drop box. Image Source: Wikimedia Commons. Author: Paul Sableman. License: Creative Commons.)

Using fundamentals to forecast elections

In political science circles, there's been a movement to use economic data to forecast election results. The idea is, homo economicus is a rational being whose voting behavior depends on his or her economic conditions. If the economy is going well, then incumbents (or incumbent parties) are reelected, if things are going badly, then challengers are elected instead. If this assertion is true, then people will respond rationally and predictably to changing economic circumstances. If we understand how the economy is changing, we can forecast who will win elections.

Building models based on fundamentals follows a straightforward process:
  1. Choose an economic indicator (e.g. inflation, unemployment, GDP) and see how well it forecasts elections.
  2. Get it wrong for an election.
  3. Add another economic indicator to the forecast to correctly predict the wrong election.
  4. Get it wrong for an election.
  5. Either re-adjust the model weights or go to 3.
These models can get very sophisticated. In the United States, some of the models include state-level data and make state-level forecasts of results.

What happens in practice

Two University of Colorado professors, Berry and Bickers, followed this approach to forecast the 2012 presidential election.  They very carefully analyzed elections back to 1980 using state-level economic data.  Their model was detailed and thorough and they helpfully included various statistical metrics to guide the reader to understand the model uncertainties. Their forecast was very clear: Romney would win 330 electoral college votes - a very strong victory. As a result, they became darlings for the Republican party.

Unfortunately for them, things didn't work out that way. The actual result was 332 electoral college votes for Obama and 206 for Romney, an almost complete reversal of their forecast.

In a subsequent follow-up (much shorter than their original paper), the professors argued in essence that although the economy had performed poorly, voters didn't blame Obama for it. In other words, the state of the economy was not a useful indicator for the 2012 election, even considering state-level effects.

This kind of failure is very common for fundamentals. While Nate Silver was at the New York Times, he published a long piece on why and how these models fail. To cut to the chase, there is no evidence voters are homo economicus when it comes to voting. All kinds of factors affect how someone votes, not just economic ones. There are cultural, social class, educational, and many other factors at work.

Why these models fail - post hoc ergo propter hoc and spurious correlations

The post hoc fallacy is to assume that because X follows Y, Y must cause X. In election terms, the fundamentalists assume that an improving or declining economy leads to certain types of election results. However, as we've said, there are many factors that affect voting. Take George Bush's approval rating, in the aftermath of 9/11 it peaked around 88% and he won re-election in 2004. Factors other than the economy were clearly at work.

A related phenomenon is spurious correlations which I've blogged about before. Spurious correlations occur when two unrelated phenomena show the same trend and are correlated, but one does not cause the other. Tyler Vigen has a great website that shows many spurious correlations.

Let's imaging you're a political science researcher. You have access to large amounts of economic data and you can direct your graduate students to find more. What you can do is trawl through your data set to find economic or other indicators that correlate with election results. To build your model, you weight each factor differently, for example, inflation might have a weighting of 0.7 and unemployment 0.9. Or you could even have time-varying weights. You can then test your model against existing election results and publish your forecast for the next election cycle. This process is almost guaranteed to find spurious correlations and produce models that don't forecast very accurately. 

Forecasting using odd data happens elsewhere, but usually, more entertainingly. Paul the Octopus had a good track record of forecasting the 2010 World Cup and other football results - Wikipedia says he had an 85.7% success rate. How was he so successful? Probably dumb luck. Bear in mind, many animals have been used for forecasting and we only hear about the successful ones.

(Paul the Octopus at work. Image source: Wikimedia Commons. License: Creative Commons.)

To put it simply, models build with economic data alone are highly susceptible to error because there is no evidence voters consider economic factors in the way that proponents of these models suggest. 

All models are wrong - some are useful

The statistician George Box is supposed to have said, "all models are wrong, some are useful". The idea is simple, the simplifications involved in model building often reduce their fidelity, but some models produce useful (actionable) results. All election forecast models are just that, forecast models that may be right or wrong. The question is, how useful are they? 

Let's imagine that a fundamental model was an accurate forecaster. We would have to accept that campaigns had little or no effect on the outcome. But this is clearly at odds with reality. The polling data indicates that the course of the 2016 US presidential election changed course in the closing weeks of the campaign. Perhaps most famously, the same thing happened in 1948. One of the key issues in the 2004 US presidential election was the 'war on terror'. This isn't an economic effect and it's not at all clear how it could be reduced to a number.

In other words, election results depend on more than economic effects and may depend on factors that are hard to quantify.

To attempt to quantify these effects, we could turn to opinion polls. In 2004, we could have asked voters about their view of the war on terror and we could have factored that into a fundamentals model. But why not just ask them how they intend to vote?

(Paul the Octopus died and was memorialized by a statue. How many other forecasters will get statues? Image Source: Wikimedia Commons. Author: Christophe95. License: Creative Commons.)

Where I stand

I'm reluctant to throw the baby out with the bathwater. I think fundamentals may have some effect, but it's heavily moderated by other factors and what happens during the campaign. Maybe their best use might be to give politicians some idea of factors that might be important in a campaign. But as the UK Highway Code says of the green traffic light, it doesn't mean go, it means "proceed with caution".

If you liked this post, you might like these ones

No comments:

Post a Comment