Monday, August 17, 2020

Poll-axed: disastrously wrong opinion polls

Getting it really, really wrong

On occasions, election opinion polls have got it very, very wrong. I'm going to talk about some of their biggest blunders and analyze why they messed up so very badly. There are lessons about methodology, hubris, and humility in forecasting.

 (Image credit: secretlondon123, Source: Wikimedia Commons, License: Creative Commons)

The Literary Digest - size isn't important

The biggest, badest, and boldest polling debacle happened in 1936, but it still has lessons for today. The Literary Digest was a mass-circulation US magazine published from 1890-1938. In 1920, it started printing presidential opinion polls, which over the years acquired a good reputation for accuracy [Squire], so much so that they boosted the magazine's subscriptions. Unfortunately, its 1936 opinion poll sank the ship.

(Source: Wikimedia Commons. License: Public Domain)

The 1936 presidential election was fought between Franklin D. Roosevelt (Democrat), running for re-election, and his challenger Alf Landon (Republican).  The backdrop was the ongoing Great Depression and the specter of war in Europe. 

The Literary Digest conducted the largest-ever poll up to that time, sending surveys to 10 million people and receiving 2.3 million responses; even today, this is orders of magnitude larger than typical opinion polls. Through the Fall of 1936, they published results as their respondents returned surveys; the magazine didn't interpret or weight the surveys in any way [Squire]. After 'digesting' the responses, the Literary Digest confidently predicted that Landon would easily beat Roosevelt. Their reasoning was, the poll was so big it couldn’t possibly be wrong, after all the statistical margin of error was tiny

Unfortunately for them, Roosevelt won handily. In reality, handily is putting it mildly, he won a landslide victory (523 electoral college votes to 8).

So what went wrong? The Literary Digest sampled its own readers, people who were on lists of car owners, and people who had telephones at home. In the Great Depression, this meant their sample was not representative of the US voting population; the people they sampled were much wealthier. The poll also suffered from non-response bias; the people in favor of Landon were enthusiastic and filled in the surveys and returned them, the Roosevelt supporters less so. Unfortunately for the Literary Digest, Roosevelt's supporters weren't so lethargic on election day and turned up in force for him [Lusinchi, Squire]. No matter what the size of the Literary Digest's sample, their methodology baked in bias, so it was never going to give an accurate forecast. 

Bottom line: survey size can't make up for sampling bias.

Sampling bias is an ongoing issue for pollsters. Factors that matter a great deal in one election might not matter in another, and pollsters have to estimate what will be important for voting so they know who to select. For example, having a car or a phone might not correlate with voting intention for most elections, until for one election they do correlate very strongly. The Literary Digest's sampling method was crude, but worked fine in previous elections. Unfortunately, in 1936 the flaws in their methodology made a big difference and they called the election wrongly as a result. Fast-forwarding to 2016, flaws in sampling methodology led to pollsters underestimating support for Donald Trump.

Sadly, the Literary Digest never recovered from this misstep and folded two years later. 

Dewey defeats Truman - or not

The spectacular implosion of the 1936 Literary Digest poll gave impetus to the more 'scientific' polling methods of George Gallup and others [Igo]. But even these scientific polls came undone in the 1948 US presidential election. 

The election was held not long after the end of World War II and was between the incumbent, Harry S. Truman (Democrat), and his main challenger, Thomas E. Dewey (Republican). At the start of the election campaign, Dewey was the favorite over the increasingly unpopular Truman. While Dewey ran a low-key campaign, Truman led a high-energy, high-intensity campaign.

The main opinion polling companies of the time, Gallup, Roper, and Crossley firmly predicted a Dewey victory. The Crossley Poll of 15 October 1948 put Dewey ahead in 27 states [Topping]. In fact, their results were so strongly in favor of Dewey that some polling organizations stopped polling altogether before the election. 

The election result? Truman won convincingly. 

A few newspapers were so convinced that Dewy had won that they went to press with a Dewey victory announcement, leading to one of the most famous election pictures of all time.

(Source: Truman Library)

What went wrong?

As far as I can tell, there were two main causes of the pollsters' errors:

  • Undecided voters breaking for Truman. Pollsters had assumed that undecided voters would split their votes evenly between the candidates, which wasn't true then, and probably isn't true today.
  • Voters changing their minds or deciding who to vote for later in the campaign. If you stop polling late in the campaign, you're not going to pick up last-minute electoral changes. 

Just as in 1936, there was a commercial fallout, for example, 30 newspapers threatened to cancel their contracts with Gallup. 

As a result of this fiasco, the polling industry regrouped and moved towards random sampling and polling late into the election campaign.

US presidential election 2016

For the general public, this is the best-known example of polls getting the result wrong. There's a lot to say about what happened in 2016, so much in fact, that I'm going to write a blog post on this topic alone. It's not the clear-cut case of wrongness it first appears to be.

(Imaged credit: Michael Vadon, Source: Wikimedia Commons, License: Creative Commons)

For now, I'll just give you some hints: like the Literary Digest example, sampling was one of the principal causes, exacerbated by late changes in the electorate's voting decisions. White voters without college degrees voted much more heavily for Donald Trump than Hilary Clinton and in 2016, opinion pollsters didn't control for education, leading them to underestimate Trump's support in key states. Polling organizations are learning from this mistake and changing their methodology for 2020. Back in 2016, a significant chunk of the electorate seemed to make up their minds in the last few weeks of the election which was missed by earlier polling.

It seems the more things change, the more they remain the same.

Anarchy in the UK?

There are several properties of the US electoral system that make it very well suited for opinion polling but other electoral systems don't have these properties. To understand why polling is harder in the UK than in the US, we have to understand the differences between a US presidential election and a UK general election.

  • The US is a national two-party system, the UK is a multi-party democracy with regional parties. In some constituencies, there are three or more parties that could win.
  • In the US, the president is elected and there are only two candidates, in the UK, the electorate vote for Members of Parliament (MPs) who select the prime minister. This means the candidates are different in each constituency and local factors can matter a great deal.
  • There are 50 states plus Washington DC, meaning 51 geographical areas. In the UK, there are currently 650 constituencies, meaning 650 geographies area to survey.

These factors make forecasting UK elections harder than US elections, so perhaps we should be a bit more forgiving. But before we forgive, let's have a look at some of the UK's greatest election polling misses.

General elections

The 1992 UK general election was a complete disaster for the opinion polling companies in the UK [Smith]. Every poll in the run-up to the election forecast either a hung parliament (meaning, no single party has a majority) or a slim majority for the Labour party. Even the exit polls forecast a hung parliament. Unfortunately for the pollsters, the Conservative party won a comfortable working majority of seats. Bob Worcester, the best-known UK pollster at the time, said the polls were more wrong "...than in any time since their invention 50 years ago" [Jowell].

Academics proposed several possible causes [Jowell, Smith]:
  • "Shy Tories". The idea here is that people were too ashamed to admit they intended to vote Conservative, so they lied or didn't respond at all. 
  • Don't knows/won't say. In any poll, some people are undecided or won't reveal their preference. To predict an election, you have to model how these people will vote, or at least have a reliable way of dealing with them, and that wasn't the case in 1992 [Lynn].
  • Voter turnout. Different groups of people actually turn out to vote at different proportions. The pollsters didn't handle differential turnout very well, leading them to overstate the proportion of Labour votes.
  • Quota sampling methods. Polling organizations use quota-based sampling to try and get a representative sample of the population. If the sampling is biased, then the results will be biased [Lynn, Smith]. 

As in the US in 1948, the pollsters re-grouped, licked their wounds and revised their methodologies.

After the disaster of 1992, surely the UK pollsters wouldn't get it wrong again? Moving forward 2015, the pollsters got it wrong again!

In the 2015 election, the Conservative party won a working majority. This was a complex, multi-party election with strong regional effects, all of which were well-known at the time. As in 1992, the pollsters predicted a hung parliament and their subsequent humiliation was very public. Once again, there were various inquiries into what went wrong [Sturgis]. Shockingly, the "official" post-mortem once again found that sampling was the cause of the problem. The polls over-represented Labour supporters and under-represented Conservative supporters, and the techniques used by pollsters to correct for sampling issues were inadequate [Sturgis]. The official finding was backed up by independent research which further suggested pollsters had under-represented non-voters and over-estimated support for the Liberal Democrats [Melon].

Once again, the industry had a rethink.

There was another election in 2019. This time, the pollsters got it almost exactly right.

It's nice to see the polling industry getting a big win, but part of me was hoping Lord Buckethead or Count Binface would sweep to victory in 2019.

(Count Binface. Source: https://www.countbinface.com/)

(Lord Buckethead. Source: https://twitter.com/LordBuckethead/status/1273601785094078464/photo/1. Not the hero we need, but the one we deserve.)

EU referendum

This was the other great electoral shock of 2016. The polls forecast a narrow 'Remain' victory, but the reality was a narrow 'Leave' win. Very little has been published on why the pollsters got it wrong in 2016, but what little that was published suggests that the survey method may have been important. The industry didn't initiate a broad inquiry, instead, individual polling companies were asked to investigate their own processes

Other countries

There have been a series of polling failures in other countries. Here are just a few:

Takeaways

In university classrooms around the world, students are taught probability theory and statistics. It's usually an antiseptic view of the world, and opinion poll examples are often presented as straightforward math problems, stripped of the complex realities of sampling. Unfortunately, this leaves students unprepared for the chaos and uncertainty of the real world.

Polling is a complex, messy issue. Sampling governs the success or failure of polls, but sampling is something of a dark art and it's hard to assess its accuracy during a campaign. In 2020, do you know the sampling methodologies used by the different polling companies? Do you know who's more accurate than who?

Every so often, the polling companies take a beating. They re-group, fix the issues, and survey again. They get more accurate, and after a while, the press forgets about the failures and talks in glowing terms about polling accuracy, and maybe even doing away with the expensive business of elections in favor of polls. Then another debacle happens. The reality is, the polls are both more accurate and less accurate than the press would have you believe. 

As Yogi Berra didn't say, "it's tough to make predictions, especially about the future".  

If you liked this post, you might like these ones

References

[Igo] '"A gold mine and a tool for democracy": George Gallup, Elmo Roper, and the business of scientific polling,1935-1955', Sarah Igo, J Hist Behav Sci. 2006;42(2):109-134

[Jowell] "The 1992 British Election: The Failure of the Polls", Roger Jowell, Barry Hedges, Peter Lynn, Graham Farrant and Anthony Heath, The Public Opinion Quarterly, Vol. 57, No. 2 (Summer, 1993), pp. 238-263

[Lusinchi] '“President” Landon and the 1936 Literary Digest Poll: Were Automobile and Telephone Owners to Blame?', Dominic Lusinchi, Social Science History 36:1 (Spring 2012)

[Lynn] "How Might Opinion Polls be Improved?: The Case for Probability Sampling", Peter Lynn and Roger Jowell, Journal of the Royal Statistical Society. Series A (Statistics in Society), Vol. 159, No. 1 (1996), pp. 21-28 

[Melon] "Missing Nonvoters and Misweighted Samples: Explaining the 2015 Great British Polling Miss", Jonathan Mellon, Christopher Prosser, Public Opinion Quarterly, Volume 81, Issue 3, Fall 2017, Pages 661–687

[Smith] "Public Opinion Polls: The UK General Election, 1992",  T. M. F. Smith, Journal of the Royal Statistical Society. Series A (Statistics in Society), Vol. 159, No. 3 (1996), pp. 535-545

[Squire] "Why the 1936 Literary Digest poll failed", Peverill Squire, Public Opinion Quarterly, 52, 125-133, 1988

[Sturgis] "Report of the Inquiry into the 2015 British general election opinion polls", Patrick Sturgis,  Nick Baker, Mario Callegaro, Stephen Fisher, Jane Green, Will Jennings, Jouni Kuha, Ben Lauderdale, Patten Smith

[Topping] '‘‘Never argue with the Gallup Poll’’: Thomas Dewey, Civil Rights and the Election of 1948', Simon Topping, Journal of American Studies, 38 (2004), 2, 179–198

No comments:

Post a Comment