Tuesday, September 8, 2020

Can you believe the polls?

Opinion polls have known sin

Polling companies have run into trouble over the years in ways that render some poll results doubtful at best. Here are just a few of the problems:

  • Fraud allegations.
  • Leading questions
  • Choosing not to publish results/picking methodologies so that polls agree.

Running reliable polls is hard work that takes a lot of expertise and commitment. Sadly, companies sometimes get it wrong for several reasons:

  • Ineptitude.
  • Lack of money. 
  • Telling people what they want to hear. 
  • Fakery.

In this blog post, I'm going to look at some high-profile cases of dodgy polling and I'm going to draw some lessons from what happened.

(Are some polls real or fake? Image source: Wikimedia Commons. Image credit: Basile Morin. License: Creative Commons.)

Allegations of fraud part 1 - Research 2000

Backstory

Research 2000 started operating around 1999 and gained some solid early clients. In 2008, The Daily Kos contracted with Research 2000 for polling during the upcoming US elections. In early 2010, Nate Silver at FiveThirtyEight rated Research 2000 as an F and stopped using their polls. As a direct result, The Daily Kos terminated their contract and later took legal action to reclaim fees, alleging fraud.

Nate Silver's and others' analysis

After the 2010 Senate elections, Nate Silver analyzed polling results for 'house effects' and found a bias towards the Democratic party for Research 2000. These kinds of biases appear all the time and vary from election to election. The Research 2000 bias was large (at 4.4%), but not crazy; the Rasmussen Republican bias was larger for example. Nonetheless, for many reasons, he graded Research 2000 an F and stopped using their polling data.

In June of 2010, The Daily Kos publicly dismissed Research 2000 as their pollster based on Nate Silver's ranking and more detailed discussions with him. Three weeks later, The Daily Kos sued Research 2000 for fraud. After the legal action was public, Nate Silver blogged some more details of his misgivings about Research 2000's results, which led to a cease and desist letter from Research 2000's lawyers. Subsequent to the cease-and-desist letter, Silver published yet more details of his misgivings. To summarize his results, he was seeing data inconsistent with real polling - the distribution of the numbers was wrong. As it turned out, Research 2000 was having financial trouble around the time of the polling allegations and was negotiating low-cost or free polling with The Daily Kos in exchange for accelerated payments. 

Others were onto Research 2000 too. Three statisticians analyzed some of the polling data and found patterns inconsistent with real polling - again, real polls tend to have results distributed in certain ways and some of the Research 2000 polls did not.

The result

The lawsuit progressed with strong evidence in favor of The Daily Kos. Perhaps unsurprisingly, the parties agreed a settlement, with Research 2000 agreeing to pay The Daily Kos a settlement fee. Research 2000 effectively shut down after the agreement.

Allegations of fraud part 2 - Strategic Vision, LLC

Backstory

This story requires some care in the telling. At the time of the story, there were two companies called Strategic Vision, one company is well-respected and wholly innocent, the other not so much. The innocent and well-respected company is Strategic Vision based in San Diego. They have nothing to do with this story. The other company is Strategic Vision, LLC based in Atlanta. When I talk about Strategic Vision, LLC from now on it will be solely about the Atlanta company.

To maintain trust in the polling industry, the American Association for Public Opinion Research (AAPOR) has guidelines and asks polling companies to disclose some details of their polling methodologies. They rarely censure companies, and their censures don't have the force of law, but public shaming is effective as we'll see. 

What happened

In 2008, the AAPOR asked 21 polling organizations for details of their 2008 pre-election polling, including polling for the New Hampshire Democratic primary. Their goal was to quality-check the state of polling in the industry.

One polling company didn't respond for a year, despite repeated requests to do so. As a result, in September 2009, the AAPOR published a public censure of Strategic Vision, LLC which you can read here

It's very unusual for the AAPOR to issue a censure, so the story was widely reported at the time, for example in the New York Times, The Hill, and The Wall Street Journal. Strategic Vision LLC's public response to the press coverage was that they were complying but didn't have time to submit their data. They denied any wrongdoing.

Subsequent to the censure, Nate Silver looked more closely at Strategic Vision LLC's results. Initially, he asked some very pointed and blunt questions. In a subsequent post, Nate Silver used Benford's Law to investigate Strategic Vision LLC's data, and based on his analysis he stated there was a suggestion of fraud - more specifically, that the data had been made up. In a post the following day, Nate Silver offered some more analysis and a great example of using Benford's Law in practice. Again, Strategic Vision LLC vigorously denied any wrongdoing.

One of the most entertaining parts of this story is a citizenship poll conducted by Strategic Vision, LLC among high school students in Oklahoma. The poll was commissioned by the Oklahoma Council on Public Affairs, a think tank. The poll asked eight various straightforward questions, for example:

  • who was the first US president? 
  • what are the two main political parties in the US?  

and so on. The results were dismal: only 23% of students answered George Washington and only 43% of students knew Democratic and Republican. Not one student in 1,000 got all questions correct - which is extraordinary. These types of polls are beloved of the press; there are easy headlines to be squeezed from students doing poorly, especially on issues around citizenship. Unfortunately, the poll results looked odd at best. Nate Silver analyzed the distribution of the results and concluded that something didn't seem right - the data was not distributed as you might expect. To their great credit, when the Oklahoma Council on Public Affairs became aware of problems with the poll, they removed it from their website and put up a page explaining what happened. They subsequently terminated their relationship with Strategic Vision, LLC.

In 2010, a University of Cincinnati professor awarded Strategic Vision LLC the ''Phantom of the Soap Opera" award on the Media Ethics site. This site has a little more back story on the odd story of Strategic Vision LLC's offices or lack of them.

The results

Strategic Vision, LLC continued to deny any wrongdoing. They never supplied their data to the AAPOR and they stopped publishing polls in late 2009. They've disappeared from the polling scene.

Other polling companies

Nate Silver rated other pollsters an F and stopped using them. Not all of the tales are as lurid as the ones I've described here, but there are accusations of fraud and fakery in some cases, and in other cases, there are methodology disputes and no suggestion of impropriety. Here's a list of pollsters Nate Silver rates an F.

Anarchy in the UK

It's time to cross the Atlantic and look at polling shenanigans in the UK. The UK hasn't seen the rise and fall of dodgy polling companies, but it has seen dodgy polling methodologies.

Herding

Let's imagine you commission a poll on who will win the UK general election. You get a result different from the other polls. Do you publish your result? Now imagine you're a polling analyst, you have a choice of methodologies for analyzing your results, do you do what everyone else does and get similar results, or do you do your own thing and maybe get different results from everyone else?

Sadly, there are many cases when contrarian polls weren't published and there is evidence that polling companies made very similar analysis choices to deliberately give similar results. This leads to the phenomenon called herding where published poll results tend to herd together. Sometimes, this is OK, but sometimes it can lead to multiple companies calling an election wrongly.

In 2015, the UK polls predicted a hung parliament, but the result was a working majority for the Conservative party. The subsequent industry poll analysis identified herding as one of the causes of the polling miss. 

This isn't the first time herding has been an issue with UK polling and it's occasionally happened in the US too.

Leading questions

The old British TV show 'Yes, Prime Minister' has a great piece of dialog neatly showing how leading questions work in surveys. 'Yes, Prime Minister' is a comedy, but UK polls have suffered from leading questions for a while.

The oldest example I've come across dates from the 1970's and the original European Economic Community membership referendum. Apparently, one poll asked the following questions to two different groups:

  • France, Germany, Italy, Holland, Belgium and Luxembourg approved their membership of the EEC by a vote of their national parliaments. Do you think Britain should do the same?
  • Ireland, Denmark and Norway are voting in a referendum to decide whether to join the EEC. Do you think Britain should do the same?

These questions are highly leading and unsurprisingly elicited the expected positive result in both (contradictory) cases.

Moving forward in time to 2012, leading questions or artful question wording, came up again. The background is press regulation. After a series of scandals where the press behaved shockingly badly, the UK government considered press regulation to curb abuses. Various parties were for or against various aspects of press regulation and they commissioned polls to support their viewpoints. 

The polling company YouGov published a poll, paid for by The Media Standards Trust, that showed 79% of people thought there should be an independent government-sanctioned regulator to investigate complaints against the press. Sounds comprehensive and definitive. 

But there was another poll at about the same time, this time paid for by The Sun newspaper,  that found that only 24% of the British public wanted a government regulator for the press - the polling company here was also YouGov! 

The difference between the 79% and 24% came through careful question wording - a nuance that was lost in the subsequent press reporting of the results. You can listen to the story on the BBC's More Or Less program that gives the wording of the question used.

What does all this mean?

The quality of the polling company is everything

The established, reputable companies got that way through high-quality reliable work over a period of years. They will make mistakes from time to time, but they learn from them. When you're considering whether or not to believe a poll,  you should ask who conducted the poll and consider the reputation of the company behind it.

With some exceptions, the press is unreliable

None of the cases of polling impropriety were caught by the press. In fact, the press has a perverse incentive to promote the wild and outlandish, which favors results from dodgy pollsters. Be aware that a newspaper that paid for a poll is not going to criticize its own paid-for product, especially when it's getting headlines out of it.

Most press coverage of polls focuses on discussing what the poll results mean, not how accurate they are and sources of bias. If these things are discussed, they're discussed in a partisan manner (disagreeing with a poll because the writer holds a different political view). I've never seen the kind of analysis Nate Silver does elsewhere - and this is to the great detriment of the press and their credibility.

Vested interests

A great way to get press coverage is by commissioning polls and publishing the results; especially if you can ask leading questions. Sometimes, the press gets very lazy and doesn't even report who commissioned a poll, even when there's plainly a vested interest.

Anytime you read a survey, ask who paid for it and what the exact questions were.

Outliers are outliers, not trends

Outlier poll results get more play than results in line with other pollsters. As I write this in early September 2020, Biden is about 7% ahead in the polls. Let's imagine two survey results coming in early September:

  • Biden ahead by 8%.
  • Trump ahead by 3%

Which do you think would get more space in the media? Probably the shocking result, even though the dull result may be more likely. Trump-supporting journalists might start writing articles on a campaign resurgence while Biden-supporting journalists might talk about his lead slipping and losing momentum. In reality, the 3% poll might be an anomaly and probably doesn't justify consideration until it's backed by other polls. 

Bottom line: outlier polls are probably outliers and you shouldn't set too much store by them.

There's only one Nate Silver

Nate Silver seems like a one-man army, routing out false polling and pollsters. He's stood up to various legal threats over the years. It's a good thing that he exists, but it's a bad thing that there's only one of him. It would be great if the press could take inspiration from him and take a more nuanced, skeptical, and statistical view of polls. 

Can you believe the polls?

Let me close by answering my own question: yes you can believe the polls, but within limits and depending on who the pollster is.

Reading more

This blog post is one of a series of blog posts about opinion polls. 

No comments:

Post a Comment