Monday, August 3, 2020

Sampling the goods: how opinion polls are made

How opinion polls work on the ground

I worked as a street interviewer for an opinion polling organization and I know how opinion polls are made and executed. In this blog post, I'm going to explain how opinion polls were run on the ground,  educate you on why polls can go wrong, and illustrate how difficult it is to run a valid poll. I'm also going to tell you why everything you learned from statistical textbooks about polling is wrong.

(Image Credit: Wikimedia Commons, License: Public Domain)

Random sampling is impossible

In my experience, this is something that's almost never mentioned in statistics textbooks but is a huge issue in polling. If they talk about sampling at all, textbooks assume random sampling, but that's not what happens.

Random sampling sounds wonderful in theory, but in practice, it can be very hard; people aren't beads in an urn. How do you randomly select people on the street or on the phone - what's the selection mechanism? How do you guard against bias? Let me give you some real examples.

Imagine you're a street interviewer. Where do you stand to take your random sample? If you take your sample outside the library, you'll get a biased sample. If you take it outside the factory gates, or outside a school, or outside a large office complex, or outside a playground, you'll get another set of biases. What about time of day? The people out on the streets at 7am are different from the people at 10am and different from the people at 11pm.

Similar logic applies to phone polls. If you call landlines only, you'll get one set of biases. If you call people during working hours, your sample will be biased (is the mechanic fixing a car going to put down their power tool to talk to you?). But calling outside of office hours means you might not get shift workers or parents putting their kids to bed. The list goes on.

You might be tempted to say, do all the things: sample at 10am, 3pm, and 11pm; sample outside the library, factory, and school; call on landlines and mobile phones, and so on, but what about the cost? How can you keep opinion polls affordable? How do you balance calls at 10am with calls at 3pm?

Because there are very subtle biases in "random" samples, most of the time, polling organizations don't do wholly 'random' sampling.

Sampling and quotas

If you can't get a random sample, you'd like your sample to be representative of a population. Here, representative means that it will behave in the same way as the population for the topics you're interested in, for example, voting in the same way or buying butter in the same way. The most obvious way of sampling is demographics: age and gender etc.

Let's say you were conducting a poll in a town to find out residents' views on a tax increase. You might find out the age and gender demographics of the town and sample people in a representative way so that the demographics of your sample match the demographics of the town. In other words, the proportion of men and women in your sample matches that of the town, the age distribution matches that of the town, and so on.

(US demographics. Image credit: Wikimedia Commons. License: Public domain)

In practice, polling organizations use a number of sampling factors depending on the survey. They might include sampling by:

  • Gender
  • Age
  • Ethnicity
  • Income
  • Social class or employment category 
  • Education 

but more likely, some combination of them.

In practice, interviewers may be given a sheet outlining the people they should interview, for example, so many women aged 45-50, so many people with degrees, so many people earning over $100,000, and so on. This is often called a quota. Phone interviews might be conducted on a pre-selected list of numbers, with guidance on how many times to call back, etc.

Some groups of people can be very hard to reach, and of course, not everyone answers questions. When it comes to analysis time, the results are weighted to correct bias.  For example, if the survey could only reach 75% of its target for men aged 20-25, the results for men in this category might be weighted by 4/3.

Who do you talk to?

Let's imagine you're a street interviewer,  you have your quota to fulfill, and you're interviewing people on the street, who do you talk to? Let me give you a real example from my polling days; I needed a man aged 20-25 for my quota. On the street, I saw what looked like a typical and innocuous student, but I also saw an aggressive-looking skinhead in full skinhead clothing and boots. Who would you choose to interview?

(Image credit: XxxBaloooxxx via Wikimedia Commons. License: Creative Commons.)

Most people would choose the innocuous student, but that's introducing bias. You can imagine multiple interviewers making similar decisions resulting in a heavily biased sample. To counter this problem, we were given guidance on who to select, for example, we were told to sample every seventh person or to take the first person who met our quota regardless of their appearance. This at least meant we were supposed to ask the skinhead, but of course, whether he chose to reply or not is another matter.

The rules sometimes led to absurdity. I did a survey where I was supposed to interview every 10th person who passed by. One man volunteered, but I said no because he was the 5th person. He hung around so long that eventually, he became the 10th person to pass me by. Should I have interviewed him? He met the rules and he met my sampling quota.

I came across a woman who was exactly what I needed for my quota. She was a care worker who had been on a day trip with severely mentally handicapped children and was in the process of moving them from the bus to the care home. Would you take her time to interview her? What about the young parent holding his child when I knocked on the door? The apartment was clearly used for recent drug-taking. Would you interview him? 

As you might expect, interviewers interpreted the rules more flexibly as the deadline approached and as it got later in the day. I once interviewed a very old man whose wife answered all the questions for him. This is against the rules, but he agreed with her answers, it was getting late, and I needed his gender/age group/employment status for my quota.

The company sent out supervisors to check our work on the streets, but of course, supervisors weren't there all the time, and they tended to vanish after 5pm anyway.

The point is, when it comes to it, there's no such thing as random sampling. Even with quotas and other guided selection methods, there are a thousand ways for bias to creep into sampling and the biases can be subtle. The sampling methodology one company uses will be different from another company's, which means their biases will not be the same.

What does the question mean?

One of the biggest lessons I learned was the importance of clear and unambiguous questions, and the unfortunate creativity of the public. All of the surveys I worked on had clearly worded questions, and to me, they always seemed unambiguous. But once you hit the streets, it's a different world. I've had people answer questions with the most astonishing levels of interpretation and creativity; regrettably, their interpretations were almost never what the survey wanted. 

What surprised me was how willing people were to answer difficult questions about salary and other topics. If the question is worded well (and I know all the techniques now!), you can get strangers to tell you all kinds of things. In almost all cases, I got people to tell me their age, and when required, I got salary levels from almost everyone.

A well-worded question led to a revelation that shocked me and shook me out of my complacency.  A candidate had unexpectedly just lost an election in the East End of London and the polling organization I worked for had been contracted to find out why. To help people answer one of the questions, I had a card with a list of reasons why the candidate lost, including the option: "The candidate was not suitable for the area." A lot of people chose that as their reason. I was naive and didn't know what it meant, but at the end of the day, I interviewed a white man in pseudo-skinhead clothes, who told me exactly what it meant. He selected "not suitable for the area" as his answer and added: "She was black, weren't she?".

The question setters weren't naive. They knew that people would hesitate before admitting racism was the cause, but by carefully wording the question and having people choose from options, they provided a socially acceptable way for people to answer the question.

Question setting requires real skill and thought.

(Oddly. there are very few technical resources on wording questions well. The best I've found is: "The Art of Asking Questions", by Stanley Le Baron Payne, but the book has been out of print for a long time.)

Order, order

Question order isn't accidental either, you can bias a survey by the order you ask questions. Of course, you have to avoid leading questions. The textbook example is survey questions on gun control. Let's imagine there were two surveys with these questions:

Survey 1:
  • Are you concerned about violent crime in your neighborhood?
  • Do you think people should be able to protect their families?
  • Do you believe in gun control?
Survey 2:
  • Are you concerned about the number of weapons in society?
  • Do you think all gun owners secure their weapons?
  • Do you believe in gun control?

What answers do you think you might get?

As well as avoiding bias, question order is important to build trust, especially if the topic is a sensitive one. The political survey I did in the East End of London was very carefully constructed to build the respondent's trust to get to the key 'why' question. This was necessary for other surveys too. I did a survey on police recruitment, but as I'm sure you're aware, some people are very suspicious of the police. Once again, the survey was constructed so the questions that revealed it was about police recruitment came later on after the interviewer (me!) had built some trust with the respondent.

How long is the survey?

This is my favorite story from my polling days. I was doing a survey on bus transport in London and I was asked to interview people waiting for a bus. The goal of the survey was to find out where people were going so London could plan for new or changed bus routes. For obvious reasons, the set of questions were shorter than usual, but in practice, not short enough; a big fraction of my interviews were cut short because the bus turned up! In several cases, I was asking questions as people were getting on the bus, and in a few cases, we had a shouted back and forth to finish the survey before their bus pulled off out of earshot.

(Image credit: David McKay via Wikimedia Commons. License: Creative Commons)

To avoid exactly this sort of problem, most polling organizations use pilot surveys. These are test surveys done on a handful of people to debug the survey. In this case, the pilot should have uncovered the fact that the survey was too long, but regrettably, it didn't.

(Sometime later, I designed and executed a survey in Boston. I did a pilot survey and found that some of my questions were confusing and I could shorten the survey by using a freeform question rather than asking for people to choose from a list. In any survey of more than a handful of respondents, I strongly recommend running a pilot - especially if you don't have a background in polling.)

The general lesson for any survey is to keep it as short as possible and understand the circumstances people will be in when you're asking them questions.

What it all means - advice for running surveys

Surveys are hard. It's hard to sample right, it's hard to write questions well, and it's hard to order questions to avoid bias. 

Over the years, I've sat in meetings when someone has enthusiastically suggested a survey. The survey could be a HR survey of employees, or a marketing survey of customers, or something else. Usually, the level of enthusiasm is inversely related to survey experience. The most enthusiastic people are often very resistant to advice about question phrasing and order, and most resistant of all to the idea of a pilot survey. I've seen a lot of enthusiastic people come to grief because they didn't listen. 

If you're thinking about running a survey, here's my advice.

  • Make your questions as clear and unambiguous as you can. Get someone who will tell you you're wrong to review them.
  • Think about how you want the questions answered. Do you want freeform text, multiple choice, or a scale? Surprisingly, in some cases, free form can be faster than multiple choice.
  • Keep it short.
  • Always run a pilot survey. 

What it means - understanding polling results

Once you understand that polling organizations use customized sampling methodologies, you can understand why polling organizations can get the results wrong. To put it simply, if their sampling methodology misses a crucial factor, they'll get biased results. The most obvious example is state-level polling in the US 2016 Presidential Election, but there are a number of other polls that got very different results from the actual election. In a future blog post, I'll look at why the 2016 polls were so wrong and why polls were wrong in other cases too.

If you liked this post, you might like these ones

No comments:

Post a Comment