How you say it is important
Volume
Pitch (frequency)
- high pitch represents energy and excitement and high emotion
- low pitch represents calm seriousness
Speed
- High speed represents energy, excitement, high emotion
- Low speed represents seriousness
Mike Woodward's thoughts on data and analytics. I cover management, law, statistics, company culture, hiring, data visualization, Python, SQL and a whole bunch of other areas.
I was told this story years ago by those who were supposedly involved. I worked for the company concerned, but I'm not sure if it makes the story truer or not. In any case, it's a fun story with a subtle moral.
There was an IT department in a big company who were installing servers in an older building that didn't have dedicated server rooms or closets. Because there was nowhere else to put it, they installed a server in an office. The server was a typical innocuous beige box.
After a few weeks, there were reports of trouble in the building. Fairly regularly, e-mail and other services would go down for five minutes at about the same time of day. The IT department investigated. They checked the server configurations, but the configurations weren't to blame. They checked the cabling, but that was just fine. They checked network cards and routers, but everything seemed to be working as expected. During the whole investigation, the network kept on going down at around 10:30am for about 5 minutes, but there seemed to be no hardware or software cause.
In desperation, the IT department posted someone to sit by the server all day to watch what happened.
At about 10:30am, a secretary filled an electric kettle with water. She walked into the office, unplugged the server, and plugged in her kettle. She made herself and her boss a nice pot of tea. When the tea was brewing, she unplugged the kettle and plugged the server back in. She then went to enjoy her break and have her cup of tea.
So the mystery was solved. The IT department put a notice on the server plug not to unplug it and identified the server as a server. The secretary found another, less convenient place to plug in her kettle, and the world moved on.
I was told this story by the IT department. In their telling, the villain of the piece was the secretary, who they thought should have known better. At the time, I accepted this and laughed with them. Now, I disagree. In my view, the villain was the IT department and the innocent party was the secretary.
No-one told the secretary that the server was important; it wasn't marked in any way. She wasn't a technical person and she had no way of knowing what the box was or what it did. Because of the age of the building, the server was in an office instead of in a server closet, so there were lots of non-technical people in the area near the server. The IT department did know what the server was, they knew that there were non-technical people around, but they chose not to mark the server or communicate to anyone what it was or how important it was to keep it plugged in.
Bottom line: don't blame people for not being psychic - it's your responsibility to communicate.
State | Election spread (Trump - Clinton) |
Poll aggregator spread (Trump - Clinton) |
---|---|---|
Florida | 1.2% | -1.5% |
North Carolina | 3.66% | -1% |
Pennsylvania | 0.72% | -2.5% |
Michigan | 0.23% | -2.5% |
Wisconsin | 0.77% | < -5% |
"Real change in vote preference during the final week or so of the campaign. About 13 percent of voters in Wisconsin, Florida and Pennsylvania decided on their presidential vote choice in the final week, according to the best available data. These voters broke for Trump by near 30 points in Wisconsin and by 17 points in Florida and Pennsylvania."
On occasions, election opinion polls have got it very, very wrong. I'm going to talk about some of their biggest blunders and analyze why they messed up so very badly. There are lessons about methodology, hubris, and humility in forecasting.
(Image credit: secretlondon123, Source: Wikimedia Commons, License: Creative Commons)
The biggest, badest, and boldest polling debacle happened in 1936, but it still has lessons for today. The Literary Digest was a mass-circulation US magazine published from 1890-1938. In 1920, it started printing presidential opinion polls, which over the years acquired a good reputation for accuracy [Squire], so much so that they boosted the magazine's subscriptions. Unfortunately, its 1936 opinion poll sank the ship.
The 1936 presidential election was fought between Franklin D. Roosevelt (Democrat), running for re-election, and his challenger Alf Landon (Republican). The backdrop was the ongoing Great Depression and the specter of war in Europe.
The Literary Digest conducted the largest ever poll up to that time, sending surveys to 10 million people and receiving 2.3 million responses; even today, this is orders of magnitude larger than typical opinion polls. Through the Fall of 1936, they published results as their respondents returned surveys; the magazine didn't interpret or weight the surveys in any way [Squire]. After 'digesting' the responses, the Literary Digest confidently predicted that Landon would easily beat Roosevelt. Their reasoning was, the poll was so big it couldn’t possibly be wrong, after all the statistical margin of error was tiny.
Unfortunately for them, Roosevelt won handily. In reality, handily is putting it mildly, he won a landslide victory (523 electoral college votes to 8).
So what went wrong? The Literary Digest sampled its own readers, people who were on lists of car owners, and people who had telephones at home. In the Great Depression, this meant their sample was not representative of the US voting population; the people they sampled were much wealthier. The poll also suffered from non-response bias; the people in favor of Landon were enthusiastic and filled in the surveys and returned them, the Roosevelt supporters less so. Unfortunately for the Literary Digest, Roosevelt's supporters weren't so lethargic on election day and turned up in force for him [Lusinchi, Squire]. No matter what the size of the Literary Digest's sample, their methodology baked in bias, so it was never going to give an accurate forecast.
Bottom line: survey size can't make up for sampling bias.
Sadly, the Literary Digest never recovered from this misstep and folded two years later.
The spectacular implosion of the 1936 Literary Digest poll gave impetus to the more 'scientific' polling methods of George Gallup and others [Igo]. But even these scientific polls came undone in the 1948 US presidential election.
The election was held not long after the end of World War II and was between the incumbent, Harry S. Truman (Democrat), and his main challenger, Thomas E. Dewey (Republican). At the start of the election campaign, Dewey was the favorite over the increasingly unpopular Truman. While Dewey ran a low-key campaign, Truman led a high-energy, high-intensity campaign.
The main opinion polling companies of the time, Gallup, Roper, and Crossley firmly predicted a Dewey victory. The Crossley Poll of 15 October 1948 put Dewey ahead in 27 states [Topping]. In fact, their results were so strongly in favor of Dewey that some polling organizations stopped polling altogether before the election.
The election result? Truman won convincingly.
A few newspapers were so convinced that Dewy had won that they went to press with a Dewey victory announcement, leading to one of the most famous election pictures of all time.
What went wrong?
As far as I can tell, there were two main causes of the pollsters' errors:
Just as in 1936, there was a commercial fallout, for example, 30 newspapers threatened to cancel their contracts with Gallup.
As a result of this fiasco, the polling industry regrouped and moved towards random sampling and polling late into the election campaign.
For the general public, this is the best-known example of polls getting the result wrong. There's a lot to say about what happened in 2016, so much in fact, that I'm going to write a blog post on this topic alone. It's not the clear-cut case of wrongness it first appears to be.
For now, I'll just give you some hints: like the Literary Digest example, sampling was one of the principal causes, exacerbated by late changes in the electorate's voting decisions. White voters without college degrees voted much more heavily for Donald Trump than Hilary Clinton and in 2016, opinion pollsters didn't control for education, leading them to underestimate Trump's support in key states. Polling organizations are learning from this mistake and changing their methodology for 2020. Back in 2016, a significant chunk of the electorate seemed to make up their minds in the last few weeks of the election which was missed by earlier polling.
It seems the more things change, the more they remain the same.
There are several properties of the US electoral system that make it very well suited for opinion polling but other electoral systems don't have these properties. To understand why polling is harder in the UK than in the US, we have to understand the differences between a US presidential election and a UK general election.
These factors make forecasting UK elections harder than US elections, so perhaps we should be a bit more forgiving. But before we forgive, let's have a look at some of the UK's greatest election polling misses.
The 1992 UK general election was a complete disaster for the opinion polling companies in the UK [Smith]. Every poll in the run-up to the election forecast either a hung parliament (meaning, no single party has a majority) or a slim majority for the Labour party. Even the exit polls forecast a hung parliament. Unfortunately for the pollsters, the Conservative party won a comfortable working majority of seats. Bob Worcester, the best known UK pollster at the time, said the polls were more wrong "...than in any time since their invention 50 years ago" [Jowell].
As in the US in 1948, the pollsters re-grouped, licked their wounds, and revised their methodologies.
After the disaster of 1992, surely the UK pollsters wouldn't get it wrong again? Moving forward 2015, the pollsters got it wrong again!
In the 2015 election, the Conservative party won a working majority. This was a complex, multi-party election with strong regional effects, all of which was well-known at the time. As in 1992, the pollsters predicted a hung parliament and their subsequent humiliation was very public. Once again, there were various inquiries into what went wrong [Sturgis]. Shockingly, the "official" post-mortem once again found that sampling was the cause of the problem. The polls over-represented Labour supporters and under-represented Conservative supporters, and the techniques used by pollsters to correct for sampling issues were inadequate [Sturgis]. The official finding was backed up by independent research which further suggested pollsters had under-represented non-voters and over-estimated support for the Liberal Democrats [Melon].
Once again, the industry had a re-think.
There was another election in 2019. This time, the pollsters got it almost exactly right.
It's nice to see the polling industry getting a big win, but part of me was hoping Lord Buckethead or Count Binface would sweep to victory in 2019.
This was the other great electoral shock of 2016. The polls forecast a narrow 'Remain' victory, but the reality was a narrow 'Leave' win. Very little has been published on why the pollsters got it wrong in 2016, but what little that was published suggests that the survey method may have been important. The industry didn't initiate a broad inquiry, instead, individual polling companies were asked to investigate their own processes.
There have been a series of polling failures in other countries. Here are just a few:
In university classrooms around the world, students are taught probability theory and statistics. It's usually an antiseptic view of the world, and opinion poll examples are often presented as straightforward math problems, stripped of the complex realities of sampling. Unfortunately, this leaves students unprepared for the chaos and uncertainty of the real world.
Polling is a complex, messy issue. Sampling governs the success or failure of polls, but sampling is something of a dark art and it's hard to assess its accuracy during a campaign. In 2020, do you know the sampling methodologies used by the different polling companies? Do you know who's more accurate than who?
Every so often, the polling companies take a beating. They re-group, fix the issues, and survey again. They get more accurate, and after a while, the press forgets about the failures and talks in glowing terms about polling accuracy, and maybe even doing away with the expensive business of elections in favor of polls. Then another debacle happens. The reality is, the polls are both more accurate and less accurate than the press would have you believe.
As Yogi Berra didn't say, "it's tough to make predictions, especially about the future".
[Igo] '"A gold mine and a tool for democracy": George Gallup, Elmo Roper, and the business of scientific polling,1935-1955', Sarah Igo, J Hist Behav Sci. 2006;42(2):109-134
[Jowell] "The 1992 British Election: The Failure of the Polls", Roger Jowell, Barry Hedges, Peter Lynn, Graham Farrant and Anthony Heath, The Public Opinion Quarterly, Vol. 57, No. 2 (Summer, 1993), pp. 238-263
[Lusinchi] '“President” Landon and the 1936 Literary Digest Poll: Were Automobile and Telephone Owners to Blame?', Dominic Lusinchi, Social Science History 36:1 (Spring 2012)
[Lynn] "How Might Opinion Polls be Improved?: The Case for Probability Sampling", Peter Lynn and Roger Jowell, Journal of the Royal Statistical Society. Series A (Statistics in Society), Vol. 159, No. 1 (1996), pp. 21-28
[Melon] "Missing Nonvoters and Misweighted Samples: Explaining the 2015 Great British Polling Miss", Jonathan Mellon, Christopher Prosser, Public Opinion Quarterly, Volume 81, Issue 3, Fall 2017, Pages 661–687
[Smith] "Public Opinion Polls: The UK General Election, 1992", T. M. F. Smith, Journal of the Royal Statistical Society. Series A (Statistics in Society), Vol. 159, No. 3 (1996), pp. 535-545
[Squire] "Why the 1936 Literary Digest poll failed", Peverill Squire, Public Opinion Quarterly, 52, 125-133, 1988
[Sturgis] "Report of the Inquiry into the 2015 British general election opinion polls", Patrick Sturgis, Nick Baker, Mario Callegaro, Stephen Fisher, Jane Green, Will Jennings, Jouni Kuha, Ben Lauderdale, Patten Smith
[Topping] '‘‘Never argue with the Gallup Poll’’: Thomas Dewey, Civil Rights and the Election of 1948', Simon Topping, Journal of American Studies, 38 (2004), 2, 179–198
How likely is it that your favorite candidate will win the election? If your candidate is ahead of their opponent by 5%, are they certain to win? What about 10%? Or if they're down by 2%, are they out of the race? Victory probabilities are related to how far ahead or behind a candidate is in the polls, but the relationship isn't a simple one and has some surprising consequences as we'll see.
Let's imagine there's a hard-fought election between candidates A and B. A newspaper publishes an opinion poll a few days before the election:
Should candidate A's supporters pop the champagne and candidate B's supporters start crying?
Let's use some standard notation. From the theory of proportions, the mean and standard error for the proportion of respondents who chose A is:
\[ p_a = {n_a \over n} \] \[ \sigma_a = { \sqrt {{p_a(1-p_a)} \over n}} \]
where \( n_a \) is the number of respondents who chose A and \( n \) is the total number of respondents. If the proportion of people who answered candidate B is \(p_b\), then obviously, \( p_a + p_b = 1\).
Election probability theory usually uses the spread, \(d\), which is the difference between the candidates: \[d = p_a - p_b = 2p_a - 1 \] From statistics theory, the standard error of \( d \) is: \[\sigma_d = 2\sigma_a\] (these relationships are easy to prove, but a bit tedious, if anyone asks, I'll show the proof.)
Obviously, for a candidate to win, their spread, \(d\), must be > 0.
From the central limit theorem (CLT), we know \(p_a\) and \(p_b\) are normally distributed, and also from the CLT, we know \(d\) is normally distributed. The next step to probability is viewing the normal distribution for candidate A's spread. The chart below shows the normal distribution with mean \(d\) and standard error \(\sigma_d\).
As with most things with the normal distribution, it's easier if we transform everything to the standard normal using the transformation: \[z = {(x - d) \over \sigma_d}\] The chart below is the standard normal representation of the same data.
The standard normal form of this distribution is a probability density function. We want the probability that \(d>0\) which is the light green shaded area, so it's time to turn to the cumulative distribution function (CDF), and its complement, the complementary cumulative distribution function (CCDF).
The CDF gives us the probability that we will get a result less than or equal to some value I'll label \(z_c\). We can write this as: \[P(z \leq z_c) = CDF(z_c) = \phi(z_c) \] The CCDF is defined so that: \[1 = P(z \leq z_c) + P(z > z_c)= CDF(z_c) + CCDF(z_c) = \phi(z_c) + \phi_c(z_c)\] Which is a long-winded way of saying the CCDF is defined as: \[CCDF(z_c) = P(z_c \gt 0) = \phi_c(z_c)\]
The CDF is the integral of the PDF, and from standard textbooks: \[ \phi(z_c) = {1 \over 2} \left( 1 + erf\left( {z_c \over \sqrt2} \right) \right) \] We want the CCDF, \(P(z > z_c)\), which is simply 1 - CDF.
Our critical value occurs when the spread is zero. The transformation to the standard normal in this case is: \[z_c = {(x - d) \over \sigma_d} = {-d \over \sigma_d}\] We can write the CCDF as: \[\phi_c(z_c) = 1 - \phi(z_c) = 1- {1 \over 2} \left( 1 + erf\left( {z_c \over \sqrt2} \right) \right)\ \] \[= 1 - {1 \over 2} \left( 1 + erf\left( {-d \over {\sigma_d\sqrt2}} \right) \right)\] We can easily show that: \[erf(x) = -erf(-x)\] Using this relationship, we can rewrite the above equation as: \[ P(d > 0) = {1 \over 2} \left( 1 + erf\left( {d \over {\sigma_d\sqrt2}} \right) \right)\]
What we have is an equation that takes data we've derived from an opinion poll and gives us a probability of a candidate winning.
For candidate A:
For candidate B:
Obviously, the two probabilities add up to 1. But note the probability for candidate A. Did you expect a number like this? A 4% point lead in the polls giving a 90% chance of victory?
Because the probability is based on \( erf \), you can quite quickly get to highly probable events as I'm going to show in an example. I've plotted the probability for candidate A for various leads (spreads) in the polls. Most polls nowadays tend to have about 800 or so respondents (some are more and some are a lot less), so I've taken 800 as my poll size. Obviously, if the spread is zero, the election is 50%:50%. Note how quickly the probability of victory increases as the spread increases.
What about the size of the poll, how does that change things? Let's fix the spread to 2% and vary the size of the poll from 200 to 2,000 (the usual upper and lower bounds on poll sizes). Here's how the probability varies with poll size for a spread of 2%.
Now imagine you're a cynical and seasoned poll analyst working on candidate A's campaign. The young and excitable intern comes rushing in, shouting to everyone that A is ahead in the polls! You ask the intern two questions, and then, like the Oracle at Delphi, you predict happiness or not. What two questions do you ask?
There are two elephants in the room, and I've been avoiding talking about them. Can you guess what they are?
All of this analysis assumes the only source of error is random noise. In other words, that there's no systemic bias. In the real world, that's not true. Polls aren't wholly based on random sampling, and the sampling method can introduce bias. I haven't modeled it at all in this analysis. There are at least two systemic biases:
Understanding and allowing for bias is key to making a successful election forecast. This is an advanced topic for another blog post.
The other missing item is more subtle. It's undecided voters. Imagine there are two elections and two opinion polls. Both polls have 1,000 respondents.
Election 1:
The best source of election analysis I've read is in the book "Introduction to data science" and the associated edX course "Inference and modeling", both by Rafael Irizarry. The analysis in this blog post was culled from multiple books and websites, each of which only gave part of the story.
Here's a story about how something innocuous and low-level like serial numbers can damage your reputation and lose you business. I have advice on how to avoid it too!
Years ago, I worked for a specialty manufacturing company, its products were high precision, low-volume, and expensive. The industry was cut-throat competitive, and commentary in the press was that not every manufacturer would survive; as a consequence, customer confidence was critical.
An overseas customer team came to us to design a specialty item. The company spent a week training them and helping them design what they wanted. Of course, the design was all on a CAD system with some templated and automated features. That's where the trouble started.
One of the overseas engineers spotted that a customer-based serial number was automatically included in the design. Unfortunately, the serial number was 16, implying that the overseas team was only the 16th customer (which was true). This immediately set off their alarm bells - a company with only 16 customers was probably not going to survive the coming industry shake-out. The executive team had to smooth things over, which included lying about the serial numbers. As soon as the overseas team left, the company changed their system to start counting serial numbers from some high, but believable number (something like 857).
Here's the point: customers can infer a surprising amount from your serial numbers, especially your volume of business.
Years later, I was in a position where I was approving vendor invoices. Some of my vendors didn't realize what serial numbers could reveal, and I ended up gaining insight into their financial state. Here are the rules I used to figure out what was going on financially, which was very helpful when it came to negotiating contract renewals.
The accounting tool suppliers are wise to this, and many tools offer options for invoice numbering that stop this kind of analysis (e.g. starting invoices from a random number, random invoice increments, etc.). But not all vendors use these features and serial number analysis works surprisingly often.
Serial number analysis has been used in wartime too. In World War II, the allied powers wanted to understand the capacity of Nazi industry to build tanks. Fortunately, German tanks were given consecutive serial numbers (this is a simplification, but it was mostly true). Allied troops were given the job of recording the serial numbers of captured or destroyed tanks which they reported back. Statisticians were able to infer changes in Nazi tank production capabilities through serial number analysis, which after the war was found to be mostly correct. This is known as the German tank problem and you can read a lot more about it online.
The bottom line is simple: serial numbers can give away more about your business than you think. They can tell your customers how big your customer base is, and whether it's expanding or contracting; crucial information when it comes to renegotiating contracts. Pay attention to your serial numbers and invoices!