Wednesday, August 12, 2020

Who will win the election? Election victory probabilities from opinion polls

Polls to probabilities

How likely is it that your favorite candidate will win the election? If your candidate is ahead of their opponent by 5%, are they certain to win? What about 10%? Or if they're down by 2%, are they out of the race? Victory probabilities are related to how far ahead or behind a candidate is in the polls, but the relationship isn't a simple one and has some surprising consequences as we'll see.

Opinion poll example

Let's imagine there's a hard-fought election between candidates A and B. A newspaper publishes an opinion poll a few days before the election:

Candidate A: 52%
Candidate B: 48%
Sample size: 1,000

Should candidate A's supporters pop the champagne and candidate B's supporters start crying?

The spread and standard error

Let's use some standard notation. From the theory of proportions, the mean and standard error for the proportion of respondents who chose A is:

\[ p_a = {n_a \over n} \] \[ \sigma_a = { \sqrt {{p_a(1-p_a)} \over n}} \]

where $ n_a $ is the number of respondents who chose A and $ n $ is the total number of respondents. If the proportion of people who answered candidate B is $p_b$, then obviously, $ p_a + p_b = 1$.

Election probability theory usually uses the spread, $d$, which is the difference between the candidates: \[d = p_a - p_b = 2p_a - 1 \] From statistics theory, the standard error of $ d $ is: \[\sigma_d = 2\sigma_a\] (these relationships are easy to prove, but a bit tedious, if anyone asks, I'll show the proof.)

Obviously, for a candidate to win, their spread, $d$, must be > 0.

Everything is normal

From the central limit theorem (CLT), we know $p_a$ and $p_b$ are normally distributed, and also from the CLT, we know $d$ is normally distributed. The next step to probability is viewing the normal distribution for candidate A's spread. The chart below shows the normal distribution with mean $d$ and standard error $\sigma_d$.

As with most things with the normal distribution, it's easier if we transform everything to the standard normal using the transformation: \[z = {(x - d) \over \sigma_d}\] The chart below is the standard normal representation of the same data.

The standard normal form of this distribution is a probability density function. We want the probability that $d>0$ which is the light green shaded area, so it's time to turn to the cumulative distribution function (CDF), and its complement, the complementary cumulative distribution function (CCDF).

CDF and CCDF

The CDF gives us the probability that we will get a result less than or equal to some value I'll label $z_c$. We can write this as: \[P(z \leq z_c) = CDF(z_c) = \phi(z_c) \] The CCDF is defined so that: \[1 = P(z \leq z_c) + P(z > z_c)= CDF(z_c) + CCDF(z_c) = \phi(z_c) + \phi_c(z_c)\] Which is a long-winded way of saying the CCDF is defined as: \[CCDF(z_c) = P(z_c \gt 0) = \phi_c(z_c)\]

The CDF is the integral of the PDF, and from standard textbooks: \[ \phi(z_c) = {1 \over 2} \left( 1 + erf\left( {z_c \over \sqrt2} \right) \right) \] We want the CCDF, $P(z > z_c)$, which is simply 1 - CDF.

Our critical value occurs when the spread is zero. The transformation to the standard normal in this case is: \[z_c = {(x - d) \over \sigma_d} = {-d \over \sigma_d}\] We can write the CCDF as: \[\phi_c(z_c) = 1 - \phi(z_c) = 1- {1 \over 2} \left( 1 + erf\left( {z_c \over \sqrt2} \right) \right)\ \] \[= 1 - {1 \over 2} \left( 1 + erf\left( {-d \over {\sigma_d\sqrt2}} \right) \right)\] We can easily show that: \[erf(x) = -erf(-x)\] Using this relationship, we can rewrite the above equation as: \[ P(d > 0) = {1 \over 2} \left( 1 + erf\left( {d \over {\sigma_d\sqrt2}} \right) \right)\]

What we have is an equation that takes data we've derived from an opinion poll and gives us a probability of a candidate winning.

Probabilities for our example

For candidate A:

$n=1000$
$ p_a = {520 \over 1000} = 0.52 $
$\alpha_a = 0.016 $
$d = {{520 - 480} \over 1000} = 0.04$
$\alpha_d = 0.032$
$P(d > 0) = 90\%$

For candidate B:

$n=1000$
$ p_b = {480 \over 1000} = 0.48 $
$\alpha_b = 0.016 $
$d = {{480 - 520} \over 1000} = -0.04$
$\alpha_d = 0.032$
$P(d > 0) = 10\%$

Obviously, the two probabilities add up to 1. But note the probability for candidate A. Did you expect a number like this? A 4% point lead in the polls giving a 90% chance of victory?

Some consequences

Because the probability is based on $ erf $, you can quite quickly get to highly probable events as I'm going to show in an example. I've plotted the probability for candidate A for various leads (spreads) in the polls. Most polls nowadays tend to have about 800 or so respondents (some are more and some are a lot less), so I've taken 800 as my poll size. Obviously, if the spread is zero, the election is 50%:50%. Note how quickly the probability of victory increases as the spread increases.

What about the size of the poll, how does that change things? Let's fix the spread to 2% and vary the size of the poll from 200 to 2,000 (the usual upper and lower bounds on poll sizes). Here's how the probability varies with poll size for a spread of 2%.

Now imagine you're a cynical and seasoned poll analyst working on candidate A's campaign. The young and excitable intern comes rushing in, shouting to everyone that A is ahead in the polls! You ask the intern two questions, and then, like the Oracle at Delphi, you predict happiness or not. What two questions do you ask?

What's the spread?
What's the size of the poll?

What's missing

There are two elephants in the room, and I've been avoiding talking about them. Can you guess what they are?

All of this analysis assumes the only source of error is random noise. In other words, there's no systemic bias. In the real world, that's not true. Polls aren't wholly based on random sampling, and the sampling method can introduce bias. I haven't modeled it at all in this analysis. There are at least two systemic biases:

Pollster house effects arising from house sampling methods
Election effects arising from different population groups voting in different ways compared to previous elections.

Understanding and allowing for bias is key to making a successful election forecast. This is an advanced topic for another blog post.

The other missing item is more subtle. It's undecided voters. Imagine there are two elections and two opinion polls. Both polls have 1,000 respondents.

Election 1:

Candidate A chosen by 20%
Candidate B chosen by 10%
Undecided voters are 70%
Spread is 10%

Election 2:

Candidate A chosen by 55%
Candidate B chosen by 45%
Undecided voters are 0%
Spread is 10%

In both elections, the spread from the polls is 10%, so candidate A has the same higher chance of winning in both elections, but this doesn't seem right. Intuitively, we should be less certain about an election with a high number of undecided voters. Modeling undecided voters is a topic for another blog post!

Reading more

The best source of election analysis I've read is in the book "Introduction to data science" and the associated edX course "Inference and modeling", both by Rafael Irizarry. The analysis in this blog post was culled from multiple books and websites, each of which only gave part of the story.

If you liked this post, you might like these ones

Forecasting the 2020 election: a retrospective
What do presidential approval polls really tell us?
Fundamentally wrong? Using economic data as an election predictor - why I distrust forecasting models built on economic and other data
Can you believe the polls? - fake polls, leading questions, and other sins of opinion polling.
President Hilary Clinton: what the polls got wrong in 2016 and why they got it wrong - why the polls said Clinton would win and why Trump did.
Poll-axed: disastrously wrong opinion polls - a brief romp through some disastrously wrong opinion poll results.
Who will win the election? Election victory probabilities from opinion polls
Sampling the goods: how opinion polls are made - my experiences working for an opinion polling company as a street interviewer.
The electoral college for beginners - how the electoral college works

Monday, August 10, 2020

Serial killer! How to lose business by the wrong serial numbers

Serial numbers and losing business

Here's a story about how something innocuous and low-level like serial numbers can damage your reputation and lose you business. I have advice on how to avoid the problem too!

(Serial numbers can give away more than you think. Image source: Wikimedia Commons. License: Public Domain.)

Numbered by design

Years ago, I worked for a specialty manufacturing company, its products were high precision, low-volume, and expensive. The industry was cut-throat competitive, and commentary in the press was that not every manufacturer would survive; as a result, customer confidence was critical.

An overseas customer team came to us to design a specialty item. The company spent a week training them and helping them design what they wanted. Of course, the design was all on a CAD system with some templated and automated features. That's where the trouble started.

One of the overseas engineers spotted that a customer-based serial number was automatically included in the design. Unfortunately, the serial number was 16, implying that the overseas team was only the 16th customer (which was true). This immediately set off their alarm bells - a company with only 16 customers was probably not going to survive the coming industry shake-out. The executive team had to smooth things over, which included lying about the serial numbers. As soon as the overseas team left, the company changed its system to start counting serial numbers from some high, but believable number (something like 857).

Here's the point: customers can infer a surprising amount from your serial numbers, especially your volume of business.

Invoices

Years later, I was in a position where I was approving vendor invoices. Some of my vendors didn't realize what serial numbers could reveal, and I ended up gaining insight into their financial state. Here are the rules I used to figure out what was going on financially, which was very helpful when it came to negotiating contract renewals.

If the invoice is unnumbered, the vendor is very small and they're likely to have only a handful of customers. All accounting systems offer invoice generation and they all number/identify individual invoices. If the invoice doesn't have a serial number, the vendor's business is probably too small to warrant buying an accounting system, which means a very small number of customers.

Naive vendors will start invoice numbering from 1, or from a number like 1,000. You can infer size if they do this.

Many accounting systems will increment invoice numbers by 1 by default. If you're receiving regular invoices from a vendor, you can use this to infer their size too. If this month's invoice is 123456 and next month's is 123466, this might indicate 10 invoices in a month and therefore 10 customers. You can do this for a while and spot trends in a vendor's customer base, for example, if you see invoices incrementing by 100 and later by 110, this may be because the vendor has added 10 customers.

The accounting tool suppliers are wise to this, and many tools offer options for invoice numbering that stop this kind of analysis (e.g. starting invoices from a random number, random invoice increments, etc.). But not all vendors use these features and serial number analysis works surprisingly often.

(Destroyed German Tank. Image source: Wikimedia Commons. License: Public Domain)

The German tank problem

Serial number analysis has been used in wartime too. In World War II, the allied powers wanted to understand the capacity of Nazi industry to build tanks. Fortunately, German tanks were given consecutive serial numbers (this is a simplification, but it was mostly true). Allied troops were given the job of recording the serial numbers of captured or destroyed tanks which they reported back. Statisticians were able to infer changes in Nazi tank production capabilities through serial number analysis, which after the war was found to be mostly correct. This is known as the German tank problem and you can read a lot more about it online.

My advice

For invoices, it's simple:

Always number your invoices and start the numbering from some high, but believable, number.
Increment your invoices using an offset that's not 1. Use your accounting package's features to obfuscate how many invoices you're sending out by setting the invoice number appropriately.

For customer IDs, it's a bit harder:

Use customer IDs that do not give away how many customers you have (so there's no customer 1, customer 2 etc.). Review every artifact a customer might conceivable see and ensure there's nothing that gives away how many customers you have (this means checking folder numbers, checking any generated URLs etc.).

Simple things say a lot

The bottom line is simple: serial numbers can give away more about your business than you think. They can tell your customers how big your customer base is, and whether it's expanding or contracting; crucial information when it comes to renegotiating contracts. Pay attention to your serial numbers and invoices!

Monday, August 3, 2020

Sampling the goods: how opinion polls are made

How opinion polls work on the ground

I worked as a street interviewer for an opinion polling organization and I know how opinion polls are made and executed. In this blog post, I'm going to explain how opinion polls were run on the ground, educate you on why polls can go wrong, and illustrate how difficult it is to run a valid poll. I'm also going to tell you why everything you learned from statistical textbooks about polling is wrong.

(Image Credit: Wikimedia Commons, License: Public Domain)

Random sampling is impossible

In my experience, this is something that's almost never mentioned in statistics textbooks but is a huge issue in polling. If they talk about sampling at all, textbooks assume random sampling, but that's not what happens.

Random sampling sounds wonderful in theory, but in practice, it can be very hard; people aren't beads in an urn. How do you randomly select people on the street or on the phone - what's the selection mechanism? How do you guard against bias? Let me give you some real examples.

Imagine you're a street interviewer. Where do you stand to take your random sample? If you take your sample outside the library, you'll get a biased sample. If you take it outside the factory gates, or outside a school, or outside a large office complex, or outside a playground, you'll get another set of biases. What about time of day? The people out on the streets at 7am are different from the people at 10am and different from the people at 11pm.

Similar logic applies to phone polls. If you call landlines only, you'll get one set of biases. If you call people during working hours, your sample will be biased (is the mechanic fixing a car going to put down their power tool to talk to you?). But calling outside of office hours means you might not get shift workers or parents putting their kids to bed. The list goes on.

You might be tempted to say, do all the things: sample at 10am, 3pm, and 11pm; sample outside the library, factory, and school; call on landlines and mobile phones, and so on, but what about the cost? How can you keep opinion polls affordable? How do you balance calls at 10am with calls at 3pm?

Because there are very subtle biases in "random" samples, most of the time, polling organizations don't do wholly 'random' sampling.

Sampling and quotas

If you can't get a random sample, you'd like your sample to be representative of a population. Here, representative means that it will behave in the same way as the population for the topics you're interested in, for example, voting in the same way or buying butter in the same way. The most obvious way of sampling is demographics: age and gender etc.

Let's say you were conducting a poll in a town to find out residents' views on a tax increase. You might find out the age and gender demographics of the town and sample people in a representative way so that the demographics of your sample match the demographics of the town. In other words, the proportion of men and women in your sample matches that of the town, the age distribution matches that of the town, and so on.

(US demographics. Image credit: Wikimedia Commons. License: Public domain)

In practice, polling organizations use a number of sampling factors depending on the survey. They might include sampling by:

Gender
Age
Ethnicity
Income
Social class or employment category
Education

but more likely, some combination of them.

In practice, interviewers may be given a sheet outlining the people they should interview, for example, so many women aged 45-50, so many people with degrees, so many people earning over $100,000, and so on. This is often called a quota. Phone interviews might be conducted on a pre-selected list of numbers, with guidance on how many times to call back, etc.

Some groups of people can be very hard to reach, and of course, not everyone answers questions. When it comes to analysis time, the results are weighted to correct bias. For example, if the survey could only reach 75% of its target for men aged 20-25, the results for men in this category might be weighted by 4/3.

Who do you talk to?

Let's imagine you're a street interviewer, you have your quota to fulfill, and you're interviewing people on the street, who do you talk to? Let me give you a real example from my polling days; I needed a man aged 20-25 for my quota. On the street, I saw what looked like a typical and innocuous student, but I also saw an aggressive-looking skinhead in full skinhead clothing and boots. Who would you choose to interview?

(Image credit: XxxBaloooxxx via Wikimedia Commons. License: Creative Commons.)

Most people would choose the innocuous student, but that's introducing bias. You can imagine multiple interviewers making similar decisions resulting in a heavily biased sample. To counter this problem, we were given guidance on who to select, for example, we were told to sample every seventh person or to take the first person who met our quota regardless of their appearance. This at least meant we were supposed to ask the skinhead, but of course, whether he chose to reply or not is another matter.

The rules sometimes led to absurdity. I did a survey where I was supposed to interview every 10th person who passed by. One man volunteered, but I said no because he was the 5th person. He hung around so long that eventually, he became the 10th person to pass me by. Should I have interviewed him? He met the rules and he met my sampling quota.

I came across a woman who was exactly what I needed for my quota. She was a care worker who had been on a day trip with severely mentally handicapped children and was in the process of moving them from the bus to the care home. Would you take her time to interview her? What about the young parent holding his child when I knocked on the door? The apartment was clearly used for recent drug-taking. Would you interview him?

As you might expect, interviewers interpreted the rules more flexibly as the deadline approached and as it got later in the day. I once interviewed a very old man whose wife answered all the questions for him. This is against the rules, but he agreed with her answers, it was getting late, and I needed his gender/age group/employment status for my quota.

The company sent out supervisors to check our work on the streets, but of course, supervisors weren't there all the time, and they tended to vanish after 5pm anyway.

The point is, when it comes to it, there's no such thing as random sampling. Even with quotas and other guided selection methods, there are a thousand ways for bias to creep into sampling and the biases can be subtle. The sampling methodology one company uses will be different from another company's, which means their biases will not be the same.

What does the question mean?

One of the biggest lessons I learned was the importance of clear and unambiguous questions, and the unfortunate creativity of the public. All of the surveys I worked on had clearly worded questions, and to me, they always seemed unambiguous. But once you hit the streets, it's a different world. I've had people answer questions with the most astonishing levels of interpretation and creativity; regrettably, their interpretations were almost never what the survey wanted.

What surprised me was how willing people were to answer difficult questions about salary and other topics. If the question is worded well (and I know all the techniques now!), you can get strangers to tell you all kinds of things. In almost all cases, I got people to tell me their age, and when required, I got salary levels from almost everyone.

A well-worded question led to a revelation that shocked me and shook me out of my complacency. A candidate had unexpectedly just lost an election in the East End of London and the polling organization I worked for had been contracted to find out why. To help people answer one of the questions, I had a card with a list of reasons why the candidate lost, including the option: "The candidate was not suitable for the area." A lot of people chose that as their reason. I was naive and didn't know what it meant, but at the end of the day, I interviewed a white man in pseudo-skinhead clothes, who told me exactly what it meant. He selected "not suitable for the area" as his answer and added: "She was black, weren't she?".

The question setters weren't naive. They knew that people would hesitate before admitting racism was the cause, but by carefully wording the question and having people choose from options, they provided a socially acceptable way for people to answer the question.

Question setting requires real skill and thought.

(Oddly. there are very few technical resources on wording questions well. The best I've found is: "The Art of Asking Questions", by Stanley Le Baron Payne, but the book has been out of print for a long time.)

Order, order

Question order isn't accidental either, you can bias a survey by the order you ask questions. Of course, you have to avoid leading questions. The textbook example is survey questions on gun control. Let's imagine there were two surveys with these questions:

Survey 1:

Are you concerned about violent crime in your neighborhood?
Do you think people should be able to protect their families?
Do you believe in gun control?

Survey 2:

Are you concerned about the number of weapons in society?
Do you think all gun owners secure their weapons?
Do you believe in gun control?

What answers do you think you might get?

As well as avoiding bias, question order is important to build trust, especially if the topic is a sensitive one. The political survey I did in the East End of London was very carefully constructed to build the respondent's trust to get to the key 'why' question. This was necessary for other surveys too. I did a survey on police recruitment, but as I'm sure you're aware, some people are very suspicious of the police. Once again, the survey was constructed so the questions that revealed it was about police recruitment came later on after the interviewer (me!) had built some trust with the respondent.

How long is the survey?

This is my favorite story from my polling days. I was doing a survey on bus transport in London and I was asked to interview people waiting for a bus. The goal of the survey was to find out where people were going so London could plan for new or changed bus routes. For obvious reasons, the set of questions were shorter than usual, but in practice, not short enough; a big fraction of my interviews were cut short because the bus turned up! In several cases, I was asking questions as people were getting on the bus, and in a few cases, we had a shouted back and forth to finish the survey before their bus pulled off out of earshot.

(Image credit: David McKay via Wikimedia Commons. License: Creative Commons)

To avoid exactly this sort of problem, most polling organizations use pilot surveys. These are test surveys done on a handful of people to debug the survey. In this case, the pilot should have uncovered the fact that the survey was too long, but regrettably, it didn't.

(Sometime later, I designed and executed a survey in Boston. I did a pilot survey and found that some of my questions were confusing and I could shorten the survey by using a freeform question rather than asking for people to choose from a list. In any survey of more than a handful of respondents, I strongly recommend running a pilot - especially if you don't have a background in polling.)

The general lesson for any survey is to keep it as short as possible and understand the circumstances people will be in when you're asking them questions.

What it all means - advice for running surveys

Surveys are hard. It's hard to sample right, it's hard to write questions well, and it's hard to order questions to avoid bias.

Over the years, I've sat in meetings when someone has enthusiastically suggested a survey. The survey could be a HR survey of employees, or a marketing survey of customers, or something else. Usually, the level of enthusiasm is inversely related to survey experience. The most enthusiastic people are often very resistant to advice about question phrasing and order, and most resistant of all to the idea of a pilot survey. I've seen a lot of enthusiastic people come to grief because they didn't listen.

If you're thinking about running a survey, here's my advice.

Make your questions as clear and unambiguous as you can. Get someone who will tell you you're wrong to review them.
Think about how you want the questions answered. Do you want freeform text, multiple choice, or a scale? Surprisingly, in some cases, free form can be faster than multiple choice.
Keep it short.
Always run a pilot survey.

What it means - understanding polling results

Once you understand that polling organizations use customized sampling methodologies, you can understand why polling organizations can get the results wrong. To put it simply, if their sampling methodology misses a crucial factor, they'll get biased results. The most obvious example is state-level polling in the US 2016 Presidential Election, but there are a number of other polls that got very different results from the actual election. In a future blog post, I'll look at why the 2016 polls were so wrong and why polls were wrong in other cases too.

If you liked this post, you might like these ones

Forecasting the 2020 election: a retrospective
What do presidential approval polls really tell us?
Fundamentally wrong? Using economic data as an election predictor - why I distrust forecasting models built on economic and other data
Can you believe the polls? - fake polls, leading questions, and other sins of opinion polling.
President Hilary Clinton: what the polls got wrong in 2016 and why they got it wrong - why the polls said Clinton would win and why Trump did.
Poll-axed: disastrously wrong opinion polls - a brief romp through some disastrously wrong opinion poll results.
Who will win the election? Election victory probabilities from opinion polls
Sampling the goods: how opinion polls are made - my experiences working for an opinion polling company as a street interviewer.
The electoral college for beginners - how the electoral college works

Monday, July 27, 2020

The Electoral College for beginners

The (in)famous electoral college

We're coming up to the US Presidential election so it's time for pundits and real people to discuss the electoral college. There's a lot of misunderstanding about what it is, its role, and its potential to undermine democracy. In this post, I'm going to tell you how it came to be, the role it serves, and some issues with it that may cause trouble.

(Ohio Electoral College 2012. Image credit: Wikimedia Commons. Contributor: Ibagli. License: Creative Commons.)

How it came to be

The thirteen original colonies had the desire for independence in common but had stridently different views on government. In the aftermath of independence, the US was a confederacy, a country with a limited and small (federal) government. After about ten years, it became obvious that this form of government wasn't working and something new was needed. So the states created a Constitutional Convention to discuss and decide on a new constitution and form of government.

Remember, the thirteen states were the size of European countries and had very different views on issues like slavery. The states with smaller populations were afraid they would be dominated by the more populous states, which was a major stumbling block to agreements. The issue was resolved by the Great Compromise (or Connecticut Compromise if you come from Connecticut). The Convention created a two-chamber congress and a more powerful presidency than before. Here's how they were to be elected:

The lower house, the House of Representatives, was to have representatives elected in proportion to the population of the state (bigger states get more representatives).
The upper house, the Senate, was to have two Senators per state regardless of the population of the state.
Presidents were to be elected through an electoral college, with each elector having one vote. Each state would be allocated a number of electors (and hence votes) based on their seats in congress. The electors would meet and vote for the President. For example in 1789, the state of New Hampshire had three representatives and two senators, which meant New Hampshire sent five electors (votes) to the electoral college. The states decided who the electoral college electors were.

Think for a minute about why this solution worked. The states were huge geographically with low population densities and often poor communications. Travel was a big undertaking and mail was slow. It made sense to send voters to vote on your behalf at a college and these delegates may have to change their vote depending on circumstances. In short, the electoral college was a way of deciding the presidency in a big country with slow communications.

Electoral college vote allocation is and was only partially representative of the underlying population size. Remember, each state gets two Senators (and therefore two electoral college votes) regardless of its population. This grants power disproportionately to lower-population states, which is a deliberate and intended feature of the system.

Early practice

Prior to the formation of modern political parties, the President was the person who got the largest number of electoral college votes and the Vice-President was the person who got the next highest number of votes. For example, in 1792, George Washington was re-elected President with 132 votes, and the runner-up, John Adams, who got 77 votes, became Vice-President. This changed when political parties made this arrangement impractical, and by 1804, the President and the Vice-President were on the same ticket.

Electoral college electors were originally selected by state legislators, not by the people. As time went on, more states started directly electing electoral college electors. In practice, this meant the people chose their Presidential candidate and the electoral college electors duly voted for them.

By the late 19th century, all states were holding elections for the President and Vice-President through electoral college representation.

Modern practice

Each state has the following representation in congress:

Two Senators

A number of House of Representative seats roughly related to the state's population.

The size of each state's congressional delegation is their number of electoral college votes. For example, California has 53 Representatives and 2 Senators giving 55 electoral college electors and 55 electoral college votes.

During a Presidential election, the people in each state vote for who they want for President (and by extension, Vice-President). Although it's a federal election, the voting is entirely conducted by each state; the ballot paper is different, the counting process is different, and the supervision is different.

Most states allocate their electoral college votes on a winner-takes-all basis, the person with the largest share of the popular vote gets all the electoral college votes. For example, in 2016, the voting in Pennsylvania was: 2,926,441 votes for Hilary Clinton and 2,970,733 votes for Donald Trump, and Donald Trump was allocated all of Pennsylvania's electoral college votes.

Two states do things a little differently. Maine and Nebraska use the Congressional District method. They allocate one of their electoral college votes to each district used to elect a member of the House of Representatives. The winner of the statewide vote is then allocated the other two electoral college votes. In Maine in 2016, Hilary Clinton won three electoral college votes and Donald Trump one.

Washington D.C. isn't a state and doesn't have Senators; it has a non-voting delegate to the House of Representatives. However, it does have electoral college votes! Under the 23rd amendment to the Constitution, it has the same electoral college votes as the least populous state (currently 3).

In total, there are 538 electoral college votes:

100 Senators
435 Representatives
3 Electors for Washington D.C.

The electoral college does not meet as one body in person. Electors meet in their respective state capitols and vote for President.

How electoral college votes are decided

How are electoral college votes allocated to states? I've talked about the formula before, 2 Senators for every state plus the number of House of Representative seats. House of Representative seats are allocated on a population basis using census data. There are 435 seats that are reallocated every ten years based on census data. Growing states may get more seats and shrinking states fewer. This is why the census has been politicized from time to time - if you can influence the census you can gain a ten-year advantage for your party.

Faithless electors and the Supreme Court

Remember, the electors meet and vote for President. Let's imagine we have two Presidential candidates, cat and dog, and that the people of the state vote for cat. What's to stop the electors voting for dog instead? Nothing at all. For many states, there's nothing to stop electors voting for anyone regardless of who won the election in the state. This can and does happen, even as recently as 2016. It happens so often that there's a name for them: faithless electors.

In 2016, five electors who should have voted for Hilary Clinton didn't vote for her, and two who should have voted for Donald Trump didn't vote for him. These votes were officially accepted and counted.

Several states have laws that mandate that electors vote as instructed or provide punishment for electors who do not vote as instructed. These laws were challenged in the Supreme Court, which voted to uphold them.

On the face of it, faithless electors sound awful, but I do have to say a word in their defense. They do have some support from the original intent of the Constitutional Convention and they do have some support from the Federalist Papers. It's not entirely as black and white as it appears to be.

Have faithless electors ever swayed a Presidential election? No. Could they? Yes.

Gerrymandering

In principle, it's possible to gerrymander electoral college votes, but it hasn't been done in practice. Let me explain how a gerrymander could work.

First off, you'd move to Congressional District representation. Because the shape of congressional districts are under state control, you could gerrymander these districts to your heart's content. Next, you'd base your senatorial electoral college votes on the congressional district winner on a winner-takes-all basis. Let's say you had 10 congressional districts and you'd gerrymandered them so your party could win 7. Because 7 of the 10 districts would be for one candidate, you'd award your other two votes to that candidate. In other words, a candidate could lose the popular vote but still gain the majority of the electoral college votes for a state.

The electoral college and representative democracy

Everyone knows that Hilary Clinton won the popular vote but Donald Trump won the electoral college and became President. This was a close election, but it's theoretically possible for a candidate to lose the popular vote by a substantial margin, yet still win the presidency.

Bear in mind what I said at the beginning of this piece, electoral college votes are not entirely representative of the population, by design. Here's a chart of electoral college votes per 1,000,000 population for 2020. Note how skewed it is in favor of low-population (and rural) states. If you live in Wyoming your vote is worth 5 times that of a voter in Texas.

Obviously, some states are firmly Democratic and others firmly Republican. The distribution of electoral college votes pushes candidates to campaign more heavily in small swing states, giving them an outsize influence (for example, New Hampshire). Remember, your goal as a candidate is to win electoral college votes, your goal is not to win the popular vote. You need to focus your electoral spending so you get the biggest bang for your buck in terms of electoral college votes, which means small swing states.

Nightmare scenarios

Here are two scenarios that are quite possible with the current system:

A candidate substantially loses the popular vote but wins the electoral college.
Faithless electors reverse the result.

Neither of these scenarios is good for democracy or stability. There is nothing to prevent them now.

Who else uses an electoral college?

Given the problems with an electoral college, it's not surprising that there aren't many other cases in the world of its use. According to Wikipedia, there are several other countries that use it for various elections, but they are a minority.

Could the electoral college be changed for another system?

Yes, but it would take a constitutional change, which is a major undertaking and would require widespread cross-party political support. Bear in mind, a more representative system (e.g. going with the popular vote) would increase the power of the more populous states and decrease the power of less populous states - which takes us all the way back to the Great Compromise and the Constitutional Convention.

What's next?

I hope you enjoyed this article. I intend to write more election-based pieces as November comes closer. I'm not going to endorse or support any candidate or party; I'm only interested in the process of democracy!

If you liked this post, you might like these ones

Forecasting the 2020 election: a retrospective
What do presidential approval polls really tell us?
Fundamentally wrong? Using economic data as an election predictor - why I distrust forecasting models built on economic and other data
Can you believe the polls? - fake polls, leading questions, and other sins of opinion polling.
President Hilary Clinton: what the polls got wrong in 2016 and why they got it wrong - why the polls said Clinton would win and why Trump did.
Poll-axed: disastrously wrong opinion polls - a brief romp through some disastrously wrong opinion poll results.
Who will win the election? Election victory probabilities from opinion polls
Sampling the goods: how opinion polls are made - my experiences working for an opinion polling company as a street interviewer.
The electoral college for beginners - how the electoral college works

Monday, July 20, 2020

Sad! How not to create a career structure: good intentions gone wrong

Sad!

I worked for a company that tried to do a good thing for people's careers and promotion prospects, but it back-fired horribly and ended up hurting more than it helped. I'm going to tell you the story and draw some lessons from it. Of course, I've changed and obscured some details to protect the guilty.

(Sometimes doing the right thing badly can turn even placid software developers into a mob. Image source: Wikimedia Commons, License: Public Domain.)

The company

The company was a large corporation with several hundred software developers working in different divisions on different projects. There was a formal grade level for developers: a developer fresh out of college might be level 20 and an experienced and senior developer might be level 30. Each level had a pay band with a rigid cap; once you reached the cap, you could only get more money by promotion. The pay bands overlapped and people knew them. Everyone was keenly aware of their level and there was open discussion of who was on what level.

The problem was, the standards and levels were inconsistent across and within departments. Some departments and managers were known to be 'generous' and others were known to be 'mean'. Some developers moved departments as the only way to get promoted, but there were problems with this approach too. Some departments had one set of rules for pay and levels, while others had different rules. In some cases, developers tried to move to other groups to get a promotion, but they were offered jobs at the bottom of the pay band at that level, which unfortunately was less money than they were currently on. Managerial inconsistencies had bad consequences for individuals too. In one case, a person got a big promotion and their manager left not long after. Their new manager felt they were underperforming and that they had been over-promoted. Some managers promoted based on performance alone, but others insisted on time served at a level.

The bottom line was, there were substantial inconsistencies and inequities and the software engineers were unhappy.

The learned organization

Fortunately, there was an active learned organization that helped software companies. One of the things this very learned organization did was produce career guidance, specifically, it produced a career hierarchy showing what software developers would be expected to do at different levels of seniority. As with all management solutions, it was a matrix. There were different skills in different areas and some areas were only applicable to more senior positions. To put it simply, they'd mapped out a grading structure and the skills and responsibilities of each grade.

On the face of it, this sounded like a perfect solution for my employer: why not roll out this grading structure and use it to decide people's levels?

The roll-out

My employer took the skills/grades matrix from the learned society and mapped it to the internal levels 20-30. The idea was, you could see what was expected of you at your level and what skills you needed to develop and the responsibilities you needed to take on to get a promotion. Sounds great!

They published this guidance to all developers.

Light the blue touch paper...

The fall-out

Here's what happened.

The people on high grades (around 30) knew they weren't going to be demoted and their salary wasn't going to go down. Most of them noticed that they weren't doing many of the things the skills/grades matrix said they should be doing. They found the whole thing hilarious. They openly joked about the grade/skills matrix and freely talked about all the things they weren't doing that the matrix said they should be.

The majority of people were on middling grades (23-27) and this was incendiary to them. Many of the people were doing more than what was expected of them at their grade level. In many cases, they were meeting the requirements two levels above their current level. The net results were urgent discussions with their managers about immediate promotions.

The middling grades people were furious, while the high-grade people were laughing about it. It wasn't a happy mix and management had to step in to stifle debate to calm people down.

Senior management soon realized this was explosive, so they backpedaled quickly. The roll-out program was canceled due to unexplained 'deficiencies in the process' and it 'not meeting business needs'. A review was promised, which as you might expect, never happened. The subject became taboo and senior management didn't attempt any further reform.

Of course, this had its impact. The middling grades realized they had to leave to get career advancement and the senior people realized further advancement was unlikely. The people with get up and go got up and left.

What about the people on the lower grades? When the roll-out happened, they were very happy. For the first time, they had clear goals for promotion, and consistency meant they wouldn't have to leave their group to get promoted. They were upset when it was canceled because it was back to the old opaque system. When the review never happened, it dawned on many of them that the company wasn't serious about their futures.

Instead of being a positive change for the future, it set the company back several years.

Why did it go so badly wrong?

Solving the wrong problem

What problem was senior management trying to solve? There were two broad problems we faced as a company:

Current inequities and inconsistencies
Future inequities and inconsistencies

The skill/grade matrix was a solution to the second problem, but not the first. The first problem was the most serious and the skills/grade matrix just made it more visible. The senior management team had no plan to address the first problem.

The leadership applied the right solution to the wrong problem.

Overnight fix to a long-standing problem

Inconsistencies had built up over the course of about a decade, and there were reasons why things developed the way they did. Partly, it was company culture.

As a general rule of thumb, entrenched problems take time to solve. Senior leadership plainly expected a magic wand, but that was never going to happen. They didn't put in enough effort post-launch to make the project successful.

Didn't listen/lack of inclusion

This was the main problem. Senior leadership should have talked to software developers to understand what their concerns really were. In short order, listening would have brought to the surface the issues with current inequities and it would have shown senior leadership that the plan wasn't going to work. The fact that developers weren't involved in discussions about their own future was telling.

Pretend it never happened

You can't put the genie back in the lamp. Senior management tried to brush the whole thing under the rug very quickly when it was apparent it was going wrong, which was the worst response they could have made.

How should they have dealt with it?

Hindsight is a wonderful thing, but knowing what I know now, here's how I would have dealt with this situation differently.

Understand what the problem really is by talking to people. By people, I mean the people directly affected. I've learned never to rely on intermediaries who may be incentivized to mislead or tell you what you want to hear.
Deal with the worst cases up-front and as one-offs. A one-size fits all process can flounder when dealing with outliers, so it's better to deal with the most extreme cases individually. In this case, people who should be on much higher levels.
Trial the solution with a small group. Trying it on a group of about 20-30 software developers would have flushed out most of the problems.
Stick with it. Put the effort in to make it work, but have clear criteria for exiting if it doesn't work.

More specifically, I might have followed these policies.

Assuming no or very little money for promotions, I would have extended the pay bands at the bottom end. This would allow me to promote people without incurring extra costs. In the ideal world, you promote everyone who deserves it and give them the appropriate pay rise, but the ideal world doesn't exist.
I might reform the whole 20-30 levels, for example, I might call the levels something else, and then I would map people to new levels, including mapping some people to higher levels. An exercise like this allows you to correct all kinds of oddities, but of course, some people will be upset.
For people who were on higher levels than they should be, I would leave them there for a while. It's too damaging for an organization to make corrections on this scale - there were just too many people who were over-promoted. But I wouldn't promote these people further until their performance warranted it.

Saturday, July 11, 2020

The art of persuasion: pathos, logos, and ethos

The art of persuasion is ancient

In this blog post, I'm going to talk about three high-level approaches to persuasion that people have used for thousands of years: pathos, logos, and ethos. Aristotle codified them, hence their Greek names, but despite their ancient origins, they pop up in modern sales methods, in the speeches given by politicians, and in preachers' sermons. Understanding these basics will let you craft more effective speeches and emails.

(Martin Luther King was one of the best rhetoricians of the 20th century. King used pathos enormously effectively in his 'I have a dream speech'. Image source: Wikimedia Commons. Photographer: Rowland Scherman.)

Logos

Sadly, this is the weakest of the three, it's the appeal to logic; the use of facts and figures, and inference and deduction to make an argument. For example, if I were trying to make the case that the economy was great under my leadership, I might talk about high employment numbers, pay rises, and business growth. My chain of reasoning would be: the numbers are great due to my leadership, so you should vote for me. Let's look at a couple of real examples.

My first example is Winston Churchill's "Their finest hour speech" (you can read more analysis of this speech here):

"‘Hitler knows that he will have to break us in this Island or lose the war. If we can stand up to him, all Europe may be free and the life of the world may move forward into broad, sunlit uplands.’"

Note there are no numbers here, it's a chain of logic linking one idea to another, in this case, an if-then piece of logic.

John F. Kennedy also used logos. Here's an excerpt from his famous "We choose to go to the moon..." speech:

"Within these last 19 months at least 45 satellites have circled the earth. Some 40 of them were made in the United States of America and they were far more sophisticated and supplied far more knowledge to the people of the world than those of the Soviet Union."

Note the use of numbers to make the point that the United States was ahead of, and more advanced than the Soviet Union. Logos isn't just about making cold scientific claims, it can be about bashing your opponents with logic.

Margaret Thatcher was a master speaker, and she used logos to batter her opponents:

"If the Labour Government is no longer able to act in the national interest, is there no alternative to the ruin of Britain? Yes, indeed there is - and that alternative is here at Brighton today."

(Leader's Speech, Brighton, 1976)

The point is, logos isn't about a mathematical deduction, it's about creating a chain of arguments to lead the audience to the conclusion that your point is true.

In sales situations, logos is used in several ways, for example, a salesperson might say something like: "Our solution costs less on a yearly basis than competitor Y but has features A, B, and C that Y does not."

Ethos

This is an appeal from authority, the idea is that the speaker has some kind of position that gives them special knowledge. The most obvious example is the old quote: "Trust me, I'm a doctor". Speakers use this device a lot to establish credibility, for example, you might hear people talk about years of government service, or their qualifications, or their awards.

One of the best examples of ethos I've come across is Steve Jobs Stanford Commencement address of 2005. Pretty much, the whole speech is ethos, establishing his credibility with the audience. Here's a small extract:

"I was lucky — I found what I loved to do early in life. Woz and I started Apple in my parents’ garage when I was 20. We worked hard, and in 10 years Apple had grown from just the two of us in a garage into a $2 billion company with over 4,000 employees. We had just released our finest creation — the Macintosh — a year earlier, and I had just turned 30."

There are several ways speakers establish ethos. Sometimes, they talk about their education, or they talk about their experiences, or they talk about their awards. But how do you establish ethos if you have none of those things? The answer is, you can co-opt it from other people and organizations.

A salesperson might co-opt ethos by talking about the awards and success of their company or product, for example, "Product X was the biggest selling product in category Y, it's the industry standard." They also use external references from trusted sources. Gartner's magic quadrant analysis is widely used for a reason, it lends external credibility and is a great source of ethos. Success stories serve similar purposes too; you can use a quote from an industry figure to boost your credibility.

Pathos

This is an emotional appeal to do something. It's is one of the most effective persuasive techniques, but it has to be used carefully; it's a blunt instrument that can undermine your argument. One of the clearest examples of pathos also illustrates the problem: "do as I ask or I kill this kitten".

I'm going to return to Winston Churchill and another of his famous speeches for my first example, here's a famous extract:

"I would say to the House, as I said to those who have joined this government: “I have nothing to offer but blood, toil, tears and sweat.”"

(Blood, toil, tears, and sweat)

Churchill's goal was to get the politicians and the people to join him in an existential struggle against fascism. He used vivid imagery to illustrate the struggle ahead. If logos is the use of facts and figures, pathos is the use of adjectives and adverbs.

One of my other favorite speakers is Martin Luther King, here's an extract from his most famous speech:

"I have a dream that one day even the state of Mississippi, a state sweltering with the heat of injustice, sweltering with the heat of oppression, will be transformed into an oasis of freedom and justice."

King is using evocative language to appeal to his audience, to motivate them to appeal for change.

I've seen salespeople use pathos in a clever way when they ask a prospect what the success of product would mean for them. The crude example here is telling someone they would look good in a sports car. Less crudely, the success of a product could lead to promotions. This has to be artfully done, but I've seen it used very effectively.

Using them in a speech

Not every speech uses all the elements of persuasion, but many of them do. Let's imagine you're promoting your company's product at a convention and had a short slot to do it. Here's how you might use pathos, logos, and ethos.

Speech	Commentary
In 2020, Gartner placed us in the leader quadrant of their X magic quadrant for the fifth year running. Ms A, Vice-President for Important Things at Big Company B said that Product Y was a game-changer for them.	Ethos.
She said it reduced design costs by 20% and reduced time-to-market by 15%. Her experience is common, which is why we have the largest market share and we're growing more rapidly than any other company in the sector. We're becoming the industry standard.	Logos.
But it's not just about money. Because we reduce design flow issues, we reduce low-level re-work and rote work, so we free up engineers' time to be more creative and use their skills in more engaging and interesting ways. Our software enabled Dr S to design H, which won the industry design award and got him on the cover of Trade Publication.	Pathos

This is a little crude, but you get the general idea.

Churchill, Kennedy, and King did not just make up their speeches. They knew very well what they were doing. Churchill and King in particular were master rhetoricians, adept at using words to change the world. Knowing a little of rhetoric can pay dividends if you're trying to be more persuasive.

Reading more

A lot of books on rhetoric are unreadable, which is odd given that rhetoric is about communication. Here are some of the ones I like:

Thank You for Arguing, Fourth Edition: What Aristotle, Lincoln, and Homer Simpson Can Teach Us About the Art of Persuasion - Jay Heinrichs
Words Like Loaded Pistols: Rhetoric from Aristotle to Obama - Sam Leith

Rhetoric for managers series

This post is part of a series of posts on rhetoric for managers. Here are the other posts in the series:

Thursday, July 2, 2020

Managers must actively listen: foundational skills

Managers need to actively listen

Almost every MBA or business school degree has units on communications, including spoken communication. A great deal of attention is lavished on learning how to give speeches and how to communicate ideas, but no time is given to a more important management skill: how to actively listen.

(If a dog can listen attentively, why can't you? Image credit: Wikimedia Commons. License: Public Domain.)

If you're thinking of listening as a passive skill, you've got it wrong. If you think of it just as letting someone speak, you're mistaken. If you think of it as something you can just do, you've totally misunderstood it.

By actively listening, I mean understanding the true meaning of what someone says and not just the superficial content. I mean actively probing for the key underlying messages and feelings. I mean communicating back to the speaker that you've heard and understand what they have to say by your choice of gestures, noises, and responses.

By actively listening, you can understand what's really going on and earn trust as a manager. You can demonstrate empathy, respect, and the value you place on people.

In the last few years, the 'fake it until you make it' business meme has done the rounds, meaning pretending to be something you're not so you can get promoted to be what you were pretending to be. Active listening is the antithesis of this approach; it's all about communicating genuine human warmth. You can't fake caring.

Listen without distraction

Our first listening lesson is the simplest: listen with undivided attention and communicate that your attention is undivided. This means no ringing phones and no beeping devices. Turn off or silence your phone or watch. Turn off or silence notifications on your computer - better still, close up your laptop or turn your screen away.

If at all possible, you should sit close to someone without distractions or obstructions. If you have an office, get up from behind your desk so you can sit with the person without diversion. Sitting behind your desk communicates power, and it's even worse if your monitor even partially obscures your view. If you don't have an office, go to a meeting room where you won't be distracted. On Zoom or video conferencing, close down any distracting windows and look at the person only (not other computer windows or something else in your home office).

These things sound simple, but it's surprising how many people don't understand the basics of turning off distractions. If you want people to talk, you need to give them the right environment - and that means you have to take action.

Let's imagine you wanted to talk to your manager about a difficult situation at work. Perhaps you suspect you're the victim of sexual or racial harassment, or you've seen something in another department you think you should report, or you're unhappy with a colleague. Each of these situations is difficult for you to talk about and requires you to trust your manager. Now imagine when you finally screw up the courage to see them, they sit behind their desk fiddling with their phone, their computer pings every minute, and they keep on turning away to check their monitor. Do you think you would tell a manager who behaves like this something that's risky and that requires trust? Do you want to be the kind of manager your team trusts, or do you just want to just fake it? Being authentic requires effort on your part.

Just making time for people isn't enough. You have to understand what's expected of you and what you mustn't do.

Listening isn't problem-solving

One of the big mistakes managers make is slipping into problem-solving mode as soon as someone starts discussing a problem or issue. For people from a technical background, this is a comforting response to a difficult situation, after all, solving problems is what technical people are trained to do. But this might be the exact opposite of what the person wants. They might just want to talk through a difficult or disturbing situation and be heard. The other risk is, by offering solutions too soon, you might block more serious content. Sometimes, someone might come to see you with a 'presenting problem' that's relatively innocuous; they're seeing if they can trust you before disclosing the bigger issue. Before suggesting any solution, you should make sure you've heard the totality of what someone has to say. Your first goal should be making sure the person feels heard.

The biggest risk of active listening for managers

I do have to address one of the big risks of listening: role confusion. You are a manager, not a counselor. The art of listening as a manager is to know your limits and not try to be a cheap therapist. Remember, as a manager, you may have to use the performance management process on someone which may be next to impossible if the person is relying on you as a counselor. If you've allowed an unhealthy pseudo-counseling relationship to develop, you've put yourself in a career-limiting situation. At all times, you must remember you are a manager.

Listening leads to acting - sometimes

There's another key difference between a therapist and a manager. As a manager, you need to act on what people tell you. You don't need to act all the time, sometimes people need to vent and your role is solely to listen and communicate you've heard what they have to say. But on other occasions, you have to act - in fact, acting is the ultimate in active listening: 'I've heard what you have to say and I'm going to do something about it'. If you do nothing in response to what you hear, ultimately people will do something themselves: they'll leave and go somewhere else.

What's next?

I'm going to blog a bit more about listening. I'm going to focus on some micro-skills you can use to communicate you're listening, and provide examples and exercises you can follow. These blog posts are not about faking listening, they're about being human and about your role for your team - demonstrating why you're a manager.

References

The best single-volume book I've come across that covers the skills of active listening is "Swift to hear" by Michael Jacobs (ISBN: 978-0281052608). The book is aimed at people involved in pastoral care, but it focuses on technical skills, making it a great resource for anyone who needs to listen attentively.