Thursday, February 27, 2020

Pie charts are lie charts

There are lots of chart types, but if you want to lie or mislead people, the best chart to use is the pie chart. I’m going to show you how to distort reality with pie charts, not so you can be a liar, but so you know never to use pie charts and to choose more honest visualizations.

Let's start with the one positive thing I know about pie charts: they're called camembert charts in France and cake charts in Germany. On balance, I prefer the French term, but we're probably stuck with the English term. Unlike camembert, pie charts often leave a bad taste in my mouth and I'll show you why.

(Camembert cheese - image credit: Coyau, Wikipedia - license : Creative Commons)

Take a look at the pie chart below. Can you put the six slices in order from largest to smallest? What percentages do you think the slices represent?

Here’s how I’ve misled you:

Offset the slices from the 12 o’clock position to make size comparison harder. I've robbed you of the convenient 'clock face' frame of reference.
Not put the slices in order (largest to smallest). Humans are bad at judging the relative sizes of areas and by playing with the order, I'm making it even harder.
Not labeled the slices. This ought to be standard practice, but shockingly often isn't.

The actual percentages are:

Gray	20.9
Green	17.5
Light blue	16.8
Dark blue	16.1
Yellow	15.4
Orange	13.3

How close were you? How good was my attempt to deceive you?

Let’s use a bar chart to represent the same data.

Simple, clear, unambiguous.

I've read guidance that suggests you should only use a pie chart if you're showing two quantities that are obviously unequal. This gives the so-called pac-man pie charts. Even here, I think there are better representations, and our old-friend the bar chart would work better (albeit less interestingly).

Now let’s look at the king of deceptive practices, the 3d pie chart. This one is great because you can thoroughly mislead while still claiming to be honest. I’m going to work through a short deceptive example.

Let’s imagine there are four political parties standing in an election. The percentage results are below.

Dog	36
Cat	28
Mouse	21
Bird	15

You work for Bird, which unfortunately got the lowest share of the vote. Your job is to deceive the electorate into thinking Bird did much better than they did.

You can obscure the result by showing it as a pie chart without number labels. You can even mute the opposition colors to fool the eye. But you can go one better. You can create a 3d pie chart with shifted perspective and 'point explosion' using the data I gave above like so.

Here's what I did to create the chart:

Took the data above as my starting point and created a pie chart.
Rotated the chart so my slice was at the bottom.
Made the pie chart 3d.
Changed the perspective to emphasize my party.
Used 'point explosion' to pull my slice out of the main body of the chart to emphasize it even more.
Used shading.

This now makes it look like Bird was a serious contender in the election. The fraction of the chart area taken up with the Bird party’s color is completely disproportionate to their voter share. But you can claim honesty because the slice is still the correct proportion if the chart was viewed from above. If challenged, you can turn it into a technical/academic debate about data visualization that will turn off most people and make your opponents sound like they’re nit-picking.

You don’t have to go this far to mislead with a pie chart. All you have to do is increase the cognitive burden to interpret a chart. Some, maybe even all, of your audience might not spot what you’re trying to hide because they’re in a hurry. You can mislead some of your audience all of the time.

I want to be clear, I'm telling you about these deceptive practices so you can avoid them. There are good reasons why honest analysts don’t use pie charts. In fact, I would go one stage further; if you see a pie chart, be on your guard against dishonesty. As one of my colleagues used to say, ‘friends don’t let friends use pie charts’.

Tuesday, February 25, 2020

What I learned from a day in the woods

Leadership in the woods

A few years ago, I went on a one-day leadership course outdoors. We did various team-building and leadership activities, all based on outdoor exercises. I gained something from the experience, but there were positives and negatives. I would send people on a similar course, but with reservations. Here’s what happened, what I got out of it, and what I would do differently.

(Image credit: Mike Woodward, copyright Mike Woodward)

What happened

The group of us that did this course were all employees of the same company, company X, but from different departments. Some of us worked together occasionally, others did not. We knew one another, just not very well. The goal of the course was to prepare us for leadership positions.

The course took place at a facility in the countryside, not too far from civilization. This definitely wasn't an extreme survival course; the worst survival hardship was getting the wrong sandwich at lunchtime.

Once we'd all arrived, we were briefed on the day, split into groups, and given exercises to do on outdoor equipment. Before each exercise, we were informed of its purpose, and instructors helped us through it.

One exercise was walking across a log suspended about 5m in the air; all perfectly safe because we wore harnesses and helmets etc. The instructor said the point of the exercise was to face fear and uncertainty and move forward with the support of the team. The team was encouraged to shout positive things to the person walking across; but nothing that might cause them to fall! However, the main instructor left halfway through, leaving us with more junior staff who focused on safety and didn’t reinforce the message about teamwork and support.

In another exercise, we were led around blindfolded to build trust.

The rest of the day went on in a similar vein, with similar exercises, and we had a debriefing session at the end of it.

What happened afterwards

I liked the program a lot and I took the message to heart about teamwork and support. Some months later, I faced a business decision that made me very nervous. I thought about my experience walking the log 5m in the air, and I went forward with my decision with more confidence (if I can do that, I can do this). However, my log walking boost wore off after about a year. Overall, the gains I made from the course only lasted for about twelve months.

Smaller, but not insignificant benefits of the whole thing for me were a day out of the office and the sense that my employer was investing in me.

There was a major problem with the day though; some people absolutely hated it. They hated the exercises, they hated having people watch them fail, and they hated the whole idea of running around in the woods; they got nothing out of the day. Bear in mind, this kind of group physical activity may bring back painful school memories for some people, and this is what happened when I was there. There was just too much baggage to learn. Although I felt good about the course, some people felt worse about the company for making them go and I'm sympathetic to their position. Did this make them less suitable as managers? I don’t think so.

Lessons learned

Would I spend corporate money to run an event like this again? Yes, but with reservations.

I would consider very carefully people’s objections to these kinds of events. I would not coerce anyone to go. If there were a number of people firmly opposed, I would try and find something else.
I would consider disabled access very carefully. If someone on the team was a wheelchair user, for example, I would execute the program in an inclusive way, if at all.
I would make sure the teamwork and leadership message was reinforced all the time. There would be a briefing before the day, and afterward.
I would schedule something like this once every year or two to reinforce the learning.

There are things to learn from a day in the woods, but maybe not everyone learns the same things.

Saturday, February 22, 2020

The Monty Hall Problem

Everyone thinks they understand probability, but every so often, something comes along that shows that maybe you don’t actually understand it at all. The Monty Hall problem is a great example of something that seems very counterintuitive and teaches us to be very wary of "common sense".

The problem got its fame from a 1990 column written by Marilyn vos Savant in Parade magazine. She posed the problem and provided the solution, but the solution seemed so counterintuitive that several math professors and many PhDs wrote to her saying she was wrong. The discussion was so intense, it even reached the pages of the New York Times. But vos Savant was indeed correct.

(Monty Hall left (1976) - image credit: ABC Television - source Wikimedia Commons, no known copyright, Marilyn vos Savant right (2017) - image credit: Nathan Hill via Wikimedia Commons - Creative Commons License. Note: the reason why the photos are from different years/ages is the availability of open-source images.)

The problem is loosely based on a real person and a real quiz show. In the US, there’s a long-running quiz show called ‘Let’s make a deal’, and its host for many years was Monty Hall, in whose honor the problem is named. Monty Hall was aware of the problem's fame and had some interesting things to say about it.

Vos Savant posed the Monty Hall problem in this form:

A quiz show host shows a contestant three doors. Behind two of them is a goat and behind one of them is a car. The goal is to win the car.
The host asked the contestant to choose a door, but not open it.
Once the contestant has chosen a door, the host opens one of the other doors and shows the contestant a goat. The contestant now knows that there’s a goat behind that door, but he or she doesn’t know which of the other two doors the car’s behind.
Here’s the key question: the host asks the contestant "do you want to change doors?".
Once the contestant decided whether to switch or not, the host opens the contestant's chosen door and the contestant wins the car or a goat.
Should the contestant change doors when asked by the host? Why?

What do you think the probability of winning is if the contestant does not change doors? What do you think the probability of winning is if they do?

Here are the results.

If the contestant sticks with their choice, they have a ⅓ chance of winning.
If the contestant changes doors, they have a ⅔ chance of winning.

What?

This is probably not what you expected, so let’s investigate what’s going on.

I’m going to start with a really simple version of the game. The host shows me three doors and asks me to choose one. There’s a ⅓ probability of the car being behind my door and ⅔ probability of the car being behind the other two doors.

Now, let’s add in the host opening one of the other doors I haven’t chosen, showing me a goat, and asking me if I want to change doors. If I don’t change doors, the probability of me winning is ⅓ because I haven’t taken into account the extra information the host has given me.

What happens if I change my strategy? When I made my initial choice of doors, there was a ⅔ probability the car was behind one of the other two doors. That can't change. Whatever happens, there are still three doors and the car must be behind one of them. There’s a ⅔ probability that the car is behind one of the two doors.

Here’s where the magic happens. When the host opens a door and shows me a goat, there’s now a 0 probability that the car’s behind that door. But there was a ⅔ probability the car was behind one of the two doors before, so this must mean there’s a ⅔ probability the car is behind the remaining door!

There are more formal proofs of the correctness of this solution, but I won’t go into them here. For those of you into Bayes theorem, there’s a really nice formal proof.

I know some of you are probably completely unconvinced. I was at first too. Years ago, I wrote a simulator and did 1,000,000 simulations of the game. Guess what? Sticking gave a ⅓ probability and changing gave a ⅔ probability. You don’t even have to write a simulator anymore, there are many websites offering simulations of the game so you can try different strategies.

If you want to investigate the problem in-depth, read Rosenhouse's book. It's 174 pages on this problem alone, covering the media furor, basic probability theory, Bayes theory, and various variations of the game. It pretty much beats the problem to death.

The Monty Hall problem is a fun problem, but it does serve to illustrate a more serious point. Probability theory is often much more complex than it first appears and the truth can be counter-intuitive. The problem teaches us humility. If you’re making business decisions on multiple probabilities, are you sure you’ve correctly worked out the odds?

References

The Wikipedia article on the Monty Hall problem is a great place to start.
New York Times article about the 1990 furor with some background on the problem.
Washington Post article on the problem.
'The Monty Hall Problem', Jason Rosenhouse - is an entire book on various aspects of the problem. It's 174 pages long but still doesn't go into some aspects of it (e.g. the quantum variation).

Wednesday, February 19, 2020

Urban myths are poor motivators

Getting people to work harder by lying to them

I like motivational stories. Hearing about how people overcame adversity and achieved success or redemption can be inspiring. But there can be a problem with using stories as motivators; some of them aren't true. I'm going to look at one such motivational story that's common on the internet, describe its old and modern forms, and take it to pieces. Let's start with the original version of the story.

The original fake story - Christopher Wren and the bricklayers

(Sir Christopher Wren. Image credit: Wellcome Images Wikimedia Commons, Creative Commons License)

Sir Christopher Wren was one of the greatest English architects. He was commissioned to design a replacement for St Paul's Cathedral which was burned to the ground in the devastating 1666 Great Fire of London. So far, all of this is well-established history.

The story goes that Sir Christopher was inspecting the construction work one day when he met three bricklayers. He asked them what they were doing.

The first bricklayer said, "I'm laying bricks. I'm doing it to feed my family."

The second bricklayer said, "I'm a builder. I'm doing it because I'm proud of my work."

The third bricklayer said, with a gleam in his eye, "I'm building a cathedral that will last a thousand years and be a wonder for the ages".

Some versions of the story stop here, other versions make the third bricklayer a future manager, or the most productive, or give him some other desirable property.

The story is meant to inspire people to see the bigger picture and feel motivated about being something larger than themselves. Plainly, the listener is expected to identify with the third bricklayer. But there are two problems with the story: internal and external.

Children's stories are for children

In many versions of the story, it doesn't say who was the better bricklayer. Even if the third bricklayer was the best, or a future manager, or some other good thing, was this because of his vision, or was it a coincidence? Was the third bricklayer being inspirational or was he trying to curry favor with Sir Christopher?

It seems astonishingly unlikely that in 1671, on the basis of a single conversation, anyone would record bricklayer productivity or future career trajectory. Maybe Sir Christopher was so motivated by the third bricklayer that he did both.

The most important problem with this story is the veracity. I couldn't find this story in any biography of Sir Christopher Wren or any academic writing about his life. With some internet sleuthing, I found what appears to be the first occurrence of this story in a 1927 religious inspirational book (''What can a man believe?" [Barton]). The book gives no reference for the story's provenance.

To put it bluntly: this story was probably made up and doesn't stand up to any scrutiny.

A made-up story about a janitor

There's a more modern version of this story, this time set in the 1960s. President Kennedy was visiting a NASA establishment where he saw a janitor sweeping the floor. The President asked the janitor what he was doing and the janitor said "I'm helping put a man on the moon". Once again, there's no evidence that this ever happened.

The odd thing about the moon landing story is there are very well-documented examples of NASA staff commenting on how motivated they felt [Wiseman]. The flip side is, there's the well-known fact that many were so motivated to work long hours to achieve the moon landing that their marriages ended or they turned to alcohol [Rose]. Leaving aside the negative effects, it's easy to find verifiable quotes that tell the same story as the fake janitor story, so why use something untrue when the truth doesn't take much more effort?

'I can motivate people by telling them fairy stories'

Most of the power of motivational stories relies on their basis in truth. If I told you stories to motivate you and then admitted that they probably weren't true, do you think my coaching would be successful? What if I told you a motivational story that I told you was true, but you later found out was made up, would it undermine my credibility?

There are great and true stories of success, redemption, leadership, and sacrifice that you can use to inspire yourself and others. I'm in favor of using true stories, but I'm against using made-up stories because it undermines my leadership, and frankly, it's an insult to the intelligence of my team.

References

[Barton] "What can a man believe?", Bruce Barton, 1927
[Rose] https://historycollection.jsc.nasa.gov/JSCHistoryPortal/history/oral_histories/RoseRG/RoseRG_11-8-99.htm
[Wiseman] "Moonshot: What Landing a Man on the Moon Teaches Us About Collaboration, Creativity, and the Mind-set for Success", Richard Wiseman

Sunday, February 16, 2020

Coin tossing: more interesting than you thought

Are the books right about coin tossing?

Almost every probability book and course starts with simple coin-tossing examples, but how do we know that the books are right? Has anyone tossed coins several thousand times to see what happens? Does coin-tossing actually have any relevance to business? (Spoiler alert: yes it does.) Coin tossing is boring, time-consuming, and badly paid, so there are two groups of people ideally suited to do it: prisoners and students.

(A Janus coin. Image credit: Wikimedia Commons. Public domain.)

Prisoner of war

John Kerrich was an English/South African mathematician who went to visit in-laws in Copenhagen, Denmark. Unfortunately, he was there in April 1940 when the Nazis invaded. He was promptly rounded up as an enemy national and spent the next five years in an internment camp in Jutland. Being a mathematician, he used the time well and conducted a series of probability experiments that he published after the War [Kerrich]. One of these experiments was tossing a coin 10,000 times. The results of the first 2,000 coin tosses are easily available on Stack Overflow and elsewhere, but I've not been able to find all 10,000, except in outline form.

We’re going to look at the cumulative mean of Kerrich’s data. To get this, we’ll score a head as 1 and a tail as 0. The cumulative mean is the cumulative mean of all scores we’ve seen so far; if after 100 tosses there are 55 heads then it’s 0.55 and so on. Of course, we expect to go to 0.5 ‘in the long run’, but how long is the long run? Here’s a plot of Kerrich’s data for the first 2,000 tosses

(Kerrich's coin-flipping data up to 2,000 tosses.)

I don’t have all of Kerrich’s tossing data for individual tosses, but I do have his cumulative mean results at different numbers of tosses, which I’ve reproduced below.

Number of tosses	Mean	Confidence interval (±)
10	0.4	0.303
20	0.5	0.219
30	0.566	0.177
40	0.525	0.155
50	0.5	0.139
60	0.483	0.126
70	0.457	0.117
80	0.437	0.109
90	0.444	0.103
100	0.44	0.097
200	0.49	0.069
300	0.486	0.057
400	0.498	0.049
500	0.51	0.044
600	0.52	0.040
700	0.526	0.037
800	0.516	0.035
900	0.509	0.033
1090	0.461	0.030
2000	0.507	0.022
3000	0.503	0.018
4000	0.507	0.015
5000	0.507	0.014
6000	0.502	0.013
7000	0.502	0.012
8000	0.504	0.011
9000	0.504	0.010
10000	0.5067	0.009

Do you find something surprising in these results? There are at least two things I constantly need to remind myself when I’m analyzing A/B test results and simple coin-tossing serves as a good wake-up call.

The first piece is how many tosses you need to do to get reliable results. I won’t go into probability theory too much here, but suffice to say, we usually quote a range, called the confidence interval, to describe our level of certainty in a result. So a statistician won’t say 0.5, they’d say 0.5 +/- 0.04. You can unpack this to mean “I don’t know the number exactly, but I’m 95% sure it lies in the range 0.46 to 0.54”. It’s quite easy to calculate a confidence interval for an unbiased coin for different numbers of tosses. I've put the confidence interval in the table above.

The second piece is the structure of the results. Naively, you might have thought the cumulative mean would smoothly approach 0.5, but it doesn’t. The chart above shows a ‘blip’ around 100 where the results seem to change, and this kind of ‘blip’ happens very often in simulation results.

There’s a huge implication for both of these pieces. A/B tests are similar in some ways to coin tosses. The ‘blip’ reminds us we could call a result too soon and the number of tosses needed reminds us that we need to carefully calculate the expected duration of a test. In other words, we need to know what we're doing and we need to interpret results correctly.

Students

In 2009, two Berkeley undergraduates, Priscilla Ku and Janet Larwood, tossed a coin 20,000 times each and recorded the results. It took them about one hour a day for a semester. You can read about their experiment here. I've plotted their results on the chart below.

The results show a similar pattern to Kerrich’s. There’s a ‘blip’ in Priscilla's results, but the cumulative mean does tend to 0.5 in the ‘long run’ for both Janet and Priscilla.

These two are the most quoted coin-tossing results you see on the internet, but in textbooks, Kerrich’s story is told more because it’s so colorful. However, others have spent serious time tossing coins and recording the results; they’re less famous because they only quoted the final number and didn’t give the entire dataset. In 1900, Karl Pearson reported the results of tossing a coin 24,000 times (12,012 heads), which followed on from the results of Count Buffon who tossed a coin 4,040 times (2,048 heads).

Derren Brown

I can’t leave the subject of coin tossing without mentioning Derren Brown, the English mentalist. Have a look at this YouTube video where he flips an unbiased coin heads ten times in a row. It’s all one take and there’s no trickery. Have a think about how he might have done it.

Got your ideas? Here’s how he did it; the old-fashioned way. He recorded himself flipping coins until he got ten heads in a row. It took hours.

But what if?

So far, all the experimental results match theory exactly and I expect they always will. I had a flight of fancy one day that there’s something new waiting for us out past 100,000 or 1,000,000 tosses - perhaps theory breaks down as we toss more and more. To find out if there is something there, all I need is a coin and some students or prisoners.

More technical details

I’ve put some coin tossing resources on my Github page under the coin-tossing section.

Kerrich is the Kerrich data set out to 2,000 tosses in detail and out to 10,000 tosses in summary. The Python code kerrich.py displays the data in a friendly form.
Berkeley is the Berkeley dataset. The Python code berkeley.py reads in the data and displays it in a friendly form. The file 40000tosses.xlsx is the Excel file containing the Berkeley data.
coin-simulator is some Python code that shows multiple coin-tossing simulations. It's built as a Bokeh app, so you'll need to install the Bokeh module to use it.

References

[Kerrich] “An Experimental Introduction to the Theory of Probability". - Kerrich’s 1946 monograph on his wartime exploration of probability in practice.

Wednesday, February 12, 2020

Presentation advice from a professional comedian

In 2019, I heard a piece of presentation advice I'd never heard before. It shook me out of my complacency and made me rethink some key and overlooked aspects of giving a talk. Here's the advice:

"You should have established your persona before you get to the microphone."

There's a lot in this one line. I'm going to explore what it means and its consequences for anyone giving a talk.

(Me, not a professional comedian, speaking at an event in London. Image credit: staff photographer at Drapers.)

Your stage persona is who you are to the audience. People have expectations for how a CEO will behave on stage, how a comedian will behave, and how a developer will behave, etc. For example, let's say you're giving a talk on a serious business issue (e.g. bankruptcy and fraud) to a serious group (lawyers and CEOs), then you should also be serious. You can undermine your message and credibility by straying too far from what people expect.

One of the big mistakes I've seen people make is forgetting the audience doesn't know them. You might be the funniest person in your company, or considered as the most insightful, or the best analyst, but your stage audience doesn't know that; all they know about you is what you tell them. You have to communicate who you are to your audience quickly and consistently; you have to bend to their expectations. This is a lesson performers have learned very well, and comedians are particularly skilled at it.

If they think about their stage persona at all, most speakers think about what they say and how they say it. For example, someone giving a serious talk might dial back the jokes, a CEO rallying the troops might use rhetorical techniques to trigger applause, and a sales manager giving end-of-year awards might use jokes and funny anecdotes. But there are other aspects to your stage persona.

Let's go back to the advice: "you should have established your persona before you get to the microphone". In the short walk to the microphone, how might you establish who you are to the audience? As I see it, there are three areas: how you look, how you move, and how you interact with the audience.

How you look includes your shoes and clothing and your general appearance (including your haircut). Audiences make an immediate judgment of you based on what you're wearing. If I'm speaking to a business audience, I'll wear a suit. If I'm talking to developers, I'll wear chinos and a shirt (never a t-shirt). Shoes are often overlooked, but they're important; you can undermine an otherwise good choice of outfit by a poor choice of shoes (usually, wearing cheap shoes). Unless you're going for comic effect, your clothes should fit you well. Whatever your outfit and appearance, you need to be comfortable with it. I've seen men in suits give talks where they're clearly uncomfortable with a suit and tie. Being obviously uncomfortable undermines the point of dressing up - to be blunt, you look like a boy in a man's clothes.

How you move is a more advanced topic. You can stride confidently to the microphone, or walk normally, or walk timidly with your head down. If you're stressed and tensed, an audience will see it in how you move. By contrast, more fluid movements indicate that you're relaxed and in control. What you do with your hands also conveys a message. Audiences can see if you have a death grip on your notes or if you're using your hands to acknowledge applause. Think about the stage persona you want to create and think about how that person moves to the microphone. For example, if you're a newly appointed CEO, you might want to establish comfortable authority, which you might try and do by the way you walk, what you do with your hands and how you face the audience.

The moment the audience first sees you is where communication between you and the audience begins. How and when you look at an audience sends a message to them. For example, your gaze acknowledges you've seen your audience and that you're with them. Your facial expression tells the audience how you're feeling inside - a comfortable smile communicates one message, a blank expression communicates something else, and a scowl warns the audience the talk might be uncomfortable. You can also use your body to acknowledge applause, telling them that you hear them, which is an obvious piece of non-verbal communication between you and your audience.

In short: think about the seconds before you begin talking. How might you communicate who and what you are without saying a single word?

Where did I hear this advice? I've been listening to a great podcast, 'The Comedian's Comedian' by Stuart Goldsmith, a UK comic. Stuart interviews comics about the art and business of comedy. In one podcast, a comedian told the story of advice he'd received early in his career. The comedian was saying that after five years of performing, he'd managed to establish his stage persona within a minute or so of speaking. The advice he got was: "You should have established your persona before you get to the microphone."

A good presentation is a subtle dialog between the presenter and the audience. The speaker does or says something and the audience responds (or doesn't). Comedians are the purest example of this, they respond in the moment to the audience and they live or die by the interaction. If communicating your message to your audience is important to you, you have to interact with them - including the moments before you begin speaking. Ultimately, all presentations are performance art.

Saturday, February 8, 2020

The Anna Karenina bias

Russian novels and business decisions

What has the opening sentence of a 19th-century Russian novel got to do with quantitative business decisions in the 21st century? Read on and I'll tell you what the link is and why you should be aware of it when you're interpreting business data.

Anna Karenina

The novel is Leo Tolstoy's 'Anna Karenina' and the opening line is: "All happy families are alike; each unhappy family is unhappy in its own way". Here's my take on what this means. For a family to be happy, many conditions have to be met, which means that happy families are all very similar. Many things can lead to unhappiness, either on their own or in combination, which means there's more diversity in unhappy families. So how does this apply to business?

(Leo Tolstoy's family. Do you think they were happy? Image source: Wikimedia Commons. License: Public Domain)

Survivor bias

The Anna Karenina bias is a form of survivor bias, which is, in turn, a form of selection bias. Survivor bias is the bias introduced by concentrating on the survivors of some selection process and ignoring those that did not. The famous story of Wald and the bombers is, in my view, the best example of survivor bias. If Wald had focused on the surviving bombers, he would have recommended putting armor in the wrong place.

When we look at the survivors of some selection process, they will necessarily be more alike than non-survivors because of the selection process (unhappy families vs. happy families). Let me give you an example, buying groceries on the web. Imagine a group of people surfing a grocery store. Some won't buy (unhappy families), but some will (happy families). To buy, you have to find an item you want to buy, you have to have the money, you have to want to buy now, and so on. This selection process will give a group of people who are very similar in a number of dimensions - they will exhibit less variability than the non-purchasers.

Some factors will be important to a purchaser's decision and other factors might not be. In the purchaser group, we might expect to see more variation in factors that aren't important to the buying decision and less variation in factors that are. To quote Shugan [Shugan]:

"Moreover, variables exhibiting the highest levels of variance in survivors might be unimportant for survival because all observed levels of those variables have resulted in survival. One implication is a possible inverse correlation between the importance of a variable for survival and the variable’s observed variability"

In the opinion poll world, the Anna Karenina bias rears its ugly head too. Pollsters often use robocalls to try and reach voters. To successfully record an opinion, the call has to go through, it has to be answered, and the person has to respond to the survey questions. This is a selection process. Opinion pollsters try and correct for biases, but sometimes they miss them. If the people who respond to polls exhibit less variability than the general population on some key factor (e.g. education), then the poll may be biased.

In my experience, most forms of B2C data analysis can be viewed as a selection process, and the desired outcomes of most analysis is figuring out the factors that lead to survival (in other words, what made people buy). The Anna Karenina bias warns us that some of the observed factors might be unimportant for survival and gives us a way of trying to understand which factors are relevant.

Leo Tolstoy in 1897. (Image credit: Wikipedia. Public domain image.)

The takeaways

If you're analyzing business data, here's what to be aware of:

Don't just focus on the survivors, you need to look at the non-survivors too.
Survivors will all tend to look the same - there will be less variability among survivors than among non-survivors.
Survivors may look the same on many factors, only some of which may be relevant.
The factors that vary the most among survivors might be the least important.

References

[Shugan] "The Anna Karenina Bias: Which Variables to Observe?", Marketing Science, Vol. 26, No. 2, March–April 2007, pp. 145–148

Wednesday, February 5, 2020

Reverse engineering a sensor

A few years ago, I was working at an excellent company (BitSight). The only problem was, I felt the office was either too hot or too cold. I bought a temperature/humidity sensor and reverse engineered the control system to build my own UI to control the device. Here's the story of how I did it. If you want to learn how to reverse engineer a sensor interface, this story might be useful to you.

(Image credit: Uni-Trend)

I bought a UT330B temperature and humidity sensor from AliExpress for $35. This is a battery-powered temperature and humidity data logger that can record 60,000 measurements of temperature and humidity. The device has a USB port to download the recordings to a computer and it came with control software to read the device data via USB. The price was right, the accuracy was good, but there was a problem. The control software for the device worked on Windows only. My work and home computers were all Macs. So what could I do?

My great colleague, Philip Gladstone, suggested the way forward. He said, why not reverse engineer the commands and data going back and forth over the USB port so you can write your own control software? So this is what we did. (Philip got the project started, did most of the USB monitoring, and helped with deciphering commands.)

The first step was monitoring the USB connection. We got hold of a Windows machine and ran some USB monitoring software on it. This software reported the bytes sent back and forth to the UT330B. The next step was to load up and run the UT330B control software. Here's the big step; we used the control software to get the device to do things and monitored the USB traffic. From that traffic, we were able to reverse engineer the commands and responses.

The first thing we found was that every command to the device had a particular form:
[two byte header] [two byte command] [two byte modbus CRC]
e.g
0xab 0xcd 0x03 0x18 0xb1 0x05
(this is the command to delete data from the device)
we were able to infer how the command bytes related to actual commands (e.g. 0x03 0x18 means delete data). In some cases, we found pressing one button on the control software resulted in several commands to the device.

The next step was figuring out the CRC. Given the data we had and our knowledge that the CRC was two bytes, I set about looking for CRC schemes that would give us the observed data. I found the UT330B used a modbus CRC check.

Now I could replicate every command to the UT330B.

Figuring out the device data format came next. The device data was what the UT330B sent back in response to commands. Some of this was straightforward, e.g. dates and times, but some of it was more complex. To get a full range of device temperatures, I put the UT330B in a freezer (to get below 0c) and held it over a toaster to get over 100c. Humidity data was a little harder, I managed to get very high humidities by holding the device in the output of a kettle but I had to wait for very dry days to get low humidities. It turns out, data was coded using two's complement (this caused me some problems with negative temperatures until I realized what was going on).

So now I knew every command and how to interpret the results.

Writing some Python code was the next order of business. To talk to the UT330B via the USB port, we needed to use the PySerial library. The UT330B used a Silicon Labs chip, so we also needed to download and install the correct software. With some trial and error, I was able to write a Python class that sent commands to the UT330B and read the response. Now I was in complete control of my device and I didn't need the Windows command software.

A Python class is fine, but the original command software had a nice UI where you could plot charts of temperature and humidity. I wanted something like that in Python, so I used the Bokeh library to build an app. Because I wanted a full-blown app that was written properly, I put some time into figuring out the software architecture. I decided on a variant of the model-view-controller architecture and built my app with a multi-tab display, widgets, and of course, a figure.

Here are some screenshots of what I ended up with.

The complete project is on Github here.

So what did I learn from all of this?

It's possible to reverse engineer (some) USB-controlled devices using the appropriate software, but it takes some effort and you have to have enough experience to do it.
You can control hardware using Python.
Bokeh is a great library for writing GUIs that are focused on charts.
If I were ever to release a USB hardware device, I would publish the interface specification (e.g. 0x03 0x18 means delete data) and provide example (Python or C) code to control the device. I wouldn't limit my sales to Windows users only.

Saturday, February 1, 2020

Company culture and the hall of mirrors

Company culture: real and distorted

When I was a child, my parents took me to a funfair and we walked through the hall of mirrors. The mirrors distorted reality, making us look taller or smaller, fatter or thinner than we really were. I've read a number of popular business books that extoll the virtues of company culture, but I couldn't help feeling they were a hall of mirrors; distorting what's really there to achieve higher book sales.

(Reading about company culture in popular business books is like entering the house of mirrors. Image credit: Wikimedia Commons, License: Creative Commons, Author: SJu)

I was fortunate enough to spend time researching the academic literature on company culture to understand what's really there, and the results surprised me. I took a course at the Harvard Extension School on Organizational Behavior which was taught by the excellent Carmine Gibaldi. Part of this course was a research project and I chose to look at the relationship between company culture and financial performance.

What books claim vs. evidence

Given the long list of popular business books making very definitive claims about company culture, you might expect the research literature to show some clear results. The reality is, the literature doesn't. In fact, when I investigated the support for claims made in books, I couldn't find much in the way of corroborating evidence. Going back to the popular books, in many cases I found no references, in other cases I found evidence based on the author's work alone, and in others, the evidence was just unverified anecdotes. What I did find in the research literature was a more complex story than I expected.

Company culture and financial performance

Despite what the popular business books said, it wasn't true that a strong company culture of itself led to financial success, in fact, many companies with strong cultures went bankrupt. It wasn't true that there were a magic set of values that led to company success either; different successful companies had different values and different cultures. It wasn't true that company values are immutable, even a large company like IBM could and did change its values to survive.

What I found is that there are mechanisms by which a strong and unified company culture can lead to better performance; coordination and control, goal alignment, and enhanced employee effort. For these reasons, under stable market conditions, a strong culture does appear to offer competitive advantages. However, under changing market conditions, a strong culture may be a disadvantage because strong cultures don't have the variability necessary to adapt to change. There also seemed to be a link between culture and industry, with some cultures better suited to some industries.

I read about DEC, Arthur Anderson, and Enron. All companies with strong cultures and all companies that had been held up at the time as examples of the benefits of a strong culture. DEC's culture had been studied, extensively written about, and emulated by others. Enron had several glowing case studies written about them by adoring business schools. In all these cases, company culture may well have been a factor in their demise. Would you take DEC or Enron as your role model now? Are the companies you're viewing as your role models now going to be around in five or ten years?

Where's the beef?

Am I falling into the trap of not providing you with references for my assertions? No, because I'm going to go one stage further. I'm going to give you a link to what I wrote that includes all my references and all my arguments in more detail. Here's the report I wrote which lays out my findings - you can judge for yourself.

Popular books are mostly wrong

During my research, I read some detailed critiques of popular business books. I must admit, I was shocked at the extent to which the Emperor really has no clothes and the extent of the post hoc fallacy and survivor bias in the popular books. Many of the great companies touted in these books have fallen from grace, disproving many of the assertions the books made. If you make statements that company culture leads to long-lasting success, and your champions fail while maintaining their values, it kind of looks like your arguments aren't great. Given the lack of research support for many business books' claims, this should come as no surprise.

What to do?

Here's my advice: any popular business book that purports to reveal a new underlying truth about business success is probably wrong. Before you start believing the claims, or extolling the book's virtues, or implementing policy changes based on what you've read, check that the book stands up to scrutiny and ideally, that there's independent corroboration.

Tuesday, January 28, 2020

Future directions for Python visualization software

The Python charting ecosystem is highly fragmented and still lags behind R, it also lacks some of the features of paid-for BI tools like Tableau or Qlik. However, things are slowly changing and the situation may be much better in a few years' time.

(Image credit: Alvesgaspar - Wikimedia Commons - license)

Theoretically, the ‘grammar of graphics’ approach has been a substantial influence on visualization software. The concept was introduced in 1999 by Leland Wilkinson in a landmark book and gained widespread attention through Hadley Wickham’s development of ggplot2 The core idea is that a visualization can be represented as different layers within a framework, with rules governing the relationship between layers.

Bokeh was influenced by the 'grammar of graphics' concept as were other Python charting libraries. The Vega project seeks to take the idea of the grammar of graphics further and creates a grammar to specify visualizations independent of the visualization backend module. Building on Vega, the Altair project is a visualization library that offers a different approach from Bokeh to build charts. It’s clear that the grammar of graphics approach has become central to Python charting software.

If the legion of charting libraries is a negative, the fact that they are (mostly) built on the same ideas offers some hope for the future. There’s a movement to convergence by providing an abstraction layer above the individual libraries like Bokeh or Matplotlib. In the Python world, there’s precedence for this; the database API provides an abstraction layer above the various Python database libraries. Currently, the Panel project and HoloViews are offering abstraction layers for visualization, though there are discussions of a more unified approach.

My take is, the Python world is suffering from having a confusing array of charting library choices which splits the available open-source development efforts across too many projects, and of course, it confuses users. The effort to provide higher-level abstractions is a good idea and will probably result in fewer underlying charting libraries, however, stable and reliable abstraction libraries are probably a few years off. If you have to produce results today, you’re left with choosing a library now.

The big gap between Python and BI tools like Tableau and Qlik is the ease of deployment and speed of development. BI tools reduce the skill level to build apps, deploy them to servers, and manage tasks like access control. Projects like Holoviews may evolve to make chart building easier, but there are still no good, easy, and automated deployment solutions. However, some of the component parts for easier deployment exist, for example, Docker, and it’s not hard to imagine the open-source community moving its attention to deployment and management once the various widget and charting issues of visualization have been solved.

Will the Python ecosystem evolve to be as good as R’s and be good enough to take on BI tools? Probably, but not for a few years. In my view, this evolution will happen slowly and in public (e.g. talks at PyCon, SciPy etc.). The good news for developers is, there will be plenty of time to adapt to these changes.

Saturday, January 25, 2020

Clayton Christensen, RIP

I was saddened today to read of the death of Clayton Christensen. For me, he was one of the great business thinkers of the last 100 years.

(Image credit: Remy Steinegger - Wikimedia Commons - Creative Commons License)

I remember finding his wonderful book, "The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail", on the shelf at City University in London and devouring it on the way home on the tube. I only recommend a handful of business books and Christensen's book is number one on my list.
Christensen was fascinated by technology disruption, especially how industry leaders are caught out by change and why they don't respond in time. The two big examples in his book are steam shovels and disk drives.

The hydraulic shovel was a cheap and cheerful innovation that didn't seem to threaten the incumbent steam shovel makers. If anything, it was good for their profitability because hydraulic technology addressed the low end of the market where margins were poor, leaving the high-end more profitable market niches to steam shovels. Unfortunately for the steam shovel makers, the hydraulic shovel makers kept on innovating and pushed their technology into more and more niches. In the end, the steam shovel makers were relegated to just a few small niches and it was too late to respond.

The disk drive industry is Christensen's most powerful example. There have been successive waves of innovation in this industry as drives increased in capacity and reduced in size. The leaders in one wave were mostly not the leaders in subsequent waves. Christensen's insight was, the same factors were at work in the disk drive industry as in the steam shovel industry. Incumbents wanted to maximize profitability and saw new technologies coming in at the bottom end of the market where margins were low. Innovations were dismissed as being low-end technologies not competing with their more profitable business. They didn't respond in time as the disruptive technologies increased their capabilities and started to compete with them.

Based on these examples, Christensen teases out why incumbents didn't respond in time and what companies can do to not be caught out by these kinds of disruptive innovations.

Company culture is obviously a factor here, and Christensen poked into this more with a very accessible 2000 paper where he and Overdorf discussed the factors that can lead to a company culture that ignores disruptive innovation.

I never met Christensen, but I heard a lot of good things about him as a person. My condolences to his family.

Further reading

"The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail" - Clayton Christensen

"Meeting the Challenge of Disruptive Change", Clayton M. Christensen and Michael Overdorf, Harvard Business Review, https://hbr.org/2000/03/meeting-the-challenge-of-disruptive-change

How to lie with statistics

I recently re-read Darrell Huff's classic text from 1954, 'How to lie with statistics'. In case you haven't read it, the book takes a number of deceitful statistical tricks of the trade and explains how they work and how to defend yourself from being hoodwinked. My overwhelming thought was 'plus ça change'; the more things change, the more they remain the same. The statistical tricks people used to mislead 50 years ago are still being used today.

(Image credit: Wikipedia)

Huff discusses surveys and how very common methodology flaws can produce completely misleading results. His discussion of sampling methodologies, and the problems with them, are clear and unfortunately, still relevant. Making your sample representative is a perennial problem as the polling for the 2016 Presidential election showed. Huff's bias comments especially rang true to me, given my prior experiences as a market research street interviewer. In my experience, even people with a very good statistical education aren't aware of survey flaws and sources of bias.

The chapter on averages still holds up. Huff shows how the mean can be distorted and why the median might be a better choice. I've interviewed people with Master's degrees in statistics who couldn't explain why the median might be a better choice of average than the mean, so I guess there's still a need for the lesson.

One area where I think things have moved in the right direction is the decreasing use of some types of misleading charts. Huff discusses the use of images to convey quantitative information. He shows a chart where steel production was represented by images of a blast furnace (see below). The increase in production was 50%, but because the height and width were both increased, the area consumed by the images increases by 150%, giving the overall impression of a 150% increase in production¹. I used to see a lot of these types of image-based charts, but their use has declined over the years. It would be nice to think Huff had some effect.

(Image credit: How to lie with statistics)

Staying with charts, his discussion about selecting axis ranges to mislead still holds true and there are numerous examples of people using this technique to mislead every day. I might write a blog post about this at some point.

He has chapters on the post hoc fallacy (confusing correlation and causation) and has a nice explanation of how percentages are regularly mishandled. His discussion of general statistical deceitfulness is clear and still relevant.

Unfortunately, the book hasn't aged very well in other aspects. 2020 readers will find his language sexist, the jokey drawings of a smoking baby are jarring, and his roundabout discussion of the Kinsey Reports feels odd. Even the writing style is out of date.

Huff himself is tainted; he was funded by the tobacco industry to speak out against smoking as a cause of cancer. He even wrote a follow-up book, How to lie with smoking statistics to debunk anti-smoking data. Unfortunately, his source of authority was the widespread success of How to lie with statistics. How to lie with smoking statistics isn't available commercially anymore, but you can read about it on Alex Reinhart's page.

Despite all its flaws, I recommend you read this book. It's a quick read and it'll give you a grounding in many of the problems of statistical analysis. If you're a business person, I strongly recommend it - its lessons about cautiously interpreting analysis still hold.

Correction

[1] Colin Warwick pointed out an error in my original text. My original text stated the height and width of the second chart increased by 50%. That's not quite what Huff said. I've corrected my post.

Wednesday, January 22, 2020

The Python plotting ecosystem

Python’s advance as a data processing language had been hampered by its lack of good quality chart visualization libraries, especially compared to R with its ggplot2 and Shiny packages. By any measure, ggplot2 is a superb package that can produce stunning, publication-quality charts. R’s advance has also been helped by Shiny, a package that enables users to build web apps from R, in effect allowing developers to create Business Intelligence (BI) apps. Beyond the analytics world, the D3 visualization library in JavaScript has had an impact on more than the JavaScript community; it provides an outstanding example of what you can do with graphics in the browser (if you get time check out some of the great D3 visualization examples). Compared to D3, ggplot2, and Shiny, Python’s visualization options still lag behind, though things have evolved in the last few years.

(An example Bokeh application. Multi-tabbed, widgets, chart.)

Matplotlib is the granddaddy of chart visualization in Python, it offers most of the functionality you might want and is available with almost every Python distribution. Unfortunately, its longevity is also its problem. Matplotlib was originally based on MATLAB’s charting features, which were in turn developed in the 1980’s. Matplotlib's longevity has left it with an awkward interface and some substantially out-of-date defaults. In recent years, the Matplotlib team has updated some of their visual defaults and offered new templates that make Matplotlib charts less old-fashioned, however, the library is still oriented towards non-interactive charts and its interface still leaves much to be desired.

Seaborn sits on top of Matplotlib and provides a much more up-to-date interface and visualization defaults. If all you need is a non-interactive plot, Seaborn may well be a good option; you can produce high-quality plots in a rational way and there are many good tutorials out there.

Plotly provides static chart visualizations too, but goes a step further and offers interactivity and the ability to build apps. There are some great examples of Plotly visualizations and apps on the web. However, Plotly is a paid-for solution; you can do most of what you want with the free tier, but you may run into cases where you need to purchase additional services or features.

Altair is another plotting library for Python based on the 'grammar of graphics’ concept and the Vega project. Altair has some good features, but in my view, it isn’t as complete as Bokeh for business analytics.

Bokeh is an ambitious attempt to offer D3 and ggplot2-like charts plus interactivity, with visualizations rendered in a browser, all in an open-source and free project. Interactivity here means having tools to zoom into a chart or move around in the chart, and it means the ability to create (browser-based) apps with widgets (like dropdown menus) similar to Shiny. It’s possible to create chart-based applications and deploy them via a web server, all within the Bokeh framework. The downside is, the library is under active development and therefore still changing; some applications I developed a year ago no longer work properly with the latest versions. Having said all that, Bokeh is robust enough for commercial use today, which is why I’ve chosen it for most of my visualization work.

Holoviews sits on top of Bokeh (and other plotting engines) and offers a higher-level interface to build charts using less coding.

It’s very apparent that the browser is becoming the default visualization vehicle for Python. This means I need to mention Flask, a web framework for Python. Although Bokeh has a lot of functionality for building apps, if you want to build a web application that has forms and other typical features of web applications, you should use Flask and embed your Bokeh charts within it.

If you’re confused by the various plotting options, you’re not alone. Making sense of the Python visualization ecosystem can be very hard, and it can be even harder to choose a visualization library. I looked at the various options and chose Bokeh because I work in business and it offered a more complete and reliable solution for my business needs. In a future blog post, I'll give my view of where things are going for Python visualization.