Monday, January 17, 2022

Cultural add or fit?

What does cultural fit mean?

At a superficial level, company culture can be presented as free food and drink, table tennis and foosball, and of course company parties. More realistically, it means how people interact with each other, what behavior is encouraged, and crucially what behavior isn't tolerated. At the most fundamental level, it means who gets hired, fired, or promoted.

Cultural fit means how well someone can function within a company or team. At best, it means their personality and the company's way of operating are aligned so the person thrives within the company, performs well, and stays a long time. In this case, everyone's a winner.

For a long time, managers have hired for cultural fit because of the benefits of getting it right.

The unintended consequences

Although cultural fit seems like a good thing to hire for, it has several downsides.

Hiring for cultural fit over the long term means that you can get groupthink. In some situations that's fine, for example, mature or slowly moving industries benefit from a consistent approach over time. But during periods of rapid change, it can be bad because the team doesn't have the diversity of thought to effectively respond to threats; the old ways don't work anymore but the team still fights yesterday's battles.

For poorly performing teams, hiring for cultural fit can mean more of the same, which can be disastrous on two levels: it cements the existing problems and blocks new ways of working.

(Monks in a monastery are a great example of cultural fit. But not everyone wants to join a monastery. Abraham Sobkowski OFM, CC BY-SA 3.0, via Wikimedia Commons)

Cultural add

In contrast to cultural fit that focuses on conformity, cultural add focuses on what new and different things an employee can bring to the team.

Cultural add is not (entirely) about racial diversity; in fact, I would argue it's a serious error to view cultural add solely in racial terms. I've worked with teams composed of individuals from different races, but they all went to the same universities and all had the same middle-class backgrounds. The team looked and sounded diverse but their thinking was strikingly uniform.

Here are some areas of cultural add you might think about:

Someone who followed a non-traditional path to get to where they got. This can mean:

Military experience
Non-university education
They transitioned from one discipline to another (e.g. someone who initially majored in law now working in machine learning).

Single parents. Many young tech teams are full of young single people. A single parent has a radically different set of experiences. They may well bring a much-needed focus on work-life balance.
Older candidates. Their experience in different markets and different companies may be just what you need.
Working-class backgrounds. Most people in tech come from middle-class backgrounds (regardless of country of origin). Someone whose parents were very blue-collar may well offer quite a different perspective.

I'm not saying anything new when I say a good hiring process considers the strengths and weaknesses of a team before the hiring process starts. For example, if a team is weak on communication with others, a desirable feature of a new hire is good communications skills. Cultural add takes this one stage further and actively looks for candidates who bring something new to the table, even when that new thing isn't well-defined.

Square pegs in round holes

The cultural add risk is the same as any hiring risk: you get someone who can't work with the team or who can't perform. Even with cultural add, you still need to recruit someone the team can work with. Cultural add can't be the number one hiring criteria, but it should be a key one.

What all this means in practice

We can boil this down to some don'ts and dos.

Don'ts

Hire people who went to the same small group of universities.
Assume racial diversity = cultural add.
Add people who are exactly the same as the current team.
Rely on employee referrals (people tend to know people who are like them).

Do:

Look for people with non-traditional backgrounds.
Be aware of the hiring channels you use and try and reach out beyond the usual channels.
Look for what new thing or characteristic the candidate brings. This means thinking about the interview questions you ask to find the new thing.
Think about your hiring process and how the process itself filters candidates. If you have a ten-stage process, or a long take-home test, or you do multiple group interviews, this can cause candidates to drop out - maybe even the candidates you most want to attract.

Cultural add goes beyond the hiring process, you have to think about how a person is welcomed. I've seen teams unintentionally (and intentionally) freeze people out because they were a bit different. If you really want to make cultural add work, management has to commit to making it work post-hire.

An old joke

Two men become monks and join a monastery. One of the men is admitted because he's a cultural fit, the other because he's a cultural add.

After dinner one evening, the monks are silent for a while, then one monk says "23" and the other monks smile. After a few minutes, another monk very loudly says "82", and the monks laugh. This goes on for a while to the confusion of the two newcomers. The abbot whispers to them: "We've been together so long, we know each other's jokes, so we've numbered them to save time". The new monks decide to join in.

The cultural fit monk says "82" and there's polite laughter - they've just heard the same joke. The cultural add monk thinks for a second and says "189". There's a pause for a second as the monks look at one another in wonder, then they burst out in side-splitting laughing. Some of the monks are crying with laughter and one looks like he might need oxygen. The laughter goes on for a good ten minutes. The abbot turns to the cultural add monk and says: "they've never heard that one before!".

If you want more of the same, go for cultural fit, if you want something new, go for cultural add.

Friday, January 7, 2022

Prediction, distinction, and interpretation: the three parts of data science

What does data science boil down to?

Data science is a relatively new discipline that means different things to different people (most notably, to different employers). Some organizations focus solely on machine learning, while other lean on interpretation, and yet others get close to data engineering. In my view, all of these are part of the data science role.

I would argue data science generally is about three distinct areas:

Prediction. The ability to accurately extrapolate from existing data sets to make forecasts about future behavior. This is the famous machine learning aspect and includes solutions like recommender systems.
Distinction. The key question here is: "are these numbers different?". This includes the use of statistical techniques to decide if there's a difference or not, for example, specifying an A/B test and explaining its results.
Interpretation. What are the factors that are driving the system? This is obviously related to prediction but has similarities to distinction too.

(A similar view of data science to mine: Calvin.Andrus, CC BY-SA 3.0, via Wikimedia Commons)

I'm going to talk through these areas and list the skills I think a data scientist needs. In my view, to be effective, you need all three areas. The real skill is to understand what type of problem you face and to use the correct approach.

Distinction - are these numbers different?

This is perhaps the oldest area and the one you might disagree with me on. Distinction is firmly in the realm of statistics. It's not just about A/B tests or quasi-experimental tests, it's also about evaluating models too.

Here's what you need to know:

Confidence intervals.
Sample size calculations. This is crucial and often overlooked by experienced data scientists. If your data set is too small, you're going to get junk results so you need to know what too small is. In the real world. increasing the sample size is often not an option and you need to know why.
Hypothesis testing. You should know the difference between a t-test and a z-test and when a z-test is appropriate (hint: sample size).
α, β, and power. Many data scientists have no idea what statistical power is. If you're doing any kind of statistical testing, you need to have a firm grasp of power.
The requirements for running a randomized control trial (RCT). Some experienced data scientists have told me they were analyzing results from an RCT, but their test just wasn't an RCT - they didn't really understand what an RCT was.
Quasi-experimental methods. Sometimes, you just can't run an RCT, but there are other methods you can use including difference-in-difference, instrumental variables, and regression discontinuity. You need to know which method is appropriate and when.
Regression to the mean. This is why you almost always need a control group. I've seen experienced data scientists present results that could almost entirely be explained by regression to the mean. Don't be caught out by one of the fundamentals of statistics.

Prediction - what will happen next?

This is the piece of data science that gets all the attention, so I won't go into too much detail.

Here's what you need to know:

The basics of machine learning models, including:

Generalized linear modeling
Random forests (including knowing why they are often frowned upon)
k-nearest neighbors/k-means clustering
Support Vector Machines
Gradient boosting.

Cross-validation, regularization, and their limitations.
Variable importance and principal component analysis.
Loss functions, including RMSE.
The confusion matrix, accuracy, sensitivity, specificity, precision-recall and ROC curves.

There's one topic that's not on any machine learning course or in any machine learning book that I've ever read, but it's crucially important: knowing when machine learning fails and when to stop a project. Machine learning doesn't work all the time.

Interpretation - what's going on?

The main techniques here are often data visualization. Statistical summaries are great, but they can often mislead. Charts give a fuller picture.

Here are some techniques all data scientists should know:

Heatmaps
Violin plots
Scatter plots and curve fitting
Bar charts
Regression and curve fitting.

They should also know why pie charts in all their forms are bad.

A good knowledge of how charts work is very helpful too (the psychology of visualization).

What about SQL and R and Python...?

You need to be able to manipulate data to do data science, which means SQL, Python, or R. But plenty of people use these languages without being data scientists. In my view, despite their importance, they're table stakes.

Book knowledge vs. street knowledge

People new to data science tend to focus almost exclusively on machine learning (prediction in my terminology) which leaves them very weak on data analysis and data exploration; even worse, their lack of statistical knowledge sometimes leads them to make blunders on sample size and loss functions. No amount of cross-validation, regularization, or computing power will save you from poor modeling choices. Even worse, not knowing statistics can lead people to produce excellent models of regression to the mean.

Practical experience is hugely important; way more important than courses. Obviously, a combination of both is best, which is why PhDs are highly sought after; they've learned from experience and have the theoretical firepower to back up their practical knowledge.