Friday, August 8, 2025

Yes, the referee might be biased. Discipline in English football.

Red cards, yellow cards, and fouls

There are all kinds of theories surrounding red cards, yellow cards, and fouls in English football. Here are a few:

  • Referees are biased against away teams and award more red and yellow cards to away teams than they deserve.
  • More risk-taking leads to more goals, but also more red and yellow cards.
  • More red and yellow cards are being issued after the introduction of VAR (Video Assistant Referee).

In this blog post, I’m going to look into these theories, putting them to the test. As we’ll see, some stand up to scrutiny while other don’t.

(Ardfern, CC BY-SA 4.0, via Wikimedia Commons)

The data set

I took this data from www.football-data.co.uk, who have a set of data going back 20 or so years for red cards, yellow cards, and fouls. The data is for the top four tiers of English league football.

Red cards have been a steady feature of English football since 1987, so in principle, we could get another decade’s worth of data, if it was available. As always with football, the problem is the availability of data.

What we know already

I’ve been analyzing a data set on English league football from 1888 onwards. Here’s what I’ve learned so far:

With all this in mind, let’s start to look at the data.

Total red cards, yellow cards, and fouls

The three charts below show the mean number of red cards, yellow cards, and fouls for each match for each league for each season. The charts are interactive; click the legend to turn the league tier lines on and off, and use the tools to move around the chart. The tooltips will show the values for each point on the line.

The red card chart shows a decline over time, with no real differences between leagues. Notably, there’s no COVID effect. 

The yellow card data is more interesting. There does appear to be a COVID effect. The data shows a decline and then a significant uptick across all leagues starting in 2021. This can’t be a VAR thing, as VAR isn’t used in the Championship (tier 2) or the lower leagues. Could this be due to a change in guidance given to referees?

The foul data shows no COVID effect. There is a decline, then around 2012, it starts to increase for all leagues, except the Premier League, before dropping down in 2021. It could be that referees started being stricter and awarded yellow cards for infringements that would otherwise be a free kick etc.

One of the initial theories was that more risk-taking = more cards = more goals. This just doesn’t seem to the case. The trend in red cards is downwards, but the goals per match trend is steady, except for the Premier League where it’s up a bit, so we really can’t claim more red cards = more goals. The yellow card and foul story is similar; we just don’t see the same trends in goals.

What about the VAR issue? VAR was introduced in the Premier League in 2019. If it had a dramatic impact, we would expect to see it in the data, but we just don’t. Remember, VAR is about a few calls in a few matches. A bad referee decision gets way more press than a good call. So while VAR may result in fewer bad decisions, it doesn’t seem to touch the overall picture very much.

Home team bias

Let’s turn now to look at home team bias. Because each club plays all the other clubs in the league at home and away, we would expect the number of red cards, yellow cards, and fouls to be about the same for home and away clubs. I’ve defined an away bias metric, which for red cards is:

\[away\ red\ card\ fraction = \dfrac{count\ of\ away\ club\ red\ cards}{count\ of\ all\ red\ cards}\]

(Similar definitions for away yellow card and foul fractions). If there were no home bias, we would expect the fraction to be 0.5. If referees were completely biased towards the home team, we would expect it to be 1, and if they were completely biased to the away team, it would be 0.

Here are the charts for away red card fraction, away yellow card fraction, and away foul fraction. They are pretty much above 0.5. I could do a z-test and prove that they’re all statistically significantly above 0.5, but there’s no need. You can clearly see it in the data over 24 years.

The red card data shows no strong trends over time. I could do a linear least-squares fit which would probably show a slight decline, but I’m reluctant to because this is a very noisy data set and I have no reason to believe the data is linear. There’s no COVID effect.

There is a noticeable drop over time in the yellow card data. The drop seems to be steady and consistent across all leagues. There is a COVID effect here.

There are no strong discernable trends in the foul data. There’s a strong COVID effect.

Why there is a COVID effect for yellow cards and fouls but not red cards I don’t know. A possible explanation is that yellow cards and fouls are more open to judgement and therefore more open to being swayed by home supporters. No supporters means no crowd influence on referees. 

Why does the yellow card data show a decline over time and not the other data? Maybe it’s the discretion thing and referees becoming immune to crowd influence over time?

All the fraction data shows a home bias. I have a couple of possible explanations for this:

  • Away teams play more aggressively therefore get punished more.
  • Referees are biased.

Do teams play tactically differently at home and away? I’m sure it happens, but I just don’t buy it as the explanation. Biased referees seems to me more likely, with the bias coming from home supporters influence, but this explanation is problematic too. Attendance is up over the last twenty years, so if more supporters = more influence, we might expect to see an increase in away bias, but we don't.

The wrong analysis

For this work, I did a literature survey and I came across a paper that attempted to quantify the impact of VAR in the Premier League by comparing red cards before and after its introduction. The analysis used various statistical methods and came to a conclusion that there was an effect. Trouble is, they only used two seasons' worth of data and they didn't look at the lower leagues. My take is, there may well be an effect, but it's a lot less than the season-to-season variance and it's almost impossible to pick out from the data, regardless of what the statistical analysis says. 

Where are we?

Home advantage is real and manifests itself in the disciplinary data. It may be caused in part by the influence of home supporters, but the picture isn’t clear. The concept that more risk taking = more disciplinary issues = more goals (and the inverse), doesn’t seem to be supported by the data. Lastly, there’s no clear VAR effect in the data.

It’s still a puzzle why home field advantage exists to the extent that it does and why it's declining.

Similar posts you might like

No comments:

Post a Comment