Breaking down … Expected goals

What are Expected Goals

Expected goals or xG (which I will refer to it as for the rest of the break down) have gained a lot of negative and positive press recently from its appearance on Match of the Day, a leading British TV programme which shows the highlights of all of the Premier League games. Comments from pundits and former players have been very mixed with some arguing that the use of these statistics have ruined the essence of the game whereas others argue that it is another significant step towards quantifying the intricacies of football.

So, what does the term expected goals actually mean. It is essentially a tool that can be used to help understand a game better by measuring the quality of shots that each team takes. The simplest xG models do this by simply assessing the location of each shot taken and the likelihood of a shot from that position leading to a goal. To find this probability the model looks at a historical data set of lots of shots from many positions and which shots lead to goals.

This simple shot location diagram forms the backbone of the XG model and shows that the angle that a shot is taken from greatly influences its XG.

Why are Expected Goals important

xG shows us how many goals you would expect a team to score from the shots they have taken, however, shot location is not the only factor that contributes to the final xG value for that shot. Other factors include whether the shot was taken with the foot or with the head (headers have a lower xG as they statistically are less likely to result in goals), the type of pass leading to the shot e.g. cutback, cross, through ball, whether the player went around the keeper or not and the speed of the attack that leads to the shot. One big advantage of xG is it can help to counteract the often-misleading shot numbers of a player. For example, here we will compare Jonjo Shelvey and Romelu Lukaku’s shot location for the 2018-19 season.

Here you can clearly see that a large majority of Lukaku’s shots are taken from inside the box whereas this is the opposite for Shelvey who often takes shots from outside the box. Last season Lukaku was taking 2.34 shots per 90 and Shelvey was not far behind on 2.22. In previous years without xG data, scouts could have been misled to believe that both players provide a similar attacking output with comparable shot numbers. However, the xG model understands that Lukaku is taking shots that are statistically much more likely in resulting in a goal with Shelvey only having an xG on 0.07 per 90 compared to Lukaku’s 0.56. Doing this comparison between shot numbers and xG can give scouts valuable insight into a player. For example, high shot numbers with low xG numbers could suggest that:

  1. The player lacks match intelligence and continues to persist long shots despite being a low percentage option for even the players with the best finishing abilities.
  2. The team around the player is weak at chance creation so a more deep-lying player may feel obligated to shoot from further away from goal.
  3. The player may lack the skills to get inside the box and shoot.

Using Harry Maguire’s shot location diagram (who is known to only really score from headers) shows how the xG model is also able to differentiate from headed and footed shots. Despite taking a considerably high 1.04 shots per 90 for a centre back his xG per 90 is only 0.06 showing how these chances are usually of little quality. Maguire is known as being a real threat in the air however, and last season he was able to outperform his xG scoring 3 goals whereas xG had him at 1.86, reflecting his quality in the air. Expected goals can not only be used in an attacking sense but also in a defensive sense as well to show the quality of a defence or goalkeeper.

I have chosen the game between Arsenal and Manchester United on December 2nd 2017 to highlight the importance of xG as a statistic defensively. In this game David De Gea was insane giving one of his best performances in his career making a monumental 14 saves in the game, equalling the premier league record. To watch De Gea’s heroics, use this URL: https://www.youtube.com/watch?v=tH4I8qM_4Eo. In this game, despite United winning 3-1, Arsenal dominated the game taking 33 shots with 16 on target. The xG model in this case almost acts like a football fan understanding that Arsenal were unlucky to lose this game and goes as far as to suggest that a more fitting scoreline based on the quality of the chances would have been 4-2 to Arsenal. This indicates clearly to a scout that the goalkeeper has severely overperformed xG in this game and has won the side the game. These statistics can also be used to critically evaluate your own team, with this result showing that the defence was poor this game and should be an area that should be looked to be improved in the upcoming transfer window or by promoting promising youth squad members to the first team.

De Gea vs Arsenal

xG is in fact better at telling you the quality of a player or the quality of a team than goals itself. Imagine a scenario where in the first 10 games of the season, team A may go on a hot streak scoring several 40-yard screamers to clinch them games and find themselves top of the league. Team B may be placed second and scored far fewer goals but very easily their xG measure could be much higher Team A’s. Expected Goals supports and benefits consistent high-quality chance creation and this is almost always indicative of a much higher quality side. In the scenario of Team A and Team B, by the end of the season, logic suggests that you would expect to find Team B finish on the most goals as they are the side that are able to build up attacks to the point of a tap-in in the six-yard box.

Looking at the top scorers from the Premier League and comparing their xG and actual goals you can see that they are not far off at all, showing how xG has an ability to match and mimick real life results. The top scorers list does tend to feature the people who have overperfomed xG the most as they were those who were able to finish the difficult chances in order to make the list. For example Sadio Mané, who was an extremely prolific finisher this season, overperformed his xG by by 5.24. For most other players, the xG metric is much closer as the top scorers tend to have had supercharged seasons with purple patches in order for them to reach the top scorers list.

Limitations of the Expected Goals metric

As with all statistics, there are some limitations of the xG model, the first being that it does not take into account or differentiate in any way the finishing ability of the player as introducing player identity leads to reduced accuracy of the comparison. It is difficult to put a qualitative measure on a player’s finishing ability other than looking at goals which as explained before could be misleading. If a player scores many goals but a majority of those are tap ins then it gives no indication as to how prolific the player really is in front of goal. Undeniably, finishing ability does exist and this can clearly be observed when watching Luis Suarez compared to Jozy Altidore. On the other hand, through the xG model, many scouts have been able to realize what truly makes a forward great. In the grand scheme of things, the ability to get high-quality chances close to goal is what makes them special not finishing those chances at an above-average rate.

Most xG models do not take into account traffic data either i.e. how many defenders are between the person taking the shot and the goal. Some models do contain this data such as Stratabet’s but most including the most well-known xG model created by Opta, do not. Through balls and counter-attacks are built in the model to loosely adjust for this as it indicates how set the defence is. Sean Dyche and Lucien Favre often tend to overperform xG defensively as they employ tactics that mean that they get several men behind the ball and their teams average some of the highest number of blocks in Europe’s top five leagues. Unfortunately the xG metric is no able to factor this data in yet but steps are being taken to incorporate this data in newer models.

How Expected Goals are being used

Football fans are already somewhat using the idea of xG when assessing whether their team was lucky or unlucky to win a game based on the quality of chances they had. In a way, the video game FIFA uses it’s own xG model with players much more likely to score close to the goal than far away from it. Top tier football clubs are using the xG metric to help their scouting and to make judgements on their own team. Some clubs that have openly advocated the use of xG include:

  • Arsenal
  • Liverpool
  • Borussia Dortmund
  • Paris Saint-Germain

The xG statistics help scouts to whittle down a large dataset of many exciting prospects into a shortlist of players which can then be further scouted in person. Betting companies use xG data to help set odds and professional betters use xG data to their advantage to exploit more naive companies that do not use the metric.

Whether you like the idea of expected goals or not, the metric will inevitably be used more and more given how simple yet effective it is. Most importantly it continues to improve, factoring in more and more variables with newer and improved models without tarnishing the spirit of the beautiful game.

By Anmol Gupta

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website at WordPress.com
Get started
%d bloggers like this: