Expected Goals in Football
Shots on goal and total shots has often been the yardstick by which punters decide on the likelihood of one football team beating another. Yet there is a better statistical tool known as ‘expected goals’.
If you want to try to predict the score in a football match, then there are a couple of things you will need; statistical information and a ‘model’. Information such as shots to goal ratios and average goals per match, as well as historical data and how to apply it, is the type of information you will need to become more successful in your betting.
Goals are quite a rare occurrence in football when compared to other sports. If you take the last five seasons of matches played in the Premiership, the average goals per match has been 2.72. but be aware that around 50% of the match result can be due to dumb luck! This could be because of unlikely deflections or officials’ decisions such as offside and penalties given / not given incorrectly.
It makes sense, therefore, to use a larger data sample. so you could use shots on goal or total shots but the problem with this data is that although we know that all goals are worth the same, the chances of a shot resulting in a goal can vary greatly. This is where expected goals ( ‘xG’ ) can be invaluable.
The number of goals a team (or teams) are expected to score in a match. This is calculated by assigning a value to shots on goal, total number of shots, location of shots, the in-game position and the proximity of opposition defenders.
Over the last five seasons in Premier League matches, shots have been converted around 9.6% of the time but if you place those shots into categories, then you can see just how much the conversion rate can vary.
By checking historical data, we can calculate the average likelihood of each shot being scored by using as many factors as we like. Some models include whether a is goal scored with a player’s feet or if it’s a headed goal, the build up to the shot or situation that led to the shot. Naturally the latter uses advanced data and analysis skills but simpler systems exist. Here is an example:
Penalties. In the five years between the 2011/12 and 2015/16 Premier League season, there were a total of 443 penalties awarded and of those 347 were scored. This means that 78.3% of penalties were converted. We can now give an expected goal value of 0.783 to penalties.
Big Chances. 2,579 of a total of 6,213 ‘big chances’ were converted (including penalties) but non-penalty big chances were converted 38.7%, which gives these types of shots an xG value of 0.387.
(Opta defines ‘big chances’ as “a situation where a player should reasonably be expected to score (usually in a one-on-one scenario or from very close range).
Shots in Area. Leaving aside penalties and big chances, there were 22,822 ‘non-big chances’ within in the box, with 1,587 scored / converted, (7%) giving an average expected goal value of 0.070.
Shots Outside Area. Lastly there are long range shots outside the box. There were 22,318 of these during the five year period, with 809 of them being converted (3.6%) . These types of shots. therefore, have an expected goal value of 0.036.
Expected goals per team
Shot data is widely available online and with this information you can quickly find the xG for each team in a game. Taking a recent sample of over 200 Premiership games, which took in most of the 2017 season, of the matches that were ‘won’ (own goals excluded) the team that had the most shots prevailed in 151 (72%) of games, whilst the team that had a higher xG score won in 170 games (81% of the time). So the advantages of xG are clear.
Using xG data and Poisson Distribution, you can calculate your own odds for football matches, that you can then compare for value with the wider market. You can also predict the likely score and the home win, draw, or away win.
The limitations of the xG / model
There are of course limitation to any model / system. xG for instance, cannot know about ‘subjective factors’ such as unrest in the squad, if a new manager has just taken over, if a top player is out injured, or if a team has had a demanding recent schedule but such information is easily available across the web or comes out in news reports.
Also, xG system like the one detailed above will not predict a ‘spike’ or high scoring game.
So in conclusion, by making use of stats / xG and Poisson Distribution, you can assess the quality of a team’s attack and defence much more accurately, as well as score predictions for soccer matches.