Our team of Hall of Famers and guest writers are offering regular contributions throughout the 2023/24 Fantasy Premier League (FPL) campaign. Here, former champion Simon March discusses expected goals (xG).
Whether you’re listening to an FPL podcast, perusing a forum or scrolling through social media, xG (‘expected goals’) will invariably be mentioned at some point. As a general rule in modern day FPL, whatever the discussion started about, it will somehow end up being about xG.
Given its ubiquity, it’s amazing how often we feel let down by xG. Post-Gameweek analysis frequently focuses on players who have under or overperformed their xG, we remark at how surprising this is to us, and then we invoke xG again when planning for the next Gameweek. xG is inevitable.
This, of course, is not to say xG is useless, meaningless or that it should be discarded onto the FPL scrapheap with rotating keepers, but it is very possible that many of us are often not looking at xG in its proper context and, consequently, we are ignoring its limitations, sometimes to our detriment.
The focus of this article will be xG, how we can understand it better and, in particular, how two essential contextual factors; its composition and its timeframe, can help us get more out of it.
For the sake of simplicity, I’m focusing on xG but some of the principles discussed will also apply to other ‘X’ stats such as expected assists (xA) or expected goal involvement (xGI)
What is xG?
Opta explains xG as follows: “Expected goals (or xG) measures the quality of a chance by calculating the likelihood that it will be scored by using information on similar shots in the past.”
So, when we hear or read ‘xG’ we can just think of it as a stat that describes the quality of an opportunity to score, represented numerically from a scale of 0 to 1.
A relatively weak opportunity to score might be awarded an xG of 0.10 (10% likelihood of being converted) and a strong chance might receive a higher number, such as 0.80 (80% likelihood of being converted). The closer the number is to 1 (100%), the better the chance in question was determined to be.
xG rarely tells us the full story
The first problem when we look at xG is that we are almost always looking at it in aggregated form. For example, we might say ‘Player Y had an xG of 1.6 in his last match’ or ‘Player Y has an xG of 6.8 for the season so far’. What we don’t tend to look at are the individual xG stats per shot and, consequently, we often lose some of the context that might make xG more useful to us when assessing and trying to predict player performance.
For example, a player receiving an xG score of 1.6 in a match could refer to two good quality goalscoring chances of xG 0.8 (0.8 x 2 = 1.6) or it could refer to eight relatively poor quality chances of xG 0.2 (0.2 x 8 = 1.6). Add both together and they make the same xG number; 1.6, but we’ve lost a lot of the story behind it.
In general, one player might require high quality chances to score (e.g. tap-ins) while another player might be capable of scoring from weaker chances (e.g. tight angle or long-distance shots) but require a high volume of them in order to convert. These players could have exactly the same aggregated xG, and even the same number of goals over that period, but they both got there in very different ways.
If we know this, we might reasonably hypothesise that a player will continue to need, at minimum, equivalent or better chance conditions in order to at least sustain their performances going forward.
The clues as to what the individual players need are in the nature of their performances, but we lose this detail if we only look at their aggregated xG score, thus we often cannot reliably judge performance sustainability by looking at aggregated xG alone.
So how do we get around this? We could break down each individual shot xG and make our conclusions based on this, but this could get very labour intensive. Alternatively, we may find it valuable to ‘sense-check’ a player’s aggregated xG score with other relevant qualitative factors in order to assess how sustainable their performance might be.
These factors might include, among others, watching the player, examining the types and volume of chances they get, their shot accuracy, the level of opposition (past and future) and systemic factors such as their role in their team, who supplies them with their chances and the availability or status of those team-mates, how significant a role the player plays in their own chance creation and so on.
These factors can help us judge whether the player in question will continue to get the type or volume of chances they need and, consequently, how predictive their xG score might be of their future performances.
An aggregated xG stat can tell us whether a player is getting chances to score, but without additional context, it tells us little about the nature of those chances or their sustainability for that particular player.
“Time isn’t the main thing, it’s the only thing” – Miles Davis about xG (probably)
Time, or matches/minutes, is arguably the most vital context of all when assessing aggregated xG stats and player performance against xG.
To demonstrate this, if you learned that a player had two goals from 2 xG in two matches, you might reasonably assume that you were looking at a high-performing attacker, Erling Haaland (£14.4m) perhaps. If you then learned that the same player had two goals from 2 xG over 10 matches, you’d soon realise that you probably weren’t.
These xG stats are the same, the number of goals scored are the same, but the time period changes everything with respect to assessing how the player has performed.
Many will be familiar with the ‘Law of Small Numbers’, a cognitive fallacy describing a tendency to incorrectly draw broad conclusions from a small sample size. If our sample size of matches/minutes is too small, xG as a stat will likely become far less representative of a player’s broader performance and, therefore, potentially very misleading.
When looking at xG over too short a period, say three matches, the chances that a single match or event might skew the data and paint a misleading picture becomes a real danger. We might, for example, think that a high xG is a good indicator of a high potential goalscorer, but we might also be looking at one exceptionally high xG match (probably against Sheffield United) which has skewed the overall sample.
The reverse can also be true, one or two low or very low xG matches in a small sample could suggest that a player doesn’t get good chances and we might rule this player out of our thinking as a result. Broaden the time period sample to, say, ten matches however, and we could see a very different story altogether.
The more time in our sample size, the less influential these outliers are likely to be in our overall data set and the more reliable, and predictive, that data set should be as a result.
Time is essential context when assessing both xG and performance against xG; we need to know the timeframe, and we need that timeframe to be substantial enough before we can begin to trust what it is telling us.
How much time is enough? Truth is, in FPL, we rarely have data sets in volumes that would satisfy a scientist, but the season is only so long, so we must be practical.
Speaking for myself, I’d be most wary of sample sizes that are fewer than 8-10 matches and, even above that, I’d still want to consider some of the types of additional qualitative context discussed earlier.
As a rule, however, wherever you set the bar yourself, your confidence in an aggregated xG stat and performance against it should increase the longer the time period across which you are examining it is, and vice versa.
Summary
Like other stats, xG can be useful when trying to assess player performance but only provided it is viewed in its proper context. It is important to remember that when we see xG, we are usually viewing an aggregated statistic compiling cumulative chances. Unless we dig into the data, we often cannot make an assessment as to the individual quality or sustainability of the chances that comprise it.
Equally, when we see an aggregated xG stat, we often neglect to consider the significance of the time period over which it was gathered, yet this should probably be among our first concerns. This is because, when assessing xG and player performance, a short period offers something significantly less useful in terms of reliability or predictive power compared to a longer period.
Time-permitting, it is significantly to your advantage to sense-check aggregated xG stats by digging into more granular stats like xG per shot or other individual qualitative stats such as volume of chances, shots in the box, shots on target and shot accuracy as these may add some important context. Just make sure you’re always looking at enough data.
Equally, it’s often the non-statistical and more tangible or systemic factors that prove the most useful when it comes to making FPL decisions (especially early in the season when data is at its most scarce). These might include a player’s role in the team, whether he is likely to get into positions to score and whether he has the correct support to do so. Layered on top of these, xG can help inform decisions and, provided the right contextual factors are present, this is arguably one of the more effective ways to use it.
The gold standard when assessing player scoring performance is, of course, goals over time, but we should never underestimate the importance of time as vital context in whatever FPL stat we’re looking at.