Introduction to xFpts: Forwards

Jun 15, 2020

As a casual draft fantasy premier league player (using togga rules on fantrax.com) who dabbles in statistics, I’m always trying to find a better way to evaluate players. I scroll through the waiver wire each week looking for the next gem, but am often left feeling like the information provided on Fantrax’s player tab doesn’t tell the full story. As a result, I set out to create a new metric of player fantasy performance.

Limits of points per 90 minutes and points per game

The standard measure of player value given as default on Fantrax, points per game (ppg) is unfortunately not a reliable measure of a player’s value. Unless a player starts every game (or substitutes every game) and ultimately ends up playing approximately the same number of minutes every outing, this metric is easily skewed by playing time fluctuations. All it takes is a couple fluky goals or a few short-lived substitute appearances to throw off a player’s ppg value for the season.

Savvier fantasy managers may turn to points per 90 minutes (pp90) to solve the ppg conundrum. This statistic smooths out some of the inconsistencies of ppg, and is often a better metric to assess player value. Yet pp90 has its own quirks most notably involving substitute appearances. Take for example everyone’s favorite hypothetical super-sub, Player 1. Player 1 plays 15 minutes in each of his first two games off the bench and nets a goal and an assist! With a whopping average of 60 pp90 (20 fantasy points across the 30 minutes) Player 1 earn himself a starting spot. Congratulations Player 1! But don’t get too excited fantasy owners; unfortunately, Player 1’s performances take a turn for the worse here. His speed isn’t quite as effective against fresh defenses and when the opponent isn’t chasing a game there just aren’t the same open spaces. After 5 scoreless 90 minute matches, averaging just 5 points each match, Player 1 is returned to the bench, banished to play out the rest of his days as a super-sub.

Tallying all of his games, our intrepid Player 1 has a respectable 8.4 pp90. Sure, he’s not a superstar, but he appears to be a consistent starter for your team. However, we, as the smart and data-savvy fantasy owner, realize this is simply an illusion, as most of his output reflects his highly effective sub minutes. It turns out this is a real phenomenon, not just something I’ve made up to sell my model. Research shows substitute minutes are inherently different from starting minutes, and thus shouldn’t be included without an abundance of caution in any analysis determining the worth of a starter. Furthermore, this bizarro pandemic season notwithstanding, preliminary numbers suggest that it is practically never advisable to knowingly start a substitute in your Fantrax lineup (stay tuned for a forthcoming article!). Time = opportunities for points.

So, how do we evaluate scoreless streaks like this? Considering how short a season is and how infrequently scoring events occur, a sample size of 5 games is inadequate to accurately determine a player’s value entirely off of goal or assist tallies. There is too much randomness in a given football season, let alone a few matches, to judge performances with just these statistics and we don’t want a few lucky or unlucky performances to unduly influence our final valuation of a player for our team.

This is a problem that doesn’t just plague fantasy owners, but also real clubs looking to exploit undervalued players. To solve our fantasy manager problem, I will borrow what they used to solve their problem. While I probably don’t need to explain expected goals (xG) and expected assists (xA), for those in the back, this is a smart method where every potential shot/assist is assigned a value, ranging from 0 to 1, where 0 indicates the attempt will result in a goal/assist 0% of the time and 1 indicates it will always result in the expected outcome. You can think of it as a probability of success for the attempt. By adding up a player’s xG and xA for a game, you can get a reasonable estimate of how many goals or assists a given player should achieve given those opportunities on average.

So, let’s look at a concrete example. Imagine hypothetical Player 2 has himself a brilliant game, scoring 3 goals. I present two scenarios that conceivably could have played out. Scenario A: each goal Player 2 scored was close to a tap-in, with an xG of 0.8 per goal, giving him a grand total of 2.4 xG for these three attempts on goal. While it was unlikely he scored 3 goals in this game, at least 2 goals were to be expected. Scenario B: each goal Player 2 scored was from a tough angle or from distance, with an xG of 0.1 for each attempt, meaning he accumulated 0.3 total xG from these three attempts on goal. It was thus highly improbable for him to score 3 goals, let alone a single goal, which even Lionel Messi would have struggled to achieve. However, as a result of this one performance, his ppg and even pp90 will overestimate his likely points return for the rest of the season.

While both scenarios result in the same number of fantasy points for the player, I contend scenario A is a lot more indicative of this player’s value moving forward. We want players who consistently perform well, regardless of how lucky they got. The degree of luck is thus crucial information that just isn’t captured with normal point-tallying methods such as ppg or pp90. This is especially relevant in fantasy soccer because the statistics that are arguably the most significantly impacted by luck—goals and assists—are assigned the most points.

A new tool for evaluating players: xFpts

Given the limitations of ppg and pp90 describe above, I propose a new metric, termed xFpts, that uses xG and xA totals to estimate the number of fantasy points a player would be expected to score, rather than what they actually score. The process is simple – instead of awarding any points for goals, assists, goals against, and clean sheets, I instead give a prorated total based on the expected number (Fig 1).

Figure 1. Calculation of xFpts for each position group, assuming standard togga scoring rules.

In our previous example, Player 2 would score 21.6 xFpts (from 2.4 xG) in scenario A, whereas in scenario B, he only manages 2.7 xFpts (from 0.3 xG). This is a far more accurate representation of this player’s potential fantasy points in the game in question and serves you much better in evaluating their skill moving forward. Note that for the final xFpt total, all non-goal-or-assist-related points (i.e., ghost points) are added as normal. For defenders, in addition to converting goals to xG and assists to xA, I also use the total xG against the defender’s team to scale clean sheets (worth 6 points) and goals against (worth -2 points) points.

As discussed earlier, in order to avoid the bias of substitutes versus starters, I solely consider player statistics in those weeks where they start the game and play a minimum of 60 minutes. I’m using this minute threshold because (1) defenders are not eligible for clean sheet points unless they play 60 minutes and (2) because this helps control for events unrelated to the player’s inherent abilities, such as injuries or other uncontrollable incidents.

Using this system, I first wanted to assess how my xFpts estimates compared to the actual observed points players scored as starters. If a player averages more xFpts than the actual fantasy points they’ve scored, then this is a sign they’ve been under-performing. If the opposite is true, and they average more actual fantasy points than xFpts, then this would indicate they’ve been over-performing. Given enough games, we would expect each player to average the same number of actual fantasy points as xFpts as luck evens out. To visualize how the premier league players have been performing this season thus far, I plotted the average fantasy point and xFpt output of each forward with at least 5 starts (Fig 2).

Figure 2. Scatterplot of average xFpts versus average actual fantasy points scored for forwards.

The black line indicates where the xFpt average equals the actual fantasy points average. In order to help identify outliers, I’ve added a dotted red line that represents the threshold where a player is outperforming their xFpt average by more than 1.5 points. The dotted blue line represents under-performance by at least 1.5 xFpts per week. Therefore, players above the red line are candidates for lowered expectations moving forward, while those players beneath the blue line should start seeing an uptick in performance.

I think it’s important to note here, that although it appears some superstars are over-performing, such as Sadio Mane or Sergio Aguero, it is just as likely that they are in fact much better than xG and xA models give them credit for. Expectation models are not calibrated to what a superstar is expected to score, but rather what your average top-flight player would be expected to achieve. Therefore, on the fringes, where we don’t really need an xFpts model anyway, my model probably loses some reliability.

Comparing players with xFpt models

As discussed in this excellent post, the standard deviation of a player gives you a good sense of the consistency of player’s performances. Not only is it important to consider how many points a player scores on average each week, but also how reliable these points are. Players with low standard deviations dependably produce roughly the same number of points each week. The beauty of using xFpts instead of actual points, is that by reducing some of the randomness surrounding goals and assists, you also tend to reduce the variance in a player’s expected point totals, giving you more precise estimates of their standard deviations. This then results in a better sense of a player’s true worth, and how they might perform in the future.

Often, as a fantasy manager I’m interested in the minimum and maximum score I can reasonably expect from a given player. In order to make player comparisons and visualize these ranges, I’ve created charts where I plot each player’s mean xFpt score +/- 1 standard deviation for each position group. Embedded here is the resulting plot for forwards (Fig 3), but the equivalents for defenders and midfielders will be posted in the next few days. Adding one standard deviation in either direction makes some assumptions about the distribution of the data that I won’t get into here, but it does give a high-level view of all players in a given position’s upper and lower expectations that can be instructive for making managerial decisions.

Figure 3. xFpt ranges for eligible forwards, ranked in order of mean xFpt performance. 1 standard deviation from the mean is plotted in both directions.

As you can see, a player like Christian Benteke is a very consistent performer (14.1 xFpts/game) and would prove very useful if protecting a lead due to his high floor. However, if you’re chasing a game, a player with a considerably lower average number of points might actually be preferred, such as Dominic Calvert-Lewin (10.7 xFpts/game), because his ceiling is higher (18.9 xFpts vs 17.3). Dominic Calvert-Lewin has simply shown a greater capacity for high xFpt totals, which should translate to a higher probability of scoring more actual fantasy points.

Embed from Getty Images

Finally, I’ve embedded a table of the top 25 forwards ranked by mean xFpts. A few interesting observations:

Gabriel Jesus is insanely good. He’s performing almost exactly as expected, which makes it all the more impressive. He puts himself into better positions more consistently than any other player in fantasy.
Harry Kane, Pierre-Emerick Aubamayang, Wilfried Zaha, where you at?? Consensus top picks who are not even sniffing the top 25, let alone the top 10.
Pedro is putting in impressive work during his limited appearances, with a top-ten ranking. Another must-start when in Chelsea’s lineup.

You can find our defender data here, containing a fascinating comparison of the opposite fortunes of Marcos Alonso and Emerson. Our midfielder post is still under construction! Please don’t hesitate to get in touch with any questions or comments!

Below we've linked to the full forward xFpt table. All code and data can be found on our github. xG and xA data pulled from www.fbref.com and fantrax statistics from fantrax.com.

NOTE: Data is only included through week 29 of the 2019/2020 season. For up-to-date ranks, please refer to our xFpts Ranking page.

forwards_xpts_table-1Download

DISCLAIMER: We will not be held liable if you use this information to make bad fantasy decisions.

Special thanks to Andrew Gear, Jill Winter, Ben Kuder, Matt Gear, and Stephan Adams for helpful comments.

Introduction to xFpts: Forwards

Discussion about this post