My experience in sports analytics has come primarily in two sports: tennis and football. In football, Expected Points Added is one of the most popular metrics for evaluating the level of success or failure of a play. Essentially, it is a translation of the “Field Situation” – down, distance, spot, etc. – into points. In football, the team that scores the most points wins every time. Therefore, framing every situation in terms of points – the currency that defines wins and losses – makes sense. In tennis, on the other hand, “Court Situation” is much more difficult to nail down than field situation. Not every point in tennis is worth the same. In fact, the losing player occasionally wins the most points. There are two barriers to creating such a metric: defining the situation and determining a currency equivalent to points. Our new Clutch Factor Over Expected metric solves these challenges.
Points in football determine victory or defeat, but they do not control the duration of the game. That is why they are such an effective response variable, allowing supervised machine learning algorithms to be used to create Expected Points. Tennis has three structures that are candidates to be a currency: points, games, and sets. Points do not truly determine victory or defeat. Games have the same issue, with the player who wins the most games occasionally losing the match. Tiebreakers are an additional complication to using games as a currency since they count as “games” in the score. Sets control the duration of the match. The tennis scoring system provides no true good candidate for a currency, so to create our metric, we combined the three. To do that, we examine how the subdivisions interact with each other, which heavily informs how we define the situation and led to the creation of Clutch Factor as a new metric and reliable currency.
The first level is from a point to a game. Let’s take a look at a point from the sixth game of the second set of Noma Noha Akugue’s victory over Emma Lene in the first round of qualifying at Roland Garros 2023. If you are in the USA and have a TC+ account, you can watch it here. Noha Akugue is serving at 1-0 2-3 40-30. The first level is the game. At this stage, we only want to know whether the next game score will be 3-3 or 4-2. If we are only modeling on the outcome of the game, then points do determine victory or defeat. They do control the duration of the game, but that is true of each layer and the reason why we are combining them. At the heart of this analysis is this algorithm developed to determine the amount of information that each point provides to the outcome of the game.
The second level is game to set. Now, we want to know whether the next set score would be 2-0 or 1-1. There are two different parts to the value of the game to the set. For the first 12 games of a normal set, the value of the game to the set can be determined using the same algorithm as the value of the point to the game. Because that algorithm generates values on a scale from 0 to 1, values can be multiplied all the way down the line to get the value of the point to the set. The 13th game – the tiebreaker – is designed to entirely determine the winner of the set. Therefore, its value is 1. But since the scoring format of the tiebreaker is entirely different, we do not have any point values down the line yet to multiply. So, we apply that algorithm yet again to each possible score in a tiebreaker and generate their point values. This is where we run into our first issue: the tiebreaker format. The tiebreaker was initially introduced at the Grand Slam level at the 1970 US Open. For the first few years, the US Open played a tiebreaker to 5 points, sometimes known as a 9-point tiebreaker. In 1975, the format changed to what we now see: tiebreakers are played as the first player to 7, win by 2, otherwise known as a 12-point tiebreaker. Because the length of the tiebreaker matters for the point values, any match with a 9-point tiebreaker was excluded from our dataset. The other Grand Slams introduced tiebreakers at different times, so we made sure to specify whether a tiebreaker was possible after each set. This also changed the values of the games: some games, especially deep into sets, are played differently if a tiebreaker is coming, so the values of the games were calculated by both the score and the tiebreaker possibility. Any matches with a 9-point tiebreaker had to be excluded from our dataset, because those are also played very differently. Until 2019, each tournament used a different method to determine the match winner when the final set reached 6-6. Then, the Grand Slams began to introduce a final set tiebreaker to 10, better known as a “super,” or “match,” or 18-point tiebreaker. They were adopted by all 4 Grand Slams in 2022. But because they have only been around for a few years, are played infrequently, and have so many possible point scores, there have not been enough of them to definitively say anything about the values of those points. Therefore, the super tiebreakers themselves have also been excluded from our analysis for now.
Now that we have navigated the complex web of options for getting from the point to the set, the third level is the value of the set to the match. To calculate this, we use the same algorithm, but with two complications: which player is leading matters, and the set situations are different based on the number of sets that can be played, as men’s Grand Slam main draw matches, plus the third round of men’s qualifying at Wimbledon, are played best 3 out of 5 sets, while all other matches are best 2 out of 3. So, instead of using the exact set scores, we named the scenarios and charted out the potential paths through the match from there. Now that we have the exact values of every potential currency to every other potential currency, we can calculate the value of each point to the match.
This leaves us two additional tennis-specific factors to define the “Court Situation” and calculate Clutch Factor Over Expected. First, men’s and women’s tennis are different. Therefore, we separated out the calculations by draw. Second, surface distinctions have a more significant impact on tennis than football. In football, teams play on sod or turf fields. While they are slightly different, it is not enough to have that included in common public Expected Points models to date. In tennis, however, the combination of surfaces, balls, and conditions forces us to separate out each event. We combined data scraped from multiple websites of nearly every match from 50 Grand Slams since 2012 with Jeff Sackman’s Match Tagging Project to create a sufficiently large dataset for this to work and leave each draw of each tournament with hundreds of thousands of points.
Now that we have all our values attached to scores, we can attach them to real points. In our Clutch Factor statistics, we are using the value of the point to the match. Each point has a certain score attached to it. Whichever player wins the point gets that score and their opponent gets 0. Taking each player’s average score over a set, match, or tournament, we generate their Clutch Factor over that period. The final step is to establish the expectation. In tennis, leads and deficits are referenced in terms of breaks, rather than games. The expectation that players will win points in different frequencies as the server versus as the returner informs the calculation of the Expected Clutch Factor. We calculated the point win percentage of servers and returners (1-servers) over each subdivision and multiplied it by the value of each point to create our expectation. Then, by comparing each player’s score to their expected score, we generate Clutch Factor over Expected.
Rankings
Over two years of development, we pulled and scraped data and developed and modified an algorithm to find the influence that each individual point has on its segment of the match. In tennis, Clutch Factor is as close as we could come to establishing a currency that would mirror football’s Expected Points by applying the match situation to a currency that determines the outcome of the match without directly defining its length. We then used it to evaluate players over a match and tournament. To evaluate a player’s performance in Grand Slams over a full season, we used an average of their Clutch Factor Over Expected rankings in the four Grand Slam tournaments. But that could not produce an effective season-long ranking, because as you win more matches in a tournament structure like tennis, opponents tend to get better and more difficult to beat. So, to create a season-long ranking, we multiplied our average by a number generated by the length of their median Grand Slam run. We separated out main draw and qualifying matches due to the differences in level of difficulty challenges. The Top 10 main draw rankings for 2023 are below.
![](https://static.wixstatic.com/media/480eaf_60b30abca6ed4e2cb4e3f8023a6dafa5~mv2.png/v1/fill/w_373,h_422,al_c,q_85,enc_auto/480eaf_60b30abca6ed4e2cb4e3f8023a6dafa5~mv2.png)
![](https://static.wixstatic.com/media/480eaf_effc667f60ef40bebd120e612fca89dd~mv2.png/v1/fill/w_391,h_422,al_c,q_85,enc_auto/480eaf_effc667f60ef40bebd120e612fca89dd~mv2.png)
Comments