Uncover the Regular Pattern of Momentum Behind Tennis Match

Yaping Yu; Yuhui Zhou; Zhipeng Hou

doi:10.54097/skkt5s45

Authors

Yaping Yu
Yuhui Zhou
Zhipeng Hou

DOI:

https://doi.org/10.54097/skkt5s45

Keywords:

Performance Score, Momentum, Softmax Regression Model, SHAP, PCA.

Abstract

In sports, a team or player may feel they have strength or force during a match as a result of which is often attributed to "momentum". We defined a series of indicators and built several models to explore the role of momentum in a tennis match and the factors that influence it. We use a list of data sets of Wimbledon 2023 men’s matches after the second round. After preprocessing the given data, we defined six indicators, such as cumulative scoring advantage, success rate of breaking serve and ability to control the ball, which make up the player's performance score in each round. The player's performance score was calculated by the normalized values of the indicators and the weights of the six indicators determined through the Principal Component Analysis (PCA). Next, a tumbling window was used on the player's performance scores, and the size of the tumbling window was set to 3 rounds, which was used to calculate the player's momentum in each period. After that, we chose the match "2023-wimbledon-1701" for the visualisation of the match flow. Then we defined the performance score difference between the two players in each round and used it as the independent variable, and the point victor in each round as the dependent variable. Then we built a Logistic Regression Model to analyze the causal relationship between the two. We found that the swings of the game and the success of the players are not random, it will be affected by the performance scores of the players in the game. Each unit increase in the performance score difference increases the probability of Player 1 winning by approximately 5.087 times. Next, we defined the performance score turning point and used it as the dependent variable. After data processing with normalization and resampling, we built a Softmax Regression Model and combined it with the SHAP method to explain the model. We found that there are indicators that can help determine when the flow of play is about to change from favoring one player to the other such as cumulative rate of unforced error, quality of the serve and cumulative rate of scoring at the net. Of these, the ability to control the ball is the most relevant indicator. After that, we made suggestions for players to play against different players. After that, we tested the model using data from the match "2023-wimbledon-1311", and the predictions were favourable. By constructing the confusion matrix and calculating the four evaluation indicators of accuracy, precision, recall, and f1-score, we found that the model results performed generally well. Then, we proposed six factors that may need to be included in the model, such as direction of serve, depth of return and length of break before this round. Afterwards, we found the dataset of Sebastian Ofner vs. Stefanos Tsitsipas in the men's singles eighth-final of the 2023 French Open Roland Garros to fit the model and found that the model is more generalizable. Finally, we evaluated and refined the model and reported the findings in a memo to tennis coaches.

Downloads

Download data is not yet available.

References

Dietl H, Nesseler C. Momentum in tennis: Controlling the match [J]. UZH Business Working Paper Series, 2017 (365).

Page, Lionel. "The momentum effect in competitions: field evidence from tennis matches." Econoindicator Society Australasian Meeting. 2009.

Corral, J., & Prieto-Rodriguez, J. (2010). Are differences in ranks good predictors for Grand Slam tennis matches? International Journal of Forecasting, 26, 551–563.

Davies A, Fearn T. Back to basics: the principles of principal component analysis [J]. Spectroscopy Asia, 2005, 1 (1): 35-38.

Montagna S, Orani V, Argiento R. Bayesian isotonic logistic regression via constrained splines: an application to estimating the serve advantage in professional tennis [J]. Statistical Methods Applications, 2020, 30 (2): 1-32.

Henrikki T, M. HRP, Elsa A, et al. Explaining a century of Swiss regional development by deep learning and SHAP values [J]. Environment and Planning B: Urban Analytics and City Science, 2023, 50 (8): 2238-2253.

Wan Lei, Tong Xin, Sheng Mingwei. Review of Image Classification Based on Softmax Classifier in Deep Learning [J]. Navigation and Control, 2019, 18 (6): 1-9.

Townsend J T. Theoretical analysis of an alphabetic confusion matrix [J]. Perception & Psychophysics, 1971, 9: 40.

Michael Nielsen. (2015). Neural networks and deep learning. Determination Press.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.