We are delighted to report that Harry Spearing, our PhD student collaborator at Lancaster University, has recently had a paper on Ranking and Prediction in Elite Swimming published by the Journal of the Royal Statistical Society. In October, Harry also delivered a remote presentation on his work at Harvard University, in a session jointly organised by Harvard’s Sports Analytics Lab and the American Statistical Association’s Section on Statistics in Sports.
The paragraph below gives a brief summary of the project, and the full journal paper can be read online at https://rss.onlinelibrary.wiley.com/doi/full/10.1111/rssa.12628 (or https://eprints.lancs.ac.uk/id/eprint/144385/).
A recording of Harry’s recent Harvard presentation (50 minutes duration) can also be viewed on YouTube at https://www.youtube.com/watch?v=w7hFyTqliAU
“Ranking and Prediction in Elite Swimming using Extreme Value Theory”
Harry Spearing & Jonathan Tawn [Lancaster University] / David Irons, Tim Paulden & Grace Stirling [ATASS Sports]
The International Swimming Federation’s simple points system aims to rank swimmers across all swimming events. The points acquired for a particular swim is a function of the recorded time and the current world record for that event, which introduces bias between events due to the differing “quality” of world records. A model based on extreme value theory is introduced where swim times are modelled through their rate of occurrence, and with the distribution of the best times being considered “extreme” events. Instead of modelling each swim event separately, a single model pools information across all 34 swim events, allowing for unbiased comparison between events, including para-swimming. From this model, it is possible to estimate other features of interest, such as the ultimate possible time and the distribution of new world records, and to correct swim times for the effect of full body suits. If multiple observations from each swimmer are available, the swimmers’ times are considered as longitudinal data, capturing each swimmer’s natural progression over their career and the dependence between swims. The model provides a novel single unified description of swim quality over all events and time, and more generally, an innovative approach for dealing with longitudinal analysis of extreme data that is relevant to a wide range of applications.