This is a review of the win probability research conducted by Sujoy Ganguly and Nathan Frank.
In the NBA the win probability statistic is based on game time, possession, and point differential in order to predict who the winner of the game will be. There are two main issues with this statistic. One, it does not contain enough information such as injuries and fouls, which can have a major impact on the outcome of a game. Second, the chance of winning or losing is not an absolute – no team is guaranteed to win or lose – there is always an element of uncertainty involved which this model does not take into account. There are two possible methods to deal with these issues. The first is to include the lineup of players for the game in the model and the second is to include the score difference distribution – in other words the difference in the points scored by the two teams.
In order to build this model data from the 2002-2017 seasons were used. This data included over 8.7 million play-by-play events. Each event included game time, ball possession, and score difference. In order to improve this model information regarding the teams playing was added to the data set. This included lineup information including player identity, number of games played, plus/minus per game, and minutes played. Putting this together means that for each play-by-play event there are 352 features taken into account. Then, rather than predicting who will win the game, the score difference is predicted in the form of home team score to away team score.
When this model was tested it was found that it was 88% accurate while the previous model was only 75% accurate – a 13% improvement.
With this improved model analysts will be better able to predict the outcome of upcoming games. The predictions will change as lineups for the teams change, making it a more fluid and dynamic analysis that responds to the ever changing compositions of the teams. Predicting a final score difference rather than simply who will win gives the analysts the flexibility to explore different alternatives for that differential. Such as what if a certain player is injured or teams make a major trade?
As this model now takes into account the lineup of each team, teams will be able to look at the expected outcome of their games and test how the expected outcome will change with different players in the lineup. This will allow them to be more proactive, rather than reactive in working to improve their team’s performance.
How mature is your team’s analytics program? Take the Sports Analytics Maturity Assessment.
Learn about the Groundbreaking Sports Analytics Model coaches and sports analysts are talking about!
Learn all about Sports Analytics here.