This is a review of the baseball Pythagorean Theorem research conducted by Jay Heumann.
In baseball, the general belief is that a team’s ratio of runs scored to runs allowed is actually a better predictor of a team’s future performance than their winning record. This is based on the baseball Pythagorean Theorem, which uses the number of runs a team has scored and the number of runs a team has allowed to calculate the probability that a team will win a future game.
However, many feel that the formula can be improved in order to calculate a more accurate probability of a team winning. One suggestion is changing the exponent from 2 to 1.83. This research demonstrates a new way to improve the Pythagorean wins statistic, called a pairwise Pythagorean formula. In this formula, the Pythagorean wins are calculated on a team-by-team basis, and not cumulatively for all teams. While the original formula calculated the sum over all teams first, this method determines each teams’ Pythagorean win total against other individual teams and then calculates the sum over all teams.
In order to test the validity of the pairwise Pythagorean formula it was put to the test using actual data. Data from the 1960-1990 Major League seasons were used and both the traditional Pythagorean Theorem and the pairwise Pythagorean Theorem were calculated for each year along with their root mean square error. Both formulas were calculated using the exponent of 2 and 1.83. The resulting chart clearly indicates that while changing the exponent from 2 to 1.83 in original formula is more accurate, the pairwise formula is even more accurate whether using 2 or 1.83 as an exponent. In 36 of the 60 seasons, the error of the two pairwise formulas was lower than that of both of the traditional formula. In 22 seasons, both pairwise methods were more accurate than their traditional counterparts were.
To improve the formula even further it is suggested that other variations should be tested to examine their validity. Variations could include changing the exponent to something other than 2 or 1.83. It is also possible that the formula should not use fixed exponents, but rather exponents that are a function of some other variable and would therefore vary from team to team.
Analysts and coaches can use this new pairwise Pythagorean formula to gain a more accurate reading of the number of games their team is expected to win over the course of a season. Analysts would have greater accuracy in predicting which teams will make it into the playoffs. Coaches will have the opportunity to make trades that will better improve their chances of winning.
There are many possible variations to the Pythagorean Theorem. It is conceivable that one of them might produce more accurate results than the tradition theorem currently being used. Further exploration is needed to determine which variation will ultimately create the best prediction of a team’s expected win-loss totals.
————————
Find out how Sports Analytics Expert Victor Holman can give your team the competitive advantage.
How mature is your team’s analytics program? Take the Sports Analytics Maturity Assessment.
Learn about the Groundbreaking Sports Analytics Model coaches and sports analysts are talking about!
Learn all about sports analytics in Victor Holman’s Sports Analytics Blog.