Today’s blog takes a look at sports analytics frequency distribution and how this analytics method is applied in sports. Once data has been collected, it needs to be organized in a manner that makes it easy to analyze. The simplest way to do this is by summing up the information and entering the sums into a frequency table. A typical win/loss chart for any team or league is a frequency table. They are easy to read and can be interpreted by simply reading along the row for the desired information. Frequency tables provide information such as number of games played, number of wins and losses, and various score statistics.
When looking at the table the number of wins is the frequency of wins and the percentage of wins is the relative frequency. Relative frequency is found by dividing the total number of team wins by the total number of games played. This takes care of any discrepancies in the number of games played per team, thus making it easier to compare the teams.
Frequency tables can measure qualitative (non-numerical) or quantitative (numerical) data. Qualitative, or categorical data are what we call discrete variables, meaning they have a limited number of outcomes and cannot be ranked. Number of wins is discrete, as you cannot have 1.5 wins; it must be a whole number. Qualitative variables can be either discrete or continuous. Continuous variables are those that can take on any value. A fighter’s weight is a continuous variable as a fighter can be 185 or 185.5 pounds.
Frequency tables can be viewed in graph form, called histograms. Histograms make it easier to compare data and look for patterns. However, if you are looking for specific numbers, they are more easily found in a frequency table.
Histograms can take on many different shapes. One shape is the normal or bell curve – a curve that looks like the shape of a bell. The normal curve is what happens theoretically under ideal conditions. Therefore, no data will ever form an exact bell curve.
One method of comparing histograms is to look at their symmetry. Normal distributions are completely symmetrical with the mode, or highest point, in the middle. When the highest point is towards the left side of the graph, the graph is said to be positively skewed and if it is towards the right side, it is negatively skewed.
Histograms with only one peak are called unimodal while those with two peaks separated by a valley are bimodal.
Frequency tables and histograms are a quick, easy way for analysts and teams to compare individual players or entire teams. They are also used to compare the stats of a single player. For example, numbers of yards per pass for a quarterback can be listed by category such as 0 to 5 yards, 6 – 10 yards and so forth. Histograms are a helpful visual to illustrate what they are discussing with the players or other stakeholders.
How mature is your team’s analytics program? Take the Sports Analytics Maturity Assessment.
Learn about the Groundbreaking Sports Analytics Model coaches and sports analysts are talking about!
Learn all about Sports Analytics here.