February 23, 2020
Form Ratings: The Method Behind our Power Rankings
An explanation of the models we use to estimate the current form of PGA Tour players based on their playing history and how we use to them to make our power rankings.
By Jeremy Phillips

There's no doubt about it... golfers go in and out of form all the time. Anybody who has played the game can relate; you think you have it all figured out one month and are searching for it the next month. Professional golfers are faced with this same challenge. While their “bad” is still good, it's not uncommon for their scoring average to fluctuate a stroke or two here or there and this can lead to big fluctuations in outcomes over four-round tournaments.

Consider Phil Mickelson as an example. The chart below shows all his PGA Tour scores going back to January 2015. The black line is a 24-round moving average.

You can see from the chart that Mickelson’s scoring average hovered around 70 from 2015 through the end of 2018. He had some periods of lower scores and higher scores, but for the most part he was pretty steady. His average starting heading in the wrong direction near the end of 2019 peaking at nearly 73 before settling back down near 70 as of February of 2020.

While a moving average does a decent job of illustrating how Mickelson’s scoring average has evolved over time, it is a relatively naïve way to estimate form. Moving averages are susceptible to outliers and do not tell us anything about our relative uncertainty in where a player’s game is at any given point in time. Our Form Ratings model takes a more sophisticated approach to estimate a player’s form over time. The following lays out some concepts that motivate our model:

  • Scoring conditions can change significantly from one round or tournament to the next. For example, shooting even par in the third round of the PGA Championship is very different than shooting an even par in the third round of the RSM Classic. Our model factors in these relative differences in scoring conditions across rounds.
  • There is a lot of randomness is golf scores. A player’s score in any individual round is only a noisy measurement of the player’s form. If you have played golf, you know you can show up to the course two days in a row with the same swing but score two very different numbers. Our model’s objective is to see through the noise and measure the signal.
  • Our estimate of a player’s form must not overreact to individual high or low scores. Suppose we think a player is currently a scratch golfer and they go out and shoot one over par in their next round. Does this mean our prior estimate of scratch was wrong? Not really. Shooting one over par is a perfectly reasonable score for a scratch golfer. Our model should update our estimate of a player’s form after a player posts new scores, but our update should consider how likely the new scores were given our prior belief. For example, if we believe the player is a scratch golfer, but we start seeing him shoot 64’s and 65’s consistently, then it would make sense for us to adjust our form rating downward to some degree.

Gross Scores vs. Adjusted Scores

Understanding differences in scoring conditions from round to round is essential for estimating players’ form ratings. Consider two players with the same scoring average but suppose one of them has only played in low scoring events while the other has only played in high scoring events. It should follow that the player that played in only high scoring events is going to be the better player and we would expect that player to score better on average should these two players ever tee it up in the same tournament.

Our model controls for the scoring conditions from round to round by estimating round level adjustment factors. These adjustment factors are subtracted from the gross scores to compute adjusted scores which can then be compared directly across rounds and players.

The table below shows two examples of adjustment factors from the 2019 season. The 4th round of the PGA Championship at Bethpage Black was one of the toughest rounds of the year on the PGA Tour and our model estimates an adjustment factor of 4.8. The 4th round at the RSM Classic was one of the easiest rounds of the year with an adjustment factor of -2.8.

Tournament Course Round Adj Factor

2019 PGA Championship

Bethpage Black

4

4.8

2019 RSM Classic

Sea Island Golf Club (Seaside)

4

-2.8

We can use these adjustment factors to compare scores across rounds. For example, we can compare shooting a 70 in each of the two rounds above. A 70 in the fourth round of the PGA Championship translates into an adjusted score of 70 – 4.8 = 65.2 while a 70 in the fourth round the RSM Classic translates into an adjusted score of 70 – (-2.8) = 72.8. Our model estimates that a 70 in the 4th round of the PGA Championship is 72.8 – 65.2 = 7.6 adjusted strokes better than a 70 in the 4th round of the RSM Classic.

We have these adjustment factors for all rounds in our database which includes rounds on the PGA Tour dating back to January 2015. Let’s take another look at Phil Mickelson’s scores by comparing them to his adjusted scores. The chart below shows how each gross score is adjusted to become an adjusted score. These adjusted scores can be compared across time and across players and give us a cleaner read on Mickelson’s performances over time.

Randomness in Scores

There is a lot of randomness in golf scores. In the chart below we look at two groups of golfers. The first group consists of all adjusted scores in our database where our model estimated the player form ratings to be between 68-70 while the second group looks at adjusted scores with player form ratings between 70-72. The bars show the distribution of the adjusted scores for both groups. The higher the bars, the more scores we have observed in that range from that group.

Since form rating is an estimate of skill, it is no surprise we see a higher percentage of scores in the 60-70 range from group one than we do group two. An interesting to take away, however, is that the distribution is quite wide for each group. 95% of the scores from group one fall between 63 and 75 which is a 12-stroke difference. When comparing group one’s distribution to group two’s distribution, we can see there is a lot of overlap. This means that even though a player from group one should be considered a favorite in a match against a player from group two, there is still a pretty good chance the player from group two will beat the player of group one. This is because of how much randomness there are in golf scores and it is important to account for this randomness or we risk overreacting and letting the noise destabilize our form ratings.

Seeing Through the Randomness

As we see in the example above, we expect to see a pretty wide range of adjusted scores from players with a given form rating. We have designed our model in a way that it will not make significant adjustments to a player’s form rating until it sees enough evidence of a change. That is, if the adjusted scores we observe seem unlikely for a player with a given form rating, then an adjustment will be made.

Consider the following example from Phil Mickelson’s scoring history. Here we are looking at Mickelson’s scores from the 2019 season. The solid black line shows our model’s estimate of Mickelson’s form rating through the end of May 2019. His last form rating during that time period was 68.2. The shaded region shows the range of adjusted scores we would expect to see from a player with a form rating of 68.2. As we start to observe Mickelson’s scores beyond May 2019, we see his adjusted scores begin to show up more in the upper part of the shaded region than in the lower part of the shaded region. We even see a few adjusted scores that come in above the shaded region all together. We would not expect to see this pattern of scores from someone with a form rating of 68.2. As we observe these unusually high scores, our model begins to adjust Mickelson’s form rating up so that the scores we are observe are more inline with those we would expect to see.

This is the basic mechanism by which our model adjusts player’s form ratings over time. Each time a player posts a new score we determine how likely observing such a score was given our previous estimate of their form rating. If the score seems out of line with our expectations, then our model will make the necessary adjustments to bring the form rating into better alignment with the new scores. You will notice that the model does not overreact to any one score. This is by design and is because the model knows the scores it sees will have a lot of noise.

Bringing it All Together

We began this article with a view of Phil Mickelson’s scoring history and computed a moving average to demonstrate how his performance had evolved over time. We are now able to provide a much more valuable perspective on Mickelson’s form. The chart below ties everything we talked about above together and show Mickelson’s form rating over time.

Our model estimates the historical and current form ratings for every player on the PGA Tour. The model is updated after every PGA Tour event and the results have several very useful applications.

The first application that we have already deployed on the site is using every player’s most recent form rating to rank order players based on how they score in our model. These results are summarized every week in our Power Rankings.

Other applications that we will be exploring in the future include using our form ratings to simulate upcoming tournaments. Our model can not only be used to estimate a player’s current form based on their historical scores, but it can also be used to simulate future tournament results. For example, we can (1) estimate the probability of each player winning or finishing in the top 5, top 10, and top 20, (2) estimate the probability of each player making the cut, and (3) estimate the probability of players winning head to head matchups. All of these applications can aid us in finding value in betting markets and in daily fantasy sports competitions. Stay tuned.