Saturday, August 6, 2011

Differentiating between Luck and Skill Part III

In my last two posts on luck in pitching, I first defined a regression model and then used individual pitchers to determine if different types of pitchers were lucky or unlucky. In this post, I am going to use all of the pitchers in the data set (those pitchers that had enough innings to qualify for the ERA title from 2002-2010), separate them into different types of pitchers, and determine whether those types were lucky or not.

The first step is to determine what a ground ball, fly ball, and strikeout pitcher looks like. To determine this, I found the 90th percentile in the data set for GB%, FB%, and K/9. I then set this number as the lowest possible value a pitcher could have in a certain year to be qualified as a type of pitcher. For example, the 90th percentile for strikeouts per nine innings was 8.63. Only 10% of the time did a pitcher strike out at least this many hitters. So, any time a pitcher struck out more than that in a season, he was qualified as a "strikeout pitcher". I also did this for ground balls (90th percentile = 52.8%) and fly balls (90th percentile = 43.3%), and found that there were 78 qualifying ground ball pitcher seasons, 75 qualifying fly ball pitcher seasons, and 77 qualifying strikeout pitcher seasons. The graph below summarizes the results.


In any season, there has been between 4 and 15 total pitchers that qualify as one of the types. The rest of the pitchers were set as "No Type", and will be ignored for the study. It is interesting to see how the number of groundball, flyball, and strikeout pitchers fluctuate over the past decade. One would think that they would remain relatively constant, as it is mostly the same pitchers pitching year to year.

One reason I can think for the fluctuation is the offensive environment for each year. If there are more runs scored, than there is better hitting, which means worse pitching, which should lead to less pitchers classified as types (this is because I classified based on the 90th percentile of all years, so if there is better pitching, there should be less pitchers that are better than the 90th percentile in that specific year). This is especially true with strikeout pitchers. When we add runs per game on to the graph, we get the following:


We see the the inclusion of R/G slightly explains the fluctuations, as generally the higher the R/G, the lower the amount of pitchers qualified, especially strikeout pitchers. The other fluctuations I believe are due to luck, and determining the answer to that question is not the purpose of this post.

The first group of pitchers I am going to look at are groundball pitchers. The graph below shows both the predicted and actual ERAs by year for ground ball pitchers. Although the points are not paired, which makes the graph less meaningful, we can still use it to figure out some important things.


We can see the range of ERAs for groundball pitchers, as well as any outliers, which would show luck. Most years, the average ERA tends to be slightly below 4.00, but in some years (2003) it is much lower, and in some (2004) it is much higher. Overall, the actual ERA for groundball pitchers was 3.62 and the predicted ERA was 3.63, showing just the slight bit of luck. The model seems to predict fairly well the ERA of a groundball pitcher, but we cannot yet make any significant conclusions about the luck of a groundball pitcher.

Since the mean ERAs don't tell us much about the luck of a groundball pitcher, another measure might. If we look at each individual season (78 in total), we find that there were 34 cases where the actual ERA < predicted ERA, showing that the pitcher was lucky. The other 44 cases showed the pitcher to be unlucky (actual ERA > predicted ERA), meaning that 43.6% of the time, a groundball pitcher will experience a lower ERA than expected. This agrees with the previous conclusion in my last post, showing that Derek Lowe, representing groundball pitchers, was unlucky.

The next group of pitchers I am going to look at are flyball pitchers. The graph below is the same as the one above for flyball pitchers, and although it does look very similar, there are some slight differences.


The average ERA for a flyball pitcher looks to be slightly higher than a groundball pitcher. Also, there seems to be less outliers for flyball pitchers, showing that the regression is probably slightly more accurate for flyball pitchers (probably because FB% was included in the regression but not GB%). The actual ERA for flyball pitchers was 4.08, and the predicted ERA was 4.10, again showing the slightest bit of luck for flyball pitchers.

Again, since the mean actual and predicted ERAs don't tell us much as they are so similar, we turn to individual cases. For fly ball pitchers, 54.7% of the time (41 out of 75) a pitcher had an actual ERA < predicted ERA, showing that he was lucky. This seems to agree with the statement made above from comparing the mean actual and predicted ERAs, and we can conclude that flyball pitchers are most likely lucky, or at least luckier than groundball pitchers. 

The final group of pitchers to look at are strikeout pitchers. The graph below again shows ERAs by year for strikeout pitchers. The most interesting thing to note is how there is often an actual ERA outlier, showing that strikeout pitchers more often have their predicted ERA further from their actual ERA.


The mean actual ERA for strikeout pitchers was 3.32, while the predicted ERA was 3.28, showing that strikeout pitchers were actually slightly unlucky. There is slightly bigger gap in the predicted ERA, showing what I mentioned above about actual ERA outliers. For strikeout pitchers, 42.9% of the time (33 out of 77) the pitcher had an actual ERA < predicted ERA, showing that the pitcher was lucky. So just like groundball pitchers, strikeout pitchers tended to be more unlucky than lucky.

Now that we have looked at all of the different types of pitchers, we can see which pitchers tend to be lucky and which tend to be unlucky. The luckiest pitchers by far seem to be flyball pitchers. More often than not, their actual ERAs were lower than their predicted ERAs. Groundball and strikeout pitchers were more unlucky, as they tended to have higher actual ERAs.

This is by no means significant proof that different types of pitchers tend to have different luck, but it does show that based on the regression I ran, flyball pitchers are slightly luckier than any others. However, they did also have by far the highest ERA, showing that while it may be good to be lucky, it is much better to be skilled.