Saturday, July 23, 2011

Differentiating between Pitching Luck and Skill Part II

In my first post on differentiating luck and skill for pitchers, I defined a regression model for determining a pitcher's skill. In this post, I want to look at different individuals pitchers as examples of certain types of pitchers to see if some are luckier than others, or if some are more skilled than they seem.

To review: a pitcher's career ERA is the defined baseline, or average skill of the pitcher. His predicted ERA is taken from the regression model and used to define his skill for that particular year as the difference from career ERA. His actual ERA is observed yearly, and any difference from predicted ERA is due to luck. One thing that is noticeable is that the predicted values are much closer to the actual values for these graphs as opposed to the hitter's AVGs. This is because the R-squared of the pitching regression is .807, while in the hitting regression it was only .344, so much more of the variability is explained by the independent variables.

The first pitcher I want to look at is Roy Halladay. Halladay has been remarkably consistent over the past few years, so it will be interesting to see if the model follows his ERA or jumps around each year. As we can see in the graph below, there has been an interesting split to Halladay's career. Until 2008, his actual ERA was always higher than his predicted ERA, showing that he was unlucky. However, from 2008 through today, his actual ERA has been lower than his predicted ERA, showing that he has been lucky.


There are a couple of possible reasons behind the change in "luck" for Halladay. Since 2008, he has had a much lower WHIP than before, has induced more fly balls, thrown many more first pitch strikes, and has had a much higher swinging strike percentage. The lower WHIP played a huge role in him decreasing his ERA almost a full run (3.71-2.78) from 2007 to 2008, and also would result in the model predicting a much lower ERA.

The other factors are much more interesting, and can explain the difference in luck much better. A higher FB% should result in more home runs and a higher ERA, but Halladay offset this by having a much lower HR/FB ratio. He was allowing many more fly balls but only slightly more home runs. And since fly balls result in lower AVG and OBP, Halladay was basically mitigating the bad result of more fly balls (home runs), and simply using them to his advantage.

The other two differences, throwing more first pitch strikes and a higher swinging strike percentage, go hand-in-hand in explaining the luck factor. Because he increased both variables, the model predicted that his ERA would increase, at least from these variables. In fact, he had a much lower ERA, and this is probably due in large part to the more strikes he threw and the swings and misses he generated. Getting ahead of more hitters probably led to a lower WHIP, which would decrease his ERA. This really goes back to the previous post and the controversial regression model. I believe that in Halladay's case, more strikes led to a lower ERA and not a higher ERA as the regression model predicted. This would perfectly explain his luck.

Now that I have discussed some factors of luck and skill, I want to look at different types of pitchers, using examples of pitchers to try and acknowledge type. I am going to do this by looking at GB%, FB%, and K%, but I am also going to look at the experience of each pitcher. I want pitchers who have pitched in MLB for a decent amount of time so their regressions are more stable.

The first type of pitcher I want to look at is a ground ball pitcher. I am going to use Derek Lowe as an example, as in 2010 he had the third highest ground ball rate in MLB at 58.8%. He has had a long career, and his reputation as a ground ball pitcher has only grown with time as he has relied more and more on his sinker as his career has progressed.


Lowe's ERA has fluctuated a lot since 2002. In five seasons he has had a predicted ERA below his career ERA of 3.87, and in five seasons he has had a predicted ERA above his career ERA. In nine of ten seasons his actual ERA has been higher than his predicted ERA, showing that he has been unlucky. Only in 2005 has he shown to have any kind of luck, when his predicted ERA was 3.96 and his actual ERA was 3.61. A much lower WHIP (1.61 in 2004, 1.25 in 2005) led to a lower predicted ERA, but he also had career highs in first pitch strike %, HR/9, and HR/FB in 2005. All of these were predicted to increase his ERA, but Lowe actually managed to post an ERA almost two runs lower while allowing more home runs. So these factors led to his predicted ERA only decreasing a small amount compared to his actual ERA decrease.

2005 seems like an outlier, and that's why the model shows that he was much luckier in 2005 than any other year. The important take away is that Lowe seems to be overall an unlucky pitcher, at least by the regression's standpoint. This is an interesting point, because the regression says that the higher the FB%, the higher the ERA, so a pitcher with a low FB% should have a lower ERA. But this is not the case. I am very curious now as to what the result of a fly ball pitcher will be.

I am going to look at Ted Lilly as an example of a fly ball pitcher. He posted by far the highest FB% of any pitcher in 2010 at 52.6%. He has always been a fly ball pitcher, but has become more so as his career has progressed.


Lilly has an interesting chart: he seems to fluctuate between being lucky and unlucky from 2003-2007, and since then he has performed exactly as predicted. Between 2008 and 2010, the model predicted his ERA to be within 3 points of his actual ERA every season (including dead on in 2009), and this year it is only off by about 16 points so far. Lilly may not be the best example of a fly ball pitcher as his FB% has fluctuated over the years. A follow up study on all fly ball pitchers, and not just Lilly, will be required to determine if they are lucky or not, because Lilly seems to have performed just as predicted (although that may be the case for all fly ball pitchers).

The final type of pitcher I want to look at is strikeout pitchers. I am going to use Justin Verlander as an example, even though he only finished 11th in baseball in K/9 in 2010 with 8.79 K/9. All of the pitchers above him were younger and had less experience, which may show how pitchers change over time. Young pitchers may be able to get by with mostly a fastball, but once they age and their fastball loses some speed they have to rely on other pitches and craftiness to get hitters out.


Verlander's actual ERA and predicted ERA seem to mirror each other in the graph, except that his actual ERA is always slightly higher (except for 2006) than his predicted ERA, showing that he is unlucky. The worst year for luck (2008), Verlander had a predicted ERA of 4.15 but an actual ERA of 4.84. Much of the difference in skill for 2008 seems to be due to a career high WHIP, but the difference in luck is less clear. One reason, which I haven't talked about yet, may be Verlander's left on base % (LOB%). This variable measures the number of runners a pitcher leaves stranded out of the total amount of runners on base (so 1 - LOB% would be the percentage of runners who score). In 2008, Verlander had a LOB% of only 65.4%, which is by far the lowest percentage in his career (his next lowest percentage is 72.0%). Although his increase in WHIP showed that he was allowing more runners, thus more runs (which was predicted by the model), he was also allowing more of those baserunners to score, which would not be predicted by the model. Thus the model would show him to be unlucky.  

It will be interesting to see if this conclusion holds up for all strikeout pitchers. They have higher swinging strike % and should have higher first pitch strike %, but also probably have lower strike zone swing % and strike zone contact %. These factors oppose one another, and it will be interesting to see whether they cancel out, or if one effect dominates another and the pitchers are shown to be either lucky or unlucky.

In this post, I used individual pitchers to represent different types of pitchers. This was not an especially effective method, but it was good to explain some of the reasons behind the difference in luck and skill for certain pitchers. In my next post, I am going to actually separate pitchers into different types based on GB%, FB%, and K% to see if there is any differences by type. This will allow the differences to be much more clear instead of simply seeing differences due to individual pitcher types. This post concluded that ground ball and strikeout pitchers are shown to be unlucky, while no conclusion can be made for fly ball pitchers. It remains to be seen if those conclusions will hold up in the next post.

Thursday, July 14, 2011

Differentiating between Pitching Luck and Skill Part I

A few weeks ago, I did a couple of posts on differentiating between luck and skill for hitters. I want to now look at it from the other side: how to differentiate between luck and skill for pitchers. This time, instead of using batting average like I did for hitters, I am going to use ERA. Batting averages against pitchers have shown to be wildly inconsistent, and as such, a better dependent variable would be to look at the runs that a pitcher gives up, because it is much more consistent and definitive over time. I don't want to simply look at counting variables such as strikeouts, walks, and home runs, but look at batted ball statistics and detailed pitching statistics.

Just like last time, I want to introduce a bunch of variables and figure out which of them are important using Mallow's Cp and p-values. I tried running a stepwise regression on all of the variables together, but there were too many variables, so I ran two separate regressions and combined the results.

The first stepwise regression I ran is with counting and batted ball stats. The regression predicting ERA includes K/9, BB/9, HR/9, WHIP, GB/FB, LD%, GB%, FB%, and HR/FB. When we run the stepwise regression, we find that the best regression predicting ERA is ERA = K/9 + BB/9 + HR/9 + WHIP + GB/FB + FB%.

The second stepwise regression involves the "plate discipline" variables. These variables deal with things such as how often a batter swings and makes contact or the percentage of pitches in the strike zone. I collected 9 of these variables from Fangraphs, divided into three categories. Swinging includes O-Swing %, the percentage of pitches a batter swings at outside of the strike zone, Z-Swing %, the percentage of pitches a batter swings at inside of the strike zone, and Swing %, which is the total percentage of pitches swung at. Contact includes O-Contact %, the percentage of pitches a batter makes contact with when swinging at pitches outside the strike zone, Z-Contact %, the percentage of pitches a batter makes contact with when swinging at pitches inside the strike zone, and Contact %, which is the total percentage of contact made when swinging at all pitches. Finally, accuracy includes Zone %, the percentage of pitches inside of the strike zone, F-Strike %, which is the first strike percentage, and SwStr %, which is the percentage of strikes that were swung at and missed.

When I ran a stepwise regression with all of these variables, the best regression output is ERA = Z-Swing % + Swing % + Z-Contact % + First Strike % + SwStr %.

Now that I have run two separate stepwise regressions, I can combine the results and run one more stepwise regression to make sure the model is the best it can be. When I do that, I find that Swing % no longer becomes needed. Another change I need to make concerns confounding variables. Since the variable WHIP includes walks in it's calculations, I can't have both WHIP and BB/9 in the regression. Since WHIP includes hits, which could be important in predicting ERA, I must remove BB/9 from the regression.

The regression now looks like this: ERA = K/9 + HR/9 + WHIP + GB/FB + FB% + Z-Swing % + Z-Contact % + First Strike % + SwStr %. When I run a linear regression on this model, I find that the p-value for GB/FB rate is an astronomically high 0.864, so it is clearly not as important as I first thought. K/9 also has a very high p-value of 0.594, so that can also be taken out. We are left with seven variables that should, with relatively high confidence, predict a pitcher's ERA. The final regression model is as follows: ERA = WHIP + HR/9 + FB% + Z-Swing % + Z-Contact % + First Strike % + SwStr %. The output table from R is below.

Coefficients:
                      Estimate   Std. Error   t value   Pr(>|t|)   
(Intercept)     -6.89402   0.92749     -7.433    2.83e-13 ***
WHIP            3.82090    0.12118     31.532    < 2e-16 ***
HRper9         0.95049    0.05343     17.791    < 2e-16 ***
Fbperc           0.49971    0.24764     2.018      0.043950 * 
Zswingperc   0.95022    0.42860     2.217      0.026914 * 
Zcontactperc 3.55909    0.80202     4.438      1.04e-05 ***
Fstrikeperc    1.32946    0.39701     3.349      0.000851 ***
Swstrperc      2.44227    1.26266     1.934      0.053452 .
R-Squared = 0.8071

The R-squared value for the regression is actually quite good, showing that over 80% of the variation in ERA can be explained by the seven independent pitching variables.

An increase of one in WHIP is associated with an increase of 3.821 runs in ERA. Again, this one makes a lot of logical sense. Giving up one more baserunner per earning should really hurt your ERA. Since ERA is based on 9 innings, we can see that the one extra baserunner per inning would increase the runs allowed per inning by 0.425 runs. This number makes a lot of sense both intuitively and through statistics. Looking at the expected runs matrix from The Book, we can see the effect of one extra baserunner per inning. If you subtract the expected runs for a certain base/out situation by the base/out situation with one less runner, and sum all of the possibilities, we can get a good estimate of the effect of extra baserunners. For example, the expected runs for no out and no runners is 0.555, and the expected runs for no outs and a runner on first is 0.953. The difference between those is 0.398. If we calculate all of the differences, we find that the expected increase in runs per inning is 0.4129 with one more baserunner per inning. Now this is not rigorous math, but a simple way of showing that the coefficient for ERA makes a lot of sense.

An increase of one HR/9 is associated with an increase of 0.9505 runs in ERA. Clearly, giving up more home runs is the fastest way to increase your ERA. However, this value does seem somewhat low. In January, I found that the true value of a home run hit in 2010 was worth 1.406 runs. So giving up a home run should right away be worth about 1.41 runs, which means your ERA should increase by about 1.41. Unearned runs are playing a part in decreasing that value, but shouldn't decrease it by close to half a run.

A one-percentage point increase in a pitcher’s fly ball percentage is associated with an increase of 0.5 runs in ERA. As I found in my post on hitting luck vs. skill, a higher fly ball percentage leads to a lower batting average for hitters, which seems to contradict this result. However, fly balls are associated with a much higher slugging percentage than ground balls, and the chance of a fly ball becoming a home run is a great risk to ERA. As an example, Javier Vazquez had a great season in 2009, with a 2.87 ERA and a FB% of only 34.8%. When he moved to the Yankees in 2010, his fly ball rate jumped to 47% and his ERA blew up to 5.32. So far in 2011, he has a 48.1% FB% and an ERA of 5.23, pretty much in line with his 2010 stats.  Although FB% is clearly not the only reason why his ERA jumped, it certainly contributed, especially because his HR/FB rate jumped from 10.1% in 2009 to 14.0% last year.

A one-percentage point increase in a pitcher's swing percentage in the strike zone is associated with an increase of 0.95 runs in ERA.  If pitchers are inducing more swings and misses, then this should be a good thing, but it is possible that hitters could simply be hacking more often because the pitches look much better to hit. Pitchers that are truly successful will be able to get outs by pitching to corners and making the batter only swing at a good "pitcher's pitch". A pitcher constantly painting corners will make the batter take more pitches as he looks for better pitches to hit, before being forced to swing with two strikes.

A one-percentage point increase in a pitcher's contact percentage in the strike zone is associated with an increase of 3.559 runs in ERA. Obviously, if hitters are hitting a higher percentage of pitches, then they are most likely seeing the ball better and hitting it more squarely. This would definitely lead to a higher ERA. Although the coefficient may seem very high, contact percentages are pretty consistent, so a big jump is rare and would lead to a much higher ERA.

A one-percentage point increase in a pitcher's first strike percentage is associated with an increase of 1.33 runs in ERA. This is really the first debatable result in the regression. One would think that throwing more first pitch strikes would lead to a lower ERA, but that is not the case. One plausible explanation can be found in this table. That shows the hitting splits for all of MLB in 2010 on different counts. The slash stats for hitters on the first pitch of an at-bat is a robust .334/.340/.534. That is well above league average, so if a hitter hits a first pitch they are going to have more success overall. Throwing more first pitch strikes leads to more hittable pitches and thus a higher ERA. However, throwing less first pitch strikes leads to pitchers getting behind in the count, and when that happens hitters hit .302/.473/.498. Although BA and SLG are lower, the OBP is much higher (mostly due to walks), and it is the statistic that is most important in creating runs. This coefficient needs to be looked at more in-depth, but right now the regression believes that more first pitch strikes leads to a higher ERA, so we are going to take that as a given.

A one-percentage point increase in a pitcher's swinging strike percentage is associated with an increase of 2.442 runs in ERA. This is an almost identical explanation to the coefficient for swing percentage in the strike zone. The more strikes a batter swings at means there are less strikes that they are simply taking. Strikes that aren't swung at have no negative consequences (other than maybe stolen bases) because the ball has no chance of being put in play, so pitchers should want the swinging strike percentage to be lower, because it will lead to a lower ERA. However, having a lower swinging strike percentage does not necessary lead to a lower ERA. A pitcher must have great control in order to take advantage of a hitter.

In my next post, I will explore different pitchers' luck and skill, just like I did for hitters. Now that the model has been defined, it will again show a pitcher's predicted ERA, and the fluctuations from career ERA to predicted ERA will show the improvements the pitcher has made that season and will be defined as skill. The difference between predicted ERA and actual ERA will show the pitcher's luck. It will be interesting to look at certain examples of pitchers who are lucky or not, and whether they fit a certain stereotype. Maybe ground ball pitchers have, on the whole, a lower ERA than their counterpart fly ball pitchers. I will show examples of certain pitchers, and we will be able to figure out whether they are truly a good pitcher, or have simply gotten lucky.