In my last post on the topic of winning and staying profitable, I found out that over the last 3 years (2007-2009), there has been quite a pattern going on for the $/WAR for players from age 22 to 32. Today, I want to use the macro example and try to fit it to a micro level, and test it on individual players.
I have picked 5 players who fit into a few of the criteria in the model. The first criterion is that the player must be born no earlier than 1981 so that they we will be able to predict their next couple of seasons (we can only predict until age 32, then a player's performance becomes erratic). The next item is that the player must of played in the league for at least 3 years (so mostly born no later than 1987), so that we have some data points that we can fit a model to. Finally, a player must be under contract until at least 2013 so that we can compute the player's $/WAR using our estimates for WAR and the player's salary. We could predict a player's $/WAR for the next few years, but without a salary we would not be able to predict their WAR. The whole point of this exercise is to try and predict how a player will perform in the next few years, and in our case the performance variable is WAR, so we do need a salary. So what we are looking for are young players locked into long contracts.
The five players I have picked are as follows: Curtis Granderson, Evan Longoria, Nick Markakis, Dustin Pedroia, and Mark Reynolds. I am sure there are other players we could have evaluated, but these were the first five players that I came across that fit the criteria. I picked five because it will show varying results, but having many more than five would be too much work to do for a simple study. We will analyze them player-by-player.
What I have done for each player is found a second-order polynomial function (in other words a quadratic model) that fits the player's $/WAR information for their years from the beginning of their career to 2010. I then extrapolated the function to the last year that they are under contract (or in Granderson's case, until he turns 33), and compared the player's individual function with the league function. The league function (which I modified slightly from my last post) is also a quadratic function, with an equation y = 37720x2 - 144374x + 328152, where x is a number starting at 1, representing a player's age, and y is the player's $/WAR for that year. I then found the arithmetic mean of the two, and set that value as the player's predicted $/WAR. Finally, I took the three predicted $/WAR for each player per year (minimum and maximum, values for the league average and player average, as well as the average value, which is the arithmetic mean of the two) and calculated (using the player's salary) each player's predicted minimum, maximum, and average WAR for each season. If this all sounds confusing, the examples, along with the graphs and tables, should hopefully clear things up.
First up is Curtis Granderson. The graph below shows the league trendline (green), Granderson's trendline (red), and his observed $/WAR values up to 2010 and predicted $/WAR values until 2013 (blue). He has played in the league since 2006, the longest of any player, so we have a lot of data for him. His trendline has an equation of y = 369618x2 - 1441988x + 1285573, which we can see trends upward much faster than the league average.The trendline has an R2 value of 0.98354 for Granderson's first five years in the league. Next year (2011), he has a salary of $8,250,000, and the graph predicts a $/WAR of $4,011,999.50, which means that his predicted WAR is 2.1. In 2012, his predicted WAR is 1.7 (salary of $10 million) and in 2013 it is 1.6 (salary of $13 million).
Next up is Evan Longoria. 2010 is only his third year in the majors, so his data is relatively limited. His trendline, with an equation of y = 44144x2 - 180679x + 268113, has an R2 value of exactly 1.00, but again that is only based on 3 data points. Longoria's graph is interesting because it is the only one where the league trendline and the player trendline actually converge. This means that as we predict further into the future, the predictions should become more and more accurate. One of the reasons for this is that Longoria has a very reasonable contract, which means that his $/WAR trendline will not increase exponentially until far in the future. Another reason is that Longoria's production has so far been very close to the league average (in terms of $/WAR), so the two trendlines are very similar.
Next is Nick Markakis. Like Granderson, 2010 is his fifth year in the majors, so we have more data points leading to more accurate results. His trendline has the equation y = 295244x2 - 1131898x + 1000953, and it has an R2 value of 0.98564. One interesting point is that Markakis' $/WAR stays almost constant from this year to next, and then increases. This is because so far Markakis has outproduced the league average, and when we are predicting, we take into account the league average. So next year the league brings down his predicted $/WAR.
Next-to-last is Dustin Pedroia. Pedroia has played in the MLB since 2007, so we have a decent amount of data on him. His trendline y = 114641x2 - 328223x + 297908 has an R2 value of 0.99907. Pedroia's $/WAR has seen a pretty constant increase over his first four years, and there is only a slightly lower increase between 2010 and 2011 before constantly increasing again. This means that we can be fairly confident that he will continue to have a slightly higher $/WAR every year (avoiding injury of course).
Finally, the last player I looked at is Mark Reynolds. Reynolds is also a very unique case as his career has been very up and down, starting with a 2.0 WAR in his rookie season (2007), down to 0.9, back up to 2.2, and finally this year his WAR will end around 1.5. So fittingly, his trendline y = 28239x2 - 56377x + 273693 has an R2 value of only 0.38815, which is by far the lowest correlation of the five players. However, the trendline is predicting a big jump from 1.5 WAR this year to 6.6 next year. We will see if Reynolds can bounce back and become a consistently great player.
Finally, I have put together a table that summarizes each player's minimum, maximum, and average predicted WAR each year until 2014. The first number in each year is the minimum WAR, which means the player should at least reach this value for that year. The second number is the maximum WAR, which means that this value is the maximum number the player should reach during that year. Finally, the third number is the average WAR, or the arithmetic mean between the league average $/WAR and player $/WAR. One quick note on the table: Granderson turns 33 before the 2014 season, and Mark Reynolds' contract is only to 2013, so we cannot predict the 2014 values for those two players.
One interesting thing to look at in the table is the range of the player's predicted WAR (range is the difference between the minimum and maximum values). Some years, like Nick Markakis next year, have a large range between the minimum and maximum (range of 10.4). Others, like Evan Longoria in 2014, have no range, as the minimum and maximum values are the same. We always need to keep in mind the R2 value of each players' trendline, as a small range (meaning a more accurate prediction) doesn't mean anything if there is not a high correlation coefficient.
So we started with an idea: the way to run a successful MLB team is to have players that are good at helping the team win, while also being cost-effective. We figured out the statistic $/WAR, and found that there is a strong correlation between age and $/WAR. From there, we used the league trendline to try and predict individual player performances for the next few years. We will have to wait and see how correct these predictions are, as hopefully they are at least moderately right.
I am going to do one more post on $/WAR and some other observations, mostly about the correlation between different teams' $/WAR and how successful they have been in the past few years.
No comments:
Post a Comment