Wednesday, October 03, 2007
What stats should we use to represent how good a player IS?
As in evaluating teams for the post-season. It is a long-standing bugaboo of mine - using one season stats as evidence of how good someone is, comparing players, teams, etc., not in the past, but for the future or the present.
It came to light again when reading BP’s comparison of the Rockies and Phillies, by Christina Karl. As usual, it gave all the players’, or at least the starters’ at each position, 07 stats. The first thing I think when see “analyses” like that, is, “Is that the best you can do - give us one year stats?” Why one year? Why the current year? Why not the last 3 months? Why not the last month? 5 months? How about year and a half? How about last year, but not this year?
I realize that giving this year’s stats in a comparison of teams or players has two things going for it: One, it is convenient and recognizable. Two, it is not a bad proxy for true talent or future performance, depending on the stats of course. And I am not quarreling with the stats themselves. It could be EQA, VOPR, WARP, lwts, whatever. I am quarreling with the relevance of one year stats to the discussion at hand.
I am claiming that you simply cannot and should not use one year stats in a rigorous or serious analysis or discussion. And it is done ALL THE TIME - by serious analysts (like the aforementioned BP article - what you think of BP or Karl as serious analysts not withstanding) , by the media, and by casual and serious fans. And I don’t like it.
It can give you all kinds of misleading information - obviously, and lead you to bad conclusions and inferences. One year stats, even for a whole team - because a whole team is made up of lots of individuals, which is not the same thing as a whole lot of stats from ONE individual) - is simply not enough information for a serious or rigorous discussion of which teams or players are better than other teams or players, as in most of the post-season analyses you read, even in the sabermetric venues.
ALL serious discussions about who is better should begin and end with projections, which have two basic ingredients, assuming that we are working with park and context neutral stats so that everyone is on a level playing field to begin with (we obviously don’t want to compare raw stats from Rockies players with that of SD players, for example). The two basic and critical ingerdients are: One, multi-year weighted averages, and two, regression toward the mean.
Here are some interesting numbers:
If you use one year stats to represent future stats or true talent (essentially the same thing), the best case scenario is when you have at least 80 games of stats in that one year. I am arbitrarily defining this as the best case scenario. Obviously it could be more. Or less, if you don’t want a best case scenario. Anyway, I looked at year to year correlations from 98 to 07 for all players who had at least 80 full time games in back to back seasons. I used park and context neutral lwts as the stat of choice.
The y-t-y “r” was .636. OK, that makes for a decent projection and a decent proxy for current true talent and future performance. Not great, but decent. .636 would certainly make us prone to lots of errors when comparing teams or players, especially if their true talent were close in the first place. And this is just for players who have at least 80 games to work with.
What about if we used last year’s stats in these analysis? IOW, what if I wrote an article comparing the Rockies and Phillies, just like the BP article and countless others, and I used last year’s VORP, EQA, or lwts? Readers would think I was crazy, right? Not so fast. The y-t-y correlation between years that are one year apart is .588! Not too much different from the .636. You might as well quote last year’s stats as this year’s stats! That is my point. Sure, this year’s is better than last years, but the difference is not all thast great, and we can, of course, easily find something better than either of those two alternatives. Heck, if we just used last year’s stats for players who had lots more PA last year than this year, we could probably get that .588 up to close to the .636.
What about the last two years weighted, with the same requirement, 80 games per year? .651. better than one year. How come you never see the last 2 years quoted in these articles or analyses? 3 years? .666, even better.
And of course if you included some regression, you would do even better, especially if you were using one year stats (where you would want to regress almost 50% or so). Why not take everyone’s lwts and cut them in half when making one of those lists? Again, the readers would think you were crazy, but you would do even better with that than by printing the full lwts.
mgl: I was wondering what you thought of the decision to take out Zambrano for the bottom of the 7th in game 1 of the Cubs/D-Backs. I realize Marmol has been fantastic this year, but that is only one years worth of stats (as you point out above). Is he really projected to do better than Z in that situation? The answer better be a resounding yes, as the game was still tied and Zambrano had only thrown 85 pitches (and looked very good in doing so FWIW). It should be noted that Piniella left Z in to hit for himself just an inning earlier… with 2 outs and the bases loaded.