Monday, June 23, 2008
Which has more predictive value for a player: last year’s stats or this year’s stats so far?
The issue being sample size versus recency, of course. I don’t know off the top of my head, but I would guess it is last year’s stats. So, here is what I propose (as King of the World):
Take every article you read about who HAS the best offense, the best pitching, who IS the best player on whatever team, who should play, who should be benched, who should be sent down to the minors, who should be traded, who should be signed, who should not have been signed, who should bat where in the lineup, etc., etc. You will always see their 2008 stats as support for whatever statement or argument the author is making. Then substitute last year’s stats for this year’s stats since the former is likely at least as predictive as the latter, therefore it should provide better support for the author’s or writer’s arguments.
If you want to have even more fun, take full season stats from 2 years ago and combine them with the first half of last year. My guess is that these would also be more predictive of future performance than 85 games of 2008 stats.
I understand the fans’ and media’s obsession with current stats as a proxy for a player’s true talent, but an analyst should NEVER, EVER, EVER (did I say NEVER?) support an argument about how good someone IS or a team IS with current season stats. EVER.
Commenting on this article on Statistically Speaking, I wrote this (the first paragraph in quotes is from the article):
“The problem with Volquez is that we simply don’t know enough about him as he’s only done this for 1/2 of a season. If he keeps it up and shows he can dominate anyone at any time for a whole year, or more, he would definitely shoot up the list.”
You can’t have your cake and eat it too. In the entire above discussion, you continually conflate two very different things. One is how players have done so far this season, which, while contributing to how good they “are”, does not necessarily indicate, one way or another, how good they “are.”
Two, is how good players “are,” meaning what their true talent level currently is and how we expect them to perform in the near and distant future (with distant future adjusted for age, chance of injury, etc.).
I REALLY wish people would stop quoting current season stats when asked a question like how good someone “is” or who they would like to have batting/pitching for them in one particular game (or something like that), which is essentially the same thing.
Here is a question for you guys? Which is more predictive of future performance: Last year’s stats (the entire season), or this year’s stats so far (85 games)? By “stats”, let’s just say VORP or something similar.
If the answer is “last year’s stats” (which I don’t know that it is off the top of my head, but I suspect it is), then why not substitute all of the stats you quoted above to support your opinions with last year’s stats and see if the opinions still make sense?
The best pitcher in baseball IS (not WAS, this year) still Johan Santana and it ain’t even close. Of course, I don’t KNOW that for a fact, but show me a reliable current forecast for all pitchers and if Santana is not in the top 3, I’ll eat my spreadsheet.
At some point of the season recent stats should trump sample size. Marcel suggests it would be some time in late August (130 games or so).