THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
Mailbag:You ask:We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, May 14, 2008

Another article about when current season stats become “real” (or something like that)

By , 03:58 PM

Ron Shandler chimes in about the “reliability” of statistics at any point in the season.  While he makes some good points about how the different measurements require different sample sizes (e.g., PA) to be equally reliable, the piece is littered with phrases, and I am paraphrasing, despite the quotations, like “become meaningful,” “reliable,” “taken seriously, etc.

As I mentioned in another thread, I don’t like those terms being thrown around, with respect to this issue.  They are misleading.  And dangerous.  One man’s reliable is another man’s unreliable.  More importantly, the more a short-term statistic strays from a career one and/or from a population mean, the less “reliable” it is, given identical sample sizes (of the current performance).  Not to mention the fact that you cannot make any mention of a statistic’s reliability based on the sample size of that statistic without knowing the prior history of the player or the mean of the population.  A player with no history hitting .260 on May 15 is probably around a .260 hitter (at least that is our best estimate, albeit without a great deal of certainty).  A player hitting .260 On May 15 who has been a .300 hitter his whole career is probably a .295 hitter.

So how in the world can we say that May 15, or any other date, is the date at which a statistic becomes reliable, without knowing the prior history of the player and the mean of the population?

Similarly, if a player is hitting .300 on May 15 with no history, his true BA is probably around .270.  So one player with no history who hits .260 on May 15 has a true BA of .260.  Another player with no history is hitting .300 on May 15 has a true BA of .270.  In one case, his short-term BA is likely his true BA.  In the other case, his short term BA is likely nowhere near his true BA.  Again, how can we talk about a date or a current sample size, in isolation, that makes a player’s statistic reliable or not?  We can’t!


(12) Comments • 2008/05/22 • SabermetricsForecastingStatistical_Theory
Page 1 of 1 pages

<< Back to main