Friday, February 10, 2012
Reader Mail of the Day: Why do we need X years of fielding data? And what about outliers?
Pete:
When it’s said 3 years’ defensive data is needed to judge a player, what does that mean? I’ll use Biggio as an example (since they’re talking about him at Baseballthinkfactory) - from ‘92-’02, he was worth about -5 runs/year fielding except for ‘97 where he was +19. Was he (1) a generally poor fielder who had a good/lucky year, or (2) still a poor fielder in ‘97 that looks good only because of the noise in the numbers?
Me:
You never throw data away, unless you have a REALLY REALLY good reason to do so. And even then, it better be REALLY REALLY REALLY good.
The more data you have, the less you need to regress. So, you need two years of fielding data to tell you as much as one year of hitting data. Would you make conclusions based on one year of hitting data? No? Then, you need more than two years of fielding data.