Monday, October 24, 2011
Small sample sizitis
Someone asked me about small sample size. I answered as follows:
==========================
It’s a fair question to ask. Basically, the choice is presented as follows:
1. Octavio Dotel has faced Ian Kinsler 8 times in his career (and got him out 100% of the time).
2. Octavio Dotel has faced 3800 MLB batters in his career (and got them out 70% of the time).
Therefore, how much weight do we place on the 8 Kinsler PA, compared to the 3800 non-Kinsler PA? Ian Kinsler does have something in common with everyone else: he’s a MLB player. That is a huge commonality we have. In that group of players are guys who are better hitters than Kinsler, but also quite worse than Kinsler.
So, we can limit the 3800 batters if you want down to the say 1000 batters faced that are about as good as Kinsler.
Now our choices become:
1. 8 PA against Kinsler
2. 1000 PA against guys as good as Kinsler is a hitter
The choice however is not either/or. You can overweight the Dotel-Kinsler matchups, and I have NO PROBLEM with doing that. How much do we want to overweight that? Two times? Five times? Ten times? Give me a number.
So, let’s say that Dotel-Kinsler tells us 10 times as much as Dotel-GoodHitter does in terms of giving us an estimate. The 8 actual PA becomes 80 weighted PA. You still have 80 weighted PA to add to the 1000 other PA in the pool. That 8 actual PA is still only 7% of the conversation when weighted 10 times.
Not to mention the reality is that if you study it, as we have in The Book, the matchups are simply not predictive. This is not a matter of opinion. It’s a matter of fact.
If someone ignores fact because they believe in their gut they are right, Colbert coined the word “truthiness” for that. I have no argument against truthiness. By definition, those who argue based on truthiness can never be wrong.
Tom


Recent comments
Older comments
Page 1 of 344 pages 1 2 3 > Last »Complete Archive – By Category
Complete Archive – By Date