THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, March 08, 2008

Primer on Small Sample Size effects

By Tangotiger, 09:50 AM

Sal does a great job.


#1          (see all posts) 2008/03/08 (Sat) @ 15:42

Ah, music to my ears!  We talk about this all the time on this blog and of course we go into detail about this concept in The Book.

It cannot be emphasized enough and I would like to see at least a brief discussion of it every so often on every sabermetrically (or not)-inclined blog or mainstream site/column!

From the blog entry:

f you remember nothing, remember this: we must always regress to the mean when figuring true player skills, and the amount we regress is based on how much performance data we have and what the spread in skills are among the general MLB population.


#2          (see all posts) 2008/03/08 (Sat) @ 15:44

Sal, a typo:

The spread in skills are among the general MLB population

There should be no “are.”


#3          (see all posts) 2008/03/08 (Sat) @ 16:16

Sorry if I am repeating myself, but great, great article that all “newbies” on this site (and most others as a refresher) should read!

Another typo:

The following chart shows how our estimate of his true OBP skill changes with the number of plate appearances over which we make observe him to have a .377 OBP.

I am not exactly sure how this sentence is supposed to read.  Even without the word “make,” it is awkward.

Also, this sentence should be qualified/explained because it is so important:

What you find is that the many splits that are touted as important ("he hits .312 after the All-Star Break!") are actually meaningless.

You (Sal) should add:

...even for large samples of performance, let alone small ones.  That is because there may be little or no spread in skill in the population of major league players.  Remember I told you that if there is no variance in skill in the population, then we regress all the way to the mean, no matter how large our sample size is.  In fact, that is true for most splits, even the ones that commentators and the entire “conventional wisdom crowd” think (or speak as if they are) are meaningful (e.g. day/night, home/road, first half/ second half). (Keep in mind that for many of these splits, there may be some very small spread in skill, which for all practical purposes is the same as no spread, since we would need an enormous sample size in order to have any meaningful regression less than 100%.)

Or something like that!

I would also love to see you add something to the effect of:

Since how much to regress a sample stat or “split” (or any sample measurement or series of measurements) varies with (is a function of) both sample size and the spread of skill in the population, there is no “magic point” at which we believe or don’t believe a sample number (like BA, OBP, HR rate, etc.).  There is no point at which a “small unreliable sample” all of a sudden turns into a “large reliable sample.” It is all relative.  And it is sort of a smooth, continuous (but not linear as you can see from the “Buck” graph) function.  The larger the sample size, the more “believable” the sample stat is (assuming there is SOME variance in skill in the population) and the larger the spread of skill, the quicker the “believability” increases as the sample size increases, where at zero sample size all sample stats are 0% believable (of course) and at an infinite sample size, they are 100% believable (again, assuming SOME spread of skill).  As I said, how quickly the “believability” rises between a sample size of zero and infinity, depends on the spread of skill. So, whether a certain sample stat is “believable or not” (whether you consider the sample size adequate or not) on its face (without actually doing the regression) depends in your definition of “believable” (and “adequate").

Great, great primer!


#4    salb918      (see all posts) 2008/03/08 (Sat) @ 18:05

Thanks for the link, corrections, and kind words.  Mitchel, I’ve posted your expansions onto the discussion thread.


#5    Sky      (see all posts) 2008/03/08 (Sat) @ 20:07

I think that graph comparing playing time to


#6    MGL      (see all posts) 2008/03/08 (Sat) @ 22:48

In #1 above, “f you” does not mean “fu** you! (although in many of my posts it could be)” The “I” got cut off. wink


#7    john      (see all posts) 2008/03/09 (Sun) @ 21:13

The league-average OBP is .330.  Through some complicated statistics, we know that the variation in true skill (the standard deviation) across all major league hitters is 0.025.  That is, 68% of all major leaguers have a true OBP skill between .305 and .355.

Just a question......can you tell us how u come up with the SD for League Average OBP?  That 0.025?  Im sure its in The Book (Appendix) but maybe u can explain that?


#8    salb918      (see all posts) 2008/03/09 (Sun) @ 22:41

John, it is in the Book.  Essentially (and mgl/tango/andy, please correct me), you look at the variance among all hitters.  Call this var_total.

var_total is the sum of other variances.  The two most important ones to consider are the variance due to random fluctuation (var_rand) and the variance due to skill (var_skill).  Variance is additive, so you can say

var_total = var_rand + var_skill

You can compute var_rand via the binomial theorem, and then you have var_skill.  The standard deviation in skill is simply the square root of the variance.

As for the computing it practically, the appendix in The Book provides step by step instructions.


#9    tangotiger      (see all posts) 2008/03/09 (Sun) @ 23:16

I’ll have to look at The Book again, but I thought the 1 SD = .025 was only an example.  I’ve always used something like 1 SD = .030 or .033.  That would imply that r=.50 when PA = 200 to 240 or so.

There’s simply no way that it’s PA = 350 or so.  For pitchers, sure.  But, not for hitters.


#10    john      (see all posts) 2008/03/10 (Mon) @ 07:12

Thanks.

I was just sitting here wondering if I could do that with other stats besides OBP and WOBA.  So basically I’d need to calculate the variances of ALL players in the population.


#11    Tangotiger      (see all posts) 2008/03/10 (Mon) @ 09:29

I’d recommend reading the Appendix in The Book.


#12    john      (see all posts) 2008/03/10 (Mon) @ 10:04

I’ll take a look at it.  It seems a bit complex.


#13    MGL      (see all posts) 2008/03/10 (Mon) @ 16:15

I’d have to refresh myself by reading the appendix in The Book, but it depends on the number of players in your population of various PA per season.  For example, if you look at all players with around 500 PA per season and you computer the variance, you will get the variance in skill plus the random binomial variance for 500 PA (.025 squared).  But…

That only gives you the variance in skill for players with around 500 PA.  What about part time players who have around 200 PA?  You can do the same thing for them separately and find the skill variance for them, which is probably different.

Then you have the issue of selective sampling.  Players with 500 PA and 200 PA overlap in skill in that some players with only 200 PA only have 200 because they were bad for those 200 PA and vice versa for some players with 500 PA, despite the fact that they may have the same actual skill.  I am not sure how that affects the variance though.

You can use everyone in the population, regardless of the number of PA, and weight in your variance formula by the number of PA or something like that.  Again, I’ll have to reread that part in the book.  I forgot how Andy does it, which I assume is the right way. 

In any case, you can’t just talk about taking the actual variance among players and then subtracting out the expected random binomial variance, without addressing the PA (per player) issue, I don’t think.  Let’s say you have 100 players with 500 PA and another hundred with 10 PA.  Obviously the observed player to player variance is going to be a lot higher than if you had 200 players with 500 PA each.  So will the expected variance by chance.  I am just not sure how to compute each, given each player’s different number of PA.


#14    salb918      (see all posts) 2008/03/10 (Mon) @ 16:22

MGL, Andy (I assume he wrote the appendix) goes through how to deal with the PA issue in The Book (I think, I don’t have it in front of me).


#15    MGL      (see all posts) 2008/03/11 (Tue) @ 00:07

Sal, yeah, I’ll have to re-read it.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 10:35
Rooting for laundry

May 25 10:14
Largest demonstration in Canadian history?

May 25 09:39
What sabermetrics is NOT

May 25 09:31
Do pitcher’s reach back for velocity when needed?

May 25 06:39
Lack of hustle during a game

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story