THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, July 21, 2006

Gory Details of the Gory Details

By Tangotiger, 08:22 AM

Andy expands on the calculations of pages 366-367 in The Book:


Let’s start with the variance of the variance, which is what is actually directly calculated.  To measure this, you need to compute the mean of the variance from any one observation (( x- )^2), relative to the expectation value of the variance (-^2).  Thus the variance of the variance equals:
< ( ( x- )^2 - ( - ^2) )^2 >

Expanding the terms, this becomes:
< ( x^2 - 2x - + 2^2 )^2 >
= < x^4 + 4x^2^2 + ^2 + 4^4 - 4x^3 - 2x^2 + 4x^2^2 + 4 x - 8x^3 - 4^2 >
= - 4 + 8^2 - ^2 - 4^4

The expectation values here for a Gaussian centered at with standard deviation sigma equal:
= x0
= sigma^2 + x0^2
= 3 x0 sigma^2 + x0^3
= 3 sigma^4 + 6 x0^2 sigma^2 + x0^4

Substituting these into the above equation, one gets that the variance of the variance equals
2 sigma^4

Naturally, this means the standard deviation of the variance, which is the square root of the variance of the variance, is sqrt(2) sigma^2.

To get to the standard deviation of the standard deviation, you use that if y is an arbitrary function of x, f(x), then the standard deviation of y equals the derivative of f(x) times the standard deviation of x.

In this case, abbreviating standard deviation and variance as SD and VAR, respectively, we have:
SD = sqrt(VAR)

so the standard deviation of the standard deviation is given by
sd(SD) = sd(VAR)/(2 sqrt(VAR))

since sd(VAR) is sqrt(2) sigma^2, and sqrt(VAR) is sigma, this becomes:
sd(SD) = sigma/sqrt(2).

Finally, the “N” comes into play since the uncertainty from repeated measurements equals 1/sqrt(N) times the standard deviation.  So the uncertainty in VAR equals sqrt(2/N)sigma^2, and the uncertainty in SD equals sigma/sqrt(2N).

-- Andy

#1    John Beamer      (see all posts) 2006/07/21 (Fri) @ 09:39

Andy,

Thanks for that—it isn’t nearly as simple as I imagined! I now see why you didn’t do the full derivation in The Book.

Anyway, I appreciate the proof. I certainly need to revisit my stats text - all the expectation values / raw moments are something I last did eons ago.

Thanks again. It is great you guys have the time for follow up on mundane questions like this.

John


#2          (see all posts) 2006/07/24 (Mon) @ 06:58

Just for ease of reading, you might want to try using superscript instead of the “^” to indicate exponents.  If your posting mechanism allows HTML, I think it’s 2 to make an exponent of 2.  There’s also HTML for symbols like square root out there as well, they look like kind of funny to type but they produce the correct symbol to the eye.  Sorry if you already know this, I’m just learning about HTML and thought I’d share on the off chance it helps.  Though whoever understands what the heck you wrote up there in the first place will probably be able to read it just fine as is anyways grin


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 09:12
David G. checks in again on whether experience matters in the post-season

Nov 20 04:02
Nate Silver: hero to interviewers

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel

Nov 19 19:13
Offense by position groups by decade

Nov 19 17:32
Changes in home run rates during the Retrosheet years

Nov 19 16:40
One Year and One Million Hits Later

Nov 19 16:22
Soria as a starter?