Friday, November 16, 2007
Reliability of statistics
Print, read, put aside, re-read then come back here.
I really wish Pizza would have included the mean, not just the minimum, for each stat, but he said he’d get back to that later. The “intraclass correlation"-type equation I use is:
r = PA/(PA+x), where x is unique to each metric.
For things like OBP or wOBA, x is 200. This means that to get an r=.50, you need 200 PA. Pizza likes to use r=.70, which means you need a mean of 467. Pizza showed us that you need a minimum of 350 PA (which in the context that he chose, means a range of 350 to 700-odd PA), and therefore likely supports the standard equation that I use.
It’s a great post that he did, and a great service that he’s doing. But, I take exception to this part:
Context Neutral wins (sum (WPA/LI)) - never did. at 650 PA, it was at .588
The implication here is that you get an r=.588 at around a mean of x=675PA or so, meaning you’d get a correlation equation of r=PA/(PA+480). And that’s ridiculous. The sum of WPA/LI should be virtually identical to wOBA or OPS or LWTS or anything else in terms of reliability.
Here’s a standard year-to-year correlation from Fangraphs, where he shows in the main blog entry, plus the 4th comment (data from 05/06):
AVG: .12
WPA: .27
BRAA: .35
OBP: .36
OPS: .36
WPA/LI for 2005 to 2006 was .36. For Clutch, it’s .01, as suspected.
SLG: .38
Those numbers are r-squared, not r. As you can see, WPA/LI was at the same level as OPS. I definitely think that Pizza made a calculation goof somewhere.
That rant aside, great work.
As for calculating the mean, most people will just take the straight mean. But, as Andy has shown me, what you really want to do is take the average of 1/PA, and then take 1/that. This has to do with the variance, and if you play around with it, you’ll see that it makes sense. Which may be the reason that Pizza didn’t present the mean, because he wants to describe something like that.
Tom, what do you think of my suggestions for this topic that I wrote in the comments section? (same username)