THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, August 30, 2006

Forecasters: How Accurate Can They Possibly Be?

By Tangotiger, 07:24 AM

0.73

Here’s how we can tell:


If you have thousands of samples, say students, and they each take several hundred tests in one session, say 550, and then have those same students take that exact same number of tests, but new, we can determine the correlation coefficient (r) in two ways.

1 - A sample-to-sample regression.
2 - Using only the results from the first sesssion

The second one is based on taking the variance (true) of the students in question, and dividing by the variance (observed actual tests).  That’s your “r”.

The problem of course is that we don’t know the variance(true).  We could estimate it if we can plug it into the equation:
var(obs) = var(true) + var(random)

However, we don’t know var(random) either.

Enter the binomial.  Let’s forget about students, and look at baseball players, and their OBP.  Fortunately, an OBP is simply the safe plays divided by the safe plus out plays.  So, we can determine the random standard deviation using the binomial.

sqrt(OBP*(1-OBP)/PA)

Remember also that SD^2 = variance

If you select a few hundred ballplayers every year with 550 PA, we can figure that the var(random) = .020 ^ 2.  It’s also easy enough to observe their OBP and get the var(obs) as around .039 ^ 2, depending what years you select.  var(true) is then derived from these two numbers as .033^2.

Our “r” is .033^2 / .039^2 = .72

What does this mean?  You can take several hundred ballplayers, give them 550 PA one year, give them 550 PA another year, make sure that these guys’ true talent level in OBP does not change, make sure they play in the same parks, make sure they face the same quality of pitchers, and their year-to-year correlation will be 0.72.  That is, the absolute maximum year-to-year r you can hope for, given a large number of ballplayers is .72.

How about instead of OBP, we look at wOBA (which is analogous to OPS)?  Here, our var(true) is .036^2, var(random) is .022^2, and var(observed) is .042^2.  Our r is .73.

So, when looking at forecasters, and you look at their correlation coefficient of their forecast to the actual results, anything close to .73 means that they did as good a job as possible.  (They could actually go over that level, since the number of players in their sample is still small enough that the level of uncertainty of that r will be a bit high.  But, given thousands of players over several years, that uncertainty level will drop quite a bit.)

The other key question is: how does Marcel The Monkey do?  A few years ago, when I ran it, I think the r was .65.  I’ll have to redo that to see what it actually is after several years of results.  But, that’s what everyone is fighting for, to get from the .65 level to the impossible .73 level.

And remember, I used 550 PA for each player.  Drop that down, and the maximum r will drop down as well.

(31) Comments • 2007/11/10 • SabermetricsForecasting
Page 1 of 1 pages

<< Back to main


Latest...

COMMENTS

Dec 05 04:40
Sabermetric Moves of the 2009 Pre-Season

Dec 05 05:33
Avery being Avery

Dec 05 05:06
NYC’s 3 1/2 year mandatory jail time sentence for carrying a loaded weapon

Dec 04 23:42
Poll: Would you vote Raines for the Hall?

Dec 04 23:07
How to calculate the area of a baseball field

Dec 04 22:48
Complete Run Expectancy, Retrosheet Years

Dec 04 22:03
Raines for the Hall

Dec 04 15:55
Mailbags on Parade

Dec 04 14:01
What would happen if the shootout period was 10 minutes, not 5?

Dec 04 11:49
Estimating BABIP