THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Thursday, August 06, 2009

More regression from Colin

By Tangotiger, 10:22 AM

Some very lovely work.

I’m going to be picky here, because it’s important to make the huge distinction between the uncertainty around the true talent level, and the spread in observed performance centered around that true talent level.

For example, Colin says this:

In the case of Pujols, we would estimate his OBP going forward as .424, with an uncertainty (expressed in standard deviation) of .006. In other words, about 68 percent of the time, Pujols’ OBP should be between .418 and .430.

The uncertainty that Colin suggests is the uncertainty of his true talent level.  That is, he’s got Pujols pegged as a true .424 OBP hitter, with one sigma of .006.  So, he’s 68% certain that it’s between .418 and .430.  It’s not 68% of the time.

Indeed, in order to say something like “68% of the time”, you have to know what the “time” is.  Is it 100PA, 1000 PA, or 10000 PA?  If you have a true .424 hitter, he will PERFORM at .419 to .429 68% of the time if given 10000 PA.  If you only give him 100 PA, then the 68% range is +/- .049 points.

However, we don’t know if Pujols is a true .424 hitter.  We estimate he is a true .424, with a certain amount of uncertainty.  Colin has that pegged at .006 for Pujols.  So, the observed range of Pujols will be a bit bigger than .419 to .429, because we aren’t sure he’s a true .424 to begin with.

As for Colin’s league average hitter having an OBP of .356, that can only be true in a certain era, or if he only considers regular players.  Since Pujols is a decidely regular player, then that mean is the population he is bring drawm from.  It’s not the “league” as in everyone in MLB, including pitchers and callups.  Pujols’ population is made up of whoever most closely resembles his actual population WITHOUT looking at his performance numbers.  (The problem is that for some players, their performance numbers do drive their playing time, as opposed to their scouting driving their playing time.  I have no idea what category Francoeur falls into.)

(2) Comments • 2009/08/07 • SabermetricsStatistical_Theory
Page 1 of 1 pages

Latest...

COMMENTS

May 26 03:03
Pete Palmer’s new book: Basic Ball

May 26 01:11
Largest demonstration in Canadian history?

May 25 23:40
“Why Kickstarter works”

May 25 19:41
What sabermetrics is NOT

May 25 16:59
Howard Stern

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 12:51
Chad Curtis

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

THREADS

August 06, 2009
More regression from Colin