THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Thursday, March 12, 2009

Normalizing the Gameday hit location data

By Tangotiger, 09:25 AM

Peter rolls up his sleeves and more, and shows us how to normalize the data at each park in part 1.

In part 2, he gives us some juicier details, including specifics on Carlos Beltran:

[My model, BZM, Big Zone Metric] has Beltran saving 16.7 runs in 2008. ...  UZR has Beltran saving seven runs in 2008. Interestingly, UZR shows 377 expected outs and 418 put outs. I have him at 385 expected outs with 414 plays made. The slightly fewer plays made is due to my ignoring line drive outs OOZ and also ignoring plays for which Gameday has no hit location data. How MGL interprets a +41 plays made over expected as only being worth 7 runs will have to be explained by him. ... Plus/Minus gives Beltran 404 expected outs and +14 plays made. This is adjusted to +24 in the “enhanced” model and 14 runs saved (not including runs saved with his arm).

So, Peter and Dewan are almost identical at 14 to 17 runs saved, while MGL (via bUZR) has him at only 7 (though Peter is saying there may be a disconnect in MGL’s expected outs, similar to another reader reporting same on Brandon Phillips… what’s missing on Fangraphs is “Plays made” that MGL considers, since the “Putouts” is not necessarily what MGL is looking at, but a subset of that).  In sUZR, MGL has Beltran at +17 runs.

Anyway, in WOWY, I have Beltran at +21 plays, which would be roughly 18 runs saved.  I have him at 415 plays made (very close to Peter’s 414).  Interestingly, the batters he faced historically did not like to hit to CF.  If we only focused on the batters he saw, he was at +29 plays.  But, using his pitchers, he was at +18.  His parks show him as being +23, and if you look at the batted ball distribution he was at +15.  The batted ball distribution “should” be in the middle of all that, since, ideally, the combination of the batters he faced and the pitchers he has and the parks he has all conspire to create a specific batted ball distribution.  But, the RECORDED batted balls shows him to be pretty good, but not as good as we’d have expected using the historical data of his batters, pitchers, and parks.  I would not be surprised if there was less than ideal scoring in his games in 2008.  This process repeats itself in 2007.  But, in 2006, the numbers are much tighter.  Anyway…

Looking at Rally’s numbers: he’s at +11 runs.  But, as we’ve noted in the past, Rally’s numbers are spread out less than UZR and mine.  It’s likely that in terms of standard deviations, he’s the same as we are.

It seems to me that we are extremely close, even though we all have different systems.  The outlier is bUZR, as sUZR, Dewan, Peter, me, and (adjusted) Rally are all right around +16 runs or so.

(9) Comments • 2009/03/17 • SabermetricsBatted_Ball
Page 1 of 1 pages

Latest...

COMMENTS

May 26 03:03
Pete Palmer’s new book: Basic Ball

May 26 01:11
Largest demonstration in Canadian history?

May 25 23:40
“Why Kickstarter works”

May 25 19:41
What sabermetrics is NOT

May 25 16:59
Howard Stern

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 12:51
Chad Curtis

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

THREADS

March 12, 2009
Normalizing the Gameday hit location data