THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, March 24, 2007

UZR for hitters

By Tangotiger, 03:44 PM

Protrade checks in with the UZR version for hitters:
http://www.protrade.com/content/DisplayArticle.html?sp=Sfd06ae48-d89d-11db-8683-5577a9d16e8f
http://mlb.mlb.com/news/article.jsp?ymd=20070322&content_id=1854282&vkey=news_mlb&fext=.jsp&c_id=mlb

The main issue is getting enough parameters to distinguigh a Beltran hit to location x,y that was “hard hit” from a Neifi Perez’s hit to location x,y that was “hard hit”.  On top of which, you really need to know the fielding alignment.  In short, it’s a good first try, but the parameters required need to be extended.


#1    MGL      (see all posts) 2007/03/24 (Sat) @ 17:16

I agree with Tango. It is a good first start, but much more work needs to be done.  In general, I am not a big fan of “UZR for batters” for the reasons Tango mentions.  A hard hit ball to section X,Y by Bonds is NOT the same as a hard hit ball by Neifi, even if it has the same STATS (or BIS or whomever) parameters.  As well, because the defense plays differently (sometimes VERY differently) for each batter, the baseline “caught percentages” can be way off and sometimes meaningless.

And of course, you will find that most of the lucky players have high BA and most of the unlucky ones, a low BA.  That is why they have a high or low BA in the first place - they likely got lucky or unlucky.  That is why we regress all sample stats to the mean.  IOW, we know that Mauer probably won’t hit .347 again without knowing his batting UZR.

However, the beauty in the batting UZR if it is done well, is to see which high BA guys got lucky and to what extent and which ones perhaps did not and then adjust the regressions accordingly.  Same thing for the low BA guys.

As far as using pitcher UZR (or PZR), that is a lot better, as we expect a “hard hit” ball to be about the same no matter who the pitcher is, on the average, and we don’t expect a whole lot of differences in terms of fielder positioning among different pitchers, other than whether they are L/R.

I did not read the protrade article real closely, but I assume that their baselines at least are separated by whether the batter and pitcher are L/R (the handedness of the the batter is the most important thing), as that is somewhat a proxy for fielder positioning and to some extent speed of the batted ball.

What I would like to see in the data, for example, would be the affect of UZR for batters and pitchers, holding a traditional stat constant.  For example, if we look at 2005 data and we look at all batters with a BA of around .300 and then break them into lucky and unlucky and then look at each group’s BA the next year.  How much more regression to the mean do we see in the lucky group?  That tells us how much the UZR data is actually helping us.

What you sometimes see with (bad) studies is something like the following:

We have 2 groups of players - one the lucky group and the other, the unlucky group (according to batter UZR).  The lucky group lost 20 points in BA and the unlucky group gained 20 points next year.  Voila, we must have a great measure of luckiness!  Not do fast.  If the lucky group had an average BA of .310 the first year and the unlucky group, .230, then we can explain the 20 point gain and loss by “normal” regression to the mean alone.  We don’t necessarily need the UZR data.  But, if we find that the lucky and unlucky .310 (and .230) hitters regressed significantly differently the next year, and ditto for the unlucky .310 and .230 hitters, then we are on to something.  Or, if we simply see more regression than we expect when we divide the players into lucky and unlcuky baskets, then we also are on to something.


#2    tangotiger      (see all posts) 2007/03/25 (Sun) @ 20:13

Greg the genius behind:
http://hittrackeronline.com/

Has collected information on all HR for 2006.  He will expand this to include all batted balls for a select few teams.

With this data, he will tell us the angle of flight of the ball, the speed of the batted ball, and landing spot.  Now, this is what you want.  A ball launched at 35 degrees and the ball bouncing off the bat at 100 mph, with a high point of 100 feet landing at point x,y means that you don’t care if it came from Bonds or Neifi.  Even so, we still need the fielding alignment.  But, what he’s doing is how the data recording should be done.


#3    MGL      (see all posts) 2007/03/26 (Mon) @ 14:15

Where does he get that kind of data from?


#4    Peter Jensen      (see all posts) 2007/03/26 (Mon) @ 14:44

MGL - He developed his own program that can take it off HDTV images of the games.  Visit his websight.  It’s fascinating what he does.

Tango - I thought your contact at MLB.com said we might be able to purchase much of this information from them with there new expanded Gameday format.  Any more updates on that?  Maybe its time to get in touch with him again and see whether MLB is still trying to implement this in all the stadiums this year. 

I wish you weren’t so hung up on having the fielder’s positioning.  Although it would be nice to know, it is much less important than all the other information and would require additional technology and many extra man-hours of inputting data.  Everything else is can be done with the existing technology and no more man hours than what MLB.com is already planning for the expanded Gameday.  What is most important is that they also offer the Gameday info in a PBP format that can be coordinated with Retrosheet PBP data (or replace the Retrosheet data for current years altogether).  I’d happily pay a premium on the scale of MLB.TV for a statistics package like that even without the positioning data.


#5    tangotiger      (see all posts) 2007/03/26 (Mon) @ 14:47

He compiles it by hand:
http://www.hittrackeronline.com/howitworks.php

We also had a recent blog entry on it here:
http://www.insidethebook.com/ee/index.php/site/comments/hittracker_needs_you/


#6    tangotiger      (see all posts) 2007/03/26 (Mon) @ 14:56

UZR for hitters would require fielder positioning.  UZR for fielders wouldn’t be as important, as you could infer it based on the batted ball spray patterns of the hitters and pitchers and game state (any combination of inning, score, base, out, count).

I’ll ask my mlb.com guy what it is that he can offer us, both for free and for pay, and how parsed the data will be.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade