THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, November 07, 2009

PZR

By Tangotiger, 05:36 PM

I came up with the PZR blueprint at the old Fanhome boards six years ago, which I linked to it here and recopied to post #1 here.

MGL provided some PZR data, and we see Alex at BtB gives us some leaderboards.


#1    Detroit Michael      (see all posts) 2009/11/09 (Mon) @ 17:50

I’ve been looking for this kind of a statistic for a couple years, thinking it’s got to be available at least to the clubs.  The Beyond the Boxscore leader boards are great for 2001-06 and it links to an MGL spreadsheet.  Any chance of getting the data for 2007-09 too?


#2    Alex Krolewski      (see all posts) 2009/11/11 (Wed) @ 01:03

I only have the data for 2001-2006.  MGL, would you be able to make public the 2007-2009 PZR data as well?


#3    Alex Krolewski      (see all posts) 2009/11/14 (Sat) @ 01:44

I’m just bumping up this post as I am afraid MGL may have missed my question due to Veteran’s Day.


#4    Matthew Cornwell      (see all posts) 2009/11/28 (Sat) @ 23:19

Quick question:

One one of the threads a few years back, Tom said that Greg Maddux had been the “luckiest” pitcher in terms of PZR from the data from 2001-2006, with nearly 60 runs worth of “luck” and Glavine and Moyer were high up there too. Does “lucky” mean that their defenses saved them those 40-60 runs, or that they gave up 40-60 fewer runs that what they “should have” based on trajectory,location etc., and the reasons why could be a combination of skill, luck, etc.? Tom did mention that he suspected that Maddux, Moyer etc. were “smart” pitchers and inferred that they were doing something out there which “helped” their PZR.  Just looking for some expert clarification.  Thanks!


#5    MGL      (see all posts) 2009/11/29 (Sun) @ 03:22

Alex, I sent you the 07 and 08 PZR data.  Did you get it?  I sent it about 4 or 5 days ago.

Matthew, PZR, at least the way I define it and present the data, is the difference between what a pitcher should give up (singles, doubles, triples and ROE), based on the exact batted balls they allow, given the defense they have behind them, and what they actually give up.

Clearly, and according to the limited research in this area, that difference is due to a combination of luck and some kind of pitcher skill (such as all the ground balls that Maddux allows that are classified as soft are softer than those of the average pitcher), as well as the non-perfect way that we evaluate defense and in general the methodology that we use to produce PZR.  I have found a y-t-y correlation on PZR of somewhere around .2 or so depending on the sample sizes (IP per year).  That is for pitchers who change teams.  We expect the correlation to be zero if there were no pitcher skill involved.

I hope that makes sense.


#6    Matthew Cornwell      (see all posts) 2009/11/29 (Sun) @ 03:28

Thanks!  That does make sense.  That was my hunch - but so many people use the term “luck” interchangeably with defensive support, that I was not 100% sure.


#7    MGL      (see all posts) 2009/11/30 (Mon) @ 01:53

Just to be clear, there are other ways to define PZR.  To tell you the truth, Tango was the one who came up with that acronym and the concept and I am not sure how he defines it.  Another way would be to look at the difference between the distribution of batted balls that a pitcher gives up given his G/F ration and compare that to what distribution he “should” allow, assuming that he were an average pitcher.  That would also likely be some combination of luck and skill.  The way I defined it in my last post, there is really no way to distinguish between fielder and pitcher skill.  If for example, with average fielders, a certain pitcher should have allowed 87 singles on his ground balls but allowed 89, we have no idea whether that was because of defense or some ability by the pitcher to allow harder to field ground balls, even given what we know about those ground balls, or luck (or some combination).  Of course, like I originally explained, we can take an estimate of each fielder’s long-term defensive talent and adjust those ground balls for them and then attribute what is left over to luck and pitcher skill.  For example, if in the last 3 years, those infielders behind that pitcher allowed an extra 2 singles for every 87 singles that an average infielder allows, then we would assume that the difference between the 89 and 87 singles was defense and that this pitcher had a PZR of zero.

An example of the other definition I mentioned above of PZR is when a RH pitcher allows 100 ground balls and the number of hits is 30 on the average with average fielders and an average pitcher on the mound, but he allows a distribution of ground balls that would allow 35 hits with average fielders.  His PZR would then be 5 singles.  IOW, he is allowing a harder (more likely to be a single) distribution of ground balls than an average RH pitcher would allow, assuming that we controlled for the batters he faced and even the park. So this is a completely different definition of PZR and has nothing to do with the fielders. It is merely the difference between the distribution of ground balls and air balls that a pitcher allows as compared to the average distribution of ground balls and air balls that an average pitcher of the same handedness would allow, after adjusting/controlling for the batters and park.

This might be the PZR that Tango refers to.  I always forget.

I don’t like even using the term PZR because, as you can see, I don’t know that there is only one definition and even if we define it precisely, I don’t really know what it means, other than the definition itself. IOW, almost any definition of PZR reflects some combination of pitcher skill, defensive skill, and luck.


#8    Matthew Cornwell      (see all posts) 2009/11/30 (Mon) @ 02:01

Isn’t it easier (albeit not quite as accurate) to just see how many hits on BIP a pitcher has allowed compared to his own teammates?  Compare his DER to the teams’ DER?  You don’t weed out luck, but that should take care of most of the defense issues.


#9    Brian Cartwright      (see all posts) 2009/11/30 (Mon) @ 03:02

Comparing one pitcher’s DER to his team’s DER, minus his own stats, is basically WOWY (With Or Without You).

Last off-season I ran WOWY for defense, holding for ballpark and batted balltype.

Now knowing the quality of the defenders on each play for each pitcher, my next step will be to run WOWY for the pitchers, holding for ballpark and fielders.

Another approach I will try is to find the GB%, LD%, FB% and PU% of all batted balls of each pitcher, find the population means and variances, and assign each player a t-score on each of the four. Then by the pythagorean distance formula find which other pitchers (or batters) are the closest to each, summing the aggregate BABIP for the x number of closest players, or how ever many other players fall within a given distance, or until a minimum sample size is reached.


#10    Matthew Cornwell      (see all posts) 2009/11/30 (Mon) @ 09:11

Brian - what if your method says that pitchers have a worse BABIP on GB than league average BABIP, but Greg Maddux’s BABIP on GB is better than league average?  Will he be unfairly “hurt”?


#11    dave smyth      (see all posts) 2009/11/30 (Mon) @ 10:36

I always thought that PZR was the expected result of the balls in play (based on UZR zones, etc.) assuming an average defense.


#12    Brian Cartwright      (see all posts) 2009/11/30 (Mon) @ 11:25

Matthew/10 - The methods I discussed are ones I will be doing very shortly for my possible improvements to my Oliver projections. Currently I have only completed the defense WOWY (and am at this moment sitting here transferring the code from Access to MySQL).

For finding comparable players, I would use that as the population mean to regress to. Instead of regressing batters and pitchers to the overall league mean, regress them to the players most like them. I believe that it’s the mix of all four categories of batted balls that is important, and I would prefer this method instead of a regression analysis.

I think the key is in the vertical angle of the batted ball, and thus the horizontal speed of the ball and the the reaction time of the fielder. There’s no one LD or FB description, it’s a continuum best described by the vertical angle and speed off the bat (and probably backspin, but that’s not easily available yet).

I was fascinated by the Andruw Jones/Derek Jeter comparison. Jeter hits a lot of GB and higher than average LD, and the lowest rates of FB and PU, and has the highest BABIP. He has higher than average BABIP on both LD and FB. I believe this is because his LD and FB have a lower than average vertcial angle, hinted at by his high GB rate. Jeter’s comps have a similarly high BABIP of around .350. Andruw hits higher than average number of FB and PU, and his comps have a BABIP near .250.

There’s a bias towards HR hitters in minor league promotion, leading to increasing lower overall BABIPs on fly balls as you get to higher minor leagues. Therefor, a pitcher will face more flyball batters as he is promoted. Disco Hayes is a high GB pitcher who saw his BABIP go way up on promotion. I need to verify, but I believe that when these FB hitters hit off a GB pitcher, it lowers the average vertical angle, resulting in a higher BABIP for both. It has been known for awhile that FB hitters have better BABIPs off GB than FB pitchers.

I’ve been mulling the PZR style approach for a few months now. I thought that if I can measure defense based on the types of balls hit to each fielder in each park, given the play by play, why can’t I do the same for batters and pitchers, holding for ballpark and fielders? Instead of a regression term, this is an expected value that the observed value of the fielders, pitchers and batters are compared to. The difference between expected and observed can be expressed as a plus/minus, a ratio, or a normalized rate. Then in any hypothetical environment, calculate the expected value and apply the player’s true talent estimate.


#13    Colin Wyers      (see all posts) 2009/11/30 (Mon) @ 12:57

MGL, how much does pitcher handedness affect the PZR y-t-y correlations?

As far as BABIP - BABIP and UZR measure two different (but similar) things. BABIP measures the performace of the defense relative to an AVERAGE distribution of BIP, UZR measures the performance relative to the performance of an average defense with the OBSERVED distribution of BIP.


#14    Matthew Cornwell      (see all posts) 2009/11/30 (Mon) @ 21:46

So back to my main question, when Greg Maddux ends up with +60 PZR runs from 2001-2006, what that means is that he prevented 60 runs due to some combination of skill and luck compared to what he “should have” given his defense and batted ball distribution?  Correct?


#15    MGL      (see all posts) 2009/11/30 (Mon) @ 22:09

"I always thought that PZR was the expected result of the balls in play (based on UZR zones, etc.) assuming an average defense.”

Again, everyone has their own definition. I was assuming it was the difference between that (what you said) and what the pitcher actually gives up, which is luck, defense, and pitcher skill combined.  Which doesn’t really tell you anything.

The only interesting question is whether in evaluating pitchers or constructing a projection, we should assume that a pitcher “really” should have given up whatever an average pitcher and an average defense would allow given that pitcher’s distribution of BIP.  Given that I find a .2 or .3 y-t-y correlation when pitchers change teams, between what they should give up and what they do give up, suggests that we should NOT do that, at least not completely. 

When I do my pitcher projections, I convert their BIP results to somewhere in the middle of what they actually gave up minus the defense and what they should have given up if they were an average pitcher with average defense.

Again, what you want to call “PZR” differs depending on whom you are talking to, which is one reason why I never liked that term.  Tango talks about it like everyone knows and agrees what it is.  I don’t think that is the case, like it is with UZR.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 04:03
MGL: Today on Clubhouse Confidential

Feb 11 04:02
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 02:12
Performance through the ages

Feb 11 02:10
Dwight Evans

Feb 10 23:01
For Your Soul

Feb 10 21:07
Hero of the month: Brittney Baxter

Feb 10 18:32
Moneyball at Villanova

Feb 10 17:00
Psst… wanna intern in Canada?

Feb 10 15:01
New PECOTA

Feb 10 14:28
Win expectancy charts used in football… in 1983!