THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, May 31, 2007

PZR

By Tangotiger, 05:23 PM

PZR is one side of the defense coin, with UZR on the flip side.  You can find discussions on it in the archives:

The description of PZR is in post #1 here:
http://www.tangotiger.net/archives/stud0191.shtml

Extenive discussion on PZR:
http://www.tangotiger.net/archives/stud0014.shtml

In a nutshell, PZR doesn’t care about whether a ball was caught or not.  It only cares about the distribution of balls allowed.  How many were hit to point x,y, how hard was the ball hit there, what’s the tendency of the pitcher/batter to allow GB, what’s the park, and what’s the handedness of the batters/pitchers.  Given all that, then we don’t need to know if Mike Cameron or Wily Mo Pena is playing CF.  Or in the case of Roger Clemens last year and this: Everett or Jeter.

If 95% of balls allowed by RHP to angle 27 degrees by a RHH who has a tendency to hit GB on a grass surface are outs, then *all* RHP who gives up said ball gets 0.95 outs.  If Clemens gave up 100 such BIP, and Everett gets 100 outs and Jeter gets 90 outs, we don’t care (for Clemens).  Clemens gets credit for 95 outs.  Everett gets +5 and Jeter gets -5.

Add up all the “virtual” outs, virtual singles, virtual doubles, virtual HR, and you get a park and fielder-independent stat line for a pitcher.

***

This really won’t work for hitters, since there’s a quality to hitting a ball that is simply not captured by the data recorders.  While Wade Boggs has some control over whether a ball is hit at 22 degrees or 24 degrees, a pitcher doesn’t have anywhere near the same control.  A hitter can hit based on the fielder positioning far more than a pitcher can make a hitter hit the ball based on fielder positioning.  That’s why HZR won’t really work.


#1    David Gassko      (see all posts) 2007/05/31 (Thu) @ 18:58

Tom,

Have you seen this:

http://whartonball.blogspot.com/2007/01/fielding-adjusted-pitcher-eras.html


#2    tangotiger      (see all posts) 2007/05/31 (Thu) @ 19:03

I wouldn’t do it that way.

Say you have Brandon Webb (GB heavy) and Barry Zito (FB heavy) on the same team.  The team features Rolen, Everett, and Pujols in the infield, with Manny, Bernie, and Junior in the OF.  See where I’m going with this?  The overall fielding may be league average, but Webb is the one getting the benefit, and Zito is the one getting killed on it.

Since PZR is just the flip-side of UZR, you might as well run virtually the same process for the pitchers.


#3    David Gassko      (see all posts) 2007/05/31 (Thu) @ 19:18

My bad. I didn’t read carefully enough: I thought that’s what they were doing. I agree with Tom completely here; David Pinto does do something similar with PMR.


#4    dcj      (see all posts) 2007/05/31 (Thu) @ 19:29

Another issue with HZR is the speed of the batter. Ichiro and Giambi could hit the same ground ball, but Ichiro beats it out for a hit. And for hitters like Giambi and Ortiz who get a defensive shift, HZR would be next to useless.

If we had better BIP data, I think HZR could be useful for fly balls. A guy gets a lot of doubles one year. Is that because he hit the ball better or because the outfielders didn’t make the plays? I’m speculating that the adjustments to fielder positioning that you mention are a bigger deal in the IF than the OF.

With the data we have now, even HZR for fly balls might have the same problem as PrOPS. That is, Eckstein and Pujols hit two balls that are classified the same in the stats, but in fact Pujols’ ball was more difficult to field.

--

PZR is an excellent idea. Whenever a pitcher has a very low BABIP, I wonder whether in fact he gave up balls that were easier to field, or whether it was his fielders bailing him out. PZR will answer that question.

David Gassko’s pitching runs created makes an adjustment for the pitcher’s K-rate. (See here and scroll down to the graph.) I think this is just an approximation, and PZR would tell exactly how much credit the pitcher deserves. Of course if you want to apply it to previous eras, something like that graph is necessary.

Along those lines, a PZR-based stat would be ideal for MVP considerations.


#5    tangotiger      (see all posts) 2007/05/31 (Thu) @ 19:39

Actually, for UZR, PZR, and HZR, they should all include the speed of the batter, the tendency for the batter to pull, spray, or go the other way.

In short, whatever you can think, it all gets included.


#6    tangotiger      (see all posts) 2007/05/31 (Thu) @ 19:42

Looks like neither dcj, nor I, read the original link, as I already listed the speed of the batter (and runners) as a parameter.  I didn’t have the sprayability though.


#7    MGL      (see all posts) 2007/05/31 (Thu) @ 22:10

I think that PZY suffers from some of the same problems as HZR.  I am not sure though.  I have a PZR version of my UZR version and I’ll present some numbers.

Also I am not sure what PZR tells you.  Presumably is the pitcher’s true run value of all balls in play independent of fielding, official scoring, park effects, etc.  But, here are some of the problems.  First of all, for some parks (like LF at Fenway) I think that the actual result of the ball may be better than the virtual result adjusted for park factors.  That may be true at other parks as well (e.g., Colorado).  Of course, given “perfect” park factor adjustments that is not true.

The other thing is that there is still plenty of “luck” in PZR, namely the types of batted balls that a pitcher allows.  Now, we think most of that is luck, but there is still some skill in there.  So what does PZR tell us?  I am not sure.  It is some intermediate result, not all skill, not all luck.  I guess it is skill plus luck without defense and park influences.  I suppose that it is a good place from which to do regressions in order to estimate pure skill and do projections.  I suppose that it is a better starting point than ERC (which is PZR plus defense, park effects, etc.).  But I am not convinced that it is actually better than ERC because of park adjustments, the HZR effect (for example, GB pitchers tend to give up easier to field GB and fly ball pitchers tend to give up easier to catch fly balls, even given the same recorded batted ball characteristics), so if you don’t do a G/F adjustment, you might get spurious results, and other things.

If I look at my list of pitchers and their UZR’s, which I can make available, many pitchers have a large difference betweeb their ERC and PZR and I have hard time believing that the differences are defense and park influences.  I think other things are going on.

Basically I am not very keen on PZR, at least until we get better defensive raw data.  Eventually when we do, we can do HZR and PZR (and UZR of course) with impunity and tease out virtually all of the luck in the disconnect between how a ball was actually hit and the result.

I just remembered one more problem with PZR.  Similar to HZR, if fielders have different positioning with different pitchers, it will screw everything up, in the same way as if you tried to do HZR with Bonds or Giambi, the results will be all screwed up because of the shift.


#8    tangotiger      (see all posts) 2007/06/01 (Fri) @ 08:47

MGL sent me his files, and I’ll post them when I get to the office…


#9    tangotiger      (see all posts) 2007/06/01 (Fri) @ 10:56

Here are MGL’s PZR, 2001-06:

http://www.tangotiger.net/mgl/PZR0106.zip

(He also sent me his UZR, 2003-2007, so watch out for those later today!)


#10    tangotiger      (see all posts) 2007/06/01 (Fri) @ 11:04

At the top of the list of guys who have the biggest disparity between his actual performance and his PZR is Greg Maddux.  Tom Glavine is #3. 

DIPS-hater Glendon Rusch is third from the bottom.

PZR is saying that Maddux got very lucky, to the tune of +62 runs over 913 IP (+14 runs per 200 IP).

Rusch, who has some of the worst BABIP around, PZR says that was very bad luck, to the tune of 12 runs per 200 IP.

I think there are a few parameters that could be added.  I consider Maddux and Moyer (#16 in good luck) to be two of the smartest pitchers around.  Maybe we need to add the pitch count to the mix as a parameter.


#11    Tangotiger      (see all posts) 2007/06/01 (Fri) @ 14:32

Here are the team totals, per 162 G.  “diff” is simply the difference in runs between the pitcher’s actual performance and his PZR-based performance.

Team diff
sln 95
phi 58
cha 42
ana 38
sdn 35
col 32
lan 27
atl 26
sea 19
mil 17
bal 15
chn 14
ala 3
cin -4
nyn -5
mon -9
oak -11
kca -13
tor -15
hou -18
tex -19
cle -23
was -24
det -30
nya -32
sfn -33
min -35
flo -36
pit -41
bos -50
ari -56
tba -57

If this was based purely on luck, that is, if PZR told us nothign at all, we’d expect 1 SD = 10 runs.  In fact, 1 SD = 35 runs.

So, PZR tells us a heckavalot.  It doesn’t necessarily tell us about the pitcher’s skill, since there may be other parameters (the Maddux/Rusch parameters) unaccounted for.


#12    Tangotiger      (see all posts) 2007/06/01 (Fri) @ 14:38

Here’s team UZR for seasons 2003, 2005-2007, per 162 games:
Team runs_162team
cha 53
sdn 52
chn 50
ana 48
col 39
det 28
phi 25
mon 24
atl 22
cle 21
sln 21
hou 17
sfn 13
min 8
nyn 5
oak 4
sea 4
tor 4
bal 2
kca 1
ari -2
ala -8
mil -18
tex -24
lan -25
flo -30
was -31
pit -34
nya -48
bos -50
tba -51
cin -66

1 SD = 32.

PZR tells us as much as UZR.

However, the correlation between the two stats is an incredibly high r=.55.

Tampa UZR and PZR, and Redsox UZR and PZR are each around -50 runs.

They should ideally have an r=.00.  After all, after you control for everything, your fielders should not be linked to your pitchers.


#13    dcj      (see all posts) 2007/06/01 (Fri) @ 17:10

Wow, this is great!

Team ala = ana? It appears that ana is years 01-04 and ala is years 05-06, so it coincides with the name change. There’s also mon and was of course.

Shouldn’t the UZR and PZR numbers in posts 11,12 be almost exactly the same? Take Derek Lowe 2002. He had a BABIP of .238 that year and sure enough, the difference between actual and PZR is +14 runs for him. That should mean that when he was on the mound, his fielders racked up +14 in UZR. So rather than wondering why the correlation is so high, I am wondering why the lists are not identical!

Okay, now that I look back, post 11 is 2001-06 and post 12 is 2003 and 2005-07. That would certainly explain it. Can you go year by year and see if the lists come out the same?


#14    dcj      (see all posts) 2007/06/01 (Fri) @ 17:46

If the data perfectly captured how difficult a ball was to field, then all deviations (actual - PZR) would be due to the defense. As it is there are park issues and G/F issues, as MGL mentions. That can be fixed with better data. Positioning to my mind is a harder problem to straighten out. A pitcher pitches a certain batter outside, making him hit the ball to the opposite field. The fielders shade him to hit it the other way. So when the ball is hit, they are in better position to make a play than the UZR figure for that type of ball would suggest.

Still, that’s more of a minor problem. Things to adjust for:

1. park
2. pitcher characteristics: handedness, G/F tendency
3. batter characteristics: handedness, speed, pull/spray tendency, how hard he hits the ball
4. game situation: count, base-out state, maybe even inning/score for “guarding the lines,” “no-doubles defense” etc?

If we put all these together with the batted ball data (plus other adjustments, what am I forgetting?) then the basis for PZR should be solid enough.


#15    ubelmann      (see all posts) 2007/06/02 (Sat) @ 01:51

Great stuff.

Wouldn’t it make sense to also list an expected BABIP (in front of a neutral defense) for each pitcher given his batted ball types?  (Though on second thought, I suppose it’s not too hard to get that by hand by adjusting the number of hits on BIP accordingly.)


#16    Los Angeles Waterloo of Black Hawk      (see all posts) 2007/06/17 (Sun) @ 00:54

A bit late to the party here, but ...

A hitter can hit based on the fielder positioning far more than a pitcher can make a hitter hit the ball based on fielder positioning.

Though there are certainly many concerns with HZR (and PZR), I don’t think that this is one of them.  Pitchers often tailor their pitching style to the defensive alignment (pitching inside to a LHB defended by an extreme shift, for example) and vice versa (a SS moving a step one way or the other just as the pitch is thrown, as he knows whether it’s a fastball or off-speed, and thus which way the ball is more likely to go).

***

Very off-topic, but the College World Series coverage on ESPN is featuring Win Probability!


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 04:03
MGL: Today on Clubhouse Confidential

Feb 11 04:02
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 02:12
Performance through the ages

Feb 11 02:10
Dwight Evans

Feb 10 23:01
For Your Soul

Feb 10 21:07
Hero of the month: Brittney Baxter

Feb 10 18:32
Moneyball at Villanova

Feb 10 17:00
Psst… wanna intern in Canada?

Feb 10 15:01
New PECOTA

Feb 10 14:28
Win expectancy charts used in football… in 1983!