Thursday, October 18, 2007
Converting Zone Data
This time, it’s Eric Van’s turn.
Buy The Book from Amazon
This time, it’s Eric Van’s turn.
Aren’t OOZ plays for one position often BIZ for an adjacent position?
"Somethings funny in CF. I have Ichiro near the top and Sizemore near the bottom, and my eyes tell me that that’s closer to reality than the opposite.”
The guy with the Reds blog, forgot his name, had similar results through his translations of the THT zone data, and I agree that looks closer to what I’ve seen than what UZR says about the two players.
By the way, can you explain the differences between your translations and the Reds’ guy in simple terms.
Eric, I see what you did for the team OOZ opportunities, but how did you apportion those to individual fielders?
I had the same question as Rally. I like the idea of using total balls in play - BIZ - non-zone-outs to get a better measure of opportunities for OOZ plays, but I don’t quite see how you can distribute that to individual fielders beyond looking at BIZ distribution around the field and basing on that.
Maybe doing something like that is still better than just using BIZ or (as Rally does) innings in the denominator for OOZ, but I’d have to play around with the data.
-j
http://sonsofsamhorn.net/index.php?showtopic=23972
In post #12, he lays it out, but the english words in his equations get in the way.
I’m going to guess the following (and if it’s not what Eric is doing it’d be what I’d do):
1. At the league level
a. figure out the number of out of zone plays as Eric is doing, or as logic would dictate
b. figure out, for each position, the share of outs, per total out of zone plays
If for example all positions get 30% outs on all out of zone plays, then the sum of each position will come out to .30. So, maybe SS is .05, CF is .08, etc.
Call this: “lgPosOuts per lgOOZ”
2. At the team level
a. figure out the number of out of zone plays as Eric is doing, or as logic would dictate (i.e., do the same as 1a, but at the team level)
b. take this figure of 2.a, and multiply by “lgPosOuts per lgOOZ”
That’ll give you the number of outs you should get on out of zone plays, for that team position.
3. At the player level
a. figure out the percentage of BIZ your player has to your team’s BIZ for taht position.
b. take the figure of 3.a and multiply it by the figure of 2.b
lgOOZ should be clearer as lgBOZ
I like this plan. Its pretty much what I theorized doing with OOZ opps from my first THT article on there defensive stats, but I never got around to doing all the math.
One more step I would suggest: control for groundball/flyball mix when dividing up your OOZ chances for each position.
I have an article ready to go for THT on figuring OOZ opportunities (though I do it differently from Eric), and my basic finding is that it doesn’t really change much from simply estimating OOZ based on BIZ. Seems that Eric is actually showing the same result, though of course that’s not what he says.
Its really not much different using innings instead of BIZ, but I’ve found that innings give you slightly more conservative results.
Either one is a lot less work.
Conceivably Eric took too early a snapshot of the THT data (since he took it very close to the end of the season), or perhaps he made a mistake. [As a side note, the THT team fielding totals for OOZ consistently understate the summed individual totals, by an average of 6 per team.] He says there were 723 Non-zone outs (NZO) per team (essentially plays made on popups , bunts, and groundouts to the pitcher which are not being counted in the THT stats as either in or out of zone.) I count 679. His calculations which depend on this number are therefore off a little.
Also, I think Anthony (#2)’s question is a problem for this method. It makes assumptions about “consistent” difficulty of BIZ vs out of zone opportunities. But I assume that, because of more aggressive defensive shifts or rangy players, some teams have more plays made in another fielder’s official zone, increasing their OOZ plays and decreasing their BIZ. I stepped through Eric’s method [as I understand it] with an extreme made up case where 2 teams had the same balls in play against them, and the same total plays made, but broken up differently between in plays made in zone and out of zone by each individual fielder. At the team level, the team with more plays out of zone and fewer in zone had a better rating than the team with more balls recorded as BIZ.
That would be fine if the team recording more outs out of zone was facing tougher opportunities. If that’s the case, then RZR is doing exactly what it should be doing and what these efforts to convert to runs assume its doing.
I wonder though how it handles plays like this:
High pop fly to medium left center. Easy out, even for a DH type like Raul Ibanez. He camps under it, if he catches it, its an in-zone opportunity converted. Ichiro! runs over and calls him off. Does Ichiro get an OOZ play credited for this?
I don’t see why he wouldn’t, under the BIS definition.
I see what you are saying. That play *IS* a BIZ, in that it’s hit to a zone that is deemed “fieldable”. But, a guy from some other position is getting that ball.
So, if Ibanez makes the play, the Mariners get a BIZ. But, if Ichiro makes the play, the Mariners get a BOZ!
Actually, if anyone other than Ibanez makes an out, it’s a BOZ. If Ibanez makes an out, or if no one makes an out, it’s a BIZ.
That’s a limitation of the accounting system, which is why UZR and PMR are better.
Actually, if anyone other than Ibanez makes an out, it’s a BOZ. If Ibanez makes an out, or if no one makes an out, it’s a BIZ.
***
Just noticed this, but I think Tango is wrong here. I would presume it would be an OOZ for Ichiro AND a BIZ for Ibanez. In which case that’s probably fine. Ichiro made a play center fielders usually don’t make and Ibanez didn’t. Ichiro gets credit and Ibanez gets a demerit. Now it may be that any center fielder COULD make that play, but just doesn’t, but since the center fielder generally gets to call off the corner outfielders if they’re both going after a ball, I don’t know how likely that is too occur.
David,
This thread was directed at the idea of estimating the “implicit opportunities” for OOZ plays. From that point of view it would be unfortunate double counting if the THT ZR counts regular opportunities as you suggest.
my assumption was the same as Tango’s, but I think in my case that assumption is based solely on how Dewan’s antecedent STATS’ ZR was defined [Ibanez would not have been charged an opportunity], in conjunction with Dewan’s explicit statements about what he intended to revise.
THT’s ZR seems to be the essentially same as the “revised zone rating” published in Dewan’s Fielding Bible (pp 227-38), so Dewan’s description there is relevant. Unfortunately it is silent on this point.
Probably evidence can be gathered to decide this question next month, although it may be a bit tedious, because I don’t know how common these plays really are. If THT is again is going to publish daily updates to its fielding statistics, I will try to capture daily deltas; from those we will know what opportunities were charged to each player at the game level. By reviewing the video of games with OOZ plays, eventually there should be at least a few games in which there is no ambiguity about which plays were in zone, out of zone [in no man’s land], or actually “poached” from the neighbors’ zone between a pair of adjacent fielders. We can match that up to the plays made and opportunities from the daily delta. Hopefully it will quickly become clear which theory is correct.
Any other ideas?
David pointed me to this thread. I asked John this question a while ago. Here was his response:
“Dave, Ichiro catching a ball in front of Ibanez in LF does not count against Ibanez. Only uncaught balls in the LF zone would count against Ibanez.”
Been really applying myself to this question recently, and I’ve come to the conclusion that OOZ/BIZ is unproductive at best. For outfielders, it seems to actually have an (insignificant) negative correlation with defensive ability. (I’ve aggregated ZR/RZR data at the team level, and am comparing it to DER compiled from Retrosheet data, seperated into infield and and outfield plays.)
Dividing by innings seems to work much better, even though I still get better correlations leaving out the OOZ data from outfielders at this point. I suspect some of my problem comes from excluding line drives, because I have no way to seperate balls that hit the grass just behind the basepaths and ropes hit off the walls at this point.)
Here’s the best fit I was able to establish so far for team level DER//RZR data. (Bear in mind that this only has been tested on 2006/2007 RZR data; the way RZR measures outfielders changed substantially after the 2005 season, and I’m starting to dislike my previous method of compensating for this.) I used trial and error and some intuition, not a regression model.
Infielders first. Calculate RZR as per THT (Plays/BIZ.) Then you need to find team ground balls allowed. “Out of zone rating,” expressed as plays per “chances”:
OZR: OOZ /(OOZ/(OOZ+Plays)*GB)
To combine into a single zone rating:
0.84*RZR+0.16*OZR
(Those weights, incidentally, are simply the number of IZ compared to OOZ plays. This holds up both in the IF and the OF, which makes me think that this is how BIS decides on what zones to use. I suppose you could attempt a best-fit model to improve on those a bit.)
The proceedure for outfielders is identical, simply substituting fly balls allowed instead. GB and FB numbers should be available from THT’s teams page or from the splits page on Baseball Reference, although it’d probably be a bit unwieldly to derive both. (I might end up getting that data from the Gameday XML files.)
To apply this to individual fielders, I’m still musing over whether it would be better to use BIZ or Innings Played to divide up the available OOZ chances.
I’ll go ahead and point out how much of an idiot I am before anyone else does.
GB-BIZ = OOZ chances for infielders
However, the OZR equation above seems to work better for outfielders than FB-BIZ does, probably because of line drives.
I should note that when I figured team GB and FB DER, I excluded certain plays (FB fielded by an infielder, GB fielded by pitcher/catcher). So I probably should parse GB/FB team data from Gameday for use with 2008 RZR data.
Good stuff, Colin. I haven’t noticed too much of a difference between what I post (which uses OOZ/BIZ) and what Rally posts (OOZ/Inn), but maybe I haven’t checked closely enough. If both you and he favor that approach, it might be time for me to switch over to Rally’s approach for individual fielders.
-j
At a team level there’s no practical difference between the two approaches (and the correlation between OOZ/INN and outfield DER is still negative, although insignificant). At an individual player level, BIZ is probably better because a 2B with x number of innings played has a different distribution of chances than a SS with x number of inning, and I think BIZ would capture that a little better.
Well, it all comes down to the average plays per unit estimate of opportunities, where unit is either BIZ or innings. You (or I, at least) calculate this separately for each position, so the decreased number of chances for a 2B will be reflected in the decreased average plays per unit no matter which unit I use.
My rationale for using BIZ has always been that it seemed more likely to reflect the gb/fb or left/right peculiarities of a team’s pitching staff than simply using innings played. But it also might be the case that you get an inverse correlation, because a ball in zone will no longer be available to be a ball out of zone (if that makes sense).
-j
But it also might be the case that you get an inverse correlation, because a ball in zone will no longer be available to be a ball out of zone (if that makes sense).
I don’t think that’s true - like the Ichiro example, above, the ball would have been an BIZ for Ibanez and thus an OOZ opportunity for Ichiro, except for the fact that Ichiro caught it. Maybe something like team BIZ - Plays - OOZ would work, or perhaps you could count a player’s teammates’ IZ opportunities as OOZ opps.
For example, let’s say you want to figure OOZ opps for a CFer. Add up the LF and RF BIZ for the team, prorate out by innings (or BIZ, either way), and use that as the denominator for OOZ plays. I’m at work right now so unfortunately I can’t test any of this.
Not as good as including BIP data to fill the gaps, but at the team level, BIZ-Plays works better than BIZ or IP for an estimate of OOZ chances.
Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season
Nov 20 04:02
Nate Silver: hero to interviewers
Nov 20 02:01
My 1B is better than your 1B
Nov 20 00:26
MLB logo
Nov 19 23:03
NBA’s Marcel
Nov 19 19:13
Offense by position groups by decade
Nov 19 17:32
Changes in home run rates during the Retrosheet years
Nov 19 16:40
One Year and One Million Hits Later
Nov 19 16:22
Soria as a starter?
Nov 19 13:50
Response of a fired head coach
Like a number of others, I’ve been busy converting the THT data into rankings. However, I believe I’ve come up with a superior treatment of OOZ: the assumption that OOZ opportunities are proportional to BIZ is demonstrably wrong.
Team defensive rankings, and an explanation of the methodology, are at
http://sonsofsamhorn.net/index.php?showtopic=23646
Individual player rankings are at
http://sonsofsamhorn.net/index.php?showtopic=23972
Somethings funny in CF. I have Ichiro near the top and Sizemore near the bottom, and my eyes tell me that that’s closer to reality than the opposite .