Wednesday, January 11, 2012
Umpire strike zone size
Josh takes a look.
I must say, this looks a bit of a step back from others that I’ve seen. Suppose that an umpire has pitchers that throw alot of pitches in the middle of the strike zone. The way Josh calculates it, that’s included in his metric.
The way the other guys have done it, they looked for a “contour” of x-inches of width where the called strike to called ball ratio was 1. Basically, the wide pitches and the center pitches tell us nothing about the umpire.
Or, do they? Perhaps, rather than a step back, Josh took a step to the side. If a pitcher is throwing alot of pitches in the middle of the strike zone, he may be doing it in ANTICIPATION of a smaller strike zone from the umpire. So, if an umpire happens to see alot of down-the-middle pitches, then that may tell us something. Except.... Josh removed any pitches that the batter swung at.
So, I think there’s alot of different considerations here, as to exactly what it is that Josh (or, the interested reader) actually wants. I’m not sure that Josh answered his intended question, or some other intended question. And, it’s quite possible, that the huge sample size mitigates any of this anyway, since umpires are not paired with pitchers.
Anyway, lots of things to consider, and reformulate…


With experience with the method that Josh is using to measure the zone, I think his methodology for zone measurement itself is perfectly fine. He essentially does what you say above: uses the contour at which Pr(Strike)=Pr(Ball), given the location. The methodology is what I use for my zone comparisons and is superior than other methods I’ve seen, IMHO. This is precisely the technique I use for two academic papers I have under construction.
The advantage of the method is that it doesn’t really matter what the pitcher anticipates the umpire, related to zone size. We still find that 50% contour that you describe with his method. This is the case even if 100% of the pitches are thrown down the middle and well outside the zone (though, the imputation would be less accurate...but for most umpires, there isn’t any serious issue with having a full distribution of pitch locations).
If pitchers are throwing to the middle of the plate more due to the smaller strike zone, the hypothesis is that we would pick it up in the kwERA (easier pitches to hit and/or possibly more walks, since fewer are being called strikes if they aren’t changing their behavior). As you mention, there are important considerations here with non-called pitches as well that may or may not be accounted for.
I agree it would also be interesting to look at the average location overall by umpires...but as for calculation of the strike zone size itself (leaving aside its impact on the game), I find this method to be the most appropriate one out there.
Of course, the contour chosen is up to the researcher, but I also use the same choice that Josh does here (50% contour...i.e. more likely to be called a strike than a ball means “in the strike zone” and less likely to be called a strike than a ball means “outside the strike zone")