THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, July 11, 2007

Where is the strike zone exactly?

By Tangotiger, 01:23 PM

Yet another in a long-series of great research pieces by John Walsh.

He points out to some obvious data quality issues like: 


”...all I can say is that one pitch whose recorded location was right in the heart of the strike zone, was actually an intentional ball that was thrown two feet off the plate!”

Generally speaking, we see that the textbook strike zone should be the 17-inches of the plate, plus the inside/outside pitches where the ball knicks the plate.  Since the datapoint is measured at the centre of the ball, that means that the center-of-ball to center-of-ball left/right strike zone would be a total of almost 20 inches.  Umpires however are calling the left/right center-of-ball strike zone as 24.1 inches for RH and 24.5 inches for LH.  Dan Fox once pointed out that the home plate also has the outside rubber, which could technically be considered part of the home plate.  That would bridge some of that gap, but not all.  As well, since John is considering all umpires, it would be interesting to see the left/right range of each umpire.  After all, the good ones would call the 20 or 22 left/right strike zone, while the bad ones would call a 25-27 strike zone.  You won’t have, I don’t think, umpires who call a 17-18 inch strike zone to balance it out.

John also shows a skew depending on the handedness of the batter.  I have to confess I don’t know exactly where the umpire positions himself.  In yesterday’s All-Star game, the old man got knocked heavily twice on his left-side, with righties at bat.  I didn’t pay attention as to where he was positioned when lefties were at bat.  I’m guessing the ump positions himself away from the catcher’s throwing hand, so, he would be farther away from a LHB than a RHB.

Fun stuff…

#1    Fargo      (see all posts) 2007/07/11 (Wed) @ 15:35

Pretty interesting stuff.  One thing to pursue for sure, given the asymmetrical relation between the handedness of the batters and the width of the zone, is the handedness of the pitchers, or more particularly the relationship between the shape of the zone and the combination (or interaction) between hitter and pitcher handedness.


#2          (see all posts) 2007/07/11 (Wed) @ 16:34

I think the All Star ump (Fremming, or Froeming, or something) is definitely an outlier in terms of positioning.  I’ve never seen someone get hit with as many foul tips as he does.  Jerry Remy, during Red Sox broadcasts, points this out every time he’s behind the plate… and sure enough, he’ll get tagged once or twice during the game.  I think he lines up significantly off-center, relative to the rest of the umpiring community.  I mean, he is lined up farther “outside” relative to the hitter, than most umps, who generally line up their torso almost directly behind the catcher.


#3    John Walsh      (see all posts) 2007/07/11 (Wed) @ 16:52

Regarding the “black”—it’s definitely not part of the strike zone and shouldn’t be considered such. The expression “on the black” makes you think otherwise, but don’t be fooled. So, it doesn’t really bridge any gap.

As for variance among umpires, that can be inferred indirectly from the sharpness of the ball/strike transition. I hope to have something about that in a future piece.


#4    Tangotiger      (see all posts) 2007/07/11 (Wed) @ 17:35

Sharpness: oooh, great idea. 

Ideas for umpire splits:
- handedness of batter
- handedness of pitcher
- amount of left/right break (little/alot)
- amount of up/down break (little/alot)

So, present the ball/strike transition for each split (meaning 8 different results), and size permitting, the ball/strike transition for the 16 possible combinations.

And, if Froemming does set himself outside alot, then capture each umpire, and note his position.  Where is his head relative to the catcher’s head.  Is there a bias there?


#5    MGL      (see all posts) 2007/07/12 (Thu) @ 08:32

Fascinating stuff and great article.  I think that data integrity is indeed a significant problem and will continue to be.  How much it will affect various analyses depends on what is being analyzed and the methodology used.

BTW, brilliant method for determining the “average” strike zone width and height!  I am not sure I wouldof thought of that (looking at pitches in the middle of the zone) myself!

John, you are not explaining how you arrive at the rule book zone for each batter.  I assume you are normalizing each batter’s data (so for example, a pitch that is right at the top of Richie Sexon’s rule book zone is recorded by you as 3.66 feet above the ground, even though it might be 4 feet above the ground?).  Does the data include each batter’s rule book zone in height (top and bottom obviously)?  IOW, the operators look at video of each batter just before they swing, mark “just below the knees” and that “rule book top” on a computer screen and the computer figures out the height in feet?

In any case, there is NO WAY IN HECK that the average umpire calls a rule book strike at the top of the zone for RHB!!!!!!!!!!

So there is something else wrong with the data!  I don’t know what else to say about that.  Some umps still call the old pre-2001 strike at the top of the zone, which was approximately the belt.  Some umps call it higher than that, one to two balls above the belt.  Virtually no umps call the rule book high strike.

So, there is no way in heck that the average ump calls a rule book high strike! 

So someone please tell me what is going on with John’s results of the top of the zone for RHB?????


#6    John Walsh      (see all posts) 2007/07/12 (Thu) @ 08:55

Thanks, MGL.

I mentioned that I mapped the individual’s sz onto a standard one, but I didn’t explain the details. The data provided by Gameday is 1) the height of the pitch, 2) the lower limit of the particular batter’s sz and 3) the upper limit of same. An mlb operator sets those sz limits at the beginning of the AB. What I do with that data is to map it onto a “standard” vertical sz that goes from 1.6 to 3.56 feet. Explicitly:

zprime = 1.6 + [(z - sz_bot)/(sz_top-sz_bot)]*(3.56-1.6)

So, yes, Sexson’s pitch might actually be 4 feet above the ground, buy my zprime value might come out 3.6 feet.

I too was surprised at the high-strike result, but as we always say, if the results were never surprising, there’d be no reason to do the analysis grin

Seriously, I can think of a few reasons: there could be some problem with the data; I might have made some mistake in the analysis, but since I did exactly the same thing for RH and LH batters, that seems unlikely. Maybe the mlb operators are underestimating where the upper limit of the strike zone should be.

Another possibility, and one that I judge most likely, is that the result is correct and our perception is off. We see pitches from the CF camera which is elevated with respect to the playing field. We also tend to judge the pitch from where the catcher catches it, which is typically a few feet behind the front of home plate. These two effects will tend to give the illusion of a much lower pitch.  If I knew how high CF cameras were situated off the ground, I could attempt a calculation of the apparent pitch height.


#7    Fargo      (see all posts) 2007/07/12 (Thu) @ 11:33

One more question, related to front vs. back of the plate (or front of plate to catcher’s mitt), do the data reflect whether the ball ever passed through the SZ or do they capture where it is at the leading edge of the plate?

What happens to “back door” sliders, etc.?

That may be one reason that Tango wants to see breaking ball splits.


#8    John Walsh      (see all posts) 2007/07/12 (Thu) @ 11:48

The location data is defined for when the ball crosses the front plane of home plate. Pitches that enter the sz from the side or top will appear to be out of the strike zone.


#9    tangotiger      (see all posts) 2007/07/12 (Thu) @ 11:59

zprime = 1.6 + [(z - sz_bot)/(sz_top-sz_bot)]*(3.56-1.6)

Another “ooooh”.  Ok, so what John is doing is better expressed by rearranging the terms this way:
(z - sz_bot)
*
(3.56-1.6)/(sz_top-sz_bot)

So, that last line simply rescales the batter’s actual strike zone (sz_top minus sz_bot) into a fixed distance of 1.96 up/down feet.

So, a guy with a true up/down of 4 feet, and another guy with a true up/down of 1 foot simply gets rescaled to 2 feet.  So, a “down the middle” pitch to the first batter is 2 feet above his lower limit, and the second batter’s down the middle is 0.5 foot above the lower limit.

John simply scales it so that they each show, on the graph, as it being 1 feet above the lower limit.  In essence, stretching up/down to fit to the “average”.  Great idea, in theory.

***

Suggestion: I think it might be beneficial to break down the hitters into those with tight strike zones and long strike zones.  After all, an umpire might not appreciate a strike zone like Rickey Henderson the way an operator would.  So, that’s a further split to show: do umpires have a bias against short/tall strike zones?

Also note that we really don’t know what the operator is doing in terms of recalibrating.  What is he recalibrates after each batter’s PA, based on how the umpire is calling the ball/strikes?


#10    Fargo      (see all posts) 2007/07/12 (Thu) @ 13:39

My hunch is that umps adjust for the type of pitch(er). One way that we all recognize (or think we recognize) is that certain pitchers are “given,” the outside of the plate ("the black” and beyond?) and others aren’t.

Another way is that an experienced ump knows if a certain pitcher likes to throw backdoor strikes. Even the catcher may cue the ump in, not just by framing the pitch but by a comment.

Not to take you too far off the main course, but it seems to me that some pitcher-specific analysis of sz’s for pitchers with “known” tendencies, could be useful. Do they get “significantly” more called strikes in a given location than average?


#11    John Walsh      (see all posts) 2007/07/12 (Thu) @ 16:32

Tango: regarding recalibrating, it appears from the data that the SZ for any batter is set only for their 1st PA and then is retained for the rest of the game.  Makes sense, as long as the operator gets it right the 1st time! In any case, he is not influenced by the umpire’s calls.


#12    Tangotiger      (see all posts) 2007/07/12 (Thu) @ 16:50

I had another thought.... alot of focus on pitchers… how about hitters?  Say, Vlad and Bonds?

Do hitters who get the ball call in the “50%-70% league zones of ball calls” at say a 75-100% clip get a higher ball call in the “20%-40% league zones of ball calls”?

That is, how much stretching of strike zones are done at the hitter level?

Note again, this should be controlled for fastball/breaking pitches.


#13    MGL      (see all posts) 2007/07/12 (Thu) @ 22:50

Sorry I don’t buy the angle of the CF camera theory.  No way.  I have seen enough replays where they give you the side angle.

There are surprises and there are surprises.  As I said before, this one is no way and I’ll tell you how I can prove it in a roundabout way (among many).

Clearly there are umps who call a high strike and umps who do not.  But…

there are no umps who call a high strike higher than the rule book.  None.  If they did, batters would revolt like you’ve never seen a revolt.  In order for the AVERAGE top of the strike zone to be equal to the rule book, there would have to be roughly half the umps who call a strike HIGHER than that.  No way!

Something is wrong.  I have watched 300 games a year for 20 years.  The average top of the sz is well below the rule book.  This is almost unequivocable.  I do not think John did anything wrong, but if he tells me that according to the data and their methodology in reporting it, that the average top of the sz is equal to the rule book, I’m sorry but I can’t believe any of the data because that is just flat out wrong.

In fact, in 2001 when Alderson sent out that edict, here is what happened which is supported by the ball, strike, rpg, and HR data:  When the edict came out many umpires stubbornly refused to follow it and continued to call anything above the belt a ball.  The next year or two, more umpires got in line and called a higher strike.  After a few years passed everyone, including the league (and of course Alderson is now witrh the Pads I think), pretty much stopped enforcing the “edict” and the sz slipped down a little.  Now, there is still some variability among umps as to the height of the sz, but in general most umps don’t call a strike much above the belt and if they do they get glares from the batters.

The chance that I am wrong and that the average ump’s top of the zone is equal to the rule book - less than 1/2 of one percent.


#14    tangotiger      (see all posts) 2007/07/12 (Thu) @ 23:54

John marked it as 1.96 feet, which I guess would be his standard for the average batter of 6 feet. 

I’m 5’10, and my standard strike zone would be fairly close to that.  If I just go to my belt, it comes in at 20 inches (1.67 feet). 

It could very well be that John’s redbox here:
http://www.hardballtimes.com/images/uploads/sz_results.png

Is an inch or two too low.

However, maybe it’s also poor data.

John, can you present the frequency of all pitches for this chart:
http://www.hardballtimes.com/images/uploads/vertical_sz.png

Where pitchers are throwing will show you where THEY think the strike zone tops off at.


#15    tangotiger      (see all posts) 2007/07/13 (Fri) @ 00:04

John, as well, remind me exactly where the pitch is measured on the x-axis?  Where it crosses the plate?  In the catcher’s mitt?  This might explain it too.


#16    Nathaniel Dawson      (see all posts) 2007/07/13 (Fri) @ 01:53

So you have the strike zone presented as an exact rectangal. But watching the strike zone as it’s actually called on the field, I’ve always thought of it more like an oval, where pitches that are medium-high but a bit outside are often called strikes, but a pitch in the same horizontal location but either above or below medium height (like at the corners of that rectangular strike zone) will more often be called a ball.

If I’m understanding your methodlogy correctly, to look at horizontal calls, you are selecting out for pitches above and below a certain zone. And of course, the same would be true in the opposite axis for vertical location.

So doesn’t that leave us with no information on the corners of the strike zone? You can say that the strike zone is 24” wide, but could it not be, say, 26” wide at medium-height, and 20” wide at the top or bottom of the zone?


#17    tangotiger      (see all posts) 2007/07/13 (Fri) @ 07:04

Another “ooooh”.  (Aside: this is why sabermetrics has to be practiced by baseball-loving fans.  You need to appreciate these nuances.)

I think Nathaniel has a great point.  A pitch at thigh level just outside (rulebook ball) might look more like a strike than a similar outside pitch at bottom-knee or above-belt level.

While I like John’s basic rescaling process, when it comes to testing this oval-based strike zone theory, that would be best studied with guys with the same height, bat/pit handedness, and fastballs only.  The rest of the data would be noise that would cover up the exact points of the strike zone (the four corners) that we’re interested in.


#18    John Walsh      (see all posts) 2007/07/13 (Fri) @ 07:51

Agreed Tango, lot’s of great food for thought here. Here are some answers to questions posed above:

Tango: the pitch location is measured at the front of home plate.

Regarding stretching of the strike zone, see Dan Fox’s recent article on this very subject. He doesn’t find much of an effect.

Fargo: good point about the 3D sz. I (necessarily) make the approximation of a 2D sz placed at the front edge of the plate. Pitches can indeed enter the strike zone from the side or (more frequently) from above. This effect, though, would be pretty hard to incorporate in the measurement.

Nathaniel: yes, I agree. I looked at this in a qualitative way and find a sz that does have the corners rounded off. It’s not so straightforward (from a tech standpoint) to define that contour, though.

MGL: I’ve been doing some checks and have come up with a couple of things: 1) I made a mistake in the vertical rulebook sz: I did not add a ball radius to the top and bottom of the zone. That will expand the red box, but will not shrink the green box. 2) Some pitches will enter the 3D strike zone from above not through the front plane. If you could compensate for this, the net effect would be to expand the rulebook sz slightly . 3) I’m finding a fair amount of variation in the lower and upper limits of the sz as provided in the data. As an example, the lower limit of the sz for Jeter ranges from 1.59 to 2.06 feet, with a median of 1.66. Now unless Jeter’s shins are changing length from 1 AB to another, his lower sz should be very constant. BTW, I estimated Jeter’s sz myself from a photo I found and I got 1.64 feet, matching well with the median value from mlb.  The upper limit of the sz varies quite a bit, also.

I’m thinking a better way to do this is to measure each batter’s sz based on all his ABs and then apply that to every pitch, regardless of what the mlb data has for any given pitch.  This would assume that he takes the same stance every time, which is probably a reasonable assumption in most cases.


#19    Tangotiger      (see all posts) 2007/07/13 (Fri) @ 08:10

That is more data quality issues.

However, what if the operator correctly marked it at “2.06 feet”, meaning that it’s really 1.64 human feet.

If a pitch comes in at “2.16” feet, you can convert that to 1.74 human feet.

That is, the operator is simply calibrating his scale to whatever he’s working with, and if he shows “2.06 feet”, I would consider that “2.06 quatlus at Fenway, Apr 7, 2007”.  You then have to scale that to human feet.

Treat the game itself as the universe.

So, if you do that, if you look at that Jeter game at 2.06 quatlus, does it look like the ball/strike calls make more sense?  Or, are the ball/strike calls look right as if the rest of the data was human feet, and Jeter’s 2.06 was simply wrong?


#20    John Walsh      (see all posts) 2007/07/13 (Fri) @ 08:57

BTW, judging the vertical strike zone on TV using the CF camera can be very misleading. Because the ball is going down and there is a sizable distance between the front edge of home plate, the position of the batter and the point where the catcher receives the pitch, pitches tend to look lower than they actually are. Quite a bit lower.

I took some measurements off my Jeter photo and found that Jeter’s position (middle of chest) is about 31 inches back of the front of home plate. He stands at the back of the box, like most batters. The catcher’s mitt is 58 inches back from the plate, or 27 back of the batter. 

Now, when we judge a pitch on TV, we necessarily compare where it hit the catcher’s glove to the position of the batter’s belt/shoulder mid-point. Actually, a typical pitch will drop from 4 to 8 inches while it travels from batter to catcher, so our judgment would be off by that much.

Even the side view doesn’t fully compensate. When we see a pitch from the side (which is fairly rare, they don’t show very many of those replays), we judge where it passed the batter—across the letters, belt, whatever. But the sz is defined at the front of home plate, around 2.5 feet towards the pitcher. Again, the downward-moving pitch drop s around 3-6 occurs between those two points.

I’m not saying this explains everything, but it’s something to be aware of.


#21    MGL      (see all posts) 2007/07/13 (Fri) @ 17:48

If an umpire were to actually follow the rule book to a “tee” would the top of the strike zone be the “midpoint” extended from the batter to the front of home plate along a line parallel to the ground?  If yes, then if a ball crosses the batter’s belt I guess it would cross the front of the plate 6 inches or so higher, according to what John says above.  Or is the rule book zone supposed to be when the ball passes the batter such that it will be higher when it crosses the front of the plate?


#22    John Walsh      (see all posts) 2007/07/13 (Fri) @ 17:59

MGL:

Here’s the definition, which I quoted in my article:

The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the knee cap. The Strike Zone shall be determined from the batter’s stance as the batter is prepared to swing at a pitched ball.

“Area over home plate” implies the first thing you wrote above. Of course, whether umps are actually attempting to call the sz that way is another question.


#23    Tangotiger      (see all posts) 2007/07/25 (Wed) @ 08:26

John is back for more:

http://www.hardballtimes.com/main/article/the-eye-of-the-umpire/

***

I like what he does in using the average operator definition of each batter’s strike zone, rather than game-by-game.  But of course, that might cause it’s own problems, since each system is tweaked per park.  Perhaps we should only look at home parks for each player (or parks where they played at least 6 games).  And even at that, any road park that is off by more than 1 inch from the home park should probably be discarded altogether.


#24    John Walsh      (see all posts) 2007/07/25 (Wed) @ 09:53

Tango,

I believe the variation is due to operator error and not to a hardware/software problem at particular parks.  So, I think taking the average over all pitches (with some outliers removed, which I did) is the right way to go.


#25    David Smyth      (see all posts) 2007/07/25 (Wed) @ 10:09

That the actual strike zone is wider horizontally but narrower vertically shouldn’t be a surprise to anyone who watches games. But J Walsh seems to be implying that it’s this way because of the lack of eyesight of the umps. It seems to me that it’s more likely a conscious or subconscious effort by the umps to stick to the ‘spirit’ of the strike definition--a pitch which can be struck well with a normal swing. A normal swing in today’s power game is different than it was 30 years ago.

So I don’t have a problem with it. The batters and pitchers are supposed to understand what’s going on, and that a pitch an inch or two outside is going to be a strike, and a high strike is going to be a ball.

It would be better, of course, if they just widened the plate, and changed the verbal descrition of the vertical limits.


#26    Guy      (see all posts) 2007/07/25 (Wed) @ 10:48

John:
A suggestion for future research:  we know that a big part of home-field advantage is a better K/BB ratio for home hitters (greater in some parks than others).  It would be very interesting to see how much of this, if any, results from umpires calling a tighter strikezone for home hitters than visiting hitters, presumably an unconscious response to fan pressure.  If there’s no difference, then we know it’s entirely a function of the home hitters and/or visiting pitchers, such as hitters’ greater familiarity with hitting background, better rest at home, etc.


#27    tangotiger      (see all posts) 2007/07/25 (Wed) @ 11:24

Great idea Guy.

You may also find this useful:
http://www.insidethebook.com/ee/index.php/site/comments/run_impact_in_parks#13

The VISIBILITY column is K minus NIBB per PA between home and away.


#28    Guy      (see all posts) 2007/07/25 (Wed) @ 11:34

Thanks.  But if SD is at the top, and CO at the bottom, shouldn’t you call it “invisibility?”


#29    John Walsh      (see all posts) 2007/07/26 (Thu) @ 03:17

David,

I didn’t mean to imply that the size of the strike zone was due to umpires’ eyesight. In fact, I sort of implied the opposite when I noted that the level of accuracy did not effect the width of the zone (see 3rd plot in the article).

I agree with your comments on why umps call the zone they do, but I wonder what is the cause and what is the effect:  Do batters stand on top of the plate so they can reach that “strike” 6 inches outside, or do umps call that pitch a strike because batters are standing on the plate?

I also think it’s very bad to have a rule on the books that isn’t enforced—that leaves too much power to the umpires (or the police, if we’re talking about real life) to choose when, and against whom, to enforce the rule. Bill James wrote about this issue with respect to the George Brett pine tar incident, I believe.

Of course, I don’t expect a change in the size of home plate any time soon.


#30    Sam      (see all posts) 2007/11/06 (Tue) @ 01:31

Take a look at the strike zone revealed.

http://3dkzone.com


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Dec 03 13:52
Sabermetric Moves of the 2009 Pre-Season

Dec 03 16:06
How to calculate the area of a baseball field

Dec 03 16:04
Avery being Avery

Dec 03 15:46
What would happen if the shootout period was 10 minutes, not 5?

Dec 03 14:50
The Return of the Baseball Abstract?  No, the next best thing…

Dec 03 14:48
Estimating BABIP

Dec 03 13:58
NYC’s 3 1/2 year mandatory jail time sentence for carrying a loaded weapon

Dec 03 10:42
What was Pedro worth?

Dec 03 10:20
Complete Run Expectancy, Retrosheet Years

Dec 02 23:36
The Holy Writers strike again!