THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, April 06, 2009

Umpire strike zones by count

By Tangotiger, 10:28 AM

Dave Allen continues to make his mark:

In fact, all else equal, each strike in the count decreases the likelihood of a pitch being called a strike the same amount as a pitch being one inch further away from the center of the zone (roughly equal estimates). The number of balls is also significant but the effect is less than half of that of the number of strikes (you can see in the image of strike zone area above, area decreases more as you increase strikes than it increases as you increase balls). The length of break is also significant, pitches with lots of break are slightly more likely to be called a strike. Once we control for break and count there is no significant difference in how the strike zone is called to different pitch types.


#1    MGL      (see all posts) 2009/04/06 (Mon) @ 13:48

I posted this comment on the site:

Great, great stuff.  I am more than a little skeptical of the swing rates on pitches outside of the zone. Swinging at more pitches at a 3-2 count than at a 2-2 count just seems implausible, as well as the .568 itself (can batters possible be swinging at so many bad pitches - balls - when they are 1 ball away from being walked?).

Dave in calculating those swing rates on balls outside of the strike zone at various counts, which strike zone did you use?  The overall one (at all counts) or the one for that count only?

Another explanation for the smaller strike zone as the number of strikes increases is this (other than the umpire making a conscious decision to change his zone with the count):

When a batter takes a pitch with more strikes, he tends to be fooled by the pitch, either because he was expecting something other than what he got, or because of the pitch itself (a very big breaking curve, for example).  The umpire will tend to be fooled as well.  And of course, if a batter takes a borderline pitch with, say, an 0-2 count, the umpire often thinks, “It must have been a ball for the batter to take that pitch with 2 strikes...”


#2    Tangotiger      (see all posts) 2009/04/06 (Mon) @ 13:58

From 1993-2008, batters swung at 73% of all pitches thrown on a 3-2 count.  At 2-2, that number is 65%. 

Of the pitches they didn’t swing at, they were almost the same 85% that were balls in either count.

I agree with MGL’s basic position that batters are not able to properly value the benefit of getting ball 4.


#3          (see all posts) 2009/04/06 (Mon) @ 14:04

MGL, I can imagine a reasonable explanation for a higher swing rate at 3-2.  Namely, you expect the pitcher to be more likely to throw you a strike.  At 2-2, batters may feel like pitchers have one more to waste, and are still trying to get them to chase.


#4          (see all posts) 2009/04/06 (Mon) @ 14:05

By the way, awesome, awesome article.


#5          (see all posts) 2009/04/06 (Mon) @ 14:35

MGL,

For the swing rate in/out of the zone by count I used the 50% strike contour for that count, not the overall one, as the boundary.  The higher swing rate at 3-2 than 2-2 is surprising, I think one reason is pitch type frequency.  At 3-2 a fastball is much more likely than at 2-2 and batters swing at fastballs more often than other pitches.  I should have broken up the analysis by pitch type.

I am not sure if there is anyway we could know, but I tend to agree with your suggestion that the change in strike zone size is not a conscious decision of the umpire.


#6    MGL      (see all posts) 2009/04/06 (Mon) @ 15:05

For the swing rate in/out of the zone by count I used the 50% strike contour for that count, not the overall one, as the boundary.

So when you say, “swing at a pitch outside of the strike zone,” you mean, “a pitch that has a 50% chance if being called a strike or a ball?”

That is completely different and you need to clarify that.  That means that only 50% of those pitches that batters are swinging at that you call “outside the zone” are really balls.  So, those numbers need to be cut in half.

So I have that right?


#7          (see all posts) 2009/04/06 (Mon) @ 15:25

A pitch right on the boundary has a 50% chance of being called a strike.  All of the pitches outside of that contour will have less than a 50% chance of being called a strike (with a decreasing percentage the further away they are).  The overall percentage of pitches outside of that boundary are called strikes is much less than 50%.

No matter where you put the boundary there will be some strikes outside of it and some balls inside of it, I chose 50% as a compromise. 

When Prof. Pepper defined his zone in the bad balls swingers he used 10%, so a much larger strike zone.  Maybe that would have been better here?


#8    MGL      (see all posts) 2009/04/06 (Mon) @ 16:16

I think you want to define the strike zone such that the final tally of pitches swing at “outside of the zone” is the percentage of pitches swung at that are the equivalent of 100% outside the zone.

I don’t know if that makes any sense, but here is what I mean:

For every pitch swing at, use the chance that pitch is in the strike zone as the multiplier.

For example, pitch #1 is swung at (at X count). If that pitch is normally called out of the zone 80% of the time, that is .8 pitches swung at “out of the zone.” If that same pitch is not swung at the next time, that is .8 pitches not swung at “out of the zone.” Add that all up and then take present the percentages.

I think that will work better.  I may have got that wrong, but the general idea is right. Those numbers are misleading and that is why they are so high.


#9    Tangotiger      (see all posts) 2009/04/06 (Mon) @ 16:21

MGL, if I read you correctly, I love the idea.

Basically, if a batter swings at, say, 52% of all pitches, construct the strike zone such that the number of PITCHES thrown in this strike zone is 52% of all pitches.  Brilliant way to set it, without coming up with arbitrary points.

The zone itself won’t necessarily grow and contract evenly around a center point, but could elongate or retract based on wherever the highest percentage of swings are actually made.


#10          (see all posts) 2009/04/06 (Mon) @ 17:32

MGL, great idea.  That is clearly the best way to do it rather than creating an arbitrary cutoff.  I went back through my data and did just as you said and I get an even HIGHER rate of swing outside of the zone. 

3-2: 0.58
2-2: 0.53
1-2: 0.49
0-2: 0.43
2-1: 0.42
1-1: 0.38
3-1: 0.37
0-1: 0.35
1-0: 0.27
2-0: 0.25
0-0: 0.19
0-3: 0.05

I think part of the reason these numbers seem high, compared to say fangraphs o-swing%, is because the rulebook zone, from which they calculate that stat, is so much bigger than the called zone.


#11    MGL      (see all posts) 2009/04/06 (Mon) @ 22:06

What is an 0-3 count? Is that when the umpire forgets the count?

Seriously, though, I am still skeptical of those numbers.

Did you do it this way?

Pitch one: 80% in that zone are called a ball.  Batter swings. 1 pitch thrown, batter swung at .8 pitches outside of the zone.

Pitch 2: 90% called ball, batter does not swing.  Batter has seen 2 pitches so far and has swung at .8, for a percentage of 40% so far.  In your old system, it would have been 50%, assuming that those two pitches were in your “out of zone” area.

Wait, that is not correct!  If we have pitches that are called balls 10% or 20% and the batter does not swing, we can’t keep including them in the denominator.

I’m not sure how to do it off the top of my head. I’d have to think about it a little more. It is something like UZR, where each ball is caught x percentage of the time and you have to give the player credit for catching or not catching it.

OK, let’s say a pitch is called a ball 80% of the time.  Batter takes.  He gets credit for .2 balls.

Then the same pitch comes and he swings. He gets docked .8 balls.

Hmmm…

So for every 10 pitches he swings at 2 and takes 8, which he is supposed to, and gets a net score of zero.

Let’s say that he swings at all of them.  He gets docked .8.  So that is an 80% swing rate??

Let’s say that in your old system, the average batter swings at 50% of your balls “out of zone” but those balls are really only called balls 80% of the time.  In 10 balls he’ll get docked 4 balls (.8 times 5) for swings and he’ll get credit for 1 ball (.2 times 5) for takes, for a net of 3 balls out of 10. Is that an effective swing rate of 30%?

I’m not sure, but that be one way to do it.


#12          (see all posts) 2009/04/06 (Mon) @ 23:16

Here is what I did. First separate by count, then I broke up the space into small bins and put each pitch in a bin.  For the i-th bin I have three numbers:

swing_i: the percentage pitches in that bin swung at

strike_i: percentage of taken pitches in that bin called strikes

num_i: number of pitches in that bin.

Then the out of zone swing percentage is the sum of swing_i*(1-strike_i)*num_i divided by the sum of (1-strike_i)*num_i. 

Is that what you were thinking?

But I did make a mistake the numbers I posted above are just for RHBvRHP which might partially explain why they are high.  Here are the numbers for all batters:

3-2: 0.57
2-2: 0.52
1-2: 0.48
0-2: 0.42
2-1: 0.41
1-1: 0.37
3-1: 0.35
0-1: 0.33
1-0: 0.25
2-0: 0.23
0-0: 0.17
3-0: 0.04

Not that much lower, so I am sure you are still skeptical.  Hopefully my explanation of the process is clear and it helps to clear things up.


#13    MGL      (see all posts) 2009/04/07 (Tue) @ 01:32

i’ll have to think about it when i get some time…


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 15:28
Largest demonstration in Canadian history?

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 15:02
Pete Palmer’s new book: Basic Ball

May 25 14:44
What sabermetrics is NOT

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion