THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, November 05, 2009

Shifted Strike Zone

By Tangotiger, 06:02 PM

Nice job.  He does what I say, and looks for a zone such that the number of called strikes equals the number of strikes inside this “called” strike zone.  However, rather than going with an elliptical zone, he decides to go for a “cross” strike zone.  Basically, take a square strike zone, and remove the four small corners.  That’s what you are left with.  Not crazy about it, but I can see the simplicity of it, and how it helps with the coding.  Here are where the strike and ball calls fit in this “cross” strike zone.


#1          (see all posts) 2009/11/05 (Thu) @ 18:40

Tom,

Thanks for the compliment. 

I am pretty sure I could get a rounder zone using a program like R, but I want the zone that can be queried in SQL.  If any one has a way to get a circular zone in SQL, I am all ears.


#2    Tangotiger      (see all posts) 2009/11/05 (Thu) @ 19:01

Should be easy enough to do.  I did it for Mike Fast in Excel for his spin chart, and that same code can be reused… let me go find it.


#3    Tangotiger      (see all posts) 2009/11/05 (Thu) @ 19:04

Ok, I found the file… little more involved than I remembered, but it should be doable.  I’ll do it at the office tomorrow, unless someone beats me here tonight.


#4    Peter      (see all posts) 2009/11/05 (Thu) @ 19:44

To Jeff at #1: I’ve done similar analysis to this by bringing pitchf/x data from my SQL database into R and then piping the pitch locations through a simple logistic regression that predicts the probability of a pitch being called a strike based on its location. That way you can avoid the trial-and-error method of estimating the zone and directly estimate an ellipse such that a ball that’s on the border of the ellipse has a 50% chance of being called a strike. Click my link for an example.


#5    Tangotiger      (see all posts) 2009/11/05 (Thu) @ 20:10

Peter, some nice stuff.  How come you don’t tell me about this?

I need to setup a better system of sharing.  Too many shy readers around here…


#6          (see all posts) 2009/11/05 (Thu) @ 20:17

#4 Peter—I am trying to say away from R so the zone can be used without people installing R on their computers and learning it.  I guess I am wanting a simple, yet better solution.


#7    Nick      (see all posts) 2009/11/05 (Thu) @ 20:29

Could you not do a logistic regression in Excel or SQL?  I know I’ve done it in excel before.  I guess you would need to do a multiple logistic regression to account for vertical and horizontal location.


#8    Peter      (see all posts) 2009/11/05 (Thu) @ 20:44

Nick,

The model I use is:

logit(pr(strike)) = x + x^2 + y + y^2

Where x is the horizontal and y is the vertical location of the pitch.

I don’t think it can be done in SQL though, at least not with out some kind of add-ons installed.


#9          (see all posts) 2009/11/05 (Thu) @ 22:45

Why not break up the area into a grid of reasonable sized squares (large enough to get enough data in each, small enough to get decent granularity) and check whether the called strike % is >50% or <50% in each one. You’ll still end up with a jagged edge, but it doesn’t restrict the results to a particular shape and it should be more accurate than the cross.

(If you wanted to be really clever, you could even try to subdivide the squares near the edge for better precision.)


#10          (see all posts) 2009/11/05 (Thu) @ 23:41

If you’re going to do it that way, you may as well make a heat map of each grid section such that we know the % chance of a strike called in each zone, so that it’s a bit more fuzzy. If you’re going to go through the trouble of adding grids that is.


#11    MGL      (see all posts) 2009/11/06 (Fri) @ 16:28

This is a little bit of a nitpick, and great work, but please, please, anyone who does work with strike zones, always tell us before the very first chart whether the perspective is from the catcher’s or pitcher’s view!


#12    Tangotiger      (see all posts) 2009/11/06 (Fri) @ 17:10

I’d prefer the standard all be by the catcher/batter/ump perspective.  That will be the standard for HIT f/x right?  Home Plate will be at the bottom, and the OF at the top?  Seems only natural to keep the same perspective the whole time.

But, no big deal, as long as the chart does show where left and right is.


#13    Nick      (see all posts) 2009/11/07 (Sat) @ 06:00

Mike Fast used to have a graphic of the batter in his posts - it made it a lot easier to visualize.


#14    Jonas Fester      (see all posts) 2009/11/07 (Sat) @ 14:00

Alright, so I have had this concern about the pitch f/x strike zone for a while.  According to the rule book, and any umpire you ask (and I have, to try and figure it out), the strike zone is a 3D box, not a 2D box as pitch f/x shows.  Hopefully I am wrong here, b/c I love the data that f/x gives us, but here is an example that concerns me…

A righty curveball passes through the front left corner of the 3D strike zone (looking from the mound) then drops out of the zone and is caught at the ankle.  “By the book” this is a strike.  Where is this plotted on Pitch f/x?  At the point it enters the strike zone?  in the middle of the plate?  Back of the plate?

Any knowledge about pitch fx that I don’t know that could help answer this would be great.  Thanks!


#15    Tangotiger      (see all posts) 2009/11/07 (Sat) @ 18:03

pitch f/x has enough parameters that you can replot the trajectory in 3-d (4-d actually, if you include the time component).

If you see something in 2-d, this is ONLY because of the person presenting the charts, and NOT a failing of pitch f/x.


#16    MGL      (see all posts) 2009/11/08 (Sun) @ 03:38

#15, that doesn’t really answer his question, which is a good one.  When you see these 2D charts, where in the 3D “box” are they captured?  Front of the plate, middle of the plate, back of the plate?

For example, if I see a chart/picture which shows the “ball” 2 inches to the right of the “strike zone,” is that when it passed the front of the plate?  Back of the plate?  Middle of the plate?


#17    Nick      (see all posts) 2009/11/08 (Sun) @ 03:44

MGL, if I’m not mistaken, Pitch f/x reports the location of each pitch as it crosses the front of the plate.  I’m sure Mike Fast or someone could clear this up for us.


#18    SirKodiak      (see all posts) 2009/11/08 (Sun) @ 05:04

According to both Mike Fast’s glossary and Alan Nathan’s tutorial, it is “as it crosses the front of home plate”


#19    Tangotiger      (see all posts) 2009/11/08 (Sun) @ 08:52

I think I did answer it. “These” 2d charts are as they are designed by the user.  That one or all of them (currently) are showing it “as crossed plate” is completely irrelevant.

The user CAN design it in such a way as they show, in 2D, the closest point to the 3D strike zone box.

This is not a pitch f/x limitation, but purely a user limitation.


#20    Nick      (see all posts) 2009/11/08 (Sun) @ 09:07

"The user CAN design it in such a way as they show, in 2D, the closest point to the 3D strike zone box.”

I’m not sure I understand what you mean by this Tom.


#21    Tangotiger      (see all posts) 2009/11/08 (Sun) @ 10:11

There is enough parameters in the pitch f/x data to draw a trajectory, in 4-d, of the pitched ball.  That is, you can recreate the pitch.

The question being asked is if we can represent in 2-D the pitch as it crossed any part of the 3-D plate (the box over the plate, between the knees and belt/letters).  If a pitch is, at the front of the plate, 2 inches inside, but it’s curving in such a way that it ends up going through the box at some point, then the CORRECT thing to do is to show it, in 2D as entering the strike zone.

The reference point is not the 2D area above the front of the strike zone, but the 3D area around the strike zone.  You plot the pitch closest to the edge of the 3D box.

As you can imagine, not the easiest thing to do.  So, the pitch f/xers are typically drawing the point in 2D.  That’s a user limitation, not a data limitation.


#22    Jonas Fester      (see all posts) 2009/11/08 (Sun) @ 12:23

Thanks for the responses guys.

So it looks like what we need is someone to create a graph that shows the strike zone as a 3-d/4-d box, and then run the pitch f/x data through it?

It’s good to know that this is not a limitation of the data, only of the user.  But when looking at the 2-d strike zones, how trustworthy are the corners of the strike zone.  I have been looking at all of this data about evaluating umpires, and it does them such a disservice to view the zone from a 2-d graph.  It seems like we are looking at these graphs with a bit too much certainty.


#23          (see all posts) 2009/11/08 (Sun) @ 19:23

I thought that Tango was talking about generating the graph so that it shows the pitch on a 2D plot at the closest to the textbook strikezone it passed.

So for instance, let’s say someone threw a ball from third base. At the front of the plate it would be shown as “just a bit inside” to a RHB. If you followed the trajectory, it would cross right through the middle of the plate.

Therefore, rather than showing a far inside pitch as a strike, it should show a pitch through the center of the zone.

At least for determining called strikes/balls.


#24    Tangotiger      (see all posts) 2009/11/08 (Sun) @ 21:13

Sal: right.


#25    MGL      (see all posts) 2009/11/08 (Sun) @ 21:56

Yes, at least for determining balls and strikes, but if a pitch is in the 3D strike zone, you still have to make an arbitrary decision as to where to capture it on a 2D graph, which makes things a little complicated.

But, as Nick and Sir Kodiak say, apparently the pitch f/x guys (or at least Mike Fast) do NOT do it the way Tango says (and in any case, his “it” only applied to pitches close to the edges), so that pitches outside the zone, but close to it, at the front of the plate, are NOT necessarily balls which is indeed not fair to the umpires.


#26          (see all posts) 2009/11/08 (Sun) @ 23:18

I am finally getting around to answering some of the questions/comments and give some more background.

#11—My mistake on not say the prospective of pitcher/catcher.

#12—The data is nice to work with from pitcher prospective.  It might change once hitfx is completely available, but almost everyone will have pitch fx data from the pitcher prospective.

#13—Maybe I will get a pic of the Babe to add

#14 and rest—Actually the point of the series (this piece is part one) is to find which pitches are called strikes to which hitters.  I needed to start with basics (hitter handedness) and move from there (pitcher handedness and pitch type). 

On the idea of a 3D box, I have had the idea of having the points looking like “sperm” with the point the cross then a tall with a direction the ball is moving and it ending when it cross the back of the plate.  I not found a good way to create this representation, so that is why I am still in a 2D world for now. 

Harry P has done some work on flight paths at BtB and you could create all the flight path, which might get messy.


#27    Mike Fast      (see all posts) 2009/11/08 (Sun) @ 23:43

There are two reasons most/all of the PITCHf/x strike zone graphs you see report the data at the front of the plate only.

1) The Gameday data contains those coordinates already, so it’s easiest to present them rather than calculating anything else.

2) It doesn’t make much practical difference at any edge of the zone except the top edge, and most of the ball/strike controversies don’t occur at the top edge of the zone.

If I were doing umpire grading, of course I would want to present the data in the plane at the front of the plate (y=17 inches) and the data at the “back” of the rectangular portion of the plate (y=8.5 inches).  And for the high strike over the middle part of the plate, you’d need a third view that somehow encompassed that triangular portion of the back of the plate. 

I’m not often/ever doing umpire grading of high strike calls, so I don’t present such complicated views of the strike zone.  It wouldn’t add much except confusion to most people.

Here’s a random example I pulled to demonstrate.  Some pitchers might be a little different, but you will see basically the same thing for most of them, that the difference between front of the plate and back of the plate is mostly in the vertical dimension.

lidge_ball_strike_11012009.png


#28    Jonas Fester      (see all posts) 2009/11/08 (Sun) @ 23:52

thanks for the info Mike. It’s pretty interesting actually that the main difference is only in the vertical portion.  Is this true you think with someone who throws a more, side-to-side slider, or a cutter?  I’m thinking of a guy like jeff weaver. 

- No need to make a separate triangular part of the zone for the back end of the plate, umpires consider the strike zone a box, going from the front to the back.  it’s only triangular for the foul line.


#29    Davor      (see all posts) 2009/11/09 (Mon) @ 08:04

I’ve seen someone at THT showing 2D graph picturing data for the closest position to the middle of the strike-zone. It’s still easy-to-see 2D graph, but there is no problem with slider-through-the-zone. It was several months ago, and I can’t remember who it was.


#30    Mike Fast      (see all posts) 2009/11/09 (Mon) @ 11:13

I can think of several issues around called strike zone presentations.

The biggest one is whether you are trying to define and/or measure against the rulebook zone or the zone as the umpire(s) actually calls it.

The second is how to define the top and bottom of the zone.  You can use the same fixed values for all batters (e.g., 1.6 and 3.5 feet).  You can use the sz_top and sz_bot for each pitch as defined by the PITCHf/x operator.  You can use an average (mean, median, or mode) value of sz_top and sz_bot for each batter.  You can use a fixed percentage of the batter height.  You can use the 50% called ball-strike line as actually called by the umps.  Each approach has advantages and disadvantages.

The third issue is error in the PITCHf/x measurements.  This should be less than 0.5 inches on average.

The fourth issue is that of sweeping pitches, which is my mind the most minor of the four issues.  For sidearm pitchers, you can see left-right movement through the zone of as much as one inch.  For typical three-quarters slot pitchers, the left-right movement is going to be small for most pitches.  Lester, Arroyo, Hochevar, and the Weaver brothers are among the non-sidearm pitchers with the most sweeping movement.  I’ll see if I can toss up a graph for one of those guys later like I did for Lidge.


#31    Mike Fast      (see all posts) 2009/11/09 (Mon) @ 11:17

No need to make a separate triangular part of the zone for the back end of the plate, umpires consider the strike zone a box, going from the front to the back.  it’s only triangular for the foul line.

Jonas, do you mean the umps consider the strike zone a 17 x 17 inch square, or a rectangle 17 inches wide and 8.5 inches deep?

Irrespective of what the umps do, the rulebook defines the zone as a pentagonal column, does it not?  I’m not suggesting an umpire could actually have any way to perceive the zone that way, though.


#32    Jonas Fester      (see all posts) 2009/11/09 (Mon) @ 11:52

Mike,

I have a friend who is an umpire, and they are taught to treat the strike zone as a 3 dimensional box, and the ball can cross through that box at any point and be called a strike.  It extends to the back tip of the plate.  They are told to ignore the triangle in the back, b/c it is just there to make the fair/foul calls easier.  In talking to him today, he was pretty positive the rule book described it as a box, not a pentagon.  Thanks for all the work you do btw, amazing stuff.  It has helped my college team I coach quite a bit too.


#33    Tangotiger      (see all posts) 2009/11/09 (Mon) @ 12:28

Without reading the rule, I would say I’m 100% positive it says the area above the home plate and not “home plate if you extend it to be the minimum rectangle required to encompass the entirety of the plate”.


#34    Peter Jensen      (see all posts) 2009/11/09 (Mon) @ 12:34

"The STRIKE ZONE is that area over home plate the upper limit of which is a
horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the kneecap. The Strike Zone shall
be determined from the batter’s stance as the batter is prepared to swing at a pitched ball.”

From the definitions section Rule 2.00, page 23.


#35    Mike Fast      (see all posts) 2009/11/09 (Mon) @ 12:51

I think anyone who posts in this thread ought to first clarify whether they are talking about the rulebook zone or the zone as actually called by the umpires.

Without reading the rule, I would say I’m 100% positive it says the area above the home plate and not “home plate if you extend it to be the minimum rectangle required to encompass the entirety of the plate”.

The rulebook zone is a pentagonal column (or prism).  But we all know that no umpire can even attempt to call such a zone in reality.  Where would your reference points be for such a shape?  It’s impossible.  So it is interesting to me to know what umpires attempt to do.

Dan Brooks and I have had several conversations about the theory of umpire signal detection in regards to reference points that the umpires use.

The Bruce Weber book As They See ‘Em has an interesting passage from umpire Alfonso Marquez.

“I think when I came up to the big leagues, I hadn’t really gotten to know the inside corner that well,” Alfonso Marquez told me.  “It gets crowded in there. The catcher’s set up inside and the batter’s up on the plate.  You’re taught to set up in the slot and follow the ball all the way into the glove, but when the catcher crowds you--when he moves into your slot--you lose the plate, and if the batter is crowding the plate, too, you lose the pitcher’s release point. So if you stay where you are, you’re only going to see the ball"--Marquez snapped his fingers--"for that long. In the minor leagues I was kinda guessing.” I asked him how he solved the problem, and he said just about every umpire made the same discovery--I confirmed this with others--that you have to shift your position counterintuitively, not getting down lower or squeezing your head in a narrower crevice of space closer to the plate but by easing off just a bit, creating a different angle from which to look at the pitch. It requires practice; you need to train your eyes to appreciate a different perspective on the plate, but at least you get a longer look--you see the ball from the time it leaves the pitcher’s hand to the time it hits the catcher’s glove.What I do now when they crowd me is back up and rise up,” Marquez said. “I set up higher so I can look over the catcher’s head. That way, if he does shift and block me, I’ve already made sure my nose is right on the middle part of the plate right on the corner.” He made his hand into a divider and placed it lengthwise along his nose. “If the ball comes in here"--that is, anywhere to the right of his nose--"it’s a strike.”

To which Dan noted, “Basically, Alfonso Marquez doesn’t actually call balls and strikes by judging whether or not something was in an imaginary box because that’s impossible.  Instead he judges whether things were to the right or left of his nose.” Which Dan noted was an egocentric reference cue rather than an external reference cue.  He argued that umpires are much more likely to use egocentric references than the external references they are supposed to use.

Anyway, my point is that this discussion about the niceties of the 3-D shape of the zone is completely theoretical when it comes to umpires actually calling a zone.  They need cues like how their nose lines up with the edge of the plate, where the catcher’s head is, and things like that.  A human being can’t conceptualize a 3-D floating object and use that to call balls and strikes.


#36    MGL      (see all posts) 2009/11/09 (Mon) @ 21:34

If we are only talking about an inch of horizontal movement at the most between the the front and back of the plate, then it is indeed no big deal.  I thought it might be more.  So the idea of a ball going “around the plate” is ridiculous?

What about the maximum vertical movement, Mike, from the front to the back?

BTW, I really like the idea of a “sperm-like” ball indicating what direction it was moving.  Those “double markers” by Mike are also nice, although obviously none of those would work too well when you have charts with hundreds or thousands of pitches.


#37    Mike Fast      (see all posts) 2009/11/09 (Mon) @ 22:30

Rule of thumb: pitches are moving roughly 10 times faster toward the catcher than their maximum speed left or right.  So in 10 inches of travel it might move at most an inch to the left or to the right.  You can also think of it this way: the ball travels 55 feet from the pitcher’s hand to the plate, but if it moves 5.5 feet left-right that’s a lot.

Here are a couple more images that I promised earlier.  First is the zone location for Mitch Stetter, who has the most left-right movement in the majors.  Second is the similar plot for Luke Hochevar, who is among the pitchers with the most left-right movement for more typical 3/4 slot deliveries.  These are at the extreme of what you will see with pitches sweeping across the plate.

stetter_ball_strike.png

hochevar_ball_strike.png


#38    Mike Fast      (see all posts) 2009/11/10 (Tue) @ 00:04

What about the maximum vertical movement, Mike, from the front to the back?

A really big or slow curveball (like one from Chris Tillman or Doug Davis) will drop a little more than 2 inches over the 8.5 inches of travel from the front of the home plate rectangle to the back of the home plate rectangle.  (Double that figure to 4 inches if you want to consider the drop across 17 inches of travel.)

Any sort of offspeed pitch from anyone is going to drop around an inch or more.  Unlike with left-right movement, gravity plays the big role in the vertical movement of pitches, so all pitches will see it to some extent, the slower ones moreso.


#39          (see all posts) 2009/11/11 (Wed) @ 21:41

Couple questions. 
1. Who gets the most vertical movement on their fastball?

2. How much does the ball typically move as it travels from the plate to the glove for a fastball or a curve?  Many umpires call the glove as that keeps them from looking bad and I’m guessing that it isn’t that far off.


#40          (see all posts) 2009/11/19 (Thu) @ 23:49

Is there any way you can do an overhead view of the plate?

That would give the entire movement across the plate.

If it could be automated, you could check by at-bat so that it doesn’t get crowded, and have both views to help make an informed judgment.

You could also use color (gradients of some sort) to show height for each pitch.


#41    Mike Fast      (see all posts) 2009/11/20 (Fri) @ 00:54

Sal, like this?

ichiro_topview.jpg

Taken from this article:
http://www.hardballtimes.com/main/article/ichiro-strike-three/


#42    Mike Fast      (see all posts) 2009/11/20 (Fri) @ 12:33

Re Mike T./#39,

1. Who gets the most vertical movement on their fastball?

The answer to this depends a lot on what you mean by vertical movement.  Do you mean vertical spin deflection (i.e., who has the most hop) or vertical spin deflection + gravity (i.e., who has the most sink)?

2. How much does the ball typically move as it travels from the plate to the glove for a fastball or a curve?  Many umpires call the glove as that keeps them from looking bad and I’m guessing that it isn’t that far off.

From the point of the plate to the glove, a typical distance is about 4 feet, making it about 5-6 feet of travel from the front of the plate to the glove.  That means that at the extremes, a pitch could move up to 6 inches left or right or drop up to 15 inches during that time.


#43          (see all posts) 2009/11/20 (Fri) @ 14:51

Who has the most “hop”?  I’m really trying to see if there’s anyone, not a submariner, with a relatively low release point who might be able to throw a pitch that is still rising (relative to the ground) on a pitch close enough to the top of the zone to be swung at fairly often.  Too bad there wasn’t pitch fx data in Seaver’s heyday.


#44    Mike Fast      (see all posts) 2009/11/21 (Sat) @ 00:08

Okay, you’re asking yet a third distinct thing.  You’re talking about who has the highest final vertical velocity.

I’ll answer the other two things first.  Most hop is Chris Young and Clayton Kershaw.  Nobody’s thrown a pitch yet recorded by PITCHf/x that accelerated upwards.  Others with a lot of hop include Brandon Morrow, Grant Balfour, Brad Penny, Cole Hamels, and Matt Thornton.

The guys who get the most sink on their fastballs are the low-release guys:  Shouse, Bradford, Ziegler, Meredith, Moylan, Sean Green, Joe Smith, O’Day, Feliciano, Masterson.  Among the guys with “normal” release points, Brandon Webb, Chad Qualls, Sergio Mitre, and Derek Lowe get the most sink.

Now to your question...nobody has thrown a pitch recorded by PITCHf/x that had positive vertical velocity when it crossed the plate anywhere near the strike zone.  The closest was by Gil Meche, the first pitch to Alex Rios in the 1st inning on April 27, 2008, which was moving up less than 1 ft/sec.  Other guys that have come close fairly often are Chad Bradford, Javier Vazquez, Matt Morris, and Mark Hendrickson.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 11:53
Do pitcher’s reach back for velocity when needed?

May 25 11:33
“Why Kickstarter works”

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 10:14
Largest demonstration in Canadian history?

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 17:04
Firefox, IE, or Chrome?