THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, February 11, 2010

Framing the framing debate

By Tangotiger, 10:37 PM

Matt at Lookout Landing.


#1          (see all posts) 2010/02/13 (Sat) @ 20:23

I did some preliminary work on this earlier in the offseason, using the rulebook strike zone (which is why its preliminary). 

http://docs.google.com/Doc?docid=0AaCkF5_yZqcZZGc4OWdtZGhfNGNtbTh2cXRq&hl=en&pli=1

Interestingly enough, when I used the same run values as Matt, I got a difference of about 1.5 runs per 120 games, in favor of Johjima - almost identical to his result.  However, both of them came out near the bottom of the league in framing ability. 

I guess this is the kick in the pants to finish revising this study, with some kind of localized regression.


#2    Nick Steiner      (see all posts) 2010/02/13 (Sat) @ 21:48

I’d like to see some more discussion about my original comment in that article (before I was jumped on for my “tone"). 

In my opinion, if you are going to create a catcher framing metric, there are a few neccesary things you have to do:

1) Adjust for batter height.  This is crucial, obviously, as a pitch at 1.5 PZ is much more likely to be a called strike to David Eckstein vs. Kyle Blanks.

2) Adjust for pitch type (or even better, pitch movement).  As Josh and Mike have shown, curveballs can bend around the strike zone and catch the plate at some point that isn’t captured in the PZ/PX (which just measure where the ball crosses the front on the plate, IIRC). 

3) Adjust for batter and pitcher hand.  This one is obvious. 

4) Adjust for count.  As Jonothan and others have shown, umpires have vastly different strike zones based on the count.

5) Adjust for umpire. 

Basically, this should be figured out like we figure out UZR, or like I proposed we figure out Pitch f/x ERA.  Either by using Bins or a localized regression (and I prefer bins in this case) figure out the probability of each pitch being called a strike and compare that to what it actually was called.  Attribute the difference to the catcher (although luck will obviously play a part in it) and sum the results.

The way that Matthew did it was to simply compare how many extra strikes Johjima got over Johnson.  He didn’t adjust for batter height, pitch type, umpire, count type (although he said he looked at all of those things and found little difference).  His method is a good start, but there definitely needs to be some more rigor in adjusting for possible bias’ before we can take the results of it seriously.  It would be like MGL computing UZR only using the location data and nothing else. 

Bill, I’ll read your study in more detail when I get home tonight.  It looks very interesting at first glance!  Did you publish it anywhere?


#3    MGL      (see all posts) 2010/02/14 (Sun) @ 02:54

This is in fact a great first start.  If you are comparing two catchers on the same team, both of whom have around equal number of innings, such as Kenji and Rob Johnson, especially if the pitchers they catch are pretty randomly distributed between them, then it probably is not that important to adjust for all the things that Nick mentions. But once you start analyzing all catchers, it becomes especially important to do all those adjustments, otherwise your results will be fraught with noise I would think, especially since you are presumably looking for a fairly weak signal.


#4    MGL      (see all posts) 2010/02/14 (Sun) @ 03:01

BTW, it would be a little helpful if people used more than their first names on their blogs.  Which Matthew is this?


#5    Nick Steiner      (see all posts) 2010/02/14 (Sun) @ 04:14

Matthew Carruth of FanGraphs.


#6          (see all posts) 2010/02/14 (Sun) @ 19:28

For what it’s worth, VEP (admittedly not a whole lot), I’m totally in your corner here. If you don’t like someone questioning your article...answer the questions. Which Matthew did to a degree, but he also diverted the conversation to one of whether your tone was nice enough or what ever it was, and quite frankly, that ruined what was shaping up to be an interesting discussion on that thread.


#7          (see all posts) 2010/02/15 (Mon) @ 02:32

Nick, I’m with you on some, but not all of these.

1) Absolutely correct.  Tango has a good method of normalizing here

http://www.insidethebook.com/ee/index.php/site/article/scaling_pitchf_x/

If you are using the rulebook zone, its not necessary to do this since pitchfx has the sz_top and sz_bot parameters, but any kind of constructed “called” strike zone needs it.

2) This ones tough.  It probably makes a difference, especially if you are using a small grid.  But you run the risk of sample size issues if you over do it with this parameter.

3) 100% agreed on batter handedness, its been demonstrated and its cause explained.  Pitcher handedness should probably be absorbed into “adjust by pitcher” - I know Dan Turkenkopf found differences between pitchers.

4) The big differences in count are due at least in part to survivor bias, as Hale points out himself in the comments section of his original post.

http://bjays.wordpress.com/2009/01/03/strikeouts-are-fascist-walks-too/

There are more extra strikes in hitters counts because pitches are more in the middle and more good pitches are taken. In pitcher’s counts, nothing much better than borderline is taken.  When you remove all the easy calls over the heart of the plate due to swings, of course it becomes less likely that a called pitch will be a called strike.

Might still be something there, but its not as big as originally made out to be. 

5) Agreed.  There is definitely some consistency to umpire zones. 

I “published” that study on the SABR stats analysis list.  I don’t write for anyone.


#8          (see all posts) 2010/02/15 (Mon) @ 03:03

Bill/7, you have lots of good input.  Is there any chance you would be willing to share your study with a wider distribution (i.e., me, but I’m sure others would be interested also)?

One note that I have--I don’t trust sz_top and sz_bot for anything other than entertainment purposes.  I use a fixed percentage of the batter’s height if I’m doing any sort of serious analysis.  Graph the sz_top and sz_bot over time for a few players if you want to see what a mess they are. 

Taking the average sz_top and sz_bot over a batter’s career gets rid of the majority of the problems, but it’s still more accurate to use a percentage of batter height from what I’ve seen.


#9    Nick Steiner      (see all posts) 2010/02/15 (Mon) @ 03:19

Mike/8

I usually just use the average sz_bot and sz_top for each player, grouped by year.  But you say that a percentage batter height is better than averaging out the sz_bot and sz_top?  That’s interesting, especially given that batter height isn’t always accurate (I assume you’re using Lahman)?  How did you determine this?


#10    Nick Steiner      (see all posts) 2010/02/15 (Mon) @ 03:56

Bill/7

I agree that there is probably some selection bias in the numbers that Jon found; however, I find it very hard to believe that an umpire’s strike zone will remain static for each count.  I think that’s still something for which is needed to be adjusted. 

I agree that adding too many variables may result in weird numbers for some of the bins, or for the localized regression; however, I think that not enough variables would introduce too many error bars into the final output.  Obviously, we need to find a midpoint. 

So here is my revised list of what should be done:

1) Adjust each pitch type by each player’s estimated strike zone (by using a percentage of the player’s height or the average sz_bot/sz_top)

2) Adjust each pitch by the movement. This could be done without creating separate bins, which might make the results to granular, by using a methodology similar to what Josh Kalk did here:

http://www.hardballtimes.com/main/article/that-was-a-strike/

3) Make some adjustments for umpires.  I can’t really think of a way to do this right now, but I do think that it’s important. 

4) After you’ve made those adjustments, bin it by batter hand, pitcher hand and count.  Supplementing those adjustments for separate bins would result in a much larger sample size for each bin, and more accurate results I think.


#11          (see all posts) 2010/02/16 (Tue) @ 15:52

Mike,

The study is in the first post.  I’d be happy to show my data as well, but the study is flawed enough that its probably not worth looking at that deeply.

What is your calculation from height, if you don’t mind sharing?

Nick,

I don’t mean to claim that count doesn’t matter, only that Hale’s study isn’t conclusive.  You’d have to see how count affected things.  (I’ve read Dave Allen’s piece too, and though his model does show a small but significant effect from count, I think there are some problems with his model construction, namely linear parameters for location.  This makes his model less accurate near the edges of the zone where ball/strike likelihood is most sensitive to location, and more pitches end up in high strike counts.)


#12    Tangotiger      (see all posts) 2010/02/16 (Tue) @ 16:34

Mike/8, regarding the sz_top, sz_bot:

Even if you don’t trust those figures literally, do you still trust them relatively speaking?

Suppose that a pitch was thrown exactly at the letters to the same batter in two different games.  Which do you think is more likely in terms of recording:

1. the pitch is going to show as sz_top+4 inches
2. the pitch is going to show as 48 inches

That is, do you think the recording of the pitch is relative to where sz_top and sz_bot are set, or do you think the recording of the pitch is made independent of how sz_top and sz_bot are recorded?


#13    Mike Fast      (see all posts) 2010/02/16 (Tue) @ 16:42

Nick/9 and Bill/11, I didn’t mean to imply that I’ve done an exhaustive study of the strike zone top/bottom issue.  I took a dozen or so of the tallest and shortest players and looked at them in detail.

For that small sample of players, the best estimate of the bottom of the zone was 0.5 ft + 21% of batter height, and the best estimate of the top of the zone was 1.75 ft + 28% of batter height.  The height I used was the height given by MLBAM.

By no means would I consider that the definitive answer or incredibly accurate.  However, the sz_top and sz_bot give some results that are definitely poorer than that, even when you average them.

Take a look at this hastily-prepared chart for Polanco, showing the sz_top and sz_bot in red vs. the actual borders of the 50% called zone in the black lines.

polanco_sz.png

Polanco’s sz_top and sz_bot at least has the benefit of being pretty consistent over time, compared to some other players, but it’s just inaccurate.


#14    Mike Fast      (see all posts) 2010/02/16 (Tue) @ 16:45

That is, do you think the recording of the pitch is relative to where sz_top and sz_bot are set, or do you think the recording of the pitch is made independent of how sz_top and sz_bot are recorded?

The latter.  The center-field camera view that is used for setting the sz_top and sz_bot is not used for tracking the pitches.  (It is used for calibrating the other two cameras but not in computing the path of the baseball that is reported by PITCHf/x.)


#15    Nick Steiner      (see all posts) 2010/02/16 (Tue) @ 16:52

Very interesting Mike.  So were there any systematic biases in the sz_bot/sz_top estimates?


#16    Mike Fast      (see all posts) 2010/02/16 (Tue) @ 17:07

Nick, I would guess that there is in fact operator-dependent bias, but it didn’t correlate with park for the one player (Jayson Werth) for whom I investigated that avenue in detail.

For my sample, both sz_top and sz_bot were biased too low for short players.  For tall players, sz_bot was biased slightly too low and sz_top was biased too high.


#17    Mike Fast      (see all posts) 2010/02/16 (Tue) @ 17:28

Bill/11, I just read your study.  The method plus your suggested improvements show a lot of promise.


#18    Nick Steiner      (see all posts) 2010/02/17 (Wed) @ 03:04

Mike, do you have a Gameday to Lahman ID mapping?


#19    Tangotiger      (see all posts) 2010/02/17 (Wed) @ 08:53

Gameday is MLBAM.  I have a recent thread on that.


#20          (see all posts) 2010/03/26 (Fri) @ 16:50

So, I finally got a version of this done. It’s up on BTB here:

http://www.beyondtheboxscore.com/2010/3/26/1360581/a-first-pass-at-a-catcher-framing


#21    MGL      (see all posts) 2010/03/27 (Sat) @ 00:07

Great stuff Bill!  Any chance you can put up on Google docs or email me the umpire strike zone file? I’ve been trying to incorporate umpires into my projections (to “umpire adjust” the raw stats) and I’ve been using umpire data from Jeff Z and a couple of other guys. Maybe your umpire data is the same as Jeff’s. Thanks!

Your idea for testing is great.  There are of course numerous ways to test the numbers.  Why not use the 08 numbers to test on 09?  Also, why didn’t you use the 07 pitch f/x data?


#22    Nick Steiner      (see all posts) 2010/03/27 (Sat) @ 00:26

MGL, 2007 Pitch f/x data is incomplete and it’s been known to have a lot of calibration errors and whatnot.  Most Pitch f/x researchers avoid 07 data for those reasons.


#23          (see all posts) 2010/03/27 (Sat) @ 19:24

MGL,

http://spreadsheets.google.com/ccc?key=0AqCkF5_yZqcZdEY1NXcyUGZ4UjZEaUJtaUdLZVlPenc&hl=en

http://spreadsheets.google.com/ccc?key=0AqCkF5_yZqcZdEY1NXcyUGZ4UjZEaUJtaUdLZVlPenc&hl=en

One for each year. It’s just strikes above/below average.

Nick hit it on the head about the 07 data.  I could do 08 to 09, that’s a good idea.


#24    MGL      (see all posts) 2010/03/27 (Sat) @ 20:55

Bill, great, thanks!


#25    MGL      (see all posts) 2010/03/27 (Sat) @ 20:57

As far as doing 08 on 09, although it will be a small sample (catchers who switch teams), the effect may be so large that it will still show up even in a small sample of data.


#26    MGL      (see all posts) 2010/03/27 (Sat) @ 21:00

Different pitcher/catcher combos, that is, not necessarily catchers who change teams.

Bill, the umpire data is based on pitch f/x locations or just average strikes called per called pitch (ball or called strike), with no regard for the location?


#27          (see all posts) 2010/03/28 (Sun) @ 20:35

It’s post-model, so location, count, and batter handedness adjusted.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 06:43
Largest demonstration in Canadian history?

May 25 06:39
Lack of hustle during a game

May 25 05:00
Help needed with sticky issue…

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards