THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, March 14, 2008

Catcher Blocks

By Tangotiger, 09:45 PM

Goalies have save percentages.  Now, catchers have block percentage.  How real is it?  Varitek and Ausmus’s performances are 4 standard deviation from the mean (on the good side), while IRod is 4 SD on the bad side.  The standard deviation of all these z-scores is 1.93. This would imply a year-to-year correlation of r=1-1/1.93^2=.73.  (If it was not a skill at all, then the SD of the z-scores would have been 1.00.)

The average number of opps was 190 opps.  This would imply an r=.50 level at opps = 70 balls in the dirt.  (That is, 190 divided by 190+70 equals 0.73.) Dan didn’t provide the sample number of innings for each catcher.  If he can tell us that, we can figure out how many games 70 balls in the dirt represents.


#1          (see all posts) 2008/03/14 (Fri) @ 22:25

I did just a quick visual scan comparing this to THT’s (WP+PB)/G.

Both systems had Varitek and Ausmus as the best.

THT’s 5 worst were AJP, Molina, Rodriguez, Posada, Olivo.
Block percentage 6 worst were Olivo, Posada, Mauer, AJP, Ross (not qualified for THT), and Rodriguez.

Quite a bit of overlap at the top and bottom. I don’t have time right now to look at the middle, but I’m always interested in new ways to tease out catcher contributions.


#2          (see all posts) 2008/03/14 (Fri) @ 23:49

>This would imply a year-to-year correlation of r=1-1/1.93^2=.73.

Tom, I think I’ve asked this before, but is there a post where you explain the math for this?  I never really understood where that formula (and some of the other similar ones) comes from.


#3    MGL      (see all posts) 2008/03/14 (Fri) @ 23:53

Tango, how do these numbers compare with yours?  Where is the THT data?  Does the pitch f/x data include whether there were runners on base?  If not, then there could be a lot of noise generated by whether certain catches care as lot or not about blocking pitches with runners on base.  In fact, I am not sure which I would rather have - blocked pitches per opp with runners on, or blocked pitches per opp in all situations (larger sample).


#4    tangotiger      (see all posts) 2008/03/15 (Sat) @ 00:13

I’d only look at with runners on base, as the article does.  Who cares about blocking a pitch if there’s no one on base?

Phil, It’s basically
var(obs) = var(true) + var(random)

r = 1 - var(random)/var(obs)

The random z-score is 1.00.  So, if the SD is 2.0, then the var(obs) is 4.0.


#5    Pizza Cutter      (see all posts) 2008/03/15 (Sat) @ 01:57

Tango, be careful here with the var(obs) = true + random formula here.  This is actually a case of a multi-level model with catchers being nested within pitching-staffs.  The Mirabelli/Wakefield example is key here.  Put Mirabelli on a much different staff (one where he doesn’t exclusively catch Wakefield or some other knuckleballer) and suddenly he looks like a much different catcher.  (And since the Red Sox just released him, he may get his wish.) Certainly, nothing changed about him or his ability, simply the context in which he was catching.  Before talking about whether or not it’s a skill, you need to properly partial out the variance between the levels.  How much of the performance is individual catchers, and how much is the fact that their pitching staffs are kind to them?


#6    tangotiger      (see all posts) 2008/03/15 (Sat) @ 07:23

You are right, and a WOWY approach is what is needed:
http://tangotiger.net/catchers.html

Luckily, Mirabelli wasn’t in the sample.


#7    Dan Turkenkopf      (see all posts) 2008/03/15 (Sat) @ 09:43

Thanks Tango for linking this.

I understand this is undeniably tied into pitching staffs, although in general the effect should be less than you might see with Wakefield.  Unfortunately, I don’t think there’s really enough data to go on to do a WOWY study, since the pitch data only goes back one season.  We might get a little closer after 2008, but I’d guess we need a few more seasons to really be able to separate out the pitchers effectively.

I’ll also take a look at the defensive innings - but it really should represent the total innings at catcher since the “Ball In Dirt” designation doesn’t require the full PITCHf/x system.


#8    Mike Fast      (see all posts) 2008/03/15 (Sat) @ 09:53

In addition to pitching staffs, there may be a dependence on the PITCHf/x stringer, too.  Dan Fox mentioned on his blog last year that not all PITCHf/x stringers were consistent about how they designated balls in the dirt in the system.

Dan T., are you also counting Swinging Strikes (Blocked) for the catchers?


#9    Dan Turkenkopf      (see all posts) 2008/03/15 (Sat) @ 10:23

Mike:  I wasn’t including those.  That will definitely change things somewhat.  I’ll rerun the numbers and post the updated information.

And I agree that there’s some strangeness about the “Ball In Dirt” characterization.  I first tried to look at pitches where the height at the plate was less than 6 inches but that didn’t match up well to the balls in the dirt at all.


#10    Dan Turkenkopf      (see all posts) 2008/03/15 (Sat) @ 11:00

Ok, new numbers are now posted


#11    Colin Wyers      (see all posts) 2008/03/15 (Sat) @ 12:21

Pretty cool.

The average catcher has an 86% save percentages. Using that we can figure out how many blocks a catcher has above/below the average, and then convert it to runs. I averaged out the values for PB and WP from http://www.tangotiger.net/bsrexpl.html - it was a .002 difference anyway.

Looks like Varitek saved about 5.22 runs with his ability to block balls in the dirt; Pudge cost his team about 7.90 runs with his inability to do the same.


#12    Bobby Swift      (see all posts) 2008/03/15 (Sat) @ 15:01

Very interesting stuff.

This is an aside, but has there ever been a more overrated player than Ivan Rodriguez?


#13    MGL      (see all posts) 2008/03/15 (Sat) @ 15:26

Well, Pudge used to hit and run real well for a catcher and did indeed have a great arm to the extent that he saved at least several runs a year in SB/CS.  So why is/was he so overrated?  I agree that he got most of his reputation as a great player because of his throwing ability, and if that is canceled out by his inability to block pitches well, then yes, there is a problem.

Tango, the reason to look at pitches in the dirt with no runners on is to increase the sample size.  Of course that assumes that all catchers use the same ability to block with runners not on as they do with runners on, which is probably not the case.  Because of that, I agree that we probably should not be using the no runners on data.

+- 5-8 runs a year is a lot, depending on what the true talent level is (the “r").  If the regression is 50% per season, then +-3 runs a year in talent is still, on the order of outfield arms and a baserunning.


#14    Colin Wyers      (see all posts) 2008/03/15 (Sat) @ 16:31

MGL, we’d really need more than one year of data to confidently say that Pudge is as bad as this as our numbers say, and even then Pudge in his prime was probably better at blocking pitches than Pudge at his age.

Looking at the THT data, it would seem that 2007 is the worst Pudge has done at blocking pitches in the past four seasons.

http://www.hardballtimes.com/main/stats/players/index.php?playerId=1275&firstName=Ivan&lastName=Rodriguez

But what the BIS data (and Retrosheet, for that matter) lacks is the “balls in dirt” that gives us the opportunites. In the specific case of Pudge, we’d need WOWY to have a good idea of what’s going on.


#15    tangotiger      (see all posts) 2008/03/15 (Sat) @ 17:01

The WOWY in my article in THT08 had IRod as one of the best fielding catchers for baserunner events in the Retrosheet years.  His entire value came from holding the running game down (an enormous 10 runs saved per season).  The rest of his game, including blocking pitches, was league average.  There’s certainly no reason to think he was other than that, even given this data, seeing that this data is (a) for only one season, and (b) for him at the tail-end of his career.


#16    joe arthur      (see all posts) 2008/03/15 (Sat) @ 18:23

The data is described as pitch f/x data, but it’s really just derived from the mlb.com XML files. [Pitch f/x is an extension of that XML- essentially extra fields added to the usual xml structure when the pitch f/x system was used for a particular game. It looks like these xml files are still available back through 2005, for example:
http://gd2.mlb.com/components/game/mlb/year_2005/month_08/day_07/gid_2005_08_07_houmlb_sfnmlb_1/inning/inning_1.xml

Unfortunately, as Colin pointed out, it doesn’t look like retrosheet or fangraphs make this distinction on pitch outcomes. (I haven’t looked at other sources.) But Dan T’s work could be extended back to 2005 with more mlb.com data…


#17    Billy      (see all posts) 2008/03/15 (Sat) @ 19:22

I think there’s one minor problem with only using data with runners on base.  What about a swinging strike three that gets past the catcher and the runner gets to first?


#18    MGL      (see all posts) 2008/03/15 (Sat) @ 20:12

#12, you would certainly want to include that, but it rarely happens anyway…


#19    Dan Turkenkopf      (see all posts) 2008/03/15 (Sat) @ 21:06

Joe/#16:

If they score pitches in the dirt, then I could easily do the same thing back through 2005.  I’ll work on grabbing that data tomorrow and see what I can do.  Thanks for the info.

Billy/#17:
Agreed, but as MGL says, it doesn’t really happen much.  It looks like it might have only happened once in 2007.


#20    MB      (see all posts) 2008/03/15 (Sat) @ 21:48

Dan, this is pretty cool stuff.

I think it may be possible to break it down by pitch type, no? (although it may require a bunch for work). Maybe some catchers face a lot more fastballs, sliders, etc. in the dirt than other catchers and maybe those pitches are harder or easier to stop. Would be neat to look at (at least to me).


#21    MB      (see all posts) 2008/03/16 (Sun) @ 01:16

Doh ... read through this thread more closely and see that you guys have already discussed what I just mentioned. My apologies. Anyway, does the data you have include pitch speed, break, and all that other stuff?


#22    Dan Turkenkopf      (see all posts) 2008/03/16 (Sun) @ 10:28

MB/21:

Some of the data includes the pitch details.  I’m planning on drilling down deeper on pitch type, etc, over the next week or two.  I’ve got some concerns on the sample sizes at that point, but it’s definitely still worth doing.


#23    greenback06      (see all posts) 2008/03/16 (Sun) @ 13:48

Would a WOWY study have a problem with pitcher behavior? If I’ve got a catcher behind the plate with a reputation as a slug, then I’m not trying to throw the ball 6” off the ground. I don’t know if this nitpick would be relevant outside the laboratory, but it is something you hear during baseball games when a pitch gets thrown in the dirt.


#24    Colin Wyers      (see all posts) 2008/03/16 (Sun) @ 14:49

greenback - You can probably account for that sort of thing, if you have pitch location data in your WOWY sample. Now that obviously limits the time period in which you can do this sort of research, but it could provide some interesting results nonetheless.


#25    Dan Turkenkopf      (see all posts) 2008/03/20 (Thu) @ 07:05

For those who are interested, I posted numbers for 2005 and 2006 here.

Looking at the results I have some concerns about data quality, especially for 2006, and the impact of pitchers.

It looks like Matt Clement was enough to push Varitek from one of the top catcher to one of the worst.  So my next step is to follow Tango’s WOWY method.


#26    Tangotiger      (see all posts) 2008/03/26 (Wed) @ 09:54

Another update from Dan:
http://beyondtheboxscore.com/story/2008/3/25/135121/550


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main