THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, March 08, 2010

Pre-Introducing Batted Ball FIP

By Tangotiger, 05:26 PM

A pitcher’s performance can be split between what includes his fielders and what doesn’t.  The latter is captured by the component called FIP (fielding independent pitching).  The former includes hits and outs, putting the pitcher at the mercy of his fielders and his parks. We need a separate metric to capture this component of pitching. 

A few years ago, I planted the seed of what a batted ball FIP would look like: rather than looking at the result of the at bat in terms of hits and outs, we’d instead look at whether the batted ball was a groundball or flyball.

Many analysts have done many good things with batted ball based component ERA.  I’m here to set the bar.  Not high, but low.  As with Marcel, I’m not interested in creating the best possible system.  I’m instead interested in creating the least acceptable system that is relevant.  FIP was one such system.  It works fantastically well because:
1. it uses limited amount of data (K, BB, HR, IP)
2. combines them in an easy and easy-to-remember fashion
3. produces such good results that it gets us most of the way there

DIPS and BaseRuns-DIPS is a better system than what I have done.  And if you can, choose the better system.  But, FIP is what is ubiquitous, not DIPS. Everyone else can fight and claw their way from my low point to the top.  I’ll be happy to stay one cut below them.

And so, I now introduce Batted Ball Fielding Independent Pitching (bbFIP).  I’ll also show you how you too can create a metric as easily as possible.  Your first step is to get hold of data.  Metrics are created to try to capture existing historical data and try to make sense of it.  We’re not here to invent a wheel: we’re here to explain how a wheel turns.

I have in my possession pitcher data from 2002-2009, totalled by pitcher that includes these results:
- strikeouts, walks, hit batters
- groundballs, outfield flyballs, infield flyballs, line drives
- runs allowed, plate appearances

I’ve got 311 pitchers with at least 1500 plate appearances.

The question on the table is: what is the relationship between all those events in the first two lines and runs allowed?  Well, we look at the data and try to figure it out.  The best thing to do is run a regression, but presuming most of you are like I am, we want to get our hands dirty and try to see the results for ourselves.

Let’s start with the first line.  I’m going to exclude intentional walks and include hit batters.  Then, I’m going to classify each of my 311 pitchers into 3 groups of walk rates (low, normal, high) and 3 groups of strikeout rates (low, normal, high).

The walk groups are based on these boundaries: .096 walks per PA, .077 walks per PA.  If you have less than .077, you are in the low walk group.  If you have more than .096, you are in the high walk group.  We end up seeing this:

n avgBB avgRA BBclass
107 0.065 4.45 1_Low
102 0.087 4.78 2_Norm
102 0.111 4.76 3_High

We have 107 pitchers in the low walk group, who averaged .065 walks per batter and allowed 4.45 runs per 9 IP.  It’s not terribly interesting other than how uninteresting it is.  The question is if there is BIAS in the groupings.  Is there something that is uncontrolled that is linked to walks to such an extent that the results we are seeing is not just about walks, but about something else?  Let’s also include strikeout rates:
n avgBB avgSO avgRA BBclass
107 0.065 0.170 4.45 1_Low
102 0.087 0.168 4.78 2_Norm
102 0.111 0.189 4.76 3_High

Ah-ha, so the high walk pitchers also have alot of high strikeout pitchers.  So, perhaps the reason that their runs allowed did not skyrocket high is because they had alot of good strikeout pitchers.  Our next step is clear: break down by strikeout groups instead.

We’ll use boundaries of .152 and .192 strikeouts per PA.  This is what we get:
n avgBB avgSO avgRA SOclass
103 0.081 0.131 5.12 1_Low
102 0.091 0.169 4.89 2_Norm
106 0.089 0.225 4.00 3_High

Now we see that high strikeout pitchers give up very few runs.  This is a huge indicator.  Walks and strikeouts are always used together, so let’s include BOTH groups: those that break down by walks and by strikeouts.  We get this:
n avgBB avgSO avgRA SOclass BBclass
34 0.064 0.223 3.70 3_High 1_Low
29 0.087 0.221 4.03 3_High 2_Norm
43 0.111 0.230 4.22 3_High 3_High

30 0.066 0.167 4.60 2_Norm 1_Low
32 0.087 0.169 4.84 2_Norm 2_Norm
40 0.112 0.170 5.14 2_Norm 3_High

43 0.065 0.129 4.95 1_Low 1_Low
41 0.086 0.130 5.28 1_Low 2_Norm
19 0.108 0.137 5.16 1_Low 3_High

The best group is, obviously, the pitchers who strikeout the most and walk the least.  Those 34 pitchers averaged .064 walks, .223 strikeouts, and allowed 3.70 runs per game.  The worst pitchers either gave up alot of walks or struck out few batters.

Before we go off an run a regression of which most of you might not even believe, let’s look at the numbers.  Specifically, let’s look at the first three rows: they are all from high strikeout pitchers, and their strikeout rates are around .22 to .23 per PA.  The range in runs allowed goes from 3.70 to 4.22, and that’s based on the range of walks allowed of .064 to .111.  So, we can roughly say that, for this group of high-K pitchers, giving up an extra .047 walks per PA (.111 minus .064 equals .047) leads to an extra 0.52 runs per 9 IP (4.22 minus 3.70).  Or, more generally speaking each .1 walk per PA adds 1.10 runs per 9 IP (.52/.047*.1).

Let’s look at the second group of pitchers, those wth a K-rate of around .170.  For those pitchers, the walk rates go from .066 to .112 (difference of .046 walks) and that leads to a change in runs from 4.60 to 5.14 (difference of .54 runs).  Well, these numbers are fantastically similar to the previous group, so we end up with a similar result: each .1 walk per PA adds 1.17 rus per 9 IP.

Finally, in the third group, .043 walks per PA adds, well, it’s not as clear because we have a little blip in there.  We’ll get back to that in a second.

We can repeat this with the walk groups.  If you look at the first row of each group, you will see they are all the low-walk groups, and they all have around the same number of walks (.064 to .066).  The differentiator is the strikeouts (.223 down to .129) and the runs allowed moves the other way (3.70 to 4.95).  A difference of nearly .100 strikeouts leads to 1.25 runs per 9 IP (or more precisely 1.32 runs per 9IP for each 0.1 K per PA).

If we look at each of the second rows, we see the BB rates are stable, whike the K rates go from .221 to .130, for a change in runs allowed of 4.03 to 5.28, or a rate of 1.37 runs per 9IP for each 0.1 K per PA.

Finally, the last row will give us 1.00 runs per 9IP for each 0.1 K per PA change.

Overall, we can see that, generally speaking, each 0.1 K per PA moved the runs allowed by around 1.0 to 1.3 runs per 9IP and each 0.1 BB per PA moved the runs allowed by around the same amount but in the opposite direction.

And this is where we run our first regression.  Rather than doing everything that I just did, in terms of binning and finding differences, etc, we run a regression.  This is what the regression was designed for.  When we run a regression of the two independent variables (strikeouts and walks per PA) against the dependent variable (runs per 9 IP), we end up with this regression equation:

RA = 5.84 + 11.8*BB - 12.6*SO

As you can see, this is exactly what we should have expected: each 0.1 strikeout should have added around 1.3 runs (or each 1 strikeout would add 13 runs).

As interesting as the one walk impacts one strikeouts finding is, the more interesting finding is that the correlation coefficient is an r=0.75.  This is a very high number, as you will soon see.

Now, as I said, I’m a simple guy.  I don’t want to see “11.8” and “12.6” in there.  They are close enough that perhaps we should just look at BB-SO (treating them both equally) and letting the regression figure out the proper weight.  And this is what we get:

RA = 5.76 + 12.5 * (BB - SO)

(If you need it in the form of ERA, multiply the resultant answer by 0.92, because 92% of runs allowed are earned.) And there you have it, the first step in figuring out bbFIP.  And, incredibly, we could actually stop right here, and consider NOTHING ELSE about a pitcher, and we’d be really close to capturing his performance.  Really, you ask?

Here is a chart that plots a pitchers Strikeouts minus Walks on the X-axis and runs allowed per 9IP on the y-axis.
image

That’s what an r=.75 chart looks like.  And this was done by looking only a pitcher’s walks and strikeout and nothing else.  I know, pretty cool.  Anything we do beyond this point is going to be gravy. I’ll get to part 2 next time.

#1          (see all posts) 2010/03/08 (Mon) @ 22:40

This is *great*.

I take it runs allowed was chosen because you are explicitly looking at the component of pitcher performance that includes fielding?


#2    Tangotiger      (see all posts) 2010/03/08 (Mon) @ 22:59

Actually, ER is stupid and silly and ridiculous.  And I’m being nice.  I see no point to its existence in this day and age.

When we look at runs allowed, it’s a combination of pitching and fielding.  When we look at ER, it’s a combination of pitching and fielding and capriciousness and illogic, and bias in that a GB pitcher allows more unearned runs than a FB pitcher.

When we want to do our mappings of inputs to outputs, the inputs (hits, walks, SO, HR, batted ball, etc) all lead to RUNS.  ER is something made up that means nothing.

Finally, if you look at year-to-year stats of pitchers who change teams, the fielding component will cancel out (given enough pitchers).  And so, the flimsiest of reasons to have ER goes away, and again, all we care about is runs.

I can’t tell you the distaste I have for ER.


#3    Nick Steiner      (see all posts) 2010/03/08 (Mon) @ 23:14

So why the hell did you scale FIP to it? wink

I should wait for part two, but I’m curious as to why you don’t consider tRA the Marcels of batted ball DIPS.  It just takes the 8 outcomes (K, BB, HR, HBP, FB%, GB%, LD%, PU%), and divides the sum of the expected runs by the sum of the expected outs and multiplies by 27.  It doesn’t mess around with regression and a constant, because it is already scaled perfectly to RA because it uses Linear Weights.


#4    MGL      (see all posts) 2010/03/08 (Mon) @ 23:37

Great stuff so far.  Really well explained.  And yes, taking out errors to get ER is like taking out singles and calling the result “non-singles runs allowed.” Errors are merely another offense event.  It’s not like singles, doubles, and triples have nothing to do with fielding.  On top of that, as Tango says, we have the bias associated with a pitcher’s GB and FB tendencies as well as the bias associated with batted ball percentages (of TBF).  If a pitcher allows a lot of batted balls, he will give up more runs.  Part of giving up runs via batted balls is giving your defense a “chance” to make errors.  Why in the world would you want to factor that out when assessing or rewarding a pitcher?  Part of what makes a high K pitcher a good pitcher is that he does not allow a lot of balls in play and one of the reasons that balls in play are bad is because fielders can and do make errors on them!

ERA is a throwback to when everyone thought that the only reflection of fielding talent is fielding percentage.

While ERA needs to be tossed into the trash bin, it will never will, because on a game by game basis, pitchers who allow a bunch of runs on any one inning because of an error or two will think that it is grossly unfair to be charged with those runs.  Fans, commentators, and pretty much everyone else would agree with that.  Of course, the answer (to a pitcher complaining about that) is two-fold: One, what about when a fielder fails to get to a ground ball but doesn’t actually make an error.  Two, well, every time you allow the batter to put a ball in play, that is one of the risks you take!


#5    Josh      (see all posts) 2010/03/09 (Tue) @ 00:12

Great points Tango and MGL. lol at non-singles runs allowed.

Laziness on my part that I’ve never looked to see if there was any actual fielding information extracted from runs allowed for the pitcher universe.

Love the passion, and thanks for the detailed exposition from you both.


#6    Sunny Mehta      (see all posts) 2010/03/09 (Tue) @ 11:45

“One, what about when a fielder fails to get to a ground ball but doesn’t actually make an error.  Two, well, every time you allow the batter to put a ball in play, that is one of the risks you take!”

Three, you DON’T get charged with anything when you let up that scorching line drive down the line that your third baseman makes a diving catch on to save extra bases.

It’s funny how pitchers (and commentators, fans, etc) find it acceptable to take credit for the process and ignore the result when it benefits them, and then do the exact opposite when it doesn’t.  smile


#7    Sunny Mehta      (see all posts) 2010/03/09 (Tue) @ 11:50

Tom,

Great article. Well written and very well explained.

That main conditional distribution table is so telling. I do perhaps think it’s important to note that once you slice and dice the categories appropriately, we’re dealing with sample sizes between 19 and 43 - small enough to likely include a shit ton of randomness.


#8    Guy      (see all posts) 2010/03/09 (Tue) @ 12:29

Tango/MGL:
This is only tangentially related, but:  Have you guys (or anyone else) ever studied whether high-K pitchers do a better job of striking out hitters at all quality levels equally, or whether they are particularly successful at striking out certain types of hitters (e.g. good hitters, or weak hitters).  What I’m wondering is whether the pool of hitters to whom high-K pitchers give up BIP is any better or worse—in terms of the hitters’ BABIP skill—than low-K pitchers.  We know that high-K pitchers tend to have a slightly better BABIP.  Is this in part because their BIP are actually hit by lower-BABIP hitters (because they struck out a lot of the high-BABIP hitters)?  Or conversely, is their ability to suppress hits on BIP understated because their BIP are actually hit by above-average hitters?


#9          (see all posts) 2010/03/09 (Tue) @ 15:47

This is an interesting argument about ERA that I never considered before.

But as a counterfactual, what if ERA never existed?  What if the dominant pitching statistic was RA?  We just count how many runs occur, per nine innings, when a pitcher is on the mound.  Writers look at wins and RA when voting for awards.

Would sabremetricians then invent a measure to take errors out of the calculation because, after all a run caused by a fumbled play by another fielder is not a reflection of the pitcher’s true talent level?


#10    Tangotiger      (see all posts) 2010/03/09 (Tue) @ 15:52

No, a saberist would not have created ERA.  That’s because all these stats reflect things that actually happened.

An ERA is a figment of someone’s imagination.  And a saberist has a better imagination than that.  In no way would a saberist create a rule that says: “all runs after two-out errors are discared, but all HITS and all WALKS after two-out errors count”.  The illogical is irrational.


#11    Bill      (see all posts) 2010/03/15 (Mon) @ 15:47

I cannot wait for Part 2!  Any idea when it will come?


#12    Tangotiger      (see all posts) 2010/03/15 (Mon) @ 16:22

I was working on and off it for several days.  I’m trying to find the right hook.  It’s on the backburner for now.

It’s going to focus on HR as a skill or not (reality: something in-between).


#13    Bill      (see all posts) 2010/03/17 (Wed) @ 21:16

It’ll be worth the wait, I’m sure.

Thanks, Tom.


#14          (see all posts) 2010/03/18 (Thu) @ 01:31

To answer Ed’s question above (which I didn’t see when he wrote it, for some reason) - if a saberist (yes, that word is a lot easier to write than sabermetrician, and both of them fail spellcheck anyway!) was trying to isolate pitching talent, he would try and adjust for or normalize errors in the same way he does for all other offensive events.  Now, as it turns out, ERA would probably better capture a pitcher’s true talent than RA, and it might even correlate better with future RA, but then again, so would FIP or DIPS ERA.  I’ll disagree with Tango and say that a saberist might invent ERA, sort of like a poor man’s FIP, which it is…


#15          (see all posts) 2010/03/22 (Mon) @ 02:00

sort of like a poor man’s FIP, which it is…

With apologies to Bill Simmons, it’s a homeless man’s FIP.


#16    Tangotiger      (see all posts) 2010/03/22 (Mon) @ 07:27

Kevin: nice.

MGL: no saberist will, at the same time, do this:

1. Given there are two outs and an error occurred
a. ignore all runs following the error
b. continue to include all hits and walks following the error

The most glaring example is counting a HR as a hit, but not a run, in a pitcher’s performance line.

If a saberist invents “earned” runs, he also invents “earned” hits and “earned” walks at the same time.  He won’t do one without the other.

I fail to see why we think we’re doing a good thing by counting all the hits after the 2-out error, but not counting all the runs.

It was a stupid and silly practice enacted by data recorders, who acted for a minute as data analysts.

Data recorders should record data.
Data analysts should analyze data.


#17          (see all posts) 2010/03/22 (Mon) @ 11:11

Ok...I’m kind of new to the saber world. I’ve loved following baseball stats since I was a kid, but never really delved deeply into it prior to a few months ago. I’ve learned a lot, but have a ridiculous amount of learning still ahead of me.

With that said, Tom...I think you just reached through the interwebs and sucker punched me. I’ve long accepted that FIP is more indicative of pitching ability than ERA. But I had never considered how illogical ERA is before. In one post you turned my world upside down.

Whoa.


#18    Sky      (see all posts) 2010/03/22 (Mon) @ 11:23

I think a saberist would invent “ERE24”.  RE24 solves the ERA problem with the start and stop points, and the Earned nature of it would ignore any specific plays involving errors, but not stuff afterwards.

Kerry Whisnant did that over at dugoutcentral.com


#19    Tangotiger      (see all posts) 2010/03/22 (Mon) @ 11:31

Rob, it’s even worse.  Brandon Webb is a GB pitcher.  Johan Santana is a FB pitcher.  When do you think errors get recorded more often?  On groundballs, not flyballs.  Errors therefore will happen to Webb BECAUSE he’s a GB pitcher.  They are unavoidable because of his style of pitching.

Brandon Webb: 557 runs allowed (78 “unearned")
Johan Santana: 552 runs allowed (45 “unearned")

(Since 2002 for Santana, Webb’s career started in 2003.  Santana gets an extra year, but I just want to focus on runs allowed, not on who’s better.)

I’ll bet you if you look at errors made per GB and errors made per FB, both pitchers would probably be pretty close.  Heck, they could even be less for Webb for all I know, as yet another example of Simpson’s Paradox.

The idea is that “unearned” runs is the causative result of bad fielding, when in fact, groundballs may be the causative agent.

But GB is a skill!

And just the whole idea of “reconstructing” an inning is ridiculous to me.  First, because we are only reconstructing some of the bad fielding plays, and not all of them.  Secondly, we don’t reconstruct good fielding plays.  When Gutierrez makes a diving play that basically turns a 3-out inning into a 2-out inning, why don’t we continue the reconstruction by adding in whatever Jarrod Washburn did in the following inning?  He didn’t “earn” the three outs in the Gutierrez inning did he?

It’s a big joke.  The worst part is that ERA is so ubiquitous, far more than batting average, is that we’re almost stuck with it.


#20    Ken      (see all posts) 2010/03/22 (Mon) @ 11:37

One of the worst things with ERA is that, after the two-out error, the pitcher can give up an unlimited number of ‘unearned’ runs as long as he does not give up a home run, which then makes any following runs earned.

It might make sense if only runs that scored on the error and the runner that got on base due to the error were included.

P.S., Welcome to the Jays!


#21    Tangotiger      (see all posts) 2010/03/22 (Mon) @ 12:10

Ken, thanks.

However, I don’t think you are correct about the HR resetting the earned clock in that particular scenario.  I think the two-out error makes everything after that unearned, regardless of what happens afterwards.  That’s because you reconstruct the inning as if the error is an out, and so, you are now at 3 outs.

I’d be glad to be corrected, and, in the end, it doesn’t matter, because of the silly premise to begin with.


#22          (see all posts) 2010/03/22 (Mon) @ 12:36

Ken, after there were supposed to be 3 outs in an inning, all runs are currently considered unearned - even HR.

I have never understood or liked ERA as a stat. Perhaps it makes sense to adjust for an actual bad play the pitcher had no hand in but how about the 4 hits ending in a HR he gives up after that? Aren’t those all his?

I always thought that - at the least - ERA should be adjusted to only called unearned those runs which got on via error and those which scored on the error when it was the third out of the inning. So 2 runs scored on an error that was the 3rd out and the guy who reached on the error are UER but not the next 4 guys who got hits and all scored.

Tango - fascinating stuff (with a nod to MGL for the assist and lively argument). I have always been fascinated by the batter-pitcher dynamic, especially of the concept of who is in control of the AB.

Since a hitter can hit the pitchers best pitch placed exactly where he wanted it and conversely miss the pitchers worst pitch set right down the middle, I’d argue control over an outcome in that situation moves around quite a bit.

For example the pitcher is in control 0-2 count and decides to throw a slider low and away but it doesn’t break and hangs over the plate, leaving the hitter in control. But the hitter fails to make solid connection and makes an out - I’d say that even if that is a strikeout, it was still the batter in control and failing.

However, if the pitcher had always come inside and the hitter was looking there and unprepared for the hanger away, perhaps the pitcher was in more control than we think - even if the result is a ball hit hard to the SS for an out.

Whether it can be measured or not and whether BABIP is the best way we can try and account or not, the AB one on one will always remain the most interesting dynamic in the world of sports to me.


#23    Sky      (see all posts) 2010/03/22 (Mon) @ 12:39

I “love” how runs can be earned for a pitcher but unearned for a team.


#24          (see all posts) 2010/03/22 (Mon) @ 17:52

Tom:

Interesting article.  I, however, am skeptical that hits allowed (including home runs) have such a relatively small impact on runs allowed.

To that end, what does this analysis suggest about the relative impact of defense (or lack thereof) on runs allowed?

Thanks,

Brad


#25    Ken      (see all posts) 2010/03/22 (Mon) @ 22:05

Tango, you’re right.

I looked it up and it seems that the home run rule for unearned runs does not exist.  I remember that from somewhere so maybe it used to exist and was changed, or maybe someone was pulling my leg when I was a lot younger.

Still, even without that, a pitcher can give up a lot of ‘unearned’ runs because of one puny little error such as a dropped foul ball.


#26          (see all posts) 2010/03/22 (Mon) @ 23:02

Tom I remember learning about that 2-out error nonsense during the ‘82 world series when I was a kid, and my dad being unable to explain the logic. This is easiest expressed with the HR example - hit but not run - but your writing here really fleshes out how ER is a fallacy overall, in the sense that it was designed to be better, more insightful than recording actual outcomes then breaking them down in specific detail. OK as a quick indicator or shortcut but as a pitcher’s lead stat? ... quite arrogant when you think about it. Thanks for articulating that with your usual forceful clarity!
Good incentive to keep Mr Morrow’s fastball on the rise wink



Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:49
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps