THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, July 28, 2011

Tango’s Lab: Deconstructing FIP

By Tangotiger, 11:22 AM

Bill James wrote an article recent called “Abe Lincoln Scores”, where he focused on 4 scores (BB+HB, SO, HR, BIP).  He set the score for BIP to a “1”, and floated the other numbers around that.  SO was 0, HR was 4, BB+HB was 2.  (HR is undervalued in his metric.)

At this point, you should be thinking two things:
a. wOBA
b. FIP

The wOBA equation is this:
0.0: SO, other outs
0.7: BB, HB
0.9: 1B, ROE
1.3: 2B, 3B
2.0: HR

What James did was to focus just on those FIP things.  So, we can come up with a FIP equation based on wOBA fairly easily:
wOBAfip = (0*SO + 0.7*BB + 2.0*HR + something*BIP) / PA

So, this made me think.  Whereas in the FIP equation, the “3.2” is a constant for all pitchers, the “something*BIP/PA” is specific for each pitcher.  That is, his wOBA will be affected based on the percentage of his PA that are BIP.  To take an extreme view, if 100% of his PA are BIP, his FIP will equal 3.20. The wOBA for such a pitcher will be 0.300. 

Is a .300 wOBA actually a 3.20 ERA?  Not exactly.  I mean, it’s pretty close.  A .300 wOBA is more like a 3.30 ERA.  But, still, there’s a bit of bias to account for.

The other thing is if the 13, 3, -2 weights are correct.  FIP complicates matters by having IP, not PA, as its denominator.  Since we are trying to remove the fielders from the equation, the existence of IP implicitly includes them.

***

The linear weights run values are these for the 4 scores:
Runs above average =
-.28 SO
-.03 BIP
+.32 BB
+1.40 HR

To convert to runs, we add +.12 runs (per PA).  So, we get:
Total Runs =
-.16 SO
+.09 BIP
+.44 BB
+1.52 HR

This gives us runs scored per game.

Since FIP likes to keep the BIP “fixed”, then we remove .09 runs per PA from each event, and spin it off into its own.  Now we have:
Total Runs =
-.25 * SO
+.00 * BIP
+.35 * BB
+1.43 * HR

+.09 * PA

Since there are 38.5 PA per game, we get:
-.25 * SO
+.00 * BIP
+.35 * BB
+1.43 * HR
+.09 * 38.5

Note: there are an average of 38.5 PA per game.  Great pitchers see fewer batters.  Hence, the reason we have a bias here.

With 9 IP per game, we get:
-.25*9 * SO/IP
+.00*9 * BIP/IP
+.35*9 * BB/IP
+1.43*9 * HR/IP
+.09 * 38.5

Which is:
(
-2.25 * SO
+3.15 * BB
+12.9 * HR
) / IP
+ 3.47

Since this is on a runs scale, and not an earned runs scale, we can multiply everything above by 0.923 to get the ERA scale:
(
-2.1 * SO
+2.9 * BB
+11.9 * HR
) / IP
+ 3.20

Hence, we see where the -2, +3, +13, 3.20 figures from FIP comes from.  Based on this deconstruction, we see that the HR value in FIP may be too high, that I should be using 12, not 13.

***

However, remember that the run value of a HR is fairly static at +1.40 runs, while the run value of the walk moves with the run environment.  As the run environment goes down, so does the run value of the walk.  So, relatively speaking, the HR value compared to the walk increases as the run environment goes down, and decreases as the run environment goes up.

Indeed, if the run value of the walk is +.30, then the FIP component for HR becomes 13.  If the run value of the walk is +.33, then the FIP component for the HR becomes 12.

***

We of course have another bias, and that is that runs are not linear when dealing with pitchers.  But we’ve taken a decidedly linear approach.  So, there are two things that conspire against a great pitcher’s FIP score being biased too high:

1. We give him 38.5 batters per 9IP, when it should be a bit lower.
2. Each event has less impact the fewer runners on base.

However, one thing that shifts the balance is the use of IP, not PA, in the denominator.  So, it kind of sets the balance back the other way.

***

What am I saying?  I don’t know.  Maybe change the “13” to “12” for HR?  Maybe try to focus on PA and not IP?  Maybe look at percentage of PA that are BIP?  Maybe have a FIP equation that is better tuned to the run environment than simply floating the 3.2?  I don’t know yet.

That’s why this is a lab thread.


#1    Mike Rogers      (see all posts) 2011/07/28 (Thu) @ 13:14

I really enjoyed this. I was wondering the other day how the run values of things have changed with the lower run environment the other day. So, a more ‘flexible’ FIP equation would be something I’d like to see.

I wonder if Colin would implement a ‘flexible’ version at B-Pro.


#2    Pierre      (see all posts) 2011/07/28 (Thu) @ 13:15

Floating the 3.2 always struck me as “wrong” in that what’s actually changing must be the run values of k/bb/hr.  I assume you did it this way because re-calculating the coefficients is a lot of work and wouldn’t really make much difference.  Otherwise, yearly FIP equations would seem like the way to go.  I doubt anyone would freak out becuase they have to change 4 variables instead of 1 on their spreadsheet.  Although there’s probably an issue around the sample size required to calculate the linear weights that would require you to use some sort of rolling average, now that I think about it.

Same with PA (i.e. why not?).  It depends where you come down on simplicity v accuracy, I guess.


#3    Mike Rogers      (see all posts) 2011/07/28 (Thu) @ 13:29

It wouldn’t be that difficult. Matt Klaassen has published year-by-year wOBA weights going back to 1871 based on a script that I think Tom published. The same could be done with FIP, no?


#4    Tangotiger      (see all posts) 2011/07/28 (Thu) @ 13:36

The year-by-year wOBA are on the right-side here:

http://www.tangotiger.net/bdb/lwts_woba_for_bdb.txt

I suppose that year-by-year FIP can be handled like year-by-year wOBA.  When I just want something quick, I rely on this for wOBA:
0.0: SO, other outs
0.7: BB, HB
0.9: 1B, ROE
1.3: 2B, 3B
2.0: HR

Keeps me happy and sane.  When it becomes important to calibrate and get more precision, then I have more work to do (hence, the above link).

So, yeah, I suppose I could sell FIP the same way.  There’s “Classic FIP” for those who just want something quick.  And after all, FIP itself was presented as a quick alternative to DIPS anyway.

A more robust FIP that would be more useful analytically might make sense too.

Bill James did “Tech Runs Created”, the more technical versions of Runs Created.  I suppose that’s the model here.


#5    Tangotiger      (see all posts) 2011/07/28 (Thu) @ 14:07

Now, what would a FIP equation look like for Mariano Rivera, or others in a very low run environment?

First we start here:
Runs above average =
-.17 SO
-.02 BIP
+.21 BB
+1.40 HR

That’s right around what Mo would give you.  Now we need runs per PA.  Figure about 36 batters per 9IP, and 2.5 runs per game, so .07 runs per PA.  So, continuing, we get:
Total Runs =
-.10 SO
+.05 BIP
+.28 BB
+1.47 HR

So we move .05 runs out, which gives us:
Total Runs =
-.15 * SO
+.00 * BIP
+.23 * BB
+1.42 * HR

+.05 * PA

We already said there are 36 PA, so:
Total Runs =
-.15 * SO
+.00 * BIP
+.23 * BB
+1.42 * HR
+1.80

With 9 IP per game, we get
Total Runs =
-.15*9 * SO/IP
+.00*9 * BIP/IP
+.23*9 * BB/IP
+1.42*9 * HR/IP
+1.80

Which is:
(
-1.35 * SO
+2.07 * BB
+12.9 * HR
) / IP
+ 1.80

And putting it on an ERA scale:
(
-1.2 * SO
+1.9 * BB
+11.9 * HR
) / IP
+ 1.66

Ok, so now we see an interesting pattern.  The coefficient for HR stays constant, while the other three floats, and is dependent on the run environment.

It should be easy enough, if a bit long in time, to come up with a run-environment based FIP after all is said and done.


#6    Tangotiger      (see all posts) 2011/07/28 (Thu) @ 17:12

By the way, the above two examples (average pitcher and Rivera) is the perfect example of where regression fails.

I laid out above a logical process as to how you can determine the coefficients for FIP.  And, as you can see, they are not only much different, but the changes are not even across the board (the HR is static).

Regression would never have picked this up.  Hence, the reason that regression is a first step, not a last step.


#7    Pierre      (see all posts) 2011/07/28 (Thu) @ 19:56

the logic is circular, isn’t it?  Mariano’s great, therefore he gets his own FIP equation that’s pretty certain to confirm his greatness.  Maybe I’m confused or not following…


#8    Tangotiger      (see all posts) 2011/07/28 (Thu) @ 21:11

You start with BaseRuns or a Markov chain.  That confirms Mariano’s greatness (his runs allowed).  Then, you find the marginal impact of adding or removing a single, walk, HR, K, out.  That gives you the run value of each event.

Then you try to turn that into a FIP equation.

Now, if you do all that, you don’t need to figure out his FIP because you did all the work already.


#9    Pierre      (see all posts) 2011/07/29 (Fri) @ 09:46

Ah.  It’s FIP’.  Getting at the non-linearity of the coefficients. 

Analogous to the squared terms in a regression, isn’t it?  For example, you’d find out that HR^2 had no explanatory power, and you’d throw that term out.


#10    Tangotiger      (see all posts) 2011/07/29 (Fri) @ 10:00

Basically, the coefficients are going to be a multiple of runs per game.

As a for instance, instead of “3.2”, maybe it’s “runs per game * 0.67”.

Maybe instead of “3” for the walk, it’s:
(runs per game + 1) * 0.5

And the HR would stick at 12.

The problem of course is that you need runs per game.  You can’t use the league number, because each pitcher is his own universe.  And since we don’t know what a pitcher’s runs per game actually is, then you’ve got circular reasoning. 

Hence, the need to establish his runs per game another way (simulator, BaseRuns, or Markov). 

But, if you go to the trouble of doing all that, then you don’t need FIP to begin with!

Sooooo… the idea is to try to infer the runs per game looking only at the three components.  And for that, we need to figure out some sort of shortcut method.


#11    dave smyth      (see all posts) 2011/08/10 (Wed) @ 08:54

Tango, have you ever checked to see if 13/3/-2.5 or even 13/3/-3 correlates better with RA than does 13/3/-2 ?


#12    Tangotiger      (see all posts) 2011/08/10 (Wed) @ 09:32

I’ve done several of those tests.  It all depends on the sample of my pitchers.  As noted, since the coefficients depend on the run environment, then how I select my pitchers will determine the coefficients.  I ran one a few weeks ago where the coefficient of the HR was 11.


#13    dave smyth      (see all posts) 2011/08/10 (Wed) @ 10:14

I guess I meant all pitchers since the offensive era in 1993


#14    Tangotiger      (see all posts) 2011/08/10 (Wed) @ 11:57

Right, but even so.  Don’t forget that a pitcher provides his own run environment as well.  If I were to separate the 1993-2010 pitchers between those with high BaseRuns and low BaseRuns, I’m going to get two different FIP equations (for reasons elaborated at the top of this thread).

That’s why I say that it all depends on which pitchers are in my sample.  It’s not helpful to say “average”, and then we end up talking about Clemens+RJ+Maddux+Pedro.


#15    Tangotiger      (see all posts) 2012/03/02 (Fri) @ 14:18

Bumping for fun.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 14:14
Pete Palmer’s new book: Basic Ball

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion