THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, February 17, 2010

Tango’s Lab: Batted Ball FIP

By Tangotiger, 05:59 PM

Just fooling around here.  I ran a regression of all plate outcomes (BB, SO, GB, outfield FB, infield FB, line drive, bunts) per PA against ER per PA.  And then I ran a regression of the same plate outcomes against Outs per PA.  I get these coefficients:

ER     outs    PlateOutcomes
0.35     0.08     rBB
-0.12     0.99     rSO
0.04     0.80     rGB
0.39     0.39     rLD
0.19     0.70     rOFFB
-0.26     1.21     rIFFB
-0.18     0.88     rBunts

And then I applied that to a pitcher’s career total (2002-09, min 1500 PA) to get their estimated ER, estimated Outs, and calculated ERA as ER/Outs*27.  The correlation between this estimate and ERA is r=.81.  For SIERA, it was r=.77.  FIP was r=.86 (but that’s because it uses HR). 

Anyway, just something quick I did.


#1    Sean      (see all posts) 2010/02/17 (Wed) @ 18:20

How different is this from when you did this a few months ago?

“If you want to introduce GB and FB, you simply end up with:
x - 12*(K-BB)/PA - 3*(GB-FB)/PA

Set “x” so that it matches the league-year in question.

This is the batted-ball version of FIP, which really, will probably be pretty close to xFIP.”
http://www.insidethebook.com/ee/index.php/site/comments/the_best_of_this_week_at_bpro/#22


#2          (see all posts) 2010/02/17 (Wed) @ 18:32

Tango, on a similar note (well not so similar but on a FIP note!) have you consider looking at three true outcomes for hitters like you have with pitchers and FIP?

The coefficients would need to change but it could be interesting. Obviously the concept behind FIP is clear, that pitchers only have total control over these three elements.

I’m also thinking of a different type of projection system, not an in-season qualifier(of course, we don’t need one we have wOBA/WPA). What elements do we know batters to have significant control over year after year and correlate highly with wOBA? We could use those to project future performance.

Just a thought that I’ve been having lately.


#3    Nick Steiner      (see all posts) 2010/02/17 (Wed) @ 18:35

Isn’t this the exact same thing as tRA, except instead of empirical Linear Weights, you are using a regression?

http://www.statcorner.com/tRAabout.html


#4    Nick Steiner      (see all posts) 2010/02/17 (Wed) @ 18:43

JD/2

Projection systems like CHONE and ZIPS regress each stat differently.  So K’s will be regressed much less than singles, BB’s much less than doubles, etc.  That makes sense also.  Players don’t have 100% control over any stats, nor 0% control over others.  They are all somewhere in between.  The only reason we use the three true outcomes in FIP is to try to eliminate defense.  If you were striving for a skill metric that neutralized ALL luck, not just defensive luck, you would regress each stat appropriately.

Incidentally, that’s why I dislike xFIP (or any metric that substitutes fly ball percentage for home runs).  By doing that, you are changing it from a DIPS metric to a skill metric; however, you’re not doing it all the way.  K’s, BB’s, and batted ball rates all have a reasonable amount of luck in them too, and HR/FB has a lot of skill as well.


#5    dan      (see all posts) 2010/02/17 (Wed) @ 18:59

Nick,

I don’t understand your reason why you dislike xFIP. Yea, it’s no longer DIPS, and HR/FB is a skill, but it’s still one of the better pitching stats out there (in my opinion). The strikeouts actually happened, the walks actually happened, and the fly balls actually happened (most of them, at least). Of course, it depends what exactly you’re trying to measure...but if you’re trying to determine how good a pitcher is/was, you could do a lot worse than xFIP.


#6    dan      (see all posts) 2010/02/17 (Wed) @ 19:01

”...understand THE reason...”


#7          (see all posts) 2010/02/17 (Wed) @ 19:23

Nick, I understand the concept behind the DIPS metrics. Defense effects hitters just like it effects pitchers, so I was saying it would be cool to have a DIHs metric. Sorry if that was lost in translation.


#8    Tangotiger      (see all posts) 2010/02/17 (Wed) @ 20:42

Sean: I forgot about that thread.  It’s not going to be much different.

Ok, I just ran my regression, and I get an r=.80 with this equation:

ERA = 11*bigs + 3*smalls + 4.2

bigs = (BB+LD) - (SO+iFB)
smalls = (oFB - GB)
4.2 = whatever you need to align to league

That’s it.  That’s my semi-official battedBall FIP (bbFIP).

A line drive is like a walk, an infield fly is like a strikeout, and the gap between an outfly and a groundball is about one-fourth the gap between BB and SO.

Any questions, comments?


#9    Tangotiger      (see all posts) 2010/02/17 (Wed) @ 20:52

Oh, BB is walks minus IBB plus hit batters.

And this should be:
bigs = [(BB+LD) - (SO+iFB)] / PA
smalls = (oFB - GB) / PA


#10    Brian Cartwright      (see all posts) 2010/02/17 (Wed) @ 21:22

Using SIERA’s set of pitchers, rmse weighted by IP, comparing stats in year0 to ERA in year1

1.010 SIERA
1.040 qERA
1.076 FIP
1.241 bbFIP


#11    Nick Steiner      (see all posts) 2010/02/17 (Wed) @ 21:46

Dan,

Well, unless pitcher’s literally have zero control over how many fly balls leave the yard, a home run is not the same as a fly ball.  It’s just a home run. 

xFIP is a good little predictive stat because it eliminates the three things pitchers have the least control over: timing, BABIP and HR/FB.  However, because it’s no longer DIPS, it gives the impression that those are the only 3 stats with luck in them.  Does that make sense to you, cause I don’t think I’m explaining it well? 

I don’t have a problem with DIPS regressing Linear Weights on BIP 100% and keeping everything, because the point of the stat is to explicitly eliminate defense.  And sacrificing timing is a necessary result of that.  However, once you add HR/FB into the mix, you are changing it from a defense independent stat to some sort of weird skill stat, than regresses HR/FB, BABIP and timing 100%, while regressing everything else 0%.

I would like to see a stat that regresses each component properly, using Pizza Cutter’s values or something similar to that in place of xFIP.  xFIP is a powerful and simple stat, but I just don’t like the fundamentals behind it.  Ditto with really any other stat that substitutes FB% for home runs.


#12    Zach      (see all posts) 2010/02/17 (Wed) @ 23:02

Nick/11: “I would like to see a stat that regresses each component properly”

If you’re going that far, why not just use an updated projection?


#13    Kincaid      (see all posts) 2010/02/17 (Wed) @ 23:04

I don’t think xFIP is necessarily morphing into some kind of skill stat more than other DIPS stats are.  The general idea strikes me as the same as the idea behind neutralizing defense in other DIPS stats.  Regressing BIP value 100% is a simple way to cut defense out of the picture.  Regressing FB value 100% is a simple way to cut park and opposing hitters out of the picture when looking at whether or not a FB becomes a home run.  Not entirely properly, of course, and it’s no substitute for regressing everything properly if that’s what you want to do (I agree that I’d like to see a stat like that), but xFIP is like other DIPS stats in that that’s not what it purports to do.  It just goes a step further in isolating the results of the pitcher from outside elements.

That’s not explicitly defense-independent, but I still think it is in the same spirit.  BIP value was regressed to neutralize defense because it appears that the result of a BIP is heavily dependent on factors outside the pitcher’s influence.  Similarly, FB value can be regressed because the outcome of a FB (HR or BIP) also appears to be heavily dependent on factors outside the pitcher’s control (like what park he is in or whether the batter he is facing has a good AB or locks in on a pitch-independent of how good the pitch itself is-or whatever else).  I think that has a place in the spectrum of DIPS stats and doesn’t really go beyond the scope of what DIPS purports to do.


#14    Kincaid      (see all posts) 2010/02/17 (Wed) @ 23:10

On a similar note, isn’t tRA* pretty close to that type of stat (where it gives you the various components regressed based on sample size, not just 100% or 0%)?  Unless you want something that looks at careers as larger sample sizes than season and regresses differently based on past results or something like that, anyway.  I’ve looked at it as that general idea in the past, at least, and I like to look at tRA* to supplement the basic DIPS views out there.


#15    Tangotiger      (see all posts) 2010/02/17 (Wed) @ 23:49

Brian: can you send me your data.  That simply seems way too far off.

Otherwise, what the suggestion is is that the LD and iFB terms simply should not be there.


#16    Tangotiger      (see all posts) 2010/02/18 (Thu) @ 00:20

Brian, I think you made a coding error somewhere.

I did a year-to-year test this way:
- at least 250 PA in back-to-back years
- unweighted difference in ERA
- figure the std dev of the differences

My results:
1.05 bbFIP
1.05 SIERA
1.11 FIP

Even if I weighted as you did, there’s simply no way that I’d get the huge difference that you are getting.

So, I’d say that bbFIP is a worthy addition here.  Not to mention that it’s in the same spirit as FIP (linear and simple coefficients).


#17    Rally      (see all posts) 2010/02/18 (Thu) @ 00:34

I like bbFIP.  Simple enough that I could memorize it and plug numbers into a calculator (I’d never have a chance with the coefficients of Siera).  And if I want anything more advanced I might as well update the projections.


#18    Tangotiger      (see all posts) 2010/02/18 (Thu) @ 00:48

You want a worthy challenger?  How about plain ole szERA?

ERA = 11*(BB-SO)/PA + constant

RMSE?  1.04

That’s better than anything out there by a hair.


#19    Tangotiger      (see all posts) 2010/02/18 (Thu) @ 01:02

As a sanity check, I included also a straight ERA-to-ERA RMSE, and it came in as 1.27.

And when I did a correlation of the individual components of Year1 to ERA of Year2, regardless of how ridiculous the weights are, the RMSE was 1.01.

So, that’s your range: the best possible is 1.01, and the worst possible is 1.27.

bbFIP, szFIP, SIERA are at 1.04-1.05.  FIP is at 1.10.  Basically, there is just not that much room for improvement here.


#20    Nick Steiner      (see all posts) 2010/02/18 (Thu) @ 02:05

Zach 12/

Because we’re not trying to predict future performance, we’re trying to isolate skill in past performance.  Regression is one easy way to do that. 

Kincaid

Okay, that’s a fair enough defense of xFIP.  As I said, it’s a nifty little stat that can be very useful; however, I just think it might be giving people the wrong idea about luck and the role it plays in stats.  You can’t not imagine how many times I hear something like this:

“in 40 innings last year, his xFIP was 3.50, meaning with normal luck, he would have been a 3.50 ERA pitcher”

If people are getting the idea that HR/FB is almost 0% skill, and strikeouts and walks are 100% skill, that’s a direct consequence of using xFIP, I think.

And I have looked in tRA*.  I’ve even emailed Graham asking for clarification on the regression values he uses.  He said they were self derived, but similar to Pizza’s.  However, I still don’t think it’s exactly what I want, because, IIRC, it assumes 0% control over the outcomes of batted balls.  Maybe I’m just being pedantic.


#21    Brian Cartwright      (see all posts) 2010/02/18 (Thu) @ 11:21

I changed the bbFIP constant to 4.8 to zero the mean error. These are the rmse of ERA estimates for pitchers with 40+ IP, weighted by IP. Errors become smaller with larger IP - about 0.75 at 200+ IP in a season

Year    FIP    bbFIP    qERA    bsrERA    SIERA
   0    0.743    0.840    0.898    0.924    0.883
   1    1.076    1.026    1.040    1.037    1.010
   2    1.136    1.070    1.090    1.087    1.055
   3    1.153    1.098    1.122    1.116    1.085
Mean   -0.017   -0.021   +0.173   +0.180   -0.062 


#22    Tangotiger      (see all posts) 2010/02/18 (Thu) @ 12:16

Great work Brian.  Clearly, the batted ball info at Fangraphs and Retrosheet is quite different, since you needed a “4.8” for your constant and I needed “4.2”.

bbFIP does a pretty good job, for a linear construction.  It also doesn’t throw away the LD information, even if it might be better to do so in future years.  It obviously helps it for current year.  That’s really the tradeoff when you look at FIP, bbFIP, and SIERA.

FIP to bbFIP drops the HR, but takes in all the batted ball types, including LD.  SIERA drops the LD and adds interdependent terms.  In each case, you LOSE reliability for same-year data and GAIN it on next-year data.


#23    Tangotiger      (see all posts) 2010/02/18 (Thu) @ 12:19

Brian, can you add this to your test:

ERA = 11*(BB-SO)/PA + constant


#24    Brian Cartwright      (see all posts) 2010/02/18 (Thu) @ 13:40

ERA = 11*(BB-SO)/PA + 5.3 AS Tango
It’s the best! (of those on the list in predicting future performance)

wERA is my formula for converting wOBA allowed to ERA. It is the most accurate for same season, but is least in predicting the future, as it relies on BABIP and other stats that are not as predictive. I will run these same measures on my projections instead of raw stats to let the projections do the predicting and then see how wERA compares for y1.

Year    FIP  bbFIP   qERA bsrERA  SIERA  Tango   wERA
   0  0.743  0.840  0.898  0.924  0.883  0.908  0.694
   1  1.076  1.026  1.040  1.037  1.010  1.003  1.180
   2  1.136  1.070  1.090  1.087  1.055  1.035  1.249
   3  1.153  1.098  1.122  1.116  1.085  1.059  1.273
Mean -0.017 -0.021 +0.173 +0.180 -0.062 -0.021 -0.066


#25    Tangotiger      (see all posts) 2010/02/18 (Thu) @ 14:20

You can call it GuyTango.  Guy was the first one to point out to me that the differential of walks and K per PA would be better than the ratio.  I simply added the best-fit coefficient.

***

I love how you went out to year2, year3 as well.  You are showing that walks and K are really the only thing you need.  Everything else starts to wash out.  Fantastic job.

***

One thing in defense of all the others I am not involved in: are you sure you have them calibrated?

What you should do is make sure that the same-year estimators and ERA are an exact match.  Then, you can do your next-year correlation.

***

You should write up your findings in an article.


#26    Matt K. (d_f)      (see all posts) 2010/02/18 (Thu) @ 14:38

Brian/#24—Is your wOBA/ERA conversion something you keep you yourself (which, if so, is understandable), or something you could share? I’d be interested to see how you do it. It is wOBA converted to “absolute” runs created divided by outs the put on a 9 inning scale?

Indeed, this discussion is fascinating in general, but it would be helpful if someone could just match each acronym with the formula used; I got a bit lost.

GuyTango is very exiciting, I must say. Simplicity is always enjoyable.


#27    Tangotiger      (see all posts) 2010/02/18 (Thu) @ 14:43

Matt: right.  It’s always good to have a minimally-involved metric to serve as the baseline that others should be compared against.  It’s like with The Marcels.

The GuyTango, which I’ve called szERA or kwERA… I like kwERA, would seem to fit that bill.  If someone is constructing a metric like bbFIP or SIERA, etc, it’s good to have such a simple baseline to compare against.


#28    mulkowsky      (see all posts) 2010/02/18 (Thu) @ 15:27

Brian, great work, and that’s amazing about GuyTango. 

BTW, any chance of adding xFIP to your chart?  Thanks!


#29    Brian Cartwright      (see all posts) 2010/02/19 (Fri) @ 21:06
         y0     y1     y2     y3
xFIP  0.859  1.000  1.043  1.074

Lags a little in y0, then is the best in y1+

I agree with Tango’s conclusion that there’s a limit on how good you can do. All these measures can only do about a one run rmse at y1, whether looking at BB&SO or batted balls (or some combination).

I’m working up an article, one of the main points to be explained is describing the past vs predicting the future. Past runs scored is nest modeled by including everything as it actually happened. The future is best predicted by looking at the repeatable skills.

wERA (wOBA to ERA) is best at describing the past because it looks at everything that actually occurred, but it’s worst at future prediction. If my projections are good enough, then hopefully a two step process (project wOBA, convert to ERA) will give me a reliable ERA predictor for y1.

Another thing to explain is how the pitching environment is not linear - ie homeruns don’t cost as much in a low obp seting.

The wERA formula uses as quadratic of wOBA. I got it by doing a regression of wOBA’s allowed to ERA’s, and the terms came out to match what we already knew.

(((wOBA-LGwOBA)/1.15+(LgER/LgPA)*LGwOBA)*wOBA*PA) AS ER,

(wOBA-LGwOBA)/1.15 is the standard conversion to runs. If wOBA=LGwOBA the term zeroes out.

Add to that league ER per PA times league wOBA, typically .114/.332=.343.

If wOBA=LGwOBA, then (0+.343)*.322=.114, so we have the league average of ER per PA - by definition average wOBA will produce average ERA.

Having (((wOBA-a)/b)+c)*wOBA gives the non-linearity, allowing it to not give ERAs too high for good wOBAs, or too low for bad wOBAs. With sufficient sample size, the ERA rmse is consistent for input wOBAs allowed from the low 200s up through the 400s


#30    Tangotiger      (see all posts) 2010/02/19 (Fri) @ 21:54

Bryan:
R = ( (OBP/(1-OBP)) ^1.5)*14

Proof:
http://www.insidethebook.com/ee/index.php/site/comments/converting_obp_or_woba_to_runs/

You can play around with the “14” to match the league average.  Use wOBA instead of OBP.

It works spectacularly well.  Try it out.  If you want, break up the pitchers like this:
wOBA under .300
.300-.320
.320-.340
.340-.360
.360+

Report the average wOBA in each group, report the ERA, and then report what the above equation says.

Again, remember to change the “14” to calibrate for the whole league.  That 14 might be 13 for ERA or something.


#31    Brian Cartwright      (see all posts) 2010/02/19 (Fri) @ 22:27

Here’s my wERA test results

1998-2009 MLB, pitcher’s wOBA allowed rounded to nearest .01, sum actual IP, PA, ER, ERA, calculated wERA for group

wOBA      IP      PA     ER     ERA    wERA
0.180     82     306     11    1.20    1.15
0.190    191     715     27    1.27    1.29
0.200    255     941     39    1.38    1.40
0.210    873    3295    148    1.53    1.58
0.220    596    2308    120    1.81    1.77
0.230   1945    7551    443    2.05    1.94
0.240   2947   11555    693    2.12    2.13
0.250   5122   20317   1328    2.33    2.33
0.260   8152   32619   2315    2.56    2.54
0.270   9316   37689   2842    2.75    2.76
0.280  16786   68631   5694    3.05    3.00
0.290  23660   98051   8648    3.29    3.26
0.300  36558  152899  14291    3.52    3.51
0.310  46922  198473  19605    3.76    3.79
0.320  49085  209680  22231    4.08    4.07
0.330  52074  225385  25495    4.41    4.38
0.340  46942  206024  24317    4.66    4.71
0.350  38977  172757  21643    5.00    5.04
0.360  27843  125240  16693    5.40    5.41
0.370  22902  103756  14444    5.68    5.75
0.380  13156   60654   8912    6.10    6.17
0.390   7334   34172   5274    6.47    6.56
0.400   3988   18846   3061    6.91    6.99
0.410   2077    9885   1683    7.29    7.39
0.420   1248    6072   1096    7.90    7.92
0.430    701    3395    606    7.78    8.27
0.440    239    1196    226    8.50    8.92
0.450    260    1303    277    9.58    9.34
0.460    114     583    121    9.58   10.00


#32    Tangotiger      (see all posts) 2010/02/19 (Fri) @ 22:55

GReat data Bryan.  I appended my equation to your data, using this equation:

ERA = ( (OBP/(1-OBP)) ^1.5)*12.62

The standard deviation of our differences is 0.149 for you and 0.142 for me.  Pretty much, you can use either one.  Mine is pretty easy to remember for whatever that is worth.

wOBA     IP    PA    ER     ERA      wERA     Tango
 0.180     82    306    11     1.20      1.15      1.30 
 0.190     191    715    27     1.27      1.29      1.43 
 0.200     255    941    39     1.38      1.40      1.58 
 0.210     873    3295    148     1.53      1.58      1.73 
 0.220     596    2308    120     1.81      1.77      1.89 
 0.230     1945    7551    443     2.05      1.94      2.06 
 0.240     2947    11555    693     2.12      2.13      2.24 
 0.250     5122    20317    1328     2.33      2.33      2.43 
 0.260     8152    32619    2315     2.56      2.54      2.63 
 0.270     9316    37689    2842     2.75      2.76      2.84 
 0.280     16786    68631    5694     3.05      3.00      3.06 
 0.290     23660    98051    8648     3.29      3.26      3.29 
 0.300     36558    152899    14291     3.52      3.51      3.54 
 0.310     46922    198473    19605     3.76      3.79      3.80 
 0.320     49085    209680    22231     4.08      4.07      4.07 
 0.330     52074    225385    25495     4.41      4.38      4.36 
 0.340     46942    206024    24317     4.66      4.71      4.67 
 0.350     38977    172757    21643     5.00      5.04      4.99 
 0.360     27843    125240    16693     5.40      5.41      5.32 
 0.370     22902    103756    14444     5.68      5.75      5.68 
 0.380     13156    60654    8912     6.10      6.17      6.06 
 0.390     7334    34172    5274     6.47      6.56      6.45 
 0.400     3988    18846    3061     6.91      6.99      6.87 
 0.410     2077    9885    1683     7.29      7.39      7.31 
 0.420     1248    6072    1096     7.90      7.92      7.78 
 0.430     701    3395    606     7.78      8.27      8.27 
 0.440     239    1196    226     8.50      8.92      8.79 
 0.450     260    1303    277     9.58      9.34      9.34 
 0.460     114    583    121     9.58      10.00      9.92


#33    Tangotiger      (see all posts) 2010/02/19 (Fri) @ 23:26

I updated the equation so that it shows 12.62.  That’s the best fit.  Above post has been updated.


#34    Tangotiger      (see all posts) 2010/02/19 (Fri) @ 23:33

FWIW, my equation has a much better fit at wOBA .275-.355, while Bryan has a much better fit outside those ranges.

74% of the innings occur at the .275-.355 level, so I’ll encourage you to use the equation of my form for most cases.

Again Brian, great work, and exactly the kind of stuff I love to see.


#35    NLBB15      (see all posts) 2010/02/19 (Fri) @ 23:57

I can’t wait to see Brian’s article on predicting future ERA using xFIP, SIERA and others. I wonder how it will compare to the article at BP and any new xFIP findings from Eric. Should I be checking Fangraphs or will it be on here? I hope it’s prominent as I expect it to be very good. 

It looks like these wOBA converters are best at predicting y0? I’m excited to see Brian and others attempt to project y1 WOBA allowed. The marginal utility of all these metrics is small but I find it fascinating and worthwhile even if it’s of no interest to the “stupid(er)”


#36    Brian Cartwright      (see all posts) 2010/02/20 (Sat) @ 07:02

A wOBA converter should be the best at y0, as it’s using linear weights to represent all the events that actually occurred. However, a lot of what occurred was luck or the defense. A year or more into the future, these start evening out and the predictive value falls sharply.

FIP, SIERA, et al are skill estimators. They deliberately leave out things like BABIP and in some cases even HRs to keep only those that are most directly controlled by the pitcher, and thus are more consistent years into the future. They do worse in y0 because of the missing data, but better in years y1+ because they are based on rates which persist.

Another approach is as MGL says, you regress everything, just some things more than others. I did my Oliver pitching projections as a mirror of the process that I did for batters, but with differing amounts of regression. For example, pitcher’s BABIP is the most heavily regressed. I will be doing further variance and accuracy tests to tweak the regressions into producing projections with the lowest total error.

Tango long ago established the formula to convert batter’s wOBA to runs as wOBA/1.15*PA, as he had artificially scaled the linear weights based wOBA up 15% to match it’s mean to that of OBP. Batter’s runs above average was then
((wOBA-LGwOBA)/1.15)*PA.

This would not work for pitchers, as they produce their own scoring environments. I did a regression analysis of ERA vs pitcher’s wOBA allowed, and constructed a formula of the same form
((wOBA-LGwOBA)/1.15+(LgER/LgPA)*LGwOBA)*wOBA*PA

Tango’s formula in #32 looks to do just as well, which makes feel good in that I must be doing something right. In the range of wOBA closer to the mean, Tango’s might be a little better, but wERA is never off by more than .05 runs per 9 IP (as shown in the chart above) and I can accept that as close enough. Using wOBA’s .270 and below, Tango’s is biased high, every result is higher than the actual ERA, by a weighted mean of .088, while wERA has some high, some low, for a weighted mean of -.001

By the time I get an article written, I probably will have already said everything here, but at least it will be well vetted!


#37    Tangotiger      (see all posts) 2010/02/20 (Sat) @ 08:30

Brian: well said overall.

You are correct about the bias for wOBA below .275 for my formula.  I think I know the reason, since the wOBA equation itself should change based on run environment.  Basically, wOBA is simply not the way to go for pitchers.  wOBA is simply linear weights, and linear weights breaks down for pitchers at the extremes.  BaseRuns is, by far, what you really want for pitchers and teams, especially at the extremes.

Your formula and my formula is an attempt at trying to fudge its way to the right point.

But, my equation needs exactly one parameter (wOBA), while for yours, you need, well, alot.  So, mine is more accurate overall, more accurate for the 75% of pitchers around the mean, and uses much fewer parameters.  In any case, it’s good to have both as companion-stats.


#38    Guy      (see all posts) 2010/02/20 (Sat) @ 08:50

Great stuff.  I wonder, though, about using year 2 and year 3 data to evaluate a skill metric.  What we’re trying to measure is the players’ skill at that moment in time (year 0).  The only way to establish a skill is real is to see if it’s repeatable, so we have to look forward.  But, the farther forward you look, the more the skills change/deteriorate.  So by years 2 and 3 you’re asking a slightly different question, which is what metrics best predict a player’s aging curve?

Another possible problem is survivor bias:  you have many fewer pitchers at years 2 and 3 than at 1 or zero.  These are not randomly selected: they are the better pitchers in general, and the pitchers who improved most over the intervening years.  Is it necessarily true that the metric which best predicts the performance of the (relatively small) subset of pitchers still performing in 3 years is giving us the best measure of their skill TODAY?  I’m not sure.  (Also not sure it isn’t.)

It might be that evaluating these by their ability to predict something like the average of year 1 and year 2 would be the best approach.  (And does Brian’s “mean” include year 0?  I think it should only include years 1 and later, as predicting future performance really is different for the reasons he states.)


#39    Brian Cartwright      (see all posts) 2010/02/20 (Sat) @ 09:53

Good points by Guy.

The mean tests were on y0. It was something quick to see how well the metrics were centered on the mean.

I just checked the means for y1+, wERA went to -0.15, but all the skill based measures barely moved, generally under abs .07.

When I was playing with the number I threw in y2 and y3 just to see how things trended. I thought that a smaller the error in y2 or y3 might indicate the metric was better at measuring persistent skills. wERA was not trying to measure those things, so I was not surprised it was worst at y2 and y3. SIERA appeared to have the smallest error 2 or 3 years down the road, but the differences are small.

Of course the error was going to get larger, and yes there would be a survivor’s bias. But remember this is total error (rmse) so it could just as well be estimating a better ERA than a worse.


#40    Zach      (see all posts) 2010/02/20 (Sat) @ 15:06

I’ve done separate tests and found that a standard base runs formula and OTSE (click name) are the most accurate for y0. Can you test these on your sample, Brian?


#41    Tangotiger      (see all posts) 2010/12/16 (Thu) @ 21:53

Bumping because I loved the work Brian put in here on years 2 and 3…


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 14:14
Pete Palmer’s new book: Basic Ball

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion