THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, August 18, 2010

Understanding DIPS

By , 01:06 AM

Even though everyone who has ever written about DIPS in a responsible fashion explains it in terms of something like this:

“Pitchers have little or no control over their BABIP (emphasis mine)...”

Many people construe that as “no” (rather than “little or no") control, which leads to criticisms like (I am making these up to illustrate a point) this:

“Come on, one of the greatest pitchers in the history of baseball, Mariano Rivera, has a career BABIP of .277.  Any idiot can see that he induces weak contact, therefore he surely has a lot of control over his BABIP.”

Well, here’s the thing:

First of all, when someone says, “Little or no control,” they are not referring to any individual pitcher.  What they mean is simply that the spread of true BABIP talent among MLB pitchers is small.  Whether that wording, “Little or no control” is to your liking is up to you I suppose.  That is always one of the problems in trying to use “English” to describe or explain a mathematical concept.

I suppose that one could easily say, “All pitchers absolutely have control over their BABIP.  It is just that there is also a lot of noise or random variation in any sample of BIP that is less than enormous.”

The words, “Little, a lot, good, bad, great, etc.” have very little meaning without context and can generate an awful lot of controversy when used to describe or explain mathematical concepts.  One could on one hand say that Jason Kendall is a great baseball hitter, as compared to all baseball players at all adult levels, while on the other hand one could say that Kendall is an awful hitter, as compared to the population of major league baseball players.

Anyway, I digressed a little.

OK, let’s assume that DIPS is correct in that the spread of true talent (for those of you who don’t know what that means - it is the “fill in the blank” that a player will accomplish if given an infinite number of opportunities, thus eliminating any random fluctuation or luck if you will.  That number is also exactly equal to the best estimate of what that player will accomplish at any time in the future, assuming that that true talent “X” does not change with age, injury, etc.  For example, the “true talent batting average” for a fair coin flip is .500) is very small.  Again, using the word “small” in that context is almost meaningless, since we are not comparing it to anything in particular.  But, we’ll stick with that characterization.

In fact, let’s say that by “small” we mean that the standard deviation (SD) of true talent BABIP for pitchers is 7 points or .007.  That basically means that virtually all pitchers have a true talent BABIP of .279 to .321, assuming that the mean is .300 (even though we can assume that the talent distribution is normal, there are practical limits such that even though technically a normal distribution has no limits at either end, when it comes to human endeavors and characteristics, there likely are limits).

Now, because this talent distribution is so “small” (say, compared to batting average, which probably has a SD of around 20 points or so), what this means is that for most samples of BIP, even multi-year ones for starting pitchers, we must regress their sample (actual) BABIP a lot - probably in the 90+% range - in order to estimate their true talent BABIP.

And that is where the trouble often starts.

Let’s say that we have some really good pitchers in a season or two and they are “consistently” posting BABIP’s of .270.  The sabermetrician will quickly tell you that those .270 BABIP’s are likely mostly luck and he will estimate those pitchers’ true BABIP at something close to league average, or .300.  And that is when the ***t starts to hit the fan!  People start screaming, “What about Mariano Rivera, or Billy Wagner?  Or Nolan Ryan or even Tim Wakefield?  Surely great pitchers have low BABIP!  Why are you assuming that pitcher’s X, Y, and Z, are really .293 or .295 pitchers when they have posted .270 BABIP’s for 2 straight years?”

Well, here is the deal. There ARE some pitchers who are true .280 pitchers.  And some that are even true .275 pitchers.  And possibly even .270 pitchers.  We already told you that when we told you that the SD of BABIP true talent among MLB pitchers was 7 points!  That means by definition that there likely are some of these pitchers in existence (not necessarily at the present time of course).  It is just that we don’t know who they are!  And there are so few of them, as compared to the many, many more who are near average, that if we find a pitcher who posts a low BABIP in 1 or even 2 or 3 seasons, it is much more likely that he is near average and got lucky than he is a true low BABIP guy.  So we just automatically assume that he is somewhere in between, but much closer to average.

Even though we might call him a .293 pitcher (say, .270 heavily regressed toward .300), what we really mean is that there is a 20% chance he is a .300 pitcher who got lucky, 15% chance he is a .299 pitcher who got lucky, a 10% chance he is a .298 pitcher who got lucky....all the way down to a 1% chance he is a true .280 pitcher, and a .1% chance he is a true .270 pitcher, exactly equal to his sample BABIP.”

So yes, he could be another Mo Rivera, who is likely 2 or 3 SD’s from the mean in BABIP.  But we simply don’t know that yet until we have 15 or 20 years from the guy at a .270 or .280 clip, and even then we are not 100% sure what he is.  At that point, the numbers will change to, “20% chance he is a .275 pitcher, 15% chance he is a .280 pitcher who got lucky, 10% a .285 pitcher who got lucky, etc.”

One final thing.  If we know something else about the pitcher other than his BABIP numbers, we can and should certainly change the way we do the math - at least change the mean we are regressing toward.  If he throws 95 mph like a Nolan Ryan, maybe the mean BABIP for all pitchers like that is .295 rather than .300 (I don’t know).  If he throws a knuckleball, like Wakefield, the mean BABIP is probably lower too.


#1          (see all posts) 2010/08/18 (Wed) @ 04:15

Well, that’s the problem, isn’t it.  Looking at the population as a whole gives you the illusion that randomness rules the kingdom.

However, for truly good or bad individual players, that BABIP is high or low for a reason, and it’s not all luck (good or bad).

“So we just automatically assume that he is somewhere in between, but much closer to average”

My ex-boss would tell you that assuming anything makes an ASS U ME.  The devil lies in the assumptions.  The best way to wake any CEO up in a meeting is to mention you are assuming something.


#2    Kincaid      (see all posts) 2010/08/18 (Wed) @ 05:01

You have to assume something, because you can’t know for sure what anyone’s true talent is.  What would the alternative be if you had to try to describe a pitcher’s true BABIP talent?

If your assumption is that the pitcher’s true talent is whatever his observed BABIP is, you’ll be wrong way more than if you make your assumptions the way MGL is talking about.


#3          (see all posts) 2010/08/18 (Wed) @ 07:46

As with any and all statistics, context is key.  When speaking of a players value, no sole statistic should ever be used in an argument.  Baseball stats often contradict each other in small ways that lead to big arguments.

One of my biggest problems with BP, Fangraphs, what have you - in their drive for the “Single Best” statistic, they overlook the nuances described above for DIPS.

Take everything in context.  If 3 stats all claim someone is the best - believe it.  If 1 does, well, it is open to discussion.


#4    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 07:58

Lou: if what you get out of Fangraphs and BPro is the drive for a single-best statistic, then you are definitely not looking at it the right way.  There are a million bajillion component breakdowns at Fangraphs and B-R.com.

***

However, for truly good or bad individual players, that BABIP is high or low for a reason, and it’s not all luck (good or bad).

I just can’t believe you would say that after what MGL wrote.  Truly people will read whatever they want to read to get whatever point they want across.

***

You’ve got a weighted die.  You know it’s weighted because you built it.  It lands on “1” 25% of the time.  You also have nine unweighted dice.  You built those too.

You put all 10 in a pouch.  You roll each die 36 times.  You get these counts for the number of times you roll “1” for the 10 dice:

11-9-8-7-6-6-6-5-5-4

Which one is the weighted die?  YOU DON’T KNOW!!!

What is the CHANCE that it’s the one that rolled 11 ones?  MORE than the chance that it’s the one that rolled 4 ones.  But EACH of the 10 has a chance to be the weighted die.

It’s all based on probability, a number that is GREATER than zero and LESS than one. 

If anyone says “all luck” or “all skill”, leave this blog, and never come back.  WE DON’T KNOW.  All we can do is make a best estimate as to the mean, and a best estimate as to the uncertainty of that mean.  And, if you like, a best estimate of the uncertainty of the uncertainty of that mean.  And so on.


#5          (see all posts) 2010/08/18 (Wed) @ 08:07

Tango

Practice what you preach.  I get MUCH more out of those sites than a drive single statistic, it was 1 issue with the writers on those sites.  One writer falls in love with one statistic, and bases an argument based upon it, without diving into alternate viewpoints provided by another statistic.  Perhaps MGL’s post wasn’t the place for the comment.

I really screwed up communicating my intent.  It was not a criticism of MGL’ post - just further agreement that DIPS is great, it just may not tell the whole story for every player.  T


#6          (see all posts) 2010/08/18 (Wed) @ 08:51

This might be my favorite sentence ever:
“The words, “Little, a lot, good, bad, great, etc.” have very little meaning without context and can generate an awful lot of controversy when used to describe or explain mathematical concepts.”

Little meaning and lots of controversy indeed.

This is an excellent reminder, and a concept that anyone seeking to learn more about baseball stats should be required to read.


#7          (see all posts) 2010/08/18 (Wed) @ 09:39

MGL,

Great stuff. I hope you can help me out with something tangential, though. Every time I read one of your articles on regression (and when I read the Book), I always have one nagging question: why normalize to league average? Mathematically, of course, I get it. And it makes perfect sense to me to assume that Pujols, Bonds, etc., are likely not as good as their career numbers.

Where I struggle is with the idea that Yuni Betancourt is likely better than his career numbers. Why do we assume that? Isn’t it actually just as likely that his true talent is that of a minor leaguer that has no business playing in the majors? After all, if we compare him to the average baseball player (not the average major leaguer), he’s already astronomically above average.

I guess my question is this: should we be more skeptical of a player’s projections if his observed performance is well below average than if his observed performance is well above average?


#8    minesweeper      (see all posts) 2010/08/18 (Wed) @ 09:56

#1 did you read the post?


#9          (see all posts) 2010/08/18 (Wed) @ 10:03

One important thing to remember is that true pitcher BABIP talent is relative to his team defense (and park, etc.). 

Mid-season this year I created a PITCHf/x-based pitcher BABIP predictor, and it turned up some of the normal suspects like Rivera and Wakefield among the predicted leaders, but mostly the leaderboard was populated by pitchers from the good defensive teams like Tampa Bay and San Diego.  Team fielding is a larger factor in actual pitcher BABIP than the pitcher’s own BABIP skill.

That’s not a new finding, of course, but it’s something people tend to forget.


#10          (see all posts) 2010/08/18 (Wed) @ 10:34

Great post, MGL.  I haven’t followed things as closely as I should since the days of reading people debate these topics (and more interesting ones, like BaseRuns) on Fanhome or whatever those old message boards were.

In any event, the original studies by Voros (and Tom Tippet’s follow ups) were limited to that universe of pitchers who threw at least 150 innings in back to back seasons in the majors leagues.  I’ve never thought of DIPS as something appropriate in other contexts - i.e., looking at Rivera’s BABIP, or minor leaguers, etc.  Has the theory increased such that we should think of pitchers like Mo in the same category as pitchers like Glavine?


#11          (see all posts) 2010/08/18 (Wed) @ 10:43

Josh/12, I made a summary list of DIPS-related research in the “References and Resources” section at the end of my article in 2009:
http://www.hardballtimes.com/main/article/confessions-of-a-dips-apostate/

In that list are the following:

Clay Davenport examined the BABIP allowed by minor league pitchers and found that those pitchers who were eventually promoted to the major leagues allowed a lower BABIP than those who did not make the majors. One possible explanation is that, while differences in BABIP skill among major league pitchers may be small, they may nonetheless demonstrate BABIP skill that is not common among all pitchers at lower levels because major league pitchers are selected partially for this skill.

http://www.baseballprospectus.com/article.php?articleid=3946

Findings by Gassko and Dave Studeman indicated that DIPS might not apply as well to closers as it does to starting pitchers.

http://www.hardballtimes.com/main/article/dips-again/
http://www.hardballtimes.com/main/article/pictures-of-batted-balls/


#12    Matthew Cornwell      (see all posts) 2010/08/18 (Wed) @ 10:49

Mo vs. Glavine?  In what terms?  Glavine is about .006 better than his mates (maybe a little closer to .007 if we extract out Maddux and Smoltz) but with 14,000 BIP.  Rivera has to be near .030 BABIP better than his mates but is closer to 3,000 BIP.  I am guessing that puts Glavine around 1.5-2 SD away from his mates and Mo closer to 3?  So we can be confident that Glavine’s BABIP vs. mates is pretty locked in and he is solid BABIP reducer (maybe a 5% regression needed or so?), and we can look at Rivera as a very good BABIP reducer even after the much bigger regression.  So Rivera’s BABIP is so low, that it is likely that his true BABIP ability is better than Glavine’s despite the BIP deficiency?


#13          (see all posts) 2010/08/18 (Wed) @ 10:52

Mo vs. Glavine?  In what terms?

He was asking if DIPS applies equally well to relief pitchers as it has been shown to apply to starting pitchers.


#14    Matthew Cornwell      (see all posts) 2010/08/18 (Wed) @ 11:05

#13

Maybe instead of saying that pitchers have no true talent outside of the .007 spread, maybe we should say that that all tenured MLB pitchers are talented enough to produce BABIP’s close to .300. If they couldn’t keep their BABIP close to .300, they would be out of the league pronto. In other words, near .300(in the modern game)is the best humans are capable of. Its not that pitchers are so bad at it that they can’t get away from .300. So a .310 true BABIP pitcher is still exhibiting plenty of BABIP skill...enough to stay in the league, but not as much skill as a .290 BABIP pitcher. Maybe if it was phrased in those terms it wouldn’t turn off so many people who only hear “your favorite pitcher was lucky, not good.”


#15    Matthew Cornwell      (see all posts) 2010/08/18 (Wed) @ 11:07

#15

Well, we know that relievers have a better BABIP as a whole anyway, so we would have to adjust for that, of course.


#16    Rally      (see all posts) 2010/08/18 (Wed) @ 11:11

"Clay Davenport examined the BABIP allowed by minor league pitchers and found that those pitchers who were eventually promoted to the major leagues allowed a lower BABIP than those who did not make the majors.”

That is something that deserves a further look.  I am really skeptical that you can find anything from a minor league babip rate.

1. It takes us many years of fulltime pitching data to determine that Glavine or someone actually has an ability here and not just lucky.

2. Minor leaguers don’t pitch as many innings.  The top prospects often don’t spend a full year at one level - 65 innings in AA, 50 in AAA, then a major league trial.

3. Defensive support in the minors shifts more than in the majors, a team might have excellent up the middle fielders like Tyson Auer and Alexi Amarista in A+, and then move them to AA for the second half.  This complicates comparing a minor leaguer’s partial season to his mates.


#17          (see all posts) 2010/08/18 (Wed) @ 11:12

Matthew/17, I’m not aware of anyone who has studied that issue outside of the two links I provided in #13 and Tango’s “Rule of 17”.


#18          (see all posts) 2010/08/18 (Wed) @ 11:16

Mike/13 - Thanks, yes that is what I am asking, and those links are appreciated.


#19          (see all posts) 2010/08/18 (Wed) @ 11:19

Rally/18, Clay looked at large groups of pitchers, with 100k+ balls in play.

He looked at all pitchers in a given minor league together.  I don’t know if the defensive variations would wash out, but I’d think they would to some extent.

My main concern with his study is that there is selective sampling that his hard to avoid.  If you have two pitchers of equal BABIP talent and equal talent in other ways, the pitcher with better (i.e., lucky) sample BABIP performance will tend to be promoted to the majors ahead of the other pitcher.


#20    mettle      (see all posts) 2010/08/18 (Wed) @ 11:32

Really great and clear explanation. Even though I already knew all the concepts, I couldn’t stop reading till the end to hear how *you* explained it. A sign of good writing.

It also reminds me of my favorites on regression to the mean for hitters:
http://www.hardballtimes.com/main/article/why-does-pujols-regress-to-the-mean/


#21    Colin Wyers      (see all posts) 2010/08/18 (Wed) @ 12:02

mcsnide: What you’re referring to is a bimodal distribution - where you have two potential “means” to regress to.

The reason you would regress a below-average player to the MLB mean is that players who continue to play in MLB tend to regress to that mean. Those players who are below-average both in performance and talent tend not to play - they’re either benched or sent down to the minors.


#22    Colin Wyers      (see all posts) 2010/08/18 (Wed) @ 12:16

Looking only at pitchers who both started and relieved in the same season from ‘93-’09, I took the weighted average of their BABIP as starters and relievers (using the number of BIP in the role where they had the fewest BIP as the weight).

BABIP_START: 0.297
BABIP_RELIEF: 0.285

So yeah, pitching in relief means a lower BABIP.


#23    Detroit Michael      (see all posts) 2010/08/18 (Wed) @ 12:33

Thanks, Colin.  Very illuminating.


#24    Colin Wyers      (see all posts) 2010/08/18 (Wed) @ 12:41

Bah, ignore that. Totally a matter of selective sampling - pitchers who move from starting to relieving tend to have above-average BABIPs, and then regress to the mean when they continue pitching in relief.

I broke it down to starters, short relievers (seventh through ninth innings) and long relievers. Nothing significant:

LONG: .286
SHORT: .282
START: .284


#25    Rally      (see all posts) 2010/08/18 (Wed) @ 13:02

Mike #21,

Yeah, put it that way and selective sampling seems to be the only conclusion you can reach from it.

The study I would be interested in is if you take minor leaguers with matched pairs in bb%, k%, and hr%, do the differences in babip tell you anything, and how much, about future MLB success?


#26    Rally      (see all posts) 2010/08/18 (Wed) @ 13:02

Or even future minor league success.


#27    MGL      (see all posts) 2010/08/18 (Wed) @ 13:05

For those of you who teach statistics classes at the high school or college level, or even if you don’t, if you want to practice your skills, try/present this problem, which actually sums up DIPS.  And BTW, DIPS is not the least bit unique.  It is exactly the same as any other statistic which reflects a skill.  The “discovery,” if you will, was merely that the spread of skill was much less than was previously assumed (I didn’t say, “known,” as I am not aware that anyone had tried to quantify it before Voros did).

Anyway, the “problem” is this:

You have a normal distribution of pitcher BABIP skill with a mean of .300 and a SD of .07.

You have 3 pitchers who have had 500, 1000, and 2000 BIP respectively, with a BABIP of .270.

1) What is the (weighted) mean estimated true BABIP of each of those pitchers?

2) What is the standard error (uncertainty) of that estimate?

2) What are the respective chances that each pitcher is a true .270 or less pitcher?  .270 to .280?  .280 to .290?  .290 to .300?  Greater than .300?

This is a Bayesian problem of course. The normal distribution defines/describes the prior probabilities.  And the binomial distribution for each of the 500, 1000, and 2000 BIP defines/describes the “regular” probabilities.

In reality of course, is that we don’t know the exact prior distribution.  We can try and estimate it from the data, but selective sampling (and the heterogeneous nature of the population of MLB baseball players) makes that difficult.


#28    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 13:22

There is a selection bias issue that you really have to look at.  In The Book, we see a huge difference in wOBA based on how you pool your pitchers.  For those guys who are starters / emergency relievers, you see the wOBA not change (even gets worse in relief I think).  But for those guys who are relievers / emergency starters, the wOBA improves significantly in relief.

As it pertains to BABIP, I posted a quick study on this blog that shows a drop for relievers of 17 points.  Again, it all depends on what you select in your pool.


#29    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 13:24

I use the Rule of 17 for the starter-relief conversion:
- 17% more K in relief
- 17% fewer HR in relief
- 17 lower BABIP points in relief
- flat walks


#30    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 13:27

Proof that MGL is making too much sense is that those few primates that are ready to pounce on him like a lion pounces on a zebra are nowhere to be found:
http://www.baseballthinkfactory.org/files/newsstand/discussion/the_book_blog_mgl_understanding_dips/


#31          (see all posts) 2010/08/18 (Wed) @ 14:05

Is there a link anywhere showing the BABIP year-by-year? Either for all of MLB or broken down by leagues?


#32    Alex      (see all posts) 2010/08/18 (Wed) @ 14:15

I’d love to hear your take on Tim Hudson.  Obviously some amount of luck is coming into play for him to have a .231 BABIP, but I wonder how much of what he’s doing this year is truly due to his ability to create weak contact.  He’s got a 65.5% GB rate (third best of fangraphs batted ball era) and an 11.8% LD rate (lowest of the FG batted ball era by 1.5%; same difference between him and second this year and second and 50th).  What do we make of him?  I assume he’ll regress some the rest of this season (due to the luck influenced part) and even more next year (chances are his sinker won’t be this good again), but how much of his current BABIP do you think is luck?


#33          (see all posts) 2010/08/18 (Wed) @ 14:35

So far this year, I come up with the following figure for the NL: 69.3% of plate appearances are a ball in play. If pitchers don’t vary much in their ability to prevent hits on balls in play, does this give us some idea of how important pitching is? Suppose pitchers control the remaining 31% of outcomes. Could we say that “baseball is 31% pitching?” (or some number close to that? Also, this % has changed over time.

Then there is the issue that the batters, too, play a role in HRS, BBs, etc. So it could be less than 31%.


#34    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 14:49

Alex: the answer is easy: tell me what his BABIP will be over his next 30 starts, and put up 100$ to take the under.

***

Cy: that’s not the right way to approach this.  You do it based on the spread in talent times the frequency times the win (or run) value of each spread point.

That the spread in BABIP is low, it’s still not zero.  And the frequency is very high.

Otherwise, you will conclude like Berri that there is no skill in goalies in the NHL because the save % spread is low.  Yeah, but they face 1800 shots a season.


#35          (see all posts) 2010/08/18 (Wed) @ 14:58

Alex/36, Atlanta has a slightly above average group of fielders, and Tim Hudson’s career BABIP allowed is .283, so we’d expect him to be among the best pitchers in BABIP on that basis.  I don’t think his performance on ~500 balls in play this year changes that expectation much.


#36          (see all posts) 2010/08/18 (Wed) @ 15:01

Tom

What % of a pitcher’s value do you think comes from his ability on BIP? Suppose a pitcher is truly 2 SDs better than average. So he allows a .014 lower BABIP. Suppose there are 27 BIP per game. That means he would prevent .378 hits per game. Suppose we give those a run value of .55. That would mean that a very good pitcher can save .21 runs per game by preventing hits on BIP. That does not seem like alot.

I admit that the % could be higher than 31%. How much higher would you estimate? 40%? But the batters also have something to say about the nonBIP. That would lower the importance of pitching. How much I don’t know. I am not saying pitching is zero %.

And all I know about hockey is that Blackhawks finally one.

Cy


#37          (see all posts) 2010/08/18 (Wed) @ 15:05

I mean won.


#38    Guy      (see all posts) 2010/08/18 (Wed) @ 15:08

Very nice overview by MGL.  I think the question of whether the variance in a skill is “small” or “large” is actually straight-forward, but often gets discussed in unhelpful ways.  The obvious (to me) standard is how many runs/wins separates those who are good/bad in a given skill, and how does that compare to other relevant skills.  So the relevant question is how many wins does BABIP skill account for, compared to Ks, BBs, and HRs.  (As best I can tell, the answer is “a lot less than Ks, less than HRs, a little bit less than BBs.")

But instead, the discussion usually focuses on y-t-y correlation rates. Which only tells us how much the skill varies relative to the noise in our data.  Why people think that is more important than wins and losses, I’ll never understand.  The use of year-to-year correlation to assess the importance of skills is, I think, a really unfortunate outgrowth of Voros’ work (and a method that he himself continues to rely on far too much.) Of course, the sports economists (Berri, JC) often make the same mistake, so we can’t put all the blame on Voros....


#39    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 15:48

Right, what Guy is saying is what I’m trying to say.

Say that one SD in BABIP is .008 hits per BIP, and each hit to out is worth .80 runs.  And you have 27 BIP per 9 IP.  So, one SD is 0.17 runs.

How about BB?  Let’s say that each walk to average PA is worth .32 runs.  And you have 39 PA, and one SD is .02 walks per PA (just making numbers up).  One SD is 0.25 runs.

As you can see, we can make the BABIP skill be similar to the BB skill. 

What Guy is saying is correct, that what we should care about is how much skill there is per game, not per BIP or per PA.


#40    Alex      (see all posts) 2010/08/18 (Wed) @ 15:50

Like I said, I’m not sure its something Hudson can maintain going forward because his sinker has never been this good before.  I was just looking to get opinions on Hudson’s insane batted ball profile from people smarter than myself.  As far as Fangraphs data goes back, we’ve never seen anything like this, so I was hoping you guys might be able to make more of it that I can.


#41    Alex      (see all posts) 2010/08/18 (Wed) @ 15:57

I guess I’m more interested in looking back than looking forward.  Based on his batted ball profile, what should we expect his BABIP to be?  Maybe someone could run his xBABIP (I don’t have access to excel right now)?  I just think what he’s done so far this year makes for a really interesting case study with regards to a pitchers ability to create weak contact.  His season this year is probably the biggest outlier we’ve ever seen in those terms.


#42          (see all posts) 2010/08/18 (Wed) @ 16:02

Alex, Derek Lowe posted a .237 BABIP in 2002.  His career BABIP is .296.  His lowest season BABIP since is .283 in 2002 and again in 2008.  I don’t see that a extremely low BABIP from a pitcher in one season tells us very much.  The evidence strongly indicates that it does not.  Why would Hudson be an exception?


#43    Jordan      (see all posts) 2010/08/18 (Wed) @ 16:03

Alex/46

Based on the formula here:

http://www.fangraphs.com/blogs/index.php/expected-babip-for-pitchers/

0.15 * FB% + 0.24 * GB% + 0.73 * LD%

and the numbers you provided, Hudson’s expected BABIP would be 0.277.


#44          (see all posts) 2010/08/18 (Wed) @ 16:04

Re Hudson’s expected BABIP, my PITCHf/x-based xBABIP tool, which takes into account pitch type and location and team fielding, predicts Hudson for a .300 BABIP going forward.


#45    Alex      (see all posts) 2010/08/18 (Wed) @ 16:15

Mike/47

Like I’ve said, I’m not as interested in looking forward as I am in looking back.  I fully accept that Hudson will almost certainly revert to a more normal form next season.  I just think this season is incredibly interesting.  The Lowe season you mentioned is one of the few I’ve seen that’s all that close and his LD rate is still 1.5% higher than Hudson’s.

As for your Pitchf/x-based xBABIP tool, does it also take into account the relative movements of the pitches?  It sounds interesting, but if you don’t give credit for an exceptionally good pitch type that one pitcher has, it seems that they would be underrated.  Have you written anything up on it?  It sounds very interesting.


#46    Alex      (see all posts) 2010/08/18 (Wed) @ 16:21

Just looking at the raw Pitch f/x data at fangraphs, Hudson does appear to be getting a lot more downward movement (or less upward movement if you prefer) on all his pitches than he has previously.  Who knows if he can keep it up, but perhaps that’s part of the reason why he’s creating so much weak contact.  I wish we had Pitch f/x data for Lowe’s season in 2002...I wonder if his sinker was working especially well that year.


#47          (see all posts) 2010/08/18 (Wed) @ 16:44

Alex/50, I didn’t mean next season.  I meant from today until the end of 2010.

I believe there are features to weak contact and pitcher control thereof that we are yet to uncover.

However, I also believe that (1) it’s tough for a pitcher to pitch in situations/locations that favor weak contact on any consistent basis from game to game, and (2) that looking at outlier performers in pitcher BABIP has not proven particularly helpful in identifying those situations/locations.  It’s not that it’s impossible, I just haven’t seen it yet.  Thus, I’m not optimistic that Hudson is showing us a real skill that we can identify.  But I won’t rule it out because I don’t know, so don’t take my response as throwing cold water on your idea.  I’m skeptical, but open to learning.

As for your Pitchf/x-based xBABIP tool, does it also take into account the relative movements of the pitches?

No, other than grouping them together in a big bucket by pitch type.  Sinkers are grouped separately from four-seam fastballs, for instance.

It sounds interesting, but if you don’t give credit for an exceptionally good pitch type that one pitcher has, it seems that they would be underrated.

Pitch location matters a lot more for BABIP than pitch movement.  That’s why I chose to bin on location rather than movement.  But movement is not irrelevant, of course.

Have you written anything up on it?

No.


#48    Alex      (see all posts) 2010/08/18 (Wed) @ 16:57

I agree that looking at outliers in terms of pitcher BABIP isn’t that exciting because luck and defense play such a big role, but I think a real extreme outlier (Hudson’s got to be quite a few SD outside the mean to have a spread between him and second similar to the spread between second and average) in terms of LD percentage might be a little more interesting.  Regardless of fielding and luck of getting balls hit at fielders, we should expect a really low BABIP due to his extremely low LD rate (though a question still remains of how much luck is involved in that).

Perhaps an interesting study would be to compare the results of his pitches this year based on movement to his results previously based on movement, since his pitch movement profile has changed along with the batted ball profile.  I just feel like we could possibly learn more about the things you talk about us not knowing by studying an outlier as opposed to the more general population.


#49    Detroit Michael      (see all posts) 2010/08/18 (Wed) @ 17:06

Cyril#35,
I don’t know of a website that lists BABIP by year, with or without the leagues combined.  I used the indivitual pitchers’ data from baseball-reference.com (which means indirectly from Retrosheet), downloading it one league/year at a time, and computed it within Excel.  It’s fairly stable from 1989-onward but certainly fluctuates somewhat.  I’ve read elsewhere that the balls in play data before 1988 is suspect, but I’d place the dividing line one year later.


#50          (see all posts) 2010/08/18 (Wed) @ 17:21

Michael

Thanks. But I wonder how the ball in play data could be supsect. Isn’t BIP = BFP - HR - K - BB - HBP? My guess is that there might be some occassional mistakes year to year on that stuff, but nothing too big.

Here is what I came up with for the AL from 1989-2010

1989 0.285
1990 0.284
1991 0.284
1992 0.282
1993 0.291
1994 0.296
1995 0.295
1996 0.301
1997 0.300
1998 0.298
1999 0.301
2000 0.300
2001 0.294
2002 0.290
2003 0.291
2004 0.296
2005 0.291
2006 0.300
2007 0.302
2008 0.297
2009 0.298
2010 0.291

Is that stable? I really don’t know. I found that every year from 1941-1976 in the AL was below .280 with many being below .270.

Cy


#51          (see all posts) 2010/08/18 (Wed) @ 17:49

Tom

Per #38 & #44, I looked at all pitchers in 2009 with 500+ BFP. I got an SD of about .02 for walk rate (I did not try to weight it). The SD for HR rate was .0074. For SO rate it was .04477.

So HR) 39*.0074*1.4 = .404. A 1 SD improvement in HR rate saves about .4 runs per game.

For SO) 39*.04477*.22 = .384. So a 1 SD improvement in SO rate saves .384 runs a game.

So preventing hits on BIP is much lower than HR, SO, BB. But it does seem to count for something. It is equal to about 40% of the value of preventing HRs.

But one more question. If in finding the value of preventing hits on BIP, you also say that you have to take into account the value of the out that is created (combining the +.55 with the -.25 or so), do we have to do something similar with HR, BB and SO? Should we compare them to what happens otherwise? If you don’t give up a HR, you save 1.4 runs. But what happens instead? Should we assume some kind of average of all other events? Would then that mean you actually save less than 1.4 runs?

Cy


#52    Matthew Cornwell      (see all posts) 2010/08/18 (Wed) @ 18:04

Looking at the batted-ball type BABIP predictions, I can’t help but think about Maddux.  He was a RH GB pitcher with below average K’s. Nothing outrageous in terms of infield flies. How was he about 2 SD from his mates on BIP?  He doesn’t seem to have many of the BABIP reducing indicators on his side.  Has to be something with location/movement/pitch type. Glavine (about 1.7 SD away from mates) seems a little more obvious.  He threw tons of change-ups low and outside and caused a vastly disproportionate number of balls to not be pulled. Same with Jamie Moyer (near 1.5 SD from mates).  I can’t think of many other guys that make me say “huh” for the bulk of their careers regarding how their BABIP reductions happened.  Most of the rest have the high K or FB thing going.


#53    Colin Wyers      (see all posts) 2010/08/18 (Wed) @ 18:34

Matthew, how much of that is Maddux’s own ability to field? I’ve revised my fielding system substantially (and I think I have a good part of the play stealing problem, although I haven’t tested) but still have him as the leading fielding pitcher, with 156 plays above average in his career.


#54    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 18:41

Since the denominator of BABIP is BIP, then converting a hit to a non-hit means converting it to an out.

Converting a BB (.33 times 3) to a non-BB PA (something times 37) means that the something is around 1/37, or -.03 runs.

I’m not sure that you would do it exactly like that, but that probably works.


#55    Matthew Cornwell      (see all posts) 2010/08/18 (Wed) @ 18:59

Colin -

Well, wouldn’t 156 plays above average be about 120 runs?  How is that possible when he is only about 100 hits or so on BIP better than his mates according to his .284 (or nearby) BABIP compared to his team’s .291 (or nearby).  How does that work?  It looks like I am comparing apples to oranges.


#56          (see all posts) 2010/08/18 (Wed) @ 19:13

Tom

Sorry, but I just don’t follow what you are doing in #59. Why multiply .33 times 3? Why are you multiplying by 37?

Suppose the typical PA is worth .11 runs on average. Then what is the typical nonBB worth? Something a little less. If walks are 10% of PAs (not really, but just rounding for simplicity), then the other 90% would have a value combined of about .0855.

That makes me wonder that if you prevent a walk, you get some other outcome that allows about .0855 runs. So not walking someone saves around .24 runs. I don’t really know if it should work this way. But I just did not see what you were doing.

Cy


#57    Guy      (see all posts) 2010/08/18 (Wed) @ 19:57

Cy:  I think you are comparing the observed SD for K, BB and HR—which includes a lot of luck—to Tango’s estimate of the true talent SD for BABIP.  So it’s apples to oranges.  If you use true talent for all variables, I think you will find that BABIP variance is not that much less than for BB and HR rates.

Rally/29:  I don’t think selective sampling explains all of Clay’s finding.  IIRC, he compared players who made it to the majors to those who didn’t at each minor league level.  So even if a guy gets promoted from A to AA based on BABIP luck, that doesn’t explain why he makes it to MLB.  Basically, the guys who eventually make it to MLB are better at every level, which suggests to me they really do have more hit prevention skill.


#58    MGL      (see all posts) 2010/08/18 (Wed) @ 20:37

Alex, a starting pitcher has 500 BIP per season, or around 400 in mid August.  The binomial SD for N=400 is around 24 points in BABIP.  If there are 100 full time starters, and all of them had exactly the same true BABIP (say, .290), by chance alone we expect 2-3 of them to be at less than .242 or so.

How “interesting” can a pitcher with a BABIP of .230 or .240 (Hudson) be?  To me, not very.


#59    Matthew Cornwell      (see all posts) 2010/08/18 (Wed) @ 20:47

Given Hudson’s career BABIP (which is about .010 better than mates), the good defense he is playing in front of, and the sizable distance from league mean - doesn’t it suffice to say that there is probably some skill, some luck, and some defensive help going on there?  Probably in order of quantity going from 1. luck, 2. skill, and 3, defense?  Maybe switch 2 and 3?


#60    BWV 1129      (see all posts) 2010/08/18 (Wed) @ 21:09

To me, one of the most intriguing avenues for future research is alluded to in the last part of MGL’s write-up:

One final thing.  If we know something else about the pitcher other than his BABIP numbers, we can and should certainly change the way we do the math - at least change the mean we are regressing toward.  If he throws 95 mph like a Nolan Ryan, maybe the mean BABIP for all pitchers like that is .295 rather than .300 (I don’t know).  If he throws a knuckleball, like Wakefield, the mean BABIP is probably lower too.

One of the most obvious places to start with is batted-ball type; this is in essence what the likes of xBABIP and xFIP are trying to do.  I don’t know how accurate they are.

I wonder if strikeout rate, HR rate, etc. might also be something we could use.  I would imagine that some of these possibilities have been explored in articles I have not seen.


#61          (see all posts) 2010/08/18 (Wed) @ 21:10

Guy

You’re right. But, as you say, “If you use true talent for all variables, I think you will find that BABIP variance is not that much less than for BB and HR rates” then I think that raises the relative value of preventing hits on balls in play. I had it 40% for HRs. So it seems like it might even be higher. So being able to prevent hits on balls in play, although less important than the other events, is actually pretty important.

Cy


#62          (see all posts) 2010/08/18 (Wed) @ 21:48

For all the guys with 500+ BFP, the cumulative HR rate was .02638. The SD of HRs allowed for 750 PAs using the binomial distribution is 4.389. Over those 750 PAs, we get 0.00585. That is a little less than the .0074 I reported before.

So redoing HRs gives us 39*.005885*1.4 = .32. Now being just as good at preventing hits on balls in play is about half as valuable as preventing HRs (Tom had come up with .17 for BIP).

My SD came from the square root of

750*.02338*.97362

Which was the square root of 19.263 or 4.389


#63    Jeff Z      (see all posts) 2010/08/18 (Wed) @ 22:06

This article needs to be added to the required reading at the top of the blog.  Nice work as always MGL.


#64    Brian Cartwright      (see all posts) 2010/08/18 (Wed) @ 22:36

I’m doing research for a piece in the upcoming THT Annual and how groundball rates of batters and pitchers affect babip outcomes. As the groundball rate allowed by the pitcher increases, the mean vertical angle of each type of batted ball decresaes. This results in the babip of groundballs decreasing, with the babip on LD’s increasing slightly, but the rate goes up nearly 50% for FB. Likely no effect on fbhr/fb, but has a large reduction of ldhr/ld.

Here’s a table from MLB Gameday data, 2005-2010. I’ll likely use Retrosheet for MLB so I can get more years and a larger sample size at the extreme values.

gbrate    bc     babip     _ifh     _gbh      _gh     _ldh     _fbh    _ldhr     _fbhr     _hr
0.30    22765    0.284    0.071    0.202    0.273    0.720    0.144    0.026    0.125    0.091
0.35    79161    0.289    0.068    0.185    0.253    0.726    0.161    0.023    0.114    0.081
0.40   210321    0.295    0.066    0.181    0.247    0.722    0.166    0.023    0.115    0.080
0.45   230619    0.301    0.064    0.174    0.238    0.730    0.177    0.020    0.115    0.077
0.50   124738    0.302    0.065    0.168    0.233    0.729    0.185    0.020    0.111    0.073
0.55    52333    0.303    0.062    0.167    0.230    0.732    0.195    0.018    0.113    0.071
0.60    39143    0.300    0.063    0.160    0.223    0.735    0.209    0.015    0.123    0.072


#65    Alex      (see all posts) 2010/08/18 (Wed) @ 23:57

MGL/63,

That’s why I keep saying I’m a lot more interested in the LD rate than the BABIP.  No one has been close to Hudson’s LD rate up until last night (not sure how many LD he got charged with) over the course of a whole season.  There’s a similar difference between him and second place this season as there is between second place and 50th place.  He was 1.5% lower than the second best pitcher as far as fangraphs goes back.  That’s a pretty huge spead.  That’s got to be multiple SD from the mean.  Its far more interesting than his BABIP, which by itself could be completely attributed to luck.  When you add in the LD rate, I think its clear something more interesting is going on here.


#66    Alex      (see all posts) 2010/08/19 (Thu) @ 00:04

I mean I’m perfectly willing to accept that part of Hudson’s LD rate is luck, but luck doesn’t seem to explain Hudson having an 11.8% LD rate while no one else since 2002 has been better than 13.3%.  That’s a huge outlier.  People keep focusing on his BABIP and ignoring the fact that his batted ball profile seemingly supports an extremely low BABIP.


#67          (see all posts) 2010/08/19 (Thu) @ 00:25

Alex/71, that’s an interesting point.  I was starting to pick up that that was what you were saying in your last few posts.  Earlier on it seemed like you were mainly talking about BABIP.  I don’t know why Hudson’s line drive rate this season is so low.  It may bear further investigation.  Without more data, it’s hard to say what to make of it, if anything.


#68          (see all posts) 2010/08/19 (Thu) @ 00:29

What I mean by “more data” is Hudson’s detailed batted ball data and his associated PITCHf/x data, and ideally, and though we don’t have it available to us, his HITf/x data.  Also, a more complete picture of how much of an outlier his season really is, statistically, in terms of line drive rate.

I know much less about line drive rate than I do about BABIP.  I’m not a big fan of the “line drive” batted ball classification.  It is extremely subjective, and thus subject to a lot of problems.  However, it’s not completely worthless, and it may still be telling us something in Hudson’s case.


#69    MGL      (see all posts) 2010/08/19 (Thu) @ 00:57

Let me say one thing that is not really in response to anyone’s post above, but many people seem misunderstand…

Part of the “luck” in a low (or high) BABIP is not only where the batted ball bounces, but also in the quality of the pitches thrown by the pitcher.  The exact location of every pitch has a lot of random noise in it.

For example, if a pitcher with a true BABIP has a BABIP of .240 over some period of time, it is likely that not only has he gotten some favorable bounces and some good defense, but he has also had excellent location overall, and of course, the batters have not squared up many pitches for whatever reasons.

So if you look at the data from a pitcher like Hudson, of course you will find that he has been above average in just about everything you can think of that is relevant to BABIP - line drive rate, quality of pitches, etc.

Now, it is also probably true that if you had two pitchers who were at .240, and one had most of their success in the “bounce of the ball” and the other in the quality of his pitches, the latter would likely have a better true talent BABIP than the former.


#70    Colin Wyers      (see all posts) 2010/08/19 (Thu) @ 02:05

As regards Hudson - well, I won’t talk about Hudson at all. But I am interested in discussing what we know about BABIP that happened, as opposed to what BABIP will happen or BABIP skill, etc.

What true score theory tells us is that observed variation in a population can be explained like so:

Obs_Var = True_Var + Rand_Var + Bias

In this case, “bias” may not be the right term - consider it things like environment (park, mainly). Anyway, we can pretty much leave it aside for the rest of this.

If we are interested in PROJECTION, we are only interested in the “true” variance, to the exclusion of the random variance. But if we are interested in DESCRIPTION, like Alex suggests, we care about the random variance. In the case of hitters, this is very straightforward - we simply credit the hitter with their observed BABIP and proceed from there. But if we can recast that equation in terms of pitching, we can see where it gets messy:

Obs_Var = True_Var_Pit + Rand_Var_Pit + True_Var_Field + Rand_Var_Field

Now what we can say is that generally speaking, the true variance of fielding is higher than the true variance of pitching - over time, fielders (as a unit) have a greater impact on BABIP than pitchers. So we can look at any pitcher’s BABIP (which is a term I hate - it’s not soely that pitcher’s BABIP, but the BABIP that occured while that pitcher was working, by some combination of effort between the fielders and pitcher) and say that based upon that, we think the “true” BABIP skill of that pitcher/fielder unit is so-and-so, and then we can subdivide that up between the pitcher and fielders (with fielders as a unit ending up with a larger share of presumed skill than the pitcher).

And once we’ve accounted for the spread of “true” BABIP skill, we’re left with our random variation in BABIP. The crux of the question, I think, is how to split random variation in pitching from random variation in fielding.

And what we know is this:

1) The distribution of batted balls affects the liklihood that batted balls will be fielded, and
2) The act of fielding a batted ball affects the perception of where it was hit.

So you have a nice little cycle of curse and recurse where our recording of batted ball distribution tells us something about BABIP, but the actual BABIP tells us something about the nature of our recording of batted ball distribution. (Emphasis on curse.)


#71          (see all posts) 2010/08/19 (Thu) @ 11:33

I’m curious what people think about what % of responsibility for BIP should be assigned to the pitcher and what % should be assigned to the fielders.

Are pitchers responsible for 10%? 20%? Is it or can it be different for different pitchers?

If pitchers have some ability, however slight, to affect BABIP, how much of their overall value does it make up? Can 10% of a pitcher’s value come from it? 20%? Higher?


#72    Matthew Cornwell      (see all posts) 2010/08/19 (Thu) @ 20:14

#76

Wasn’t there a 40-30-20-10 breakdown suggested a few years back with 40% being luck and 10% being park with pitching skill and defense there in the middle?  Is that seasonal?  I guess we could take out more of the luck portion and attribute it more to pitching skill as more BIP accrued?


#73    studes      (see all posts) 2010/08/19 (Thu) @ 20:57

Based on work I did a long time ago at Baseball Graphs (dang if I can find it now), I thought a 50/50 split between pitching and fielding was generally appropriate. I think I was looking at DER, which is pretty much the same as BABIP.

I think DER was park-adjusted, and I was basically trying to spread the “luck” part evenly between the two.  So long ago… I’m sure people have had more insightful analyses since.

BTW, I definitely believe that several/many elite relievers have a true relatively low BABIP talent.


#74          (see all posts) 2010/08/19 (Thu) @ 21:07

Matthew

Thanks. I probably did not see that breakdown, so it was good to see.

studes

Thanks.

Cy


#75    Matthew Cornwell      (see all posts) 2010/08/19 (Thu) @ 21:13

Cyril - don’t take my word for it.  I seem to remember it, but it could be something else.


#76          (see all posts) 2010/09/29 (Wed) @ 17:04

Since this thread, Hudson’s BABIP? .324

And Trevor Cahill’s BABIP since then? .305


#77    dave smyth      (see all posts) 2011/01/17 (Mon) @ 18:51

For a long time I have wondered whether it was most correct for the DIPS insight to have been put in terms of hits per non-HR ball in play. I mean, why was the defense-independent characteristic assumed to be the main source of the phenomenon? Maybe it’s something else. I’m more interested in how hard a pitcher allows balls to be hit, with respect to evaluating his batted ball talent. HR are obviously a part of this approach.

From the baseball I’ve watched over many years, it seems like the luck in this area is mostly in 1B per fair batted ball (meaning that HR are included). The ball is going to somehow go for a single 21% or 22 % of the time, despite the advantage or disadvantage in the count, and despite the quality of the pitcher or hitter, except in the uncommon case of a specialist like an Ichiro. Even pitchers batting are around 18 to19%, and given that they don’t practice hitting as much and are slower runners, that’s acceptable in my theory.

So, why not limit DIPS to 1b per fair batted ball? And applied to 95% of ballplayers?


#78    Arvin      (see all posts) 2011/01/18 (Tue) @ 04:01

Since this thread has been resurrected…

#70-73:  This is what Tom, Erik Allen, and myself did back in 2003.  Although, we were using single season numbers, iirc, so with the data we have now, we have much better numbers.

#69: “Part of the “luck” in a low (or high) BABIP is not only where the batted ball bounces, but also in the quality of the pitches thrown by the pitcher.  The exact location of every pitch has a lot of random noise in it.”

Yes, and some pitchers create a lot more noise than others(e.g. some pitchers have a lot more variance in their hittability). 

#77: “I’m more interested in how hard a pitcher allows balls to be hit, with respect to evaluating his batted ball talent. HR are obviously a part of this approach.”

I’ve been looking at a lot of LD-rate data recently, and I’m starting to think that some pitchers have A LOT of control over their line drive rate.  Some of the numbers I see for game-t-game seasonal performance is astonishingly consistent.  For other pitchers, their within season game-to-game LD rate fluctuates A LOT.

BABIP has been barely detectable on a year-to-year correlative basis.  Line Drive rate should be more detectable. 

A model (Bayesian) that takes into account pitcher’s predictability of LD-rate should be able to make much better predictions for future LD-rates.  Regress a lot when the pitcher has little control over his own LD rate, and regress just a little when the pitcher has a lot of control over his own LD rate.

As for HR, I think I remember reading something about LD rate being one of the key predictors of HR rate, which makes sense.

Obviously, if we have Fliner data, incorporating that into a “strength of hit-ball” given up would provide more robust data.


#79    Colin Wyers      (see all posts) 2011/01/18 (Tue) @ 11:57

I’ve been looking at a lot of LD-rate data recently, and I’m starting to think that some pitchers have A LOT of control over their line drive rate.  Some of the numbers I see for game-t-game seasonal performance is astonishingly consistent.  For other pitchers, their within season game-to-game LD rate fluctuates A LOT.

BABIP has been barely detectable on a year-to-year correlative basis.  Line Drive rate should be more detectable.

To the extent that BABIP is not well correlated y-t-y but LD% is, that just tells us that line drive percentage is not a good indicator of pitcher BABIP skill. So why do we care? (Especially since we can explain LD% persistence through things like observational biases.)


#80    Arvin Hsu      (see all posts) 2011/01/18 (Tue) @ 15:44

"To the extent that BABIP is not well correlated y-t-y but LD% is, that just tells us that line drive percentage is not a good indicator of pitcher BABIP skill. So why do we care? (Especially since we can explain LD% persistence through things like observational biases.)”

How can you say that?  Out of the three types of BIP’s, LD’s fall for the greatest percentage of hits.  IIRC, FB’s have like a 15% BABIP, GB’s have 18% BABIP, and line drives have 70% BABIP.  It’s a causal connection.  Just because there’s noise and it makes it hard to detect doesn’t mean it doesn’t exist, especially for something that’s clearly causal.


#81    Tangotiger      (see all posts) 2011/01/18 (Tue) @ 15:55

Colin is correct in pointing out the LD scorer or park biases, as we’ve seen in the past from him and Harry.

As for his other point, it seems that he’s saying that because of the low correlation in BABIP, that breaking it down into components is not going to get us very far?  I think he’s saying that.


#82    Arvin      (see all posts) 2011/01/18 (Tue) @ 16:21

I guess it comes down to whether or not the variance introduced by persistent LD-scoring biases outweighs the variance induced by converting LD/GB/FB rates into BABIP. 

The issue with BABIP has always been the high degree of noise that is associated with the statistic, which is why most current implementations regress to league average.  As we’ve seen, pitchers do have some degree of control over BABIP, it’s just really hard to detect. 

Let’s say we assume that LD/GB/FB rates have no persistent scorer biases.

BABIP = H/BIP
= (H_LD + H_GB + H_FB)/(BIP)
= ((H_LD/LD)*LD + (H_GB/GB)*GB + (H_FB/FB)*FB)/(BIP)
= (BA_LD)*(LD/BIP)+(BA_GB)*(GB/BIP)+(BA_FB)*(FB/BIP)
= BA_LD*LDrate+BA_GB*GBrate+BA_FB*FBrate,

where H_LD, H_GB, H_FB are the # of hits from LD, GB, FB and BA_LD, BA_GB, BA_FB is the batting average for LD, GB, and FB, respectively.

We have good league estimates for BA_LD, BA_GB, BA_FB.  Since pitcher LDrate, GBrate, and FBrates have season to season correlation, and can be used predictively(albeit with regression, as always), wouldn’t expressing predicted BABIP as the sum of league_BA * predictedrate for each of LD, GB, and FB result in a better prediction?

What this does is it removes the binomial variance from BA_battedballtype for each batted ball type.  If you wish, you can adjust BA_LD, BA_GB, and BA_FB for both defense and park effects, which would decrease variance even further.

I’m arguing that this should be a better predictor of expected BABIP.

Now, if LD/GB/FB scoring biases introduce more variance than this process removes, it’s useless.  I’m not familiar with Colin’s work documenting these biases.  Is there a way to systematically extract and adjust for scorer bias much like we extract and adjust for park effects?  If so, then that would increase predictive accuracy even further.


#83    Tangotiger      (see all posts) 2011/01/18 (Tue) @ 16:38

Arvin, I guess you have been out of it for a while.  What you are doing, multiply the hit rate per event type by the frequency of each event type is standard for some systems out there.

Running a regression on the frequency, and on the hit rates, was done at least as far back as MGL’s DIPS Revisited article from 2003 or so.

There’s a long number of articles since then, at Hardball Times in particular, that focuses on this.  I’m thinking from Gassko, Colin, Bendix, and several others.

***

I will also add the following: we don’t care about batting average so much as run values. The run value of a GB and a FB is identical (after discarding HR).

The second thing is: HR.  A HR on a line drive is removed if you look at “balls in park”, but kept if you look at “contacted balls”.  So, you’ve got to be very clear as to when and why a HR is being removed or kept, especially if the focus is on line drives specifically.  We can’t just treat the removal of the HR from the dataset on the basis that no fielders were involved.  That may be correct for some specific application, but it has to be done carefully and with justification.


#84    Colin Wyers      (see all posts) 2011/01/18 (Tue) @ 16:40

The key findings:

http://www.hardballtimes.com/main/article/when-is-a-fly-ball-a-line-drive/

http://www.baseballprospectus.com/article.php?articleid=10523

And since we know there is bias in terms of batted ball types, the question is whether or not persistence in batted ball types reflects primarily persistence of bias or persistence of skill. When we look at OUTCOMES, we see evidence of very little persistence. I see no evidence that batted ball types provide evidence of persistence that we simply haven’t been able to detect in outcomes; instead, I see evidence that the batted ball data is unduly influenced by bias.


#85    Tangotiger      (see all posts) 2011/01/18 (Tue) @ 16:50

Arvin: one of the possibilities is that a ball is marked a line drive if it falls for a hit and is marked a fly ball if it is caught, in a disproportionate manner.

The extent of this bias is what we need to get a handle of, so we can understand the implications of it.


#86    Arvin      (see all posts) 2011/01/18 (Tue) @ 17:20

"Arvin, I guess you have been out of it for a while.”
Yes. You could say that again.  Since 2003, basically.  I didn’t even find out that you had wrapped up our Primer discussion regarding fielding, park, and pitching variance components of BABIP in a tidy little pdf until years later when I googled my name randomly.

“Running a regression on the frequency, and on the hit rates, was done at least as far back as MGL’s DIPS Revisited article from 2003 or so.”
Seems like it appeared in 2004.  I’ll take a look at it.

“There’s a long number of articles since then, at Hardball Times in particular, that focuses on this.  I’m thinking from Gassko, Colin, Bendix, and several others.”
I read the Gassko articles from 2005 and 2006. He is a proponent of discarding LD-rate and using league-avg LD rate instead.  At least that’s what he proposed for DIPS 3.0.  I disagree with that, as some pitchers seem to have a large degree of control over their LD rate.

“I will also add the following: we don’t care about batting average so much as run values. The run value of a GB and a FB is identical (after discarding HR).”

I guess it depends on what your goal is.  Only run values matter for wOBA.  However, for predicting counting stats like HR, 2B, 1B, wouldn’t having the actual BA_FB and BA_LD numbers be useful?

Agreed on the caution with regards to how HR are used in the analysis.

Thanks for the links Colin.  I’ll take a look.


#87    Arvin      (see all posts) 2011/01/18 (Tue) @ 19:52

DIPS Revisited, MGL: http://www.baseballthinkfactory.org/files/primate_studies/discussion/lichtman_2004-02-29_0/

So MGL found that pitchers have a degree of control over IFFB, OFFB, Pop Flys, Ground Balls, and Line Drives, as well as BA_LD.  The fangraphs data subdivides differently, iirc.  It has GB/FB/LD/IFFB/Fliner, but Fliner may be OFFB, and FB may be Pop Flys.

If he found these in 2004, why do most of the popular metrics still support throwing out BABIP entirely?  Shouldn’t we be taking these 5 stats and BA_LD, adjust for defense and park, regress, then predict BABIP?

Colin: nice work on the first article.  Since that scorer who works @ Pittsburgh said he doesn’t use TV feeds, do you still think throwing out those 6 ballpark data points is valid?  Also, since you have an LD-rate affect by ballpark(press-box height), wouldn’t it be appropriate to then adjust observed LD-rates by that, rather than throwing out the LD statistic as unreliable?

On the second article, the images are broken, but I got the gist: BIS consistently scores certain ballparks differently than MLBAM, so we not only have ballpark effects in the scoring of batted ball data as per your first article, we also have ballpark-data gatherer interaction effects. 

I wonder if we could evaluate BIS and MLBAM in terms of which set of batted ball observations better predicts actual measured outcomes.  Alternatively, we could just average the two sets of observations, doubling our observation sample size so to speak.

“I see no evidence that batted ball types provide evidence of persistence that we simply haven’t been able to detect in outcomes; instead, I see evidence that the batted ball data is unduly influenced by bias.”

Isn’t this essentially what MGL showed in 2004?  He looked at year-to-year correlations for pitchers who switched ballparks/teams, and found detectable correlations for all batted ball event types and for BA_LD.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 01:57
Who is Jeremy Lin?

Feb 12 00:40
Clutch analogy

Feb 12 00:38
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul