THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, August 13, 2011

BABIP primer

By Tangotiger, 06:58 PM

A great primer on BABIP, and just a good read all-around even if you’ve had your share of BABIP.

I’d ignore the talk of “Nasty Factor”, because the stringer can be biased.  Is he going to give a high Nasty Factor if a hitter hit a liner in the gap?  Probably not.

And check out the great splits chart at the end (click to expand):


#1    danmerqury      (see all posts) 2011/08/13 (Sat) @ 20:32

This is indeed a fantastic article. But one thing: “Nasty Factor” is just a number that Gameday spits out, on a scale of 1-100. As far as I know, it’s a formula that the guys at MLBAM created that uses pitch location, sequencing, movement, etc. Has nothing to do with the batter or any contact made on the pitch.


#2          (see all posts) 2011/08/13 (Sat) @ 22:21

Dan/1, correct.

It was a good article, long, and I’m not sure I digested it all, but good.


#3          (see all posts) 2011/08/14 (Sun) @ 05:39

Thanks Dan and Mike.

“I’m not sure I digested it all”
Heh, I do tend to get long-winded at times smile Thanks for making it through.


#4    MGL      (see all posts) 2011/08/14 (Sun) @ 10:39

I am not thrilled with this article, to say the least.

The premise is reflected in this statement:

The above data does indicate that if pitchers would have a different defense behind them every 5 at bats, we might see the same influence on BABIP as we do with the batters.

That is absolutely untrue.

The reason that there is little predictive value in BABIP for pitchers as compared to batters is that the spread of BABIP talent among pitchers is much smaller than that of batters.  If you had the same defenses behind all pitchers every day, you would still see little predictive value and little spread among pitchers over large samples. The reason is that all major league pitchers have good stuff and they all throw nasty stuff when they are ahead in the count.  The ability to hit the ball hard obviously varies tremendously for batters because they are not necessarily selected for how hard they hit the ball.  They are also selected based on defense, speed, walking ability, etc.


#5          (see all posts) 2011/08/14 (Sun) @ 10:55

MGL/4, I agree with your contention, but your explanation in your final paragraph makes no sense.

Of course major league batters are selected for their ability to hit the ball hard.  That’s one of the primary things they are selected for.  It’s absurd to state otherwise.

I thought Bojan’s explanation in the article was very good for why quality of the fielders and attributes of the park affect the pitchers much more than the batters.  And that is one of the major findings of Voros with DIPS, one that is far too often overlooked.  Matt Cain is falsely seen as some anti-DIPS posterboy largely because people don’t understand this part of DIPS that Bojan captured very well.

But yes, once you remove that, batters still have a disproportionate impact on BABIP, something like five times as much as the pitchers.  The reason for that is quite simply that the batter is the one who swings the bat.  Bat speed and where in the swing the batter contacts the ball are what determines how hard the ball comes off the bat.  The pitcher has some influence because he can try to deceive the batter into swinging at the wrong place and time or by putting the ball in a location where it is hard for the batter to put a good swing on the ball.  But the batter is ultimately the one with the control over where the bat goes and how fast, and that is what controls how fast the ball comes off the bat.


#6          (see all posts) 2011/08/14 (Sun) @ 11:04

MGL/4
Thanks for your input.

The highlighted statement is poor choice, you are right, as while the data seem to imply that “poorer” pitches are hit with less success the implied 1:1 relation is purely speculation. I will fix that, thanks.


#7          (see all posts) 2011/08/14 (Sun) @ 11:10

I’d like to retract my statement in #5 about MGL’s claim being absurd.  That’s a poor choice of words.

If what MGL was claiming is that fielding + baserunning is an important part of the game for hitters, much moreso than fielding + hitting + baserunning is for pitchers, and that this has an effect to broaden the BABIP skill we observe among hitters as compared to pitchers, I agree.  I don’t see how we have any way of quantifying the impact of that on the spread of BABIP, though.


#8    Tangotiger      (see all posts) 2011/08/14 (Sun) @ 11:42

There are many reasons for getting a low correlation.  But one reason is if the spread in talent is low to begin with.  If let’s say all the strikeout pitchers were Verlander, Weaver, Felix, Lincecum, and 5 or 6 others, and then you look at the K/PA that you observe after 100 PA in one pool and then again in another pool, then your correlation is going to be much lower than if you used all MLB pitchers.

This is why for example if you look at save percentage for the top 10 goalies, their correlation coefficient is going to be much lower (practically close to r=0) than if you looked at the top 50 or top 100 or top 1000 goalies.

(One way to combat this is to increase the number of opportunities.  So, even if you have 10 goalies or 10 pitchers, if you give them 2000 PA or give them 10,000 shots, then the correlation is going to shoot up.)

I would say that 90% of academicians ignore this when it comes to sports.

Basically, the larger the spread in talent, then the larger the correlation coefficient.

As for determining how much talent there is, this is also fairly straightforward (to a point anyway): you look at the observed spread, and you compare it to what you expected from random.

Now, what you expected from random has to also include things like park and defense and whatnot.

Then, whatever is left over is your spread in talent.

For BABIP, the spread in talent is roughly 1 SD =.010.  (It could be 1 SD = .007, but it’s going to be somewhere in this range.)

Of course, it depends WHO is in the sample: is it all pitchers, is it all starters, is it just the 30 best starters, etc.


#9    Matthew Cornwell      (see all posts) 2011/08/14 (Sun) @ 20:03

Anytime someone out in “common baseball-fan world” asks questions or is confused about BABIP, I always direct them to the following two nuggets of gold:

“Career DIPS Numbers” from Tom and “Understanding DIPS” from MGL.

I bet these two selections can answer a large majority of BABIP questions people have had and will continue to have regarding spread of true BABIP talent, regression, and sample size. 

Personally, because of articles like those, my mind is mostly at ease with the questions of “Do pitchers have influence over BABIP?” and “How much can pitchers influence BABIP?” I feel like most of my questions come in the “how do pitchers...” and “why do pitchers...” realm.

bojan’s article from above (and chart especially) is one of many that I can look at to answer those questions.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 20:16
Largest demonstration in Canadian history?

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards

May 24 08:13
espnW for hockey: CBC’s WhileTheMenWatch.com

May 24 00:16
Psst… wanna intern… somewhere?