THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, January 09, 2012

Sharp and soft contacted balls

By Tangotiger, 11:47 AM

Derek shows that while the rate of “sharpies” and “softies” is very consistent year-to-year, it’s virtually meaningless in it’s relationship to hits and outs.

This is a perfect example of why it’s important to remember why we do year-to-year correlations: we don’t care about how much something is real, but rather how much that thing is related to the thing we care about.  So, while it’s very interesting to learn that sharpies and softies are consistent (or at least consistently recorded), it’s ultimately useless. 

Side note to Derek: there’s no reason to remove SF and E from the denominator.


#1    Derek Carty      (see all posts) 2012/01/09 (Mon) @ 13:40

Thanks, Tom, and you’re totally right about SF and E.  I was trying to replicate replicate AB-K-HR without thinking that, for this, it makes much more sense to include SF and E.  Wrote this after a long plane ride.


#2    Tangotiger      (see all posts) 2012/01/09 (Mon) @ 13:46

Obviously, overall it doesn’t matter.  It just stood out.

Great piece nonetheless.


#3    Derek Carty      (see all posts) 2012/01/09 (Mon) @ 13:49

Thanks, Tom.  I really appreciate it.


#4    Matthew Cornwell      (see all posts) 2012/01/09 (Mon) @ 14:39

I am not a BP subscriber. Did you calculate the BABIP of the two groups?  If so, how did they fair? If you are allowed to say, of course. smile


#5    Tangotiger      (see all posts) 2012/01/09 (Mon) @ 14:47

He did calculate it.  But, given that he found almost no correlation, the results should be self-evident: almost no difference between the two groups.


#6    Matthew Cornwell      (see all posts) 2012/01/09 (Mon) @ 15:45

I have a few questions:

So if the hardness of contact is not one of the key contributors in BABIP, then what are the most important factors for those hitters who do routinely have consistently below or above average BABIP?  I am sure leg-speed is a factor.  I know that LD% vs. GB% vs. FB% is too, but I was always under the assumption that how hard the ball was hit factored into those rates as well.  Location of batted ball, i.e. opposite field or pulled?  I thought that there was a correlation between pulled vs. opposite field and how hard the ball is hit.

So what are the biggest factors?


#7    Tangotiger      (see all posts) 2012/01/09 (Mon) @ 15:48

To this:

“So if the hardness of contact is not one of the key contributors in BABIP”

Add this:

“… as recorded by the stringers.”

This is a HUGE difference from what you are saying.


#8          (see all posts) 2012/01/09 (Mon) @ 15:51

So if the hardness of contact is not one of the key contributors in BABIP

That’s not true.  Hardness of contact as denoted by sharp/soft labels from the MLB stringers is not predictive of future BABIP.  Hardness of contact as measured by HITf/x initial batted ball velocity is predictive of future BABIP.

I wrote a couple articles on that topic.  I think Derek linked at least one of them in his article.

http://www.baseballprospectus.com/article.php?articleid=15532
http://www.baseballprospectus.com/article.php?articleid=15562


#9    Matthew Cornwell      (see all posts) 2012/01/09 (Mon) @ 16:06

Got it.  Thank you.


#10    MGL      (see all posts) 2012/01/09 (Mon) @ 17:46

There are several issues here relating to the results that Derek got:

One, since the fraction of sharp and soft batted balls are so low, they are going to be somewhat fluky.  Imagine this:  Say, you are only going to record soft batted balls.  And say that indeed players who hit batted balls more softly do indeed have lower BABIP.  If I only record really soft batted balls, like ones that are really mishit or squibbed or cued, I am not really capturing the differences between players.  IOW, player A might have an average speed off the bat of 80 mph and player B might have 100, but both players might have 5% of their batted balls be “really soft” (squibbs).  Same thing for really hard hit balls, to a lesser degree. So the thing that you want to try and correlate with BABIP is overall speed of batted balls or at least expand your definition of hard and soft so as not to only include the “freakish” plays.  Again, all players may have the same number of “freakish plays” regardless of their overall speed of batted balls.

In fact, the players who hit the balls the hardest overall might have a lot of squibbs because they swing and miss a lot.

That, however, is not the main reason why correlating hard and soft batted ball percentage to BABIP is not going to get your anywhere.

Players that hit the ball harder have lots more HR and doubles (and triples).  Once you eliminate the HR, you are going to automatically reduce the BABIP.  If I consistently smash the ball such that 1 out of 4 of my batted balls is a HR, what is left in terms of BABIP after you remove my HR?

Also, who cares about BABIP?  We care about run value of the batted ball.  I’m pretty sure that hard and soft batted ball percentages correlate very well with run value.

In fact, the balls which have the highest chance of becoming a hit are actually the very soft ones.  So you have countervailing correlations when you regress soft and hard percentages on BABIP, when your hard and soft batted balls are such a small fraction of all batted balls.  Really hard balls fall for a hit quite often, although, again, once you eliminate the HR, you artificially reduce the BABIP (imagine this - every hard hit long fly ball and line drive that is caught at the warning track gets included in the denominator and those same balls that go over the fence are not included in either the numerator or denominator), and really soft balls also often go for a hit.

It is no wonder that there is little or no correlation!


#11          (see all posts) 2012/01/09 (Mon) @ 18:17

Obviously it would be better to use batting average on contact, (BACON), here than to use BABIP.

Mmmmmm, BACON!


#12          (see all posts) 2012/01/10 (Tue) @ 01:25

I’ll take the group in the sharpy leaders over the sharpy trailers anyday.

Could not read the whole thing and maybe I am misreading this but logically, harder hit balls would seem to go for hits more frequently than those which travel a lower speeds (eg FB vs LD).  Of course, speedsters like Ichiro do well with softly hit GB, but there are not many like him around.

Given a player who hits the ball consistently hard vs one who does not, I will always choose the player who hits it harder (all else being equal).  Opposite is true of pitchers.

When and if SOB data are publicly available to all, I expect this will be clear.


#13    MGL      (see all posts) 2012/01/10 (Tue) @ 02:05

pft, it is absolutely true that harder hit balls go for more AND BETTER hits than softly hit balls, if you include all or most balls in your soft and hard buckets. The problem with this analysis is that HR are not included in BABIP and one of the benefits of harder hit balls are HR of course.

The other problem is that the type of hit is not reflected in BABIP. One of the benefits of harder hit balls is that they go for extra base hits more often.

The third problem is that VERY soft balls go for base hits quite often, albeit mostly singles, and lots of infield singles.  So if your “softly hit balls” bucket is only the very softest, which is apparently the case with this data set, then sure, that soft bucket will show a high BABIP.

So very misleading results, as you point out…


#14    joe arthur      (see all posts) 2012/01/10 (Tue) @ 09:39

The “softly” and “sharply” labels are not in fact consistently inserted in the MLB.com play descriptions, so Derek’s analysis is based on very noisy data, and as Mike says, when aggregated it is not indicative of what actually happens on the field.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 11:53
Do pitcher’s reach back for velocity when needed?

May 25 11:33
“Why Kickstarter works”

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 10:14
Largest demonstration in Canadian history?

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 17:04
Firefox, IE, or Chrome?