THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, October 09, 2009

Clutch skill DOES exist

By Tangotiger, 09:29 AM

JC argues against:

Identifying clutch hitting is practical problem that requires a decision involving real costs. Should a team factor in clutch ability when choosing between free agents. Should it matter for the manager choosing among pinch hitters? Should a historically big-game pitcher start the playoff series over your regular season ace? Based on the available evidence, if I had to decide between Jeter or A-Rod it’s not even close: Alex Rodriguez is a far superior player to Derek Jeter, and that’s what is relevant.

Actually, clutch hitting is mostly something for fans to talk about themselves.  It is NOT a practical problem that teams rely on.  To the extent that it is, if teams behave the way fans believe, then the tradeoff is .020 in wOBA.  That is similar to the tradeoff in the platoon advantage.  And when I asked Yankee fans who they wanted, they did choose Jeter over ARod.  And when I did this for all 30 teams, asking if they prefer the better hitter or the clutchier hitter, they chose the clutchier hitter with a wOBA of 20 points worse than the better hitter.  They were (partly) exonerated when I tracked their decision and the clutchier hitter ended up 10 points worse than the better hitter.  That is, they did in fact perform better than expected, but not good enough to overcome the gap in talent.  (That study did not show a statistically significant difference, even though it was based on nearly 2000 clutch PA.)

Anyway, as for actually finding a clutch skill, Andy did in fact find it, and the results are published in The Book.  On p.103:

Batters perform slightly differently when under pressure. About one in six players increases his inherent “OBP skill” by eight points or more in high-pressure situations; a comparable number of players decreases it by eight points or more.

But as Andy concludes later on p.108:

For all practical purposes, a player can be expected to hit equally well in the clutch as he would be expected to do in an ordinary situation.

And the reason is as Andy noted on the previous page:

...that normalizing factor of 7600 clutch plate appearances is simply too large to ever predict a specific player to have a significant clutch hitting skill. Put differently, the fact that one of three players performs at least .006 [wOBA] better or worse in the clutch doesn’t mean that we can tell which players have this skill, even when looking at several seasons’ worth of data.

So, the entire problem rests on the fact that the hitting talent in MLB is so narrow to begin with, and that even though we have determined that clutch skill exists in that population of players, it is simply too hard to identify the specific players that it makes any practical difference.

To conclude: yes, clutch skill exists.  No, it’s not that big a deal (at best, half as wide as than the platoon advantage).  Correct, teams should not rely on clutch skill in their decision-making process, other than as a tie-breaker.

Ass-slap: Repoz.


#1    Tom N.      (see all posts) 2009/10/09 (Fri) @ 11:04

I’m really glad you guys did the study on clutch hitting that you did, because I was always uncomfortable with previous methods of proving/disproving clutch ability.

I hear all the time that if clutch ability was a true skill, the same players would be clutch year over year. And people point to the low year-to-year correlations in “clutchness” as evidence that clutch hitting does not exist.

So, I did an experiment. Everyone agrees that getting on base is a true skill, and different players have different OBP ability. So, a few weeks ago I took it upon myself to look at two random months from this season (I think it was June and July) and see what the month-to-month correlation in OBP was. The correlation I got was less than .10 if I remember correctly (I dont remember the exact numbers, and I cant find the spreadsheet with the data).

I figured that however you define “clutch” plate appearances, it can’t reasonably account for more than 1/6th the total number of PAs in a season, or about a month’s worth of PAs. Thus, a low year-to-year correlation for clutchness only disproves the existence of clutch ability if the low month-to-month correlation of OBP disproves the existence of differences in OBP skill.

I thought that the methodology you guys used in The Book was way more analytically sound than previous studies.


#2    Tangotiger      (see all posts) 2009/10/09 (Fri) @ 11:34

The correlation is dependent on many things, two of the most important is: underlying population skill spread and sample size.

Let’s look at OBP.  r=.50, when you have 200 PA in each of your “before-and-after” groups.  So, if you take say April/May and compare to June/July (or better, take 200 random PA and 200 other PA), and do a correlation of OBP, you will get r=.50 (more or less).  If you don’t get it for 2009, then do it for more years.

The shortcut equation becomes r = PA / (PA+200)

So, if all you have is 100 PA in each of your groups, then your correlation of OBP will be roughly r=.33.  If you have 22 PA, you will get r=.10.

For clutch, you need around 5000 (total PA, of which 500 will be clutch PA) in order to get r=.50. 

That’s the reason you can’t really find it, or make any decisions on it.  If you need 10 years of performance data to know clutch as much as two MONTHS of performance tells you about a player’s OBP talent, you can see why it’s pretty tough to rely on the numbers to figure out clutch.

Basically, throw the numbers out the window, for all practical purposes.  If you think you know clutch, only base it on your guts.  The problem is that managers and fans will look at the numbers, the very numbers that are clouding the issue.


#3    Tangotiger      (see all posts) 2009/10/09 (Fri) @ 11:47

Btw, BPro’s BBTN (Woolner/Silver) also did an excellent job of find a correlation.  r was .33 or something.  But, they split a player’s entire career, so their PA total was huge, probably 3000 PA in each group.

So, again, this would similar to getting an r=.50 if you had 6000 PA.

Clutch skill is there, it is detectable, it exists.  But, it’s hard to find for any individual, and it’s not actionable.

Bill James can call it “fog” and JC can say whatever he wants to say.  But, the reality is what I said.


#4    Pizza Cutter      (see all posts) 2009/10/09 (Fri) @ 14:29

My own contribution to the cause is linked off my name.  Basic findings: At a sample size of 5000 PA, I got a split-half correlation of about .65.  So, we can probably talk about clutch careers, but for practical purposes of picking who will be most clutch next year, the current WPA - WPA/LI definition is inadequate.  Now, maybe if we looked at clutch from a different vantagepoint, or with a new statistical formula, it might stablize, but as of right now, the proper thing to say is that given our current conceptualizations, clutch hitting is not stable ("skill based") from year to year.


#5    Tangotiger      (see all posts) 2009/10/09 (Fri) @ 14:45

Pizza was marked for moderation.

***

His key table is this:

number of PA     N     split-half
1000     869     .174
2000     429     .304
3000     186     .431
4000     74     .489
5000     20     .656

And if we use my handy-dandy shortcut, we get this as the “PA to add” to get that r.

number of PA     N     split-half    PA_to_add
1000    869    0.174    4747
2000    429    0.304    4579
3000    186    0.431    3961
4000    74    0.489    4180
5000    20    0.656    2622

So, if you do:
r= PA / (PA + 4747), you get r=.174 when PA=1000.

As you can see, the value is pretty much around PA_to_add of 4000-5000 (depending which dataset of players you’d use).

With only 20 players in the 5000 PA camp, that .656 is hardly reliable.

If we make our regression equation as:
r= PA / (PA+4500) across the board, we get this level of r:

number of PA     N     split-half    estimated r
1000    869    0.174    0.18
2000    429    0.304    0.31
3000    186    0.431    0.40
4000    74    0.489    0.47
5000    20    0.656    0.53

Compare Pizza’s empirical data from his given sample, to my quick estimate.  That pretty much nails it no?

Considering that all such tests get 4000-7000 of league average PA to add, that’s pretty much the limit of what you’ll EVER find.  Pizza’s probably got the best correlation to date, vis-a-vis sample size used.  That is, his might be the most sensitive of the attempts so far.


#6    Guy      (see all posts) 2009/10/09 (Fri) @ 15:20

Pizza may get a higher correlation because he’s measuring performance across the full LI spectrum.  A hitter is considered clutch if he performs equally well in mid- and hi-LI PAs, but worse in low-LI PAs (if I understand clutch definition).  Most clutch studies just compare hi-LI (or alternative definition of clutch situation) to all other PAs, which of course is how most fans think about clutch (not “yay, my favorite player hit .190 in blowouts!").

One question is whether this picks up some aspects of how pitchers approach different types of hitters, as well as any hitter talent that may exist.  For example, are power hitters more likely to face good pitchers, or platoon disadvantage, in high-LI PAs (compared to their low-LI PAs)?  This could be a problem for any clutch study that doesn’t control for pitcher handedness/quality, but may be more so with this approach. 

If Pizza’s study covers many years, that can paradoxically be a problem too:  over the last 3 decades clutch hitting has been in a secular decline (more use of relievers), so that will create some correlation among the split-samples based on a player’s era. 

I also wonder about LHH vs. RHH.  I would think LHHs might benefit some in the clutch because there’s often a baserunner on 1B, and because so many closer are RHP they may enjoy the platoon edge more often (then again, use of LOOGY’s may offset that).  Probably not a big issue.


#7    MGL      (see all posts) 2009/10/09 (Fri) @ 15:33

I agree with Guy that you have some “non-clutch” factors which may contribute to “clutch-looking” data in these studies when you don’t control for platoon, handedness, and relievers.  So the likely clutch skill is probably less than all of these studies suggest.  Again, as Guy says…

Let’s say that there were no clutch skill.  In high leverage situations, managers will tend to bring in a same-side reliever against the better hitters only. That will appear to make the better hitters look less clutch than the worst hitters, thus a “clutch skill” will be found when none exists.  Same thing for good lefty hitters. Managers will tend to bring in LOOGY’s against them in high leverage situations, making them look even less clutch than good RH batters.


#8    MGL      (see all posts) 2009/10/09 (Fri) @ 15:35

Tango, did you think about that or control for that in any way in the Fan clutch experiment? 

IOW, if you choose the best hitters and the fans choose who they think are “clutch” players but not necessarily the best hitters, did the best hitters face more good relievers in high leverage situations and did they lose the platoon advantage more often in high leverage situations?  And are there more LH hitters in your team than in the fans’ team?


#9    Tangotiger      (see all posts) 2009/10/09 (Fri) @ 15:49

No, I did not control for the quality/handedness of opposing pitchers.


#10          (see all posts) 2009/10/09 (Fri) @ 16:07

If we can identify (after 7000 PA, or whatever), specific players as clutch or not clutch, how well do the conclusions observers jump to after one or two seasons line up with the eventual results?  Is there a tell that observers are keying on that can be seen if you are watching the approach rather than the outcomes?  Or is the reputation purely outcome driven?

If clutch is really a skill, how would we identify where that skill peaks in the 10 year plus window we use to identify it?

How long does it take for reputation to catch up with change?


#11    Terry      (see all posts) 2009/10/09 (Fri) @ 17:05

This is good stuff!!!!!!


#12          (see all posts) 2009/10/11 (Sun) @ 00:07

Tango, how did you come up with the .020 figure in the post, and what does it mean?  Is .020 the observed *belief* in the SD of clutchness?


#13    Tangotiger      (see all posts) 2009/10/11 (Sun) @ 00:48

Right, that is the belief, by the fans.  Fans behave in such a way that they will trade away 20 points for a better hitter in order to get the clutchier player at bat.  That’s the same tradeoff as the platoon advantage.

That’s the extent of how much clutch fans think there is.


#14    Matt Swartz      (see all posts) 2009/10/11 (Sun) @ 10:52

Discovering clutch hitting by looking at individual hitters is naturally going to be something that takes a lot of plate appearances to judge, and by the time we actually discover that someone has been clutch, he may not even be clutch anymore. 

I have thought for a while that certain *types* of hitters are more or less likely to be clutch, though.  I did an article a while back when I wrote for The Good Phight blog (http://www.thegoodphight.com/2009/1/29/741980/there-is-clutch-or-the-cas) in which I discussed a byproduct of the shift.

Lefty sluggers tend to get shifted against so that defenses can cut down on their BABIP.  With runners on base (which are naturally higher leverage situations), defenses are not able to do this as they need to hold on runners, be able to turn double plays, etc.  Thus, my theory was that lefty sluggers would have better BABIP with runners on base than righty sluggers.  I looked at the top 20 hitters in SLG with 3000 career PA or more (as of a year ago when I did the study), and split the group into 8 lefties and 12 righties.  They had similar HR and SO differences with runners on vs. bases empty, but the LHB had much higher BABIP.  The BABIP gap with runners on was only .009 for the 12 righties and .021 for the 8 lefties.  The t-stat on this difference was 2.20, which is pretty damn significant. 

To me, this seems like a far more practical way to determine clutchness.  You can wait until Ryan Howard and Prince Fielder are 40 to look at their clutch stats, or you can say that lefty sluggers rip a lot of hard hit balls between first and second base, and that will be relatively tougher to defend with a couple guys on and expect higher BABIP as a result. 

I’m sure there are many other contextual reasons that some people will have structural clutchness rather than other clutchness.  Lefties’ ability to hit with a larger hole on the right side in general with men on first can be considered a structural clutchness too.  Hitters who hit pitchers with hard fastballs (more typical of closers who pitch in higher leverage situations) will probably be clutch too.  That seems like a way to determine clutchness scientifically without waiting until someone has retired to retroactively crown them.


#15    Guy      (see all posts) 2009/10/11 (Sun) @ 12:30

Tango:  have you ever created a situational hitting metric, analogous to “clutch?” I guess it would be something like RE24 minus (RE24 divided by RE-LI)?  It might be interesting to examine how much of a repeatable skill situational hitting is.  It seems likely to me that most if not all clutch skill is really a situational hitting skill.  Perhaps that skill could be identified more effectively if looked at it isolation from the rest of clutch (though you probably still need very large samples).


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 15:02
Mail: rWAR v fWAR

Sep 02 14:59
Roger Federer

Sep 02 14:59
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 14:57
Could Rob Dibble have been a comp for Strasburg?

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II