THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, October 31, 2009

Some actual data and research on hot and cold pitchers (for one game), rather than just “opinion”

By , 07:27 PM

This is directly from The Book, but I will paraphrase:

We looked at all pitchers who retired the first 9 hitters faced.  In fact they struck out 29% of them.

You can’t get much better than that for being “on.” If you believe anything about a pitcher being “on or off” as a predictor for the rest of the game or an indicator as to whether you should take a pitcher our or leave him in, surely you HAVE to believe that on the average a pitcher who retires the first 9 batters with 29% K, is “on,” is having a “good day,” has his “good stuff” for the most part, at least for the first 3 innings.

If you are going to complain about this sample, there is really nothing to discuss with you - you are going to complain about and reject a finger right in front of your face, if you know what I mean.

Anyway, surely a pitcher who retires the first 9 batters in a row with 29% K’s will pitch better than expected for the next 9 batters, considering the identities of the next 9 batters.

Well, that would be right.  He did pitch better as any of you “pitchers are on and off on any given day” people would surely expect.
However, even though they pitched like Cy Young re-incarnated for the first 3 innings, they only pitched 7 points in wOBA better for the next 9 batters - less than one SD.  IOW, the difference was statistically insignificant. Plus, since we did not control for park, weather, and home/away and when a pitcher retires the first 9 batters in a row, the weather and park will tend to be pitcher favorable and the pitcher will tend to be at home, that 7 points actually all but disappears.

So there is simply zero evidence that a pitcher who pitches the very best that he possibly can, other than striking out everyone for 3 innings straight, is likely to pitch any better for the next 9 batters than he would on any random day. 

In fact, the evidence is exactly the opposite.  A pitcher who pitches as well as he possibly can through the first 9 batters will pitcher exactly as he would on any random day from that point on, at least for the next 9 batters.

If we look at the third time through the order (after a pitcher retires the first 9 guys in order), we find the same small effect, which can also be explained away by weather, park, and home/road status, and is not statistically significant anyway - 6 point in wOBA.

What about the flip side?  The 6 runs and 6 hits in 4 innings that Greg references from King Felix. Yes, every manager in baseball takes their pitcher out after or before that because they ALL believe that how a pitcher is pitching, runs, hits and walks-wise, tells us how he is going to pitch at any point subsequent in the game and is ALWAYS used, along with pitch count, to determine when a pitcher comes out of the game.

Every person in the world believes that.

In The Book, we looked at 6000 PA of pitchers who got absolutely hammered to the tune of allowing a wOBA of .701 the first time through the order.  What about the second and third times?

We did find a much larger effect, again without considering park, weather and home/road.

We found an effect of almost 20 points in wOBA.

Unfortunately, I don’t know how much controlling for those other things might reduce that effect.  I wouldn’t imagine that it would be more than 5 or 10 points, but I’m not sure.

Interestingly, when we split those pitchers who got hammered the first time through the order into experienced and not-experienced (for their careers), we find that experienced one see an effect of 9 points (1.5 SD) and with the inexperienced ones, it is 67 points (4.8 SD).  You can speculate any way you want as to why that might be.

So, what do we have so far?

If a pitcher is pitching fantastically the first time through the order, that has very little predictive value.  If he is getting absolutely hammered, it does, mostly (and a lot) for inexperienced ones though.

That suggests of course that our pitchers that get hammered may be populated by a significant percentage of injured pitchers or perhaps less likely, that is is hard to pitch above your normal level and a lot easier to pitch below it.  I favor mostly the former explanation, but there might be something to the latter one.

Now, here is one more scenario which is the one you see a lot and complain about one way or the other:

We know that typically a pitcher gets significantly worse each time through the order, which is why I am always advocating replacing most starters as early as possible (including the fact that short relievers are much better pitchers than starters in general).

But what about if that starter is starting to face the lineup the 4th time through the order, typically in the 7th or 8th inning or so, but he has pitched brilliantly, at least for the last 9 batters?  A similar example would be when Burnett allowed 4 runs in the first inning versus the Angels and then pitched very well the rest of the way, no runs and only a few hits and walks I think.

That is a typical scenario where a manager often leaves in his starter if his pitch count is not that high and eschews his relievers even his good short relievers.  That is also a typical situation where a manager will be criticized if he takes his starter out when his pitch count is not that high.

Again, people, including the managers themselves, can have all the opinions they want and can muster, but they mean NOTHING without evidence.  Well, once again, here is some evidence:

In The Book, we looked at pitchers who were facing the order the 4th time.  They had pitched very well up until that point.  We used a measure of how hard a pitcher gets hit called Hammer Points, and these pitchers had at most -1 Hammer Points.  What that means is not so important.  These pitchers had been pitching very well thus far.

Based on who these pitchers were, we expected them to have a wOBA of .350 and their actual wOBA against was .321, 2.6 SD below expected.

But, remember that the 4th time through the order, we expect that a pitcher will be around 8 points worse than his overall wOBA allowed or around .25 runs in ERA.  So our gain of 29 points above for a pitcher who has been pitching well, is only a net gain or 20 points or so, which is still a lot - around .70 in ERA.

So, we might expect a pitcher who has been pitching well to be .70 runs in ERA better the 4th time through the order than he would be on just a random day.

Again, I’d like to re-do these studies adjusting for park, weather, and home/road status, but until I or someone else does, I will go on record as saying that I have been wrong in declaring indiscriminately that most starters should come out as early as possible when they have been pitching well.

The data so far seems to suggest this:

When pitchers start out pitching fantastically the first time through the order, it means nothing.

When pitchers get absolutely hammered the first time through the order, they continue to pitch badly at least the second time through the order. This effect seems to be muted for experienced pitchers and greatly enhanced for inexperienced ones.

When pitchers pitch well the first 3 times though the order, they appear to continue to pitch well the 4th time through the order.

Keep in mind that all these numbers are obviously based on pitchers who were allowed to stay in the game past a certain point when they are pitching well or not, so they are a selectively sampled group and that the numbers and conclusions may not necessarily apply to a pitcher who has not yet been left in or taken out by his manager.


#1    Sunny Mehta      (see all posts) 2009/10/31 (Sat) @ 23:18

fantastic post


#2    MGL      (see all posts) 2009/11/01 (Sun) @ 00:06

Thank you!  We aim to please!


#3    Greg Rybarczyk      (see all posts) 2009/11/01 (Sun) @ 01:55

Interesting…

What was the criteria for selecting the events to include in the 6000 PA’s that constituted the “got hammered” set? 

I was thinking if you could find the 50/50 point in wOBA (regardless of number of PA’s, i.e. not always looking at the first time through the order), where half of the pitchers were taken out and half were left in, that would be a good way to divide up the data.  Then you could look at the half that stayed in, and see if the rest of their game was better or worse than that dividing line wOBA…

Anyway, it is interesting to see that there is apparently an asymmetry here, in that superior performance is not predictive, but subpar performance is (or possibly is?)

One last thing, I’m wondering why you figure that controlling for things like park, weather and home/road would reduce the magnitude of the observed difference?  Can you give a quick rundown of why you think that would be the case?


#4    MGL      (see all posts) 2009/11/01 (Sun) @ 01:34

Tango did the research, so he’d have to answer most of those questions.

As I say in the post, pitchers who have pitched well at any point that you look at in a game will tend to have pitched in “pitcher’s conditions” (cold weather, pitcher’s parks and at home).  So they will also tend to pitch better than expected (if you are not controlling for those things) in any other time period during the same game.

I am actually not sure that Tango did not control for these things. He doesn’t say he did in The Book so I assume that he didn’t, but it is important to do so.  How important, I am not sure. It could end up being only 1 or 2 points in wOBA, which would make it not that important.

Just like you have to control for the opponents, which he did of course.  Any group of pitchers that pitch well, for example, the first time through the order will have tended to have pitched against weak opponents, so if you look at the second time through the order, they will look like they pitched well again, if you don’t control for the opponent.

Any time you look at players who performed well or poorly in one “split” and you want to look at how they performed in another “split”, you have to control for every external (outside of the player’s ability) factor that can bias performance otherwise you are always going to get a positive correlation having nothing to do with the player per se.


#5          (see all posts) 2009/11/01 (Sun) @ 14:21

This is what I’m talking about. You are asking and answering two different questions.

You ask if a pitcher is pitching well, will he continue to pitch well? Then you plug in the wrong data. The data you are using will answer, if a pitcher is lucky, will he contine to get lucky? We all know the answer to the second question, but nobody seems to know the answer to the first.

Everybody on this website and some managers know whether or not a pitcher gets nine up and nine down has very little to do with how well he’s pitching. Some managers might not, and that might lead them to making some bad decisions. In that case it is their misread of the previous data that causes them to make the mistake. They are using the wrong input.

As for this study, I would much rather see it done in xFIP or QERA or anything that limits the luck factor, because you have to adjust for that as well weather/opponent/etc. If you use a luck adjusted stat, your subset of pitchers will have a much higher percentage of pitchers that are actually pitching well. From the looks of it, even more importantly, you will have a much smaller percentage of pitchers who are pitching poorly.


#6    J. Cross      (see all posts) 2009/11/01 (Sun) @ 14:44

This is excellent stuff but #5 is right too.  I might go one further (b/c over such a short time frame xFIP is mostly luck too) and say that we should ask questions like - If a pitcher has had above average velocity (for him) is he likely to continue to have above average velocity?  If a pitcher has had above average movement is he likely to continue to have above average movement?


#7    Ian      (see all posts) 2009/11/01 (Sun) @ 15:40

Re: #5 “If you use a luck adjusted stat, your subset of pitchers will have a much higher percentage of pitchers that are actually pitching well”

Are you suggesting that xFIP or QERA should be used to determine which pitchers are pitching well?  You don’t think that pitchers who have retired the first 9 batters with 29% K rates will have good xFIPs?

This post is obviously not targeted at people who have already read and believe the studies in the book, since it’s basically a summary of a study in the book. 

The pitch f/x stuff would be interesting, if a little time consuming.  ‘Good’ pitch movement vs. ‘Bad’ pitch movement would be especially tough to define without manually reviewing each pitcher.  Velocity would be easier, but is there evidence that increased velocity helps?


#8    J. Cross      (see all posts) 2009/11/01 (Sun) @ 16:14

Velocity would be easier, but is there evidence that increased velocity helps?

There’s evidence that guys who throw harder have more success (link).  Not exactly the same, I know.


#9    MGL      (see all posts) 2009/11/01 (Sun) @ 16:26

Yes, of course ideally you want to look at pitch f/x data or something like that.  But at the same time, there is the question of whether if a pitcher is pitching well “to the naked eye”, will he continue to pitch well (or vice versa).  That is a legitimate question since that is what fans think and many managers.  You don’t get too many managers who look at a pitcher who has given up 3 hits, 1 walk, and zero runs in 5 innings and say, “I’m taking him out.  Despite those numbers, he’s not really pitching well.” Not too many at all…


#10    J. Cross      (see all posts) 2009/11/01 (Sun) @ 16:46

True, the manager can’t (and maybe even the catcher can’t) see the difference between a good movement/velocity day for a pitcher and an average one.  The catcher can definitely tell whether a pitcher is hitting the glove and a manager can definitely see if he’s falling behind in the count, getting swings and misses or giving up hard line drives and deep fly balls.  So, they do have more info than ERA/wOPS provides.  Do they use it?  That would be an interest study in and of itself.

btw, in addition to controlling for weather, opponent and park, you’d have to control for home plate umpire, I think.


#11    Jamesian      (see all posts) 2009/11/01 (Sun) @ 16:56

Parts of this post come close to my curiosity about the 100-pitch limit now used almost across the board.

Is it your opinion that managers might be helping their team with this strategy even if inadvertently by jerking a pitcher throwing well out after the sixth inning when they seem to be sailing along versus letting them pitch another inning. Or even, God forbid, a complete game.

I think the 100-pitch limit is primarily used to limit bad arms but could it also be an improvement versus pitching a guy until the manager begins to think he is getting hit or tiring?

Basically I’m wondering if you think the use of the 100-pitch limit is a step forward in terms of strategy or if the approach of the older days was better? This obviously depends on the quality of the bullpen and the manager’s decision making. But all things being equal, what’s your take based on the numbers like these?


#12    Jack Klompus      (see all posts) 2009/11/01 (Sun) @ 17:04

Great stuff. The outlook for pitchers who get intially hammered might be even bleaker than your numbers suggest: the manager might have had some reason for leaving in those starting pitchers who got hammered first time through the order. That is, the data presumably necessarily exclude pitchers who got hammered immediately and were taken out thereafter; these pitchers might have performed even more terribly in later innings than did their counterparts who were left in. Of course, perhaps managers have no (statistically significant) idea what they’re doing in this regard. I get that feeling when I look at most of them. I virtually know it when I hear Charlie Manuel, Dusty Baker, etc. But perhaps worth taking into account.


#13    shawndgoldman      (see all posts) 2009/11/01 (Sun) @ 17:21

Great post!

I’m confused about this part:
But, remember that the 4th time through the order, we expect that a pitcher will be around 8 points worse than his overall wOBA allowed or around .25 runs in ERA.  So our gain of 29 points above for a pitcher who has been pitching well, is only a net gain or 20 points or so, which is still a lot - around .70 in ERA.

If a pitcher typically does worse the 4th time through the order, doesn’t that make the improvement the 4th time through the order even more impressive? Maybe I’m missing something here, but it seems to me the effect should be 37 points of wOBA or so, which is an ERA difference over 1.


#14    rwperu34      (see all posts) 2009/11/01 (Sun) @ 17:55

#7, As a group, of course they are going to have better xFIP than normal, however, you will still have plenty in there that don’t. By using a luck adjusted stat, you will definitely have a higher percentage that are pitching better than normal within your subset and likely increase the sample size as well.

#8, All that shows that most managers don’t know who is playing well and who isn’t. No need to argue that one. To use their faulty decision making in a study to determine if someone who is playing better/worse continues to play better/worse is just as bad as them making the decision in the first place.


#15    rwperu34      (see all posts) 2009/11/01 (Sun) @ 18:12

A couple of other questions that pop up for me.

If inexperienced pitchers are more likely to pitch below there true talent level the second time through the order after getting shelled, they must pitch better than normal when they don’t get shelled. How much can we carry that from start to start? Is their a similar effect with inexperienced hitters?


#16    MGL      (see all posts) 2009/11/01 (Sun) @ 18:49

Good questions and comments.

Jack, yes, that is why I said we have a selective sample.  Obviously when a manager leaves in a pitcher who was getting shelled, he has some reason to have left him in.  Unfortunately, we can only look at pitchers who were left in and not the ones who were taken out.

#14, yes there is the the issue of symmetry.  If a young pitcher pitches really badly after getting shelled, he must pitch exceptionally well after pitching well.

Same problem with Tango finding that 9 batters of great pitching does not have predictive value but 9 batters or getting shelled does.  There is no symmetry there, which doesn’t make much sense. There has to be symmetry.  IOW, if you find that one subset of performance is better (or worse) than average, the remaining subset HAS to be worse (or better) than average. That is what I call the “law of symmetry.”

shawn, yup, I definitely blew that one.  37 points, not 20, which is over a run in ERA!

I’s like to do some research over again to answer some of these questions.

Sometimes when you are looking at sample data and your samples are not all that large, it is difficult to answer questions like these with any certainty. As well, if you attempt to ask and answer too many questions, you will likely end up making Type I and II errors.


#17    rwperu34      (see all posts) 2009/11/01 (Sun) @ 21:32

Another question about pitchers. How did the experienced guys do when they were inexperienced compared to after they were experienced? In other words, is being able to regroup after getting shelled a skill that is learned with experience or is it something that does not improve once they get to The Show?


#18          (see all posts) 2009/11/04 (Wed) @ 12:55

"including the fact that short relievers are much better pitchers than starters in general”

I was under the impression that starters as a group were much better pitchers than relievers?  If short relievers are better than starters, why not make them starters to get them more IP?


#19    Colin Wyers      (see all posts) 2009/11/04 (Wed) @ 13:21

To try and put this as clearly as possible:

The average starter has a higher ERA/RA/FIP/whatever than the average relief pitcher - when each is used in those roles.

The average starter, used as a relief pitcher, would perform better than the average reliever.

The average reliever, used as a starter, would perform worse than the average starter.


#20    MGL      (see all posts) 2009/11/04 (Wed) @ 13:24

To further Colin’s comments, that really only applies to “short relief.” When pitchers are used in long relief, they are about as good as when they start.


#21          (see all posts) 2009/11/04 (Wed) @ 13:44

Thanks for the clarification, I just misinterpretted the meaning of the statement.  Do you think a more optimal use of pitchers might involve using a lot of pitchers in short stints rather than a starter in a long stint with the bullpen to finish it up in short relief (at least as a replacement for the back end of the starting rotation)?


#22    jsolid      (see all posts) 2009/11/04 (Wed) @ 13:53

@colin#19/mlg#20
bit of a tangent, but…
is the difference between a short reliever and a starter completely attributed to the fact that a short reliever never sees a batter more than once (in a game)? in which case, you could say that the reliever’s statistics/total performance is better, but only due to a selection bias. (see also: LOOGY)


#23    MGL      (see all posts) 2009/11/04 (Wed) @ 14:05

jsolid, it is presumably that, fatigue, and the fact that they can go all out, knowing that they are only going to pitch 1-2 innings.  For example, you see a difference in fastball speed, although I don’t know what that average difference is - have to ask one of the pitch f/x guys.

“e clarification, I just misinterpretted the meaning of the statement.  Do you think a more optimal use of pitchers might involve using a lot of pitchers in short stints rather than a starter in a long stint with the bullpen to finish it up in short relief (at least as a replacement for the back end of the starting rotation)?”

Yes, exactly.


#24    Tangotiger      (see all posts) 2009/11/04 (Wed) @ 15:03

jsolid: see The Book, plus a recent thread, I think in October.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 14:01
Pete Palmer’s new book: Basic Ball

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion