THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, November 29, 2011

How often is an above-average pitcher’s best days behind him?

By , 02:23 AM

On XM radio’s MLB channel the other day, the talking heads were discussing Papelbon (right after his signing).  One of them asked the question, “Do you think that Papelbon’s best days are behind him?”

Here’s a news flash, and an important statistical concept for you newbies:

For any overall pitching stat, like ERA, FIP, or ERC (component ERA), if a pitcher has been above average for that stat in the past, his better days are always behind him, regardless of his age or experience, assuming that we know nothing else about him.  Of course when I say “always,” I mean that our projection for him going forward is always going to be worse than his past performance, using a weighted average of his last 3 years (say, 1, 2, 3 weights) to represent his past performance.

If you don’t believe me, I challenge you to give me any parameters that you think would defy that proclamation, and we’ll test it using historical data. 


#1    Sang      (see all posts) 2011/11/29 (Tue) @ 08:18

I disagree Tom. Think about a projection system that will look a a twenty year old who managed to go slightly below average in the majors in his first full season. Tell me PECOTA won’t love him.


#2    DSMok1      (see all posts) 2011/11/29 (Tue) @ 08:41

@ #1
There’s been a lot of discussions here about pitcher aging curves; in general, it appears that they are mostly down hill.

And MGL’s point is that regression to the mean will probably outweigh even the slight positive aging early in the career.


#3          (see all posts) 2011/11/29 (Tue) @ 09:03

I mean that our projection for him going forward is always going to be worse than his past performance

Well if that’s how your projection system works, then of course you’re right.

Isn’t the question whether or not his *actual* performance next year is better or worse than his weighted average prior seasons?  (and by “his”, I mean “a meaningful number of pitchers who meet some criteria").

I’ll propose the following, to get the ball rolling:

Before age 23:
200+ IP
K/9 > 8.0
BB/9 > 4.0

The idea being, young guys with good stuff who have a high (but not catastrophically high) walk rate.  Hoping they might be able to curtail the walks a bit and become more effective.

I was originally thinking of a much younger criteria, envisioning guys like Gooden and Felix and Feller, who come into the league so young that they’d be likely to be well before their prime.  But those guys in particular (and perhaps any young pitcher) had what I’d consider lucky first seasons, so of course they hit a “sophomore slump”.  If they weren’t so amazing in their first season, they might have been sent down for a little while to mature.


#4    Peter Jensen      (see all posts) 2011/11/29 (Tue) @ 10:02

MGL - I am sure that your projection system is correct that the majority of pitchers that are above average at any age will project to have lower performance at year plus one than the current year projection.  You have mentioned this before, but as Mike points out above that is not exactly what you are implying in this post.  And if it is than isn’t that a tautology?  Aren’t all players whose true talent is projected as more than slightly above average for year one expected to perform less well in year plus one because of regression to the mean?

My understanding of what you are saying above is that it will never be correct to project an above average pitcher higher in any future year than his estimated true talent in the current year.  I don’t think that is true as many above average pitchers have had runs of performance of 3 or more years at higher in future years that should result in higher projections.  Greinke and Halladay seem to meet that criteria among current pitchers, Koufax and Ryan among past pitchers.


#5    Tangotiger      (see all posts) 2011/11/29 (Tue) @ 10:27

My understanding of what you are saying above is that it will never be correct to project an above average pitcher higher in any future year than his estimated true talent in the current year.

That’s not quite what he is saying. 

If you have his estimated true talent in the current year, say Strasburg or Kershaw, then their future forecasts (i.e., estimated true talent in future years) WILL be higher than in the current year.  That’s a given.

What MGL is saying is that given the observed OUTCOMES in the current year (outcomes that are better than average mind you), then your estimated true talent in future years will always be worse than his outcome in the current year (for outcomes that are better than average).

For example, say you have a young pitcher who we observed at a(n effective) .600 win%.  That is, translate his runs allowed into a win%.  Could be anyone… Bumgarner, Pineda, whoever.  What’s your forecast for this pitcher in the next year?

Well, MGL is implying that first you have to regress that observation into a true talent level.  Say, just for illustration purposes, you regress that down to a .550 win%.  Now, since it’s a young pitcher, we expect the following year for him to be better.  Say, .570 win%.

Hence, MGL’s proclamation: our estimate of a young player’s true talent level in the next year is going to be worse than the observed outcomes in the current year.

MGL also added a disclaimer that we are limited to observed outcomes (FIP, ERA, etc), and that we can’t add tools (fastball speed, “hitting his spots”, number of pitches he commands, etc).

(All numbers for illustrative purposes only.)

Anyway, to beat MGL is quite simple: take a young pitcher that pitched alot but that is barely above .500.  Say you have a .505 pitcher.  You regress him down to say .502 or .503, but then you age him forward to .515 or something like that.

Boom, you win.

Though, presumably, MGL is talking about guys that at least clear some hurdle of above average.  Probably .525 would be the point where regression and aging would cancel out for all pitchers.  Something like that.

Anyway, it’s a good example that he brings forth, and worthy of an aspiring saberist to tackle and research.


#6    BrianK      (see all posts) 2011/11/29 (Tue) @ 10:28

I suppose the corollary is true...if a pitcher has been below average for a stat in the past, his best days are ahead, according to your projection system.

True?


#7    Tangotiger      (see all posts) 2011/11/29 (Tue) @ 10:51

Brian: definitely true for a young player, but possibly not true for a guy in his mid to late 30s.

That’s because at that point, the aging curve starts to become steep downhill.  So, for example, a guy who is observed to be .480 might regress to .490, but when you apply aging, he might be at .470.

MGL’s claim wouldn’t be true if you happen to have a 16-yr old in MLB as a .530 pitcher.  He’d regress to .515, but then he’d age to .540 as a 17-yr old.  It just so happens that among the players we are looking at pre-peak, there’s not that many ages, and so, the hill is not as steep.  But for post-peak, there’s a huge number of late 30s players still around, and the hill is pretty steep at that point.

It depends where you are in the aging curve:


#8    Matt Swartz      (see all posts) 2011/11/29 (Tue) @ 11:16

They aren’t talking about FIP on XM Radio. They are talking about their own interpretation of performance. Papelbon had a 1.58 SIERA and 1.53 FIP and a 2.16 xFIP last year, but a 2.94 ERA. Maybe his projection should be north of 2.94, but you can imagine that a pitcher with a significantly higher ERA than peripherals in one season is probably likely to improve by their standards if they’re just focusing on SV/BS and ERA. And when a scout sees good “stuff” and says the pitcher will improve, that often overlaps with our pitching metrics besting ERA, and other times that our pitching metrics don’t reveal something, I’m confident that sometimes a good scout can see what is likely to change for an above-average pitcher to improve.

You’re focused on whether a projection should show improvement or not, and they could even be talking about changing talent level. If they are talking to scouts, that may be exactly what they are envisioning. It’s not the same thing you’re arguing.


#9    Tangotiger      (see all posts) 2011/11/29 (Tue) @ 11:28

Matt, excellent point.

Matt is talking about being allowed to look at all available outcome stats, when forecasting a player’s true talent level.

MGL on the other hand is focused on using a particular outcome stat when forecasting a player’s true talent level.

So, it is a bit disingenuous of MGL, based on what the talking heads were actually referring to.  But, it’s accurate in and of itself.

For example, to support Matt, a guy happens to give up a high BABIP and performs much worse with men on base than bases empty.  This will lead to a very high ERA.  We wouldn’t even need his ERA to forecast his ERA.  So his future ERA will be much better than his observed ERA.


#10    Peter Jensen      (see all posts) 2011/11/29 (Tue) @ 11:33

Anyway, to beat MGL is quite simple: take a young pitcher that pitched alot but that is barely above .500.  Say you have a .505 pitcher.  You regress him down to say .502 or .503, but then you age him forward to .515 or something like that.

Hasn’t MGL claimed in a past thread that pitchers aging pattern is always a decline, even for young pitchers?  I thought this was part of what he was implying above.


#11    MGL      (see all posts) 2011/11/29 (Tue) @ 11:38

I am talking about actual outcome, as Tango said.  It doesn’t matter what my projection says.

You can’t know any peripherals.  Only the overall stat in question (ERA, FIP, ERA, etc.).

And yes, I’m sure you can find some group that is slightly above average who will beat that going forward, so, as Tango said, I should have mentioned a threshold.  But, in fairness to me, the threshold does not have to be much better than average.

So, someone prove me wrong! Give us some parameters and we’ll look at ACTUAL outcomes.  Your parameters can be age, role, experience, IP.  Just not K, BB, or HR rate, BABIP, pitch speed, repertoire, injury history, park, etc.


#12    MGL      (see all posts) 2011/11/29 (Tue) @ 11:48

"Hasn’t MGL claimed in a past thread that pitchers aging pattern is always a decline, even for young pitchers?  I thought this was part of what he was implying above.”

Yes I have, and that is part of the claim, as you say…


#13    Tangotiger      (see all posts) 2011/11/29 (Tue) @ 11:52

Hasn’t MGL claimed in a past thread that pitchers aging pattern is always a decline, even for young pitchers?  I thought this was part of what he was implying above.

Yes, but I’m showing you a situation where MGL can be proved wrong.

So, if someone wants to take MGL up on his challenge, I’m giving you a narrow set of parameters where you can actually (possibly) beat him.

Try this:
- must be no older than 23
- must have had at least 20 starts
- his ERA must be barely better than league average (in B-R.com, say his ERA+ is between 101 and 105).

The GS parameter is important as well, since giving a pitcher that young that many starts is also a “scouting” parameter.  It’s kind of cheating, but, what the heck, that’s how you beat a forecaster at his own game!

Take Roger Clemens’ rookie year for example.  He has 20 starts, he turned 22 in August, and his ERA+ juuuuuust missed the cut at 97 (though with a bigger park factor, you can sell that he was 101).

You don’t give a pitcher that many starts at that young an age unless you believe he’s an above average pitcher.

A recent example is Pineda (103 ERA+, 28 starts, 22 years old).

Anyway, if you have to take that narrow set of parameters in order to barely beat MGL, I think MGL made his point.


#14    MGL      (see all posts) 2011/11/29 (Tue) @ 11:53

"So, it is a bit disingenuous of MGL, based on what the talking heads were actually referring to.”

I was merely using the XM radio thing to make a point (I don’t know or care what “they” were referring to), but I also think it applies to exactly what they were talking about.  Papelbon has a 2.33 career ERA in the AL.  When someone is that much better than average, you can use any peripherals you want, and they are still going to be worse going forward…


#15    Tangotiger      (see all posts) 2011/11/29 (Tue) @ 12:07

Sure, but what if they are talking about just one year. 

Let’s say the talking head says this: “Listen, I love that this guy saved 30 games, and I know he had a dreadful ERA for a closer, but his best days are still ahead of him.”

Say we look at these parameters:

- reliever is a bit better than league average in terms of ERA in year T
- reliever has at least 30 saves in year T

What’s your forecast for his ERA in year T+1?

I just described Papelbon in 2010.  Now, is he representative, or an anomoly of the population of pitchers that meet my criteria?

I’m guessing he’s fairly representative, in that I’m “cheating” by including his saves totals.  Clearly, the manager must think he’s great, and, if he’s given the job of closer, he must have had some good history as well, not only in ERA but his peripherals too.


#16    Zach Randolph      (see all posts) 2011/11/29 (Tue) @ 14:07

What about if you have a pitcher who has improved over the course of his career. Say an ERA+ of 95, 100, 105. If you used a 3,2,1 weighting scale you would have a projection 101.7 before any aging or regression. The next year your seasons to weight would be 100, 105, 101.7 giving you 102.5. This could change when you factor in aging and regression but I doubt it.


#17    MGL      (see all posts) 2011/11/29 (Tue) @ 14:42

Zach, the whole point is aging and regression. As I said, pick some parameters and we’ll check the historical records…


#18    Zach Randolph      (see all posts) 2011/11/29 (Tue) @ 14:59

Ah, ok. I think I see what you’re saying. So it basically just comes down to what Tango said. Pick a group where the aging effect will be equal or greater than the regression component.


#19    Guy      (see all posts) 2011/11/29 (Tue) @ 15:09

I suppose the corollary is true...if a pitcher has been below average for a stat in the past, his best days are ahead, according to your projection system. True?

Brian: definitely true for a young player, but possibly not true for a guy in his mid to late 30s.

No, this is not true.  Or rather, it’s only true if you already knew that a pitcher will continue to be allowed to pitch in the future.  Many below-average pitchers will subsequently be demoted, sent to the pen, or retire.


#20    MGL      (see all posts) 2011/11/29 (Tue) @ 20:09

"Pick a group where the aging effect will be equal or greater than the regression component.”

My past research has suggested that there is no age group where pitchers improve…


#21    Tangotiger      (see all posts) 2011/11/29 (Tue) @ 21:15

Before MGL is quoted out of context: he surely means no age group where his talent level is expected to exceed his past outcome level, where his past outcome level is better than average.

***

In any case, if someone wants to bet MGL, I’m suggesting there IS an age/outcome group that WILL exceed its past outcome level.  Pineda and Derek Jeter group of players: rookie full time players who were at 100-105 of league average.

I can’t promise you will win, but someone bet 20$ or 100$, and let’s get this party started.


#22    MGL      (see all posts) 2011/11/29 (Tue) @ 22:22

Pitchers only Tango!  And I’m not taking any bets on this one....


#23    MGL      (see all posts) 2011/11/29 (Tue) @ 22:25

Also, you can’t weight the pitchers in year x+1 for obvious reasons. The ones that turn out to be bad or the scouts know are bad will get little playing time.  Even the ones that get no playing time in year x+1 might be enough to show improvement in the entire group.


#24    MGL      (see all posts) 2011/11/29 (Tue) @ 22:32

Here’s how the selective sampling issue can throw a monkey wrench in the testing procedure. Say you choose all rookie pitchers with only 10 ip in year 1 and an ERA+ of 105.  Presumably their true talent as a group is a lot worse than that 105. Say there are 10 if them and half have a true ERA+ of 110 and half 80 (for an average of 95 of course).  Now what if the teams know which half is the 80 and they are not allowed to pitch at all in year x+1?  We have problems....


#25    Tangotiger      (see all posts) 2011/11/30 (Wed) @ 00:30

I dunno MGL… I think you can win the war here by accepting to possibly lose the battle.

If you can accept a challenge like Pineda/Clemens, then you can rightfully say to what extremes the conditions have to be for you to be proved wrong, and barely at that.

That vindicates your entire position really.


#26    MGL      (see all posts) 2011/11/30 (Wed) @ 02:08

Heck, I’ll accept a wager, as long as the parameters of the test are quite clear beforehand.  As I stated, things like the weights for the year x+1 (and year x) performance can bias/influence the results.

The question we are asking of course, is always in real time.  If we have this pitcher (group of pitchers for testing purposes) and all we know about him are age, experience (and perhaps IP in year x), and some measure of his overall performance, what is his most likely (mean) result in that same overall performance in the very next year?

To test that, in order to insure a large enough sample size to reduce sample error to something that we can all live with, we have to find a large group of similar pitchers in the past.

Unfortunately, not all of them will pitch in the subsequent year, which is a problem.  We also have to determine the weights to use for year x and year x+1 to figure the mean of both years.  That can be problematic too because of selection bias…


#27    Tangotiger      (see all posts) 2011/11/30 (Wed) @ 10:48

Here’s what I’ll propose:

1. Select pitchers in year T that had at least x number of starts, and y percentage of their games as starters.

2. In year T+1, he has to meet a threshold of between x/2 and 1.5x, and y-.10 to 100%.

3. You can only select on the stat you are going to measure. (Meaning, select on ERA in year T and see results of ERA in year T+1.  Or, FIP in T and FIP in T+1, etc.  I would exclude walks as a possible metric to use, though.)

So, if MGL says x=20 and y=.90, then that means the threshold in year T is 20 starts and 90% of games are starts, and in year T+1, it’s 10-30 starts and 80% to 100% of games as starts.

We do have to be fair that we don’t get too many relievers in here, since that’s going to tremendously bias the results.

SOMETHING like that.

The researcher can then use age to see if there’s any age group in which he can prove MGL wrong. And a talent level group as well.  Obviously, the minimum talent level must be above league average.  So, you can choose say ERA+ of 101 to 110 for example.

I suppose we need a minimum number of pitchers in the study.  Say you need at least z=50 pitchers.  Obviously, the lower the number, the more chance you have of luck overwhelming the signal.

So, we just need MGL to set the parameters of this bet, and then all you youngsters will have the privilege to prove MGL wrong.  Heck, how about we even create a badge that says: “I proved MGL wrong!” and you can use that as bragging rights for a year.  Anytime MGL questions you, you can throw it back in his face!

Sounds like a fun competition to me…


#28    Peter Jensen      (see all posts) 2011/11/30 (Wed) @ 11:16

Tango - It may be an interesting competition but it is getting pretty far afield from what MGL claimed in his original post.

For any overall pitching stat, like ERA, FIP, or ERC (component ERA), if a pitcher has been above average for that stat in the past, his better days are always behind him, regardless of his age or experience, assuming that we know nothing else about him.

That statement to me means that for any single pitcher that if you take any above average pitcher’s performance for the past 3 years (x-1, x-2, x-3) weighted as he mentions in the next sentence “using a weighted average of his last 3 years (say, 1, 2, 3 weights) to represent his past performance.”, that you will never project that play in ANY future year to be better than that past performance not just in year x.  Now maybe MGL didn’t mean his statements to be interpreted in this way.  But he hasn’t explicited stated that he only meant the projection for year x.  In fact, his response in post #12 supports my interpretation. My contention is that not every above average pitcher has a continuous downward progression of projections.  Sometimes, like the 4 pitchers that I named in post #4, a pitcher will put together 3 consecutive future years that are superior to the weighted average of x-1, x-2, and x-3 and would therefore cause you to project him in say x+6 to be better than he was in year x, hence his best days would not be all behind him as MGL claimed above.


#29    Tangotiger      (see all posts) 2011/11/30 (Wed) @ 12:00

Well, if that’s the case, then MGL sounds wrong.  I mean, a guy has a(n effective) win% observed of .510, we regress to .505, and he’s 23 years old, so our forecast for him will be .515 at age 23, .525 at age 24, .530 at age 25, .535 at age 26, .540 at age 27.

Is MGL arguing against that?  I hope not, and I would guess not as well.

Now, is he setting the threshold higher, such that we’re observing him at a(n effective) win% of .550, which we regress to .525, and he’s 25, not 23, so that his peak forecast is going to be .550?  Well, then he’s in a much more solid footing there.

So, we’re going to have to wait to see what kind of qualifications MGL wants to put on his statement.  I’m guessing he didn’t intend to mean Pineda/Clemens as rookies, that it’s guys more established than that.


#30    Peter Jensen      (see all posts) 2011/11/30 (Wed) @ 12:49

I mean, a guy has a(n effective) win% observed of .510, we regress to .505, and he’s 23 years old, so our forecast for him will be .515 at age 23, .525 at age 24, .530 at age 25, .535 at age 26, .540 at age 27.

MGL’s argument is that a pitcher’s true talent decreases with age so he would not be forecasting the increases that you are stating unless, of course, the pitcher’s performance is actually getting better each year.


#31    MGL      (see all posts) 2011/11/30 (Wed) @ 13:09

My argument is predicated upon the fact that I don’t think that pitchers get better with age or experience regardless of the initial age or experience.

I can probably be proved wrong on two fronts. One, in my previous research I may not have looked at very young pitchers and two, survivor bias in that you only get to look historically at those pitchers who actually had a year x+1 which probably eliminates at least a few pitchers whom the scouts think were lucky in year x.

If anyone wants to get their hands dirty I would set the performance threshold at at least 10% better than league average.  After all, if a talking head is speaking of “better days” surely he’s not talking about a pitcher who is 1% better than average.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 08:11
What sabermetrics is NOT

May 25 06:43
Largest demonstration in Canadian history?

May 25 06:39
Lack of hustle during a game

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards