THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, October 20, 2008

Panic mode in Redsox nation?

By Tangotiger, 10:17 AM

Are Redsox fans in danger of becoming as hard to please as Yankees fans?  If you have a team that was expected to be one of the best teams in the league, played like one of the best teams in the league, and was a hair of playing in the World Series, then panic and wholesale changes and spending hundreds of millions of dollars at the “problem” is not the answer.  (A problem that would not even exist if they won one more game.) This is the same kind of story I use to hear every day for a few years, how the Yankees needed to shore up their bullpen because they were weak in the 7th innings.  Imagine that.  Some teams don’t even have a quality ace, or have more than five quality position players, and the Yankee fans used to think their problem is that the 23rd best player on the team isn’t good enough!  Talk about a jaded perspective.  All this means is that we will always have fans of 29 teams complaining about something, and all say they need big changes, not realizing that all 30 teams are in a similar position.

Ideally, you would like for your team to have a true talent level of around 86 wins.  That should be the goal for any team.  If you have a team that is good enough to win, on average, 86 wins, then you need to be in an extremely good mood.  If you have a team that has a true talent level of 76 wins, then you can beg the team to spend big bucks for at least one huge player, if not two.  Otherwise, be prepared to be miserable, and plan for hope.  Teams that are average should go out and spend the big bucks for one huge player.

So, Redsox fans: list the teams from 1 to 30 in terms of talent level, and figure out where your team ranks.  Then be glad that you are not in the position of the at least 25 teams below you.


#1    MGL      (see all posts) 2008/10/20 (Mon) @ 17:20

In my article in the upcoming THT Annual, I give a rough win/loss projection for each team in 2009.  Boston has the best, followed by the Yankees.

I don’t think that the Boston organization or even their fans, thinks that they need any changes whatsoever for next year, other than perhaps at catcher if they don’t re-sign Varitek.  The Yankess, on the other hand, who the heck knows what the fans and the organization think.  They are both nuts.

The Mets are a picture perfect example of what Tango is talking about. They had the best team in the NL this year (yes, that is correct kids, a team that didn’t even make it into the playoffs can be the best team - what is this world coming to?) and I have them projected to be the best team next year as well, with roughly the same players.  However, most fans think that the team needs major re-structuring of players and the management may even think the same.  Ridiculous.

And BTW, the area of a team that, by far and away, should be the least of a team’s concern, but isn’t, is bullpen, for a variety of reasons.  One, much of the bullpen does, or at least should, pitch in low leverage situations.  In fact, I have always advocated having a couple of terrible relievers (not intentionally of course) to pitch in the lowest leverages of situations, on the average.  Two, most relievers are fungible.  Three, there is so much random fluctuation in bullpen performance (simply because you are talking about around 500 IP split up among 10 or 20 pitchers), that much of what people think they know about bullpens (whether they are good or bad) is noise.

Here are some interesting numbers.  The difference between the best and worst bullpens in baseball in true talent is probably around 1 run per 9.  Again, that is the absolute best versus the absolute worst.  If we exclude the closers, we are left with around 430 IP of relief work with average leverage of around .9.  That is a difference of 39 runs or less than 4 wins.  So the difference between the absolute best and worst bullpens in baseball, not including the closers, is less than 4 wins.  Which also means the worst bullpens in baseball cost their teams less than 2 wins a season!  Who is going to believe that?  How about that for the headline in tomorrow’s sports section?  “If you put together the worst bullpen in baseball, you only cost your team 2 wins!” (Story to follow.)


#2    birtelcom      (see all posts) 2008/10/20 (Mon) @ 19:36

Of course the Mets have fallen short of the postseason the last two years by one game. So arguing to a Mets fan that he or she shouldn’t worry about the pre-closer bullpen because it’s probably not worth more than a couple of games is not going to get very far.  Plus, the Mets in fact have no closer at all in 2009, at the moment, with Wagner out the entire upcoming season.

I do get, however, that the small samples for bullpen pitchers makes it very difficult to predict who the heck is actually going to be an improvement over the current group.  The scariest thing about wanting to turn over the bullpen is having so little to go on to pick who is and isn’t worth signing/trading for (or letting go).

I also get that fans will often focus on demanding improvements in small weaknesses, sometimes forgetting that the absence of infinite resources means filling small needs can risk opening bigger holes elsewhere.  The Mets clearly would have been better this season with stronger performances from 2B, LF, Pedro’s spot in the rotation and and the bullpen.  But as you point out, every team has holes, always, and it cares to decide when filling them puts at risk the powerful parts of the team, of which the Mets have many.


#3    Rally      (see all posts) 2008/10/20 (Mon) @ 22:21

“If you put together the worst bullpen in baseball, you only cost your team 2 wins!”

A lot of people probably won’t get this.  Two wins is of course just the average.  If you put together the worst true talent bullpen, you might get lucky and have them appear average.  Or you might get unlucky and wind up with the Baltimore Orioles.


#4          (see all posts) 2008/10/21 (Tue) @ 00:55

“The Yankess, on the other hand, who the heck knows what the fans and the organization think.  They are both nuts. ”

Hey!

Actually, you’re probably right.


#5    MGL      (see all posts) 2008/10/21 (Tue) @ 01:14

Yeah, and if we knew for a fact the spread in talent between the worst and best pens in baseball were 2 wins in either direction, it is almost guaranteed that some team or teams would be horrendous in performance and some team or teams would be lights out because of the fluctuations in bullpen performance.  That gives the illusion that pens are so important.

#2, it is not that it is so hard to project pen performance, true talent-wise.  It is not.  It is that there is so much random fluc in performance for 400 some innings among 15 or 20 or so pitchers.  I could easily go back over the last 20 years and project which pens would be poor, average, or good, and would nail it right on the money (in the aggregate).  Easily.

There really is no such thing as a terrible pen.  There just aren’t that many bad relievers.

Check this out:

I looked at all bullpen performance (including closers), from 1990 to 2000.  I looked at first half versus second half performance.  It is not necessarily the same personnel in the first half as in the second, and I did not control for that, but I think that we can safely assume that a team’s bullpen is usually comprised of around the same pitchers in the first half as in the second half, or at least that there should not be too much of a change in true talent from the first half to the second half for most teams.

In the first half, I put all teams (330 team seasons worth) whose pen had an ERA of less than 3.4 in one bucket.  3.4 to less than 3.9 in the second bucket.  3.9 to 4.3 in the 3rd bucket. 4.30 to 4.67 in the fourth.  4.67 to 5.2 in the 5th and 5.20 or greater in the 6th.

Basically we are looking at the best to worst bullpen performances in the first half of the season.  Of course, again, assuming that personnel and IP for each pitcher remain around the same, we can easily check the true talent of each of those collective pens.  That is their second half performance of course.

(Important point for people doing or reading baseball research:  A random selection of a player’s performance is always an unbiased estimate of his true talent.  If you have a bunch of players so that the sample size is pretty large, you will get a nice and pretty reliable measure of those players’ collective true talent.)

Anyway, I looked at that second half performance to get an indication of each of those buckets’ true talent.  Here is what I got:

Group 1st half IP 2nd half IP 1st half ERA 2nd half ERA

Group I 9741 11005 3.10 3.89
Group II 11331 12584 3.65 4.14
Group III 10618 11492 4.09 4.13
Group IV 10479 10748 4.47 4.30
Group V 11830 12836 4.88 4.51
Group VI 11351 11600 5.72 4.59

The difference between the best and worst teams in the first half was 2.61 runs in ERA.  Their true talent only differed by .7 runs when the smoke cleared.  That is an enormous difference.  Now, granted part of that regression could be that the teams with bad bullpens got better personnel and the teams with good ones lost players to injury and had to replace them with worse relievers, but I think that if we controlled for personnel we would get very similar results.

Imagine telling the fans of team with a bullpen that allows a 5.72 ERA in the first half, “Don’t worry, if you do absolutely nothing, that pen will regress to a slightly below average pen?” They would think you were crazy.

The mean in the second half is around 4.26.  So it looks like teams are regressing around 70% toward the mean from the first to second half.  That would imply that over a full season, a team’s pen should regress around 54% toward the mean.  So take a team’s pen performance and regress it 54% to get an estimate of its true talent.  That would imply that if in a typical year, the best pen or two were 2 runs and change better than the worst pen or two, that the true talent would indeed be only a run or so different.


#6    Aaron      (see all posts) 2008/10/21 (Tue) @ 01:38

I certainly agree with MGL’s belief that the bullpen is not an area where a team should spend big bucks to improve. In fact, I wouldn’t be surprised if there is an inverse relationship between money spent on relievers and bullpen performance. However, I think his calculations are misleading and understating the real impact a bullpen has a team’s record.

First, I don’t see any reason to be excluding the closer from the analysis. Bad bullpens usually have bad closers and vice versa.

Second, since the leverage of the situations relievers find themselves in varies enormously, I don’t think determining their impact on win expectancy is simply a matter of summing their aggregate talent. Let me give an example to make my point.

Say you have two teams that each have bullpens with combined true talents of a 4.00 ERA. However, one team’s relievers all are 4.00 ERA talents while the other one has pitchers whose true talents range from 2.00 to 5.50. If the team with a dominant closer and bad long-man used its personnel wisely, wouldn’t it have a much greater positive impact on win expectancy than the balanced team? The reason being that the difference between a 4.00 guy and a 5.50 guy in a blow out is trivial, but the difference between a 2.00 guy and a 4.00 pitcher in a 1 run ballgame in the 9th is a big deal, especially over the course of a season.

Looking at Fangraph’s WPA seems to suggest that the impact that bullpens have on winning is far greater that the basic numbers suggest:

Top 5
Rays- 9.30
Phillies- 8.37
Yankees- 8.33
Angels- 6.74
Astros- 5.54

Bottom 5
Mariners: -5.60
Indians: -5.11
Orioles: -4.70
Tigers: -4.35
Mets: -2.55

If I’m interpreting that right, there was a 15 win difference between the best and worst bullpens in 2008 which would mean a 7 or 8 win difference between the worst and average. Yes, part of that is the highly volatile nature of individual pitchers, but the asymetrical distribution of those individual performances (which is mostly a conscious decision by the manager) also plays a huge part.


#7          (see all posts) 2008/10/21 (Tue) @ 08:16

I’m a big mets fan.  I too believe the mets had the best talent in the NL.  After the last two years tho of watching countless names from the bullpen come out and not get the job done, I want to see changes.

Some people are insane tho.  If you listen to WFAN, you hear some crazy talk like trading a david wright or a jose reyes. 

I dont know.  Im just not so sure I buy the 2 wins argument.  It might not be much more then that but we only missed the playoffs by one game.


#8    Tangotiger      (see all posts) 2008/10/21 (Tue) @ 10:52

Should the Mets go after some new relievers like Heath Bell and Dan Wheeler and Dave Weathers?


#9    MGL      (see all posts) 2008/10/21 (Tue) @ 12:50

Aaron, #6, my numbers in that little “study” did include closers.

While yes it is true that a team can and does leverage their pen, you are way overstating the impact of that.  My basic point still stands.

Looking at WPA will give you enormous fluctuations in “performance impact.” If you guaranteed that the best pens were only 2 wins better than the worst pens, as I said, there might be a 4 win difference in ERA, but a 6 or 8 win difference in WPA (or maybe more - I don’t know).  And all but 2 wins of that WPA difference would be noise (since we guaranteed a 2 win difference in true talent).

Basically looking at a player’s actual OPS or ERA as compared to their true talent ERA or OPS, you get some noise.  Looking at those same player’s sample WPA as compared to their true talent WPA you get a gigantic amount of noise.  That is because WPA is so dependent on random events, such as the runners on base and the game situation, most of which has little or nothing to do with the player himself.

Imagine that every player had the same true talent.  What do you think the WPA’s would look like?  They would be all over the map.

The “illusion” (that pens are so critical, or that there is an large difference between good and bad pens come from 2 things:  One, the large number of IP comes from a lot of pitchers.  I have said many times that 300 IP from one pitcher is a totally different animal than 300 IP from 10 pitchers, in terms of random fluctuation around the mean true talent for those 300 IP.  Two, anything that happens at the end of the game will be perceived as way more critical than things that happen at the beginning of the game.

While it is important to have a good closer because the average leverage for him is or should be around 2.0 or more, the average leverage of a pen is around 1.0 I think, which means that pens are no more important than starters and certainly less since they only throw 1/2 as many IP as the starters.


#10    Tangotiger      (see all posts) 2008/10/21 (Tue) @ 13:05

I’m basically going to ditto everything MGL said here.

WPA is the right way to look at it, but you need to do the split-season as MGL is doing it.  Take the top 5 bullpens for the first 81 games, and bottom 5, by WPA.  Then, tell us what they did in WPA for the following 81 games.

Perhaps most readers here can’t accept, but the fact is that the out-of-sample data is what establishes the true talent level (for the group), and the in-sample data is completely irrelevant in this regards.


#11          (see all posts) 2008/10/21 (Tue) @ 13:07

I dont believe the mets should have given up on Heath Bell in the first place.


#12    MGL      (see all posts) 2008/10/21 (Tue) @ 19:11

I also should add that another reason that you will get tons of noise in WPA is that the measure itself has such a large range even in reasonable situations.

But, as Tango said, see for yourself.  Take the best and worst bullpens (or offenses, or whatever you want) in the first half and see how they do in the second.  That will tell you how much WPA gets regressed.  It won’t tell you the spread in WPA in true talent, because the best and worst in the first half are not (close, but not quite) the best and worst in true talent.

Same thing for my study.  The pens in the first half are not separated according to true talent, but according to performance.  That is what the spread in true talent in the second half was only .7 runs, but I am positing that the spread in true talent between the best and worst pens in true talent is around 1 run (just a guess, which is supported by my data).

I was just thinking.  If team front office personnel did nothing but read this blog, it would turn their entire operations upside down. That is presuming that they understood and appreciated (as credible) somewhere between 50 and 100% of the material herein.

Seriously, shouldn’t this be mandatory reading for at least one or two people in the front offices of all the “progressive” teams?  And it’s free!


#13    Aaron      (see all posts) 2008/10/21 (Tue) @ 19:31

MGL, with regards to closers, I was referring to the last paragraph in your first post where you said “So the difference between the absolute best and worst bullpens in baseball, not including the closers, is less than 4 wins.”

Why closers would be excluded from those calculations doesn’t make sense to me. And if they are included then that probably means that the worst bullpen costs its team more than two wins, even if using a simplistic approach.

********

Let me be clear, I understand all of what you and Tango are saying. I understand the difference between observed performance and true talent. I understand that small samples lead to highly volatile results. I understand that WPA is hypersensitive to the outcome of a handful of at bats. I get that. That’s why I am not overstating anything because I am not arguing that bullpens have a true, expected impact of +/- 8 wins. I am simply saying that it is more than +/- 2 wins.

The reason that is so is because the situation that a particular reliever finds himself in is not a random occurrence like it is for hitters. The manager of the team makes a very conscious and deliberate decision about when to deploy each of his relievers, and that means that we can be confident about what their average leverage will be BEFORE the season starts. Aren’t you almost 100% certain that Mariano Rivera is going to have a leverage of at least 2 next year? Don’t you KNOW that, at least as much as you can know any future event? And don’t you know that a team’s long man/mop-up guy is going to have a leverage of less than 1? If that is the case, then there is absolutely no reason to treat the innings of pitchers with completely different roles as if they have the same value.

Again, we are talking about things we can be certain of before the fact. We know the way that bullpens are run means that the impact of the closers and setup man is exaggerated while the value of guys at the bottom of the hierarchy is reduced. So while an after-the-fact metric like WPA may overstate the impact a bullpen should be expected to have, lumping all the relievers together is going to understate things. It isn’t enough to determine the bullpen’s overall true talent, the specific leverage of each individual pitcher also needs to be taken into account.


#14    Aaron      (see all posts) 2008/10/21 (Tue) @ 20:04

Let me use a different example to make my point.

Stealing a base is a positive event for an offensive player. How do we put a concrete value to those steals, though? The simple way is to just multiply his total number of steals by the league average value of a stolen base. But the value of a base varies greatly by the situation (and whether it’s 2nd or 3rd), and the runner gets to decide when to try to steal. So shouldn’t we take take that into account in order to give credit to the guys who steal when it really matters instead of those who pad their numbers when it’s easy? WPA is one way to do that, but the outcome of a few attempts will have a huge impact on the numbers and thus cloud how good/smart the runner really is.

Those are two extremes, but there is third way which that takes into account the specific situations a runners tries to steal in without adding a ton of noise. That is to weight each attempt by its leverage, but multiply all attempts by the player’s overall success rate (or whatever you think his true talent success rate should be) instead of the actual results for each play.

That’s what I have in mind when talking about bullpens. Weight each player’s innings by his leverage but use his “true” talent projection to determine his impact rather than the actual results.


#15    tangotiger      (see all posts) 2008/10/21 (Tue) @ 20:46

How about a real-world view as to how much impact the bullpen has.

Teams pay their nonpitchers around 57% of the payroll, their starters get around 33% and relievers get 10%.

We also know that the standard deviation (SD) of teams in MLB is around 1 SD = .060 wins.  This means that 95% of all teams have a true talent in the +/- 2 SD range, or between a .380 and .620 range, or 62 to 100 wins.  (Note: these days, the range is a bit tighter.)

Anyway, with some 38 wins in true talent gap between the best and worst teams, 10% of that is the bullpen, or 3.8 wins.

That’s exactly what MGL said:

That is a difference of 39 runs or less than 4 wins.  So the difference between the absolute best and worst bullpens in baseball, not including the closers, is less than 4 wins.

If teams really believed bullpens were more valuable, they’d pay for it.  They don’t because they aren’t.

***

Can we come up with the 10% figure in different ways?  Sure. 

I have the replacement level for starters as .380, and for relievers as .470.  The average starter is .490, and the average reliever is .520.  Starters get 65% of the innings.

So, the WAR for starters, per team game, is:
(.490-.380)*.65 = .0715

The WAR for relievers per team game:
(.520-.470)*.35 = .0175

Because of leverage, let’s make it .07 wins for starters and .02 wins for relievers.

So, out of .090 pitching WAR per team game, relievers get almost 22% of the WAR (.02 / .09).

Teams pay relievers 10% of the 43% that go to all pitchers, or 23%.

As you can see, teams pay relievers, as a group, what they are supposed to.  We know the value of the bullpen is 10% of team value.  And that means that the best bullpen would be roughly worth 4 more wins than the worst bullpen.


#16    MGL      (see all posts) 2008/10/21 (Tue) @ 21:05

Everything you said is true.  That doesn’t mean that pens are “worth” more or less than 2 wins or 3 wins or 10 wins.  I was just guessing about the value of bullpens in terms of true talent.  And I just showed the 1st half/2nd half data to illustrate how much regression there is in bullpen performance. I realize that if you want to look at pen value you have to weight the contributions of the various classes of relievers properly.  No disagreement there.  In fact, when I do my team projections, I give expected closers 2.0 weight (simply weight their innings by 2), set-up men, 1.5, and middle and long relievers .9.  If you did that, I am still guessing that you won’t find more than 2 wins either way between the best and worst in true talent.

I mis-spoke when I first said, “not including closers” etc. and then said, “therefore, there is about a 4 win difference between the best and worst bullpens.

IOW, yes, of course the true talent value of a bullpen is the true talent of a closer prorated by about twice his innings pitched, etc., yet, I still think that the overall true talent of the best and worst pens, using that (correct) methodology, is plus or minus 2 wins, and Tango’s salary analysis supports that.


#17    MGL      (see all posts) 2008/10/21 (Tue) @ 21:06

Interestingly, while teams talk a big game about how important a bullpen is, they don’t really mean it, as evidenced by Tango’s numbers.


#18    Rally      (see all posts) 2008/10/21 (Tue) @ 23:41

"Anyway, with some 38 wins in true talent gap between the best and worst teams, 10% of that is the bullpen, or 3.8 wins.”

I don’t think that’s a good example.  Would you say that the gap between the best and worst 1st basemen is only 2.7 wins? (38 * .57)/8 It suffers from the same problem as win shares.


#19    tangotiger      (see all posts) 2008/10/22 (Wed) @ 07:07

The split is 57% nonpitchers, 33% starters, 10% bullpen. 

You CANNOT split that any further.  You can’t say that the starters is 33%, divide by 5 or 6, and get 5%, and then take 5% of 38 wins, and get 2 wins as the gap between the best and worst starters.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 08 04:25
Sabermetric Moves of the 2009 Pre-Season

Jan 09 02:33
Cheers

Jan 08 23:45
The first Hardball Times Annual available for download!

Jan 08 21:16
Line Drives

Jan 08 20:23
(recent) Historical WAR on Fangraphs

Jan 08 16:07
Clint Eastwood is Archie Bunker

Jan 08 16:06
Hardball Times Annual 2008, starring…

Jan 08 15:58
Madoff’s Ponzi

Jan 08 03:41
Valuing relievers

Jan 07 17:41
The latest in park factors