THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, December 04, 2007

Starters going deep

By Tangotiger, 10:10 AM

Discussions centered on the impact to the team, bullpen, player, on what happens when a starter does or does not go deep, and how to evaluate the player.

Here’s the post from MGL that is kicking this off:

(Note: the first few posts below were moved here from another thread.)


MGL:

BTW, how do we resolve that issue of the replacement pitcher not throwing 7 IP per game?  Doing it that way, I get a 6.0 WAR for Santanta.

If the replacement pitcher only throws 5 IP and then a replacement reliever comes in for the other 2 IP that Santana would pitch, his WAR is only 5.4, a big difference.

IOW, here are the two ways to figure Santana’s WAR:

Method 1

Santana pitches 30 games, 7 IP per game, and allows 1.2 runs per 9, or .9333 per game (7 IP) less than the average league RA.  For the remaining 2 IP per game, we assume league average RA.  That is a .61 wp in a league where RA=4.7 and we use pythagorous.  If a replacement pitcher pitches those same 7 IP per game, they will allow 1.2 runs worse than league average for those 7 IP, or .9333 runs per game worse than league average.  Again, the other 2 IP are leage average RA.  That is a .41 wp.  For 30 games (210 IP for Santana or the replacement starter), that is a difference of .2 * 30, or exactly 6 wins.

Method 2

This is what happens in reality.  The replacement pitcher only pitches 5 IP, for a total runs allowed of 1.2 * 5/9 worse than league average, or .67 runs per game. For the next 2 IP, a replacement reliever pitches .3 runs per 9 worse than average or another .067 runs per game worse than league average.  The final 2 IP are pitches by league average pitchers (just like with Santana or with the replacement starter going 7 IP per game).  That is a total of .733 rpg worse than league average, which is a wp of .428 rather than .41 when the replacement starter was going 7 IP.  That is a WAR of 5.46 for Santana.  That is with Santana going 210 IP but his “replacement” only going 150 IP, which as I said, is closer to reality.

So is Santana 6 WAR or 5.46 WAR.  I submit that it is clearly #2.  I think that Tango’s method would put him at 6 WAR, using something like method #1.  Tango, what say you?  Santana is 1.2 runs per 9 better than league average RA.  He pitches 7 IP per game and 210 total IP.  How many WAR is he if replacement is also 1.2 runs worse than league average RA.  IOW, Santana allows 2.4 fewer runs per 9 than the league average.  And league average RA per game is 4.7. 

#1    tangotiger      (see all posts) 2007/12/02 (Sun) @ 13:23

I reject the notion of 15%. 

Frankly, I think the Buehrle study I did is the best of its kind.  And Santana is the exact same age as Buehrle.  Players of that age pitched to 2700 batters over ages 29-32, across the history of MLB (outside of war years).  For players of that type (top end, young pitchers), that works out to 10% dropoff in the present times.

As that study shows, it’s irrelevant how many innings the top-end pitchers threw.  They ended up throwing to 2700 batters afterwards.

The study does not apply to older pitchers, or lesser quality pitchers.  It’s a study that exactly fits Santana.


#2    tangotiger      (see all posts) 2007/12/02 (Sun) @ 13:45

As for the replacement-chaining effect, it certainly has alot of layers to it.  Not only does Santana prevent a lesser pitcher from coming up, but he saves pitcher usage too.

Patriot has a great and long explanation of the issues here (toward the middle):
http://gosu02.tripod.com/id77.html

I try to follow the K.I.S.S. approach.

***

As for the notion about breaking the 20MM barrier with a 25MM deal, isn’t that what ARod did 7 years ago?  Santana is to the rest of the league what ARod was to the rest of the league.

I think I called for 6/150 deal as well.

It’s just too bad he didn’t have his typical great year last year, what with all those HR.


#3    MGL      (see all posts) 2007/12/02 (Sun) @ 16:34

Again, if you are talking about the difference between Santanta pitching and a replacement pitcher pitching, there really is no chaining effect.  In my “method 2” above, if you replace Santana’s 7 IP per game with 5 IP of a replacement starter and 2 IP of a replacement reliever, nothing else changes.

So I would have to say that the regular way of computing WAR for a starter will overstate the value of the that starter.  One has to replace a starter with a replacement starter for 5 IP per game and a replacement reliever for the difference between that starter’s expected IP per game and 5.

All that really means is that although a replacement starter may be 1.2 runs per game worse than league average, his overall impact is mitigated the fact that he only pitches 5 IP per game and then is replaced by a much better pitcher (for an inning or two), so that our replacment level for a starter is in essence not that low.

If you want to talk about chaining, then you have to say that the 2 IP between Santana and a replacement starter is pitched by one of a team’s regulal relievers, which is not what I am saying.


#4    Tangotiger      (see all posts) 2007/12/03 (Mon) @ 10:57

If you use up the replacement reliever, you are tapping into your bullpen.  Let’s say the typical bullpen throws say 500 innings a year (6 relievers x 83, or 7 x 71).  Now, you need an extra 70 innings to pick up the slack of 2 IP x 35 starts for the Santana-less games.  Where are they going to come from?  If all the relievers pick up the slack, that’s an extra 10 or 12 innings each.  If those good relievers could perform at that level, presumably they would.  Presuming that they already pitch their maximum innings for their maximum effort, then something has to give (reduced effort per game).

The counter to that is that I think relievers aren’t being maxed out, and it would do them some good to throw 90 IP each.  But, that’s a different model that needs discussing.

In any case, I’m not going to say that replacing Santana with a starter at a rate 6.00 R/9 (for 5 innings) and a reliever at a rate of 5.00 R/9 (for 2 innings) is the comparison point.  Why not then say that he can be replaced by 4 relievers at a rate of 5.00 R/9 (for 7 innings)?

I can’t count on his longevity to go 7 innings a game as a minus. (i.e., I can’t compare his 6th and 6th inning against a better level, without considering the fact that he’s saving his bullpen).


#5    Rally      (see all posts) 2007/12/03 (Mon) @ 15:52

On Santana’s replacement level:

It really depends on the makeup of your pitching staff.  If he’s one of 4 ‘horses’ in the rotation then your relievers are probably fresh most of the time, and might even benefit from a little more work if you replace him with a 5 inning starter.

If he’s the only durable pitcher you have, then you lose him and you’ll be going to the bullpen all the time, and things will get ugly.


#6    dcj      (see all posts) 2007/12/03 (Mon) @ 17:10

I like the game-by-game method for a pitcher like Santana, but of course every start is not exactly 7 IP, 2 ER. I wonder if we adjusted for the inconsistency, if Santana’s value wouldn’t go up.

In Sal Baxamusa’s HBT article today about the Weibull distribution he says that consistency in scoring runs is good and inconsistency in allowing runs is good.

http://www.hardballtimes.com/main/article/consistency-is-key/

I’m not sure that applies as a blanket statement. Consider a thought experiment where a team scores exactly 5 runs per game. If they allow an average of 4 runs per game, then the more consistent they are at allowing runs, the better the record. On the other hand if they allow an average of 6 runs per game, then the more *inconsistent* they are at allowing runs, the better the record. At least, I think it should work that way.

Bringing it back to Santana, I suppose the direction of the adjustment for inconsistency might depend on the strength of the team’s offense. Though, probably not for an ace like him. The right thing to do for purposes of looking forward, I guess, is to run the (theoretical) distribution of runs allowed in a game Santana starts against an average team’s distribution of runs scored.

Then there is the issue that runs scored and runs allowed are not independent even after correcting for the lack of ties. Steven J. Miller assumes that away when he derives the Pythagorean formula from the Weibull distribution

http://www.math.brown.edu/~sjmiller/math/papers/PythagWonLoss_Paper.pdf

but we might not be able to get away with that for this purpose.

Of course after all that calculation it may turn out that the adjustment is negligible. But right now I don’t have a good idea of how big it might be, or even in which direction. I guess the SNVAR stats from BP go towards answering this question, although those are backward-looking.


#7    dcj      (see all posts) 2007/12/03 (Mon) @ 18:09

Actually now that I think about it, all the extra work might be unnecessary. Experimentally it seems that a team’s runs allowed distribution over the full season approximates the Weibull. If the same is true when only looking at games started by Santana, then we could model “Twins-with-Santana” as a team scoring 4.5 RPG and allowing 3.5 RPG, or whatever. So the Pythagorean formula would give the right answer on a game-to-game basis.

There is the question of whether the Weibull distribution is a good approximation for games started only by one pitcher, like it is for whole teams. Theoretically speaking I think the answer has to be no. Imagine that a team has 5 starters as follows:

Starter 1: 3.50 ERA
Starter 2: 4.00 ERA
Starter 3: 4.50 ERA
Starter 4: 5.00 ERA
Starter 5: 5.50 ERA

So they will give up something like 4.5 RPG. (I don’t know if the lower bullpen ERA fully offsets the unearned runs.) According to Miller, the overall distribution of runs allowed follows the Weibull pattern.

Now if we isolate only games started by the third starter, the average runs allowed is still 4.5, but the spread in the distribution should be narrower. So if the Weibull is a good match for a whole team’s runs allowed distribution, it has to be too wide for a “team-with-pitcher’s” runs allowed distribution at the same overall average.

The game-to-game random variation may wash out all of this effect. After all, the Weibull is supposed to be a good fit for teams no matter what the range of ability among their starting pitchers is. Then again, no team ever has five starting pitchers all with exactly the same ability, which is what a “team-with-pitcher” would look like.

It may also be the case that some pitchers are more consistent from game to game than others, even at the same overall ability level. Intuitively, a guy who tends to give up more HR might be less consistent.

I guess that despite these issues, I’ve convinced myself that the straight pythag should be a good approximation. Still not sure of how to properly account for increased/decreased bullpen innings, though.


#8    dcj      (see all posts) 2007/12/03 (Mon) @ 21:38

On the question of bullpen usage.

I think there are three ways that a well-rested bullpen can perform better. First, with fewer overall innings, a greater fraction of them is pitched by the better pitchers. Second, each individual pitcher might improve his performance since he is not being worked quite so hard. Third, more flexibility gives the manager leeway to leverage the pen more effectively.

Of these I would say the toughest to quantify is the second effect. It gets even trickier if part of the improvement in performance is a decline in the frequency of injuries.

--

I’ll take a stab now at the third effect. Fangraphs lists for each team the total IP of the bullpen, the total WPA, and the total BRAA. BRAA if I understand correctly is lwts by base-out state, so it should add up to total runs allowed (accounting properly for runners inherited from starting pitchers). Do I have it right?

There’s also a total pLI listed, which is the average leverage of a situation faced by the pen. So if we take

WPA - pLI * BRAA / 10

then that should give a measure of how effectively the bullpen was leveraged. I would hope that every team in MLB should come out positive in this measure, because a monkey throwing darts would get zero.

Now, it will be interesting to see if there is any relationship between total bullpen IP and “effectiveness of leverage.” Of course there are a lot of things going on here, not least blind luck, managers’ bullpen philosophies, etc. But the hypothesis at least is that the lower the bullpen IP, the higher the effectiveness of leverage, because the manager has more room to maneuver.

--

All right, collecting data…
Here it is. There are 2 sheets, the first with the data, the second with correlations and scatterplots.

http://spreadsheets.google.com/ccc?key=pso80GqIAQP3yHg80yCg3Nw&hl=en

--

So. In 2007 we have 15292 bullpen IP across both leagues at an ERA of 4.19. The average pLI is 1.04, the only major outliers being the Cardinals at 0.82. The high end is the Blue Jays at 1.21 followed by the Giants at 1.17.

Because pLI * BRAA / 10 is awkward, I’ll call it ExpWins for expected wins. I will refer to WPA - ExpWins as Effectiveness.

Totals for all teams: ExpWins = +35, WPA = +77, Effectiveness = +41. (Some rounding error.) ExpWins is positive because a 4.19 ERA is better than league average. The good news is that teams leveraged their bullpens to the tune of +1.4 wins apiece. (1.4=41/30) The leader: St. Louis at +6.2 wins. The monkey awards go to the A’s (-1.3), the Cubs (-1.4), the Rockies (-3.5), the Giants (-4.0), and the Blue Jays (-4.4).

The Jays are an interesting case. Their top three relievers, Accardo, Janssen, and Downs, had excellent seasons (each had an ERA between 2.10 and 2.40), but their WPA numbers were merely pedestrian. Accardo, the closer, had a mean pLI of 2.02 and +19 BRAA in 67.3 innings. You would expect something like +3.8 WPA, but his WPA was only +1.9. Add similar results from Janssen and Downs and an abysmal 4.1 innings of 12.46 ERA from B.J. Ryan and you have the most underachieving bullpen, relative to its ERA-type stats, in baseball.

This case illustrates a flaw in my analysis. The hypothesis is that overworked bullpens fail to be effective because the manager can’t leverage them properly. In the Jays’ situation, John Gibbons got his best pitchers in the game in the most crucial spots, but the pitchers simply didn’t deliver.

The Jays’ pen pitched 450 innings, far less than the average of 510. So maybe Gibbons *did* take advantage of his starters’ long outings in order to leverage his pen, but his top relievers failed to execute. This would indicate that “Effectiveness” as I’ve defined it is not quite what we want to be measuring.

--

In any case, on to some correlations.

ExpWins against bullpen ERA: r=-.97

This suggests that despite the flaws of ERA for relief pitchers, it’s reasonably accurate when applied to the whole bullpen (say, within 0.25 or so, according to my scatterplots).

WPA against ExpWins: r=0.80

As expected, the better the bullpen is in ERA-like stats, the more wins it produces.

Effectiveness against ExpWins: r=-0.30

I am not sure of the cause of this negative correlation. Ideally, “effectiveness” should be uncorrelated with the overall ablilty of the bullpen. Another strike against this definition.

ExpWins against IP: r=0.18

So bullpens with more innings pitched rack up a higher total number of expected wins. This isn’t too much of a surprise as the average relief pitcher is “above average” according to the overall totals. In any case, the correlation is very weak.

Effectiveness against IP: r=-0.13

This is what we were after all along: overworked bullpens have lower “effectiveness.” On the other hand the relationship is extremely weak, and I’ve already noted problems with the measure of effectiveness. Well, at least the sign of the correlation is negative like we expected.

--

Clearly this little study would benefit from expansion to years other than 2007. Fangraphs has this data going back to 2002 and 6 years is much more robust than one. As well, “effectiveness” might be replaced with a metric that rewards bullpens in which the highest-leverage innings go to the pitchers with the best ERA (or component stats, etc). On the other hand, how to deal with teams such as this year’s Indians, who purposely gave one of their worst relievers the ball in the most crucial situations?

To conclude: it seems plausible that having starters who pitch deep into games enables a team to leverage its bullpen more fully. All this wrangling with numbers has given the idea at least a little bit of support.


#9    MGL      (see all posts) 2007/12/04 (Tue) @ 00:11

RE: 195 4

The reason you don’t compare Santana or any pitcher to 4 relievers at 5 rpg for 7 IP is that that is not reality.  Figuring a pitcher’s WAR is supposed to be an exact model of reality comparing your starter in question to a “literal replacement starter.”

And in reality that replacement starter will NOT pitch 7 IP per game, but only 5.  So THAT is the baseline you HAVE to compare to if you want to tell a team, “If you get a replacement starter and pay him $500,000, here is how many wins your team will have.  And if you DON’T get that replacement starter, but instead, you get Santana (or whomever), here is how many wins MORE than that you will have.

Isn’t that the EXACT question you are answering when computing WAR? If it is, you cannot use your “replacement pitcher for 7 IP per game” model.  It is NOT a negative that Santana goes deep into the game.  It simply mitigates his overall value.  Like your closer model (where you assign double weight only after a certain threshhold), for a starter, he has one value for the first 5 IP per game and then another value for the next X IP per game, simply because his replacement will only pitch 5 IP per game and then get replaced by a better pitcher (for 2 IP), a replacement, or 5 rpg, reliever.

You may be right in that there is some chaining. I am really not sure.  If there is, I don’t think it makes much difference (it won’t make up the .5 win difference between my model and yours). In any case, my model may not be exactly correct, but it is MUCH closer to reality than your model (which is clearly incorrect).  It has to be becuase it IS reality (5 IP by the repl. starter and 2 IP by a reliever) whereas yours is definitely not (a repl. starter going 7 IP).

If you have that extra 60 IP (2 IP per game times 30 starts for Santana) per year to make up by your pen, what happens?  Well, I am already assuming in my model that a replacement reliever pitches those extra 60 IP.  So the question is, if a bullpen has to pitch 60 more IP a year and you are already assuming that it is by a replacement reliever, are any of the other pitchers in the pen, or the other bullpen innings affected by chaining?  Perhaps a tiny bit, but I am not sure if at all.  Either way, I don’t think it is going to make much difference.  Couldn’t you just literally let everyone else pitch as many IP as they would if there weren’t those extra 60 IP to absorb and just bring up a replacement reliever from the minor leagues every once in a while to take up that slack?

Or, what if we just assume that we take any bullpen and we add an extra 60 IP of workload.  What would those extra 60 IP look like no matter how you restructured the pen (give everyone an extra few IP, give your most underworked and probably worst reliever a lot of the extra IP, give your 5th starter a few relief outings, etc.)?  They can’t be much worse than reliever replacement level can they?

If you think that your clearly “artificial” model comes up with the right answer, all I ask is that you tell me what happens in reality if you truly replace Santana with a replacement pitcher.  I am willing to listen to see if any chaining causes those extra 60 IP to be worse than reliever replacement level.  I guess all you need is an extra .08 runs per 9 for those 60 IP to make up the extra half win, so I am willing to listen to how that might be.  Or perhaps we just can’t know how much an extra 60 IP affects a bullpen. 

Again, I am already conceding in my model that the extra 60 IP is much worse than the average talent of en entire bullpen.  I am assigning those extra 60 IP at 5 rpg or replacement reliever level.  I just don’t think you can get any worse than that.  IOW, no matter how many IP you add to a pen, the worst you can do is assign those extra IP to replacement reliever level (5 rpg).  No?


#10    MGL      (see all posts) 2007/12/04 (Tue) @ 00:30

dcj, interesting analyses and concept, although I did not quite follow it with a quick read.

BTW, there has been a lot of research on whether consistency or inconsistency is better or worse, not that there is any evidence that a pitcher or batter has control over their consistency so I don’t think it matters other than in a retrispective look at value.  Anyway, I think, IIRC, that a pitcher worse than league average benefits from inconsistency and that a better than average pitcher benefits from being consistent.  I think it is the reverse for batters, but I am not sure about that.  We can easily test it using sims (the sim can even model the fact that a pitcher’s true talent may change a little from game to game, due to dynamic physical and psychological effects).  We don’t have to figure out the best theoretical run distribution model, whether it be Weibel or whatever.

As you surpmised, I don’t think it matters whether we use a static rpg model or a “real” one in terms of figuring WAR.  I have used my sim to compute a pitcher’s WAR and I have used the “same rpg” model, and it basically comes out the same.  I suppose that if a pitcher has a skill for some weird distribution, it might be different, but I doubt that anyone does.

Rally, remember that in my method for computing Santana’s (or any pitcher’s, although it only makes a big difference if the pitcher has a high average IP per game), I am assuming the “worst” which is that those extra IP the pen has to absorb somewhere is going to be at reliever replacement level (5 rpg in a 4.7 rpg envir.).

Tango’s mathod assumes that those extra IP (60 or so in Santana’s case) are at starter replacement level (6.0 rpg or so), even though we KNOW that a replacement starter will NOT pitch those extra 2 IP per game - it will be a reliever.

So the question is, which is closer to reality?  I can’t believe that it is even close to Tango’s method, but I could be mistaken.  And I think it is entirely possible that the floor for how bad ANY number of extra IP for a pen is, no matter how it affects the pen (flexibility, chaining, fatigue, etc.), is replacement reliever level for those extra IP, be it 20 or be it 100.

Let’s say that we added 1000 extra IP to the pen.  Couldn’t we just constantly bring in replacement level relievers from the minors any time we wanted?  Isn’t fatigue on a pen only costly because managers want to let their replacement (worst) relievers pitch as little as possible?


#11    David Gassko      (see all posts) 2007/12/04 (Tue) @ 00:48

Mickey,

If you bring up an extra reliever, you lose one bench player, giving you less flexibility and forcing your regulars to play more. I think it is not an unreasonable assumption that the cost of this would be equal to .08 runs per game. And if you don’t bring up the extra reliever, the rest of your relievers are a lot more stretched out (and more likely to get injured)—which might again reasonably cost .08 runs a game.


#12    Anthony      (see all posts) 2007/12/04 (Tue) @ 01:07

Couldn’t we look at bullpen performance based on how long the starting pitcher goes? That is, what’s the bullpen’s RA when the starter gets 15 outs, 21 outs, etc.

Or even identifying the replacement starters first and then measuring bullpen performance in their starts through seven innings (or whatever). Maybe even check bullpen performance in the next few days to see if the added innings have a longlasting impact. It just seems like something that can be studied (and probably already has, for all I know).


#13    tangotiger      (see all posts) 2007/12/04 (Tue) @ 08:36

MGL is right that I am suggesting that the other two extra innings that Santana is pitching is being replaced by 6RPG/9IP, as opposed to 5 RPG/9IP.  We have some 60 to 70 extra innings here, which means there’s an extra 7 runs over 162 games that the bullpen will allow because of “tiring”. 

Given that a bullpen allows say 250 runs, it will be rather difficult for me to prove that each reliever will allow only 3% more runs, especially since bullpens are not some fixed set of pitchers.

Therefore, I’m just presuming that the two will match up, that the replacement reliever for the Santana games, plus the overall loss of effectiveness for the bullpen over a season, will work out to a replacement pitcher as starter.

***

As for the non-reality I brought up, why don’t managers only use relievers?  After all, say you have your 5th starter, who has an RA of 5.5 RPG.  Clearly, any reliever you bring in will be better.  By the time this 5th starter comes in to face the opponents the third time, he will be dreadful.  But still, there he is.

Why not give each of these starters negative salary value?  After all, they are pacing themselves for 5 or 6 innings, when they should be going all-out for 1 or 2 innings.

That’s what they are being paid for: pacing themselves.  Imagine that Santana is told: only give me 5 innings.  Isn’t it now possible that his effectiveness will go up? 

Instead of being a 3.5 RPG pitcher, he will be a 3.0 RPG pitcher?

Being a 3.0RPG compared to a 6RPG for 5 innings means he’s worth +1.67 runs per start.

Being 3.5 compared to 6.0 for 7 innings means he’s worth 1.94 per start.

And what if he goes 37 starts the first way and 32 the second?  62 runs saved either way.

Basically, I don’t like taking the position of only looking at half the effect (which goes against him), when we ignore the other half that goes with him.

We allow bad starters to intentionally pitch less effectively, because we need them to go 5 innings to save the bullpen. 

Therefore, I think it’s reasonable to say that the taxing of Santana preserves 7 runs over a season for the bullpen.  It’s certainly at least 1 run.  It’s something.  7 seems reasonable enough that we can use the 6RPG model throughout a Santana start.


#14    MGL      (see all posts) 2007/12/04 (Tue) @ 14:04

I’ll buy that.  Anthony, sure it could be looked at (I don’t think it ever has), but I am not sure you could find the kind of small effects you are looking for amid all the noise.  Your sample sizes would not be very large, I don’t think.  Whenever I am pondering a study, I always think to myself, “What magnitude of effect am I looking for?” and, “What size sample do I expect to get?” That often gives me some idea as to how fruitful my study is likely to be in terms of getting any reliable results.


#15    tangotiger      (see all posts) 2007/12/04 (Tue) @ 15:39

Ooh, that’s a very pithy way of saying it.  Good job.


#16    salb918      (see all posts) 2007/12/06 (Thu) @ 02:16

7: “Actually now that I think about it, all the extra work might be unnecessary. Experimentally it seems that a team’s runs allowed distribution over the full season approximates the Weibull. If the same is true when only looking at games started by Santana, then we could model “Twins-with-Santana” as a team scoring 4.5 RPG and allowing 3.5 RPG, or whatever. So the Pythagorean formula would give the right answer on a game-to-game basis.”

That’s a decent approximation.  I doubt that pitchers can control their distribution of performance.  Given a large enough sample size, I think that a pitcher’s performance distribution is Weibullian.  Anecdotally, we have some evidence for the careres Kevin Appier and Pedro Martinez:
http://www.hardballtimes.com/images/uploads/dist_ape_pedro.jpg

One season is of course too small a sample to see the Weibull curve, but if a career is Weibullian than we might as well assume that individual season *would* take this shape *if* the sample size were larger than just ~30 starts.

One way to figure this:

1. Say Santana is 3.00 RA pitcher (or whatever) working in a 9 RPG run environment (or whatever).  That’s all the information you need to generate the Weibull curve.  Then you generate the curve for a replacement pitcher (RA = 6.00 or whatever) in the same run environment. 
2. Subtract the replacement’s distribution from Santana’s.
3. Figure a league-average offense (or whatever you like) to determine the Pyth win% for each run value, and then take the sum, weighting by the numbers in step 2.  This tells you the difference between Santana’s win% and the replacement’s.

Of course, I don’t know that that is correct or that it makes a big difference anyway.  And now that I think about it, I think it’s the same as taking Santana’s pyth win% (based on his RA) and subtracting the replacement’s pyth win% (based on replacement RA).  So maybe I said nothing at all.


#17    Guy      (see all posts) 2007/12/06 (Thu) @ 11:56

Sal:
A question:  When you say a 9 RPG “environment” do you mean the league as a whole, or in Santana’s games?  And more importantly, why should the “environment” have any effect on the distribution of Santana’s RA?  I would think the distribution should always be the same for any given mean RA, regardless of “environment.” I could see how a very different OBP/SLG balance might create different distributions for identical means, but still the overall RPG doesn’t tell us much about that.


#18    salb918      (see all posts) 2007/12/06 (Thu) @ 16:00

Guy, that’s a question I’ve wrestled with myself.  I *think* (now) that we need to use a the league run environment, which is in opposition to what I said at THT last week.  Oops.

The distribution does change as a function of run environment.  Think of a league where the the run env is 1 RPG.  Then a 4.5 RA pitcher is still going to have mostly 1 or 2 runs allowed per game, but the distribution will have a long tail that pushes the average up to 4.5.  I have a somewhat messy picture here:
http://web.mit.edu/baxamusa/Public/tango_dist_weibull_dist.JPG

Anyway, that’s an admittedly extreme example, and I think the Pyth method works fine (with the caveat that you have to properly choose your “replacement” pitcher [reliever]).


#19    Guy      (see all posts) 2007/12/06 (Thu) @ 18:17

Sal:  I guess I don’t see the logic.  Why would you expect a 4.5 pitcher to have many more 1- and 2-run games in a 1 RPG “environment” than in a high-scoring one?  I’d think it would be the same.  Or to pose a more realistic version, why would the distribution for a 3RPG pitcher look different in 1968 (when he was slightly above average) than in 2000 (when he was an All-Star)?  Again, I’d expect HIS distribution to be the same (but of course he’d win a lot more games in 2000).

The exception would be if the run environment was telling us something about the OBP/SLG ratio at the time, which might have a small impact on the distribution.


#20    MR      (see all posts) 2008/01/01 (Tue) @ 23:01

So, what was ultimately decided between tango and MGL on figuring Santana’s WAR? Method #1, with the other 2 innings as a replacement level pitcher?  And I think it does make sense to configure WAR with your replacement level starter by making him only go 5 innings...did you guys decide on doing that when determining every other starter’s (i.e., Santana) WAR?


#21    Tangotiger      (see all posts) 2008/01/02 (Wed) @ 10:16

Not speaking for MGL, but I haven’t changed my position.


#22    MR      (see all posts) 2008/01/02 (Wed) @ 13:40

So you have Santana at 6 WAR like his method 1? I didn’t exactly see a post where you valued Santana compared to a replacement pitcher pitching 5 innings (if you did that).


#23    Tangotiger      (see all posts) 2008/01/02 (Wed) @ 14:02

MR, actually my position is best described in post 13, and it looks like MGL agreed with it (at least to some extent) in post 14.


#24    MR      (see all posts) 2008/01/03 (Thu) @ 15:37

Okay, so you have the replacement level pitcher going 5 innings and a reliever with a 6RPG/9IP going the other two that Santana would normally pitch.  So what does 6 RPG/9 translate to in win %?  And most starters average like 6.5 IP/9 (or something a little below 7), so if you were using an average starter and not Santana, you would have to assume an extra 1.5 innings or so and not a full 2 innings.


#25    Tangotiger      (see all posts) 2008/01/03 (Thu) @ 16:13

Replacement level pitchers as starters are .380 and as relievers are .470.

The average starter goes 5.9 IP.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 21 17:29
Sabermetric Moves of the 2009 Pre-Season

Nov 22 06:40
The New Triple Crown

Nov 22 06:24
Chance of Scoring by Base/Out, Retrosheet Years

Nov 22 02:48
How good are the Fans in evaluating fielding?

Nov 21 20:13
Runs Produced

Nov 21 19:27
Marcel 2009 is here

Nov 21 16:43
Nate Silver: hero to interviewers

Nov 21 10:57
New BBTN

Nov 20 20:34
ABSO-lutely… not!

Nov 20 19:23
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being