Friday, August 19, 2011
Replacement level and MVP
I did a two-post followup to the poll at Fangraphs.
Part1, Part2.
Buy The Book from Amazon
I did a two-post followup to the poll at Fangraphs.
Part1, Part2.
Linear Weights for pitchers.
It requires using a conditional clause. This is what you say:
IF you intend to look at numbers, THEN this is one of the best ways you would look at them.
If they come back with:
But what about what’s not in the numbers? What about heart? How can you rely only on numbers?
Your response is:
I’m not disagreeing with you. All I said is IF.... IF you intend to look at numbers. Whether the numbers tell 100% of the story or 10% of the story, I’m not suggesting either way. The only thing I’m saying is IF… IF. You can decide for yourself how much weight the numbers should get, and how much weight non-numbers should get.
And if they say:
What about wins?
Your response is:
That’s a number.
Excellent little article by Schoenfield. Once you see that, then you have no choice but to move on to something better. You can’t look at all the holes in RBIs and then… be content to use RBIs. Eventually, you make yourself to this article by Ruane. And then finally, you get to Linear Weights and RE24.
As we’ve found out, FIP does not work at all run environments. The coefficients work for pitchers if they give up around league average runs. But the further away you are, then the more the coefficients need to change. If you wanted to do FIP the right way, that’s what you’d have to do. However, the appeal to FIP is that we get a quick look by using nice constant coefficients. If we had to figure out the new weights for each pitcher and each year, it would lose a great deal of appeal.
wOBA was never intended for mainstream use. It was conceived for The Book. And wOBA, done right, would be FIP done right: proper coefficients for each run environment. So, some years, the coefficient for the HR is 1.90 and others it’s 2.10 and so on.
But, the appeal to FIP is the non-changing coefficients, and we calibrate by using a constant (we’d add +3.20 or +3.00, etc, as the case warrants it).
Indeed, when I use wOBA as a quick calculation, I use this:
wOBA
= 0.7 * (BB + HB)
+ 0.9 * (1B + ROE)
+ 1.3 * (2B + 3B)
+ 2.0 * (HR)
If I need to align it to some league level, or if I want to make it cross-era useful, I’ll just apply some overall constant to line them up. That is, I use the same principle behind FIP: keep the coefficients, and apply an overall fudge factor.
My questions:
1. Do you prefer to see a non-changing wOBA formula, like FIP?
2. If so, do you prefer to align to that particular year, or do you want the league average to always be 0.330?
Good job by Dave. But I like his side note here:
We would not suggest that anyone look at 2011 WAR as a definitive ordered list of who the best players in the game are at this time – it’s not even trying to make that claim. It’s talking about past performance only, not what we expect going forward.
In fact, Buster’s criticism of WAR could be applied to any stat you want, traditional or advanced. If you interpret it literally, ERA currently says that Ryan Vogelsong is the best pitcher in the National League. That’s crazy, of course, but no one interprets single-season ERA that way. Single season batting average gives you Casey Kotchman as the third best hitter in baseball. It’s not just the advanced stats that produce results that “don’t pass the smell test”.
Bill James wrote an article recent called “Abe Lincoln Scores”, where he focused on 4 scores (BB+HB, SO, HR, BIP). He set the score for BIP to a “1”, and floated the other numbers around that. SO was 0, HR was 4, BB+HB was 2. (HR is undervalued in his metric.)
At this point, you should be thinking two things:
a. wOBA
b. FIP
The wOBA equation is this:
0.0: SO, other outs
0.7: BB, HB
0.9: 1B, ROE
1.3: 2B, 3B
2.0: HR
What James did was to focus just on those FIP things. So, we can come up with a FIP equation based on wOBA fairly easily:
wOBAfip = (0*SO + 0.7*BB + 2.0*HR + something*BIP) / PA
So, this made me think. Whereas in the FIP equation, the “3.2” is a constant for all pitchers, the “something*BIP/PA” is specific for each pitcher. That is, his wOBA will be affected based on the percentage of his PA that are BIP. To take an extreme view, if 100% of his PA are BIP, his FIP will equal 3.20. The wOBA for such a pitcher will be 0.300.
Is a .300 wOBA actually a 3.20 ERA? Not exactly. I mean, it’s pretty close. A .300 wOBA is more like a 3.30 ERA. But, still, there’s a bit of bias to account for.
The other thing is if the 13, 3, -2 weights are correct. FIP complicates matters by having IP, not PA, as its denominator. Since we are trying to remove the fielders from the equation, the existence of IP implicitly includes them.
***
The linear weights run values are these for the 4 scores:
Runs above average =
-.28 SO
-.03 BIP
+.32 BB
+1.40 HR
To convert to runs, we add +.12 runs (per PA). So, we get:
Total Runs =
-.16 SO
+.09 BIP
+.44 BB
+1.52 HR
This gives us runs scored per game.
Since FIP likes to keep the BIP “fixed”, then we remove .09 runs per PA from each event, and spin it off into its own. Now we have:
Total Runs =
-.25 * SO
+.00 * BIP
+.35 * BB
+1.43 * HR
+.09 * PA
Since there are 38.5 PA per game, we get:
-.25 * SO
+.00 * BIP
+.35 * BB
+1.43 * HR
+.09 * 38.5
Note: there are an average of 38.5 PA per game. Great pitchers see fewer batters. Hence, the reason we have a bias here.
With 9 IP per game, we get:
-.25*9 * SO/IP
+.00*9 * BIP/IP
+.35*9 * BB/IP
+1.43*9 * HR/IP
+.09 * 38.5
Which is:
(
-2.25 * SO
+3.15 * BB
+12.9 * HR
) / IP
+ 3.47
Since this is on a runs scale, and not an earned runs scale, we can multiply everything above by 0.923 to get the ERA scale:
(
-2.1 * SO
+2.9 * BB
+11.9 * HR
) / IP
+ 3.20
Hence, we see where the -2, +3, +13, 3.20 figures from FIP comes from. Based on this deconstruction, we see that the HR value in FIP may be too high, that I should be using 12, not 13.
***
However, remember that the run value of a HR is fairly static at +1.40 runs, while the run value of the walk moves with the run environment. As the run environment goes down, so does the run value of the walk. So, relatively speaking, the HR value compared to the walk increases as the run environment goes down, and decreases as the run environment goes up.
Indeed, if the run value of the walk is +.30, then the FIP component for HR becomes 13. If the run value of the walk is +.33, then the FIP component for the HR becomes 12.
***
We of course have another bias, and that is that runs are not linear when dealing with pitchers. But we’ve taken a decidedly linear approach. So, there are two things that conspire against a great pitcher’s FIP score being biased too high:
1. We give him 38.5 batters per 9IP, when it should be a bit lower.
2. Each event has less impact the fewer runners on base.
However, one thing that shifts the balance is the use of IP, not PA, in the denominator. So, it kind of sets the balance back the other way.
***
What am I saying? I don’t know. Maybe change the “13” to “12” for HR? Maybe try to focus on PA and not IP? Maybe look at percentage of PA that are BIP? Maybe have a FIP equation that is better tuned to the run environment than simply floating the 3.2? I don’t know yet.
That’s why this is a lab thread.
I understand the reason for the existence of OPS. There’s the OBP pillar over here, there’s the SLG pillar over there, and we need something to keep the house from falling, so, let’s use the OBP + SLG pillars together.
But, why does OPS+ need to exist? No person actually calculates OPS+ by hand or by computer even. B-R.com calculates it for you, so you are basically taking it on faith. If you are going to take a metric on faith, why not take one that is not biased? That’s why I support RC+ (though not the James version of Runs Created). OPS+ doesn’t even mean anything. It just happens, by luck, to approximate RC+.
Bill James recently noted when asked about OPS+:
I don’t much like OPS. OPS is an approximation which has gained favor over better measurements because of its simplicity. A mathematical derivation based on a convenient approximation doesn’t strike me as a best option.
And he’s right. OPS is a nice shortcut, which will ensure its survival. But, to add the level of complexity required to get it to OPS+ is not the best option. It may be an ok option, it may be a passable option. It may even be half-decent option.
OPS+ is nowhere near the best option, and there’s no point in debating for it on that basis. The argument for OPS+ requires you to concede that you are not interested in the best. And if you want to argue for OPS+ the way you’d argue that you’re happy at your crappy job because it pays the bills, then so be it. It gets the job done.
I’ve been meaning to do this for a few years now.
In 2010, with Cliff Lee on the mound, his team allowed 84 runs to 842 batters. Tommy Hanson’s team gave up 86 runs to 845 batters. As you can see, a pretty solid match.
(Note by the way that I didn’t say Cliff Lee gave up 84 runs. The defense has 9 fielders on the field. While the pitcher may be the pivotal player in allowing runs, he’s not the only one. This is why we should always say “the team allowed with the pitcher on the mound X number of runs”. This is not only accurate, it keeps us from giving too much credit to the pitcher.)
Hanson’s slash line (BA / OBP / SLG) was: .239/.301/.347
Cliff Lee on the other hand: .240/.255/.363
That works out to an estimated wOBA of:
Cliff Lee
= .277
Tommy Hanson
= .300
How is it that Cliff Lee ended with much better results than Hanson overall, but gave up a similar number of runs? While Hanson’s slash line with runners on base and bases empty was consistent with the league, Cliff Lee was on the mound when bad things happened with men on base:
Cliff Lee
.214/.230/.333 Bases Empty
.288/.302/.420 Runners on Base
The entire difference is basically BABIP driven, but we’re not concerned about this for now.
So, the question is: can we come up with a BaseRuns equation that is dependent on the base-out situation, such that the total runs estimated will be the same for Hanson and Lee? I don’t know the answer to that question yet.
I do want to present a general wOBA equation for bases empty and runners on base. For bases empty, we have:
0.85: 1B, BB
1.10: 2B
1.50: 3B
2.25: HR
Obviously, a single and walk are identical with bases empty. A shortcut to get the above, using only the slash line would be:
wOBAe = (2 * OBPe + SLGe - BAe ) * .42
The little e denotes performance with bases empty.
A general equation for runners on base would be:
0.50: BB
0.95: 1B
1.40: 2B
1.60: 3B
1.75: HR
With runners on base, there’s simply little to distinguish the various extra base hits. So, a shortcut equation would be:
wOBAr = (3 * OBPr + 2 * SLGr + BAr) * .16
The little r denotes performance with runners on base.
Also note that the Leverage Index with runners on base is 1.4, while it’s 0.7 with bases empty. And that the bases empty occurs 55% of the time. (Yes, I know that the better you are, the more often the bases are empty. This is quick shortcuts here.)
So, to combine the above two equations into an overall wOBA, we get:
wOBA
= wOBAe * 0.7 * .55
+ wOBAr * 1.4 * .45
So, if we take Cliff Lee:
.214/.230/.333 Bases Empty
.288/.302/.420 Runners on Base
We can convert that as:
wOBA
= (2 * .230 + .333 - .214 ) * .42 * 0.7 * .55
+ (3 * .302 + 2 * .420 + .288) * .16 * 1.4 * .45
= .299
Tommy Hanson:
.233/.289/.349 Bases Empty
.249/.319/.343 Runners on Base
We can convert that similarly to:
wOBA
= .303
As you can see, a wOBA based on looking at performance by men on base and bases empty makes Cliff Lee and Tommy Hanson equivalent.
Lee does a conversion from wOBA to ERA. However, he does a linear conversion, which is wrong. Since wOBA is Linear Weights, wOBA should not be used so simply, for pitchers. Pitchers should use BaseRuns.
If you insist on using wOBA, then you need to do something more multiplicative. My method is to do the following:
1. wOBA / (1-wOBA)
2. Take result, and raised to power of 1.5
3. Take result, and multiply by a constant (12 for ERA or 13 for RA9)
You can play around with the 1.5 and 12 and 13 to match to that season.
But, the BETTER thing to do is to use BaseRuns. And, theoretically, you should not require any fudge factor on a seasonal level, since giving up 15 HR, 50 walks, and getting 200 K on 220 IP should give you the same answer, whether in 2010 AL, 1985 NL, or 1969 high school. Now, what is missing is all the little things, like base stealing, advancing the extra base, etc. The fudge factor would change to account for the missing parameters. So, in that respect, you could have different results in these various leagues.
Anyway, if you want to get it right, use a simulator. If you want to get it fast but be in great shape, use BaseRuns. Everything else is a trick or shortcut, meaning it will have limitations.
Fangraphs thread, my response:
Alot of the issues being asked, very legitimate issues, is already being handled by other stats.
You can’t blame OBP (a SABR 101 stat) for valuing a HR and BB identically.
You can’t blame wOBA (a SABR 201 stat) for treating the run value of a HR the same across all 24 base/out states.
You can’t blame RE24 (a SABR 301 stat) for not distinguishing between a close and late game and an early inning blowout.
You can’t blame WPA/LI (a SABR 401 stat) for, well, for not understanding it.
The best thing to do is rather than saying that a stat doesn’t cover it, instead say: “which stat DOES cover this scenario”. And, then I can tell you which one does.
To just out and out presume that because a SABR 201 stat doesn’t cover a scenario without understanding or knowing that there’s a SABR 301 stat out there just keeps us in circles.
Ask where you can find it.
This fellow seems to have his heart in the right place. However, he’s all over the place in terms of trying to get a grasp of WAR, what it means, why Fangraphs and B-R.com are different, and a host of other puzzling statements.
I’ll try to get to these in the morning. I am thankful that he made his post, because I think there must be tons of people as confused as he is, and it gives me something to work with.
Plus, awesome blog name.
UPDATE:
I have successfully deleted this post twice already because of the amazing functions of undo and autosave. Regardless, I hope to incite a forum on some of the sabermetrics that are becoming more ubiquitous as time passes. I have read Tom Tango’s book showing how wOBA is better than AVG, OBP, OPS, etc.
Kind of an odd takeaway from The Book. But, that doesn’t seem to be the issue at hand, so let’s skip that.
Of course there are the constant stream of intermediaries that people use to calculate these statistics, but the one that I’m most hesitant of is WAR. For those that don’t know, WAR (Wins Above Replacement) is an all-encompassing statistic that essentially determines how much a given player is worth. This includes offensive and defensive analyses.
Rather than say how much a given player “is worth”, let’s say “WAR is the number of wins his past performance has been attributed to the player”.
I don’t like the idea of how “manufactured” the stat is because it’s essentially an average of an average of an average, etc.
I have no idea what average of an average means, nor the “etc” part. Let’s throw this sentence out the window . The blogger is trying to learn, but I think he’s reaching here for something.
And each statistic that is used in its calculation has limitations and assumptions, which aren’t usually discussed.
EVERY metric has limitations and assumptions, which aren’t usually discussed. OBP values a walk and HR equally. No one talks about this either. SLG has HR at 4 and single at 1, and that’s not discussed. Let’s not set a higher standard for WAR.
I see how it can describe how “valuable” a player was to his team last year, but can it really help when it comes to a player being traded or picked up?
Ah, excellent. Now, we have something to talk about. Can’t we say the same thing about OBP or ERA? By definition, every performance metric measures past performance. That’s what the stat is. If you want to know about the future value of the player, we need to INTERPRET that metric, be it WAR or any other metric.
First thing you have to figure out is: what is the metric actually trying to do.
Or can you simply add the WAR of each player on a team and predict the playoffs for the following year (and maybe the World Series teams)? I don’t think it can stretch that far.
No, you can’t do that.
The data I have below (which I can’t format well for the life of me) are total WAR for each time last year. Now, of course the better teams have better WARs since they were better. The reasoning is a bit circular which I think makes it robust for past analyses but not as useful for the future.
Right, if you are stuck on an unadjusted metric, it’s hard for it to be useful for the future. Same as any other metric.
Anyway, let’s look at them and see how well it did. The first table is from Baseball-Reference.com, and the second is from Fangraphs.com (WAR is also calculated differently at different places, another reason I’m not too high on it).
How well they “did”? Did at what?
As for different calculations: that’s why I call them rWAR and fWAR to show that they are in fact different calculations. They are part of the WAR family. Is it that hard to get past it?
I have them listed as batWAR, pitWAR, and Team WAR. These are the sum of the WARs for each individual position player (batWAR), pitcher (pitWAR), and collective team (Team WAR), respectively.
I can’t seem to write below these, so I apologize for any scrolling that’s necessary. If you look closely, there are some discrepancies. First off, the Fangraphs.com values are higher in general than the Baseball-Reference.com ones.
fWAR is higher than rWAR because fWAR uses a lower replacement level. There’s nothing wrong in either case. Just a reasonably justiable choice by both systems.
And Fangraphs had the Twins as the best team in baseball. Baseball-Reference had them 5th. Seems to be a decent drop.
Here is probably where the big difference rests: rWAR tries to account for all runs scored and allowed. fWAR does not do that. Basically, rWAR tries to apportion the luck to the players involved, while fWAR largely ignores the luck aspect.
It’s a choice.
Anyway, as a comparison sake, I would say that Baseball-Reference better encompassed the results of last year so I’ll talk about it mainly. I just wanted to show the difference between the sites.
To the extent that luck is a result, and you need to see that luck somewhere somehow, then rWAR would be the better choice. In this particular instance.
Something that first strikes me as interesting is that the Yankees had a better RAR than the Rays in both systems, but Tampa won the division. That seems to be interesting. I can see how WAR would fail when comparing teams that didn’t have much of an effect on the other, but to me, it seems odd the Tampa was not 1st in it’s division’s WAR from either site.
I don’t find that interesting at all, nor is it even a requirement of anything really. The Rays scored 23.6% more runs than they allowed. The Yankees were at 24.0%.
If rWAR or fWAR were more interested in capturing the luck of wins, then, sure, you’d have a case to make. But, that’s not what they are about.
Something impressive from BR (Baseball-Reference) is that the 8 playoff teams were in the top 9 in WAR. Only Boston (who was impressively 4th, meaning the AL East had 3 of the top 4 WAR teams last year) didn’t make the playoffs within the top 9 WAR teams. So, this measure pretty well “predicted” the playoff teams. FG (Fangraphs) didn’t do as well.
The use of predict here is very wrong. When you “predict”, you are making an estimate of a future event. In this case, fWAR is simply representing the runs scored and allowed by the team, and distributing it to the players. Obviously, the teams that make the playoffs will be predisposed to be those teams that score alot more runs than they allow.
This is another instance of the blogger wanting to learn, but it stuck on something that he should get out of.
Something else that’s interesting is that of the 8 playoff teams, the Giants had the best pitching WAR according to BR. Seems to coincide with the old belief that pitching is everything in the playoffs.
Again, he’s grasping. n=1.
Actually, if you look closer, within each series from the playoffs, the team with the better pitching WAR won the series. That makes me feel more comfortable about the statistic, but again, these calculation included the successful pitching of those teams so it’s circular. However, it does seem promising.
No, you should forget about all this. None of this is relevant in discussing WAR. It’s fun trivia, but ultimately meaningless in validating WAR.
But I would like to have people talk about these context-neutral statistics. WAR is normalized based on the replacement player of that year, so it’s supposed to comparable across time and leagues.
Eh, sorta-kinda. It compares players to that year’s baseline. Whether that baseline player is identical across time and leagues is debatable.
However, wouldn’t the context change if that player were to change teams? They would play around different defenders which can take away plays from them or cause problems. The new pitching staff could affect a players defense. The ballpark obviously has an effect. And you see different pitching more than likely changing your ability to hit to some degree. Does this not seem to matter?
The exact same thing can be said of any metric. Again, the blogger is grasping here, looking for chinks in an armor.
Also, WAR takes into account some form of fielding statistic, and all of the fielding statistics seem to be a bunch of magic.
Granted, they “seem” like a bunch of magic. But, they have a logical, rational basis.
I’m not saying I know a better way, but not much can be quantitative. Anyway, please respond with thoughts about these statistics and what you feel is successful and appropriate in many discussions. I just feel a bit hesitant, but maybe someone can help ease my discomfort.
Take care.
That first sentence is the key: if you want to discard WAR, and you STILL want to have an opinion, then what do you do? Well, you come up with your own flimsy, half-rational metric, without any internal consistencies. You’ll look at someone’s OBP and SLG, maybe his SB, look at his park, appyl some visual observations of their fielding and how they look at bat, see how his team did and say “Yeah, Ryan Howard is pretty good.” That’s really all you are going to do. And the more you try to do, the more rigid you make your system, the more consistent you try to make your ideas, the more logic you apply, well… congratulations, because you are on the path to WAR.
It’s almost like you don’t want to go to WAR, and are trying to figure out how to do it your own way. When your way is simply a circumventing of WAR. And eventually, the more you do the work, the more you realize that, “yup, that WAR is what I’ve been doing all along”.
Really, it’s not like I just came in and said: “This is WAR and this is how it’ll work.” This was a long process to get to where we are. And, if we have to change things, we will. This is not some religion. It’s a result.
And if you don’t want to use WAR, then use whatever else you want to use. But when you are challenged on logic and rationality, then, please, be kind enough to explain yourself. Don’t just say “this sucks” without offering an alternative. That’s what politicians do. Challenge the logic, and the rationale. That we can talk about.
You have to answer this question: I win a million dollars in a lottery. My __ x __ bank account has a million dollars. My __ y __ bank account has zero dollars.
Are you talking about x, or about y? What is it you want? And don’t say both!
***
Let me say something to those who are 100% in the “value” camp. I’ll present an extreme example (but one which nonetheless has happened, however rarely).
A player virtually never comes to bat, spending all his time as a pinch runner. He scores a ton of runs (more runs than times on base).
In terms of “value”, does he get ALL the credit for the run scored, or does some of the credit for the run scored go to the guy who reached base? In a pure and total value (i.e., I won the lottery) argument, I’d think it’s bad luck to the guy who reached base, and the guy who scored the run is the guy who scored the run. It’s his indivisible run.
Indeed, in terms of pure value, the ONLY events that can possibly count are those where a run actually scored. That means the guy who scored, the guy who drove him in, and the guy who moved him over (batter assists).
A leadoff triple where no run scores in the inning is, in effect, of no value whatsoever.
It of course portends real skill. But, it has no value. You can’t cash in that lottery ticket. It’s like you have half a dollar bill, and your friend with the other half of the dollar bill is refusing to tape it with you into one dollar bill.
***
Therefore, it would seem to me, that if you truly believe in “value”, you’ve got to be way on board the runs, rbi, and batter assist train.
But, you probably don’t like that idea, so then you start to couch things into theoretical terms, like run expectancy and linear weights, and leadoff triples where no run scores starts to accrue “value”. When, indeed, no value exists.
***
So, are you really about skill or about value?
Or, you don’t like the answer that results from those two questions that you end up making up your own question to fit the answer you are looking for?
I’m having a discussion with Bill James on the virtue of the K minus BB differential, compared to the ratio. Bill took the challenge (sub only). There are 3 things he did there that would cause bias, because he looked at all data from 1947-2010:
1. Because K rates have been on a steady rise, and there’s been a shift in batting average on balls in play between 1992 and 1994 (.280 pre-1992, .300 post-1994), that the groupings that Bill has done may have an era bias. This is why I find it helpful to look at the data from 1993-present, because there’s been a pretty strong line drawn there, in terms of looking at unadjusted data.
2. I had said it should be K minus BB per PA, not per IP. It’s not that big a deal, but it still has some bias to it.
3. And of course, HR exploded at the same time BABIP jumped (between 1992-1994). And so, if you have more guys in the high K minus BB per PA group, you’d also have higher HR rates as well.
***
I did my own study. I looked only at 1993-2010. I grouped all data by player, so we have 2513 pitching lines (no PA limit). I calculated a “walk” as BB-IBB+HB. I calculated each pitcher’s strikeout minus walk and divided by plate appearances.
I then grouped them into 5 categories:
Group 4: differential of at least .14 per PA
Group 3: differential of .10 to .14 per PA
Group 2: differential of .06 to .10 per PA
Group 1: differential of .03 to .06 per PA
Group 0: differential of less than .03 per PA
I also calculated their runs allowed per 27 outs (i.e., per 9 IP):
Grp IPouts diff/PA R/27 6-14*diff/PA
4 206352 0.169 3.66 3.63
3 436797 0.116 4.33 4.37
2 866050 0.079 4.80 4.90
1 530103 0.047 5.26 5.34
0 208194 0.009 5.94 5.88
Grp is the groups I noted above.
IPouts is IP*3.
diff/PA is the strikeout minus walk differential per PA
R/27 is runs allowed per 27 outs (or per 9IP)
6 - 14*diff/PA is simply six minus 14 times the diff/PA in the previous column
As you can see, a very strong relationship.
Bill however upped the ante, and looked at within each differential group at the K/BB ratio.
So, what I did was flag every pitcher with at least a 3 K/BB ratio as “high ratio”, anyone with under a 1.25 K/BB ratio as “low ratio”, and the rest as “average ratio”.
First up is for Group 4:
Ratio IPouts diff/PA R/27 6-14*diff/PA
Aver. 70668 0.157 3.76 3.80
High 135684 0.176 3.61 3.54
As we can see, the high ratio pitchers do indeed get a better value than their differential would suggest, but the difference is pretty small.
Group 3:
Ratio IPouts diff/PA R/27 6-14*diff/PA
Aver. 372113 0.115 4.36 4.40
High 64684 0.126 4.16 4.24
In this case, both of them are pretty much off by the same amount relative to expectations of using differentials only.
Group 2:
Ratio IPouts diff/PA R/27 6-14*diff/PA
Aver. 862868 0.079 4.80 4.90
High 3182 0.078 4.90 4.91
In this case, the lower ratio pitchers are actually the better pitchers.
Group 1:
Ratio IPouts diff/PA R/27 6-14*diff/PA
Low 8204 0.033 5.46 5.54
Aver. 521899 0.047 5.25 5.34
Here, both groups, the Low and the Average ratios, are off by a similar amount from expectations using differentials only.
Group 0:
Ratio IPouts diff/PA R/27 6-14*diff/PA
Low 166582 0.004 6.09 5.94
Aver. 41612 0.027 5.31 5.62
The biggest gap is here, with the differentials suggesting that the Low Ratio and Average Ratio pitchers should be off by only 0.32 runs, but are instead off by 0.78 runs. The worse performing group is the low K/BB ratio pitchers.
Overall, we see that while the ratio may have some additional information for us, a simple and straight strikeout minus walk differential per PA is a great indicator of performance.
Mike gives us the league leaders over the past few seasons, with Carl Crawford leading the way. To answer make: he got on base, so definitely it should count. We don’t care “why”, just “what”.
Phil Hartman:
Ladies and gentlemen of the jury, I’m just a caveman. I fell on some ice and later got thawed out by some of your scientists. Your world frightens and confuses me! Sometimes the honking horns of your traffic make me want to get out of my BMW.. and run off into the hills, or wherever.. Sometimes when I get a message on my fax machine, I wonder: “Did little demons get inside and type it?” I don’t know!
My primitive mind can’t grasp these concepts. But there is one thing I do know - when a man like my client slips and falls on a sidewalk in front of a public library, then he is entitled to no less than two million in compensatory damages, and two million in punitive damages. Thank you.
***
Blogger says:
The reason that batting average was the predominant statistic in the first hundred years of baseball was that it involved such a simple calculation: it’s easy enough to go back to the dugout after a 2-for-5 day and say, “Hey, I’m batting .400!” On the other hand, could I ever sit in the dugout without pen or calculator and figure out my wOBA?
Instead of “1” for a walk or hit batter, think “0.7”.
Instead of “1” for a single or reaching on error, think “0.9”.
Instead of “1” for a double or triple, think “1.3”.
Instead of “1” for a HR, think “2.0”.
(Notice how everything revolves around “1”, and the scale of it seems at least reasonable, and easy enough to remember?)
So, say you go 2 for 5, with a HR. In the dugout, you are thinking “I’m 2 for 5, but with a HR!”. Well, now you can think you are 2.9 for 5. Is it really that hard?
What if you are 1 for 3, with a walk? In the dugout, you are thinking “I’m 1 for 3, but with a walk too!”. Well, now you can think you are 1.6 for 4. Again, is it really that hard to do?
What if you went 0 for 4, but you got on base twice on error?(*) In the dugout, you are thinking “I may be 0 for 4, but I was really 0 for 2, and twice I got on base via error.” Well, now you can think that you were 1.8 for 4. Again, that hard to do?
(*) And yes, who cares if you got on base via error. If you got on base via a bloop single or a single off the wall, you don’t make the distinction in the dugout do you?
So, what is it that people want? I’m giving you a very simple way to go from OBP to wOBA. You can do the calculation in your head. But it seems that people still want OTHER people to do the calculation, to figure out what is an official at bat and not. To figure out if SF should appear in the denominator (OBP) or not (SLG, BA). To figure out OBP and SLG. Heck to even add OBP and SLG. And they won’t be able to tell you all of the rules for how to calculate OBP, SLG, or BA anyway.
It sounds like some baseball fans want inertia in their rules and stats. And they are “frightened and confused” when presented with something that is, at its heart, pretty straightforward, because it requires them to take a small tangent.
OPS is fine, if that’s all you have. But, don’t be too serious about it. Think of OPS as your f-buddy.
Nothing new for the regulars here, but, it seems there’s always some fresh blood, so, a little update is always good. My comments:
1. The coefficients for SB and CS are roughly 0.25 and -0.50. You use them if you need them.
2. The data for all the events through 2008 is here:
http://tangotiger.net/bdb/lwts_woba_for_bdb.txtThe blog post that describes the details of wOBA, including SQL, is here:
http://www.insidethebook.com/ee/index.php/site/article/woba_year_by_year_calculations/3. Hit batters are mostly random, regular walks are somewhat random, and intentional walks are not random. Hence, HBP are more likely to happen with a runner on 1B, a regular walk might happen with a runner on 1B, and IBB almost never occurs with a runner on 1B.
4. When you reach base on error, you can reach 2B, 3B, or even home plate. A single where you stretch to 2B is actually a double (d’uh). And we don’t care about who was responsible. We are just counting “what happened”. Otherwise, we’ll be talking about a batter being very responsible for a K, mostly responsible for a BB, alot responsible for a HR, somewhat responsible for a 2B, and flip of the coin on 1B.
That’s not the discussion wOBA is having, any more than OBP or SLG is having that discussion either. wOBA simply represents WHAT happened, without asking WHY.
5. 1.7*OBP + SLG is going to be very close to wOBA.
6. Correlation? Well, if most players are around average, then all correlation is going to do is tell you that, yup, a good wOBA means a good OPS. Again, d’uh. But, what does it mean for players at the extreme, with a lopsided OBP/SLG view? Nothing, because you can’t extrapolate.
A few years ago, Patriot wrote an article about the proliferation of breathless inventions of the same thing, all centered around the bases per out or bases per PA idea (which you can extend to bases per hit, or really just about dividing any two related numbers). Colin followed that up with essentially a part 2. Indeed, on the top of this page is a “required reading” that links to those two articles. No one should attempt any kind of mathematics until they read those two articles.
I just saw yet another article on dividing two things, and how the presentation was somehow novel. Basically uttering Ernie Banks’ non-saber credo: “let’s divide two!”
This post is just a reminder that there is required reading. You can start with the link at the top of any page here. And then pick up Panas‘ Beyond Batting Average. And maybe move up to Eric’s book. Let those readings inspire you to do build on what’s been built, or create something new.
***
In terms of the “ultimate” bases per PA is wOBA:
= 0.7 * (BB+HB)
+ 0.9 * (1B + Errors)
+ 1.3 * (2B+3B)
+ 2.0 * HR
+ 0.25 * SB
- 0.50 * CS
all divided by PA
That’s the basic model. You can tweak things a bit, and separate out the 2B and 3B if you want. And adjust by run environment. But, it basically comes down to that. If you see something different, it’s just a way of trying to get to that.
His takedown of batting average is a re-run, but worth re-reading. Imagine being in court, having the prosecutor read Poz’s post on batting average, and you on the defense have no choice but to say: “we accept that statement as factual”. And then after accepting the facts, you argue that it makes sense that batting average is one of the prime stats to use.
***
One of the coolest stats out there is WPA, which stands for Win Probability Added, which is a name that I don’t think helps the cause much. There are certain words that scare the bejeebers out of people. Linear Weights were like that for me. I would see anything mashing those words together—“linear” and “weights”—and I would kind of freak out.
I’d be happy to hear of alternatives. WPA is the change in win expectancy assigned to the players involved in the play. I agree that the description may sound off-putting. Suggestions?
***
I cracked up here:
The stat wOBA looks scary because any word where you make the first letter lower case and the rest upper case is scary. It doesn’t matter how harmless or happy the word really is. Look:
eLMO
bABY
fARVE
The one thing I want to say about wOBA is about the number “1”. The numerator of wOBA is positive events. An average positive event is “1”, just like in OBP. Actually in OBP, EVERY positive event is a “1”, be it a walk or a HR.
In wOBA, we see that a single is just a bit worse than an average positive event. That’s why you give it 0.9. A walk is alot worse than an average positive event, so it gets 0.7. A HR is far better than an average positive event, so it gets 2.0. Basically, the whole thing is centered around “1”. An average positive event is 1, and then you simply use that as the centering point.
I know way back when wOBA was first introduced in The Book, there was alot of push back that I should have used batting average as the scale. Except I couldn’t do that. If I did that, then the average positive event would have to be 0.8. A single would be 0.7, a HR would be 1.6. See? There’s just nothing really to center everything. Indeed, a double would be worth 1.0. Not to mention that OBP, not batting average, is what baseball is all about. OBP already has plate appearances (PA) as the denominator. If I scaled wOBA to batting average scale, how can I explain that the denominator of this metric would be PA? Anyway, that’s why wOBA makes sense to me.
FIP is a wonderfully simple stat, that owes its existence to DIPS. It’s raison d’etre was SOLELY to represent a subset of a pitcher’s current season performance. It made no opinion as to its predictability. That’s why the HR is weighted so high: because HR generate alot of runs.
Now, some people would like to know how “real” each of those components are, so that they can use FIP for predictive purposes. Well, good ole FIP is not going to do the job (even though it does a pretty job in any case). No, what we need is FutureFIP. And that’s what I’ll attempt to do.
***
Note: when I say BB, I mean BB-IBB+HBP.
Let’s say we want to keep to the basics, and use IP as the denominator. How do we change the classic FIP:
ERA = (13*HR + 3*BB - 2*SO)/IP + 3.2
Here then is my first stab at…
FutureFIP = (6*HR + 2*BB - 2.5*SO)/IP + 5.12
(Note: FutureFIP is on a scale of RA9 and NOT ERA.)
So, what do we see here? Well, SO become more predictive, BB less predictive, and HR much less predictive. But, and this is important, HR still count. You don’t give it a weight of 0.
***
Now, we really don’t like IP in that denominator, right? Let’s present a FutureFIP equation with PA in the denominator. So, we have:
FutureFIP = 4*(6*HR + 2*BB - 2.5*SO)/PA + 5.10
***
As always, that 5.1ish floats to match the league RA9.
That’s my first volley. Someone want to do better?
May 16 23:35
Now you frame it, now you don’t
May 16 22:50
Dodgers’ win reversed because Mattingly did not attest to proper score!
May 16 20:44
How to beat the shift
May 16 20:02
Sponsoring MLB jerseys
May 16 16:56
Did Manny Pacquaio actually quote Leviticus?
May 16 16:06
Does changing your pitch frequency lead to substantial change in results?
May 16 14:18
Extra Innings: One-minute review
May 16 14:16
This particular criticism of UZR is unfounded
May 16 13:21
Psst… wanna intern for the Astros?
May 16 12:23
Arena wars
THREADS
May 16, 2012
Now you frame it, now you don’t
May 16, 2012
Dodgers’ win reversed because Mattingly did not attest to proper score!
May 16, 2012
Does changing your pitch frequency lead to substantial change in results?
May 16, 2012
Sponsoring MLB jerseys
May 15, 2012
Andre The Hawk Dawson speaks
May 15, 2012
Euro 2012 Preview
May 15, 2012
How to beat the shift
May 15, 2012
Will Pujols end the season with at least 30 HR and .500 SLG?
May 15, 2012
Kershaw v Strasburg, part 2
May 15, 2012
Did Manny Pacquaio actually quote Leviticus?
Recent comments
Older comments
Page 2 of 342 pages < 1 2 3 4 > Last »Complete Archive – By Category
Complete Archive – By Date