Thursday, April 23, 2009
Historical WAR for MVP analysis
One of my favorites, John Walsh, using the work of one of my favorites, Rally, and my favorite measure ever, WAR, to look at the 1974 MVP race.
Buy The Book from Amazon
One of my favorites, John Walsh, using the work of one of my favorites, Rally, and my favorite measure ever, WAR, to look at the 1974 MVP race.
Tom Hanrahan (pdf) seems to think so:
In summary, all Run Estimators overstate leadoff batters contributions to team runs scored. By about 10%. That ain’t small potatoes when figuring out what makes teams win.
He has his findings of when he put in a great hitter as a leadoff hitter (+74 runs), as a #3 hitter (+77 runs), cleanup hitter (+76 runs), #6 hitter (+69 runs), and #9 hitter (+63 runs). The same hitter, but in various slots, ended up having his team score that many more runs. This is his run impact. So, we can see that generally speaking, this great hitter should come out at roughly +74 runs or so.
Now, let’s say this hitter’s standard Linear Weights is +.11 runs per PA. As a leadoff hitter, he gets alot more PA, so he might have say 750 PA, while as a #5 hitter he’d have 72 fewer PA (or 678 PA). He ends up with .11*72= 8 more runs if you use his performance stats as a leadoff hitter.
Tom is totally right. Indeed, in Table 51 of The Book, I present the Run values by event, by batting order. Per PA, the run values of the leadoff hitter slot is about 5% below the average, while those in the #5 slot is 5% higher.
If however you look at it per game (rather than per PA), the run values of the hitters in lineup slots 1 through 5 is fairly constant (which explains almost totally why you want your five batters in the top 5 slots, and why mixing them up in there doesn’t really make much difference).
So, the solution is not to look at things in terms of per PA, but in terms of per game. When you look at Grady Sizemore and the other leadoff hitters with 750 PA, they do get an advantage in standard linear weights.
However, a play-by-play metric, like WPA / LI * boLI does not have this problem. And I agree with Tom that we should make the necessary adjustment. However, it will not come out to 10%. It’ll be +/- 5%.
A reader wrote, in full:
I recently came across this article at The Harbdall Times: http://www.thehardballtimes.com/main/article/the-great-run-estimator-shootout-part-1/ , and it got me thinking a bit about wOBA. I was initially sold on wOBA because of its really solid conceptual background (especially when compared to EqA, which I can’t make any sense of) and the way it helps correct for the shortcomings of OPS and distinguishes between players who may have the same OPS but different OBPs and SLGs.
The above article, however, confused me because it seems to indicate that wOBA correlates worse (albeit very, very slighty) with run scoring than does OPS, OPS+ and RC. The mean average error and RMSE were also slightly larger for wOBA than for OPS and OPS+.
This got me thinking about how to measure how “good” of a stat wOBA is. Since it definitely seems to be a stat geared towards individuals, I’m not sure it makes sense to correlate it to runs scored on the team level (as I think was done in the article, though it was done per inning). Also, I’m not even sure it is intended to correlate with runs at all, considering that it doesn’t claim to generate the actual runs a player was worth, but rather how many runs he would be worth if all of his offensive contributions took place in an average setting. If this is the case, and it isn’t actually meant to correlate with runs scored, how do we test the quality of the stat, or really, the quality of any stat which focuses on individual performance?
Does the strength of wOBA lie entirely in its intuitive derivation and the way it modifies and complements OPS, or is there some other way to measure its quality? Or, is it the case that even if wOBA makes perfect intuitive sense, at the end of the day it isn’t as important a stat as some of its seemingly inferior counterparts, such as OPS, if it does not correlate as well with run scoring as they do?
Sorry, this e-mail was pretty long and I know you’re very, very busy. Hopefully you get a chance to read it and respond. Thanks in advance for your response and for all the great content you post up on the web.
My response:
If you start with Linear Weights, you will get to exactly 100% wOBA. They are identical stats. At the same time, if you start with wOBA, you can get to Linear Weights exactly 100% of the time. They are identical. Whereas wOBA presents data in two dimensions (the rate portion of wOBA, and the number of PA), Linear Weights presents it as one-dimension.
The conversion of Linear Weights (denominated in runs, but based on runs above/below “average") to total Runs Created (denominated in runs, but based on absolute number of runs) requires two separate processes, depending on whether you care about total team runs scored, or whether you care about total “player” runs scored. The second-dimension of Linear Weights requires either the use of PA or outs, depending on what translation you are making. The allocation of how many runs an “average” hitter creates requires great care to ensure that the value of the outs (the inning killer) is handled properly.
I do not know what Colin did in order to convert wOBA to total runs created. Seeing that wOBA is Linear Weights, then I expect an r = .999 of any results coming out of wOBA or Linear Weights. If you don’t see that, then there’s a gap somewhere in the translation.
A related thread can be found here.
Apparently. Mario Mendoza now comes in at replacement-level, or worse, every year of his career. Just picking out a name at semi-random, Grady Sizemore has a 24.5 WARP, a 26.1 WAR on Fangraphs, or 27.2 WAR as per Rally.
Rally and Fangraphs come closest to what I do, so if Clay has managed to pretty much align himself to them, then I approve. It’s still up in the air how he handles the positional adjustments, so until Clay explains himself, or addresses the issues I brought up in the other thread, WARP still has to be taken with a grain of salt (which is better than the pound of salt you needed before today).
What to make of those people who were defending (the old) WARP? This is the same issue I have with defenders of say Runs Created. Once Bill James finally comes around and admits he was wrong with RC, then what are all his disciples going to do? This is just like 1984, where people defend whatever they are told by people they respect or were bewitched by, even if they don’t understand the underlying logic. People will quote PECOTA percentiles, but what happens when Nate disavows how he created them (which he’d have to if he was being honest with us)?
Be critical with what you see, and if the creators won’t explain themselves enough to defend their work, then don’t preach in their choir.
Colin gives it a go (part 1).
Surprise! BaseRuns demolishes the field.
All the other linear estimators are all pretty similar. Not sure how he tuned each equation on an annual basis (team, game, or inning basis?). wOBA, reg, and House should all come out pretty much the same. It also depends on how he handled IBB, since wOBA explicitly ignores it as an event.
As Homer says of donuts, I say of Linear Weights: “Is there anything they can’t do?” Dave Allen gives us Field Blobs:
My comment:
Dave, fantastic. One request please: show those maps based on the handedness of the batter (or even batter/pitcher). Ideally, we should see the red blobs shifted over, with patches of green more clearly separating them.
I have spend a lot of computing time (it takes forever to “run") trying to figure out what the league average rates for the various offensive categories are as a function of the base/out state. Any differences from overall rates would be due to batters and pitchers changing their approach and the fielders being in different positions and perhaps doing different things when the ball is hit to them.
It appears from the data that the batters trump the pitcher in terms of their approach (although, as you will see, there are times when the pitchers truno the batters, as with BB and HP rates), or perhaps the pitchers don’t “care” as much as the batters, and perhaps rightfully so. For example, with a runner on second base and no outs, batters in general (lefties and righties, BTW), do in fact hit more balls to the right side, even though to some extent the pitcher is trying to “force” then to hit to the left side. We mostly see that effect with no outs, BTW, which is pretty much what we would expect. If a commentator ever tries to say that batters try and hit to the right side with a runner on second and ONE out, don’t believe him (it is certainly possible that that happens SOME of the time - but on average, there is not much evidence of that).
For all situations other than at least a runner on second and no outs, batters hit fly balls to the left side 51 or 52% of the time and ground balls .55 or .56. This is RHB and LHB combined. With no outs, however, and a runner on second only, it is .47 and .48 (FB and GB), respectively. With first and second and no outs, it is .53 and .52, so fly balls are normal and ground balls are hit more to the right side, I guess to help stay out of the DP, or maybe as a function of the hit and run. With 2nd and 3rd, it is .5 and .55, so ground balls are normal and fly balls are a little to the right side. With a runner on third in general, we see more fly balls to the right side. My guess is that with zero and one out and a runner on 3rd, the batter is trying to hit a fly ball, which results in more fly balls to the opp field for both lefties and righties. In fact, with a runner on 3rd only, we see .48 for FB. With one out, we see .5 and with 2 outs, we see .51.
Sky gives us the breakdown. And, I must say, I love the presentation.
2002-2007, WAR. 1st number from Fangraphs, second number from Chone:
21, ?? Beckett (70)
14, 13 Nick Johnson (80)
8, 6 D’Angelo Jimenez (67)
Win Shares in parens.
Does anyone believe at all that the contributions of Jimenez could approach that of Beckett?
List inspired by John Sickels and football Stu.
The answer isn’t that WS hates Beckett. It just hates starting pitchers.
Good post by Patriot. As he notes, it all depends. I think about it a bit as well, and I flip around a bit as to what I’m trying to do when I’m trying to measure “power”. There are alot of right answers here.
Devil Fingers looks at what the Yankees have, had, and could have.
Rally does what I’ve been meaning to do since forever. I’m glad he’s spent the time to do it for all of us.
Devil Fingers gets his hands dirty and looks at the differences between what VORP does and what a tweaked version offers. He concludes with an open letter to Rany:
Dear Rany, If this does get to you—please advocate VORP being fixed/put out to pasture in its present form. The tools with which to do so are there, as as been pointed out. Even if Dan Fox can’t give you guys Simple Fielding Runs anymore (which would fix the embarrassing issues regarding defense and BP), you could at least use RARP (based on EqA/R) in your team audits and as your flagship stat, since it is a stand-in for the correct linear weights (albeit, in the case of EqA, an overly complicated one).
Rany is also another fine gentleman over at BPro, willing to listen. Though at this point, I have to believe that Clay will be moving and shaking things, no matter how esoteric things may seem to those not knee-deep in this.
Finally! Clay has been bombarded by me on this blog regarding his low use of replacement level. Based on his article in BP09, he’s acknowledged that he’s basically the last man standing in terms of belief in his low replacement level.
Once he took the first step (making replacement level from around -3.5 or -4 wins, to -2 wins), I knew what his second step would lead to. And, I felt his frustration, and he was totally right, that he didn’t want to force the issue that the difference between average and replacement level was the same for every position, every year. He even cited my oft-made remark that in the 1950s, the CF outhit the corner OF, and so, it would be foolish to make the average CF be equal to the average corner OF, given the fact that the average CF both outhit, and (must have) outfielded them.
I felt it. You could really feel his frustration on the matter in hs words.
He finally gave in. He accepted that the average non-pitcher is 22.11 runs per full season better than replacement-level. (This is virtually identical to what I use). He accepted that the average starting pitcher should be compared to a baseline that is 1.25 runs above the league average, which is again virtually identical to what I use. I don’t think Clay has the correct replacement-level for relievers (based on his text, he is definitely wrong about it, but all he needs is to read a couple of pages from The Book to realize his error).
So, kudos to Clay for embracing what the rest of use are accepting. Now, he joins the issue of how to balance the baseline levels across positions. This is something he thinks about alot, as you can see by how he wrote his chapter. However, I don’t think he’s got it yet. Here are the technicals of why I think he needs more tweaking done:
Here’s John’s work on avoiding the double play.
Another possible reason for those sluggers appearing on his list is that they bat left-handed. LHH with a man on 1B have the advantage of the hole (p.323 The Book):
How about lefties and righties? Left-handed batters have a 20-point [wOBA] advantage, while righties have a 10-point advantage. In this case, it’s easy enough to understand: the hole on the right side is there for the taking, and lefties are able to take advantage of that more than righties.
No surprise if this also includes avoiding the DP.
Little technical note: DP should include TP, since a TP is a DP-plus. Given the rarity of the TP, it’s understandable not to include it.
It’s very simple: record the number of times that a batter moved a runner over who will eventually score, but that he did not get credit for an RBI. The leader in 2008 is: Justin Morneau, with 73 Batting Assists. (Can you guys think of a better name?) There were 65 runners that he moved into scoring position (by hit or out) that scored in a subsequent at bat. And there were 8 runners that scored while he was batting (by out), of which he did not get an RBI.
How about if a batter does NOT advance a runner even one base (or worse, gets him doubled off)? The MLB leader in 2008 was: Jeff Francoeur with 319 runners who were blocked. I’ll call these Batting Blocks.
Finally, the leaders in the ratio between Batting Assists to Batting Blocks is: Joe Mauer. The league average ratio is 1 assist per 5 blocks. Mauer had 69 assists and 170 blocks. In “Linear Weights speak”, he was +29, followed by Ichiro and Carlos Guillen at +22. On the bottom side, we have Francoeur and Corey Hart at -21.
Since 1993: the leader is Barry Bonds at +258 and Derek Jeter at +231. At the bottom of the pile is Tony Batista at -128.
And a special shout-out to John Smoltz, who moved 99 runners who eventually scored and blocked only 261.
Colin brings it together.
I posted this on USSM, but I figure a few newcomers might want to have a quick reason to eschew RBIs:
You have to ask yourself what do RBIs represent. They are a combination of: a player’s hitting talent, the number of runners he has on base, and the timing to match these two.
wOBA, EqA, Linear Weights (LWTS), etc, all give you the player’s hitting talent. The number of runners a hitter sees on base has nothing at all to do with his talent level. So, you try to remove that aspect, and you get RE24 (which you can find at Fangraphs), which is Linear Weights by the 24 base/out states.
So, if you have RE24, then RBIs now become useless, as they contain no information at all, that you cannot find in RE24, insofar as the player’s hitting talent and his timing. RBIs are a subset of RE24.
The question then comes LWTS or RE24. You can make a case for either.
Devil Fingers looks at Situational Wins (WPA/LI), and compares that to Linear Weights (wRAA) to see if there’s anything there.
Great job. I’m looking forward to seeing multi-year totals.
Aug 31 15:28
Fans Scouting Report: Update
Sep 02 15:54
The two uncertainties of UZR
Sep 02 15:17
Mail: rWAR v fWAR
Sep 02 14:59
Roger Federer
Sep 02 14:59
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are
Sep 02 14:57
Could Rob Dibble have been a comp for Strasburg?
Sep 02 14:15
WOWY Teachers
Sep 02 13:37
Who’s Waldo?
Sep 02 08:36
Team Elin
Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?
THREADS
August 31, 2010
Fans Scouting Report: Update
September 02, 2010
The two uncertainties of UZR
September 02, 2010
More on Morris and Cox, bloggers / editors
September 02, 2010
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are
September 02, 2010
Roger Federer
September 02, 2010
Ryan Howard: Mr September
September 02, 2010
WOWY Teachers
September 01, 2010
Jose Bautista
September 01, 2010
Workload Regularity Score
September 01, 2010
Strasburg II
Recent comments
Older comments
Page 5 of 199 pages « First < 3 4 5 6 7 > Last »Complete Archive – By Category
Complete Archive – By Date