Thursday, August 14, 2008
Fixing VORP
UPDATE: This blog entry (not an article) has been linked from several places, and there are questions from non-LWTS followers about what the weights should be. One place to find them is the last line of this page. Had I intended to write this as an article, I would have been more complete in my description. I apologize to those who stumbled along here for the first time. Basically, anyone coming to this blog is walking into the middle of a conversation. Feel free to interrupt and ask a question.
***
Baseball Prospectus undervalues walks. By how much? As much as OPS. Don’t believe me? Let me walk you through the steps:
1. Start with a standard batting line, like here. That gives you a .269/.333/.430 line. That’s your baseline.
2. Figure out the new batting line, if you add 1 PA, 1 AB, 1 H. Figure out the new batting line if you add 1 PA, 1 AB, 1 H, 1 2B. Figure out the new batting line if you add 1 PA, 1 BB. And so on. This is what we typically call the Plus1 Method.
3. For each batting line, figure out the VORP or MLV. I’m using what I find on Woolner’s site, which he’s kind enough to distill into one formula:
MLV_FULL =GAMES*OUTS*(1/9 * (8*L_OBP+P_OBP) / (9-8*L_OBP-P_OBP) *((8*L_SLG*(1-L_OBP))/(1-L_AVG) + P_SLG*(1-P_OBP)/(1-P_AVG)) - L_OBP*L_SLG/(1-L_AVG))
4. Subtract the MLV_FULL for each batting line from the baseline one. You’ll get a differential for the batting line where you added one single, the differential for the batting line for doubles, for walks, HR, and outs.
5. In order to align it to something meaningful, multiply all these differentials by a constant, such that the resultant outs value is -.28. It doesn’t really matter much, but at least it gives you something reasonable to look at.
I did all that. And this is what I get:
0.51 1b
0.80 2b
1.09 3b
1.37 hr
0.22 bb
(0.28) out
One of these things is not like the other, one of these things doesn’t belong. Can you tell which thing is not like the other, before I finish this song?
Fans of Linear Weights are pretty familiar with the above numbers. A couple are slightly off, but I won’t get too worked up about a .02 run difference here or there. But, the run value of the walk is .10 runs off. A guy who walks 100 times where the league average is 60 means that he’ll be undervalued by 4 runs.
Now, you may say “4 runs, big deal”. Well, ok, if that’s the case, you may as well leave.
But, if you are going to go to the complicated formula to create your run estimate, shouldn’t you be rewarded for that complexity with extra accuracy? Why not simply use Linear Weights then?
Here is what OPS says using the Plus1 Method:
0.45 1b
0.83 2b
1.22 3b
1.60 hr
0.23 bb
(0.28) out
The same low run value on the walks. OPS is even worse because it overvalues the HR by a very large margin. (See why I hate OPS? It doesn’t represent how runs are produced. OPS: Begone.)
What does Baseball Prospectus say about MLV (which is the core of VORP)? All they say is this:
Marginal Lineup Value, a measure of offensive production created by David Tate and further developed by Keith Woolner. MLV is an estimate of the additional number of runs a given player will contribute to a lineup that otherwise consists of average offensive performers. Additional information on MLV can be found here.
That link in the quote is the same link I used in step 3.
Does anyone at Baseball Prospectus know about the bias? Are they going to change the basis of VORP? It’s very simple. VORP is based on MLV, and MLV is based on this Runs Created formula:
OBP*SLG/(1-BA)
In fact, here is the values of the above using the Plus1 Method:
0.51 1b
0.80 2b
1.09 3b
1.38 hr
0.22 bb
(0.28) out
Except for the slight difference in HR, the rest of the values are identical.
As you can see, you start off with a poor construction (the original RC equation), make that a core part of each subsequent framework, and the wrongs of the original get perpetuated. And because of all the mathematical gymnastics, no one knows about it.
Murray Chass is right about deriding VORP. We are so g-dd-mn smug about what we do, that we can’t even properly defend the thing that Chass is holding as exhibit A. Now, Chass will be wrong about deriding VORP once (if?) it’s finally fixed. But, at the very least, present the correct thing first.
How can BP fix this? Use BaseRuns as the core. BaseRuns is easily tweakable so that you can get the Linear Weights values that you want.
Everyone who studies BaseRuns loves BaseRuns. If we consider Dan Fox to be the saberist who would appear to be the most respected and least biased of all saberists out there by all concerned, and who has studied the issue as intimately as Patriot and myself, if not more so, and who said on BP’s own site:
Dan Fox: On the defensive side of the ball I would take input from SFR, UZR, and Plus/Minus to make a case for one over the other. On the offensive side the metrics are much better and more equivalent but BaseRuns makes the most sense to me.
Then what exactly is stopping BP from making the necessary changes here and advancing its metric from the 1970s version that it’s using at its core?
Retorting for Murray Chass, here is an Open Letter to Baseball Prospectus:
Hi BP,
I write to you as one analyst to another. Fix VORP.
Sincerely,
Tom
Baseball Schlub
Tango,
thanks for that analysis. I have always thoroughly loved VORP. Alot due to the respect for Keith. If he is aware, and has the time, he’d certainly address it - with the possible exception there is something you are missing (really, that is possible).
I’d like to use BaseRuns but I haven’t seen a spreadsheet that isn’t too convoluted. Anyone got one?