THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, August 14, 2008

Fixing VORP

By Tangotiger, 12:31 PM

UPDATE: This blog entry (not an article) has been linked from several places, and there are questions from non-LWTS followers about what the weights should be.  One place to find them is the last line of this page.  Had I intended to write this as an article, I would have been more complete in my description.  I apologize to those who stumbled along here for the first time.  Basically, anyone coming to this blog is walking into the middle of a conversation.  Feel free to interrupt and ask a question.

***

Baseball Prospectus undervalues walks.  By how much?  As much as OPS.  Don’t believe me?  Let me walk you through the steps:


1. Start with a standard batting line, like here.  That gives you a .269/.333/.430 line.  That’s your baseline.

2. Figure out the new batting line, if you add 1 PA, 1 AB, 1 H.  Figure out the new batting line if you add 1 PA, 1 AB, 1 H, 1 2B.  Figure out the new batting line if you add 1 PA, 1 BB.  And so on.  This is what we typically call the Plus1 Method.

3. For each batting line, figure out the VORP or MLV.  I’m using what I find on Woolner’s site, which he’s kind enough to distill into one formula:

MLV_FULL =GAMES*OUTS*(1/9 * (8*L_OBP+P_OBP) / (9-8*L_OBP-P_OBP) *((8*L_SLG*(1-L_OBP))/(1-L_AVG) + P_SLG*(1-P_OBP)/(1-P_AVG)) - L_OBP*L_SLG/(1-L_AVG))

4. Subtract the MLV_FULL for each batting line from the baseline one.  You’ll get a differential for the batting line where you added one single, the differential for the batting line for doubles, for walks, HR, and outs.

5. In order to align it to something meaningful, multiply all these differentials by a constant, such that the resultant outs value is -.28.  It doesn’t really matter much, but at least it gives you something reasonable to look at.

I did all that.  And this is what I get:
0.51 1b
0.80 2b
1.09 3b
1.37 hr
0.22 bb
(0.28) out

One of these things is not like the other, one of these things doesn’t belong. Can you tell which thing is not like the other, before I finish this song?

Fans of Linear Weights are pretty familiar with the above numbers.  A couple are slightly off, but I won’t get too worked up about a .02 run difference here or there.  But, the run value of the walk is .10 runs off.  A guy who walks 100 times where the league average is 60 means that he’ll be undervalued by 4 runs.

Now, you may say “4 runs, big deal”.  Well, ok, if that’s the case, you may as well leave.

But, if you are going to go to the complicated formula to create your run estimate, shouldn’t you be rewarded for that complexity with extra accuracy?  Why not simply use Linear Weights then?

Here is what OPS says using the Plus1 Method:
0.45 1b
0.83 2b
1.22 3b
1.60 hr
0.23 bb
(0.28) out

The same low run value on the walks.  OPS is even worse because it overvalues the HR by a very large margin.  (See why I hate OPS?  It doesn’t represent how runs are produced.  OPS: Begone.)

What does Baseball Prospectus say about MLV (which is the core of VORP)?  All they say is this:

Marginal Lineup Value, a measure of offensive production created by David Tate and further developed by Keith Woolner. MLV is an estimate of the additional number of runs a given player will contribute to a lineup that otherwise consists of average offensive performers. Additional information on MLV can be found here.

That link in the quote is the same link I used in step 3.

Does anyone at Baseball Prospectus know about the bias?  Are they going to change the basis of VORP?  It’s very simple.  VORP is based on MLV, and MLV is based on this Runs Created formula:
OBP*SLG/(1-BA)

In fact, here is the values of the above using the Plus1 Method:
0.51 1b
0.80 2b
1.09 3b
1.38 hr
0.22 bb
(0.28) out

Except for the slight difference in HR, the rest of the values are identical.

As you can see, you start off with a poor construction (the original RC equation), make that a core part of each subsequent framework, and the wrongs of the original get perpetuated.  And because of all the mathematical gymnastics, no one knows about it.

Murray Chass is right about deriding VORP.  We are so g-dd-mn smug about what we do, that we can’t even properly defend the thing that Chass is holding as exhibit A.  Now, Chass will be wrong about deriding VORP once (if?) it’s finally fixed.  But, at the very least, present the correct thing first.

How can BP fix this?  Use BaseRuns as the core.  BaseRuns is easily tweakable so that you can get the Linear Weights values that you want.

Everyone who studies BaseRuns loves BaseRuns.  If we consider Dan Fox to be the saberist who would appear to be the most respected and least biased of all saberists out there by all concerned, and who has studied the issue as intimately as Patriot and myself, if not more so, and who said on BP’s own site:

Dan Fox: On the defensive side of the ball I would take input from SFR, UZR, and Plus/Minus to make a case for one over the other. On the offensive side the metrics are much better and more equivalent but BaseRuns makes the most sense to me.

Then what exactly is stopping BP from making the necessary changes here and advancing its metric from the 1970s version that it’s using at its core?

Retorting for Murray Chass, here is an Open Letter to Baseball Prospectus:

Hi BP,

I write to you as one analyst to another.  Fix VORP.

Sincerely,
Tom
Baseball Schlub

#1    Chris Dial      (see all posts) 2008/08/14 (Thu) @ 13:57

Tango,
thanks for that analysis.  I have always thoroughly loved VORP.  Alot due to the respect for Keith.  If he is aware, and has the time, he’d certainly address it - with the possible exception there is something you are missing (really, that is possible).

I’d like to use BaseRuns but I haven’t seen a spreadsheet that isn’t too convoluted.  Anyone got one?


#2          (see all posts) 2008/08/14 (Thu) @ 14:15

Wow, I always thought that VORP and MLVR were lwts with the appropriate values. I guess not.  Another one bites the dust!

I think that the problem is that everyone wants to invent their own stat even if that means taking a near perfect one and making it worse.

I am just the opposite.  I want a perfect stat whether it is mine, someone else’s or a combination.  For example, I use lwts with a little extra, such as separating K outs from batted ball outs, separating fly ball and ground ball outs, including ROE, and separating infield singles from outfield singles, etc.

I do this NOT so that I can claim something as my own, but simply to make something good just a little bit better.

I can’t say I blame (too much) a commercial site like BP for just wanting to invent their own stat even if it is far less perfect than someone else’s.  Everyone (which is interested in ego and money) does that.


#3    Tangotiger      (see all posts) 2008/08/14 (Thu) @ 14:35

Chris: I share your admiration for Keith.  But, since Keith is a FTE of the Indians, and since VORP is an IP (to the extent that is can be) of BP, it’s up to BP to fix it.

MGL: I’m with you nostly.  I trumpet BaseRuns even if it’s not mine and is better than what I can come up with.  However, I will blame anyone, commercial or not, for putting something that has deficiencies, especially if they don’t put a disclaimer on it.


#4    Colin Wyers      (see all posts) 2008/08/14 (Thu) @ 14:45

BP actually does have their own “brand” of linear weights - EqR. In fact, check the EqA/DT reports:

http://www.baseballprospectus.com/statistics/eqa2008.php

RARP is Runs Above Replacement, Position-Adjusted. If that’s not the definition of VORP I don’t know what is. So BP already has a linear weights version of VORP.


#5    Patriot      (see all posts) 2008/08/14 (Thu) @ 15:14

Colin is right in #4, of course, and that’s the problem.  BP has at least 4 measures of value against replacement floating around:

1. VORP, fueled by the flawed RC model
2. RARP, based on EqA, which is probably the best, at least of the first 3
3. WARP, which takes RARP a step forward, but adds in fielding and uses a ridiculously low baseline
4. SuperVORP, which I think includes fielding but I don’t know how it differs from the others

They can’t even come to a decision about which to use.  Small wonder their average customer is confused.


#6    Tangotiger      (see all posts) 2008/08/14 (Thu) @ 15:20

Chris: I’ve used this in the past when I just have limited data:

A = BB+HBP+H-HR
B = .8 x 1B + 2.1 x 2B + 3.4 x 3B + 1.8 x HR + .1 x BB
C = AB-H+SF
D = HR

scoreRate = B/(B+C)

Runs = A*scoreRate + D

***

You can see other examples depending on what kind of dataset you have:
http://www.tangotiger.net/wiki/index.php?title=Base_Runs

***

Colin: BP confuses matters by having two of things.  EqA is very close to LWTS.  There also, as we’ve talked about in the past, EqA is LWTS, made worse, and more complicated.  Clay is aware of Patriot’s great article on the subject, but I don’t know how much it bothers him.


#7    Tangotiger      (see all posts) 2008/08/14 (Thu) @ 15:29

They can’t even come to a decision about which to use.  Small wonder their average customer is confused.

Yes, BP, why can’t you choose one?  It’s not like it’s a question of using the best tool for the proper job.  There are fundamental differences here as Patriot is showing.

Ideally, they would do the following:
1. Use SFR for their fielding
2. Use BaseRuns or LinearWeights (EqA if you really have to) as their core
3. Use Replacement-Level as Keith describes it, with minor tweaks

So, merge Dan Fox, Clay Davenport and Keith Woolner, and you get the “perfect” metric, one that I would support and endorse.

Why do I care about BP more than BP does on this issue?


#8    MGL      (see all posts) 2008/08/14 (Thu) @ 16:22

"We” care because other than plain old OPS or OPS+ which are readily available, the single most quoted source for player value/performance is BP.

If it is flawed and everyone is using it, then there is a lot of misinformation floating around, which is probably not a good thing.


#9          (see all posts) 2008/08/14 (Thu) @ 18:11

SuperVORP is just regular VORP plus FRAA. There’s an implication there by Nate or Keith that Clay’s concept of a replacement level fielder is dead wrong. It is, of course, and yet BP who clearly recognizes it keeps putting it out there.


#10          (see all posts) 2008/08/15 (Fri) @ 01:27

I believe Justin (On Baseball and the Reds) showed in his player value series that replacement level fielders were just about average fielders.


#11    tangotiger      (see all posts) 2008/08/15 (Fri) @ 08:55

Yes, this has been shown a few times in this blog as well.


#12    tangotiger      (see all posts) 2008/08/15 (Fri) @ 19:37

I provide additional commentary at BTF:
http://www.baseballthinkfactory.org/files/newsstand/discussion/the_book_blog_tango_fixing_vorp/

***

http://insider.espn.go.com/espn/print?id=3536623&type=blogEntry
Rob Neyer says:

You know you’re a baseball geek if … you want to know why TangoTiger challenges Baseball Prospectus to fix VORP. I’m not smart enough to know if BP should fix VORP, but BP’s reluctance to respond to criticism has long been a source of frustration to me, as a big fan.

Well said.


#13    tangotiger      (see all posts) 2008/08/15 (Fri) @ 21:31

I put an update at the top of this thread for those who stumbled onto this thread from elsewhere.


#14    Colin Wyers      (see all posts) 2008/08/16 (Sat) @ 14:05

I think Neyer is selling himself short there (if I’m smart enough to understand what’s going on here, surely he has to be, right?), but other than that he has a point.

Christina Kahrl’s chat yesterday on BP does a pretty good job of explaining why BP is so silent on issues like this, though:

collins (greenville nc): Of BP’s three distinct ways of calculating playoff odds, is one of them better than the others? I love TA: I always read it first.

Christina Kahrl: Oh, see, that’s mean, because they’re all cool. Maybe I’m just way too vanilla, but I favor the elegant simplicity of the simple Odds Report. Nate’s ELO is really very complicated and perhaps a bit too experimental for me, and I’m not exactly clear as to the extent that the PECOTA-flavored variant adapts to injuries and roster turnover. Besides, vanilla with some nice berries is really very tasty, but it’s all a matter of taste.

Seriously? Outside of Will Carroll and Joe Sheehan she’s arguably one of the most visible BPers around, she edits their annual, she’s the managing editor of the website… and she doesn’t know which of the three Odds Reports is best. Doesn’t seem to know how they work. And thinks it’s “mean” to ask her to choose, because its a “matter of taste.”

(Also amusing, and perhaps relelvant:

rawagman (Toronto): Christina - Do you grade transactions over the course of the year? It would be very cool (if also very daunting) to see an end-of-year scorecard of the moves teams made, both in terms of adjusting to the inevitable unpleasantnesses (injuries) and in how they were proactive (promotions, trades, etc.) Thanks for the great work!

Christina Kahrl: Thanks for the compliment… actually, I don’t grade moves, and I guess I’ve never been tempted, because my concern would be that I’d be doing something dumb, like numericizing a studied opinion to give it math-y weight. Sort of like Win Shares, and just as useless.

Is there any way to read the above and not come to the conclusion that Kahrl has no clue about how Win Shares works?)


#15    david smyth      (see all posts) 2008/08/16 (Sat) @ 17:09

Are we getting close to dissing C Kahrl because she’s (let’s not mention it) female? smile

But, on the thread topic, I say, why ‘fix VORP’? Let VORP be what it is, as well as all of the other offensive value stats.

Let them not be fixed, let them be accepted or rejected or replaced, according to their merits.


#16    Colin Wyers      (see all posts) 2008/08/16 (Sat) @ 22:49

That really wasn’t the spirit in which the criticism was intented - all of the Odds Reports are, well, odds reports - they purport to predict probabilities of future events, something clearly intended as a statement of fact. One of them has to be more correct than the others (presumably it’s the PECOTA one although I haven’t tested this); I don’t see how you can simply say it’s a “matter of taste.”

As far as “why fix VORP,” well, it’s probably the most popular “sabermetric” stat in existance, other than OPS+. And it’s one that seems to stand for the community as a whole - when people like Murray Chass look to mock sabermetrics, what do they mention? Win Shares? OPS+? wOBA?

And it’s not like OPS, where to try and fix it you end up fundamentally changing the stat. A fixed VORP would be just as VORPy as it was before, just more accurate.


#17    dave smyth      (see all posts) 2008/08/17 (Sun) @ 07:39

-----"That wasn’t really the spirit in which the criticism was intended...”

I realize that, and was half joking. But I have seen quite a few C Kahrl ‘bad analyst’ posts recently, and I wonder if ‘girls can’t do saber’ might be in the back of some minds.


#18    David Cameron      (see all posts) 2008/08/17 (Sun) @ 20:43

Occam’s Razor - she really is just a horrible analyst.  No underlying secondary reasons needed.


#19          (see all posts) 2008/08/18 (Mon) @ 02:37

Oh I’m sure that girls can do saber just as well as guys, I’ll never be one to jump to prejudicial analysis...but remember that Chris Kahrl isn’t one to fit a stereotype


#20    Rally      (see all posts) 2008/08/18 (Mon) @ 09:15

When it comes to good analysts at BPro, there are two categories:

1. Nate Silver
2. Those who left to work for baseball teams (Woolner, Fox, Click)

The value of anyone else there is in their writing.  I consider Davenport a number cruncher, not an analyst.  I’ll reconsider when he reconsiders replacement level.


#21    Mark Thompson      (see all posts) 2008/08/18 (Mon) @ 23:25

Basic OPS, which basically assigns values of 0,1,2,3,4,5 isn’t all that far off from the other fomulas as far as relative values of events are concerned. Except for outs.


#22    tangotiger      (see all posts) 2008/08/19 (Tue) @ 00:58

Mark, I already posted what those values are, and here they are again:

0.45 1b
0.83 2b
1.22 3b
1.60 hr
0.23 bb
(0.28) out


#23    Sean D      (see all posts) 2008/08/20 (Wed) @ 19:44

Before delving into ‘Girls can’t do saber’ you may want to stop looking like a fool and read this (click name):

or this: http://www.baseballprospectus.com/unfiltered/?p=345


#24    Tangotiger      (see all posts) 2008/08/21 (Thu) @ 10:05

Sean’s post was marked for moderation and has been unqueued.


#25    Tangotiger      (see all posts) 2008/08/21 (Thu) @ 10:28

Sean: I would guess that most of the regulars are well-aware of Christina’s life choices, and I don’t see how it applies here.


#26    david smyth      (see all posts) 2008/08/21 (Thu) @ 15:00

I had no idea about that. I thought she was the wife or ex-wife of the male Kahrl.


#27    nickojohnson      (see all posts) 2008/08/22 (Fri) @ 15:20

Dudes, you’ve seriously lost focus here.

Did anyone else notice Derek Jacques article the other day?

(Click name)

There is some discussion of BP’s policy with regard to outsider criticism, and a bit of discussion on the accuracy of various metrics.  He doesn’t address VORP, but his article seemed to be more than just a coincidence…


#28    Tangotiger      (see all posts) 2008/08/22 (Fri) @ 17:07

Sometimes, we’re criticized at Baseball Prospectus for not responding to outside critiques. It’s a conscious decision, made at the management level—we’d rather talk about baseball than about ourselves or about our colleagues in the world of baseball analysis. While we don’t respond to every broadside in every blog, we also by and large don’t spend much time critiquing others’ work—with the exception of a situation like this, where someone has basically written in asking for feedback.

That’s the part Nick is talking about.


#29          (see all posts) 2008/08/22 (Fri) @ 17:28

Sorry, didn’t know if I was allowed to post the text.  There’s also this:

“After all, some metrics have enough variables that they’re almost infinitely modifiable, but not every modification is worth the trouble. Successful metrics should ideally bring something new to the table, something that’s an improvement over the status quo. For example, that something can be simplicity—someone takes a stat that required fifteen steps and proprietary data to calculate, and does it in four steps with the kind of information one could find on the back of a bubblegum card, and without an extreme loss of accuracy. Or it can be an element of context—league, ballpark, era adjustments—or a closer relationship to game events like runs scored or games won.”

But fixing VORP like is suggested here wouldn’t be very complicted, right?  And I DO think it would be “worth the trouble” when small differences are regularly cited in MVP arguments, etc.


#30    tangotiger      (see all posts) 2008/08/22 (Fri) @ 18:47

Nick, you are absolutely correct. This is a question of fixing something wrong, not a philosophical difference.

All they have to do is use a tech version of Runs Created over the basic version of Runs Created.

OR

Use Equivalent Runs over the basic version of Runs Created.

OR

Use BaseRuns

OR

Use Linear Weights

BP on the other hand chooses to do nothing.  I cannot believe that they’d rather do nothing, spin this as a philosophical issue, rather than moving forward and fixing something so obviously wrong.

Say it ain’t so, BP.


#31    Colin Wyers      (see all posts) 2008/08/22 (Fri) @ 19:32

VORP - Inaccurate out to one-tenth of a run!


#32    Colin Wyers      (see all posts) 2008/08/24 (Sun) @ 00:06

I’ve written up a rather long and (sadly) meandering piece on VORP for StatSpeak:

http://mvn.com/mlb-stats/2008/08/23/the-trouble-with-vorp/

A lot of it’s probably going to be old hat for most of the regulars here. The one thing I discovered that still blows my mind - VORP apparently uses linear weights for basestealing runs.


#33    tangotiger      (see all posts) 2008/08/24 (Sun) @ 01:10

Colin, you have a link wrong, as you need this:
http://www.baseballprospectus.com/statistics/eqa2008.php

Since the contention is that the bias is focused on walks, it would be good for someone to look at the walk leaders per PA, and compare their RARP and VORP.  Ideally, we’ll see a bias.

You should also do it for guys who are doubles leaders or some such, to use as a control, to make sure that it’s not a general overall bias.

Good job Colin.


#34    jay gibbons      (see all posts) 2008/08/24 (Sun) @ 06:05

Unlike Baseball Prospectus, you do generally respond to people’s questions.  I’ve sent questions to bp just asking them about the methodology and they usually don’t respond (and I’m not asking for private formulas to stats).


#35          (see all posts) 2008/08/24 (Sun) @ 08:06

BP is a strange (actually a more frustrating than strange) bird.  I think it’s a perfect storm of some of the personalities there and BP’s proprietary nature.


#36    tangotiger      (see all posts) 2008/08/24 (Sun) @ 08:36

It’s really weird at BP. There is this entity named BP, and there’s the people who work at BP.  Lots of good people at BP.  Whatever it is, it’s not close to a democracy there.

They think it works.  And, as long as they have 10K+ subscribers, they are satisfying some hole in the marketplace.


#37    Colin Wyers      (see all posts) 2008/08/24 (Sun) @ 16:27

Okay, I’ve reimplemented MLV and BRAA (EqR above average) using the Baseball Databank, for the sake of comparison. (I did some spotchecking, and my figures match up pretty well with what BP is using; there are some differences, which I think are largely a product of the lack of park adjustments.) All players 1954 - 2007, which are the years which BP calculates VORP for.

Avg. Error: 3.96 runs
Avg. Error, qualified starters: 5.79
Avg. Error, top 300 qualified starters in BB/PA: 8.32


#38    Colin Wyers      (see all posts) 2008/08/24 (Sun) @ 16:48

Avg. Error, top 300 qualified starters in 2B/H: 6.57

Looks like a bias against walks to me.


#39    tangotiger      (see all posts) 2008/08/24 (Sun) @ 19:28

To whoever at BP who is listening: what Colin is saying is that if you compare your two measures, VORP and RARP (which both are offense runs above replacement position… and there’s no reason that you should have TWO things to do exactly the same thing), you get an 8.3 run difference among the guys with the most walks.

The average error among all regulars is 5.8 runs and among regulars who hit alot of doubles it’s 6.6 runs.

There is a bias against walks.

Big thanks to Colin for doing all the verification work here.


#40    Tangotiger      (see all posts) 2008/08/25 (Mon) @ 14:10

You guys may also remember a few months ago, where I took issue with Pete Palmer’s run value of the double (0.85 runs).  I said it should be closer to .77 runs or so.

I wrote to Palmer, explained him why I thought he was wrong.  He looked into it, told me why he made the mistake, and said that he’d correct it in the next edition and thanked me for bringing it to his attention.

No fuss, no muss.


#41    Tangotiger      (see all posts) 2008/08/28 (Thu) @ 11:11

I wrote to BP a few days ago, asking for comments on the VORP construction.

No reply yet.


#42    Colin Wyers      (see all posts) 2008/08/28 (Thu) @ 17:29

I asked about it in Silver’s latest BP chat. Nothing there, either.


#43    tangotiger      (see all posts) 2008/08/28 (Thu) @ 18:13

Well, that’s disappointing.


#44    Tangotiger      (see all posts) 2008/08/29 (Fri) @ 13:56

Joe Sheehan replied and suggested I contact Clay or Nate.  I had already contacted Clay, and Clay replied in the meantime, so I think we can say that BP is finally responsive on this issue.

As with WARP in the other thread, it’s a similar issue with VORP, that looking into making changes at this point is a question of balancing time with other efforts.

Anyway, I’d like to thank Joe and Clay for being so forthcoming in their responses.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:49
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps