THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, December 02, 2008

RARP v VORP, take 2

By Tangotiger, 11:55 AM

Three months ago, I did a comparison of RARP and VORP.  I presented my findings to Clay, and he said he was going to make at least one change (as it relates to pitchers-as-hitters).  I don’t know if he did any other changes, yet.  (As I right these words, I haven’t looked at the data.) You can click on the above link to get all the particulars.  I made no changes to everything I’m about to analyze, other than to update the data.

Let’s get on with it:


VORP v RARP

BP has two measures that do similar things.  One is called VORP, by Keith Woolner, and another is called RARP, by Clay Davenport.

The total VORP for 2008 is 5352 runs.  And for RARP, it is 5257 runs.  This is based on around 187631 PA, which means that per 700 PA, we have this:

VORP: +20.0 runs
RARP: +19.6 runs

As you can see, both have a very similar replacement baseline.  Indeed, this is a very common baseline.  MGL uses +19.4 runs, and I use (roughly) +19.8 runs.  We can feel very confidant that overall, we’re talking about the same currency and the same scale.

The question is if either measure shows much difference.  This is the 10 guys that VORP likes more than RARP:

diff    RARP    VORP    Name
14.6     47.3      61.9     MATT HOLLIDAY
14.1     41.5      55.6     AUBREY HUFF
13.9     16.5      30.4     DERREK LEE
13.3     70.8      84.1     MANNY RAMIREZ
13.2     59.3      72.5     LANCE BERKMAN
12.4     22.9      35.3     JOEY VOTTO
11.9      9.0      20.9     JAMES LONEY
10.7     18.2      28.9     CONOR JACKSON
10.7     88.0      98.7     ALBERT PUJOLS
10.4     43.0      53.4     KEVIN YOUKILIS

The concern is on the high-end guys here.  Ramirez, Berkman, Pujols, Holliday, and Youkilis were all MVP candidates, and these two measures sees them as 10 to 15 runs different.

And these are the guys that RARP likes more than VORP.

diff    RARP    VORP    Name
-10.4    -1.6    -12.0    JEFF MATHIS
-9.8     7.7     -2.1    JASON VARITEK
-9.2     6.5     -2.7    JOHN BUCK
-9.0     3.7     -5.3    BRANDON INGE
-7.4     7.5      0.1    JASON KENDALL
-7.4    22.3     14.9    KURT SUZUKI
-7.2    -4.2    -11.4    JOSE MOLINA
-7.1     5.3     -1.8    BOBBY CROSBY
-7.1    11.9      4.8    MARK ELLIS
-7.0    -2.9     -9.9    JACK HANNAHAN

The issue is best exemplified here:

diff    RARP    VORP    Name
-4.5    61.6    57.1    JOE MAUER            
13.2    59.3    72.5    LANCE BERKMAN

RARP sees Mauer and Berkman as being 1.7 runs apart (with Mauer being better), while VORP sees them 15.4 runs apart the other way (Berkman better).  That’s a 17.7 run difference.

I think it would be nice for someone at BP to respond to why we see such an enormous gap here, and which numbers we should trust.  Or at least give us enough information for us to decide.

Here are the positional totals:

Pos    RARP    VORP    diff    n
1B    670.3    846.9    
-176.6    59
2B    627.0    667.4     
-40.4    66
3B    562.7    584.0     
-21.3    59
C     563.8    352.4     211.4    104
CF    591.4    673.7     
-82.3    52
LF    650.1    736.6     
-86.5    82
Ot    103.2    213.9    
-110.7    100
P     214.1     13.3     200.8    341
RF    663.7    668.3      
-4.6    53
SS    611.0    595.7      15.3    67

We see an enormous gap among catchers and pitchers, of which RARP really loves, while VORP is much more high on 1B and DH. 

We’d like to see the runs over replacement to be somewhat similar across all the positions.  They don’t need to be identical, as it would imply things that are not true, but neither can they be so far different from each other that it would lead to absurd results.  While RARP paints the even-ish picture, VORP does not: 1B have over twice the value of catchers.  This explains why we have such a difference between Mauer and Berkman. 

I don’t know why BP writers are using the VORP numbers, when the more reasonable RARP numbers are never cited.  Will someone at BP explain this?

Also, as other readers have pointed out: how are IBB treated?  As regular walks?  Or something else?

***

The range of pitchers according to RARP is Zambrano (+12 runs) to Bergman (-3 runs).  VORP’s range is +17 for Zambrano to -6 for Bergman and Jimenez.  Thats’ 24 runs (after rounding) for VORP to 15 for RARP.  Which is right?

Let’s look at Linear Weights, via Pete Palmer, via b-r.com, for pitchers with at least 70 PA, making the leader and trailer for VORP and RARP as Zambrano and Jimenez.  The gap is +13 runs using RARP and +24 runs using VORP.

Palmer says Zambrano is +2 runs above average and that Jimenez is -17 runs.  That range is 19 runs.  Well, that doesn’t help us!  Maybe Zambrano is too much of an outlier.

Looper, Haren, Arroyo, Cook are the next leaders in both RARP and VORP (at least they agree on that), with RARP at +6.5 for them, and VORP at +6.0. 

At the bottom, if we look at the bottom 7 for each, we have these six common pitchers: Jimenez, Sheets, Myers, Lohse, Billingsley, Pelfrey.  That’s an average of 0 for RARP and -4 for VORP. 

So, the gap is +6.5 for RARP and +10 for VORP. 

What does Palmer say?  The top 4 averaged -5.7: Looper (-3.2), Haren (-7.2), Arroyo (-5.8), Cook (-6.6).  The bottom 6 averaged -13.6: Jimenez (-17.1), Sheets (-14.5), Myers (-12.6), Lohse (-12.7), Billingsley (-12.6), Pelfrey (-11.8).  That gives us a gap of 7.9 runs.  Once again, that doesn’t help us either as Palmer is right in the middle.

Let’s try to infer the replacement level for each.  The top pitchers, after Zambrano, were +6.5 RARP, while Palmer said they were -5.6 RAA.  That sets the replacement level as 12.1 runs.  The bottom pitchers were 0 for RARP, compared to Palmer’s -13.6 runs: that sets the replacement level at 13.6 runs.  So, Clay sets the replacement level at around 13 runs, for around 75 PA.

The top pitchers were +6.0 VORP, which sets the replacement level at 11.6 runs.  The bottom pitchers were -4 VORP, which sets the replacement level at -9.6 runs.  So, Woolner sets the replacement level at around 10.5 runs for 75 PA.

With 6000 PA for pitchers, this 2.5 runs per 75 PA gap will lead to 200 runs difference between the two measures.

All to say that I really don’t know which one is better.  I can see what Clay did, and I think I’d lean towards his numbers.

#1    Patriot      (see all posts) 2008/12/02 (Tue) @ 14:01

According to the appendix of “Baseball Between the Numbers” (obviously a few years old now, so it is possible this has changed), the replacement levels used by BP (it never specifies VORP or RARP; based on my recollection and the data above, I have to believe it’s VORP), their replacement percentages in terms of R/O are:

1B/DH 75
2B/3B/SS/OF 80
C 85

That’s the Berkman/Mauer issue, in a nutshell.


#2    Tangotiger      (see all posts) 2008/12/02 (Tue) @ 14:49

If we exclude pitchers, there is 5043 RARP and 5339 VORP.  Per 700 PA, that’s 19.4 RARP and 20.6 VORP.

I don’t see how 80% is even the baseline for replacement level.  If you exclude pitchers, the runs created per 700 PA would be around 85 to 90 runs.  To get 20.6 VORP per 700 PA would set the replacement level at 1 minus 20.6/87.5 or so, or around 76%.  The RARP replacement level can be similarly calculated at 78%.

So, I will question the 80% figure.


#3    Colin Wyers      (see all posts) 2008/12/02 (Tue) @ 16:22

Everything I’ve read Woolner say about replacement level is that he uses OPS to figure replacement - .70 points below average OPS is rep-level is how I think he phrased it in his early writings. We know that OPS doesn’t scale correctly with run scoring. Could that be the issue here?


#4    Tangotiger      (see all posts) 2008/12/02 (Tue) @ 16:30

Woolner does have some inconsistencies.  I remember the 80% figure from his excellent Replacement article in a BP from a few years back.  And, on his site, the team replacement level he mentions doesn’t jive with the player replacement level.  (See my wiki for the reference.)

Basically, I can’t take for granted that what I read is what is being implemented.  I have the results, and that’s what counts, frankly, as to what has been implemented.


#5    Patriot      (see all posts) 2008/12/02 (Tue) @ 17:59

Tango’s point about the discrepancies between explanation and implementation raises the question: who is implementing these metrics now that Woolner is not with BP?  I have always assumed that Davenport maintains his own stuff (the EQA family of measures), and Silver his stuff, etc.  Who is now in charge of the Woolver family of measures (MLV/VORP), and are they empowered to tinker?


#6    Tangotiger      (see all posts) 2008/12/08 (Mon) @ 14:35

If anyone has had any contact with anyone at BP on any of this, please post their replies.

As it stands, I got nuthin’.


#7    Colin Wyers      (see all posts) 2008/12/08 (Mon) @ 15:15

I asked Sheehan in his chat last week about it, and he ignored it. I may try again in his next chat on Wednesday.


#8    cannatar      (see all posts) 2008/12/09 (Tue) @ 15:38

On the general topic of WARP’s replacement level (and BP’s defensive stats), from Jay Jaffe’s article today:

“Clay Davenport has been hard at work revising the Wins Above Replacement Player system, our player valuation metric that covers the entirety of baseball history. Namely, he’s incorporating two major changes; first, he’s raising the replacement-level floor significantly beyond that of the bottom-of-the-barrel 1899 Cleveland Spiders or a current Double-A player to conform to a more modern definition of the major league replacement level, and second, he’s adding a play-by-play based fielding component for the years where it is available.”

later in the article…

“Roughly speaking, each full-season player loses about 2.0 WARP in the transition to the new methodology with a tougher definition for replacement level.”


#9    Tangotiger      (see all posts) 2008/12/09 (Tue) @ 15:50

http://baseballprospectus.com/article.php?articleid=8353

***

I feel like when Andy Dufresne was writing a letter a week to the State to ask for books for the library.  And after 6 years, they finally delivered, with a request that they now consider the matter closed, so stop writing.

And his response?  That he’s now going to write TWO letters a week.

Clay: you are about to get more emails from me…

This is a very good day.


#10    Tangotiger      (see all posts) 2008/12/09 (Tue) @ 15:57

This is really cool.  I offered Clay my services in whatever way he likes.

***

As Rally pointed out a few months ago, he thought that it would be too hard for BPro to own up to the error in judgement, as it would be like Emily Litella’s “nevermind”:

http://en.wikipedia.org/wiki/Never_mind_(Saturday_Night_Live)

A made up dialogue that precedes Clay’s relaunch of WARP, where Clay simply retorts with “Nevermind”, would be simply fantastic.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade