THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, March 05, 2007

SuperVORP

By Tangotiger, 03:40 PM

I haven’t seen this anywhere on their site, except in a recent chat, where Nate Silver says:

SuperVORP is basically VORP + defense, so a player with a VORP of 20 and a +3 defensive rating will have a SuperVORP of 23, all else being equal… Basically, it’s trying to combine the best features of VORP and WARP. WARP sets the replacement level bar a little bit too low for my tastes.

Finally.  Nate is being rather kind in stating that WARP’’s replacement level is “a little bit too low”.  It doesn’t make much sense for BP to have two different measures that try to describe the same thing (or will end up being used the same way).  I’m not sure why Clay is as insistent as he is.  In any case, big kudos to Nate for going ahead and using the better measure.  Eventually, as soon as they incorporate Dan Fox’s baserunning, they’ll have SuperDuperVORP, and will essentially be a competiting measure to MGL’s SuperLWTS.  Their biggest difference will be in MGL’s use of UZR for fielding, as opposed to the use of non-play-by-play data by BP.

Now, if we can get Keith to change the basis of VORP:

Team PA’s T_PA =162*25.5/(1-T_OBP)
Team AB’s T_AB =T_PA*(1-T_OBP)/(1-T_AVG)
Team Runs T_RUNS =T_OBP*T_SLG*T_AB

Which is:
Runs = T_OBP*T_SLG*(162*25.5/(1-T_OBP))*(1-T_OBP)/(1-T_AVG)
Which, if I’m doing this right, reduces itself to:
constant * OBP * SLG / (1-AVG)

I’ll be back later to prove that this is wrong.  Keith should use BaseRuns as the basis. 


#1    Patriot      (see all posts) 2007/03/05 (Mon) @ 16:24

And of course Tango is doing it right.

OBA*SLG/(1-BA) is by defnition runs per out from basic RC.  Multiplying this by the average of 25.5 outs/game and 162 games/season makes it the basic RC estimate of team seasonal runs scored.

MLV also introduces the issue that a player’s value is now expressed in terms of his impact on a theoretical team (including the extra PA that he will create for himself that may not have actually existed).

What I mean is that a good OBA player, say Jeter on the Yankees, will wind up with more PA in real life then he will on the theoretical team.  But a good OBA player on a poor OBA team, like Jason Bay, will end up with more PA on the theoretical team.

In one sense, this can be seen as a good thing, neutralizing the effect of team quality on player opportunity.  But if you want to look at the player’s actual real life performance and determine it’s value to his actual team, this is not something that you want to do.  So you have to think very carefully about what exactly it is that you are trying to measure, and I’m not sure that BP has done this in incoporating MLV into VORP.


#2    Tangotiger      (see all posts) 2007/03/05 (Mon) @ 16:58

My first concern was the use of basic RC.  Why that is done, I don’t know.  Using EqR (i.e., Linear Weights) I think would be better than using the theoretical-team (TT) construct with RC as the basis.  That is, using the TT approach is good, but using the RC as the basis is not good.  TT and BsR is what you want.

My second concern is echoed by Patriot.  I’m not sure the PA is handled properly, though I’ve yet to put pen to paper in analyzing this issue.  I will soon though.


#3    Tangotiger      (see all posts) 2007/03/05 (Mon) @ 17:25

Similar to the way I evaluated EqA, I’ll do the same for VORP.  I start with the same standard hitting line (.268/.339/.423), and figure out what happens when I add 1 single or 1 walk, etc. 

Remember that the weights, relative to the single are: 0.7 for walk, 1.6 for double, 2.2 for triple, 2.9 for HR, and -0.6 for an out.  This is true whether you use Linear Weights, wOBA, EqA, or 1.8*OBP+SLG.

And VORP?  1.6 for double, 2.2 for triple, 2.7 for HR, all ok.  But 0.4 for the walk (and -0.55 for the out, which is necessary to compensate for the low value of the walk).  Right away, we can see that this is wrong.  RC is notorious for undervaluing the walk, and VORP perpetuates this problem.  But, with all the mathematic gymnastics that it goes through (similar to the obfuscation of EqA), no one can tell.  Except for Patriot and a few other people.

I tried different stat lines by playing with the number of outs, or walks, or HR, and I always end up with the same thing.

My next step is to implement Keith’s exact methodology, step-by-step, and then use the plus-1 method to show exactly how much a change in walks or HR affects the overall value.  I’m quite confident that it won’t work.  But, let me do the exercise anyway.


#4    Chris Long      (see all posts) 2007/03/06 (Tue) @ 06:39

Tango, why not just calculate the partial derivatives to find the rate of change of the individual components?  It should be simpler and close enough.

-Chris


#5    tangotiger      (see all posts) 2007/03/06 (Tue) @ 09:05

I say “partial derivatives”, and I lose almost my whole audience.  I need to include “Jessica Alba” in the same sentence to get them back.

However, Patriot uses partial derivatives to great results in various articles here:
http://gosu02.tripod.com/id76.html

Those who are really into the technicals will enjoy all those articles.


#6    Patriot      (see all posts) 2007/03/06 (Tue) @ 11:01

Trust me, you don’t want to try to differentiate the MLV formula.  I did it for about 5 minutes once and concluded “Let’s see what the +1 method has to say”.  It’s a real pain in the butt. 

In practical terms, there should be virtually no difference between the partial derivative and the +1 method.  If you have a very simple function, like RC or BsR, then the derivatives are easy to find and it’s nice to be able to give a precise formula rather then the nebulous “add one of the event, then...” but the end results will be nearly identical, and we really don’t care whether MLV values a single at .5812 or .5816 runs; .58 will serve us just fine.


#7    Guy      (see all posts) 2007/03/06 (Tue) @ 13:36

Whatever the RC vs. BsR problems may be, this is clearly a huge improvement over WARP (as Tango said in the first post).  But I think the name is unfortunate:  “VORP” is already used by anti-stat baseball writers as a kind of strawman for statistical analysis, simply because it sounds funny.  Imagine the fun they’ll have with “SuperVORP.” I hope Nate will consider a new name.


#8    Rally      (see all posts) 2007/03/06 (Tue) @ 13:59

Aren’t they the ones that also came up with MORP?

I don’t even remember what it is, but it sounds funnier than VORP.


#9    tangotiger      (see all posts) 2007/03/06 (Tue) @ 14:51

MORP is the non-linear version of WARP, in dollar terms. 

free agent salary = MORP = 1.2 * (WARP^1.5) + 0.4

Contrast that with mine: 4 * WAR

The average team has 57 WARPs, so that makes the average full-time player at around 3.4 I guess.  So, the average player would have a MORP (free agent salary) of 8 MM.

The average team has 34 WARs, making the full-time player at around 2.  That makes the average player 8 MM as well.

In short, the WARP guy gets around 1.4 more “wins” than a WAR guy.

Now that we are in-synch, consider a great player, someone who would be a +8 WAR guy would be a +9.4 WARP guy.  In my system, he gets 32 MM, and in MORP he gets 35MM.

A +5 WAR (+6.4 WARP), it’s 20 MM for me and ... uh, 20 MM for MORP.

A +3 WAR (+4.4 WARP), it’s 12 MM for me and ... uh… 11.5 MM for MORP.

Seems to me that making it exponential doesn’t really do anything:

war warp Me$ MORP$
1.00 2.40 4.0 4.9
2.00 3.40 8.0 7.9
3.00 4.40 12.0 11.5
4.00 5.40 16.0 15.5
5.00 6.40 20.0 19.8
6.00 7.40 24.0 24.6
7.00 8.40 28.0 29.6
8.00 9.40 32.0 35.0
9.00 10.40 36.0 40.6


#10    tangotiger      (see all posts) 2007/03/06 (Tue) @ 14:56

I guess this really points something out in WARP: making it exponential with MORP simply aligns it to something… linear!

The whole non-linear trend that MORP is showing aboput WARP is only because of the poor construction of WARP.

I wouldn’t be surprised, at all, if Nate recalculates MORP using SuperVORP instead of WARP to find a completely linear equation, one which will essentially match mine.


#11    tangotiger      (see all posts) 2007/03/06 (Tue) @ 15:03

I just noticed that I never added the “0.4” to my measure.  To align the average player, I should be using 3.8MM per win, not 4.0.  Here then is the new chart:

war warp Me$ MORP$
1.00 2.40 4.2 4.9
2.00 3.40 8.0 7.9
3.00 4.40 11.8 11.5
4.00 5.40 15.6 15.5
5.00 6.40 19.4 19.8
6.00 7.40 23.2 24.6
7.00 8.40 27.0 29.6
8.00 9.40 30.8 35.0
9.00 10.40 34.6 40.6

The divergence happens at around the 6 WAR level.  Other than Pujols, no one exceeds that.  (THT has Santana and ARod at around 5.5, which is less than 1 MM difference between me and MORP$).

And, since most of you that I was crazy to consider Pujols worth as much as I said he was in $ terms (that if a player is tooooo good, that there’s some sort of “diminishing returns"), you guys are going to hate the ever-rising value of his MORP$.

So, let’s all agree that linear is the easiest, safest cleanest way to work and think about this, and anything else will make you go through so many mathematical gymnastics that it makes you see something (like MORP/WARP do) that simply isn’t there.


#12    tangotiger      (see all posts) 2007/03/06 (Tue) @ 21:26

Nate pointed out that it’s at the lower-end that you’ll see the biggest difference, and he’s right:

war warp Me$ MORP$
(1.00) 0.40 (3.4) 0.7
(0.50) 0.90 (1.5) 1.4
0.00 1.40 0.4 2.4
0.50 1.90 2.3 3.5
1.00 2.40 4.2 4.9
1.50 2.90 6.1 6.3
2.00 3.40 8.0 7.9

Those are huge gaps.  And all that MORP does is show us how you have to decide whether to trust the superlow replacement level of WARP or mine in WAR.

I don’t see the benefit of the exponent in MORP, and just see the downside of the superlow repl level in WARP.


#13    salb918      (see all posts) 2007/03/06 (Tue) @ 21:49

A note on the partial derivative/+1 method: if I understand correctly, the +1 method is a basically a numerical partial derivative and will give the same answer as the +X method as X-->0.  If more accuracy is desired without resorting to actual differentiation then one might consider a +1/2 or +1/4 method.


#14    Patriot      (see all posts) 2007/03/06 (Tue) @ 23:41

Salb is correct.  With a spreadsheet, you can use .000001 or whatever, and you will have virtually the same answer as the same partial derivative.

But adding 1 is a lot easier to explain to people who don’t understand the notion of the derivative, and the results are close enough adding 1.  For example, in the 06 NL, the basic RC HR value is 1.579087977 runs using the +1 method.  Using the partial derivative, it is 1.579063133.


#15    salb918      (see all posts) 2007/03/07 (Wed) @ 00:35

I think the non-linearity of MORP is appealing because it confirms a notion that teams pay a premium for star-level talent.  If it turns out - at seems that this is the case - that the non-linearity is due to a problem with the definition of replacement, then perhaps tams don’t pay a premium for star-level talent.


#16    tangotiger      (see all posts) 2007/03/07 (Wed) @ 02:35

While I say “plus 1”, I really end up doing “plus 1/1,000,000”, which is the same thing as the derivative.


#17    Tangotiger      (see all posts) 2008/12/22 (Mon) @ 12:18

Bumping to focus on posts #9 and later…


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade