THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, January 13, 2010

Mail: rWAR v fWAR

By Tangotiger, 05:23 PM

> From a completely unrelated area… can you explain to me (or point to
> some piece which does the same) how it is that Rally’s WAR figures differ
> from FanGraphs WAR figures? Ken Davidoff used Rally’s historical WAR data
> (and a 60-WAR threshold) when completing his HoF ballot. As a mental
> exercise, I went to see what this criterion would suggest about those
> already enshrined and those not yet eligible.

Three major differences:
1. Pitchers: fWAR is 100% DIPS-compliant, while rWAR teases out the fielding part with his team fielding stats

2. UZR v TotalZone

3. rWAR has baserunning

Otherwise, the differences would probably be pretty minor, like positional adjustments are a bit different, or something.

The 60 WAR baseline is pretty good.  The 50/50 mark for making the Hall of Fame via BBWAA is at around rWAR of 55 or so.  So, anyone at 60 WAR should be seriously considered, and 70 WAR is a shoe-in.  Edgar, Alomar, Larkin, Raines are in the 65-68 range.  Blylven is 92!  Anyone in the 50-59 range is where the arguments take place: Dawson, McGriff, etc.  This really gets to the heart of the “type” of HOF you want.  Or small/big, etc.


#1    philly      (see all posts) 2010/01/13 (Wed) @ 17:32

Is there still a catcher’s defense difference?  fWAR assumes all catchers are average defensively and rWAR uses the basic available catcher related data to come up with a defensive component.

Has anybody looked to see how much of a difference that makes to catchers in the overlapping era (2002-2009)?


#2    Tangotiger      (see all posts) 2010/01/13 (Wed) @ 17:50

Yowza, excellent point.


#3    Rally      (see all posts) 2010/01/13 (Wed) @ 17:54

I was going to say catcher defense too.  This shouldn’t be an issue too much longer, David Appelman has told me he plans on adding a catcher defense statistic next season.

Other than that, I’m floored.  Someone used my numbers as a guide for an actual HOF ballot?  Just..Wow.  That’s kind of like having my own vote.

My site has the top 500 hitters and pitchers, with a column for “HOF” and “HOM” if the player was voted into either hall.  So it should be pretty easy to compare WAR totals to the voting.


#4    jonm      (see all posts) 2010/01/13 (Wed) @ 19:02

Thanks for your site, Rally. I’ve been looking at it a lot lately and have derived much enjoyment from it.

I noticed the catching defense issue; I think that your system does lead to an undervaluation of catchers. Out of the 500 position players, there are only 37 catchers and, intuitively, it just seems that Bench and Berra should be higher.


#5    Blackadder      (see all posts) 2010/01/14 (Thu) @ 06:13

Of course, the next step after looking at total rWAR is to make some consideration of peak.  Pujols and Rose are very close right now in career WAR, but obviously (even setting aside gambling issues) Pujols has a vastly better HOF case.  Still, voting straight WAR is light years ahead of ignoring Rickey Henderson because you weren’t a “Rickey guy”.


#6    Nick Steiner      (see all posts) 2010/01/14 (Thu) @ 06:24

Blackadder - I’m working on a Pennants Added metric right now - basically it’s just a modified version of David Gassko’s study a couple of years ago.  I think it will end up weighting great season exponentially higher, so a guy like Pujols will be better than Rose.


#7    Ryan JL      (see all posts) 2010/01/14 (Thu) @ 06:44

Select catchers, 2002-2009:

fWAR, rWAR
Posada:  36.6, 31.4
Pudge:  22.9, 21.4
Varitek:  22.3, 18.0
Kendall:  20.3, 16.1
AJP:  18.9, 11.4
B Molina: 14.8, 7.3
Y Molina:  8.5, 7.8
Ausmus:  5.2, 3.1


#8    Nick Steiner      (see all posts) 2010/01/14 (Thu) @ 07:02

I’m not sure how Pudge and Yadi could possibly lose WAR from the addition of catcher defense.  Let’s just take a look at Yadi. 

BProj has him at with 2413 PA, -57 batting runs, -15 baserunning runs, -9 GIDP runs, -7 ROE runs, +59 catching runs, +40 positional adjustment and +69 replacement for a total of 80 RAR. 

FanGraphs has him 2458 PA, -44.7 batting runs, 0 baserunning runs, 0 GIDP runs, 0 ROE runs, 0 catching runs, +49.3 positional adjustment and +81.9 replacement for a total of 85.9 RAR. 

So the difference is pretty big here.  Rally includes baserunning, catcher defense, GIDP and ROE, while FanGraphs doesn’t; so when you consider all of those, Yadi should be +28 more on BProj vs. FanGraphs. 

However, FanGraphs has higher values for batting runs, positional and replacement level adjustments.  Positional and replacement level, I could see a reason for a discrepancy, as there are different interpretations of what those should be (although, don’t Rally and David both use Tango’s positional adjustments?).  But I don’t see what could explain the 12 run difference in batting runs.  Is it just because of the custom team linear weights Rally uses?


#9    Kincaid      (see all posts) 2010/01/14 (Thu) @ 08:04

I’m not sure that the system undervalues catchers just because they are less well represented than other positions among the WAR leaders.  Because of the wear and tear of the position, it makes it hard for catchers to have long careers, and it makes it hard for them to keep their value as long as players at other positions.  Because of that, career value totals will dwindle and stop accumulating more quickly at catcher.  That could be a point in favour of increasing the position adjustment.  Do you want to only look at the value provided within a season to a team, or do you want to add in a bonus for long-term destruction to the player’s career?  Is it more important to capture a catcher’s seasonal contributions on the same scale as other positions, or to capture his career contributions on the same scale as other positions?  I’m not sure which is more right, and I think you would get different answers depending on how you feel about that.  Maybe catchers aren’t getting enough credit because of that.

I think there is another issue that will ensure that catchers are likely to be less well-represented among WAR leaders, though.  Because of the value-sapping nature of the position, a lot of the best players who are capable of playing catcher probably play other positions in amateur ball or move to other position by the time they are professionals.  I suspect players with the ability to play catcher in addition to other positions tend to choose other positions, and I suspect there is some tendency of teams (even at the amateur level) to move really good prospects with good bats and decent mobility away from catcher to preserve their health.  It’s a selection bias that results from enough people thinking playing catcher sucks (what with it destroying your knees and giving your body a beating you don’t have to endure at other positions and all) that those who can make it without having to play catcher tend to not play catcher, and those who do play catcher being of more marginal talent and accepting the beating of the position to increase their chances of playing.

Because of the selection bias, if catchers were as well represented as other positions among the top however many in WAR, it would probably mean they are being overvalued.


#10    Tangotiger      (see all posts) 2010/01/14 (Thu) @ 08:22

I’ll add a 5th big difference: park factors.


#11    Rally      (see all posts) 2010/01/14 (Thu) @ 10:49

Two things on my catcher ratings vs Fangraphs:

I use a slightly higher replacement level, so a player getting +22 on Fangraphs might get +20 on mine.

My catcher position adjustment is lower, I use +10 instead of +12.5.  That comes from observations made in a replacement level article:
http://www.hardballtimes.com/main/article/replacement-level-article/


#12    Rally      (see all posts) 2010/01/14 (Thu) @ 11:02

One think to consider in the future, regarding position adjustments, is baserunning.  Most catchers will be well below average, positions like SS and CF above average.  If the hitting stats of freely available catchers indicate they are 10 runs worse than average, and they are all slow runners in addition, maybe I should use +12.5 instead.


#13    Rally      (see all posts) 2010/01/14 (Thu) @ 13:37

Does Fangraphs WAR account for the difference in league quality?  Mine do, and you can see it in the replacement column.  On Fangraphs the replacement column does not have an obvious league difference, it’s in the same order as total plate appearances.  But I don’t know if David takes care of the league quality in the WOBA to runs calculation or something.

It does appear like the NL leaders are consistently ahead of the AL leaders, so maybe league is not considered.


#14    Kincaid      (see all posts) 2010/01/14 (Thu) @ 14:45

David said the other day on FG that he uses the same replacement level for both leagues.  I’m pretty sure the wOBA to wRAA is not league dependent, because the numbers on the site match up with what I get from the Lahman database without a league adjustment.  Unless part of the translation from wRAA to batting runs involves a league adjustment (and the only adjustment I’ve heard of them using there is a park adjustment, so I doubt it does), I don’t think there is one.


#15    Tangotiger      (see all posts) 2010/01/14 (Thu) @ 14:56

Good stuff.  Ok, so we have these additional differences:

4. catcher defense missing in fWAR

5. park factors treated differently

6. league differences treated differently (fWAR makes no adjustment)


#16          (see all posts) 2010/01/15 (Fri) @ 00:28

The offense is somewhat different, with Fangraphs using wRAA based off of wOBA (with a broad park adjustment) and Rally using a custom BaseRun formula to make sure that the sum of the individual players’ contributions on a team equal the team’s actual runs scored.

I’m not sure how big of a difference that is though.


#17    Nick Steiner      (see all posts) 2010/01/15 (Fri) @ 00:54

Rally - how hard would it be to add a quality of batters/pitchers faced adjustment?  I think that would be pretty useful.


#18    Rally      (see all posts) 2010/01/15 (Fri) @ 09:55

I do that for pitchers - at least on the teams faced level.  It’s not easy.  It adds a few extra steps to the whole process, which was a bear to update this past winter.  I had forgotten how many programs I had used to put all that stuff together.


#19    Matthew Cornwell      (see all posts) 2010/08/16 (Mon) @ 23:07

Another quick question - do both rWAR and fWAR set the same replacement levels?  At one point, wasn’t Rally doing .420 for starting pitchers and FG about .380?


#20    Tangotiger      (see all posts) 2010/08/17 (Tue) @ 10:02

They have different replacement levels.  The total number of WAR for Fangraphs is around 1000 for current seasons, and it’s around 870 I think for Rally.


#21    kds      (see all posts) 2010/08/17 (Tue) @ 13:21

Fangraphs does now go back historically past 2002, so they must be using something other than UZR for those WAR calculations.

Tom, I think you want to add different replacement level to your list.


#22    Rally      (see all posts) 2010/08/17 (Tue) @ 14:59

Before 2002, Fangraphs is using TZ in the WAR calculation, though I think David is not using my numbers for the rest, such as batting and pitching runs.


#23    Matthew Cornwell      (see all posts) 2010/08/17 (Tue) @ 23:26

So rWAR has a higher replacement level but fWAR does not adjust for AL vs. NL.  So which one of those has a bigger impact, the gap between the replacement levels or the difference between the league adjustments. 

Rally said that if a pitcher has +20 in rWAR might be +22 in fWAR.  How would that number change for a modern day pitcher in the NL vs. AL?


#24          (see all posts) 2010/09/02 (Thu) @ 13:16

so i’ve been checking out some of the phillies starters. 

when looking at Cole Hamels on fangraphs, he has a 3.0 WAR.  i thought that was kinda low becuase he’s having an absolutely great year.  so i go to bbref and check out his WAR over there and he’s at 3.9. almost a full win more than fangraphs.

i checked out Oswalts and he’s roughly .6 more on rwar than fwar. 

whats the differences that effect pitchers? 

which one do you guys put more stock in?  it seems that rwar really goes the extra mile while fangraphs is fine with broad sweeping adjustments.


#25    Tangotiger      (see all posts) 2010/09/02 (Thu) @ 13:54

Jaime: can you tell me what you learned from this thread regarding pitcher WAR?


#26          (see all posts) 2010/09/02 (Thu) @ 14:01

TZ vs UZR
Batter Qaulity
park factors
league differences
100% dips vs. something i don’t quite understand

that about sums it up.  i just wasn’t sure if there was a whole different framework between the two.  i wouldn’t expect the differences from the two to possibly equal 1 win or more.


#27    Tangotiger      (see all posts) 2010/09/02 (Thu) @ 14:16

It seems that you understand it well-enough to have answered your own question:

“whats the differences that effect pitchers?  “

And, as for this: “which one do you guys put more stock in? “, it’s really a matter of personal philosophy.  There’s not much wrong or right (at the seasonal level).  I just split the difference.

At the career level, the 100% DIPS philosophy doesn’t work as well, so fWAR loses some ground there.


#28          (see all posts) 2010/09/02 (Thu) @ 14:26

I have an idea of whats going on.  but a 1 win difference between the two is pretty big in my opinion.

i guess the correct question would be “what are the philosophical differences between the two?”

if i choose rwar to be my preffered pitcher WAR, then do i have to bring the batter WAR with it to stay consistent?


#29    Rally      (see all posts) 2010/09/02 (Thu) @ 14:49

That seems like a matter of personal choice.  WAR is a framework.  It’s very easy for anyone to substitute one component for another.  Since both Fangraphs and BB-ref have John Dewan’s fielding numbers, it would be very easy to use WAR and replace the fielding part with Dewan’s if you wanted to.  Or use Chris Dial’s, although not quite as convenient.

I wouldn’t have an issue with using rWar for pitchers and fWar for batters, or something like that.  The only consistency you need to follow is be consistent with yourself. 

For example, if you choose rWar in general as your metric for batters, but happen to be a Ranger fan.  So all of a sudden you decide you like how fWar rates Josh Hamilton, and switch to make an MVP argument.  That would be cherry picking.  Try to avoid that.


#30    Tangotiger      (see all posts) 2010/09/02 (Thu) @ 15:02

"but a 1 win difference between the two is pretty big in my opinion. “

This is the Ricky Nolasco issue from 2009, which if I remember had something like a 3 win difference.

Nolaso had two things going against him last year: a terrible BABIP (.336) and a horrible performance with men on base (only 61% of runners stranded).  FIP of course focuses on everything about pitching except those two things.  A 100% DIPS-based WAR system will presume an average BABIP, and an average sequencing of events.  So, Nolasco looked great in fWAR in 2009, and terrible in rWAR.

In rWAR, Rally counts the fact that Nolasco was on the mound when runs were being scored.  Regardless of the opinion of whether he “deserved” it, he was the general on the battlefield.  So, Rally dings him heavily for being on the mound when lots of batted balls fall in, and being on the mound when he sequenced the events in a bad order.

It’s a PERSONAL opinion as to what you want to do.

Now, this year, his BABIP is still high (.328, career .314), which gives SOME weight that maybe he is indeed partly responsible for giving up all those hits.  His strand rate is 72% this year, which means he’s been much better at sequencing.

In the end, I’ll guess that 2010 for Nolasco will be similar for fWAR and rWAR, and that each will be half-way between the fWAR and rWAR of 2009.

Basically, understand why things happen, and then save yourself the headache of choosing and just split the difference.


#31          (see all posts) 2010/09/02 (Thu) @ 15:10

tango/#30

that makes sense.  i guess the part that didn’t make the most sense to me is responsible for most of the difference!  i didn’t understand what you meant by:

“1. Pitchers: fWAR is 100% DIPS-compliant, while rWAR teases out the fielding part with his team fielding stats”

i think i fall more in line with the rwar than the fwar in respect to that.


#32          (see all posts) 2010/09/02 (Thu) @ 15:17

Use both, as much as you can.  If you’re just curious where a player ranks on the career lists, either one is roughly as good as the other.  But if you want to argue for MVP reasoning, or just, “my 2nd baseman is better than yours”, I’d use both.  And anything else you can get and reasonably explain what it means.


#33    NaOH      (see all posts) 2010/09/03 (Fri) @ 00:47

Sean Forman has a new post up at the NY Times baseball blog providing a basic description of the factors used for his/Rally’s WAR for pitchers.

http://nyti.ms/94fjcI


#34    Tangotiger      (see all posts) 2011/09/08 (Thu) @ 15:23

Ricky Nolasco was 4.3 fWAR in 2009 and -0.3 in rWAR (average of 2.0).

To no surprise, in 2010, his fWAR and rWAR coalesced to something in between those two: fWAR of 2.5, and rWAR of 1.4.

Basically, each metric regressed about 75% to the midpoint.  (Hence, my rule of just taking both metrics and splitting it in two.)

However, in 2011, Nolasco continues to be a bane, with an fWAR of 3.3 and an rWAR of 1.1.  Once again, he’s close to that 2.0 WAR midpoint.

Nolasco’s a fun case, in trying to figure out which WAR version you prefer, and whether you might prefer one version for single year and another for multi-years.


#35    rempart      (see all posts) 2011/09/08 (Thu) @ 17:22

I was thinking about the same thing today. How about Seamheads sWAR(Base Runs/DIPs/WS defense). Then form a triumvirate (see below). Average the three, or throw out the high and low, or weight them 30%/30%/40% or something. Love this stuff. The framework for WAR is great. Wish I had it when I was a kind.

avg      sWAR      rWAR      fWAR     Player    
 7.9      7.7      8.1      8.0     Jose    Bautista
 7.4      6.8      8.6      6.8     Matt    Kemp
 7.0      7.0      7.7      6.4     Justin    Verlander
 7.0      6.4      6.5      8.1     Jacoby    Ellsbury
 6.5      5.9      6.2      7.4     Roy    Halladay
 6.4      5.9      6.9      6.3     Ryan    Braun
 6.3      5.5      6.4      7.0     Dustin    Pedroia
 6.1      5.5      6.3      6.6     Joey    Votto
 6.0      6.2      5.7      6.2     Clayton    Kershaw
 6.0      5.4      5.9      6.7     C
.C.    Sabathia
 6.0      5.4      6.4      6.1     Adrian    Gonzalez
 5.9      5.8      5.2      6.8     Curtis    Granderson
 5.9      5.8      6.1      5.9     Cliff    Lee
 5.9      5.5      5.7      6.6     Troy    Tulowitzki
 5.7      5.9      5.7      5.5     Miguel    Cabrera
 5.6      5.4      6.0      5.4     Jered    Weaver
 5.4      4.9      5.3      6.0     Alex    Gordon
 5.3      5.8      5.1      5.1     Alex    Avila
 5.3      4.9      5.4      5.7     Andrew    McCutchen
 5.3      4.5      5.1      6.3     Shane    Victorino
 5.2      5.4      5.2      5.1     Cole    Hamels
 5.2      5.3      4.3      5.9     Ben    Zobrist
 5.1      4.3      4.4      6.7     Justin    Upton
 5.1      3.9      5.1      6.3     Shane    Victorino
 5.1      4.3      4.7      6.3     Ian    Kinsler
 5.0      5.9      4.3      4.8     Robinson    Cano
 5.0      4.4      5.2      5.4     Felix    Hernandez
 4.9      4.7      4.5      5.4     Jose    Reyes
 4.9      5.2      3.7      5.7     Dan    Haren
 4.8      4.2      6.1      4.1     Josh    Beckett
 4.7      4.2      5.1      4.8     James    Shields
 4.7      4.1      5.0      5.0     Albert    Pujols
 4.6      5.0      3.8      5.1     Matt    Cain
 4.6      4.1      4.2      5.4     Justin    Masterson
 4.5      3.9      4.4      5.3     Matt    Holliday
 4.4      4.1      4.3      4.8     C
.J.    Wilson
 4.2      4.1      4.0      4.6     Jhonny    Peralta
 4.2      4.4      4.1      4.2     Prince    Fielder
 4.1      4.5      4.0      3.8     Lance    Berkman
 4.0      4.5      3.9      3.6     Paul    Konerko

Notes:
Kemp has a huge difference due to the Fielding. One says he is +10 the other -4 or -5 I forget.

Some interesting pitchers as well with fairly wide disparity. Of course this is due to DIP v actual RA etc.

sWAR seems to correlate better with rWAR.

I just wanted to close by saying I don’t think it is a bad thing to look at it different ways. If something really is true you should be able to prove it different ways! In spite of what was written earlier in the week, these results certainly seem to pass the sniff test.


#36    Darren      (see all posts) 2011/09/08 (Thu) @ 20:45

You could also include BPro WARP. The only issue you have to consider is that they all have a different Replacement Level, which would require some adjustment before simply averaging them.


#37    Tangotiger      (see all posts) 2011/09/08 (Thu) @ 20:58

You don’t have to, because every player would be affected pretty much equally.  We are talking about guys with 500+ PA to begin with.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:42
Who is Jeremy Lin?

Feb 11 19:33
Clutch analogy

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 16:48
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul

Feb 10 18:32
Moneyball at Villanova