Tuesday, October 03, 2006
Baseball Prospectus’ WARP1 is wrong
Let’s start off with the defintion of WARP-1:
WARP-1
Wins Above Replacement Player, level 1. The number of wins this player contributed, above what a replacement level hitter, fielder, and pitcher would have done, with adjustments only for within the season.
Then, let’s look at a .500 team. When I need a .500 team, pretty much without fail, I look for the Houston Astros, and they satisfy my needs. The scored about as much as they allowed, and won about as much as they allowed. Let’s take a look at their team:
BP does a great job in presenting their stats, making my job very easy:
http://www.baseballprospectus.com/dt//2006HOU-N.shtml
If you go down to the “Advanced Batting Statistics” section, which is a misnomer, since the data there is batting, fielding, and pitching, the team WARP-1 totals is 58.5 wins. The Astros won 82 games, which is pretty much what their RS/RA numbers would have expected. 82 minus 58.5 is 23.5 wins. 23.5 / 162 = .145. Another perennial .500 team I like is the Seattle Mariners. Their team WARP-1 is 52.1, and they won 78. Their RS/RA would have expected around that as well. 78 minus 52.1 = 25.9 wins. 25.9/162=.160. The Yanks won 97, which is also around their pythag record. Team WARP-1 is 71.9. 97 minus 71.9 = 25.1 wins, or a .155 record.
What do we learn here? That BP’s WARP-1 treats the replacement level as a team with a .150 record.
I’ve shown elsewhere on this site (click on the “Talent Distribution” category at the bottom of this entry) how the most likely team replacement level is a .300 record. This can be shown in many many ways. And, that pretty much is the number most analysts would use. To use anything else is, frankly, just plain wrong. Or needs a ton of explaining.
The replacement level that I use are: for a position player and a starting pitcher is .380. For a reliever, it’s .470. A team of such players will win .300 games.
So, why does BP calculate WARP-1 the way they do? The likelihood is that it treats a “replacement-level’ position player as a replacement-level fielder and replacement-level batter. But, such a player is not the 420th best position-player in the world. He’s probably not even in the top 1000 players in the world. Why is this the benchmark? What does it tell us?
I know all about the 1899 Spiders, and the recent Tigers. It doesn’t matter. Even if an MLB team posts a .140 or .250 record, our best estimate of the true talent level of these teams is nowhere close to those records. They probably need to be regressed 25-50% towards the mean.
Clay Davenport, the creator of WARP, is aware of this. It’s been brought to his attention many times. I’ve talked to him about it personally. He’s never bothered to fix it, for reasons no one can really understand.
It’s also why VORP doesn’t appear on the BP player cards, because Keith Woolner actually calculates replacement level correctly, so VORP and WARP using different replacement level values, and they don’t line up with one another. Even though both are “BP stats”.
Combinining the incorrect replacement level and the major flaws of FRAA and FRAR as fielding metrics, WARP is totally useless.