Saturday, November 14, 2009
The Sauer/Hakes Moneyball spreadsheet
I put it on Google Docs. Here is my first example:
You can click the image to see it better. Ok, so what do we see? The data on the left is if you use the 2001 coefficients. The data on the right is the 2004 coefficients.
Player 1 is an OF/1B, player 2 is a C, and player 3 is an IF. I made this guy a free agent, with 600 PA, a .400 OBP and .500 SLG. His salary as a free agent is 4.5MM$ or 4.8MM$. An infielder with those stats is a huge star, and an OF with those stats is a borderline one. And a catcher who can hit like that would be Mike Piazza. There should be more of a differentiation in salary than we see here.
Now, look at 2004. Notice the NEGATIVE coefficient for the infielder? That’s right, while the OF is earning 6.7MM$ for his performance, the infielder is earning 6.0MM$ for the same performance.
In this case, I made all three players free agent outfielders. Player 1 is a slugger, player 3 is an on-base machine. Player 2 is in-between. In all cases, their 1.8*OBP + SLG is the same. As we know, this metric tracks Linear Weights (or wOBA) very well. We see that in 2001, the slugger would earn 6.3MM$, while the on-base heavy guy would earn 4.0MM$.
Fast-forward to 2004, and we have equilibrium! All three players earn between 5.7MM$ and 5.8MM$. The 2004 numbers are pretty believable and argues that the market has corrected itself. I can also believe in the 2001 numbers.
So, in a limited sense, we can see how there is some rational results. But, does it seem possible that the sluiggers, after 3 years, dropped in salary? That would be an interesting finding, but I don’t know if it’s true.
We know there are big problems, like the ones I’ve noted already (negative value for the infielders in 2004). What if he now change it to 400, 500, and 600 PA? Here’s what you get:
For the players in 2001, they all come out with a 3.35MM$ salary. In order to do that, I have to give the guy with 500 PA a .415 OBP and .498 SLG and the guy with 600 PA a .333 OBP and .400 SLG. And for the guy with 400 PA? Forget it. He’d have to be Barry Bonds. Does this make any sense? A 1B with a .333/.400 slash line is out of baseball practically. A guy with a .415/.498 line and 500 PA is a very solid outfielder. How can those two guy both earn the same salary? And then match them to a 400 PA Barry Bonds?
The 2004 data is more reasonable here. The .333/.400 hitter with 600 PA gets his 4.2MM$, the same total as the .381/.457 hitter in 500 PA. And the same as the .429/.515 hitter with 400 PA. I don’t agree with it still, but at least it’s more reasonable.
In short, I don’t see that their regression model reasonably models the reality of 2001 or 2004. Sakes/Hauer did identify the important parameters, but the method in which those parameters are used in the model is not reasonable, nor are some of their coefficients even plausible.