Wednesday, February 23, 2011
Linear wins
Dave brings up the comparison of Pujols against two good players, of how they both contribute the same amount and (will be) paid the same amount. So that all the extra things, scarcity, risk, non-linearity cancels out.
One of the things these discussions always brings out is the 2-for-1 comparison, and so, we also have to talk about roster management. Let’s agree that this is an issue, and in the future, make it a 2-for-2. I suggest that next time you want to do a 2-for-1 comparison, make the 4th player someone like Lastings Milledge (or Melky Cabrera, etc), a player young enough and seemingly with enough potential to be given a chance to play 120 games every year, but who always produces 0.5 WAR a season. So Pujols+Milledge v two good players (say Rios/Dunn).
***
Also note that if you put the replacement level too low, you won’t get a linear wins to dollars conversion. We talked about this way back when Nate Silver had a very low replacement level. He and I ended up with the SAME dollar values for the same (above-average) players, except he had an exponential growth for wins to dollars, while I had a linear fit. How is that? Because where I’d have 6 WAR, he’d have 7.5 WAR. And where I had 3 WAR, he’d have 4.5 WARP. We both agreed that the better guy should get paid twice as much. And in my case, it was simple enough: 6/3 = 2 times. For him, it was 7.5/4.5 = 1.67. So, he needed to add an exponent to turn that into 2 times. Where we disagreed was on the very low end (the guys I had below 0 WAR but that he still had as a positive WARP). Anyway, all that is moot now because BPro has increased their replacement level to around the level we’ve espoused. And as a result, they’ve gone to a linear model as well.
Anyway, it’s an important critical part to point out that where you set the baseline will determine if you get a linear fit or not.
Let’s presume that’s still not clear, and I’ll give you a better example. Think of win percentage for a pitcher (or better, the win% implied by a pitcher’s RA9).
Let’s say I ask you this question: how much do you pay for these three pitchers?
win% IP
0.600 205
0.500 150
0.380 90
The first pitcher is your standard #1 guy, and you’re giving him 20MM$. The second pitcher is your standard #3 pitcher, and you’re giving him 8MM$. The third pitcher is your #6 pitcher, and you are giving him the league minimum (effectively 0$).
So, you look at that and think “well, obviously wins are not linear”. The .600 pitcher only gets 1.2 times more wins per game, and he’s only pitching 205/150 (1.37 times) more than the .500 pitcher. Their W/L records look like this:
win% IP W L
0.600 205 13.7 9.1
0.500 150 8.3 8.3
0.380 90 3.8 6.2
So, I’m only going to pay the first pitcher 13.8/8.3 = 1.64 times as much (which you can also get from 1.2 x 1.37).
Ah, but we are not saying that ALL wins are linear. We are saying the MARGINAL wins at the level we are interested in are linear. More accurately, all the wins below a certain level of worth ZERO dollars, and then all wins above a certain level are worth 4MM$.
Let’s remove from each pitcher .380 wins. So, we now have this:
win% IP
0.220 205
0.120 150
0.000 90
Our first pitcher is now +.220 wins per game, the second one is +.120 wins per game, and the third one is +zero wins per game. All the wins that we took out (the .380 wins per game) are worth zero dollars.
Now, let’s do the same thing and multiply the marginal win% by IP/9 to give us wins above the baseline. We get this:
win% IP WAR
0.220 205 5.0
0.120 150 2.0
0.000 90 0.0
And if we multiply each win by 4MM$? You get 20MM$, 8MM$, and 0MM$, respectively. And that’s exactly what you (through me) said you would pay.
Basically, rather than having two lines as we do (0$ per win up until a win% of .380, and 4MM$ per win for each win past the .380), a typical statistician would try to have ONE line that best-fits through all the data, and he’d have some exponential or quadratic equation to try to describe the data.
That’s why, when I say the wins to dollars is linear, it’s only linear above a certain point. And I set that point at around .380 win%.


Recent comments
Older comments
Page 1 of 344 pages 1 2 3 > Last »Complete Archive – By Category
Complete Archive – By Date