Friday, December 17, 2010
Fantasy valuation
David asks, in part:
How do you choose the player pool for averages and standard deviations? Do you use last year’s stats? Do you use projected stats? Do you use iterations? Do you use empirical data from similar fantasy leagues?
What you cannot do is just use the projected stats on its own to figure out the standard deviation. Imagine, for example, that every pitcher is forecasted for between 8 and 12 wins. That would set one standard deviation to be pretty low, say 0.5 wins, and so the top guy will be say 3 or 4 SD from the mean. But, he’s only 2 wins above average. That guy forecasted with 12 wins will win say between 6 and 18 games. Now. imagine all base stealers are forecasted for between 8 and 12 steals. And, if we presume stealing is far easier to forecast (for this illustration… if you can’t get past that, then call it quatlus), then the guy with 12 steals will also be 3 or 4 SD from the mean. And in reality, he will end up with between 11 and 13 steals.
So, that’s why you can’t use the standard deviation of the forecasted stats. You need to include the standard deviation from the uncertainty of the forecast and the random variation in the stat.
The easiest thing to do is just look at the empirical data from prior seasons, and take the standard deviation of those observed data. Then calculate your z-scores. It’s going to be pretty stable each year. For example, I pretty much stick with a 3/3/1/1/ model for RBI/R/HR/SB in terms of weighting. Things change every year of course, and you can feel free to create a model that uses the forecasted data to determine the expected standard deviation. That would be fun to do. Until then, take the easy way out, day tripper.