Sunday, February 06, 2011
Testing the 2007-2010 Forecasting Systems for Batting, part 1
Just a heads up. Brian compiled the data for Chone, Marcel, Oliver, Pecota, Zips for 2007-2010. He sent it to me for my analysis and to publish my findings. I found some gaps in the data he provided, so I have to wait for a resend. I will publish the dataset I will be using.
In order to test, this is what I’m going to do. If you have a suggestion or recommendation, please do so here. (I haven’t done this yet, so I’m as interested as you are in seeing the results.)
1. Figure out the population mean for each of the 5 forecasters each season, by weighting their forecasted wOBA by the actual playing time (for common players).
Reasoning: I need to know a common baseline, and the best way to do so is to simply use the same players for all 5 systems. It doesn’t really matter if this was the league mean used by the 5 systems. I just need to baseline the common players so that all 5 systems are speaking the same language for the common players.
2. Compare the forecast of all players forecasted to each forecaster’s population mean. (And actual to population mean.)
Reasoning: Once I have a standard baseline, everyone being compared against that standard baseline will be comparable across forecasting systems.
3. Square the difference between the adjusted forecast and the actual result. Multiply by actual PA.
Reasoning: we don’t want to weight the guy with 1 PA as much as 600 PA. We also don’t want to discard the guy with 75 PA altogether. So, you weight by playing time.
4. That’s it.
5. I’m also going to do “head-to-head”, forecaster v forecaster. Anyone who gets the wOBA the closest gets a win. A win is going to be equal to the number of PA. This way, winning on 600 PA is worth more games than winning on 1 PA.
6. That’s it.
***
Anything else you want me to do?


Recent comments
Older comments
Page 1 of 344 pages 1 2 3 > Last »Complete Archive – By Category
Complete Archive – By Date