Thursday, January 05, 2012
Hard to beat Marcel’s pitching forecasts
Matt shows this:
Estimator(N=1,576 pitchers) RMSE of Statistic with Next Year’s ERA(2006-2011)
SIERA 1.126
Marcel 1.132
PECOTA 1.141
ZiPS 1.143
xFIP 1.148
FIP 1.212
tERA 1.236
ERA 1.387
First of all, no need to go to three decimal places. We show ERA as two points, so why bother showing it to three decimal places? As far as I’m concerned, there’s virtually no difference among the top five.
Secondly, I can’t tell if the future ERA is park-adjusted or not. It MUST be unadjusted. NO ONE is trying to estimate a pitcher’s park-neutral ERA in terms of testing. The only test is how he actually did. So, we don’t adjust for park and strength of schedule and innings per start.
(MGL for example only cares about park-neutral. And that’s fine. But then, we can’t test his results. SIERA is park-neutral I think, but FIP is not. All the forecasting systems in fact are park-specific. You can’t turn everything to park-neutral first.)
You COULD make the case that we should throw out any unexpected starter-relief switches, for reasons we’ve learned about over the years. But, we need to be careful here, as we may end up with a selection bias.
In the comments, Matt notes that it was park-adjusted. Again, I completely disagree here. The test is against actual performance, not adjusted performance. He notes it didn’t make a difference. Well, given that the test is slanted toward SIERA, and SIERA is Matt’s baby, then, I’d REALLY like to see the results the right way.
Now, Matt may decide to introduce a park-specific SIERA, so that we can all make the apples-to-apples comparison. Until then, SIERA will simply have to have its hand tied behind its back.
Thirdly, for the RMSE test, you MUST calibrate it so the league average for the forecast equals the league average for the actuals. It should be clear that if you treat the forecasting system as its own universe, it’s irrelevant if the expected ERA was set to 3.9 or 4.3 and the actual ended up at 3.7 or 4.8 or whatever. I’m not sure if Matt handled this.
As we know, RMSE, not correlation, is the correct test.
Having said all that: great job to Matt!