THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, January 05, 2012

Hard to beat Marcel’s pitching forecasts

Matt shows this:

Estimator(N=1,576 pitchers)     RMSE of Statistic with Next Year’s ERA(2006-2011)
SIERA     1.126
Marcel     1.132
PECOTA     1.141
ZiPS     1.143
xFIP     1.148

FIP     1.212
tERA     1.236
ERA     1.387

First of all, no need to go to three decimal places.  We show ERA as two points, so why bother showing it to three decimal places?  As far as I’m concerned, there’s virtually no difference among the top five.

Secondly, I can’t tell if the future ERA is park-adjusted or not.  It MUST be unadjusted.  NO ONE is trying to estimate a pitcher’s park-neutral ERA in terms of testing.  The only test is how he actually did.  So, we don’t adjust for park and strength of schedule and innings per start. 

(MGL for example only cares about park-neutral.  And that’s fine.  But then, we can’t test his results.  SIERA is park-neutral I think, but FIP is not.  All the forecasting systems in fact are park-specific.  You can’t turn everything to park-neutral first.)

You COULD make the case that we should throw out any unexpected starter-relief switches, for reasons we’ve learned about over the years.  But, we need to be careful here, as we may end up with a selection bias.

In the comments, Matt notes that it was park-adjusted.  Again, I completely disagree here.  The test is against actual performance, not adjusted performance.  He notes it didn’t make a difference.  Well, given that the test is slanted toward SIERA, and SIERA is Matt’s baby, then, I’d REALLY like to see the results the right way.

Now, Matt may decide to introduce a park-specific SIERA, so that we can all make the apples-to-apples comparison.  Until then, SIERA will simply have to have its hand tied behind its back.

Thirdly, for the RMSE test, you MUST calibrate it so the league average for the forecast equals the league average for the actuals.  It should be clear that if you treat the forecasting system as its own universe, it’s irrelevant if the expected ERA was set to 3.9 or 4.3 and the actual ended up at 3.7 or 4.8 or whatever. I’m not sure if Matt handled this.

As we know, RMSE, not correlation, is the correct test.

Having said all that: great job to Matt!


(7) Comments • 2012/01/06 • SabermetricsForecasting
Page 1 of 1 pages

<< Back to main