Tuesday, October 06, 2009
Incorporating guts into a forecast
David says:
My expectation was that a computer would be much better at assimilating a lot of statistical information into one final prediction than the human brain, and while I still do believe that to be the case, it does appear that we humans can see something computers do not.
But then he goes on to say:
Looking at the hitters I thought would beat their projections, I saw a lot of special skills, most of them young, but all very talented.... The hitters I thought our projections overrated were mostly some combination of old, fat and strikeout-prone. ... The only thing that really jumps out at me is that I liked a lot of high-strikeout guys, while a lot of the pitchers I didn’t like are below-average at whiffing hitters.
The “computer” that David references is the algorithm he designed to create the forecasts, and the computer simply speeded up the process. It’s the only thing the computer did. Speed. The algorithm was designed by a human. Furthermore, that human chose to ignore the impact that would have helped his algorithm. So, he “knew” (or suspected anyway) that high-K pitchers have an extra oomph (something more real, or a better ERA to regress toward). But, he didn’t put that in his algorithm. This is the kind of thing that PECOTA would implicitly accept. For example, it would look at the high-K pitchers, find the comparable pitchers, and use that as an extra regression point.
Anyway, all David has to do is create additional parameters for his algorithm. He can set a “1” for anyone that satisfies his baseball guts for improvement, he can set a “1” for anyone that doesn’t. That gives us an extra parameter for the regression equation. If his baseball guts are worth 50 points of OPS and 0.50 ERA, then he can include that in his equation. Basically, if he has a reason to suspect that a player’s 2008 or 2007 stats are not representative of that player, he can fudge that data by introducing a Guts parameter.
I think MGL has said that he manually makes park factor changes, as he thinks appropriate. It’s the same deal here.
Kudos for David to showing that he’s got baseball guts. Now, just include that in his algorithm, so that next year, he can’t beat his own algorithm.


I think it’s useful to distinguish between finding systemic improvements to his model (or uncovering systemic flaws in his model—same thing), that apply across players with similar statistics, vs. situations where his “gut” or direct observation of a player tells him something that the statistics cannot. Looking at his successful calls, you might want to explore theories like:
1) very young players given lots of playing time (e.g. Upton) should be regressed to a higher mean;
2) players in their mid-30s need to be projected lower (aging curve is off)
3) for dramatic changes in performance after age 32, give a lot of weight to declines (Ortiz) but less weight to improvements (Chipper);
4) give more weight to performance more than 3 years prior (Chipper, Ichiro);
5) lower projections for players with very low BA relative to OPS (Cust)
6) include weight or body mass as predictor (Ortiz, Howard)
7) whatever else you do, don’t forecast Ichiro to be 70 points below his career OPS, and don’t project Danny Haren to have a 4.22 ERA (seriously, how hard was it to call these players right?).
I’m not saying any of these are right, and maybe none are. But testing them might allow for improvements in the model.
Then, there may still be things that can be seen by an observer but are not (yet) captured by the data. But I’d put that in a separate category.