THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, June 08, 2009

David Gassko versus THT Forecasting system

By

In an article written before the season started, David presented 29 hitters and pitchers who he thought would over or under-perform their THT projections.  This is interesting since I think that David plays a big part in these forecasts.  Like most forecasting systems, I think that a computer essentially spits out each player’s projection and I don’t think that a human being, including David, tweaks them before they are released. I could be wrong, and David can correct me if I am.

Anyway, I either told David or I made a mental note that I was going to check his “intuition” at season’s end.  I am always skeptical of these, “I can beat a good forecast system just by looking at the forecasts.” They are similar to the “players who I think will breakout or fizzle” articles, although it depends on who is authoring these articles or proclamations.  David is a very smart guy and a very good sabermetrician - he should be working for a team someday, if he wants to of course.  He has already done some work for some teams.

On the other hand, maybe it is not so hard for a good analyst or even a good scout or insider to take a list of projections and pick out the ones that are not very good.  Being able to do that is very different from coming up with good projections of your own.  In other words, a scout might easily be able to accurately identify 10 or 20 bad projections but he would likely get killed by a good projection system if he had to project every player.  I wish we had done some kind of survey where we had analysts look at a good projection system like THT, Pecota, Chone, Oliver, ZIPS, etc., and try and identify their top 10 or 20 that they think were “wrong.” Because most projections systems are mechanical, that may not be all that hard to do.  Of course, I am saying that after I have already checked on David’s picks.  It would also probably depend on whether the projection system just spits out the numbers from a computer, as I think THT (and most of the others) does, or whether someone already goes through and tweaks them before they are released.  For example, if David had tweaked the THT projections, presumably he would not have any players to choose as being over or under-rated.

Anyway, here is the good news and the bad news for David:

The good news is that out of the 14 hitters (I did not include Alex Gordon as he only has a few PA) he had on his list, he was correct on 10 of them and wrong on 4 of them.  For each player, he said whether they would likely over or under-perform their THT projection.

For pitchers, he went 8-3.

So overall, he was 18-7, which is quite impressive.

Or is it?

He says in the article that he didn’t look at any other projection system before he chose these players.  I have no reason not to believe him.

On the other hand, you have to wonder whether David is so smart or perhaps the THT system is so bad, or at least obviously bad on some of the players, that it was easy for David to come up with 25 players that they were likely to get wrong. Keep in mind that I think that the THT projections are very good, and they have probably fared well in the various “projection evaluations.” But, as I said, for various reasons, it is probably fairly easy and commonplace for even a good projection system to get some payers obviously wrong, if that projection system is “automated.” If the projection system includes tweaks by knowledgeable human beings, then by definition, it is not so easy to get any players obviously wrong (if it is obvious, they would be corrected by those human beings, right?).  Again, I think that THT is an “automated” projection system with few if any “human tweaking.”

Anyway, to see if David was really smart or the THT projections on his 25 players were just “bad” I compared the THT projections to my projections for those 25 players.  I have an independent projection system which is basically a “Marcel” with a lot of normalization (park, age, and opponent adjustments).  There is nothing special about my projection system whatsoever.  As I said, it is just a Marcel which is more finely-tuned than a basic Marcel.  In fact, the strength of my projections are my park adjustments, but for most projections you don’t need any park adjustments, unless a player changes teams, which only happens 10-15% of the time or so (just a guess).  In fact, in David’s list, almost none of the players changed teams from 08 to 09, which makes it even easier to project them.

So here is what I did:

If David thought that a player would outperform or under-perform his THT projection and so did my projection, I called that a “cheat” for David.  If his subjective evaluation and my projection disagreed with one another (for example, he thought a player would do better than THT’s projection and I thought he would do worse), then I called that a “non-cheat.” To be clear, I don’t think that David cheated in any way by looking at anyone else’s projection, and he didn’t have access to my projections anyway, as they are not publicly available (for no particular reason, BTW).

So here is the tally:

Hitters

Cheats: 10
Non-cheats: 4

Of the 4 players in which David and I differed, as compared to the THT projections of course, David was right on 3 of them, Howard, Ichiro, and Cano, and I was right on one of them, Miggie Cabrera.

Pitchers

Cheats: 9
Non-cheats: 2

Of the non-cheats, David was right on one of them, Greinke, and I was right on one of them, Volquez.

So while I still think that David is a really smart guy, I am going to conclude that it is probably not that hard to identify bad projections from any system, and based on the fact that of the 25 players that David identified as “bad” projections from THT, I agreed with him on 19 of them, that those were truly bad projections on THT’s part and not necessarily great insight on David’s part.

On the other hand, to be able just to look at THT’s projections and without looking at any other projection system system and come up with 25 or so of the players that THT got “obviously” wrong may be an impressive feat after all.

If someone has some spare time, maybe they can do the same thing with a few other projection systems like Chone or Marcel.  If they mostly agree with David, as mine did, then we can probably just take any projections system and assume that if the other projection systems all disagree with a player’s projection, then that projection is likely to be wrong.

Tango has a boat load of projections.  He can probably come up with a list of 20 or so projections for each system that the other systems disagree with.  I would guess that we would find that a majority of those players had “bad” projections in that one system.

If I remember I will revisit David’s picks at the end of the season, as obviously we have larger sample sizes to work with. 


(33) Comments • 2009/10/08 • SabermetricsForecasting
Page 1 of 1 pages

<< Back to main