THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, November 22, 2011

Tango’s Lab: forecasting players who switch teams

By Tangotiger, 04:57 PM

KJOK was nice enough to send me a Marcel file that includes the 2009 forecasts, the 2009 actuals, and if the player played for the same team in 2008 and 2009.

I took his data, and calculated the wOBA for each player (his forecast, and his actual).  I found the weighted error as the difference between these two figures, multiplied by his actual PA.

I limited the forecasts to only those players with a reliability of at least .50, and who, naturally, played in 2009.  This gave me a total of 382 hitters, with 151,897 PA. 

I then simply split up the 382 players into whether they played for the same team or not.  For those that played on the same team, the average error was .025.  For those that switched teams, the average error was .027.

Since Marcel does not make a park adjustment, that could be a source of error.  Then again, it could simply be that the change in context simply forces an error.  In order to understand if park adjustment is the reason or not, we need to compare to a forecasting system that does an explicit park adjustment.  (It may be that the team switchers simply are harder to forecast because there was a reason they switched teams to begin with.)

If someone has forecasts for 2009 that were made with a park adjustment, then please supply me with such a file.  It MUST contain at least the following information:
bdbID, AB, H, 2B, 3B, HR, BB, HBP

If you don’t have at least exactly that, don’t send me anything!

I will then have something to compare against.


#1    Tangotiger      (see all posts) 2011/11/22 (Tue) @ 18:01

I had the Steamer forecasts lying around in usable format.

The problem is that they forecasted alot fewer players, especially the team-switchers.

Anyway, looking at just those players they forecasted, and those I had a reliability of at least 0.50, we have this:

Average Marcel Error for same team players:
.0252
Steamer:
.0261

And for the team switchers:
Marcel: .0235
Steamer: .0255

So, Marcel did much better in this group of players for the team switchers.  And in either case, it bested Steamer.  And it bested Steamer much more with the team-switchers.


#2          (see all posts) 2011/11/22 (Tue) @ 19:04

Did the new-team players underperform their projections relative to the same-team players?


#3    Bill Waite      (see all posts) 2011/11/22 (Tue) @ 20:13

It seems to me that it would be simpler to look at the average performance of those who switch to a significantly better hitter’s park (where the difference in park factor for runs has to be at least 0.1 or some minimum threshold) compared to the average Marcel forecast, and the same thing for those who switch to a better pitcher’s park.

If one group noticeably overperforms the average forecast as a group, and the other group underperforms the average forecast as a group, then park factors are probably worth something. If not, they’re probably not.


#4    Tangotiger      (see all posts) 2011/11/22 (Tue) @ 22:37

Bill: ok, so now you’ve got it isolated down to a couple of guys.

I’m not saying that park factors have zero influence.  I’m just saying it’s no big deal, in the grand scheme of things, and even in the smaller scheme of just the team switchers.  To make the case that it’s necessary to do for all players simply because it’s needed for a couple of guys every year has the possibility of over-adjusting all the team-switchers.

I’m reminded of when MGL would do “virtual HR”, which is basically the HR based on flyball distance, superimposed on each park.  And what did he find?  He’d have to regress the virtual HR 50% toward the player’s observed HR.  Why?  Because you can’t just transplant a player’s stats onto a park like that.  That a player’s stats are (partly) a result of his response to that particular park.

Aren’t people surprised at how little Matt Holliday’s stats have changed after leaving Coors?  There was a big whoop-di-do about how Galaragga’s stats would take a nose dive after he left Coors. 

It’s just presumptive to think a one-size-fits-all park adjustment is somehow necessarily beneficial overall.

Anyway, back to the matter at hand.  We can have all the theoretical discussions we want.  But what we need are empirical results.  The floor is open to researchers to present their results.

I am not 100% or even 80% convinced I am right.  I think I’m 50/50 on the matter of needing park factors for forecasts, be it for team-switchers or same-teamers.

Indeed, even consider the possibility that park factors are done so poorly as to benefit the team-switchers (i.e., better than Marcel), but to the detriment of the same-teamers (i.e, worse than Marcel).  That’s a very real possibility, especially if you use UNREGRESSED PARK FACTORS.


#5    MGL      (see all posts) 2011/11/22 (Tue) @ 22:51

I’m not sure the question you are asking Tango, and I’m not sure the point of comparing one system to another. How do we even know that these systems are using good park adjustments (e.g., as you say, using un-regressed 1 or even 3-year factors would be terrible).

Do we really need to test whether a projection for a player going from SEA to TEX (or vice versa) or ARI to SDN (or the reverse) would be better or worse with or without park adjustments?

Now, the question of how much you gain is another story.  Obviously overall, not much since most players don’t switch teams and even the ones who do likely go from one park to a similar park (since not that many parks are extreme), and only half the games are affected.

So, again, we have the situation where overall there is not going to be a great difference, but for certain categories, albeit a small subset of the overall, there is.

Tango, if you want to do some testing, take the Marcel’s and compare player who have switched teams using a park adjustment and no park adjustment (the regular Marcels I guess).  You can do that for any system.  I just don’t see the point of comparing across systems.

Here are regressed run factors for 2009, using prior data:

ARI 1.1
ATL .97
CHN 1.07
CIN 1.03
COL 1.16
FLO .95
HOU 1.03
LAN .97
MIL 1.01
NYN .95
PHI 1.04
PIT 1.01
SDN .84
SFN .97
WAS .93

ALA .99
BAL .94
BOS 1.06
CHA 1.06
CLE .99
DET 1
KCA 1.01
MIN 1
NYA 1.02
SEA .94
TBA 1
TEX 1.09
TOR .99

Tango,


#6    Tangotiger      (see all posts) 2011/11/23 (Wed) @ 07:54

My point is what you said at the beginning: using an unregressed park adjustment may be worse than no park adjustment.

My other point is that even if you use a good park adjustment, the gain will be limited, because whatever tiny adjustment you make will be quite limited, other than to a handful of players.

My final point is that if a guy goes from a 1.02 to a 0.98 park, do we REALLY need to adjust that guy at all?  You’d think we do, but, isn’t it also possible that players adjust to their surroundings such that there’s a high rate of players that won’t be affected, while it’s really a smaller set of players that do get affected?  And the net effect is that no adjustment is better than some adjustment.

Instead of me improving Marcel (which I won’t), then the next best test is to see ACTUAL application of park factors in action (which may be the best, or which may be crappy ones).

I’m not asking if Marcel can be improved upon with park adjustment, but rather is Marcel just as good as other systems that go out of their way to park adjust.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 14:14
Pete Palmer’s new book: Basic Ball

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion