Friday, December 21, 2007
Dan Fox’s Manny
I don’t believe there’s an issue with the sample being biased though (for example Manny is a large part of the data set above) since we’re comparing all performances at home with those on the road and if he stinks at Fenway he should probably stink on the road which would cancel each other out.
Maybe Dan hasn’t read my With or Without You (WOWY) in THT08 Annual. But, there’s simply no reason to create a park factor for Manny that actually uses his own performance. You’re not going to create a HR park factor for LHH at Yankee Stadium by using Babe Ruth, and then applying it to Babe Ruth. You need to compare how Manny does compared to others in the same contexts. You can’t compare Manny to half of Manny.
Normally, for park factors, a guy might be 5% of the sample, which is why we don’t care too much about removing the offending player. But, in this case, he makes up 50% of the sample. The shortcut we allowed in the 5% case can no longer apply here. You may still get the same conclusion in this particular case, but that’s besides the overall point.
How does MGL calculate his UZR park factors?
I see it as a necessary evil. If you don’t include Manny in the calculation, then you’ve got to use different park factors for every player the Red Sox put out there, or any other team/ outfield spot.
You could use visiting players only, but some teams, like the Angels, hit extremely well at home for no obvious reason. I think you need to balance it out by using both home and road data.
Its a lot easier, and I checked my Totalzone numbers: Manny comes out, for 2007, at -4.2 runs. If I take him out of the park factor calculation, which uses 5 year data (keeping other Red Sox LF in) it changes to ... -4.1 runs.
All of that difference for an hour of work. From a cost/benefit standpoint, doesn’t seem worth it to me.