Monday, April 21, 2008
WOWY Ripken
One of my issues under consideration is how to handle the situation when the “without” is alot smaller than the “with”. WOWY(*), as you know looks at a player’s performance with some parameter and without. Let’s take the case of Cal Ripken, and some his pitchers (guys who pitcher predominantly, or totally with the Orioles, with Ripken as his shortstop). One such pitcher is Jeff Ballard. He had some 70 innings without Ripken as his shortstop and 700 with. Normally, I scale the “without” data to the “with” because I want to keep my “with” untouched. I want his actual innings, and outs, and balls in play to reflect his actual totals. But, in cases like this, I’m very uncomfortable scaling up by a factor of 10 the “without” data.
I have some options here:
1. Break with what I want, and scale the “with” down to the “without”. That is, allow either one to scale to the other’s lower sample size. Now, instead of adjusting just the without, I’m adjusting both. And the “totals” line of the “with” will not match his actual overall totals. I don’t necessarily like this.
2. Allow the scaling up a certain factor. I’m thinking 4 times. Why 4? I don’t know. I’m thinking at least 2. And I don’t want to go 8 or 10 times. Four sounds reasonable. So, in this case, I can scale up to 280 innings. That still leaves me 420 innings short. So, I either scale down to 280, or do something else.
Andy likes to scale things as 2/(1/70+1/700), so I’d scale both to 127. He’s got statistical theory on his side. That still leaves me with the same basic problem.
3. Fill-in whatever I’m not scaling (that is, the difference between 700 and 127, or between 700 and 280, or between the unscaled 700 and 70), with some fixed average. That average could be the average for the average pitcher that shortstop had that year, or that he had in his career, or the average pitcher in the league that year, or whatever.
What do you guys think?
(*) With Or Without You, or since MGL is a stickler about it: Without or With You ... dude, I need a cool sounding name, and WOWee sounds better than WWOU
I am doing a lot of WOWY (but I call it WWOY so I can call it my own!
), thanks to you (I love the idea). I always use either the Andy method (that is the harmonic mean, I think) or the short-cut, which is to weight by the lesser of the two values. There is little doubt in my mind that that is the best way to do it (either one).
I know you’d LIKE to be able to use the “with” numbers to do the scaling, but that is just not correct.