Tuesday, December 23, 2008
Dual Positions, using bUZR
With so much of this being discussed at Primer and Fangraphs, I figured it was time to update my process to use the latest data from MGL’s bUZR. So, sit back, and prepared to be numbed by numbers:
As I have done in the past few years, I will compare players’ UZR of those who have played multiple positions.
My datasource this time will be bUZR (UZR using BIS data, as published on Fangraphs). The data will is from 2002-2008, and is NOT age adjusted. The “games” I will be citing is the UZR games, which is the number of effective games a player has played, based on his opportunities. For all intents and purposes, it’s games played.
As always, we start in the outfield.
There are 419 players that have played both LF and RF in that time frame. This ranges from Geoff Jenkins playing 431 in LF and 333 in RF, to Junior who played 1 in LF and 239 in RF to a whole bunch of players who played 1 game in each position, and everything in-between. The stats of each player is weighted by the lesser of his two games.
Like I said, I have 419 players, and those players totalled 11,728 matching games. This is one-third of all games played. This is ALOT of games. Around 70% of players who played one corner outfield also played the other corner outfield. We have prime evidence that teams treat the corner outfield as interchangeable.
Their UZR was -0.1 per 162 G in LF and +0.9 in RF. Remember, this is the same group of players, playing in two different positions. (You perform better, relative to your competition, if your competition is worse, by definition.) So, in this case, we say that the average LF is 1 run better than the average RF, in 2002-2008 (excluding arm). If we included UZR arms, I would presume that it’s pretty much a wash.
How did each corner OF do in CF? We have 280 players who played both CF and LF, with 8391 matching games (roughly 25% of all games played). This too is ALOT. Those players were -2.9 runs in CF and +4.8 in LF, for a difference of 8 runs.
Comparing CF to RF: 7741 matching games, -5.1 in CF, +3.4 in RF, for also a difference of 8 runs.
There we have it: a pretty good match. The average CF, using this sample is 8 runs better with the glove than the corner OF, and the corner OF are equal to each other.
What kind of biases could we have? Well, not every player plays the dual positions. Then again, I do have some 70% of the players that did. Because of the “minimum games of the two positions”, I only get 25% or 30% of the matching games. Nonetheless, it would be fairly hard to argue that we’ve got too strong a bias here.
The other place is aging. Since this sample looks at seven years in total, a guy who is a CF in 2002-04 and a corner OF in 2005-08 looks like the same player playing two different positions. In reality, he got worse because he’s older. And since players will likely move more from CF to the corners than the other way around, then we have a bias here. Maybe I’ll handle this later.
I suspect that 8-run gap would turn into a 9 or 10 run gap if we handle the aging bias. And perhaps if we handle the bias that not all players will play in dual positions (Adam Dunn in CF?), then only players who don’t get exposed too badly get to compete in the dual positions. So, maybe we have another 1 or 2 run bias here. This could turn our 8-run observed difference (with bias) to an unbiased estimate of say 12 runs.
***
Let’s do one more that I’ve never done in all the years I’ve done this. Since we suspect that LF and RF are interchangeable (both in lack of bias, and in talent levels being the same), let’s compare the dual “positions” of CF to LF/RF.
Of the 380 players who have played CF, 303 of them have played at least one game in the corners. The total number of games is 12,386, which is over one-third of all games. These players were +4.4 runs in the corner, and -3.4 in CF, for a difference of 8 runs.
We didn’t learn anything new here, though we did manage to increase our sample size. Remember this, as I may use this later.
***
Now, how about SS/2B? While in the outfield, we see that there was alot of movement, and therefore, we are not that concerned about potential talent bias, this concern exists here. We have 211 players that played both positions (around half the middle infielders), totalling 6243 games, roughly 20% of all games. So, yeah, we may have an issue, especially since we know that guys who play SS are not the bad 2B (for the most part).
Players at these dual positions were +1.3 at 2B and -2.5 at SS, for a gap of 3.8 runs. Possible age-bias, and likely talent-bias probably shows this as too small. A 5 or 6 run difference would probably be more likely in an unbiased estimate.
What about the SS/3B? We have 170 players (one-third of all players who played SS or 3B), totalling 4518 games (15% of all games). In this case, they were both zero! Even if we have the talent-bias (only the really good 3B play SS), why were they a zero at 3B to begin with?
Finally, 2B/3B: 203 players, totalling 5915 games, -0.8 at 2B, -1.1 at 3B. Even! I don’t think we have an age-bias to contend with, but we may have a talent-bias. Not necessarily in terms of quality of talent, but perhaps in breadth of skills.
There is one more thing to note here, which is probably the overriding factor, and that is the “experience bias”. A guy who is a full-time 2B, if he plays 3B every now and then, will simply not be as good. That is, there is a certain amount of benefit in having experience at a position. If this experience is worth say 5 or 10 runs, and if all of the movement goes from the full-time 2B playing part-time 3B, and if they end up with equal UZR, that doesn’t mean that the baseline is the same. That’s because we have an experienc-bias to account for.
That said, we know that, while there may be a disproportionate number of players moving between 2B/3B one-way based on experience level, it’s not all of them.
We can try to account for this in a few ways. One is to separate players into “primarily 2B”, “primarily 3B”, “primarily SS”, and see how they do at 2B and 3B. If for example we have that the primarily 2B are a +5 at 2B and +5 at 3B, while the primarily 3B are a +5 at 3B and -5 at 2B, perhaps the best interpretation is that the +5 primarily 2B would be +10 at 3B, but because of lack of experience at +5. And, the primarily 3B of +5 at 3B would be +0 at 2B, but because of lack of experience are -5.
I don’t know yet. I’ll put that on my check-the-bias list.
As it stands, the SS/3B/2B issue is fairly unresolved. I suspect we’ll end up at something like SS is 5 runs ahead of 2B/3B as a group, while there may be a gap of 1 or 2 runs gap between 2B/3B.
***
Why don’t we look at SS compared to 2B/3B as a group, similar to what we did with CF compared to LF/RF. So, we’ll compare a player who played SS and played either (or both) of 2B, 3B.
We have 231 such players, with 8707 games, performing -1.2 at SS and +0.8 otherwise, for a 2 run difference. As noted, we have the potential talent-, age-, and experience-bias. I can see that 2 run difference going up to 5 or 6 runs. I don’t think much higher than that.
***
To complete this installment, let’s take our first infield/outfield look. How do 2B/3B compare to LF/RF? Since we think that players in each position group are roughly the same, we get to increase our sample size by combining positions. We have 177 players involved here, totalling 5875 games. They were -1.3 in the corner outfield, while these same players were -5.1 in the infield. The infield is therefore 4 runs ahead of the corners.
In this particular case, the experience-bias will be stronger, since almost all IF/OF movements occur IF to OF. So, there is a very strong disproportionate movement here.
In addition, we have an age-bias, where likely older infielders are moving out.
We also have the talent-bias makes it so that the lesser infielders are the ones going to the OF. It’s possible that much better infielders would perform much much better in the outfield.
Finally, we have a handedness-bias: lefty throwing corner outfielders would make for very terrible infielders. How bad? We estimate there is a 30-run bias for such players.
All told, we can see how a 4-run gap can balloon to a 10 or even higher (15?) run gap between 2B/3B and LF/RF.
How much higher needs to be studied.
***
So, if we are reasonably sure that we have SS being around 5 runs ahead of 2B/3B, and CF being 10 runs ahead of LF/RF, the question is how to align the infielders and outfielders. We think that the 2B/3B need to be at least 10 runs ahead of the corner outfielders. In that case, we’d get this:
7.5 SS
2.5 2B, 3B, CF
-7.5 LF, RF
These are consistent with the expectations of the above paragraph.
But, what if the gap between 2B/3B and the corner outfielders is closer to 15 runs? Then we get this:
10 SS
5 2B, 3B
0 CF
-10 LF, RF
As you can see, the question is how to center 2B/3B/CF. The other positions will be dependent on how those three are treated.
Until we can get more questions answered, I prefer this chart:
7.5 SS
2.5 2B, 3B, CF
-7.5 LF, RF
I’d rather say that the 2B, 3B, CF are sorta-kinda around the same value, than to create a chart that puts the CF as 5 runs below the other two infielders. I think it would be more justifiable for me, having enough uncertainty in the process, and enough biases to account for, to keep the three in the same group, than to split them out at this moment. I’ll find it easier to break them up, if I have to, then to end up in a situation where the CF might end up being 5 runs behind the 3B to 3 runs ahead of him. By keeping CF=3B for now, I’m hedging my bets.
And seeing how both positions have both around league average hitters since 1980, there’s a certain amount of natural selection that centers me to that position.