Friday, April 03, 2009
Starting from scratch
A reader was asking me about starting a baseball league from scratch. All you have is:
1. A population of 5000 nonpitchers
2. They have a mean offense runs per PA of 0, with one SD = .150 per PA
3. They have a mean fielding runs per PA of 0, with one SD = .075 per PA (presumes 75% of PA are BIP)
As I told him:
I meant that if you take the 5000 players in pro ball, the range in fielding would come in at +/- .10 runs per BIP, regardless of position. Or more accurate, on average.
So, the range might be +/- .15 runs per BIP at SS, +/- .13 at CF, +/- .12 at 3B, 2B, etc, etc, etc +/- .05 at 1B, etc, etc, etc. So that, on average it would be +/-.10.
On average, I’m saying that it’s +/-.15 runs per PA as a batter, and +/- .10 runs per BIP as a fielder. And since BIP makes up 75% of the PA, then the range is twice as wide hitting-wise than fielding-wise.
The plan therefore is to be able to find the right balance in terms of distributing your talent across positions. Basically, given these conditions, how much hitting and fielding talent should you expect to find in MLB at each position? I guess I should have also noted that we have a handedness constraint at the three IF positions, and we’d have to accept a constraint at C.
The question being asked is if my assumptions are valid: what kind of distribution of hitting and fielding talent should we expect to exist among these 5000 players?


So I created a little simulator to run this scenario. It randomly generates 5000 players with scores in hitting and fielding that average 0 and have an SD of 0.15. The program then goes through each position in order of defensive difficulty (SS,2B,CF,3B,RF,LF,1B,DH) and picks the best player at each position based on adding the hitting score plus an adjusted fielding score depending on position.
These were the adjustments:
‘SS’:0.75f, ‘2B’:0.60f, ‘CF’:0.55f, ‘3B’:0.50f, ‘RF’:0.35f, ‘LF’:0.30f, ‘1B’:0.25f, ‘DH’:0.00f
Here are some runs of the sim with my comments:
Top Players:
2198 Pos: SS Hit: .41 Fld: .46 Score: .75
3098 Pos: 2B Hit: .40 Fld: .43 Score: .66
1651 Pos: CF Hit: .35 Fld: .41 Score: .57
0852 Pos: 3B Hit: .46 Fld: .06 Score: .49
0777 Pos: RF Hit: .45 Fld: .07 Score: .47
4666 Pos: LF Hit: .45 Fld: .03 Score: .46
4607 Pos: 1B Hit: .43 Fld: .06 Score: .45
4416 Pos: DH Hit: .45 Fld: -.15 Score: .45
This team looks fairly reasonable and has a lot of good hitters. Both the SS and 2B are great all around players. The best hitter is the 3B who looks like more of a Ryan Braun type who probably wouldn’t be a 3B on another team.
Top Players:
3300 Pos: SS Hit: .37 Fld: .31 Score: .61
0545 Pos: 2B Hit: .44 Fld: .17 Score: .55
3563 Pos: CF Hit: .32 Fld: .38 Score: .53
4771 Pos: 3B Hit: .45 Fld: .11 Score: .51
3156 Pos: RF Hit: .40 Fld: .22 Score: .47
2865 Pos: LF Hit: .39 Fld: .22 Score: .45
4519 Pos: 1B Hit: .40 Fld: .12 Score: .43
1619 Pos: DH Hit: .43 Fld: -.43 Score: .43
This group shows the problem of a simple minded assignment of players. The CF would be a lot better applied to the 2B position and the 2B should probably be the RF. Likewise the 3B should be the LF. The problem is that those players scored so high with their offense that they were the most valuable player at those positions. Unfortunately it doesn’t make sense once the players are part of the final team. The DH is also funny since I wonder if a player with that low of a fielding score could even run the bases.
I’m interested in hearing any ideas about a better way to assign player based on these scores.
P.S. I can post the program code if anybody wants to see it.