I don’t remember. It may have been 2007, or a weighted of the last 3 years.
I’d suggest either:
a) take everyone with a reliability figure of .05 and below, and figure out the average
b) sort all players by PA, and take the top n number of players, such that the sum of their PA equals around 180,000 (total PA of nonpitchers in a typical year), and figure out the average
Note that I am showing the wOBA in the last column, which, with the league average wOBA you are seeking (get from either a or b above) and the PA will give you LWTS.
***
Doing what I said in step b (396 players), I get a wOBA of .338. So Pujols, with a .421 wOBA and 603 PA gives us:
runs = (.421-.338)/1.15*603= +44 runs
Step a gives us an average of .334, which makes Pujols +46 runs.
When I repeated with the pitchers, I get Runs/G of 4.80. That should be the environment.
Thanks, Tango. Mind if we post them on THT again?
Sure, go ahead!
Hardball Times has posted the Marcels on its site:
http://www.hardballtimes.com/main/blog_article/2008/01/05/
Fangraphs has the forecasts of Bill James, Chone, and Marcel:
http://www.fangraphs.com/statss.aspx?playerid=1070&position=OF
Obviously, none of us have incorporated known league suspensions.
I’m curious as to how the ages are calculated. The Marcel google doc shows Ryan Howard as age 29. His birthdate is 11/19/79 - so wouldn’t he turn 29 after the 2008 season is over?
Thanks in advance.
Ah, my chance to showcase the wiki, so I don’t have to repeat myself:
http://www.tangotiger.net/wiki/index.php?title=Seasonal_Age
Cool, thanks for the opportunity.
Thanks!
Question #2 - has anything changed in the calculation since the 2004 introduction?
I ran the exact Beltran example and got the same numbers but when I tried it on 2008, it did not exactly equal for everyone.
No change.
OK - one more question:
What does step #6 “Rebaseline the results against an assumed league average of 2003.” mean?
If you don’t mind, would you be able to run through the 2008 calc for David Wright?
I think that’s the step where I have to convert the performance line that was based on 2001-2003 data into a 2003 context. That is, I assume that the run environment context in 2004 will equal to 2003. I could have assumed that 2004 = average of 2001 to 2003 at the league level I suppose.
Are you saying this?
04yrAdj = [r/g03] / [(3*r/g01+4*r/g02+5*r/g03)/12]
Do you know what your factor ended up being for 2008?
Thanks
For the PA regression you use 5/4/3 and 1200.
For the pitching IP you use 3/2/1 but what league amount do you use?
Thanks.
Hmmm… good question. I wrote this 4 years ago:
http://www.tangotiger.net/archives/stud0346.shtml#1028
But I didn’t mention what it was. Weird. Anyway, it’s 800/3 IP.
So, a guy with 200 IP in 2007, 150 in 2006, and 100 in 2005 would be regressed as follows:
totalIP = 200*3 + 150*2 + 100*1 = 1000
leagueIP = 800/3 = 266.67
regression toward league mean = 266.67 / (1000+266.67) = 21%
r = 1 - .21 = .79
Basically, the 1200 PA for a batter, if he had a .333 OBP, would be on base 400 times, and out 800 times. So, that’s where the 800 comes from.
That is, the 266.67 IP for a pitcher roughly corresponds to 1200 PA for a batter.
So, the batter’s performance (the 5/4/3 = 12 weights) is more indicative than the pitcher’s performance (3/2/1 = 6 weights).
A pitcher would have to face twice as many batters as a hitter would to get the same regression amount.
A pitcher with say 266.67 IP each year for 3 years would have an r=.857.
A hitter to have that level of r would need 600 PA each year for 3 years.
Thanks for these.
I’d like to calculate linear weights. What did you use as the 2008 league batting line? Can I just use Chris Basak’s projection to figure out the league outs multiplier?