Thursday, December 18, 2008
Chone has its own site
Rally is rolling out his forecasting system site, and it’s still in beta. Here’s a preview.
Buy The Book from Amazon
Rally is rolling out his forecasting system site, and it’s still in beta. Here’s a preview.
Rally, how do you come up with your percentile projections?
At some point I’ll put the defensive projections on this site. So far, development has been the priority.
Victor - based on an estimate of the variance in all the component rates and a normal distribution. Beyond that, I’d just have to say it’s a trade secret.
The “simple player valuation” table is fantastic.
Can team wins be approximated by summing the wins over replacement, plus 40 wins (or whatever the best guess for a replacement level season is)?
Only if you add your own assumptions for playing time. Once I get the pitchers done, I’ll probably update the front page with team projected records.
Rally, how do you project playing time? Something Marcel-like? If so, how do you handle it for team projections? Use a sim?
It’s a complicated process to estimate playing time. It involves age and recent playing time data, but I’ve tweaked it a bit to limit the effect one wiped out season might have.
For team projections, it’s all manual. On a team by team basis I estimate how much playing time people will get. I never go above the original projection, but for pretty much everyone other than the starters, I go below quite a bit.
I’ll do it only once, after the bulk of the top free agents find new homes.
I’ve done the 2009 Dodgers team WAR calculations based off of a combination of projections from Marcel/Zips/Chone/BJ, throwing out the outlier(s). Having to estimate playing time is a bit tricky, but being off by 5-10% here or there doesn’t make too big of a difference. Where roster spots have openings, I put in a replacement level player or something close to it until a player is signed to fill that role. I tried my best to follow Tango’s WAR calculation methodology (may be some minor errors). The worksheet is on my blog (linked to in my name). And I believe the team replacement level mentioned in #4, should be 50 wins. I admire someone who may do this for every team!
vr, Xei
The defensive projections have been restored to the new site, they are at the bottom. For those of you watching from the beginning, my html skills are growing quickly. I’m not a state of the art 1997 programmer.
(when I first did the defensive stats I just used the excel html output).
First of all, Rally, this is great. I have one request: that you add a column for PA (which I assume is AB+BB+HBP). But this isn’t crucial.
As to Zack’s question, it occurs to me that there’s a nonlinearity problem. Sky brings up the issue in the “valuing relievers” thread, post #9 and continuing in post #17. (Click my name for link.) I agree with Tango’s response in #18: the numbers don’t need to add up, and it’s not appropriate to think in terms of an all-replacement level team.
Maybe in practice things are close enough to linear that it doesn’t matter. This should be easy enough to figure out using BaseRuns.
Rally/9—Thanks for getting the defensive projections back up there so quickly!
-j
When I do the team projections, I won’t even look at the R150 column (runs+ per 625 PA/150G). I’ll assign playing time, make sure that outs add up to a predetermined level for every team, total up the component stats, and run that through baseruns.
Then add up the runs and innings for pitchers, assign playing time, and get pythag record.
Player pages are up.
http://lanaheimangelfan.blogspot.com/2008/12/it-is-fully-operational.html
Rally any chance we can get FIP or ERA league and park neutral on the team pages like you do in R150 for hitters. Thanks for all the great work like always.
thanks for the great work!
was anyone else surprised by Brett Myers’ projection?
I’ll have to adjust him - I’m projecting pitchers with more than 50% of their games starting as starters, the rest as relievers. There are a few players - Joba, Dempster, Myers - that I have to go in and force the program to project them as starters.
I’ve been using the site for estimating team runs. Here’s my process:
1) copy team page into excel
2) add a column for PA (AB+BB+HBP) I’ll heed the suggestion to add this to the report next time I regenerate the pages
3) add another column for PA% - Rules are it can go less than 100, but never over. Many of the deep minor leaguers will get a zero here. Assume that the projected playing time is the most CHONE says a guy can play
4) create new columns for all the offensive stats, and prorate them by PA%.
5) Adjust your PA% to make sure outs (ab-h+cs) equals 4100 (average for last year)
6) make sure that positions are not lopsided, i.e. 900 PA for 1B and 400 for catchers, unless the positions you are over on can cover other positions - it’s OK for your SS to be at 800 and 2B at 600 if your backup SS normally is the backup for both spots
7) Run team totals through baseruns
I’ve done 3 teams so far, and even without Tex the Angels look in OK shape. I’ve got:
Angels 758
A’s 745
Rangers 805
Mariners ? do tomorrow
And the Angel pitching looks to be the class of the division. Not 20 games better, but favorites by a few.
Looking at the Cubs page, I see...Craig, 387 AB, Lalli,352 AB, Scales,398 AB, Reynolds,378 AB. And on and on for a couple dozen guys I’ve never heard of (yet). Why not just give these players a standard 600 PA projection and make it cleaner and easier on the eyes?
Rally, do your pitching projections take team defense into account?
Yes, team defense is accounted for. It will need to be updated when 2009 lineups become clearer, and pitcher projections as well.
Something else to consider when forecasting team runs scored (based on player projections) is the effect of lineup position on plate appearances. It’s quite significant. Here are the average number of plate appearances at each lineup slot for each AL team last year:
Slot Avg PA 1 766 2 748 3 729 4 713 5 697 6 680 7 660 8 643 9 623
That’s an extra 143 PAs for the leadoff hitter vs #9. One shift that comes to mind is Tony Fernandez in 1985 and 1986. In 1985, mostly batting 9th, Fernandez accumulated 618 PAs while playing 161 (batting) games and 1416 defensive innings. In 1986 he mostly hit leadoff, and while receiving a similar amount of playing time (163 batting games and 1443 defensive innings), came to bat 727 times—an increase of more than 100 plate appearances.
So ... in 1986, Fernandez created about 99 runs (baseball ref), or .136 per PA. Even if prior to the 1986 season you accurately forecast the .136/PA rate, if you relied on his 1985 PAs, your forecast would be 84 RC (instead of 99)—a shortfall of 15 runs.
Here’s a simple model to forecast PAs by lineup position:
Slot ob% outs/pa pa/g outs/g Pa/162 Actual 1 .347 .653 4.75 3.10 769 766 2 .330 .670 4.64 3.11 751 748 3 .356 .644 4.53 2.91 733 729 4 .356 .644 4.41 2.84 715 713 5 .334 .666 4.30 2.87 697 697 6 .334 .666 4.19 2.79 679 680 7 .325 .675 4.08 2.75 661 660 8 .319 .681 3.97 2.70 643 643 9 .311 .689 3.86 2.66 625 623 TOTAL 38.73 25.74 6274
~ Ob% is the actual on-base percentage by slot in the AL in 2008.
~ Outs/pa is 1-ob%.
~ Pa/g is plate appearances per game, and for every slot below leadoff, it is the pa/g of the slot above minus 1/9 (eg pa/g for #2 = 4.75 - (1/9) = 4.64).
~ Outs/g is outs/pa * pa/g.
~ Pa/162 is pa/g * 162 games.
~ Actual is the actual number of PAs per slot in the 2008 AL.
The key to the model is pa/g for the leadoff hitter. It is set so the sum of the outs/g equals 25.74. Why 25.74 and not 27.00? Outs on base (caught stealing and GIDP) are not reflected in ob%, and so they are ignored. Plate appearances = AB + BB + HP + SH + SF. Times reached base = H + BB + HP. So, outs should equal plate appearances where the hitter did not reach base—(AB + BB + HP + SH + SF) - (H + BB + HP), or AB - H + SH + SF. So ... using actual numbers from the AL last year: outs = (78120 ab) - (20901 h) + (477 sh) + (686 sf) = 58382 outs, which divided by 2268 games = 25.74.
Good job.
The shortcut is that the difference between each slot is 1/9 (.111).
***
Also, Fernandez may “create” more runs going from 9 to 1, but the guy going from 1 to 9 offsets some of that.
So, presume you have one guy creating .13 runs per PA and another creates .10 per PA.
The gap between number of PA in the 1st and 9th slot is .8889, or 144 PA in a season.
The better guy gets an extra .03 runs on those extra 144 PA, or 4 extra runs. (See why lineup order has less impact than we’d otherwise think. Of course if someone is ideally suited for leadoff, then there’s extra leverage there to take advantage of, so maybe that’ll bump to 6 or 8 or 10 runs.)
All to say that unless a team has an egregious setup, I wouldn’t bother too much with the batting order.
Agreed. I think I’m worried more about off-season acquisitions being dropped into a lineup. The PAs will change (sometimes by more than 100) if the batting order position changes or the new team has much stronger (or weaker) offence. As a hypothetical example, if Jack Cust (598 PAs) were dropped into the middle of the Texas offence, he’d probably pick up another 40 PAs based on the Rangers’ higher ob% alone, and if he batted 3rd exclusively then another 40-50 PAs on top of that. Also, if he replaces a player with a lower ob%, then he’ll increase the PAs for his teammates as well.
Another Fernandez-like example is Alfonso Soriano—614 PAs in 2001 batting mostly 8th and 9th, then 741 in 2002 batting leadoff (almost identical gp and defensive innings each year). If you accurately forecast his .300 average and assumed he’d pick up 574 ABs again, your forecast error would be more than 35 hits, even though you nailed the batting average!
I’ve added a user guide to explain the stats that may not be self-explanatory.
http://www.baseballprojection.com/userguide.html
And put Teixiera on the Yankees. At this point Bradley is the only free agent position player who’s really good.
Rally, you missed a couple transactions for the Pirates.
Jose Bautista traded to the Blue Jays for Robinzon Diaz
Michel Hernandez sold to Tampa Bay
Thanks. I’m sure I missed a lot more than that.
Feb 08 16:18
Batman, the webslinger?
Feb 08 15:23
When is a life entity considered a person?
Feb 08 15:14
New PECOTA
Feb 08 14:44
When to purposefully lose the lead
Feb 08 13:49
The will of the people?
Feb 08 11:43
Is Nate Silver alot more certain than he lets on?
Feb 08 09:02
Forecaster’s Challenge: 2012?
Feb 08 07:43
For Your Soul
Feb 08 01:22
Why I’d Bet on My Model (and Against My Instincts)
Feb 07 20:05
Golfers “playing through”
He also has defensive projections on that comcast site that isn’t working anymore. He might want to provide a link to that somewhere on the site as well...Unless I just don’t see it.