Unless you’re a WhiteSox fan, whose team is consistently being projected to perform the first-to-worst feat.
Anyone know how DM handles defense? Any method which tries to forecast team standings without incorporating player defensive projections is going to be sorely lacking in accuracy. Although pitching forecasts basically factor in defense, or at least they should…
Seeing that Diamond-Mind is a creation of Tippett, and Tippett had a fielding system to match UZR, I’d say it was likely quite good.
However, since Tippett sold, I don’t know what DMB’s fielding system is. I’d think he sold that along with the whole company…
James Holzhauer has written about the problem with these sims:
http://playoffodds.blogspot.com/2009/03/2009-season-simulated-intro.html
As I have commented on before, some systems come up with updated versions as the depth charts change during the preseason.
The DM projection for BP (PECOTA) is obviously based on an early version of the numbers and perhaps also not even using a BP-based depth chart but instead only the early individual player projections as published in the annual or the first weighted means spreadsheet.
It looks like SG said he is using the SAME depth charts for all the forecasters, and he uses mostly the BPro depth charts.
Anyone know how DM handles defense? Any method which tries to forecast team standings without incorporating player defensive projections is going to be sorely lacking in accuracy. Although pitching forecasts basically factor in defense, or at least they should…
It depends on the projection system. If the pitching projections already incorporate defense into them(cairo, chone, hardball times, pecota), I set all the defenders to average. If they don’t (marcel) I used a combination of standard ZR/bUZR, and Diamond Mind’s fielding ratings for whatever players had rating from their 2008 projection disk. Unfortunately since Tippett sold they did not do a 2009 projection disk. They have five categories for defensive range, then an error rating.
1 - Excellent (around +15/full seasons)
2 - Very good (around +7)
3 - Average (0)
4 - Fair (around -7)
5 - Poor (around -15)
Error ratings are based off 100 being average. Lower than 100 means the player makes fewer errors per chance than the average player, higher than 100 means they make more.
ZiPS does incorporate the defense into their projections, although Dan is conservative with it so he also does include a fielder rating based on ZR I think.
The DM projection for BP (PECOTA) is obviously based on an early version of the numbers and perhaps also not even using a BP-based depth chart but instead only the early individual player projections as published in the annual or the first weighted means spreadsheet.
I am using a combination of BPro’s depth charts and MLB.com with my own assumptions. Like Tango said, the depth charts should be roughly the same for each system.
My simulations don’t match PECOTA’s(and other systems) for a few reasons:
1) I have to reverse engineer the run environment since I don’t have access to theirs
2) There are likely differences in park factors that will skew some of the runs for/runs against for various teams
3) Diamond Mind uses a lot of stats that are not provided by most of the projections (GDP, GB%, 2B/3B allowed by pitchers, WP, BK). So I have to use a rough estimate for those based on the projections that do provide that data.
JB H: right on.
Holzhauer is correct that what the sims are using is a fixed true talent level, and a fixed depth chart.
In reality, we don’t have such perfect god-given information.
This is a limitation, but I don’t think it’s that bad. Taking a guess, I’d say this bumps the SD from the 6.4 that we’d expect with perfect information sqrt(162)*.5, to, what 1 SD = 8?
Holzhauer is correct that what the sims are using is a fixed true talent level, and a fixed depth chart.
Something I wanted to do was figure out a way to use a random # generator to modify player stats (0 to 1 times the standard deviation of their component stats) as well as to manipulate the playing time. Unfortunately, DMB’s interface isn’t really conducive to it.
This is a limitation, but I don’t think it’s that bad. Taking a guess, I’d say this bumps the SD from the 6.4 that we’d expect with perfect information sqrt(162)*.5, to, what 1 SD = 8?
Standard deviations are a little bit higher than 6.4 (6.7 when looking at a combination of all the projections) because I kept the random injury function on and because of the differences in the projections, but yeah, it’s a limitation.
I think that I once mentioned to SG in ATL that these don’t factor in midseason trades, but he said that that doesn’t make that much of a difference. OOTP let’s you mimic a GM, correct? Is there a way to autoplay that?
This is a limitation, but I don’t think it’s that bad.
Click on the homepage of his blog. Increasing the SD roughly doubles the playoff chances for the lower third of the league.
I looked at the 2008 projection compared to the actual 2008 results and found:
14 teams RF 1 SD below projection
1 team RF = 1 SD below projection
13 teams RF within 1 SD of projection
2 teams RF 1 SD above projection
13 teams RF 1 SD below projection
14 teams RF within 1 SD of projection
3 teams RF 1 SD above projection
RF=??
2008 was a bad year for any good forecast system. SD was off the charts. CLE was almost off the charts. ANA once again greatly exceeded almost anyone’s forecast. The Yankees were pretty bad.
While doing something in the model, like having random injuries, trades, personnel moves, etc, or even using randomly fluctuating player talent may change the post-season chances quite a bit, it is not going to change the projected w/l totals much. A lot of that also depends on the playing time forecasts. If you use overly liberal ones, you are going to at least overvalue the good teams (which probably ends up undervaluing the other teams).
Good stuff my James H on the web site above. I wonder though if you should use a symmetrical standard error for all teams, as James is doing. If a team is already a very good team, wouldn’t a really bad spate of injuries hurt it more than a team that isn’t that good? To some extent the baseline playing time forecasts already take that into consideration, but I’m not sure that enough of that is accounted for…
Tango/10: I’ve yet to see a projection system that I believe can produce a standard deviation of less than 8.5 games when projecting the standings. I believe the correct number is somewhere between 8.5 and 9.
mgl/16: I share your concerns in this regard. Of course, this effect is not exclusive to the very good teams: if Albert Pujols suffers a serious injury, that’s going to cripple the Cardinals.
A “very bad” spate of injuries is unlikely to strike any given team. We shouldn’t ignore this possibility, but it’s built into the model.
I would say a bigger weakness with a symmetrical standard error is that it doesn’t account for inherently high-variance players. Who really knows what we’ll see out of Alex Rodriguez this year, or Rich Harden, or Chris Carpenter? If one team had several of these players, that would be good cause for re-evaluating the standard deviation for that team.
RF (runs for) is what the author of the original article uses instead of RS (runs scored). I just used it to be consistent with the author. I’ll switch to RS as I should have to start with.
The simulations had 4.8% more runs scored than actually occurred.
For outliers:
(predicted RS - actual RS) / StDev :
SD(2.1), WAS(2.8), OAK(2.6), NYY(2.8), MIN(-2.4), TEX(-2.0)
(predicted RA - actual RA) / StDev :
TOR(2.3), TB(2.1), DET(-2.4), CWS(2.6), PHI(3.0), FLA(2.3), MIL(2.0), LAD(2.2)
Diamond Mind appears to be dead.
Approaching the whole problem of handicapping MLB from a “better data” but not requiring a statistical programmer perspective…
Who can you recommend that’s combined your WPA & LI event based data into a replay simulator which allows for customizable teams/lineups?


Those standings look pretty good.
vr, Xei