THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, March 31, 2009

Diamond Mind baseball + leading forecasting systems = standings

By Tangotiger, 10:50 AM

Here’s the AL and the NL.


#1    Xeifrank      (see all posts) 2009/03/31 (Tue) @ 11:13

Those standings look pretty good.
vr, Xei


#2    weskelton      (see all posts) 2009/03/31 (Tue) @ 13:27

Unless you’re a WhiteSox fan, whose team is consistently being projected to perform the first-to-worst feat.


#3    MGL      (see all posts) 2009/03/31 (Tue) @ 15:48

Anyone know how DM handles defense? Any method which tries to forecast team standings without incorporating player defensive projections is going to be sorely lacking in accuracy.  Although pitching forecasts basically factor in defense, or at least they should…


#4    Tangotiger      (see all posts) 2009/03/31 (Tue) @ 15:51

Seeing that Diamond-Mind is a creation of Tippett, and Tippett had a fielding system to match UZR, I’d say it was likely quite good.

However, since Tippett sold, I don’t know what DMB’s fielding system is.  I’d think he sold that along with the whole company…


#5    JB H      (see all posts) 2009/03/31 (Tue) @ 15:55

James Holzhauer has written about the problem with these sims:

http://playoffodds.blogspot.com/2009/03/2009-season-simulated-intro.html


#6    Fargo      (see all posts) 2009/03/31 (Tue) @ 16:38

As I have commented on before, some systems come up with updated versions as the depth charts change during the preseason.

The DM projection for BP (PECOTA) is obviously based on an early version of the numbers and perhaps also not even using a BP-based depth chart but instead only the early individual player projections as published in the annual or the first weighted means spreadsheet.


#7    Tangotiger      (see all posts) 2009/03/31 (Tue) @ 16:41

It looks like SG said he is using the SAME depth charts for all the forecasters, and he uses mostly the BPro depth charts.


#8    SG      (see all posts) 2009/03/31 (Tue) @ 16:43

Anyone know how DM handles defense? Any method which tries to forecast team standings without incorporating player defensive projections is going to be sorely lacking in accuracy.  Although pitching forecasts basically factor in defense, or at least they should…

It depends on the projection system.  If the pitching projections already incorporate defense into them(cairo, chone, hardball times, pecota), I set all the defenders to average.  If they don’t (marcel) I used a combination of standard ZR/bUZR, and Diamond Mind’s fielding ratings for whatever players had rating from their 2008 projection disk.  Unfortunately since Tippett sold they did not do a 2009 projection disk.  They have five categories for defensive range, then an error rating.

1 - Excellent (around +15/full seasons)
2 - Very good (around +7)
3 - Average (0)
4 - Fair (around -7)
5 - Poor (around -15)

Error ratings are based off 100 being average.  Lower than 100 means the player makes fewer errors per chance than the average player, higher than 100 means they make more.

ZiPS does incorporate the defense into their projections, although Dan is conservative with it so he also does include a fielder rating based on ZR I think.


#9    SG      (see all posts) 2009/03/31 (Tue) @ 16:47

The DM projection for BP (PECOTA) is obviously based on an early version of the numbers and perhaps also not even using a BP-based depth chart but instead only the early individual player projections as published in the annual or the first weighted means spreadsheet.

I am using a combination of BPro’s depth charts and MLB.com with my own assumptions.  Like Tango said, the depth charts should be roughly the same for each system.

My simulations don’t match PECOTA’s(and other systems) for a few reasons:
1) I have to reverse engineer the run environment since I don’t have access to theirs
2) There are likely differences in park factors that will skew some of the runs for/runs against for various teams
3) Diamond Mind uses a lot of stats that are not provided by most of the projections (GDP, GB%, 2B/3B allowed by pitchers, WP, BK).  So I have to use a rough estimate for those based on the projections that do provide that data.


#10    Tangotiger      (see all posts) 2009/03/31 (Tue) @ 16:48

JB H: right on.

Holzhauer is correct that what the sims are using is a fixed true talent level, and a fixed depth chart. 

In reality, we don’t have such perfect god-given information. 

This is a limitation, but I don’t think it’s that bad.  Taking a guess, I’d say this bumps the SD from the 6.4 that we’d expect with perfect information sqrt(162)*.5, to, what 1 SD = 8?


#11    SG      (see all posts) 2009/03/31 (Tue) @ 16:59

Holzhauer is correct that what the sims are using is a fixed true talent level, and a fixed depth chart. 

Something I wanted to do was figure out a way to use a random # generator to modify player stats (0 to 1 times the standard deviation of their component stats) as well as to manipulate the playing time.  Unfortunately, DMB’s interface isn’t really conducive to it.

This is a limitation, but I don’t think it’s that bad.  Taking a guess, I’d say this bumps the SD from the 6.4 that we’d expect with perfect information sqrt(162)*.5, to, what 1 SD = 8?

Standard deviations are a little bit higher than 6.4 (6.7 when looking at a combination of all the projections) because I kept the random injury function on and because of the differences in the projections, but yeah, it’s a limitation.


#12    Gary Geiger Counter      (see all posts) 2009/03/31 (Tue) @ 17:31

I think that I once mentioned to SG in ATL that these don’t factor in midseason trades, but he said that that doesn’t make that much of a difference.  OOTP let’s you mimic a GM, correct?  Is there a way to autoplay that?


#13    JB H      (see all posts) 2009/03/31 (Tue) @ 17:31

This is a limitation, but I don’t think it’s that bad.

Click on the homepage of his blog.  Increasing the SD roughly doubles the playoff chances for the lower third of the league.


#14    SirKodiak      (see all posts) 2009/03/31 (Tue) @ 22:18

I looked at the 2008 projection compared to the actual 2008 results and found:

14 teams RF 1 SD below projection
1 team RF = 1 SD below projection
13 teams RF within 1 SD of projection
2 teams RF 1 SD above projection

13 teams RF 1 SD below projection
14 teams RF within 1 SD of projection
3 teams RF 1 SD above projection


#15    MGL      (see all posts) 2009/03/31 (Tue) @ 22:37

RF=??

2008 was a bad year for any good forecast system.  SD was off the charts.  CLE was almost off the charts.  ANA once again greatly exceeded almost anyone’s forecast.  The Yankees were pretty bad.

While doing something in the model, like having random injuries, trades, personnel moves, etc, or even using randomly fluctuating player talent may change the post-season chances quite a bit, it is not going to change the projected w/l totals much.  A lot of that also depends on the playing time forecasts. If you use overly liberal ones, you are going to at least overvalue the good teams (which probably ends up undervaluing the other teams).


#16    MGL      (see all posts) 2009/03/31 (Tue) @ 22:50

Good stuff my James H on the web site above.  I wonder though if you should use a symmetrical standard error for all teams, as James is doing.  If a team is already a very good team, wouldn’t a really bad spate of injuries hurt it more than a team that isn’t that good?  To some extent the baseline playing time forecasts already take that into consideration, but I’m not sure that enough of that is accounted for…


#17    James Holzhauer      (see all posts) 2009/04/01 (Wed) @ 04:13

Tango/10: I’ve yet to see a projection system that I believe can produce a standard deviation of less than 8.5 games when projecting the standings.  I believe the correct number is somewhere between 8.5 and 9.

mgl/16: I share your concerns in this regard.  Of course, this effect is not exclusive to the very good teams: if Albert Pujols suffers a serious injury, that’s going to cripple the Cardinals.

A “very bad” spate of injuries is unlikely to strike any given team.  We shouldn’t ignore this possibility, but it’s built into the model.

I would say a bigger weakness with a symmetrical standard error is that it doesn’t account for inherently high-variance players.  Who really knows what we’ll see out of Alex Rodriguez this year, or Rich Harden, or Chris Carpenter?  If one team had several of these players, that would be good cause for re-evaluating the standard deviation for that team.


#18    SirKodiak      (see all posts) 2009/04/01 (Wed) @ 04:46

RF (runs for) is what the author of the original article uses instead of RS (runs scored).  I just used it to be consistent with the author.  I’ll switch to RS as I should have to start with.

The simulations had 4.8% more runs scored than actually occurred. 

For outliers:
(predicted RS - actual RS) / StDev : 
SD(2.1), WAS(2.8), OAK(2.6), NYY(2.8), MIN(-2.4), TEX(-2.0)

(predicted RA - actual RA) / StDev :
TOR(2.3), TB(2.1), DET(-2.4), CWS(2.6), PHI(3.0), FLA(2.3), MIL(2.0), LAD(2.2)


#19          (see all posts) 2010/10/08 (Fri) @ 18:55

Diamond Mind appears to be dead.

Approaching the whole problem of handicapping MLB from a “better data” but not requiring a statistical programmer perspective…

Who can you recommend that’s combined your WPA & LI event based data into a replay simulator which allows for customizable teams/lineups?


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential