THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, November 09, 2006

Baseball Prospectus readers know what they are doing

By Tangotiger, 04:23 PM

Chalk up yet another win to the Wisdom of the Crowds, this time courtesy of Baseball Prospectus.

If we…


...count a “win” as any pick that was within 3 wins of actual and a loss as any pick that was at least 8 wins away from actual, BP readers are 10-10, PECOTA is 13-12, Vegas is 10-9, ESPN is 13-13, and DMB only 9-12.

The actual standard deviation of wins was 10.1, and all forecasts were (correctly) below that.  PECOTA at 7.4, DMB at 8.1, BP readers at 8.5, Vegas at 8.7, and ESPN at 9.5.  When you guess, you are guessing on talent, and you are not trying to guess the noise as well (unless of course you are competing with a group, where a wild guess actually helps). 

In this case, 10.1/162 = .062.  ObservedSD^2 = TrueSD^2 + LuckSD ^ 2.  LuckSD = .039, making our TrueSD = .048 wins per game, or 7.8 wins per 162 games.  PECOTA and DMB were the only ones that were really close to this, though BP readers weren’t that bad.

(Weirdly, the Vegas odds predicted 81.4 wins per team.  Seems to my uneducated gambling mind that if you picked the under for every single team, you’d have won more than you lost.  In fact, 16 teams won less, and 14 won more than the Vegas line.)

***

I also recommend this study by me a few years ago:
http://www.tangotiger.net/forecastFinal2.html

As well as this one by David Cameron and Greg Spira:
http://www.baseballprospectus.com/article.php?articleid=229

#1    David Cameron      (see all posts) 2006/11/09 (Thu) @ 17:34

Huh - thats weird.  I had nothing to do with that piece that Greg Spira wrote.  At least, if I did, I have no memory of it whatsoever, and the only time my name appears anywhere is in the byline. 

It’s a good article, though, so maybe I should take credit Greg’s work and just keep my mouth shut.


#2    MGL      (see all posts) 2006/11/10 (Fri) @ 01:47

The standard deviation by luck should really be considerably more than the .039 (which I assume is that expected by the binomial variance, where p =.5).

One, each team’s true p is something other than .5, which raises the SD.  Two, each team competes against different teams, pitchers, etc. (IOW, their p varies from game to game), which also raises the SD.  Finally, and this is the most significant, injuries, personnel changes, etc., signficantly raises the SD. So I think the true SD BEFORE the fact should be considerably less than 7.8.  After the fact, only the first two factors above should apply, so that the SD of true talent should be a little more than 7.8.  Since the projections are before the fact (not knowing the injuries, personnel changes, etc.), they should have a SD of quite a bit less than 7.8.

It is no surprise that ESPN and Vegas have the highest (and most incorrect) SD. The average fan (and even some sophisticated ones) is going to think that you SHOULD predict teams to have the high and low win totals that you typically see among the teams with the best and worst records (e.g., 100 wins, 62 wins), when in fact you should rarely project anything close to that of course.

Interestingly, my own projections, which are based strictly on the individual projections of all the players on the 25-man roster, projected starting lineups, pitching rotations, and bullpen roles, and “generic” playing times for starters, backups, and pitchers (IOW, all players in all starting lineups had the same amount of PA, all 1-3 starters and 4-5 starters had the same IP, etc.), had a SD of only 6.2 wins.  I think this is closer to what it should be.

BTW, don’t put too much stock in or read into the quotes for the Vegas lines. First of all, there are many biases in Vegas lines.  For example, if you bet all dogs in NFL and baseball, you would have almost no juice (come close to breaking even) as the public “likes” the favorites in sports and the line is biased to reflect that.  Second, even at 81.4, you probably lose money betting on all the overs.  The juice on these pre-season ovre/unders is typically 20 or 30 cents and sometimes even more, depending on the sports book.  That means you have to lay 11 or 11.5 to win 10.  Finally, at any given time, a sports book can have an over/under a lot different from another one.  I have no idea where BP gets the numbers from and at what time.  Probably when they “open” the average win total is 82 (maybe not).  But after the public starts betting them and they move, they can be all over the place.

Now, if you did nothing else but bet under on all their highest totals and over on all their lowest ones, you would almost certainly win in the long run, as their spread of win totals is typically too high, as we see above (8.7), reflecting what the public thinks (that good teams “should” win 97 and 98 games and bad teams “should” win 62 and 63 games).


#3    MGL      (see all posts) 2006/11/10 (Fri) @ 02:32

BYW, here is my assessment of “why” the various forecasters, especially “predicatron” and Pecota were “right” or “wrong.” This is off the top of my head, but I think it is pretty accurate and certainly interesting.

Team Actual Predictatron PECOTA Vegas ESPN DMB

ANA 89 86.6 81 88.5 90 79 The Predicatron guys (PG) clearly go more by past record (and then adjust for personnel changes) and public perception than Pecota (Pec).  For whatever reasons or no reason at all, ANA has “outperformed” a good projection the last few years. ESPN nails this one because all they care about is past record and “perception.” They couldn’t analyze their way out of a paper bag (well maybe Neyer could).

ARI 76 77.0 77 74.0 74 76
ATL 79 88.6 85 89.0 91 85 Here the PG simply made a mistake in giving ATL more credit than they deserved (given their actual talent) presumably because they win the division every year.  Sort of like expecting red to come up in Roulette just because it has come up red 10 times in a row.  The Pec guys were definitely more accurate with this one for good reason. Here you see that ESPN makes a ridiculous prediction for ridiculous yet obvious reasons.
BAL 70 74.7 77 76.5 72 74 I’m guessing they got a little unlucky, despite ESPN’s good prediction.
BOS 86 91.9 93 90.5 91 86 While I think Pec had their projection too high for some reason, probably injuries and maybe a little bad luck deflated their actual record.
CHA 90 88.0 82 91.5 92 86 Again, ESPN’s prediction was ridiculous and obviously based on last year’s w/l and the fact that they had essentially the same team, or at least one that was not worse.  The PG guys also had a ridiculous prediction for essentially the same reasons.  As you can see from Pec and DM, they did not have tremendous talent and way ovreperformed for some reason.  I’ll take the under blind next year for any amount of money.
CHN 66 81.9 85 85.0 81 85 Obviously pitching injuries completely derailed them this year and maybe some bad luck.  Maybe ESPN and PG figured that there would be more injury and poor performance from Wood and Prior than did Pec and DM (and me).
CIN 80 73.3 78 74.0 70 77 The PG and ESPN simply did not analyze this team properly.  They were a pretty good team that should have won around 80 games.
CLE 78 89.8 88 90.0 90 88 This team simply got really unlucky this year.  If the same team played a million games, I’ll wager everything on over 85 games.
COL 76 69.2 74 68.5 70 67 Mistake by almost evreyone, but especially PG and ESPN.  They failed to consider the affect of the humidor on the Coors Hangover effect (essentially eliminating it) as well as the affect of the humidor on the pitching staff in general.  Also, when doing projections for the players, you have to adjst their road stats for the hangover affect, which most analysts did not do.  I thought they had a pretty decent team this year and had them projected at 74 wins.
DET 95 78.3 83 77.0 77 79 As you can see, if you do a proper analysis on them, you will see that they are better than PG, ESPN, etc. thought from their “rudimentary analyses.” I had them at 83 wins also.  As to why they won 95 games - a lot of luck and unexpected performances, which could be true talent, from Verlander and Zumaya and even Bonderman.
FLO 78 67.2 71 67.0 61 69 Good luck and a lot of great performances by rookies and young players that may or may not have been true talent but could not have been anticipated by anyone.
HOU 82 81.4 81 83.5 84 78
KCA 62 60.6 61 64.0 61 62
LAN 88 85.6 87 85.5 87 86
MIL 75 84.1 84 80.5 83 79
MIN 96 84.3 84 80.0 84 90 It seems like they have outperformed expectations, both by the public and the analysts for the last few years.  I don’t know why.  This year probably luck and unexpected great performance by Liriano and even above-expected performance by Santana.  The again, Radke and Silva sucked, so I don’t know where that record came from (Mauer and Morneau didn’t hurt it of course).
NYA 97 93.2 94 100.0 94 93 Projecting 100 wins for just about any team is ridiculous of course.  Chance of injury alone should make it less than 100.  The Vegas line reflected extreme public bias towards the Yankees, which is THE most bet on team in baseball the last 5 years or more.
NYN 97 88.7 88 91.0 90 87 The “public oriented” forecasters had the Mets win total higher than the “analysts” because of all the hype about the Mets.  I think the analysts were “right” and I am not sure why the Mets won so many games.  Probably fluke career years for a lot of the offensive players and probably a lot of runs scored given the raw stats of the offensive players.

TBC....


#4    MGL      (see all posts) 2006/11/10 (Fri) @ 02:33

cont. from previous post…

OAK 93 93.3 93 88.5 92 96 For some reason everyone knew how good a team they had except Vegas. Yes, they got lucky given their actual personnel during the season, but the picks were based on Harden pitching all season, as well as Crosby and Chavez being healthy.  So everyone (but Vegas) got this one right, in a screwy way.  Or something like that.
PHI 85 86.1 86 81.5 84 86
PIT 67 72.5 79 75.5 76 75 Wel, I had PIT winning 77 games.  79 by Pecota seems awfully high.  Somehow the PG had a better handle on the team, as I don’t think they got too unlucky this year.  Maybe a little.  They were probably a 70-72 win team.
SDN 88 81.8 78 77.5 80 77 I had them as a 78 win team also (like the other analysts - DM and Pec).  Somehow the PG and ESPN knew better.  I don’t think that SD got that lucky this year.  In fact, I kind of liked them in the post-season so somehow my projection for them got a lot better at the end of the season.
SEA 78 74.5 77 75.5 71 80 I had them at 76 wins.  Analytically, they were not such a bad team.  I’m not sure why the “public” (ESPN and PG) had them rated so low, but they were wrong before the season started and after the smoke cleared.
SFN 76 81.4 80 81.5 84 86 I had them at 86 wins also (like DM).  I think everyone’s projection hinged on Bonds.  I think I assumed like 130 or 140 games and I definitely overrated his performance as I used a regular Marcel-style projection which even though he had a great season would have tremendously overrated his impact by probably 5 wins.
SLN 83 92.3 86 94.0 93 95 The projections by everyone but Pec were ridiculous.  Obviously they were influenced by last year, when they had a much better team and got really lucky to post the win total they did.  They were NOT a 90+ win team going into the season.  I don’t know where DM got their number from (95) but that was absurd, even given their talent at the start of the season.  I had them at 8 wins, which was probably about right.  Their actual 83 wins reflected a little bit of bad luck, injuries, and personnel changes.  The “concensus” that they were a “bad” team that won the WS is ridiculous.  The team that played in the post-season was probably a 90-92 win team.
TBA 61 71.0 69 67.5 70 70 Hey, I had them at 72 wins.  I think everyone but Vegas was optimistic about some young players.  The Vegas line which turned out to be “right” merely reflected their history and what the casual fan thinks of TB.
TEX 80 79.1 80 80.5 80 81
TOR 87 82.9 79 86.0 85 83 All the hype jacked up everyone’s totals but Pec and DM.  I’m still not sure why Pec was so low.  I had them at 81 wins.
WAS 71 70.7 70 77.0 73 75 Hey, I had them at 77 wins also!  I think they got unlucky this year and were actually closer to a 77 win team.  Of course, I did not consider the impact of who I consider the worst manager in baseball.  Maybe he cost them 3-4 wins.  The 77 win total from Vegas seems odd, although I agreed with it.


#5    tangotiger      (see all posts) 2006/11/16 (Thu) @ 18:23

mgl, prorating Marcel’s forecast to Bonds’ actual PA, Marcel had Bonds at 86R, 72RBI, 31HR (total of 127 runs produced.  His actual was 125 RP.

Pro-rating to his PA minus IBB: he actually had 49 XBH, compared to Marcel’s 55.  His actual NIBB+singles was 127, and Marcel said 130. 

I don’t think Bonds’ actual was too much worse than his Marcel forecast.  Add in the fact that Bonds was clutch (+0.8 wins to his team, above his random performance line), and Bonds impacted as forecasted. 

He also played far more games than expected.  His 2002-2004 GP average was 140, which means we’d forecast something like 130 for him in 2005.  He was hurt in 2005, which means that in 2006, injured and older, we’d have to expect much lower than 130.  120 would have been the max reasonable esimate for his GP.


#6    tangotiger      (see all posts) 2006/11/28 (Tue) @ 18:53

Thanks to Dan Fox for the reminder, Tippett once again publishes the annual results:
http://www.diamond-mind.com/articles/tmpred06.htm

Interesting how Nate Silver did not look at his own system, and paid the price, once again showing not to listen to any single person, and trust the crowd and/or system.

A neat look at how the Vegas line, the “ABP” crowd, and Tippett simulator all do a great job.


#7    Tangotiger      (see all posts) 2007/12/12 (Wed) @ 14:58

The Wisdom of the BP reader crowd does it again in 2007:

http://baseballprospectus.com/article.php?articleid=6987


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Mar 21 08:18
Yahoo fantasy sabr league

Mar 21 04:44
Morgan Ensberg has parental advice

Mar 20 21:32
BDB Database (MS Access)

Mar 20 15:42
Quickest ejection in MLB history?

Mar 20 12:31
Statistical Significance, or the reason that mathematician Ron Fisher is on MGL’s “On Notice” Board

Mar 20 10:20
Optimizing the batting order: Phillies and Yankees

Mar 20 02:31
Will Mariano Rivera save only 22 games this year, and with a 3.53 ERA?

Mar 20 01:12
One Year and One Million Hits Later

Mar 19 23:52
Another brilliant quote…

Mar 19 23:30
Arbitration and bias