THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, December 15, 2006

What are the chances of the worst team in the league winning the World Series?

By Tangotiger, 05:21 PM

According to this, it seems it’s only 150:1.  I’m guessing it’s more like 4 million to 1.  Here’s what I did:


1. Come up with a true talent distribution.  That number is 1 SD = .060 wins per game.

2. Create a league of 30 teams that satisfies that.  I did.  Here it is:
101 97 95 93 92 90 89 88 87 86 85 84 83 82 81 81 80 79 78 77 76 75 74 73 72 70 69 67 65 61
Best team in the league has a true talent level of 101 wins, and the worst has 61 wins.  Standard deviation is 9.83 wins, or .0607 wins per game.

3. Use the binomial distribution to figure out how often each team will win at least 88 games (and make the playoffs). The true 101 win team will make it 98% of the time.  The true 61 win team will reach that level once every 168,000 years.

4. Given that a team has made the playoffs, what are its chances of winning the World Series?  I just created a simple function, so that it goes from 17% for the 101-win team to 4% for the 61-win team, with the average at 12.5%.

5. Multiply 3 by 4.

6. Express as Odds.  Here they are, for each team:
5 6 7 9 10 13 15 18 22 27 33 42 54 72 96 96 131 183 261 381 567 865 1,349 2,156 3,529 10,189 17,988 60,679 228,224 4,559,310

So, the 101-win team has a 5:1 odds of winning the World Series.  The 61-win team has a 4.5 million to 1 odds of winning the World Series.

***

The reality is that we don’t know which is the true 101-win.  Is it the Yanks, Mets, Redsox, Whitesox, Angels, Tigers, Cards?  Who knows.  So, our estimate for the true talent level cannot have a standard distribution of .060 wins.  If we change the estimated true talent levels, so that the four extreme teams are estimated at 81 wins, our new estimated true talent standard distribution is .044 wins.  Repeating this exercise, and the odds of the best team (estimated at 95 wins) of making the playoffs (90%) and winning the world series if in the playoffs (14%) is now 7:1.  At the other end, the estimated true 67 win team has a 25,000:1 odds of winning the World Series.

***

When I line up my odds (assuming true known distribution) against the Vegas odds, we see the following:
a. the total Vegas odds assumes 1.58 World Series winners… obviously, a healthy vig (if that’s the right term)… markup or commission for the rest of us
b. I’ve got the “true” top 4 teams winning 52% of the time, while Vegas has it as 56%.  The next 7 teams have a 38% chance of winning, while Vegas has them at 53%.  Finally, the rest of the teams have a 10% chance of winning, while Vegas has them at 49%.

It seems pretty clear that if you are going to bet on someone to win the World Series, bet on a team that is at least in the top third.  Otherwise, you are buying a lottery ticket.

SabermetricsData
#1    Jeff Sackmann      (see all posts) 2006/12/15 (Fri) @ 18:22

Not only is it a more than healthy vig, but you have to consider the present value of your winnings smile.  They get your cash for the next ten months, which ups their take even more.


#2          (see all posts) 2006/12/15 (Fri) @ 18:31

Sure, but you can’t count on any team, no matter how bad it was last year, to be a 61-win team or whatever.  Last year it might have been somewhat unlucky.  Its young players might suddenly improve next year.  It might pick up a free agent or promote a player who becomes a star.

Suppose the Nationals have a 1 in 20 chance of actually being a .490- to .510- skill team next year, for any or all of these reasons.  Then, they have at least a 1/96 times 1/20 chance of winning next year, which is 1 in 1920.

If you assume they also have a 1 in 20 chance of becoming a .450-.490 skill team, and a 1 in 50 chance of becoming a .510 to .530 skill team, and so on, you have to bump the odds up even higher.


#3    tangotiger      (see all posts) 2006/12/15 (Fri) @ 23:17

Sure, absolutely.  I know, almost for sure, that there is one team out there that is actually a true 61-win team.  But, you probably have a reasonable chance that any 10 teams are actually one of them.  The average true wins of the true bottom 10 teams is 70.5 wins.  Assuming that the Nats (or whoever) has a 1 in 10 chance at being any of those 10 teams, then their cumulative World Series chances are 2000 to 1.  Still a very long way from the Vegas odds of 150:1.

***

I’m trying to model the Vegas odds, and I think this is how it works:

1. Come up with an estimated true talent distribution.  With the true standard deviation should be .060, they use .036, since we can never know who is really the best team, etc. 

2. To get that, you start at 92 wins, and go down by 1 win, until you get to 70.  That gives you 23 teams.  You put in an extra 81-win team.  That’s 24 teams.  You fill in the other 6 teams by giving them between 80 and 82 wins.  That gives you 1 SD = .036

3, 4, 5 is as my original post.

6. Multiply those odds by 1.43, and add .005.  That’ll give you the total odds of 1.58.  (That .005 effectively introduced a floor, which makes it a 200:1 odds)

This is the result:
Wins Odds Vegas
92.0 4 4
91.0 5 6
90.0 6 8
89.0 6 8
88.0 7 10
87.0 9 10
86.0 10 10
85.0 12 12
84.0 15 15
83.0 18 15
82.0 23 15
81.6 25 15
81.3 27 20
81.1 28 20
81.0 28 22
81.0 28 25
80.9 29 28
80.7 30 30
80.4 33 33
80.0 36 45
79.0 45 50
78.0 57 50
77.0 72 60
76.0 89 70
75.0 108 70
74.0 127 80
73.0 145 80
72.0 161 90
71.0 174 100
70.0 183 150

The “Odds” column is my quick model here.  The “Vegas” column is the actual odds using actual teams.

As you can see, a pretty fair model. Don’t be fooled by the “magnitude” of 127:1 and 80:1 odds.  They are both around 1% chances (.008, .012).

Would be interesting to see what kind of model is built for football, basketball and hockey.  That is, how much “uncertainty” do they build in the model, so that the estimated true talent levels of the teams conform to that.


#4    tangotiger      (see all posts) 2006/12/15 (Fri) @ 23:24

This is similar to the idea of forecasting the RBI leader.  Last year, I called the over/under for the 2006 RBI leader as around 150.  The actual leader, as it turned out, had 149. 

However, I’d have no idea it would have been Ryan Howard.  It could have been any number of players, be it Ortiz, Morneau, Pujols, Howard, Andruw, Tejada, etc.  So, to play it safe, you would need to forecast each guy at no more than 110-120 RBIs.  That is, one SD in true-talent RBI could be say 20, but I’d have to assume 10. (Or something like that.)

Same thing here.  While I “know” that there is a true talent team out there with 101 wins, my best guess is to assume 92 wins or so.  I know that the teams are distributed with 1 SD = .060, but I’ll have to assume .036, because I have that much uncertainty.


#5    Rob McQuown      (see all posts) 2006/12/16 (Sat) @ 18:01

What fun!  Nice work.  I remember reading in an old Bill James Abstract that he suggested if anyone ever offered you 250:1 on a team to win the W.S., you should take it, because no team is ever that bad.  I hadn’t ever really looked into it, but it didn’t sound right then, and sounds even less right now.

That said, I believe there are a lot of no-so-bad teams out there which get lumped in with the Nats, et al, which are truly awful.  (Now, watch Snelling win the MVP as he leads the Nats to a pennant!)


#6          (see all posts) 2006/12/16 (Sat) @ 20:44

Bill James did an article in the first of the Brock Hanke-led Baseball Abstracts (1989?) about “How Often Does the Best Team Win the World Series?”, using Monte Carlo simulations.  The “real” (unknown) SD had to be smaller than .060, and he invented a randomizing formula for generating “real” abilities that, ranging in theory from .325 to .675, would hover around .5 and generate the right SD of wins.

I ran a 120-season simulation using his random-win generaton and—as he of course did not—a thirty-team, six-division, eight-playoff-teams model.  (Best win-lost record during simulation: 117-45; lowest, 46-116; average SD about 9.0.)

104 of the 120 World Series were won by one of the eight best teams.  As for the others:

9th (3)
10th (2)
11th (2)
12th (1)
13th (2)
14th (2)
15th (1)
16th (1)
17th (1)
18th (0)
19th (1)

Not a single World Series was won by the 20th through 30th best teams in a given year.


#7    tangotiger      (see all posts) 2006/12/16 (Sat) @ 22:14

Great work Brian!

I’ve got the top 8 teams, using an SD of .036, winning .685.  With an SD of .060, the top 8 win .785.  Your 104 of 120 is .867.

***

Why do you say that the true SD had to be less than .060?  Historically, I have the observed SD has being around .072.  The implies a true SD of .060.  That is, sqrt(.072^2 - .5*.5/162) = .060.

***

Rob, that James quote definitely sounds like he was completely wrong.


#8          (see all posts) 2006/12/16 (Sat) @ 23:17

Oh, I take your point:  .060 SD real to create .072 observed SD.

Since 120 seasons isn’t a huge sample, I will assume, now that we’re on the same page, that my study’s results were a little fluky/exaggerated.  One difference: your system gives the exact same ability distribution every year, James’s makes every year different.  But on our basic point—the worst third of teams practically never win—we’re reaching the same conclusions.

***
Defending Bill James:  I think his logic was that a team’s “real” quality is itself hard to predict, due to injuries, sudden developments/collapses, team chemistry changes (just because team chemistry is “intangible” doesn’t mean it doesn’t produce statistical results if only we could measure them), etc.  So if we think the Royals are a .400 team, they could easily turn out to be a .475 team based on real ability, and then the 250-to-1 odds against them lucking into a series title don’t look so bad.

Problem is:  yes, the odds do still look bad, after all.  Oh well.  *is Royals fan; is used to this*


#9          (see all posts) 2006/12/17 (Sun) @ 01:42

I agree with Brian Brock (#8).

The problem is that we don’t know how good the team will be next year.  Tango, you gave a hypothetical where the Nats might be any one of the bottom ten teams.  But they have a non-zero probability of them being one of the MIDDLE ten teams, and that’s where most of their world series probability lies.

In 1983, the Cubs were 71-91.  In 1984, they were 96-65.  That’s more than a 25-game increase.  Increase the Royals by 25 games and what have you got? 

In fact, the Bill James comment referred to above was for the ‘84 Cubs.  In the ‘84 Abstract, page 56, James wrote,

“If anybody offers you 100 [sic] odds against the Chicago Cubs winning the [division] in 1984, take it… I have serious doubts that any team should even be considred a 200-1 shot to win the division ... Anyways, if I have ever seen a dead giveaway set-up for a miracle, this [’84 Cubs] is it.”

James was prescient.  Of course, there were indicators, like Pythgoras and so on, but still.

And, 200-1 to win a division in 1984 is about 800-1 to win the World Series.  That’s still substantially more than the 150-1 offered on the Nationals.


#10    tangotiger      (see all posts) 2006/12/17 (Sun) @ 02:12

Right, an 800:1 to win the W.S. is certainly different than what was being quoted earlier.

***

Right again on the distribution of what the worst team could be.  I suppose we can create a distribution of the perceived worst team being, such as .20 of being the worst, .15 being the second worst, .10 being the 3rd worst, .09 being 4th worst, .... .0001 being the best, and then figuring the odds of that.


#11    tangotiger      (see all posts) 2006/12/17 (Sun) @ 02:24

I made the perceived worst team 20% chance of actually being the worst team, 16% of being 2nd worst, and so on down the line (almost 80% of the previous level), up to a 1 chance in 3200 of being the best team.

The odds of such a team of winning the WS is 600:1.

I repeated, this time making them a 30% chance of being worst team, 21% chance of being 2nd worst (and so on, using 70% this time).  Odds of winning the WS is 3000:1.

Finally, started the team as being 50% chance of being worst, 25% of being 2nd worst and so on down the line (until 1 chance in a billion of actually being the best).  Such a team is a 30,000:1 shot of winning the WS.

This compares against the 4 million:1 shot against winning the WS if you were 100% certain they were the worst.

(All done using 1 SD = .060).

If we take the 1st scenario as the most likely, then we’re 89% sure that the perceived worst team is one of the 10 worst teams in the league, and a 1% chance that the perceived worst team is one of the 10 best teams in the league.

If that’s what Bill James was talking about, then those odds, 600:1, is what we are talking about, and James seems to have been dead-on about that.


#12    tangotiger      (see all posts) 2007/02/10 (Sat) @ 00:30

I’m always annoyed when someone reads something I write here, and then needs to blast me somewhere else.  I have a comments section right here!  Why not write your thoughtful comments here?

Anyway, one of the comments was:

I’m not sure if that is the proper percentages, but that should be done game by game, opponent by opponent, and not with some numbers taken out of his ***. This is a classic example of a crappily-conceptualized and crappily-executed study.

That was in reference to my point #4 above.  It was in fact done on a game-by-game type basis, and wasn’t taken out of any part of my body.  I just wanted to create a simple function.

While the reader may think it was crappy, I was quite satisfied with the study.

That same reader also said:

He assumes, using current data (variation of team wins) that the distribution we see DOES in fact reflect true quality,

That was in reference to #1 above.  I’m one of the few analysts that actually regresses team wins.  The observed team winning perecentage is 1 SD = .072, which implies a true team winning percentage of 1 SD = .060.  I don’t know why he says I assume anything.  In fact, it was the reader himself who assumed what I did, since I wasn’t explicit in what I said (though regular readers of this blog has seen me discuss the .072 / .060 issue several times).

If he assumed it, I would have been ok.  But, for him to say that I assumed it? 

Problem there (two problems see below) is that a win for one team is a loss for another: beat up
enough potential contenders and your team will increase its chances of winning AND decrease the chances of the teams you are beating. He assumes the 88 wins will happen in a vacuum.

He’s correct about this, but for all practial purposes, it doesn’t matter.

88 wins to get into the playoffs huh? Since we are discussing the chances of marginal teams sneaking in and then wreaking havoc in the postseason, I think that floor is too high-I mean a fricking 83 win team won it just last year! Nevermind something like 1994 in a weak 4 team division where a losing team gets in.

He’s right, and I could have created a distribution for that too.  I don’t know how much my conclusions would have changed.  My guess is hardly any difference.


#13    Tangotiger      (see all posts) 2008/12/17 (Wed) @ 17:24

This week’s bumping…


#14    MGL      (see all posts) 2008/12/17 (Wed) @ 22:41

Every year I do my team w/l projections and their chances of winning the division, WC, pennant, WS, etc., by simulating 100,000 seasons, using a static w/l% for each team and each team’s actual schedule (and a log5 for each “game"). 

Just looking at one of those years, KC was a 66 win team and won the WS around 1 in 10,000 times.

I don’t know how typical that is, but I would think that 1 in 10,000 is a reasonable number for an average “worst team in the league.”

Now, if we use .06 as one SD of talent, then in a typical year there will be one team that is around a true 63 win team or so.  They probably are closer to 100,000 to 1 to win the WS.  So maybe it is somewhere between 1 in 10,000 and 1 in 100,000.  I doubt it is much lower than that on the average.

But, I can certainly run some sims using a team talent distribution which is normal with a .060 SD and see what it yields.  That should be close to the “real” answer.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade