THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, November 02, 2009

Probability illusions

By , 11:25 PM

Before Game 5, if you asked baseball fans if they thought that the Phillies might still win the Series (being down 3-1), most of them would say something like, “No way.”

If you asked a sabermetrician or even a fan who was a statistician the same question, their response would likely be, “I don’t know, but they have around a 9% chance, so it is unlikely but obviously possible.”

If the Phillies do come back and win, again, from the perspective of before Game 5, people will be shocked and they will be talking about the comeback for many, many years.

But....

Is something happens that has a 9% chance of occurring, shocking?  Typically, no they aren’t.  9 and 10% things happen all the time.  Sports fans see it ALL the time.  But as far as public perception goes, it depends.  What does it depend on?  One thing is the importance of the event.  A team, even an inferior one, winning 3 games in a row during the regular season?  No big deal.  Happens all the time.  When the Pirates win 3 games in a row, they don’t talk about that for years do they (well, maybe Pirates fans do)?  But in the WS?  Different story even thought the percentages and likelihood are the same.

Another thing it “depends on?” How many opportunities there were in history.  Why is that?  Because the human mind remembers how many times the unlikely event occurred and not necessary the number of opportunities.  An inferior team winning 3 games in a row during the regular season?  Happens all the time.  That is because there are hundreds of opportunities.  But we are not necessarily aware of the number of opportunities, only that it happens a lot.  But, it only happens 10% of the time or so, exactly the same as a team coming back from a 3-1 deficit in the WS.  That too will happen 9 or 10% of the time, but we have had so few opportunities for that in history that people only recall how infrequently it has actually occurred (4 times in history I think).

I’ll give you one more example of how our “shocked meter” is nowhere near congruent or commensurate with probabilities.  Barry Bonds steps up to the plate. How many people would be shocked if he hit a HR or said, “No way he hits a HR?” The Phillies come back to win the WS after being down 3 games to 1.  How many people would be shocked or said, “No way!”

Chance of Bonds hitting a HR?  6%.  Chance of Phillies winning the Series?  9%.  Which should be WAY more shocking from a purely statistical perspective?


#1    Greg Rybarczyk      (see all posts) 2009/11/03 (Tue) @ 01:01

You’re right, a 3-1 comeback should be more surprising than a homer in any particular at-bat, but psychologically, the fact that a hitter gets hundreds of opportunities in a season to hit a homer has to subconsciously factor in.  If a hitter gets 5% homers, and has 500 PA’s, he is guaranteed to get many homers, therefore getting a homer seems less a longshot and more of an inevitability…

Contrast that with 3-1 deficit opportunities.  Those situations are rare, and teams (aside from the Red Sox) don’t get them very often.  A team that fails to rally from 3-1 down might not get another chance for decades, so when they do come through, it seems more remarkable.

It’s not quite the same, but I like to think about how when you cut a card out of a deck, there is a 1 in 13 chance of getting any particular ranked card, which works out to a 7.7% chance.  By the “shocked” logic, any card that appears should be shocking, but in fact none is, because it is inevitable that the card is one of the 13 types…


#2    Xeifrank      (see all posts) 2009/11/03 (Tue) @ 02:18

~12.6% now.  Swag on the Game #7 WP.  About the same chances of flipping a coin three times and having it come up heads every time.

0.36 x 0.35 = 12.6%

vr, Xei


#3    KY      (see all posts) 2009/11/03 (Tue) @ 02:30

rolling craps (2,3,12 with two dice) in craps is only slightly more likely at 11.1% ... yet it seems to happen every time I roll!


#4          (see all posts) 2009/11/03 (Tue) @ 02:38

I wonder what answer you’d get if you asked people (beforehand) how likely the Phillies were to win Game 5.  I don’t think the wisdom of the crowd would put it at 45% - probably 10-20%.  It’s more that people can’t even properly estimate the likelihood for one game, let alone three. 

Obviously if you actually thought the Phils had just a 10% chance to win game 5, then it would be entirely rational to be surprised if they won the series…


#5    DH Phils      (see all posts) 2009/11/03 (Tue) @ 02:43

Hawerchuk: Vegas had the Phillies as the favorites in Game 5 at -157.  I don’t know where you get your 10-20% number from, but money-backed public opinion said the Phillies’ chances were approximately 60%.


#6    MGL      (see all posts) 2009/11/03 (Tue) @ 03:16

yes, 60% for Game 5, 36% for Game 6 and probably 32% for Game 7.

So, actually 7% before Game 5, but 11.5% now. 

Either way greater than a Bonds HR in any given PA!


#7    Sunny Mehta      (see all posts) 2009/11/03 (Tue) @ 03:40

probability of “witnessing a team come back from down 3-1 in a world series” is different than the probability of “a team coming back from down 3-1 in a world series.” while the latter probability might be 9 or 10 percent, the former is much lower because 1) we have to factor in the condition of the series getting to 3-1 in the first place and 2) we have to factor in the number of opportunities to witness it (say, the number of WS a person watches in a lifetime also probably combined with some factor for period in his/her life where watching it is impactful/meaningful)

say a person has a span of 50 solid years of meaningful world series watching. (average is probably less than this given average life span combined with average viewing life span combined with average meaningful viewing life span). only about a quarter of those series will get to 3-1. so you’re looking at an expected value of witnessing about 1 WS 3-1 comeback in life, and with considering the variance about that 1, many people will never see it.


#8    Spike      (see all posts) 2009/11/03 (Tue) @ 04:02

In fairness, if Bonds batted once every October, we’d be shocked the one time every 12-13 years he hit a home run. smile


#9    Spike      (see all posts) 2009/11/03 (Tue) @ 04:04

Make that 16-17 years.


#10    cdm      (see all posts) 2009/11/03 (Tue) @ 10:55

MGL,

We’re actually very good at updating our conditional probabilities in most settings and modalities.  In language (http://www.jstor.org/stable/40063345?cookieSet=1), vision (http://eprints.kfupm.edu.sa/59233/) and motion (http://www.iop.org/EJ/article/1741-2552/2/3/S04/jne5_3_s04.pdf). Low-level decision making structures update conditional probabilities well (http://www.mitpressjournals.org/doi/abs/10.1162/neco.2007.19.2.442). Even in high-level, deliberative decision making processes, we are surprisingly good (http://www.cogsci.northwestern.edu/Bayes/Shanks_1995.pdf).

Last year I looked at how accurate the Vegas odds were, and found that Vegas deviated from the actual outcomes less often than you would expect by chance 70% of the time. 

We’re actually very good at this; I think the anecdotal fans who are saying “No way!” would give you a different answer if there was any cost to getting it wrong.


#11    MGL      (see all posts) 2009/11/03 (Tue) @ 12:07

"I think the anecdotal fans who are saying “No way!” would give you a different answer if there was any cost to getting it wrong.”

Yes, I have always said that people will often give you 2 answers to a question. One, their first answer, and two, their answer if they had to put their money where their mouth is…


#12          (see all posts) 2009/11/03 (Tue) @ 12:40

Not sure what happened to my comment, but I think that the average fan has no idea what the probability of one team winning one game is.  KC-Yankees, for example - 33% KC?  37% KC?  I think the average fan would tell you something vastly lower, like 10%.  The lines get it right.  But the guy who believes in 10-AB sample sizes and “ownage” would not.


#13    MGL      (see all posts) 2009/11/03 (Tue) @ 15:53

Well the average betting fan knows the correct percentages more or less.

As cdm says, though, even the fan who when pressed, will admit that team B has a 20% or 30% chance of winning (if they had to put their money where their mouth is), they will still say, “No way they win,” as if the chances were near zero.

When people talk, they rarely think (rationally) first.  And even when they think, they often think with incomplete and irrational thoughts interspersed with rational ones. If you can help them in their thinking process just a little, they will often easily come up with the correct answer (which they actually knew all the time, but it was clouded by irrational thoughts brought on by emotion and other things).


#14    Matt Lentzner      (see all posts) 2009/11/03 (Tue) @ 18:18

If 60% of the fans are sure the Yankees will win game 6 and 40% are sure the Phillies will win then they can come up with the correct prediction (as a group) without anyone having any statistical savvy whatsoever.

If I were a betting man I would be taking the Phillies right now. Not that I think they will win, but I think the Vegas line is too low. Pitcher fatigue and poor management (Girardi) evens things up a bit.


#15    MGL      (see all posts) 2009/11/03 (Tue) @ 18:28

"If 60% of the fans are sure the Yankees will win game 6 and 40% are sure the Phillies will win then they can come up with the correct prediction (as a group) without anyone having any statistical savvy whatsoever. “

First of all, it is not that 60% are “sure the Yankees will win,” etc.  It is just a reflection of the opinion that a large group of fairly informed people has of each team’s chances of winning. That is a little different from what you said.

And, that does not necessarily mean that their estimate is going to better than that from one or more more knowledgeable persons or entities (like a computer being programmed by a knowledgeable person) either for any one game or all games in general.  That is one reason why there are some successful sports betters!  They are able to identify at least some of the events where the oddsmaker and public combined do not have as good an estimate as the better, or at least the better can recognize when the oddsmaker and public might be biased or simply have a bad estimate.

Let me give you an example of that latter phenomenon:  It is not so much true anymore, but 20 years ago, if a baseball team won 5 games in a row, the next game the public and the oddsmaker would set the line such that it favored the team on a streak moreso than if the team had not won 5 games in a row.  A statistician who looked over history and found that team streaks had no predictive value could blindly bet against those teams and probably have a small edge.

So, successful sports betting is not always or necessarily a matter of a person or a statistician making a better line than the public/oddsmakers - it can also be simply exploiting known biases in the betting lines. Again, not nearly so much these days as many years ago.


#16    Eric Hanson      (see all posts) 2009/11/04 (Wed) @ 16:25

@ cdm (post #10)

You wrote:  “Last year I looked at how accurate the Vegas odds were, and found that Vegas deviated from the actual outcomes less often than you would expect by chance 70% of the time. “

Could you restate this conclusion in another way for me please?  Or, if it’s easier, share the process by which you reached this conclusion?  I’m not sure what the exact claim is.

Thanks.


#17          (see all posts) 2009/11/04 (Wed) @ 19:27

On a related note—I was initially very amused today by the ESPN poll that, after about 100,000 respondents, had the following consensus expectation about the relatively likelihood of the potential WS outcomes:

Yankees in 6 = 48%
Yankees in 7 = 10%
Phillies in 7 = 41%

My first thought was that these results were very silly, in that they implied that (a) the Yankees have a 48% chance of winning Game 6, but also that (b) the Yankees have only about a 20% chance of winning Game 7, conditional on their having lost Game 6. 

However, as I thought about it more, I realized that there’s a selection bias at work here.  A more accurate way of stating (b) would be to say that, of the set of people who expect the Yankees to lose Game 6, 80% of them also expect the Yankees to lose Game 7.  However, the poll didn’t ask the people who thought the Yankees would win Game 6 what would happen if there were a Game 7.

So really, maybe the results of the ESPN poll are reasonable, and consistent with a model in which the polling universe consists of 3 types of people:  (a) people that always select the Yankees over the Phillies to win any single game; (b) people that always select the Phillies over the Yankees to win any single game; and (c) people who form their view on which team will win any single game by mentally flipping a coin.  The Type A people vote Yankees in 6; the Type B people vote Phillies in 7; the type C people get split among the 3 outcomes.

Any thoughts?


#18          (see all posts) 2009/11/04 (Wed) @ 19:52

My thoughts are that if those same people had to put their money where their mouths are, the results of the poll would be very different.

My other thought is, “So much for wisdom of the crowds!”


#19          (see all posts) 2009/11/04 (Wed) @ 19:57

Let’s say that you polled reasonable, rational, knowledgeable, people, who were to give objective responses.

The result should be close to 100% “Yankees in 6.” The answer to the question, “What do you think the outcome of the Series will be,” is the same as the answer to the question, “Which outcome is mostly likely of the 3 possible outcomes?”

And anyone who believes that the Yankees have a greater than 50% chance of winning today’s game, which should be everyone (given the criteria above), have to answer, “Yankees in 6.”


#20          (see all posts) 2009/11/04 (Wed) @ 21:31

I disagree that each person’s answer to the question “What do you think the outcome of the Series will be” should be that person’s estimate of the mode of the distribution.  I think each person should develop a view as to the entire probability distribution, then pick a random number, and convert that random number into an outcome. 

(At least, that’s what I’m arguing today.  If you asked me tomorrow maybe I’d argue something different.)


#21    Think Blue Crew      (see all posts) 2009/11/09 (Mon) @ 14:11

This reminds me of Fooled By Randomness and The Black Swan.


#22    Vidor      (see all posts) 2009/11/09 (Mon) @ 19:43

Um...comebacks from 3-1 deficits…

1903 World Series, Red Sox (best of nine)
1925 World Series, Pirates
1958 World Series, Yankees
1968 World Series, Tigers
1979 World Series, Pirates
1985 ALCS, Royals
1985 World Series, Royals (!)
1986 ALCS, Red Sox
1996 NLCS, Braves
2004 ALCS, Red Sox (down 3-0)
2007 ALCS, Red Sox


#23    Corey      (see all posts) 2009/11/10 (Tue) @ 11:11

But Bonds hitting a HR would be over his career, how about you shorten that to just post-lockout?  That was when he really became the HR threat that everyone remembers.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 10:35
Rooting for laundry

May 25 10:14
Largest demonstration in Canadian history?

May 25 09:39
What sabermetrics is NOT

May 25 09:31
Do pitcher’s reach back for velocity when needed?

May 25 06:39
Lack of hustle during a game

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story