THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, March 31, 2010

Poll: Chances for Redsox and Royals to surprise

By Tangotiger, 01:12 PM



SabermetricsPoll
#1    Tangotiger      (see all posts) 2010/03/31 (Wed) @ 13:55

After 20 votes, the Redsox are at 84.5.  This means the chance that they will win at least 100 = chance they will win at most 84.5.  That sets the midpoint as 92 wins.

The Royals are at 100 losses, setting the midpoint at 90.5 losses (or 71.5 wins).

It seems to me that fans have a good grasp of the role that probability plays.  A fan may SAY that the Redsox are a 98-win team, but clearly he won’t believe that.  Because if that’s true, then the chances that they win at least 100 is going to be the same as winning at most 96.

And the same for the Royals.

I think that if you do NOT ask the fans the question as I ask it, and instead just ask for one number of wins, they’re going to think “let’s see… some team is going to win 98-100 games, and I think the Redsox are the best team, so… Redsox at 99 wins”.  They don’t really believe that, which is the point of these questions I ask.

This is just like the Strasburg v Lincecum poll.  Yes, Strasburg can have a sub-2.50 ERA, and so can Lincecum.  But, we’ll be far more shocked if Lincecum posts a 4.00 ERA than if Strasburg were to post a 4.00 ERA.  And so, there’s no way you can call their midpoint the same.


#2    Xeifrank      (see all posts) 2010/03/31 (Wed) @ 14:21

Is the expected win probability a true (symmetrical) bell curve?  Does an 81 true talent team have the same chances of winning 96 games as they do of winning 66 games?  Does a true talent 50 win team have the same chances of winning 70 games as they do of winning 30 games?  Just curious.
vr, Xei


#3    Tangotiger      (see all posts) 2010/03/31 (Wed) @ 14:39

It’s going to be very close to symmetrical.


#4    Tangotiger      (see all posts) 2010/03/31 (Wed) @ 14:52

After 40 votes, it’s still the same results:

Redox 92 wins as midpoint
Royals 72 wins as midpoint


#5          (see all posts) 2010/03/31 (Wed) @ 15:04

Um, really?  I don’t see how it would be symmetrical.

Regression to the mean, right?

If Josh Beckett is projected for a 3.75 ERA, does that mean he has an equal chance of putting up a 4.75 as a 2.75?  I don’t think so.

If something changes in the true talent of a good player, isn’t it likely that it would be a BAD thing?

Heck, even for average MLB players… far more things (mostly injuries) will make them bad than will make them better.


#6    Tangotiger      (see all posts) 2010/03/31 (Wed) @ 15:44

Mike, we’re talking about team wins, not player ERA.


#7    Matt K. (d_f)      (see all posts) 2010/03/31 (Wed) @ 15:54

Just a suggestion—I know I’m a bit slow on the uptake, but I had to read the questions several times, and finally scroll down to your first comment to figure out how to vote properly (I hope!). Maybe I’m just particularly stupid (that certainly can’t be discounted), but a better explanation up top might help.


#8    Jon      (see all posts) 2010/03/31 (Wed) @ 16:09

Matt K. I had to read it multiple times as well.  I sorta think Tango wanted to obfuscate the question so that you didn’t feel like you were just predicting actual wins.  SFWIW, saying “at most” and “at least” makes me think more than “no more than” and “no less than”.


#9          (see all posts) 2010/03/31 (Wed) @ 16:10

What’s the difference between a player who’s projected to be above-average, and a team that’s projected to be above-average?  Each of them has many smaller components that make up the whole.  And again, as I see it, anything that occurs to change the projection is more likely to be something that lowers expected performance, right?


#10    Sky      (see all posts) 2010/03/31 (Wed) @ 16:16

Matt, I, too was perplexed by the wording.  Would have favored something like “The Red Sox have as good a chance of winning 100 or more games as winning X or fewer games.” Royals version was even tougher—I don’t often think in terms of losses, just wins.  For future reference, Tango.


#11    Steve Sommer      (see all posts) 2010/03/31 (Wed) @ 16:19

Mike, it seems to me that it is less about actually changing the projection (i.e. changing underlying talent) than it is about realized performance around the projection (the error bars of the projection).  Personally I see a difference between the two.


#12    Tangotiger      (see all posts) 2010/03/31 (Wed) @ 16:25

How about:

The REDSOX have as much chance at winning at least 100 games as they have at winning at most ___ games.

You guys can put up clearer questions as well, and we’ll go with the one that we can agree on.


#13    Matt K. (d_f)      (see all posts) 2010/03/31 (Wed) @ 16:49

I think #12 is easier to understand what is posted up top. I’m not sure how I would phrase it.


#14          (see all posts) 2010/03/31 (Wed) @ 19:34

The team you start the season with is not always the team you finish with.  That’s what makes such projections difficult.

Health and injury can be unpredictable (not in JD’s or Daisukes case, but say a Pedroia going down would be unexpected).  If a lot of things go right, the Red Sox could win 100 games, but if things go bad, well, just look at 2006.

I think you would be best to project 95+/- 5 for the Red Sox.  Anything more or less would be surprising (improbable), although not impossible.


#15    Jon      (see all posts) 2010/03/31 (Wed) @ 22:06

How about:

“The REDSOX are as likely to win MORE than 99 games as they are to win FEWER than ___ games.”


#16    dan      (see all posts) 2010/03/31 (Wed) @ 23:01

I was far more confused by the wording in the Royals question, FWIW


#17    KY      (see all posts) 2010/03/31 (Wed) @ 23:38

Isn’t there a selection bias on the voters in this blog?  The readers of this blog aren’t what I’d call the average American baseball fan.


#18    Colin Wyers      (see all posts) 2010/04/01 (Thu) @ 00:35

If Josh Beckett is projected for a 3.75 ERA, does that mean he has an equal chance of putting up a 4.75 as a 2.75?  I don’t think so.

That’s what it SHOULD mean, yes. That’s the whole point of regressing a forecast to the mean - so that one SD above the forecast is as likely as one SD below. If that’s not the case, the forecast is not being regressed properly.


#19          (see all posts) 2010/04/01 (Thu) @ 00:55

Colin #18

I think people get confused, because counting stats tend not to fall in a symmetrical curve.  If someone is projected to hit 30 HR, he is much more likely to hit 0 than 60.

But, like you said, if the projection is regressed correctly, then RATE stats should be pretty much symmetrical. 

At the team level, it ends up being symmetrical because with so many players, your chances of complete failure (replacement level) end up at basically zero, while there is a chance any one player ends up at this level.  The Red Sox most likely have a better chance at being a 50 win team than a 130 win team (plane crash?), but both probabilities are so close to zero that it doesn’t matter.


#20    Scott M      (see all posts) 2010/04/01 (Thu) @ 01:11

KY,

Did you check the results of the Royals poll? If there was a bias the results should be a lot closer together.

In fact I don’t think the Royals poll tells us anything at all. The average of the poll is 99.5 with an SD of 4.6. If each choice was selected once you’d get an average of 99.5 with an SD of 5.2.


#21    Tangotiger      (see all posts) 2010/04/01 (Thu) @ 01:22

For ERA, it’s a little bit more confusing.  Had you said .270 wOBA as .350 wOBA, so a mean of .310 wOBA, then yeah.

But, those numbers translate to say 2.80, 3.75, 4.95 ERA.  That’s because of the multiplying nature of runs.


#22    Colin Wyers      (see all posts) 2010/04/01 (Thu) @ 01:37

The issue at the team level is that the total number of wins (as opposed to hits/runs/etc.) is fixed. If you have 30 teams and 81 home games scheduled per team there are a total of 2430 wins and 2430 losses - you are absolutely set at one win per game.

So for the Red Sox to win, say, six more games than their projection, some combination of the other teams has to win six fewer games. That puts additional pressure toward the mean that you don’t get with individual players.


#23    Colin Wyers      (see all posts) 2010/04/01 (Thu) @ 01:43

Does it work the same with RA, or just ERA?


#24    Colin Wyers      (see all posts) 2010/04/01 (Thu) @ 01:56

I should note that there is a technical name for what I described above, the hypergeometric distribution:

mathworld.wolfram.com/HypergeometricDistribution.html

You end up with a random variance for team wins that’s a bit less than what you’d expect if you simply assumed that everything was binomial.


#25    MGL      (see all posts) 2010/04/01 (Thu) @ 02:44

"If something changes in the true talent of a good player, isn’t it likely that it would be a BAD thing?

Heck, even for average MLB players… far more things (mostly injuries) will make them bad than will make them better.”

Depends on how you came up with your team projection.  If a team is truly a 90 win team, and the assumption is that that does not change, then the distribution of actual wins is going to be symmetrical.

If you assumed that everyone stays healthy, etc. then, of course fewer wins are more likely than greater wins.  But that would be a bad projection.

You are supposed to already account for chance of injury, etc., in your projection, even for a player projection, so that the chances of a player or team ultimately having a true talent better or worse than your projection is the same. Of course a median and mean projection are not necessarily the same, although that is more true for players than for teams. And yes, if your median and mean are not the same, that automatically means that your distribution of possible results is not symmetrical…


#26    Tangotiger      (see all posts) 2010/04/01 (Thu) @ 07:53

Right, ERA or RA… same deal.


#27          (see all posts) 2010/04/01 (Thu) @ 09:35

Isn’t the best way to settle the question of whether the fans think the distribution is to ask?

Have them assign a probability to each win total.  Restrict it to units of X%, or group the wins, or both to make it manageable.  Say, groups of 5 wins, and units of 5%. 

You might still run into people not being able to follow directions, but it should answer the question.


#28    Tangotiger      (see all posts) 2010/04/01 (Thu) @ 10:06

Bill, I setup my polls so that the reader doesn’t have to think too much.  If I did it your way, which is the correct way, I’d get one-third the voters, and I’d still get the same results.

I just find the way I do these polls, like with Lincecum and Strasburg, to convey the same thing, without making the reader sit down and come up with probability numbers.

You Blink, and you get an answer.  It works, basically.  That’s why I like it.  And that no one else does it or thinks to ask a question in this way (forecasting the upper and lower boundaries) makes it the kind of quirky poll that I like.


#29    Jon      (see all posts) 2010/04/01 (Thu) @ 11:06

Tango, have you played with the starting number to see if that affects the results at all?  I wonder if the midpoint would move around much (or at all) if you had asked “The REDSOX are as likely to win MORE than 115 games as they are to win FEWER than ___ games.”

Just curious if that 100 figure is shaping people’s answers or not…


#30    Tangotiger      (see all posts) 2010/04/01 (Thu) @ 12:17

I wanted to use some reasonable number that was at 1 or 2 SD from the mean, so that I give people a choice of 10 or 12 answers.  If I say 115, I’d have to give them numbers from 81 down to 60 or something.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential