THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, May 24, 2007

Odds of making the playoffs

By Tangotiger, 04:02 PM

There are at least two sites that track daily the chances of each team of making the playoffs.  CoolStandings.com even offer two flavors, one the “smart”, and one the “dumb” (presumes a prior of .500 for each team).

The Yanks, as far behind as they are, have a 40% chance using the BP prior (i.e., whatever PECOTA thinks they are), a 30% chance using the Smart Cool Standings prior, and a 15% chance if they were a true .500 team in a league of only .500 teams ("dumb", or more accurately, clueless, prior).  The Houston Astros, virtually in the same spot as the Yanks, have a 4% chance according to BP, 6% according to the Smart Cool Standings, and a 15% chance according to the clueless Cool Standings.

It’s a remarkable difference of how much the true talent of a team can impact their chances of making the playoffs, given that they are both equally, and so far, behind.  Looks like Clemens picked the right team, between the two.


#1    MGL      (see all posts) 2007/05/24 (Thu) @ 18:55

I really like the BP odds using their updated estimates of each team’s true talent level (w/l% going forward) in order to compute playoff odds. 

No matter how much we (analysts) show how current performance affects performance going forward (not much as compared to past performance, unless we have little past performance of course), EVERYONE, including people like Neyer and Law, and of course, every mainstream fan and commentator on the planet, attaches WAY too much importance to a team’s performance thus far in the season.  People cannot imagine that the Yankees may be a true .575 team or that the Mets may be a true .550 team or the Nats a true .450 team, the Cards a true .520 team, etc.  That just boggles their mind.

And as I showed in a thread a few weeks ago, it is easy to get fooled by looking at what seems like a large sample of performance by a team.  500 PA by a team is NOT the same as 500 PA by a player as far as projecting future performance.  This is a critical point which very few people undestand.


#2    Gary_      (see all posts) 2007/05/24 (Thu) @ 18:57

1. Is the Cool Standings site based just on pythag?

2. Doesn’t BPro always underestimate the Astros?

3. Do you honestly think Clemens’ decision was motivated by anything other than cash?


#3    tangotiger      (see all posts) 2007/05/24 (Thu) @ 19:51

I’m not sure that Clay/Nate gets a new prior, before each game.  I get the feeling it’s fixed at the start of the season.  So, if the Cards lose Pujols, or the Twins lose Santana, they still “count”. 

But, maybe Clay or Nate will stop by to explain.

The CoolStandings fellows use the current RS/RA.

We had a good discussion over at BattersBox two years ago on this topic.  I’ll see if I can dig it up…


#4    tangotiger      (see all posts) 2007/05/24 (Thu) @ 19:51

http://www.battersbox.ca/article.php?story=20040923122101999


#5          (see all posts) 2007/05/24 (Thu) @ 20:43

I think MGL is referring to the ELO adjusted standings, which do try to take into account a team’s current strength, but do not take into account things like injuries.


#6    MGL      (see all posts) 2007/05/25 (Fri) @ 01:27

Do they use those (ELO) in their updated playoff odds?  They should of course.  Using the pre-season team wp projections is real bad mainly because of injuries.  Of course that is a lot better than using actual wp so far.  In fact, that tells us exactly what we DON’T want.  The whole point of a good playoff odds model is to tell us, for example, when a good team has been unlucky (e.g., the Yankees) and therefore still have a decent chance to make the playoffs and when a not so great team has been very lucky (e.g., the Brewers) and therefore are not so likely to make the playoffs as it might seem.


#7    David Gassko      (see all posts) 2007/05/25 (Fri) @ 01:38

Mickey,

There are three different reports:

(1) The original report is based on “third-order winning percentage,” which is Pythagorean record using EqR scored and allowed to try to weed out as much luck as possible.

(2) The PECOTA-adjusted version, which uses each team’s projected winning percentage.

(3) The ELO version, which I frankly don’t like at the start of the season because it places weight on the previous season and is based on the team’s performance this year (IOW, it’s no better than and possibly worse than the original).

Until around August, I prefer the PECOTA version, and after that, I think the ELO is most interesting (and probably most accurate).


#8    HarryAbles      (see all posts) 2007/05/25 (Fri) @ 10:31

Noticed something - the BP page says they add 2% to the home team and subtract same from the away to account for HFA, but the DMB article they link to says it should be plus and minus 4%, right?


#9    tangotiger      (see all posts) 2007/05/25 (Fri) @ 10:44

Matter of perspective.

A true .520 team facing a true .480 team will result in a .540 win percentage for the better team (or .460 for the worse).


#10          (see all posts) 2007/05/25 (Fri) @ 12:06

I agree that “reversion to preseason expectations” is seriously underestimated by most people.  Even most statheads were saying that the Tigers were a far better team than the Cardinals going into last year’s Series ... and that the Cardinals were somewhat better than the Red Sox in 2004.  I think they reached these opinions because they didn’t adjust enough for the information we knew about the players prior to the season in question.

I think it’s true in other sports as well.  To give a couple more cherry-picked examples, the Saints were probably over-estimated by people going into the most recent NFL playoffs (regular season said they were great, pre-season expectations said they were horrible).  And the Spurs were probably under-estimated by people going into the NBA playoffs this year (regular season said they were inferior to Mavs and Suns, pre-season said they were about as good).

Compared to the opinions of typical fans, the odds on postseason series adjust more for pre-season expectations.  But I think there’s even some inefficieny there, they still don’t adjust as much as they should.


#11    Tangotiger      (see all posts) 2007/05/25 (Fri) @ 12:23

An interesting case was just this past week.

Nadal had something like an 80-win match streak on clay, a record.  Federer had his worst drought leading up to the match.  And in a “friendly”, Nadal beat Federer in a half-clay, half-grass match.

And what happened?  Federer beat Nadal.  I don’t know what the actual odds were (33%, 67% according to one site, and it looks like their head-to-head was 4:7).

I don’t know what the odds should have been.  It’s possible that Nadal’s streak and Federer’s recent drought contributed to the actual odds.

Then again, if it’s the case that this gives you an arbitrage, you can bet that this gap would get closed pretty fast. 

However, there is no arbitrage with sports writers.  I’d guarantee you that far more than 67% of sports writers had taken Nadal to win.  And, if you gave them odds, they’d have wanted 3:1 or more to take Federer.

When it comes to money, the market is efficient.  When it comes to loudmouths....


#12    dfan      (see all posts) 2007/05/25 (Fri) @ 14:50

I have a page (click on my name) on which I keep up-to-date graphs charting the progression of every team’s chance of making the playoffs based on BP’s calculations.  Check it out if you like that sort of thing.


#13    MGL      (see all posts) 2007/05/25 (Fri) @ 14:51

Tango, last sentence above, you got that right!  As I always say, and I paraphrase my own rule, everyone has an opinion until you ask them to put their money where their mouth is.

For example, on BTF, there is a link to some moronic article about how Smoltz is a “big game pitcher” and therefore there was “no way” that he was not going to win against the Mets the other day (200th win, playing against division rival, yada, yada, yada).  First, I pointed out that the article was conveniently written AFTER Smoltz already won (how many gamblers have said, after the fact, “I just KNEW that so-and-so was going to win/lose?") and second, the Vegas odds on the game was 57/43 Braves.  I would have gladly given the writer of that article 1-2 odds (he wagers 2 dollars to win 1 dollar) on Smoltz for any amount of money and I am quite sure he would NOT have taken the bet, even though he was apparently completely sure (90%, 99%, 100%) that Smoltz was going to win the game!


#14    MGL      (see all posts) 2007/05/25 (Fri) @ 15:00

Yeah, I was just reading the way BP does the Pecota post-season odds.  For the remainder of the season wp for each team, they take the current wp for each team and regress towards their pre-season Pecota-based team wp projection.  Not a very good scheme.  One, although they don’t say how the regression works, they probably put too much weight on current season wp, and two, the pre-season wp projections are not very good for teams that have had and will continue to have substantial personnel changes, due to injury or what have you.  But, I guess they did not want to put a whole lot of effort into these post-season odds reports as the rigorous method would require constantly updating each team’s projected wp for the remainder of the season, using updated Pecota projections for each player and updates playing time projections for each player.  I guess their Pecota method is probably not too bad, as to some extent the current season wp reflects injuries and other personnel changes (and lots of luck of course), and to some extent the pre-season Pecota-based wp projections reflects true talent of the players on the team (at least the ones they thought would be playing X amount of games before the season started) absent any current season luck.


#15    MGL      (see all posts) 2007/05/25 (Fri) @ 15:08

The Cardinals W3 and L3 current wp is around .395 (17-26).  Their pre-season projection by BP was .530.  They have them as a .442 team for the remainder of the season, so it looks like they are regressing around 1/3 towards the pre-season wp projections now.  That does not seem like nearly enough of a regression to me after 45 games, but I could be wrong.


#16    dcj      (see all posts) 2007/05/25 (Fri) @ 18:13

Interesting discussion. Another thing to consider is league strength. I guess that would be reflected in the actual records after interleague play is over, but I wonder if the Pecota preseason projections were centered at .500 for both leagues.

The Marcel regression for batters is 5/4/3/2 for the past 3 years and the league average. I wonder, could you use the same method for in-season forecasting, maybe with a weight of 6.5 or something for the current season? Would the current-season weight change as the season goes on (maybe going down)?

A Bayesian approach might also work. For that we would need not just a preseason estimate of true talent, but a full prior distribution. I guess Pecota and Zips try to do that with the percentile forecasts, but I remember Tango has criticized the Pecota percentiles in the past.

I think of ELO as a stab at the Bayesian method. I can’t see the details since they are behind the paywall, but if I understand correctly it is based only on team W-L. Clearly there’s room for improvement there.


#17    Anthony      (see all posts) 2007/05/25 (Fri) @ 19:27

BPro’s ELO multiplies the game’s rating by the cube root of the margin of victory.


#18    MGL      (see all posts) 2007/05/25 (Fri) @ 21:20

The method for computing each team’s wp from now on is different depending upon the knowledge you have (and trust).  It is pretty straightforward though.  If it is me I simply use updated projections and playing time for each player and I ignore team w/l records thus far. Obviously the player projections include current season performance and the team w/l records thus far reflect that current season performance.  The teams’ w/l records thus far tell us nothing we don’t already know and even if they did (intangibles?) there is too much nosie to be able to do much of anything with it anyway.

Let’s say that a third order (using updated player projections prorated for their actual playing time this year and whether they have bneen playing injured or not) w/l gives us 25 and 20 for a team and they are only 22 and 23.  I guess I don’t mind substracting .5 win from my projection for them to account for “intangibles” but I would not be comfortable putting any more weight on their actual record than that.


#19    tangotiger      (see all posts) 2007/05/25 (Fri) @ 22:10

If you follow the BattersBox thread, one thing that the current W/L record tells us is the likelihood that a team will try to improve itself.  In essence, is a team a buyer or a seller?  And you can even add more parameters, because a team like the Yanks will always be a buyer, if they have even a 5% chance to make it.  Some other team on the other hand might need to be a 15%-20% chance to make it to be considered a buyer.


#20    mgl      (see all posts) 2007/05/26 (Sat) @ 03:19

Of couse most people overestimate the impact of players acquired.  At the trade deadline even an impact player is worth maybe 1 to 1.5 wins.


#21          (see all posts) 2007/05/27 (Sun) @ 20:44

At this time of year, only going 1/3 of the way towards a pre-season prediction is definitely not enough.

For hitters, even at the end of the season, I think it’s appropriate to regress slightly more than half-way to the pre-season prediction.  So at this time of year, you’d barely move off your pre-season predictions at all.

Pitchers are more complicated, because different stats should be handled differently.  The BABIP stat should be regressed almost all the way to the pre-season prediction (which, in turn, should be close to the league average).  With strikeouts, I would grant that you can make adjustments pretty quickly.  But that’s an exception.  Assuming no major injuries, at this time of year, your opinion of a team’s ability level should be more influenced by your pre-season prediction than by their performance.


#22    MGL      (see all posts) 2007/05/28 (Mon) @ 01:23

I couldn’t agree more (21)!


#23    MGL      (see all posts) 2007/05/28 (Mon) @ 01:28

If you use a 5/4/3/2 Marcel, at the end of the year, you have each player basicall regressed 64% toward the pre-season projection.

As I said, if you want to project team’s going forward, simply update each player’s projection and add everything up based on playing time projections.  The player projections are NOT going to change much.  What will change are the playing time projections.  Still, I would think that at this time something like 15% current team w/l records plus 85% pre-season projected records would be right, assuming you know nothing about playing time other than your estimated pre-season playing time prorojections.  Heck, it might be 90/10.  How many people are going to believe that?  Not too many.


#24    tangotiger      (see all posts) 2007/05/28 (Mon) @ 13:32

BP still has the Yanks at a 29% chance of making the playoffs.

The average AL wild card team is expected to average around 92 wins, meaning the Yanks should play for around 71-43 (.623). 

They need to play “around” .623.  Since BP has them as a true .573, that target level is around 1 SD higher than their “true” mean.  While them playing at least .623 will happen 16% of the time, that’s not exactly the target level.

That’s not to say they need to play at least that.  After all, if they win 65 games the rest of the way, they have a certain chance of making the playoffs, and at 66 more wins, they have a slightly higher chances, etc.


#25          (see all posts) 2007/05/28 (Mon) @ 14:50

Before yesterday’s games (where Boston won and the Yankees lost) the THT projections said 22%, and the market says 18% (this is for winning the division, and doesn’t factor in the wild card)

http://www.hardballtimes.com/main/article/the-state-of-the-al-east/


#26    tangotiger      (see all posts) 2007/05/28 (Mon) @ 17:39

No offense John, but this statement makes every single Expos fan sick every time we hear it:
“Last year we saw one of the great streaks in sport come to an end: namely the Atlanta Braves’ run of 14 consecutive division championships”

Just because Bud Selig didn’t declare a divison winner doesn’t make it that 1994 didn’t exist.  The Expos won that division.

In any case, what the Braves did was no so “great” in “sport”.  Just in baseball.


#27          (see all posts) 2007/05/28 (Mon) @ 17:50

Actually offense intended! My excuse is that I’m a Braves fan, you see, and I was being deliberately provocative and jestful when I wrote that sentence.

I don’t want to rehash old arguments on the 1994 division. All I’ll say is I agree that you can’t call it a Braves win.

In seriousness, I don’t want you (or anyone else) to take offense so sorry if that was the case.


#28    Anthony      (see all posts) 2007/05/28 (Mon) @ 19:06

This thread made me very curious so I wanted to check out first half/second half splits for 2006. I didn’t know an easy way of extracting that from Baseball-Reference in one shot, so I just copied the Pre-All Star top 40 in OPS from ESPN’s stat page.

I also have the 2006 PECOTA spreadsheet, so I compared first half/second half/PECOTA PA, AVG, OBP, SLG, OPS (actually for the 2006 stats from ESPN I did AB+BB instead of PA). I weighted each player by the lesser of the PA, and came up with this:

Pre-All Star: .311/.400/.565/.965
Post-All Star: .295/.382/.522/.904
PECOTA: .282/.366/.498/.865

So that would mean 60% regression to the projection at the All Star Break, (if I’m doing this correctly). Does that sound right?


#29    Tangotiger      (see all posts) 2007/05/29 (Tue) @ 10:11

John: I take offense as an Expos fan, but my “no offense” statement was intended to say that it wasn’t directed to you directly.  If the 1994 season didn’t exist at all, your “consecutive” division title statement would be fine. 

It is unreasonable to have two-thirds of a season played to be equivalent to zero games.  The season did exist, the writers handed out MVP awards, and some 50-million fans went to games.  What Bud Selig decides for record-keeping purposes to call the season is irrelevant. Observers record history, not the generals (though they can influence it).

Montreal 74-40, +131 runs
Atlanta 68-46, +97 runs


#30          (see all posts) 2007/05/29 (Tue) @ 10:19

Compared to what I found when I looked at this, 60% All-Star break sounds a little low.  But it’s certainly not too far off, and I admit I haven’t looked at this in a couple of years.

It might be simplest to combine the player’s current stat line with N plate appearances at the projection level.  Then, the question becomes, what’s a good value for N?  Obviously, it depends on the quality of the original projection.  To be consistent with what I’ve posted upthread, I’ll say that if you have a good opening projection, I think N should be in the neighborhood of 800.


#31          (see all posts) 2007/05/29 (Tue) @ 10:55

Tango

I think it is very difficult for even the most ardent Braves fan to argue their corner.

The Expos were the better team and I guess had a 70-80% probability of winning when play was halted.

As far as I am concerned the Expos should 1994 division champions. The fact they are not is just a convenient fact for Braves fansat the supposed streak.


#32    Rally      (see all posts) 2007/05/29 (Tue) @ 12:39

Give the 94 Expos their due, and just call it consecutive playoff berths for the Braves.  They may not have played one, but they would have had the wild card.


#33    MGL      (see all posts) 2007/05/29 (Tue) @ 17:35

You can come up with an average value for N I suppose, but it completely depends on the number of prior PA the projection is based on, although…

The projection is also a proxy for the regression to the mean, so maybe the N IS pretty stable for all players.

IOW, if we have a player who had a .950 OPS in only 100 PA prior to the season, we would project him at only a little above average anyway, so if he has a .950 OPS again after half a season, our updated projection is only going to be a little more above average than the pre-season one, so we might in fact use 800 for our N.

Let’s say we had a player with a few thousand pre-season, and he was a career .950 hitter.  We might project him at, say, .930.  If he is also .950 after a half season, our new projection is going to be a shade over .930, so using 800 for out N, might work as well.

Where you would want to use much different N’s, depending on the number of the career PA of the player, is when you combine current season stats with raw prior stats (not pre-season projections) and THEN do the regression toward the mean.  This is of course the better (cleaner?) way to do the updated projections.


#34    Ken Roberts      (see all posts) 2007/05/30 (Wed) @ 22:54

I’ve created yet another Monte Carlo site:
(Click on my name)
It only uses the flip a coin algorithm, but I think it presents the results in an interesting way.  I’m trying to answer the questions: “How well do I need to finish out the season to make the playoffs” and “What games matter most today”
I’d love any thoughts you might have.
Thanks, Ken.


#35    Tangotiger      (see all posts) 2007/06/01 (Fri) @ 13:04

A good post that goes through some of the nuts and bolts to see why the Yanks are still in it:

http://jeteupthemiddle.blogspot.com/2007/05/odds-of-making-playoffs.html


#36    dackle      (see all posts) 2007/06/01 (Fri) @ 14:02

Ken, great site. Do you keep a log of the impact of games already played? It would be interesting to see a list of a season’s 10 biggest games.


#37    Ken Roberts      (see all posts) 2007/06/07 (Thu) @ 09:59

Tangotiger, I read jetupthemiddle’s post, thanks.  I get the gist but not the math.  It’s time for me to buy your book smile

Dackle, I’ll add your great idea.


#38    Tangotiger      (see all posts) 2007/06/07 (Thu) @ 14:17

"The average number of wins for the Wild Card since xxx is yy”.  You hear that all the time.

Along comes this genius
http://sonsofsamhorn.net/index.php?s=&showtopic=19331&view=findpost&p=796335

...let’s look at how many minimum wins it woul have taken to win the WiC in those years, in other words, the number of wins the team that finished 2nd in the WC + 1.
...
The average number of wins it takes to win the WC in the last 6 years is 91.8 wins, so let’s say 92 wins. and it has never taken more than 94 wins to get the WC in any of those years.

I wish I could have thought to say all that.  If you check out Clay’s latest, he shows this as his forecast:

Average wins by AL Wild Card:  92.4

So, that’s what we’re really after.  92 wins.  Thanks Ananti!


#39          (see all posts) 2007/06/13 (Wed) @ 17:09

Does anyone know where I can find the thread MGL refers to in comment #1?



#41    rluzinski      (see all posts) 2007/06/14 (Thu) @ 12:51

I know I tried to figure out how much BP was regressing the results of W3 towards their preseason PECOTA projections and it didn’t make much sense.  At one point (haven’t checked lately), the Brewers’ future expected winning % was greater than both their W3 win% and their preseason PECOTA projection!  Could future SOS completely explain that?  I know the NL Central is relatively weak but…

The fact that the Brewers’ offense has so many young hitters in prominent roles (Hart, Prince, Braun, Hardy, Weeks) illustrates just how wrong it is to regress all teams the same amount, anyway.  Heck, trying to project the Brewers offense right now is pretty much a crap shoot.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 08 04:25
Sabermetric Moves of the 2009 Pre-Season

Jan 09 02:33
Cheers

Jan 08 23:45
The first Hardball Times Annual available for download!

Jan 08 21:16
Line Drives

Jan 08 20:23
(recent) Historical WAR on Fangraphs

Jan 08 16:07
Clint Eastwood is Archie Bunker

Jan 08 16:06
Hardball Times Annual 2008, starring…

Jan 08 15:58
Madoff’s Ponzi

Jan 08 03:41
Valuing relievers

Jan 07 17:41
The latest in park factors