Monday, September 22, 2008
MLB Playoff Race
One more site for odds of making the playoffs. His help page doesn’t describe the method of figuring the true talent level of the team, which is of course where the whole ball of wax is.
Buy The Book from Amazon
One more site for odds of making the playoffs. His help page doesn’t describe the method of figuring the true talent level of the team, which is of course where the whole ball of wax is.
Tom, thank you for this post. Who knows why I’m not more specific on the help page. Here is how I compute the weighed MLB:
short version:
Log5 using pythagenpat for team record instead of actual team record (goals for/against with exponent of .287, I don’t use the more accurate version based on number of singles, doubles, etc. because I don’t have that data handy.) 4% home field advantage. I do not through in a regression to mean (or to anything else) like BP does, simply because I have not gotten around to learning how. (I trust adding in a regression matches the historic data better? I was confused about that because it seams using pythagenpat might “overlap” that correction and cause you to overcompensate.)
And MGL, that is a good point, the kind of detail you don’t think about until you sit down to do it, I do recompute all this after each simulated game. It is slower but makes more sense to me. I’m not sure how much this matters either, or what bp or coolstandings does.
That’s it, nothing original. I hope I’m adding value with the way I present the data.
exact version:
public double PythagenPat
{
get
{
if (GamesPlayed == 0)
{
return .5;
}
if (GoalsFor == 0)
{
return 0;
}
double exponent = Math.Pow((GoalsFor + GoalsAgainst) / (double)GamesPlayed, .287);
return 1 / (1 + Math.Pow(GoalsAgainst / (double)GoalsFor, exponent));
}
}
public const int Log5Const = 10000000;
public int Log5ChanceHomeWins
{
get
{
//To simulate the normal 4%
//home-field advantage, the home team gets a .020 point bonus, while the
//visitors take a 0.020 penalty.
double h = Home.PythagenPat + .02;
double a = Away.PythagenPat - .02;
double ha = h * a;
double denominator = h + a - (2 * ha);
if (denominator == 0)
{
return Log5Const / 2;
}
else
{
return (int)((Log5Const * (h - ha)) / denominator);
}
}
}
protected override void Play(Game g)
{
switch (PlayType)
{
case PlayType.CoinFlip:
{
int t = Program.Rand.Next(100);
g.HomeScore = t / 10;
g.AwayScore = t % 10;
while (g.HomeScore == g.AwayScore)
{
g.Overtime = 1;
if (Program.Rand.Next(2) == 0)
{
g.HomeScore++;
}
else
{
g.AwayScore++;
}
}
break;
}
case PlayType.Log5:
{
int t = Program.Rand.Next(Game.Log5Const);
//Log5ChanceHomeWins will be 0 - Log5Const inclusive
if (t < g.Log5ChanceHomeWins)
{
// Home wins
// 1 - 10
g.HomeScore = (t % 10) + 1;
// HomeScore minus number between 1 and HomeScore (inclusive)
g.AwayScore = g.HomeScore - ((t % g.HomeScore) + 1);
}
else
{
// Away wins
// 1 - 10
g.AwayScore = (t % 10) + 1;
// AwayScore minus number between 1 and AwayScore (inclusive)
g.HomeScore = g.AwayScore - ((t % g.AwayScore) + 1);
}
break;
}
}
}
Yes, you’ve got a pretty good presentation for sure, and I also like the “look-ahead” feature.
Sorry, I’m not very statistically savvy. What is the log 5/odds ratio method? How often will a true .550 team beat a true .500 team, and is the probability of winning a seven-game series greater than winning a single game for the better team? Thanks.
Ken, updating each team’s wp after each simulated game adds nothing to the model (if that is what you mean - IOW, if a team’s pythag wp is .550 and in the sim, they win he next game, the wp you use for the next game is a little higher than .500). In fact, it may introduce noise such that the results are less accurate than just using a fixed wp for every game that you are simming.
BTW, Ken, it would not be all that difficult to regress the pythag wp for each team to get closer to a team’s true wp. One line of code.
Ken has a nice site, a personal favourite, particularly the “big games” section—very handy if you like to watch or listen to a lot of out-of-town contests.
The coin-flip approach to the sim is nice in the sense of quantifying the value of the lead itself. A bit like the way a WPA table of two theoretical .500 teams is useful. It would be interesting if visitors to the site could input their own winning percentages.
Ken, if you’re reading this post, I hate to keep bugging you, but any chance of a list of the “big games” of the year? For example, here are the 10 biggest games of last season based on the total change in BP’s playoff odds after the game (along with estimated average leverage index). “Vis chg” of 43.0 means the visitor’s playoff odds increased or decreased by 43.0 following the game:
Date Visitor Home Score Av LI Vis chg Hom chg Total Oct 1 Padres Rockies 8-9 1.73 43.0 43.0 86.1 Sep 30 Marlins Mets 8-1 0.38 0.0 56.3 56.3 Sep 28 D-Bags Rockies 4-2 1.20 17.1 29.5 46.7 Sep 30 Nationals Phillies 1-6 0.80 0.0 44.3 44.3 Sep 30 D-Bags Rockies 3-4 1.32 0.0 44.1 44.1 Sep 28 Marlins Mets 7-4 0.99 0.0 37.5 37.5 Sep 29 Marlins Mets 0-13 0.29 0.0 35.3 35.3 Sep 30 Padres Brewers 6-11 0.78 32.1 0.0 32.1 Sep 28 Padres Brewers 6-3 1.19 25.0 7.1 32.1 Sep 29 Nationals Phillies 4-2 0.99 0.0 28.2 28.2
The best game before September (and interestingly it was also the best American League game but only ranked 17th overall) was Yankees/Tigers on August 24 (http://www.baseball-reference.com/boxes/DET/DET200708240.shtml). Yanks fell three off the wild card, Tigers moved within 1-1/2 of the division lead. 2-0 Detroit after 1, Yankees scored in the second and third to take a 3-2 lead, Tigers went ahead 4-3 in the 3rd and 6-3 in the 4th, Yanks tied it 6-6 in the 5th. Game remained scoreless until Carlos Guillen hit a 3-run shot with two out in the bottom of the 11th. Average LI = 1.71.
I should also note that the average “total change” for every game played last year was 4.46. So the Padres/Rockies playoff game at 86.1 would have a mind-blowing “leverage index” on a game level of 86.1/4.46 = 19.3!
Thank Tom.
Mitchel, you are following me. If a team’s wp is .550 now and they win the first sim game it will be higher than .550 for their second simmed game of that simmed season. (each simmed season resets back to .550 of course). I’d have to take your word on that being a bad idea. And your teasing me about the single line
, when I figure it out I might run it by you. I should have paid more attention in stats class.
Dackle, your not bugging me at all, I just have not gotten around to showing the biggest games. I did not realize BP broke down the odds by game also.
As far as picking a wp, you can sort of use a teams “what if” section to do that. But you can’t tweak more than 1 team at a time. That would be nice.
Jan 08 04:25
Sabermetric Moves of the 2009 Pre-Season
Jan 09 02:33
Cheers
Jan 08 23:45
The first Hardball Times Annual available for download!
Jan 08 21:16
Line Drives
Jan 08 20:23
(recent) Historical WAR on Fangraphs
Jan 08 16:07
Clint Eastwood is Archie Bunker
Jan 08 16:06
Hardball Times Annual 2008, starring…
Jan 08 15:58
Madoff’s Ponzi
Jan 08 03:41
Valuing relievers
Jan 07 17:41
The latest in park factors
From the website:
SportsClubStats.com calculates each teams odds of making the playoffs, how each upcoming game will impact those odds, and how well they have to finish out to have a shot. It knows the season schedule and scores for past games. Each night it grabs any new scores from the internet and simulates the rest of the season by randomly picking scores for each remaining game. The weighted version for baseball takes the opponents record into account when randomly picking scores, so the better team is more likely to win. All other sports use 50/50, which gives each opponent an equal chance of winning (or tying if the sport allows it) each game.
So, it has two options. In one option, it assumes all remaining games are a coin flip. Whether it adjusts for home/away, it does not say.
The other option uses each team’s record to date as their true wp (I assume that that number is static in the sim - i.e., it does not change as the sim plays out the rest of the season, not that that should matter much). Again, I don’t know whether a HFA is added to that and I also don’t know whether he uses a (proper) log 5/odds ratio method for the team matchups to determine each team’s chances of winning the game given each of their true wp.