THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, August 02, 2010

God and .500

By Tangotiger, 01:19 PM

Suppose that God herself came to you and told you that she was going to do something devious: for the 2011 baseball season, every team would have 25 players of identical talent, with all 30 teams being equals.  That no player would learn from each plate appearance, and no player would get hurt.  That no player will even interact on the most basic level with each other, and if they did, it would be indistinguishable from any other interaction. No player would even age, having been frozen in time.  All the games would be played in the same city, at the same time, in 15 identical ballparks, domed, and staffed by identical groundskeepers and HVAC guys.  This is the most controlled science experiment ever: nothing can possibly ever change.

And after each team plays 162 games, you will get ten teams winning between 78 and 84 games, with ten winning fewer than 78 and 10 winning more than 84.  All that would happen based purely on the only thing that differentiates each of the thirty teams and 750 ballplayers: luck.

We expect the distribution of wins to have one standard deviation equal to 6.4 wins.  Imagine, one team will have 91 wins, and another will have 71 wins, and they are identical in all respects!  They are identical because god told you.  That if they played each other one million times, they’d each have a .500 record.  But 162 games is not a million.

***

As it turns out, in reality, you don’t have ten of thirty teams that win outside of those 78 and 84 games.  In the 12 seasons between 1998 to 2009, you had an average of ten teams that won less than 75 games, ten teams that won more than 87 games, and ten teams that won between 75 and 87 games.  The distribution of actual wins is one standard deviation of 11.8 wins.  INCLUDED in that 11.8 wins is the luck portion of 6.4 wins.  The difference between the two is (11.8^2-6.4^2)^0.5, or 10.0 wins.

That is, when you look at the won-lost records of baseball teams, 60% of that is the talent and other vagaries of the participants, and 40% of that is luck.  If you have a team that wins 91 games (+10 wins above average), that could be because it was +6 because of talent and +4 because of luck, or +12 because of talent and -2 because of bad luck, or -5 because of talent and +15 because of fantastic luck.

We don’t know for any one team, how we can work backwards from the W/L record.  All we can do is make a best guess, and have a huge uncertainty level around that guess.  Since it’s easy enough to take two identical god-told teams that won 91 games and 71 games and make them both 81 win teams, do you see how hard it would be to take a team that won 86 games and another that won 76 games and say that the 86-win team is in fact better than the 76-win team?

Sports, life actually, is played by unique persons.  But luck, timing, good/back breaks plays a huge role in the outcomes.  Not everything is luck.  But not everything is talent either.


#1    lincolndude      (see all posts) 2010/08/02 (Mon) @ 14:33

Thanks Tango.  This is fantastic and so easy to understand.


#2    sharpie      (see all posts) 2010/08/02 (Mon) @ 14:46

How did you get that 60% of win-loss record is talent and 40% is luck? I followed up until that sentence.


#3    Devon & His 1982 Topps blog      (see all posts) 2010/08/02 (Mon) @ 14:48

Aaahhhh, I don’t believe in luck being “by chance”. I agree with Branch Rickey, that luck is just the residue of design. That being so… then an 81 win team could become 91 win team given the right design by the manager & coaches....or they could become a 71 win team based on it.


#4          (see all posts) 2010/08/02 (Mon) @ 14:53

I presume the variance from “luck” is simply the binomial distribution for a game where one team has to win, the odds are 50-50 for each game, and there are 162 trials.

Tango then looked at the actual distribution of wins and found that the variance was larger.

Using the fact that
Variance (total) = Variance (luck) + Variance (things other than luck)
and that fact that variance adds quadratically, he found that luck was 40% of the the total variance, thus everything else was 60%.


#5    Tangotiger      (see all posts) 2010/08/02 (Mon) @ 14:54

Sharpie/2: the spread in talent is one SD = 10, and the spread in luck is one SD = 6.4.  That puts the talent spread about 1.5 times wider than the luck spread, or 60/40.


#6    Tangotiger      (see all posts) 2010/08/02 (Mon) @ 15:01

Devon/3: you completely ignored the first half of my post.


#7    Tangotiger      (see all posts) 2010/08/02 (Mon) @ 15:03

Mike: close.  The binomial accounts for 30% of the variance, but we want to measure wins, not wins-squared.


#8    David Pinto      (see all posts) 2010/08/02 (Mon) @ 15:04

I wrote a simulator that presents a nice demonstration of these points.


#9          (see all posts) 2010/08/02 (Mon) @ 15:13

In a post about luck you tell us that we WILL see ten teams below 78 wins, ten teams between 78 and 84, and ten teams above 84.

How does that work out?

Mightn’t we see every team get 81 wins, every team between 78-84, or 15 teams below 78 and 15 teams above 84?


#10          (see all posts) 2010/08/02 (Mon) @ 15:13

I don’t buy it. The Orioles would still find a way to win fewer than 70 games.


#11    Tangotiger      (see all posts) 2010/08/02 (Mon) @ 15:25

Matt/9: Ok, if we had one million of such leagues (rather than just one league), then we will see the one-third, one-third, one-third breakdown, give or take a tiny bit.  I’m not sure if you were being pedantic or precise.


#12    B      (see all posts) 2010/08/02 (Mon) @ 15:26

I thought this was on topic, so I just thought I’d share the link:

http://contenta.mkt1710.com/lp/26966/115068/Untangling%20Skill%20and%20Luck.pdf


#13    Tangotiger      (see all posts) 2010/08/02 (Mon) @ 15:30

David/8: excellent.


#14    Miss Awgennist      (see all posts) 2010/08/02 (Mon) @ 15:30

This post is obviously incorrect because God is a man (or, alternatively, if God were a female, she would not care about baseball).


#15    Nick Steiner      (see all posts) 2010/08/02 (Mon) @ 15:35

Well since God is Albert Pujols he must be a man, or half man half unicorn or something else awesome like that.


#16    Tangotiger      (see all posts) 2010/08/02 (Mon) @ 15:36

B/12: beautiful, excellent piece.  I encourage all to read it.  (I’m just on page 6, and I really like it.)


#17    Tangotiger      (see all posts) 2010/08/02 (Mon) @ 15:39

Whoah, they actually cite one of my threads (page 7, footnote 15).  Thank you Michael Mauboussin.


#18          (see all posts) 2010/08/02 (Mon) @ 16:14

@#15: Obviously if God is a half man half unicorn, then God is Alex Rodriguez. He has the painting to prove it. http://outofbounds.nbcsports.com/A-Rod%20Centaur.jpg


#19    Jonathan Sher      (see all posts) 2010/08/02 (Mon) @ 16:17

Interesting post and a thoughtful way to quantify the importance of luck in outcomes.

I have one suggestion - perhaps you can define what you mean by the flip side of luck, which you have labeled talent. It seems to me that label doesn’t entirely describe the flip side of luck.

Let’s imagine a league with 20 identical rosters, the only difference their managers, who use their rosters differently. Compare the variation with the expected distribution with identical managers.  If there is some difference than managers have some effect on outcome in their roster and game management and that effect would not fall neatly on the side of luck or talent.

None of that takes from your central point about the importance of luck.


#20          (see all posts) 2010/08/02 (Mon) @ 16:20

Tango/7, yes.  I did leave out taking the square root of the variance, didn’t I?


#21          (see all posts) 2010/08/02 (Mon) @ 16:21

I’ve always thought the two most obvious and therefore least discussed things about baseball are these:  the average of all teams’ records will always be .500, and the most likely result of a plate appearance is nearly always an out.


#22    Nick      (see all posts) 2010/08/02 (Mon) @ 18:00

As usual, Jurassic Park puts it best: “I’m simply saying that life, uh… finds a way.”


#23    intricatenick      (see all posts) 2010/08/02 (Mon) @ 18:03

This is great stuff. I remember a statistics course in college where we did this basic example using loaded dice. You could measure the effect of loading the dice (talent) vs not loading the dice (luck).

We then had to calculate the expected value and error bars obtained by using the loaded dice (after rolling them 1000 times) in craps betting a hard eight (they were loaded to give preferential fours).


#24    Dingers      (see all posts) 2010/08/02 (Mon) @ 19:04

Cool post. 

Also wanna add I always appreciate your stuff, Mr. Tango.


#25          (see all posts) 2010/08/02 (Mon) @ 19:29

As you’ve built the model, try this:

Trial #1:  Stay true to your original model, but intstead of making every team equal, disperse the talent normally amongst the 30 teams, with a standard deviation of 3% winning ability in the 30 team population ... run your model.

Trial #2:  Stay true to your original model, but intstead of making every team equal, make 26 teams identical, two teams awful, and two teams terrific.  Do it so that a standard deviation of 3% winning ability exists within the 30 team population ... run your model.

....

Now use sabermetric methods to determine the value of each team using just the results; analysis of variance, ANOVA, Z-scores .. pick your term.  That math will see both of these cases as being identical. And, if we split the season, it will go for a spectacular poop when we use it to predict future results in the latter case (not in the former case, it works very well there ... if the world ever starts behaving like that, I’ll be sure to call you :D )

After ruminating on that, you migh wan to take a sledgehammer to “Solving DIPS”.  When you egt back from that, think about Bill James.  Specifically why is his early stuff so terrible and as naive as the assumption in trial#1.  And why is his recent stuff implicitly like an especially cynical look at trial#2?  Leopards don’t usually change their spots, but dude is off the hook in recent years.


#26    Chris Long      (see all posts) 2010/08/02 (Mon) @ 20:12

In the .500 universe Tom Tango would be writing articles where he hypothesizes a universe where the players and teams are all unequal.


#27    Sharpie      (see all posts) 2010/08/03 (Tue) @ 00:20

Tango/5: I get it now. thanks for clarifying. great stuff.


#28    Steve Marino      (see all posts) 2010/08/03 (Tue) @ 01:17

Post #10 made an excellent point which needs to be addressed (I was going to post the same kind of thing about the Astros).

Incidentally I’m a big Tango fan at Fangraphs and I never knew you had a book out, consider that another Amazon sale.


#29    heyheylbj      (see all posts) 2010/08/03 (Tue) @ 10:03

"I returned, and saw under the sun, that the race is not to the swift, nor the battle to the strong, neither yet bread to the wise, nor yet riches to men of understanding, nor yet favour to men of skill; but time and chance happeneth to them all.”

-- Ecclesiastes 9:11


#30    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 10:20

Doctors have their oath.  The mob has their oath.  The Green Lanterns have their oath.  That one should be the saberist oath.


#31    Ben V-L      (see all posts) 2010/08/03 (Tue) @ 10:27

Tango, you are overestimating luck in real major league baseball, because the binomial distribution is not an accurate model.  The easiest way to see this in the data is the following: the binomial distribution predicts a std dev of 6.4 wins, but quality win estimators (like pythagopat) can predict wins to within an rms error of about 4 wins.

So what’s going on?  How do we understand this theoretically?  Basically, individual games are varying quite a lot in winning percentage, based on pitching and team matchups.  This acts to bring the uncertainty down relative to the binomial distribution for a .500 team.  To understand the principle, consider the extreme case of a .500 team with 4-man rotation made up of two perfect pitchers who win each time out, and two awful pitchers who lose each time out.  This team will hit .500 on the nose, binomial distribution be damned.

Obviously in real baseball, it’s not so extreme, but the concept still applies.  And as such, the win variation due to luck is not well estimated by sqrt(162)/2.  The actual randomness is quite a bit smaller.


#32          (see all posts) 2010/08/03 (Tue) @ 10:46

Ben/31, when you look at runs scored and allowed, you’re already incorporating some of the luck.  Yes, there is some additional luck in how those runs scored and allowed are distributed, but that is not the total amount of luck.  There is also luck involved in getting the runners on base and sequencing the offensive events to get them to come around to score, and you’re not measuring that luck in Pythagenpat.


#33    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 10:57

Ben: if I were to run a thousand DMB simulations of thirty equal teams playing each other for 162 games, are you saying we will, or will not, witness a distribution of 1 SD = 6.4?  If not, what will we see?

Now, are you also suggesting that if you look at real-life type of teams that the part that is luck will not be explained by the above distribution?


#34    mettle      (see all posts) 2010/08/03 (Tue) @ 10:59

Cross posted at fangraphs (sorry!), but I had a question:

Would the wins SD be the same if it were calculated from an at-bat level or a runs scored level? That is, if the probability of an out, single, HR, etc were simulated per at bat for each game for each match up, following the rules of baseball (extra innings, etc) would the SD on wins still be 6.4? Or if you randomly generated runs scored by each team and assigned wins accordingly, would the SD =6.4? Is that SD dependent on a normal dist of runs?

My intuition says no, so I did the following:
Simulate a 162 game season 100 times with a normal dist of runs:
avg=81.3, stdev=6.4, as expected.
Simulate a 162 game season 100 times with a chi-sq dist of runs (more realistic since you can’t score negative runs):
avg=81.2, stdev = 7.6!!!!!!

This seems like a crucial distinction with important implications for that 60/40 split (is it really 75/25?) Can someone tell me what’s right or wrong here?


#35    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 11:07

mettle: intriguing.

This may be what Ben was talking about, and it may explain other things I’ve been thinking about (such as, for example, the NFL).

We talked about this in the past at Fanhome, where the example is runners.  You have one runner who runs the 100m at 10 sec, +/- 0.20.  And another who runs the 100m at 10.1 sec, +/- 0.20, and so on… and another who runs the 100m at 11 sec, +/- 0.20.

Now, if you were to put these 11 runners in a league, you might say end up with the 10 sec runner with a 90% win% and the 11 sec runner with a 10% win%.  But if the two faced head-to-head, the results would not be 99% win% for the 10 sec runner, but much higher.

So, if that’s what you are talking about, then, yes, I can definitely see the point.

***

Btw, you may prefer the Tango Distribution to model your run scoring (last two links on home page):

http://www.tangotiger.net


#36    Colin Wyers      (see all posts) 2010/08/03 (Tue) @ 11:09

Ben is right that the binomial distribution is not the correct model for this question (although I don’t think Pythag is evidence of this, necessarily).

The binomial distribution assumes that outcomes will be independent. With OBP, for instance, whether or not one hitter gets on base has no effect on whether or not another hitter gets on base. (Okay, so this is not EXACTLY so - we know that whether or not a hitter gets on base has an affect on the next hitter in the lineup, due to intentional walks and such. But that’s not a serious concern.)

Whereas for wins, random variation in one team’s win percentage has to show up in another team’s win percentage. That’s because the total quantity of wins is fixed at the league level. So random variance will be smaller than the binomial distribution predicts. The “correct” model (or at least a better one) is the hypergeometric distribution:

http://mathworld.wolfram.com/HypergeometricDistribution.html


#37    Ben V-L      (see all posts) 2010/08/03 (Tue) @ 11:19

Mike/32, that’s a very good point.  But there is still a non-binomial aspect to the variation in wins (more precisely, the actual distribution is well estimated by a convolution of binomial distributions with different means).  How much this lowers the variation in wins is a good question, and I don’t know the answer.

Tango/33: your question isn’t yet well posed.  Are these equal teams each consisting of essentially 5 equivalent starters?  Or do the starters for these equal teams vary in quality from ace to 5th starter?  In the latter case, are the teams going to do perfect matchups of ace against ace down to 5th starter against 5th starter?  Or will the matchups be somewhat randomized?  Are the ability gap between ace and 5th starter the same or different on different teams?

Whether the std dev will be 6.4 depends on the answers to these kinds of questions.  You’ll have to make the situation highly symmetric, basically insisting that each individual game is a coin toss, to get to 6.4.


#38    mettle      (see all posts) 2010/08/03 (Tue) @ 11:21

I guess the practical question, though, is should the SD be lower (Ben’s example, which makes perfect sense) or higher (the 7.6 I got in my little simulation)?

Thanks for the runs link, btw. Very cool (even though it’s for Windows) and over 10 years old, to boot. I think the discussion is adequate to back-generate the code.


#39    Ben V-L      (see all posts) 2010/08/03 (Tue) @ 11:29

mettle/34: if your simulation parameters are equal for the two teams, then each game outcome is still a 50/50 proposition, and so you must get a std dev of 6.4 once you’ve done enough runs.


#40    Ben V-L      (see all posts) 2010/08/03 (Tue) @ 11:35

mettle/38: the 6.4 std dev is an upper bound.  The std dev of a binomial distribution as a function of the mean is sqrt(p*(1-p)), which is a concave function of p.

This means: if the season average p is composed of some games with a probability higher than p and some games with a probability lower than p (which it is in any remotely realistic scenario), then the convolution std dev is lower than the value right at p.


#41    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 11:44

Ben, from the initial thread:

“every team would have 25 players of identical talent, with all 30 teams being equals”

For your purposes, presume that every players hits, fields, and runs like Carlos Beltran, and every pitcher throws like Justin Verlander, and no one learns, and no one ages.

(Or put in whatever two players you like.)


#42    Ben V-L      (see all posts) 2010/08/03 (Tue) @ 11:49

Tango: then it’s 6.4.  But you can also do simulations of equal teams that are equal in a different way and get less than 6.4, as long as the game-to-game winning probability varies.  Which becomes relevant as soon as you try to estimate how much of MLB winning percentage is luck.


#43    Rally      (see all posts) 2010/08/03 (Tue) @ 11:53

Some people want to take a simple exercise and make it much more complicated.  If all the players are equal, and games are independent, then it matters not what your best distribution is to model run scoring.  Every single game represents exactly a 50/50 chance for either team to win.

I put a little code into a quick excel macro:

Sub Macro1()

Range("a1:b10000").Clear
Randomize
For i = 1 To 10000
For j = 1 To 162
If Rnd() > 0.5 Then
Cells(i, 1) = Cells(i, 1) + 1
Else: Cells(i, 2) = Cells(i, 2) + 1
End If
Next j
Next i

End Sub

This will simulate one team-season 10,000 times.  The average is 81 and the SD I got running it is 6.3 - close enough.

If I have more time I could put together one that does 30 teams through a schedule, such that when one team wins another has to lose.  That might change the SD a bit, but I’m not sure.


#44    mettle      (see all posts) 2010/08/03 (Tue) @ 12:04

@Ben/40: Then why, if you use a chi-sq dist for runs for and runs against and simulate 100 162-game seasons do you get a SD of 7.6? Are you saying that if I did infinite runs, it would asymptote to 6.4? If so, you seem to be saying that it would asymptote from below, but this is above.


#45    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 12:08

Ben, ok, so we agree that in my baseline scenario, we have luck being 1 SD = 6.36 wins.

Now, you are suggesting that I can’t just use that figure as the luck portion in real-life baseball.  And that’s because in baseball, it’s not always a p=.500 team facing a p=.500 team.  In some cases, it might be a .550 v .450, so we have the mean for the better team as being p=.600.

So, let’s start off with these as the true win% of each team, such that had they played each of the other 29 teams a million times, they would have these results:

0.650
0.620
0.590
0.570
0.550
0.530
0.520
0.510
0.510
0.505
0.505
0.505
0.500
0.500
0.500
0.500
0.500
0.500
0.495
0.495
0.495
0.490
0.490
0.480
0.470
0.450
0.430
0.410
0.380
0.350

The standard deviation is .060.  Now, if they were to play 162 games instead, what would you expect to be the observed standard deviation?

And, more importantly, would the rules of baseball give you a different result than hockey or football?  That is, since scoring in hockey follows Poisson and baseball follows something else (let’s say the Tango Distribution), then perhaps the Odds Ratio matchup (limited only to win%) won’t apply, and so, we’re not going to get the same observed distribution in win% in each of the sports?  (Similar to say the runner example provided earlier.)


#46    Kincaid      (see all posts) 2010/08/03 (Tue) @ 12:41

mettle/44, because taking the standard deviation over 100 observations can give you pretty much anything, be it above or below the expected value.  Simulating 100 seasons is not nearly enough to give you a precise estimate of the expected SD.  If you keep running simulations of 100 seasons, you’ll end up with an average SD for all your runs of about 6.4, which means you’ll sometimes get SDs higher than that as well as lower than that.  That is the same whether you select runs from a normal distribution or a chi distribution.  Either way, sometimes you’ll get 7.x, and sometimes you’ll get 5.x, and on average you’ll end up with around 6.4.

If all you’re doing is picking out a pair of random numbers from a distribution and seeing which is greater, it doesn’t matter what distribution you’re choosing from, there will be a 50/50 chance for either number being bigger, and that’s the only thing that really matters in this case.


#47    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 12:47

What I did was run a league of 150 games, where each team played 5 games against the 30 teams (including themselves).

The binomial would have said 1 SD = sqrt(150)/2 = 6.12 wins.

The true talent was distributed at 1 SD = 9.00 wins.

The observed, if this was just pure Odds Ratio matchup based on the win% would have estimated sqrt(6.12^2+9^2) = 10.89 wins.

I ran 100 sims, and I got 1 SD = 10.50 wins.

So, this supports Ben’s position.

Working backwards then, the luck portion was really sqrt(10.5^2-9^2) = 5.41 wins, and not 6.12 wins.

Whether this ratio (5.41/6.12, or 88%) is a good one to use for research purposes, I don’t know.

Also, I’m not sure that my original talent distribution is necessarily a reasonable approximation or not.  Changing that would change things.

Furthermore, on a game-by-game basis (because of the starting pitcher), you are not really facing a .600 team 5 times, but say a .700, .650, .600, .550, .500 teams once each.

Fun and games…


#48    mettle      (see all posts) 2010/08/03 (Tue) @ 12:59

@Kincaid/46. Duh, that makes sense.

So, what I take from this is that the only way the SD will differ from 6.4 is because of difference and deviations from .500 in true talent levels.

Therefore, the only difference between baseball and football or hockey is what the distribution of talent is and what the distribution in expected W% would be - the formula is the same.


#49    Ben V-L      (see all posts) 2010/08/03 (Tue) @ 13:22

What Kincaid said.  (And mettle agrees, so we’re all good there).

Tango, I think there’s an aspect of this that you’re still not getting.  I’m already disagreeing at the point where you want to define things by a .500 team versus a .550 team, etc.

My point is that a .500 team is likely on some days a .650 team and on other days a .350 team, depending on who’s pitching.  A different .500 team might vary only between 0.550 and .450, because their pitchers are more equally skilled.  In this case, the first .500 team would have a lower random variation in their expected wins than the second .500 team.  Even if they play the same schedule.

Additionally, there is the phenomon that you’re picking up on, that the opponent quality is also affecting the expected winning percentage.

Both are signficant effects, and I don’t know offhand which is larger (in affecting the scale of random variation in wins).  But if I had to guess, I would go with the pitchers.


#50    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 14:01

This new thread may be of some interest:

http://www.insidethebook.com/ee/index.php/site/article/odds_ratio_method_track_runners_and_baseball/


#51    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 14:14

Steve was trying to post the following:

====================================
I did what Rally mentioned at the end of his post 43 and ran a 100000 trial sim where the teams played each other according to this years MLB schedule.

When teams each had 0.500 true talent the overall SD fell out at 6.36. The max intra-team SD was 6.39 the min was 6.34.

When I used the CHONE preseason projection true talent I get overall SD 0f 9.12 (i.e. the SD of all datapoints, probably not a useful number). The max intra-team SD is 6.38 and the min 6.20.  The average intra-team SD across the 30 teams was 6.32
============================================


#52    Rally      (see all posts) 2010/08/03 (Tue) @ 15:32

Interesting.  Did the extreme teams (really good or really bad) have the smaller variance?

Binomial calculations would suggest a 6.36 SD for a .500 team, and a 6.24 SD for a .600 (or .400) team.


#53    Rally      (see all posts) 2010/08/03 (Tue) @ 15:35

Adding in a home/road adjustment would also reduce the random variance, as instead of .500 teams facing each other all the time you’d have .525/.475 matchups.


#54    Steve Sommer      (see all posts) 2010/08/03 (Tue) @ 15:47

@Rally #52,

Yep. The Yankees had the lowest SD, with TOR and SLN among the other lows.


#55    mettle      (see all posts) 2010/08/03 (Tue) @ 16:41

As established, if you take .500-level teams and they play, you get a 6.4 SD in Ws.

For a second, I was confused as to how you could then ever get SD>6.4 since the function is concave, whereas SD_baseball = 11.8.

It took me a second to realize (perhaps obvious to everyone else) that when you calc the SD for, let’s say a .666 team, while the SD for one team is lower (6.0) the combined SD of the .666 and .333 team is greater (27.7).
So, in case anyone is curious, the W% to get an SD =11.8 is .562.
That is, a .562 and a .438 team playing each other 162 times will yield a baseball-like SD. I may play around with 30 teams and figure out the SD of true talent required to yield our present 11.8 SD. This seems like useful knowledge - you can calculate that for other sports, and compare the true talent distributions and also know how distributed your other metrics need to be if they represent true talent.

If someone knows, please share. It may end up being a simple equation involving 6.4 and 11.8


#56    Tangotiger      (see all posts) 2010/08/03 (Tue) @ 16:52

Mettle, right, that’s what I’ve been doing:

sqrt(11.8^2-6.4^2)

The distribution of actual wins is one standard deviation of 11.8 wins.  INCLUDED in that 11.8 wins is the luck portion of 6.4 wins.  The difference between the two is (11.8^2-6.4^2)^0.5, or 10.0 wins.

That 10.0 wins is the true talent spread for baseball teams.

I did this for all the sports a couple of years ago:

http://www.insidethebook.com/ee/index.php/site/comments/true_talent_levels_for_sports_leagues/

Check it out.  I think you’ll like it.


#57    mettle      (see all posts) 2010/08/03 (Tue) @ 17:10

Yes, liked it very much.
Now you just need to include soccer rasberry


#58          (see all posts) 2010/08/03 (Tue) @ 17:53

The first paragraph in the article linked in #12 is great.


#59    Scott Segrin      (see all posts) 2010/08/03 (Tue) @ 19:37

In football they play only 16 games.  If you ran a similar simulator, assuming a .500 world, you would find that about 3 times in 4 at least one team would win 12 or more games.  In fact, the random standings would not look all that dissimilar to what actually occurs in the NFL.  (Someone did this a few years back but I’m not finding it.  They were trying to make the point that there was perfect parity in the NFL.)

But if I told you that this season, one team would win 12 games and another would lose 12 games, and that the two teams involved were the Indianapolis Colts and the Detroit Lions, it would be extremely easy to predict which were which; unlike what it would be in the random model.

My point is - I think - that just because distributions appear to be randomly distributed does not necessarily mean that they are.  Or to put it another way, differences that are of the same magnitude as one would expect by luck does not necessarily mean that the observed difference is due entirely to luck.


#60          (see all posts) 2010/08/04 (Wed) @ 06:01

In recent years, the Lions would have loved to lose just 12 games.


#61          (see all posts) 2010/08/04 (Wed) @ 07:52

Just one more point on this because I’ve thought it through some more.  According to Tango’s model:

“[W]hen you look at the won-lost records of baseball teams, 60% of that is the talent and other vagaries of the participants, and 40% of that is luck.”

This is true *if* every team is a .500 team.  In reality, not every team is a .500 team.  There are some .550 teams and some .450 teams.  This set of non-,500 teams could generate final standings that look *exactly* like the random standings generated by the model.

Suppose that by some quirk every team played the season exactly to its ability.  The (true) .569 team won 92 games, the .519 team won 84 games, and so on.  How much of the variance in the standings would be attributed to luck then?  None.  But the standings could still look exactly the same as those generated by a .500 random model.

So my final point is this:  If we can’t measure skill while ignoring the effect of luck, then we also can’t measure luck while ignoring the effect of skill.


#62          (see all posts) 2010/08/05 (Thu) @ 03:55

If God had made all the umpires equal as well I would question the premise that luck would cause 2 equal teams to have 20 wins difference between them over 162 games.

But I suspect umpires may account for at least 50% of luck (good or bad), so this is possible.


#63          (see all posts) 2010/08/05 (Thu) @ 03:57

And the other 50% of luck could be managers and coaches.


#64    Tangotiger      (see all posts) 2010/08/05 (Thu) @ 07:32

pft: you question the entire discipline of statistics and probability then.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 08 15:14
New PECOTA

Feb 08 14:46
When is a life entity considered a person?

Feb 08 14:44
When to purposefully lose the lead

Feb 08 13:49
The will of the people?

Feb 08 11:43
Is Nate Silver alot more certain than he lets on?

Feb 08 09:02
Forecaster’s Challenge: 2012?

Feb 08 07:43
For Your Soul

Feb 08 02:00
Batman, the webslinger?

Feb 08 01:22
Why I’d Bet on My Model (and Against My Instincts)

Feb 07 20:05
Golfers “playing through”