Thursday, August 24, 2006
True Talent Levels for Sports Leagues
I’m engaged in a discussion on True Talent Levels across sports leagues. The question on the table is how many games do you need to get the truly better team to have the better record. In that thread, I give you a couple of useful equations to use.
I will reprint all my comments into this thread here (which may look somewhat bizarre without the accompanying context), but I encourage all to also follow the discussion there.
var(observed) = var(true) + var(random)
where var=variance
If we look at OBP, var(true) in MLB is around .030^2.
For any single PA, it’s either safe or out. That makes our var(random) = .474^2.
var(observed) therefore will be .475^2.
Your regression toward the mean therefore will be over 99%.
That’s for one PA. But, there are 80 PAs per game, more or less (hitters and pitchers). The var(random) drops down to .053^2.
So, it all depends on the number of “trials”. In football, you probably have around 150 possessions? Basketball is what, 200? Hockey likely in the 100+ neighboorhood? Tennis, 4 matches x 9 games x 6 to 8 points = 250?
The less trials, and the closer the var(true) is to zero, the more luck plays a role. My guess is that tennis has far fewer upsets simply because the trials are so high, and the spread in talent is so much wider.
===================
I don’t see how it can be close to 100%. It’s not like we always see players ranked #1 through #16 in every tournament.
In any case, the exact answer can be determined either empirically (we have enough data), or through the process I explained.
===================
Ah, 60% is huge! I guess the simple question is: if a guy who wins 60% of the points faces a guy who wins 40% of the point, how often will the second guy win more than 50% of the points, over 250 trials? I get 99.9%.
===================
If the probability we expected was simply 51% to 49% for any single point, the better guy will win 62% of the time. If let’s say this was Sampras/Agassis head-to-head record, it shows you how very close they are, and it’s only the setup in tennis that allows Sampras to stand out much more.
===================
For tennis, this is likely the case, with women. The spread in talent in women’s tennis is likely far wider than in men’s tennis. To ensure that the same women don’t always win, you need fewer games per match.
As for baseball, var(true) for a baseball team is about .060 (which can be calculated in many ways).
var(random) reaches .060, when the number of games played is 69. That is, after 69 games, the “r” is .50.
I don’t know what the var(true) for a football team is. I’m sure it’s quite a bit higher. Just taking a quick stab at it now, let’s say var(true) it’s .150 for football. To get var(random) to be .150, you need 11 games. That, is, after 69 baseball games, you’ll know as about the true talent of teams, as you would after 11 NFL games.
===================
Here is one way to figure out the var(true) for any league.
Step 1 - Take a sufficiently large number of teams (preferably all with the same number of games).
Step 2 - Figure out each team’s winning percentage.
Step 3 - Figure out the standard deviation of that winning percentage.
I just did it quick, and I took the last few years in the NFL, and the SD is .19, which makes var(observed) = .19^2
Step 4 - Figure out the random standard deviation. That’s easy: sqrt(.5*.5/16)
16 is the number of games for each team.
So, var(random) = .125^2
Solve for:
var(obs) = var(true) + var(rand)
var(true), in this case, is .143^2
Knowing that var(true) is .143, to get an “r” of .50, you need var(rand) to also be .143. For that to happen, the number of games played equals 12. That is sqrt(.5*.5/12)= .144
In baseball, var(true) is .060.
I haven’t figured out what it is in NHL, or NBA, but perhaps someone wants to look at it?



I also added the following to that thread:
I’ll take a quick guess with the NHL. The number of points (adjusted for ties/OT) parellels somewhat the number of wins in baseball.
So, if the observed SD win% in baseball is .072, then it would be around .080 for the NHL, I’ll guess.
The random variation for 82 games is .055^2
which makes var(true) = .058
Therefore, in the NHL, 82 games is pretty much the point where r=.50. That is, 12 NFL games, 69 MLB games, and 75 NHL games are all equivalent.
MLB decides to play 162 games instead. The NHL decides to allow 16 teams in the playoffs.
===================
Hmmm… should have checked first. var(obs) in the NHL is .100^2, making var(true) = .083^2. To match var(rand) of of .083^2, you need to play 36 games.
So, 12 NFL games, 36 NHL games, and 69 MLB games are equivalent.
In the NFL, with only 16 games, luck plays a huge role. In the NHL and MLB, both those number of games is 43-44% of their respective seasons. There’s no “true talent” reason for the NHL to have all those playoff games.