Tuesday, August 03, 2010
Odds Ratio Method: track runners and baseball
Let’s say you have a runner that runs the 100m, and we observe his running time as 10.0 seconds, with one SD = 0.2 seconds. Let’s call this runner Aurora. And we have a second runner who averaged 10.1 seconds, witth one SD = 0.2 seconds. This runner is called Bolt.
How often will Aurora beat Bolt? Running a simulator 10,000 times, I get .640. I’m sure there’s a nice probability calculator that will give us the same result (if so, please share).
Let’s say we have a third runner, called Canuck, and he runs in 10.2 +/- 0.2. Bolt will beat him .640 times.
What are our expectations of Aurora beating Canuck? Well, the Odds Ratio suggests the following:
.640/.360 = 1.778 = odds ratio of Aurora to Bolt
.640/.360 = 1.778 = odds ratio of Bolt to Canuck
1.778 x 1.778 = 3.16 = odds ratio of Aurora to Canuck
Therefore, under these assumptions, Aurora wins .760.
And in reality, how often would a runner with a 10.0 +/- 0.20 time beat a runner with 10.2 +/- 0.20 time? The simulator says: .760.
So far, so good. Let’s keep going, and have these competitors to Aurora, with the following expectation using the Odds Ratio;
Aurora facing
0.500 10.0 Aurora
0.640 10.1 Bolt
0.760 10.2 Canuck
0.849 10.3 Doppler
0.909 10.4 Echo
0.947 10.5 Flash
0.969 10.6 Gonzalez
0.982 10.7 Hermes
0.990 10.8 Iceman
0.994 10.9 Jaguar
0.997 11.0 Kwicksilver
So, when Aurora faces Doppler and his 10.3 +/- 0.2 time,Aurora will win .849 times, according to the Odds Ratio method. But, what would happen in reality?
Sim ORM Aurora facing
0.500 0.500 10.0 Aurora
0.640 0.640 10.1 Bolt
0.760 0.760 10.2 Canuck
0.852 0.849 10.3 Doppler
0.922 0.909 10.4 Echo
0.962 0.947 10.5 Flash
0.983 0.969 10.6 Gonzalez
0.994 0.982 10.7 Hermes
0.998 0.990 10.8 Iceman
0.9995 0.994 10.9 Jaguar
0.9999 0.997 11.0 Kwicksilver
As you can see, right around the 10.4 mark or so, the Odds Ratio method breaks down. If the breakdown point can be thought of as 2 * SD, the bright folks out there can tell us.
So, what about baseball? Well, in baseball, matchups are not in win% but in OBP. Let’s say that the average hitter has an OBP of .340, +/- .030, and the average pitcher+fielders have an OBP of .340 +/- .030. It would seem, therefore, that the Odds Ratio Method would work as long as the difference in the mean of the players is within… well, some number. (And if you think about it some more, a player doesn’t have an OBP talent level, since OBP is really a sort of win%. Each player would have some number, similar to a track runner’s time number. So, in order to model baseball, you would need to figure out how to give each hitter and pitcher a number such that we can try to figure out how much the Odds Ratio Method can be used and when it breaks down.)


Since you are basically subtracting 2 normal distributions, you can do the following to get a distribtion for the difference between the 2 times:
Subtract the two means. This will be the mean of the new distribution:
10.0 - 10.1 = -0.1
Add the variances of the two distributions. This will be the variance of the new distribution.
.2^2 + .2^2 = .08
convert to SD:
sqrt(.08) = .28
So your new distribution has a mean of -0.1 and a SD of .28. Plug that into a normal distribution, find the cumulative probability for x<0 (negative difference means Aurora was faster), and you get .638. Doing the same thing for your whole table (all you have to do is change the mean of the new distribution, since the SD will be the same for all match-ups), you get Aurora’s odds against each opponent:
10.1 - 0.6382
10.2 - 0.7602
10.3 - 0.8556
10.4 - 0.9214
10.5 - 0.9615
10.6 - 0.9831
10.7 - 0.9933
10.8 - 0.9977
10.9 - 0.9993
11.0 - 0.9998