Friday, April 22, 2011
Bayes and regression, again
Great post from Kincaid.
Note: a better estimate than .050 as the spread in talent is .060. That’s why I use 69 games as the regression amount equal to 50%. (Kincaid’s example, using .050 as the spread, implies 100 games as the regression amount.)
One easy way to test that 69 is a better number than 100 is to simply take games #1, 3, 5… 137 for each team in pool1, and games 2, 4, 6… 138 in pool2, and run a correlation. You should get r=.50.


Using .060 as the SD for the prior distribution gives an expected true W% of .450 for a team that goes 2-10.
From 2000-2010, there were 1255 stretches where an MLB team went 2-10 over 12 games (per Retrosheet gamelogs). The average record of those teams in games outside that 12 game span (from the same year) is .451. So that fits reality much better than using .050.