Tuesday, March 24, 2009
Being Behind is a Good Thing (Part III)?
Using King Yao’s data, here is how the home team if you look at their first half scores:
half homeScore homeWins n
1 -7 0.413 1103
1 -6 0.419 1265
1 -5 0.455 1478
1 -4 0.463 1518
1 -3 0.515 1618
1 -2 0.539 1844
1 -1 0.575 1865
1 0 0.607 1918 <--
1 1 0.600 1996 <--
1 2 0.629 1923
1 3 0.646 1908
1 4 0.690 1756
1 5 0.704 1680
1 6 0.739 1542
1 7 0.754 1440
We see the discontinuity when the home team is up by 1 or tied at the half.
Now, let’s look at how the home team does if you ONLY look at the second half scores. That is, assume that the game starts at the 3rd quarter. Here then is how the home team does, based on their score in the second half of the game, and how often they won:
half homeScore homeWins n
2 -7 0.451 1164
2 -6 0.445 1329 <--
2 -5 0.486 1431 <--
2 -4 0.486 1513
2 -3 0.522 1680
2 -2 0.552 1795
2 -1 0.558 1884
2 0 0.597 1946
2 1 0.614 1963
2 2 0.654 2001
2 3 0.657 1886
2 4 0.687 1665
2 5 0.725 1703
2 6 0.728 1627
2 7 0.770 1331
We have discontinuities at different points, but also alot of close calls too. For example, if they win the second half by 5 or 6 points, their chances of winning the game is virtually identical. Same for scoring 2 or 3 more points in the second half.
Remember, we didn’t look to see how well they did in the first half. There’s no reason that scoring 5 or 6 points in the second half should be biased based on the first half score, should it?
I’ll repeat the first half chart, this time adding a straight line regression, and the difference between the empirical and the regression line:
1 -7 0.413 0.407 0.006
1 -6 0.419 0.432 -0.013
1 -5 0.455 0.457 -0.002
1 -4 0.463 0.482 -0.019
1 -3 0.515 0.508 0.007
1 -2 0.539 0.533 0.006
1 -1 0.575 0.558 0.017
1 0 0.607 0.583 0.024
1 1 0.600 0.608 -0.008
1 2 0.629 0.634 -0.005
1 3 0.646 0.659 -0.013
1 4 0.690 0.684 0.006
1 5 0.704 0.709 -0.005
1 6 0.739 0.734 0.005
1 7 0.754 0.760 -0.006
The standard deviation of the differences is .012.
Now, here it is for the second half scores:
2 -7 0.451 0.431 0.020
2 -6 0.445 0.454 -0.009
2 -5 0.486 0.478 0.008
2 -4 0.486 0.501 -0.015
2 -3 0.522 0.525 -0.003
2 -2 0.552 0.548 0.004
2 -1 0.558 0.572 -0.014
2 0 0.597 0.596 0.001
2 1 0.614 0.619 -0.005
2 2 0.654 0.643 0.011
2 3 0.657 0.666 -0.009
2 4 0.687 0.690 -0.003
2 5 0.725 0.713 0.012
2 6 0.728 0.737 -0.009
2 7 0.770 0.760 0.010
The standard deviation of the differences is .011.
It looks to me that the deviations are noise, and not related to anything beyond that. Certainly, there’s nothing really distinguishing between the 1st half or 2nd half.