THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, March 24, 2009

Being Behind is a Good Thing (Part III)?

By Tangotiger, 12:40 PM

Using King Yao’s data, here is how the home team if you look at their first half scores:


half homeScore homeWins n
1 -7 0.413 1103
1 -6 0.419 1265
1 -5 0.455 1478
1 -4 0.463 1518
1 -3 0.515 1618
1 -2 0.539 1844
1 -1 0.575 1865
1 0 0.607 1918 <--
1 1 0.600 1996 <--
1 2 0.629 1923
1 3 0.646 1908
1 4 0.690 1756
1 5 0.704 1680
1 6 0.739 1542
1 7 0.754 1440

We see the discontinuity when the home team is up by 1 or tied at the half.

Now, let’s look at how the home team does if you ONLY look at the second half scores.  That is, assume that the game starts at the 3rd quarter.  Here then is how the home team does, based on their score in the second half of the game, and how often they won:

half homeScore homeWins n
2 -7 0.451 1164
2 -6 0.445 1329 <--
2 -5 0.486 1431 <--
2 -4 0.486 1513
2 -3 0.522 1680
2 -2 0.552 1795
2 -1 0.558 1884
2 0 0.597 1946
2 1 0.614 1963
2 2 0.654 2001
2 3 0.657 1886
2 4 0.687 1665
2 5 0.725 1703
2 6 0.728 1627
2 7 0.770 1331

We have discontinuities at different points, but also alot of close calls too.  For example, if they win the second half by 5 or 6 points, their chances of winning the game is virtually identical.  Same for scoring 2 or 3 more points in the second half.

Remember, we didn’t look to see how well they did in the first half.  There’s no reason that scoring 5 or 6 points in the second half should be biased based on the first half score, should it? 

I’ll repeat the first half chart, this time adding a straight line regression, and the difference between the empirical and the regression line:
1 -7 0.413 0.407 0.006
1 -6 0.419 0.432 -0.013
1 -5 0.455 0.457 -0.002
1 -4 0.463 0.482 -0.019
1 -3 0.515 0.508 0.007
1 -2 0.539 0.533 0.006
1 -1 0.575 0.558 0.017
1 0 0.607 0.583 0.024
1 1 0.600 0.608 -0.008
1 2 0.629 0.634 -0.005
1 3 0.646 0.659 -0.013
1 4 0.690 0.684 0.006
1 5 0.704 0.709 -0.005
1 6 0.739 0.734 0.005
1 7 0.754 0.760 -0.006

The standard deviation of the differences is .012.

Now, here it is for the second half scores:
2 -7 0.451 0.431 0.020
2 -6 0.445 0.454 -0.009
2 -5 0.486 0.478 0.008
2 -4 0.486 0.501 -0.015
2 -3 0.522 0.525 -0.003
2 -2 0.552 0.548 0.004
2 -1 0.558 0.572 -0.014
2 0 0.597 0.596 0.001
2 1 0.614 0.619 -0.005
2 2 0.654 0.643 0.011
2 3 0.657 0.666 -0.009
2 4 0.687 0.690 -0.003
2 5 0.725 0.713 0.012
2 6 0.728 0.737 -0.009
2 7 0.770 0.760 0.010

The standard deviation of the differences is .011.

It looks to me that the deviations are noise, and not related to anything beyond that.  Certainly, there’s nothing really distinguishing between the 1st half or 2nd half.

#1    MGL      (see all posts) 2009/03/24 (Tue) @ 22:17

There’s no reason that scoring 5 or 6 points in the second half should be biased based on the first half score, should it? 

Not exactly sure what you are asking here ("scoring 5 or 6 points?"), but in basketball what the score and score differential is in the first half is very much related to what happens in the second half.

Funny, how people that follow basketball and know intimately how it “works” seem to be in almost unanimous agreement that there is something going, which is not surprising, since basketball does not nearly have the “independence” that baseball has, and people who are just “numbers people” attribute all the anomalous to noise *at least that that is most likely).

Guess what?  I am 95% sure that the numbers guys are wrong.  Let this be a lesson.  At least it should be. The numbers people are being hoisted by their own petard!  They are the ones that know about Bayesian probabilities, but because they are ignorant of the unique characteristics of basketball and the potential for real effects to be causing these anomalies, they overlook, ignore, or understate the a priori probabilities.

The basketball guys get the conclusion right (most likely) because they unknowingly are using proper Bayesian analysis which the number guys are not.

I repeat (and repeat and repeat and repeat), you can’t just look at the numbers in a vacuum!  I don’t know how to stress that any more than I am trying.

It is just like the odd/even days and day/night example I gave earlier. You CANNOT look at a 2.5 SD anomaly in pitcher ERA for power and finesse pitchers and conclude that “there must be something going on” and you cannot look at a 1.5 SD in day/night splits and conclude that it is just noise without knowing or estimating some probability that day/night splits for finesse and power pitchers might be “real.” Same thing for odd/even day splits.

What if you knew nothing about baseball and started looking at all kinds of splits for players:  day/night, versus lefty/righty opponents, parks, etc.  Could you just look at the numbers and reach good conclusion and make reliable inferences about what is likely “real” or not “real?” No, no and no!  You COULD but you would do a hell of a lot better if you applied what you knew about baseball or just applied common sense. It would change those inferences and conclusions dramatically.

Tango, I am afraid you are doing the same thing with this data.  Operating in a vacuum as if you know nothing about way the basketball potentially works (I don’t know whether you do or don’t).  That is NOT a good way to do an analysis and come up with reliable conclusions. 

I’ll say it again.  From what I know about basketball, is extremely unlikely that these anomalies are occurring by random chance even though that might the conclusion you have to reach without using a Bayesian analysis (which is fine if you nothing else).


#2    King Yao      (see all posts) 2009/03/24 (Tue) @ 22:25

The 2H result has a lot to do with the 1H result.  For example, Oklahoma City was a 7.5 point underdog to the Lakers tonight.  OKC was down by 24 at the half.  They were favored in the 2H by 2.5 points.  Before the game, you may have expected the 1H line to be about Lakers -4 and the 2H line (assuming you didn’t know what happened in the 1H) to be about Lakers -3.5.  So OKC -2.5 for the 2H means there was a 6-point swing due to the 1H result.

As for the numbers guys vs the basketball guys...I don’t know if there really is such a division.  At least I’ve never actually seen it anywhere.  Most numbers guys in basketball actually do understand basketball...same as the numbers guys in baseball actually do understand baseball also.


#3    Tangotiger      (see all posts) 2009/03/24 (Tue) @ 23:58

There’s no reason that scoring 5 or 6 points in the second half should be biased based on the first half score, should it?

Not exactly sure what you are asking here ("scoring 5 or 6 points?"), but in basketball what the score and score differential is in the first half is very much related to what happens in the second half.

What I was saying about the +5 / +6 is that when the home team, IN the second half, score 5 more points than their opponent, they win exactly the same as when they score 6 more points than their opponent.

This can ONLY be true if:
1. random variation
2. bias in the data

I am NOT looking at the first half, and seeing what is happening in the second half.  I’m doing the opposite.  I am looking only at what happened in the second half: the home team outscored the opponent by 5 or 6 points, and in either case, they ended up winning the same amount of games (73%).

Now, should we expect bias if you know that the home team outscored the opponent by either 5 or 6 points?  No.  I don’t see why you should.  Why would one particular set of games, where the home team outscored their opponents by 5 IN the second half have come from a different population of games than when they outscored their opponents by 6 in that same half?

This is unlike saying that you outscore your opponent in the first half: the play in the second half is dependent on the play of the first half.  But, I’m not looking at that.  I’m starting with the second half.  The bias shouldn’t be there.

That leads me to random variation.

And indeed the spread in the first-half differentials is the same as the spread in the second-half differentials.

If the first-half drives the play in the second-half, we should expect to see different distributions of scoring in each half.  Instead, we see the same distributions in scoring.


#4    Tangotiger      (see all posts) 2009/03/25 (Wed) @ 00:06

I’ll try to be clearer.  Look at the data IN the SECOND-half:

2 -7 0.451 0.431 0.020
2 -6 0.445 0.454 -0.009

2 -5 0.486 0.478 0.008
2 -4 0.486 0.501 -0.015

2 -3 0.522 0.525 -0.003

2 -2 0.552 0.548 0.004

2 -1 0.558 0.572 -0.014

2 0 0.597 0.596 0.001

2 1 0.614 0.619 -0.005

2 2 0.654 0.643 0.011
2 3 0.657 0.666 -0.009

2 4 0.687 0.690 -0.003

2 5 0.725 0.713 0.012
2 6 0.728 0.737 -0.009

2 7 0.770 0.760 0.010

Those columns are:
- second half
- home score differential in second-half
- eventual win%
- expected win% using regression
- difference between the last two

See where I have them grouped?  When the home team outscores its opponent by 2 points IN the SECOND half, they end up winning 65.4% of the time.  But, if they outscore them by 3, they win 65.7% of the time.  The expectation was 64.3% and 66.6%, respectively.

So, these particular groups of teams came from a disproportionate group in the FIRST half.  How else to explain ending with the same winning percentages, even though you know in the second half one group outscored the opponent by 2, and the other outscored the opponent by 3?

We know they came from two different pools.  The question is why?  And, there’s no reason to think that a group of teams that outscores its opponents by 2 would have come from a different pool than those that outscored their opponents by 3. 

This is unlike starting with a team with a lead of 2 and a lead of 3 entering the second-half.  In THOSE cases, you could have pyschology impacting you.  In my case, it doesn’t apply, since I’m looking at the second-half scores.

Hope that’s clear…


#5    King Yao      (see all posts) 2009/03/25 (Wed) @ 00:55

Is this your point?: look at the variability and seemingly randomness in the 2H numbers ... and it compares similarly to the 1H numbers (as far as variability from the regression expectation), thus its a sign the 1st half result (where the home team down by 1 wins the game so often) is likely to be just noise similar to these 2H numbers


#6    MGL      (see all posts) 2009/03/25 (Wed) @ 01:27

King, when I said “numbers guys” I meant guys like the authors of the Wharton study and baseball sabermetricians who know little about basketball.  You CAN’T, and I am sure YOU will agree, reach accurate conclusions and make reliable inferences from the numbers without knowing a lot about the game, at least in the type of analysis we are talking about here.  Determining whether the data in the study meet some level of statistical significance one way or another is NOT enough.  For fear of sounding like a broken record, it is important to do a Bayesian analysis, which requires knowing A LOT about the game, otherwise you are just data mining.

I don’t really get Tango’s point either.  Plus, Tango, it IS true that in basketball a team that outscores its opponents by 5 in the second half comes from a different population of teams than a team that outscores its opponents by 6 (or any other differential).  They are not quite as good a team.  But I don’t know what that has to do with the discussion at hand.

Yes, we all realize that these data points will be all over the map because of randomness.  At the same time, the task is to look at those anomalous data points and try and see if they fit into any plausible models that we can construct about basketball that comports with what we know or suspect about the game.  If there is some suggestion that one or more do, then it behooves us to look into that more closely to see if we can find evidence for or against.  The operative idea there is “models that comport with what we know or suspect about basketball.” That is critical and that is what is missing from your analysis and discussion and hence what makes for dubious or at least incomplete conclusions.  We CAN’T just look at real-world phenomena that we know a lot about and look at the data in a vacuum and say, “that looks like a random data point, but that doesn’t.” That is not the way to do it.  We certainly don’t do that in baseball and we certainly should not in basketball.  I truly think you (Tango) are forgetting about that.  We usually start with very plausible hypotheses in baseball, as the authors of the original study did, and then look for statistical evidence to support them or not.  You are looking at the data blindly and then using a blind (non-Bayesian) conclusion to debunk the hypothesis.  That would be like hypothesising that left-handed batters do better against RHP than LHP (and there is evidence in the experimental sciences that that “kind of thing” is true), then looking at a small sample of players’ splits, finding a 1.5 SD result (in favor of the hypothesis) and then declaring that, “That is too small an anomaly to take seriously. It is likely random variation. I’ll discard my hypothesis.” No, no, no!


#7    Tangotiger      (see all posts) 2009/03/25 (Wed) @ 07:27

King/5: yes.

MGL/6: no, I’m not forgetting anything.

I’ll be back with more later today to answer this question:

Plus, Tango, it IS true that in basketball a team that outscores its opponents by 5 in the second half comes from a different population of teams than a team that outscores its opponents by 6 (or any other differential).


#8    King Yao      (see all posts) 2009/03/25 (Wed) @ 09:28

#6: You CAN’T, and I am sure YOU will agree, reach accurate conclusions and make reliable inferences from the numbers without knowing a lot about the game, at least in the type of analysis we are talking about here.

I agree.


#9    Guy      (see all posts) 2009/03/25 (Wed) @ 15:39

Setting aside the probability lesson, let’s review the bidding:

The original study finds that -1 teams (down one at half) win about 51% of the time.  They find this is mainly a road team effect—road teams win 8% more games than expected when down 1. 

King’s large NBA dataset, using Tango’s results above, has -1 teams winning 48% of the time. (This isn’t consistent with MGL’s earlier post, and Tango’s samples are about twice as big, so I don’t know who’s right—I’m going with Tango’s #s until that’s clarified.) There appears to be a discontinuity around 0, though hard to pin down at which score/s, and here the home team appears to overperform when down 1.

Two other NCAA data sets, one quite a bit larger than the authors, find that teams down by one win about 48% of the time. 

Where does this leave us?  I’m inclined to agree with MGL that there may be something odd going on in games that are very close at half time.  A one point lead often doesn’t seem to be worth as much as it “should” be.  But I don’t think we can be nearly as sure of that as he is, given the contradictory results we’re seeing. 

And I think we can be fairly sure that teams down one at the half do not really win more often than teams tied at the half.  It’s worth considering how much attention the original paper would have received if they had used a larger NCAA dataset.  Would anyone have cared—and would NYT have published the op-ed—if their conclusion were “teams down by one point at half win slightly more than they should”?

One complication here is that the relative talent of the two teams varies by halftime score, but not in a neat, consistent way.  So you have multiple sources of noise in this data.  I have another NBA dataset that includes team win%, so we can calculate the log5 for each matchup (note that this overlaps with King’s data, so is not really a new source).  When I do that and run a regression controlling for team strength, each point at halftime is worth about .019 in win%.  Then I project a home win%:  log5(home) + .103 + .19*HalftimeLead.  Note that halftime lead is measured as above/below the mean of +2.1 points.

Q2-H Log5 Proj.  Hwin% Diff
4 0.519 0.658 0.653 -0.005
3 0.529 0.649 0.644 -0.005
2 0.489 0.590 0.598 0.008
1 0.492 0.574 0.582 0.008
0 0.474 0.536 0.545 0.009
-1 0.460 0.503 0.554 0.052
-2 0.463 0.487 0.503 0.016
-3 0.464 0.468 0.481 0.013
-4 0.461 0.445 0.436 -0.009

(sorry about formatting).  Here again the -1 result stands out at +.052—it’s about 2 SDs away from projected given sample size.  -2 and -3 home teams also do a bit better than expected.

+8


#10    Guy      (see all posts) 2009/03/25 (Wed) @ 16:46

In looking at this data, I also compared win% to the log% result—i.e. home court advantage— stratified by strength of home team.  Dividing games into 6 equal groups (2034 games per cell), you get this:

Log 5 HWin% Home Adv.
0.820 0.905 0.085
0.673 0.794 0.121
0.558 0.706 0.147
0.443 0.558 0.115
0.328 0.431 0.103
0.180 0.227 0.047

Home court advantage appears greater when teams are more evenly matched.  (Unless there’s some reason to think log5 understates the favorite’s advantage in certain ranges.)


#11    MGL      (see all posts) 2009/03/25 (Wed) @ 19:22

In basketball, unlike in baseball at least, home court advantage appears to be a fixed number of points.  This is a known thing in “basketball circles” as well (again, emphasizing the point that “knowing” about the sport helps in interpreting and making sense of the data).  Accordingly, it makes the most difference in close games.

Guy, how are you getting the strength of teams?  What database are you using?  Do you have Vegas lines that you are using as a proxy for team strength?


#12    Guy      (see all posts) 2009/03/25 (Wed) @ 19:37

MGL: the dataset I’m working with has season Win% for both the home and away team. So the “Log5” column is the log5 matchup from home team’s perspective.  The projection is Log5 plus .103 (league-wide home advantage) + .019 for each halftime point above/below average. 

Adding .103 is probably not the best home court estimate possible, as you indicate.  But since the range I’m looking at here is pretty narrow --log5 of .461 to .529—I would think it works OK.  I’m certainly open to suggested improvements.

I think the log5 pattern is interesting.  It appears step-like, though that may just be an illusion.  From -4 to -1 at halftime the matchups are basically identical, then home team is a bit better when tied, then +1 and +2 are basically the same matchups, as are the +3 and +4 games.

I do think this additional source of variation makes the approach of the original study problematic.  They draw a line through the data points, and then ask how far over/under “expectation” the outcome is for each halftime score.  But their dataset isn’t very big, so the curve itself has a lot of noise built into it.  For example, they say that a +1 home team in the NCAA “should” win 65.6% of the time, but “only” win 57.5%.  But 65.6% is much too high an estimate.  I think the average home team in NCAA wins about 62%, and should lead by about 3 at halftime.  Clearly, a home team up by only one—probably a below average team, and with a smaller than usual lead—should then win 60% of the time or less, not 65%.  They need a more solid foundation for their “expected” win%.


#13    King Yao      (see all posts) 2009/03/25 (Wed) @ 19:53

Usually, in NBA playoff series, when a team is a 4-point home favorite, they are around a 4-point underdog at the other team’s court.  Being up 2-0, down 0-2 or 1-1 going into the 3rd game has something to do with it (the team behind in the series justifiably get a little extra boost), but typically its a 8-point swing.

However, when it is a lopsided matchup, the home court advantage is less.  A team that is a 8-point favorite at home is not a pick’em on the road...rather they are about a 2-point favorite. 

So, I do agree in general that big disparities in talent may show less HCA than when the two teams are about equal.  At least it is clearly that way in the NBA playoffs.


#14    MGL      (see all posts) 2009/03/25 (Wed) @ 22:14

There are clearly a lot of psychological, fatigue, strategy, and personnel issues that we see in basketball that we don’t see in, for example, baseball, which renders certain points in a game very much dependent on other points in a game and very much dependent on the relative quality of the teams.

Yao, are Yango and I using the same data set?  I don’t have the numbers in front of me, but I think we got different outputs from presumably the same data.  If that be the case, one of us made a programming error.  Let me make sure I interpret the file headings correctly (they are not 100% unequivocable).  T1 is generally the home team (T1 location is usually “H").  T1Sc is generally the home team score for the entire game, and T2Sc is the road (team 2) team final score.  Q1F is T1 (usually home) team score in the first quarter and Q1A is the road team (usually) score in the first quarter.  IOW, “F” is “for” and “A” is “against” from the perspective of team I, which is usually the home team.  Same thing for 1H and 2H and OT.

Also, some of the dates in your file, Yao, are messed up.  For some of the season, you have the wrong year in the year field.  For example, look at 2000, 2, 3 (year, month and day), game between ATL and PHI that ended at 98-80.  That is actually from the 00-01 season and occurred in 2001 and not 2000.  You have that mistake in the file for some seasons and not for others.

It doesn’t change any of the numbers, but I was trying to match up my logs of Vegas lines with your games using the date, teams, and score and I found the problem.


#15    King Yao      (see all posts) 2009/03/25 (Wed) @ 22:17

14: Yao, are Yango and I using the same data set?

You guys should have the exact same data.  I sent him the exact same file I sent to you, unless some odd error occurred.


#16    MGL      (see all posts) 2009/03/25 (Wed) @ 22:18

OK, I got it (I am thick).  Your first field is the “season” and not the actual year.  “1993” is the 93-94 season.  As I said, the headings and the data were not 100% unequivocable.


#17    King Yao      (see all posts) 2009/03/25 (Wed) @ 22:28

Your interpretation of the columns is accurate.

The ATL-PHI game --- that was 2000, 1, 3, right?

The season column is not the year, but rather the first year of that season.  2000 = 2000/2001 season regardless of month. 

let me know if there are still errors or differences.


#18    King Yao      (see all posts) 2009/03/25 (Wed) @ 22:29

oops, did not see your post #16.  ignore my post #17.


#19    MGL      (see all posts) 2009/03/25 (Wed) @ 23:36

I matched up Kin Yao’s database with a database of Vegas lines for each game, which is good proxy for the relative strength between the two teams.  I only have Vegas lines for the 99-06 (8) seasons, so I’ll only use that data.

orig. ps is the pre-game point spread, which is a proxy for the relative strength between both teams.  The average home team point spread is -3.2 points in this sample.

Home team

pt diff at half ---- orig. ps ----- final wp --- N

0 ----------------- -2.9 --------- .539 --- 304
+1----------------- -3.2 --------- .565 --- 347
+2----------------- -3.2 --------- .604 --- 336
+3----------------- -3.6 --------- .682 --- 314
+4----------------- -3.9 --------- .684 --- 342
+5----------------- -3.8 --------- .697 --- 320
+6----------------- -4.6 --------- .782 --- 308
+7----------------- -4.4 --------- .818 --- 314
+8----------------- -4.2 --------- .798 --- 238
+9----------------- -4.7 --------- .797 --- 281
+10 or more ------- -5.6 --------- .922 --- 1844

0 ----------------- -2.9 --------- .539 --- 304
-1----------------- -2.2 --------- .603 --- 345
-2----------------- -2.4 --------- .551 --- 272
-3----------------- -1.9 --------- .465 --- 273
-4----------------- -2.2 --------- .522 --- 268
-5----------------- -1.5 --------- .404 --- 255
-6----------------- -1.3 --------- .339 --- 245
-7----------------- -1.9 --------- .303 --- 231
-8----------------- -1.2 --------- .339 --- 224
-9----------------- -0.4 --------- .225 --- 178
-10 or more ------- -0.4 --------- .164 --- 983

In this data set (99-06), the home team wins more than 50% of the time even when down by 2 pts! Down by 1, and they win over 60% of the time.  I think that is a fluke as they are only outscoring their opponent, the road team, by 2.90 points, which gives them an average margin of victory of 1.9 points, which I don’t think is equivalent to a 60% win rate.  However, when down by 2 or 3, they are outscoring their opponent by almost 3 points even though they were only a 3.2 full game favorite, which suggests that they should be outscoring their opponent by only 1.5 or so in the second half.

In a tie game, the home team is a 2.9 pt fave before the game starts and outscores the road team in the second half by only 1 point.

When the road team is down by 1, 2, or 3 points, they are being outscored by 1 to 1.5 points in the second half, and they were a 3.2 to 3.6 pre-game dog.

Put another way, when the home team is up by a small margin, they outscore the road team in the second half by only 1 to 1.5 points, about what they should, maybe a little less.  This (out-scoring their opponent less in the 2nd half) is probably because they more often have big leads and pull their starters late in the game (the “pt shrinkage” I spoke about with either a big fave before the game or a big lead early).

However, when that same home team is behind by a small margin, they outscore the road team in the second half by around 3 points, more than they “should.” I don’t think the difference is due to shrinkage but it could be.

I think that clearly when the home team is down by a small margin at the half, they out-score the road team in the second half more than they should, based on the initial point spread and adjusting for shrinkage.

Again, I have no idea why, other than ref bias, or some psychological effect.

BTW, Guy, what kind of regression did you run to come up with the home court advantage coefficient?  A regular or logit?

Have you (or could you) ever done the same thing with a large baseball data set?  I don’t recall anyone ever doing any research on how HFA relates to a team’s overall wp.  For example, if a team is a .580 team overall, what should we expect them to do at home and on the road?  What about a .430 team?  Etc.  I’ve seen some estimates of what the function or equation should look like, but I have never seen, or I don’t recall, any good regression or other work to come up with it.  I don’t think is is obvious that it is linear, so I think it requires playing around with different best fit curves and I don’t know how to do that.


#20    Guy      (see all posts) 2009/03/26 (Thu) @ 00:08

MGL:
There was no regression involved.  I just calculated the home team’s projected win% for each game using log5.  Then grouped the games into 6 tiers based on home team’s projected win%, and compared that to their actual win%.

I know log5 works well for baseball, but the range of win% (or OBP, or ERA) is of course much narrower.  Does anyone know how well it works at more extreme values?  There are NBA games in which log5 gives one team a 95% chance of victory, and that’s without accounting for home court advantage!  So I don’t know if the smaller home court advantage I’m seeing at the extremes is real, or if log5 is perhaps exaggerating the strength of favorites in lopsided matchups.


#21    Tangotiger      (see all posts) 2009/03/26 (Thu) @ 00:21

I think that clearly when the home team is down by a small margin at the half, they out-score the road team in the second half more than they should, based on the initial point spread and adjusting for shrinkage.

Again, I have no idea why, other than ref bias, or some psychological effect.

MGL, can you redo your cool chart in post 19, but instead of looking at the score in the first half, look at the score in the second half.  That is, add up the scores for Q3+Q4.


#22    Guy      (see all posts) 2009/03/26 (Thu) @ 00:41

"I think that clearly when the home team is down by a small margin at the half, they out-score the road team in the second half more than they should, based on the initial point spread and adjusting for shrinkage.”

Shouldn’t we expect this?  The home team will have a lot of games in which it has “point shrinkage”, but relatively few in which it gains points because the road team has a blowout win and eases off.  Given that, I would expect home teams to outscore opponents by more than average in close games.  And the games most likely to stay close to the end are those in which the home team slightly trails at the half (since we expect home team to be +1 or +2 in second half).  Isn’t that what we’re seeing here?


#23    MGL      (see all posts) 2009/03/26 (Thu) @ 02:37

Guy, yes, part of the explanation is probably shrinkage, but I don’t think that explains nearly all of the difference.  I don’t think that the difference in blowout percentage between up 1 and down 1 is all that great, even including the fact that the up 1 team is a better team than the down 1 team.  But I could be wrong.  And in any case, the shrinkage only applies to the point differential and NOT the final wp.  So even if we see a larger point differential in the second half when the home team is behind (due to shrinkage), we will not see any anomalies in terms of wp due to shrinkage. Shrinkage (pulling starters) does not affect wp, otherwise they wouldn’t do it - or at least it doesn’t affect it all that much.

The problem of course, is that there is a lot of random noise in wp and less so in point differentials.  But looking at the point differentials is problematic due to shrinkage.

Ah, you used the poor man’s regression, which I am always referring to!  I love it.  I always say that you should do that first before ever doing a regression because regressions can be so opaque to both the person who does them and to the person who sees the results. And in many cases, you don’t want to even do a regression.  I agree, that log5 could easily break down in basketball, thus screwing up the home court advantage calculation for high wps.

Tango, I can do that, I guess, but I am not sure what it is going to show - but I’m sure you’ll let us know!  You want the same chart, but the first column, the point differential, to be from the second half?


#24    Tangotiger      (see all posts) 2009/03/26 (Thu) @ 07:08

Right!


#25    Guy      (see all posts) 2009/03/26 (Thu) @ 11:30

MGL:  Where do you stand at this point on the original question?  Are you still convinced that home teams down 1 at the half win more than they “should?” If the NCAA data also found a home effect, I’d be inclined to agree, but it shows the reverse (or perhaps none at all, depending on which NCAA dataset you want to look at).  Are you still convinced there’s something “real” here?


#26    MGL      (see all posts) 2009/03/26 (Thu) @ 16:03

Guy, look at the NBA data again to answer your question.  There is little doubt in my mind that something is going on.

I had a bug in the part of my program that got the second half differentials for each first half pt differential.

Here are the new numbers for 93-08 NBA, plus I am including 2nd half pt differentials by quarter.

D1=pt diff after 1st half
N=games
WP=final wp for home team
WP ALL=final wp for Home and Road
D2=pt diff for 2nd half
D3=pt diff for 3rd quarter
D4=pt diff for 4th quarter

Home Team

D1---N---WP---WP ALL----D2----DQ3----DQ4

-10 : 2202 : .163 : .109 :  3.5 : 1.6 : 1.9
-9 : 369 : .236 : .204 :  2.6 : 1.9 :  .8
-8 : 462 : .331 : .264 :  3.4 : 1.9 : 1.5
-7 : 519 : .324 : .269 :  2.7 : 1.3 : 1.4
-6 : 511 : .356 : .283 :  2.3 : 1.4 : 1.0
-5 : 568 : .389 : .333 :  2.2 : 1.2 : 1.0
-4 : 589 : .445 : .384 :  2.4 : 1.3 : 1.1
-3 : 614 : .482 : .412 :  2.2 : 1.1 : 1.2
-2 : 656 : .527 : .448 :  2.6 : 1.5 :  .9
-1 : 710 : .582 : .504 :  2.5 : 1.3 : 1.1
0 : 707 : .546 : .500 :  1.5 : 1.1 :  .4
1 : 739 : .571 : .496 :  1.2 : .8 :  .3
2 : 717 : .625 : .552 :  1.6 :  .9 :  .6
3 : 727 : .646 : .588 :  .6 :  .6 :  0
4 : 777 : .662 : .616 :  .2 :  0 :  .2
5 : 674 : .714 : .667 :  .9 :  .7 :  .1
6 : 683 : .772 : .717 :  1.1 :  .8 :  .2
7 : 653 : .775 : .731 :  .4 :  .3 :  .1
8 : 553 : .792 : .736 :  -.2 :  .2 : -.5
9 : 543 : .818 : .796 :  -.5 :  0 : -.6
10 : 4087 : .919 : .891 :  -.6 :  .1


#27    MGL      (see all posts) 2009/03/26 (Thu) @ 16:19

As Tango requested, here are the same numbers, but the first column, the point differential at the half, is for the second half and not the first, as if the game were reversed and they played the 2nd half first: I am not sure what this shows us.

Home Team

D1---N---WP---WP ALL----D2----DQ3----DQ4

-10 : 2475 : .148 : .093
-9 : 420 : .262 : .190
-8 : 480 : .252 : .203
-7 : 533 : .315 : .239
-6 : 577 : .374 : .280
-5 : 591 : .411 : .220
-4 : 598 : .420 : .356
-3 : 628 : .500 : .380
-2 : 698 : .539 : .432
-1 : 751 : .573 : .485
0 : 745 : .566 : .500
1 : 689 : .611 : .515
2 : 736 : .670 : .568
3 : 734 : .723 : .620
4 : 702 : .698 : .644
5 : 695 : .757 : .680
6 : 683 : .799 : .720
7 : 600 : .828 : .761
8 : 572 : .837 : .797
9 : 526 : .867 : .810
10 : 3627: .944 : .907

Seems to be a little bit of an anomaly when the home team is outscored by 1 in the second half, but not nearly as large as when the pt differential is -1 (or -2) in the second half.

If Tango is saying that the results should be smooth when looking at score diff in the second half (since teams don’t have crystal balls), then his point is well-taken, I think, and the data above indicate that something is going on with -1 and -2 games in the first half, but the data also indicate that something fundamental is going on when a team is up (or down) by 1 in either half. I think. I say fundamental in this case because it can’t be psychological since a team does not know the pt differential in the second half until the end of the game.


#28    MGL      (see all posts) 2009/03/26 (Thu) @ 16:43

I think I’ve got some more insight.  Though I am still not sure of the reason.

In tied games at the half, OT occurs 10.2% of the time.  The home team scored .1 pts more than the road team in OT.  Sounds about right I guess, maybe a little low. Single OT is around 10% of a full game, time-wise.

Here are the OT point differentials for all pt differentials at the half, again, from the perspective of the home team and how often OT occurs:

-10+ .3 .049
-9 -.7 .073
-8 -.7 .097
-7 -.7 .067
-6 -.8 .082
-5 -.5 .070
-4 -.1 .087
-3 -.6 .085
-2 1.4 .072
-1 1.6 .085
0 .1 .102
1 .6 .083
2 1.2 .091
3 .5 .076
4 -.7 .066
5 0 .071
6 1.3 .054
7 .7 .063
8 1.3 .065
9 .9 .039
10+ .4 .028

For some reason, when the home team is down by 1 or 2 points at the half and then the game is tied in reg, they just kick the road team’s **s in OT!


#29    Tangotiger      (see all posts) 2009/03/26 (Thu) @ 16:45

EXACTLY 100% my point!


#30    Tangotiger      (see all posts) 2009/03/26 (Thu) @ 16:46

(29 was in reference to 27)


#31    King Yao      (see all posts) 2009/03/26 (Thu) @ 16:51

Interesting twist to look at OTs.
Can you rerun the table in post #26 with only non-OT games?


#32    Guy      (see all posts) 2009/03/26 (Thu) @ 17:05

MGL:  Why “little doubt” something is going on?  If you take your +1 teams in post 26 and knock it down 2SDs, you get a .48 and everything looks pretty smooth and there’s nothing much to talk about.  Now, don’t get all Bayesian on me and remind me of prior probabilities—I get that.  But the NCAA data is quite mixed, and show the opposite pattern in terms of home/road effect.  So I’m not being argumentative (or stubborn):  what is so convincing here? 

To me the biggest sign of an anomaly is the change in second half scoring from -1 to tied (for home team).  It suddenly drops a full point and stays down, and seems like the curve should be much smoother there.  Maybe OT plays a role?


#33    MGL      (see all posts) 2009/03/26 (Thu) @ 22:30

When I say “little doubt” I mean like 10% doubt.  So I am not THAT sure. I am a little biased because I know some half-time basketball bettors that have known about this for years, but that still doesn’t mean it is “real.” Other than that, I don’t know anything more than anyone else.  The data is the data.  I am also influenced by the fact that basketball is fraught with all the things I have spoken of, like dependence, psychological factors, fatigue, etc.  As well, in basketball there appear to be things going on at home and on the road other than just a 3 or 4 pt difference.  At least that is what most basketball people think. That is all part of the Bayesian equation.  As I said, it doesn’t really matter what I think. Sometimes we just don’t know.  I mean, with sample data, we never know anything for sure, but sometimes we know things 99.9%, sometimes we have no idea, and everything in between. This is one of those in between things, I think.  I mean it is silly to say we “know” anything from sample data, unless the Bayesian prior probabilities are really strong.


#34    MGL      (see all posts) 2009/03/27 (Fri) @ 02:10

Same chart, but with all OT games removed. There is still a disconnect at -1 point for the home team.  They win more than in a tie game.

Home Team

D1---N---WP---WP ALL

-10: 2094: .144 : .095
-9 : 342 : .228 : .194
-8 : 417 : .317 : .251
-7 : 484 : .316 : .255
-6 : 469 : .350 : .272
-5 : 528 : .386 : .324
-4 : 538 : .441 : .375
-3 : 562 : .489 : .412
-2 : 609 : .519 : .440
-1 : 650 : .578 : .501
0 : 645 : .551 : .500
1 : 678 : .574 : .499
2 : 652 : .633 : .560
3 : 672 : .652 : .588
4 : 726 : .674 : .625
5 : 626 : .728 : .676
6 : 646 : .785 : .725
7 : 612 : .794 : .745
8 : 517 : .803 : .749
9 : 522 : .828 : .806
10 : 3974: .931 : .905

Guy, in both data sets - the NBA I am using from King Yao, and the NCAA in the authors’ data set, there is a disconnect at 1 point down for both home and away teams combined.  That is powerful evidence.  The fact that that is only true for the home team in the NBA and it is not true for the NCAA data set does not mean all that much especially since NO ONE is hypothesising, certainly not the original authors, that this effect, whatever it is, should be true only for the home team.  The authors are hypothesising that a small deficit at the half in a basketball game creates extra effort in the second half which leads to the losing team winning the game more often than expected, and in fact, more than in a tie game.  That is exactly what they found!  We looked at exactly the same hypothesis in the NBA, and we found exactly the same thing!  The fact that we found it only for the home team in the NBA data means almost nothing, since home/road was not part of the original hypothesis.  What if we found that the “-1 effect” was only true for teams west of the Mississippi or that wear red at home?  Would that invalidate or qualify the conclusion as well?  No!

All it does is give us another piece of data to work with.  If both in the NBA and in the NCAA, the effect was mostly with the home team, then perhaps the hypothesis would need to be modified or even changed.  But that is not the case, is it?

If we knew more about this effect with respect to basketball, and if the effect is real (again, we don’t know that with 100% or near 100% certainty), then perhaps we could begin to investigate if the “home/road stuff” in the NBA is meaningful and why it might not show up in the NCAA data.  For all we know, if could be only or mostly a home effect and the fact that it wasn’t only a home effect in the NBA was a fluke.  Or it could turn out that the home/road differences in the NBA are a fluke.  But no matter what, the authors formulated an hypothesis and we found strong evidence to support that hypothesis in two independent data sets.  Why do you (Guy) continue to have a problem with that?  That is the way scientific inquiry generally works, as far as I know - or at least ONE way in which it works.


#35    Guy      (see all posts) 2009/03/27 (Fri) @ 07:23

MGL:
I am persuaded that a one point lead in basketball is less valuable than it “should” be, if should means assuming each additional point has the same effect on win%.  I don’t think +1 and tied teams truly win equally, as happens to be the case in your dataset (but not in a large NCAA datset), but it does appear a +1 team wins just 51% to 52% of the time. 

Now, is the cause structural or the result of effort?  I don’t know, but lean toward structure.  The fact that -1 teams in the second half also win “too many” games supports this. 

* *

Back to the log5 issue, if I look at all teams (home and road), log5 provides pretty good predictions.  Favorites win a bit more than log5 anticipates, but not be a lot.  So I think it is true that home court advantage is greatest when teams are closely matched.  The six groups (sextiles?) are each 1/6 of the games, divided by log5 expectation.  Last column is home win% minus win% for all teams in that cohort.

Log 5 / W% / Home edge
0.820 0.839 0.066
0.673 0.682 0.113
0.558 0.574 0.132
0.443 0.426 0.132
0.328 0.318 0.113
0.180 0.161 0.066


#36    King Yao      (see all posts) 2009/03/27 (Fri) @ 09:27

I agree with Guy’s first paragraph in post #35.


#37    MGL      (see all posts) 2009/03/27 (Fri) @ 13:26

Guy, any chance you can do that for baseball (log5, home and road)?


#38    MGL      (see all posts) 2009/03/27 (Fri) @ 21:56

Without doing the math myself, are those numbers in 35 consistent with just adding a fixed number of points for the HCA, like 3 or 4?  I’m not sure exactly how to test that, but it could be done.

Does anyone know if any studies in baseball where they look at the HFA for different kinds of matchups, like Guy does above?


#39    Guy      (see all posts) 2009/03/27 (Fri) @ 23:14

MGL:  I don’t have any comparable data on baseball games.  Tippett and Ruane look at some of these issues in baseball context here: http://www.diamond-mind.com/articles/playoff.htm.  However, they are using a simplified version of log5—.500 + Awin% - Bwin%—rather than log5. 

As for home court being worth +3 points or something like that, pythag shows a similar pattern:  average teams gain more from a 3-point gain than either weak or strong teams.  If we assume opponent scores 95 points, and use an exponent of 14 (I gather there’s some debate in basketball circles about the best exponent), you get this:

Pts Pythag Delta
86 0.199 *
89 0.286 0.087
92 0.389 0.103
95 0.5 0.111
98 0.607 0.107
101 0.702 0.095
104 0.78 0.078

It should be easy to see if very strong and weak teams actually have smaller home court advantages.


#40    MGL      (see all posts) 2009/03/28 (Sat) @ 01:34

Guy, what is “delta” in your table? Delta what?


#41    MGL      (see all posts) 2009/03/28 (Sat) @ 01:44

All the Ruane and Tippett article does is compare the modified log5 outcome with the true outcome for various strength teams (they found that as the gap between the two teams widens, the modified log5 overestimates the favored team’s expected wp - which they attribute - with no evidence - to teams resting key players versus poorer teams).

There is nothing in the article that looks at the relationship between HFA in baseball and a team’s overall wp and/or the gap in strength between the 2 teams.

Surely there must be some research in this area.


#42    Guy      (see all posts) 2009/03/28 (Sat) @ 07:16

Sorry, it’s just the change from prior win%.  So +3 is worth 87 points to a .199 team, 111 points to a .389 team, etc.


#43    MGL      (see all posts) 2009/03/28 (Sat) @ 16:07

Guy, oh, OK, I see. My bad....


#44    King Yao      (see all posts) 2009/03/28 (Sat) @ 20:17

One thing I am pretty sure of ... the authors of the NCAA paper did a poor job of getting data.  There are thousands if not tens of thousands of games that they could have easily gotten 1st Half and 2nd Half results.  But they only got data from 2005 to 2008.  They could have gotten data from 1985 to 2008.  I’m sure they have research assistants they can ask to do the data-gathering.  So why did they jeopardize writing a paper that could have odd results due to small sample sizes?  The cynic in me thinks it is because otherwise they wouldn’t have a paper to write at all. 

Am I being too negative on the academic field?


#45    Tangotiger      (see all posts) 2009/03/28 (Sat) @ 22:27

Without necessarily painting a brush on all, you seem to capture the mood of many on the matter…


#46    MGL      (see all posts) 2009/03/28 (Sat) @ 22:52

#44, who knows, but I would not go so far as to say that they limited the data in order to increase their chances of finding anomalous results.  They doesn’t really even make sense. I don’t even know how they would accomplish that.  They probably just didn’t have the time or money or energy to collect more data.  I’m sure they realize that the more data they have, the more reliable results they will get, one way or another. 

I think that I mentioned this in a previous post, but when doing research on sample data or doing controlled experiments on random subjects, you should always temper your conclusions by saying something like, “More research needs to be done in this area for us to have more certainty.” That certainly would have been appropriate in this case, given the relatively small sample sizes and given the “surprise” nature of the results (maybe the researchers weren’t surprised, given their initial hypothesis).

By the way, that brings up an important point that we mention from time to time on this blog.  And that is of publishing bias.  Let’s say that out of every 100 studies like this one, even with plausible initial hypotheses, we find one or two, by chance alone, where the data support the hypotheses.  And what if only those one or two ever get published.  That is “publishing bias.” That happens all the time and has to be taken into consideration when considering the merits of these kinds of studies and the conclusions that can be drawn from them.


#47    BC      (see all posts) 2009/04/09 (Thu) @ 21:48

#46-I attended a philosophy talk a few years ago where the author gave a serious argument against the very intuitive & classical-statistics-endorsed Bonferroni Correction, which also undermines the “publishing bias” worry (the Bon. Correction demands a higher level of significance for any single hypothesis when multiple hypotheses are tested against a single set of data.)

I think this is it, if you’re interested (click name)

There’s also a bunch of other stuff on the Bonferroni Correction out there.


#48    Tangotiger      (see all posts) 2009/04/10 (Fri) @ 10:49

Post/47: marked for moderation and now open.


#49    MGL      (see all posts) 2009/04/10 (Fri) @ 17:15

BC, thanks.  Yes, I have talked about that many times, but I am not familiar with the concept, mathematically.  Sounds like this is it.  I have always wanted to quantify it (when testing multiple hypotheses on the same data, what the equivalent significant levels are), but never knew how.  Maybe this article will help, if I can understand it.


#50    MGL      (see all posts) 2009/04/11 (Sat) @ 05:43

BC, the link does not work - at least I cannot get it to work…


#51    BC      (see all posts) 2009/04/11 (Sat) @ 12:54

MGL--here’s a link (name) to the google html version of the pdf. Hope that works. If not, googling the author, “Matthew Kotzen” should take you to his webpage, where there should be a link to the article, “Multiple Studies and Evidential Defeat.”

Since I last wrote, I’ve been checking out a lot of the other articles criticizing the Bonferroni Correction out there, but those are mostly geared toward tweaks of it and problems related to specific fields of research--so is as yet the best article I’ve seen on the general issue of whether a correction should be made at all.


#52    MGL      (see all posts) 2009/04/11 (Sat) @ 17:02

I have read a few pages of the study.  I am impressed with the clarity of this guy’s writing. I wish that more academics would write like this.  That is the style (unfortunately, often boring) I try to use - words and sentence structures that make one’s communication crystal clear.

However, I am intrigued by what he is going to say in terms of criticizing or refuting the “defeatist” point of view.  As far as I know and and as far as I can surmise from pure mathematical logic, the “defeatist” paradigm is 100% correct. “Defeatist” is the term he uses when one (rightfully) discounts the (statistical) significance level of a relationship when several relationships were tested, but that those multitudes of tests were not necessarily reported and are not factored into the computed significance level.  Exactly what we have been talking about with regard to this basketball study. I call it data mining.  Basically, the idea that if you look an enough relationships, you will find X (depending on how many you look at) that are rare, yet occurred by chance.  You report that you found such a relationship and that the chances of that relationship (or more severe, of course) occurring by chance is very low (for example, .01, which is considered statistically significant).  So you conclude that this relationship is likely to be “real.” The mistake, of course, which this author (curiously) calls “defeatism” (the knowledge that you made a mistake, that is), is that you looked at so many relationships that one or more rare ones were VERY likely to have occurred by chance, therefore your conclusion is either wrong or at the very least must be modified, depending upon the p value of the relationship you found, and how many relationships you looked at in the first place.

Anyway, we’ll see what he has to say. I am suspecting that he is a bit of a crackpot (if he says that this “defeatist” paradigm is wrong - which it isn’t), but that remains to be seen…


#53    MGL      (see all posts) 2009/04/11 (Sat) @ 20:12

I was way too hasty in saying that this guy writes clearly. His intro was clearly and well-written, but the rest of the article is not.  He is also dead wrong and I don’t have the time right now to read the whole thing or to comment on why he is wrong.  The big problem is that he is not even remotely a statistician, I don’t think, and therefore he is not qualified to comment on the the Bonferroni correction, yet he seems to think he has all the answers - which he doesn’t.

Here is an example of his faulty thinking/logic, from his article:

Fifth, it is at least somewhat natural to think that, as long as we’re sure that the data hasn’t been falsified or altered in some other uncontroversially misleading manner, an evaluation of the evidential impact of a study shouldn’t require an investigation into the goals or intentions or hopes or histories of the
researchers who conducted that study. If we find a researcher’s laboratory notes from a particular study but can’t locate her for some reason, defeatism seems to entail that we should ask her friends, family, and colleagues whether she ran
other studies besides that one or not; if she ran others, Defeatism entails that the study is less significant than if she didn’t. Moreover, Defeatism looks to be committed to the evidential relevance even of counterfactual scenarios.

Just the fact that he calls this whole thing “defeatism” suggests that he is a philosopher/psychologist/sociologist type academician, and this issue is strictly mathematical and statistical.


#54    MGL      (see all posts) 2009/04/12 (Sun) @ 05:09

Here is another thing he gets dead wrong:

Under ordinary circumstances where only one study is performed, I don’t think that a hypothesis about the researcher’s private reasons for conducting that study plays a crucial role in grounding the conclusions that we draw from the results of the study; after all, these reasons typically aren’t cited in medical or scientific journals, and classical statisticians insist that we can evaluate the probative force of a study without having to know anything about independent evidence for or against some connection.

I don’t know what he means by “under ordinary circumstance,” but, as I have been saying for years, because these problems are Bayesian ones, it very much matters what the researchers’ reasons for choosing a study are because it very much effects the a priori probabilities.

I don’t know who these “classical statisticians” are, but he is misinterpreting them.  We don’t HAVE TO know anything about the researchers’ motivations for the study (their independent belief as to the existence or strength of the relationship of effect being tested), but it sure helps if we do!  And in many cases it behooves us to try and figure out what if any beliefs or evidence as to that relationship existed before the study was conducted (independent of the study actually).  In fact, one of the first things a person should do when he is evaluating a study is investigate the researchers’ motivation for the study, the past histories of similar studies, and other independent evidence of any relationships or effects being tested in the study.

At the beginning of this article he mockingly says that if we should be concerned about the number of tests being conducted by the researchers (and the fact that in a small percentage of them they find an anomalous result), we should also be concerned with tests being conducted by other researchers - in fact, all tests conducted in all of history.  Again, he mocks that idea. Well guess what?  It is true that we very much should be concerned with those things!  That is publication bias in a nutshell! 

I’ll give my classic example again from baseball:  This author would want you to believe (I think) that a 2 sigma result if we tested the differences between power and finesse pitchers in odd and even days yields the same conclusion as a 2 sigma result when testing the difference between power and finesse pitchers in the day or night.  It doesn’t!

Just as I rail against statisticians and econometricians who know nothing about sports doing sports research, this guy appears to be a philosopher sticking his nose into statistics, in which it clearly does not belong…


#55    BC      (see all posts) 2009/04/13 (Mon) @ 10:59

MGL--I hadn’t seen the earlier thread/your comment about the day/night vs. odd/even splits, which helps. A couple of notes about this:

Am I correct in making the same conclusion about both anomalous splits I found since they were both 2 SD from the norm?  Not even close!  Why? If we do a Bayesian analysis where the a priori probability is almost zero and then after looking at a sample, it is 2SD from the norm, guess what?

1. Minor thing: I think you’re misusing the term “a priori probability” here. The prior probabilities that you’re adverting to are, presumably, our commonsense (or expert) background knowledge about the game of baseball, which are subjective probabilities. I believe the term “a priori probability” is usually reserved for other ways (e.g., non-empirical) of deducing prior probabilities over a probability space (e.g. the principle of indifference).

2. Now that I think I understand both of you, it’s clear that the issue that Kotzen is raising in the article, then, is (in large part) different than the one you’re concerned with. He would not disagree with your point that prior probabilities matter. Apologies for sending you on a semi-wild-goose chase.

3. The non-wild-goose chase part: what the day/night vs odd/even example in part obscures, by introducing different priors, is the issue that Kotzen is on about: should one use the simple fact that you’ve conducted 100 studies automatically discount (to some degree or other) (both) the day/night and odd/even results?

4. I think there is something to Kotzen’s “generality worry”, the worry (that you dismiss) that we need to be concerned with “all tests conducted in all of history.” Viz., it has the counter-intuitive consequence that our confidence in, say, your day/night study should be diminished (to some degree) by, say, the fact that some other guy conducted 1,000,000 statistical studies on, say, earthquakes.

5. Like you, I don’t find Kotzen’s arguments persuasive--I don’t think he’s putting much weight on the particular passages you cited. If there’s a place where his argument lives or dies, it’s in his “Toy Case.” I haven’t had the time/dedication to tackle that one yet, but I think it’ll show whether there’s merit to his position or not.


#56    BC      (see all posts) 2009/04/13 (Mon) @ 11:27

Okay, here’s Kotzen’s “Toy Case”

Consider a jar containing 1,000,000 well-mixed dice, 99% (i.e., 990,000) of which are fair and 1% (i.e., 10,000) of which are biased in such a way that they always land 6. Now, suppose that a die (call it “Harry”) is collected at ran-
dom and rolled three times in a row, landing 6 all three times. How confident should we be that Harry is biased?

If we’re Defeatists, then we’ll think that the case is crucially under-described. For a Defeatist, knowing merely that Harry was collected at random in some sense of the term “random” isn’t enough to know how we should set our credence that Harry is biased. If Harry was the only die that was selected from the jar and rolled, then perhaps we would have good reason to believe that Harry is one of the biased dice (after all, the probability that any particular fair die will land 6 three times in a row is 1/6^3 ≈ 0.00463, whereas a biased die that is tossed three times is certain to land 6 all three times). But if some large number of dice (say, 1,000, including Harry) were all randomly collected from the jar at once, and they were all tossed three times, and all we know about Harry is that it was one of the dice that landed 6 all three times, then we would have less reason to believe that Harry is biased. After all, given that we collected 1,000 dice and rolled them each three times, the probability is quite high (approximately .99) that at least one of the collected dice would land 6 three times in a row, even if every single one of them is in fact fair… Thus...the Defeatist would claim that we should be less confident that Harry is one of the biased dice if 1,000 dice (including Harry) were drawn from the jar and rolled than if just Harry was.

1. The posterior probability following the non-Defeatist claim is .69, and Kotzen asserts that (but doesn’t show) that a Monte Carlo simulation will show that betting with less confidence than .69 (as Defeatism suggests) would be a losing proposition.  But I haven’t had the chance to run the simulation myself.

2. MGL: do you think drawing 999 other die should affect our credence of in Harry being biased? (If not, in what way do you think this scenario is disanalogous to the scenario of publication bias)?


#57    Gary Geiger Conter      (see all posts) 2009/04/13 (Mon) @ 11:37

Post/47: marked for moderation and now open.

Tom, what does this mean when I read this?  IS there a FAQ somewhere?


#58    Tangotiger      (see all posts) 2009/04/13 (Mon) @ 12:19

The software sometimes flags posts for whatever reason.  The reasons I have found are:
- using links preceded by http
- using alot of CAPS
- using html tags, with greater than, less than signs, rather than using square brackets

So, it goes into the moderation pile, which I need to open.


#59    MGL      (see all posts) 2009/04/13 (Mon) @ 17:57

If we choose one die at random and roll a 6 3 times in a row, the chances that we have a biased die is completely given by the information we have.

Chance of a fair die coming up 6 3 times in a row is of course 1/216. 

Chance of a biased die, 1.00.

Chance that our die is indeed biased is:

.01/(.99*(1/216)+.01), or .686, as you say.

Why does “defeatism” suggest a lower number than that?  That number is 100% correct if we draw one die at random.

I’m not sure what you are asking in terms of drawing 999 other die. You have to be specific as far as what the experiment is and what the question is based on the experiment.

If we draw 1000 die one at a time, at random, we will get 14.58 of them to come up with a 6 three times in a row of course.  Out of 1000 die, 990 will be fair and 10 will be biased.  Those 10 will all come up with a 6 three times in a row, and 1/216 of the 990 fair dice will come up with a 6 three times in a row, for a combined total of 10 plus 990/216 or 14.58.

If I pull out 1 of those 14.58 dice and ask what the chances are that it is biased, it is 10/14.58 or .686, same as before.  So I don’t understand the significance of the “1000 selections” question.

Publication bias would be if I did the one die experiment 1000 times and I only reported one of the 14.58 times it came up 6 three times in a row, and then concluded that there was a 68.6% chance that I had a biased coin.

I think I have that right, but I am not sure.


#60    BC      (see all posts) 2009/04/13 (Mon) @ 18:51

MGL--thanks for going through the math on all that. That all correct, so far as I can tell. One key point, one key question:

1. The key point (which you show):

In the scenario described, selecting 1000 dice and rolling them 3 times each should leave you with 14.58 “6-6-6” results, and the credence that we should have that any of those 14.58 dice are biased is 10/14.58 = .686.

That’s the same credence that we should have had we pulled one die, rolled it 3 times, and gotten the “6-6-6” result.

That is, selecting the 1000 dice is irrelevant to how informative the 6-6-6 result is for any particular die: the 6-6-6 result always gives us a .686 credence that the die is biased.

2. The key question is then whether selecting 1000 dice and rolling them 3 times is relevantly similar to the conditions that are supposed to create publication bias.

The analogy, I take it, is supposed to be this:

a. The 1,000,000 well-mixed dice: stand in for all possible subjects of statistical study (e.g., whether ERA is correlated to odd/even days; whether hair color is correlated to power hitting; etc. etc.)

b. The 10,000 biased dice stand in for the (different) sets of variables that are really truly correlated (e.g., weight & HRs).

c. The 1,000 dice drawn are the studies we actually performed.

Don’t know if the analogy holds or breaks down--have to think about it more. But that’s the suggestion.

And if it’s right, then the conclusion is: just as we don’t have to consider how many dice we drew to determine that the 6-6-6 result for this die gives us a .686 credence that this die is biased, we don’t have to look at how many other studies are performed to figure out what confidence to have in this statistical study.

(Sorry if this is pedantic--really just trying to make it clear for myself.)


#61    BC      (see all posts) 2009/04/13 (Mon) @ 19:00

Just to be perfectly clear for those who might be trying to follow along (of which there are likely not many!)--

c. The 1000 dice drawn and their three-throw trials represent studies on different topics, not 1000 studies of the same subject.


#62    MGL      (see all posts) 2009/04/13 (Mon) @ 23:16

I follow what you are saying about the 1000 dice selection and the one die selection, but that is NOT the same as publication bias or “data mining” or having to use a Bonferroni correction.  It can’t be, because the assumption underlying the Bonferroni correction is correct.  The only (legitimate) controversy surrounding the Bonferroni correction is whether it is the exactly correct solution to the problem.  There is no controversy regarding why a correction is needed.  That is the issue that I have with this article, although in all fairness, I did not understand it after the introduction and I lost interest in it as well.  He seems to suggest that no correction is needed and that is clearly incorrect. 

All of this makes my head spin, so I am not sure I can articulate why there is a difference between publication bias and/or “data mining” and the dice example, but it is 100% clear to me that there is a difference and if the author is suggesting that his dice example is a good analogy as to why there is no “correction” needed when an experimenter performs multiple tests on the same data, or if different researchers perform tests on the same data, then he doesn’t know what he is talking about. For that, I am certain.

To be honest, the part that I have not yet wrapped my own head around is the notion that if we perform all kinds of experiments on all kinds of data for all kinds of things (sports, medicine), etc., that by chance alone we will find many anomalous relationships and thus make many Type I (and Type II) errors.  Yet, we don’t need a correction for every experiment based on the entire history of all experiments ever done.  I’m not sure why that is.

I think that this author has a hard time wrapping his head around these tricky things too, and rather than admitting that he just doesn’t know the answers, he comes up with answers of his own that he is not qualified to make.  He should be asking these questions to statisticians or combing the published research annals to find out the answers, not making them up himself, as he has little (or not enough at least) expertise in this area.

I must admit it is a fascinating topic.  Unfortunately, I don’t have the time right now to think too much about it. If BC or anyone else has further insight, I’d be happy to hear from them on this thread.  I don’t think there is much more for me to offer.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 02:54
Largest demonstration in Canadian history?

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards

May 24 08:13
espnW for hockey: CBC’s WhileTheMenWatch.com