Wednesday, November 10, 2010
Brothers stealing
The authors responded to Phil about that article. It seems that it’s a question of definitions, as noted on page 10:
Ironically, although odds ratios are often used in an attempt to clarify complex statistical findings, people who are not familiar with them sometimes misinterpret what odds ratios do and do not mean. In our own data for major league brothers, for example, an odds ratio of 10.58 to 1 in favor of younger brothers attempting to steal more bases per opportunity does not mean that younger brothers attempted 10.58 times the number of steals as did their older brothers. Similarly, this statistic also does not mean, as Schwarz (2010) mistakenly reported in the New York Times, that more than 90 percent of younger brothers attempted more steals per opportunity than their own older brothers.
Only 59 percent of younger brothers in our sample attempted more steals per opportunity, although this statistic, uncontrolled for call-up sequence, considerably underestimates the overall effect for this measure, just as computing an odds ratio without regard to call-up sequence underestimates the effect. For example, among the 10 brothers in our study who were called up during the same year—where there is no possible bias owing to callup sequence—80 percent of younger brothers (4 out of 5) attempted more stolen bases per opportunity than their older brothers, yielding a relative risk ratio 4.00 to 1 (80%/20%), and an odds ratio of 16.00 to 1.
I have no idea what the heck they are talking about, other than they may have double-counted. If you have the Yankees with a true talent .667 win% facing the true talent Royals of .333 win%, the Odds Ratio matchup would say that you do .667/.333 (Yankees’ 2:1 odds) divided by .333/.667 (Royals’ 1:2 odds) to give you a 4:1 odds ratio, implying a win% of .800.
If however, your universe of teams is ONLY the Yankees and Royals, and you observe one million games where the Yankees have a .667 win% and, by default, the Royals have a .333 win%, the Odds Ratio would imply a win% of .667 for the Yankees.
So, going back to the authors, it seems to me that when they report an Odds Ratio of 10:1 for younger brothers stealing more, they imply to mean 3.16:1, or 76% as the likelihood of a younger brother stealing more than an older brother. Indeed, why not just report that figure instead? Give us the “win%” of younger v older. We all understand win%.
The authors do a nice job in their response. It would have been nicer if they actually refereced Phil and Guy or whoever by name or handle or even site.


I don’t know if these guys are deeply confused, or deliberately muddying the waters to hide their errors. But the analysis is just hokum.
I’m doing this from memory (Phil knows the data 100x better than I do), but the bottom line here is that younger brothers only attempt steals a little bit more than older brothers, as indicated by the fact that just 59% of younger bros. steal more often than their sibling. But even this overestimates the real difference, for at least 2 reasons:
1) older bros have longer careers, and SBA drop off sharply with age, so their career rate will be lower if you don’t adjust for age;
2) the authors define opportunities as BB, 1B, 2B, and 3B, but since older brothers have much more power on average, a larger share of their estimated opportunities are not true SB opportunities (because it’s an XBH).
I am reasonably sure that once you correct for these, there is no statistically significant difference.
The way the authors invent a big difference is by adjusting for what they call “call-up sequence,” meaning which brother is called up first chronologically. So, for example, they compare younger brothers called up 1st to older bros called up first. Guess what? A younger bro who gets called up before his older brother is usually a very good player, and thus steals much more (same thing if called up the same season). Now reverse it and compare younger bro called up 2nd (basically average, as almost all younger bros are called up 2nd) to an older bro called up 2nd (a weak player on average)—again, the younger brother is a much better player on average. This is just a whole lot of words to say nothing of value.
But in reviewing the data, Phil did discover something very interesting: younger brothers are significantly worse players on average. That suggests that players whose older brother played/plays in the majors are given more of a chance to play in MLB than their talent level alone would usually provide. Cool finding....