THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, March 16, 2010

Simpson’s Paradox

By Tangotiger, 01:44 PM

Jack Morris v Bert Blyleven on BRef.  When given 0-2 runs of support, Bert has a much lower ERA and much better win% than Jack Morris.  When given 3-5 runs of support, Bert has a slightly better ERA, and barely worse win%.  Given 6+ runs of support, he’s got a way better ERA, and somewhat better win% (though it’s hard to get much better win% when Morris is “bad” at 139-10).

Jack Morris overall however has a better win%.  And that’s because he had more games where he got more run support than Bert.  And so, even though at the run support level, Bert has the better win%, when you add it up, Morris as the better win%.

I don’t know if this makes sense to everyone, but perhaps Poz can give us a 2000-word blog post that hammers this point home for everyone else.

Glove-slap: BtB


#1    Sunny Mehta      (see all posts) 2010/03/16 (Tue) @ 15:15

Jim Albert explains this well in his “Workshop Statistics” book (which, incidentally, is an excellent intro stats book written from a Bayesian perspective, and is available for free download as an e-book on his website).

He says something like: Player A had a better batting average than Player B in the first half and the second half of the season, yet ended up with a worse overall batting average for the total season.

An easy illustration:

First Half:  Player A goes 50 for 100 (.500) and Player B goes 100 for 300 (.333)

Second Half: Player A goes 50 for 300 (.167) and Player B goes 10 for 100 (.100)

Total:  Player A is 100 for 400 (.250) and Player B is 110 for 400 (.275)


#2    Hizouse      (see all posts) 2010/03/16 (Tue) @ 18:20

a WSJ article from December on the paradox, using unemployment statistics as a launching point, but also using a Jeter-Justice example:

http://online.wsj.com/article/SB125970744553071829.html


#3          (see all posts) 2010/03/17 (Wed) @ 00:20

Baseball provides so many terrific examples to help learn statistics.  Despite having a formal education in mathematics, I feel I developed most of my intuition about statistics from baseball stats.  I think my first exposure to Simpson’s Paradox might have been in a presentation at a math conference where the example was, you guessed it, batting averages.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 20:02
Who is Jeremy Lin?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:33
Clutch analogy

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 16:48
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul

Feb 10 18:32
Moneyball at Villanova