THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, October 04, 2010

Phil nails it again about .300 hitters

Read.

Now, how would you do this study?  Very very easy.  Look at all “last TEAM game of season”, and select each PA where the batter entered with a batting average where the next hit would put him at .2995 or above.  And just count if he did get a hit or not.

You cannot, as Phil explains so clearly, start with the batter’s last PA of the season as if that were some random data point.  It was his last PA of the season precisely because he may have gotten a hit on that last PA and then stopped playing.  Never start with the end result, and presume that there’s no bias.  This is probably the biggest mistake done in baseball research (academics or otherwise). 

This is very popular with the “at count”.  “Hey look, batters who end their PA when they were at 0 balls and 2 strikes got alot more strikeouts than those that did not end their PA when they had 0 balls and 2 strikes.” That’s because if they ended their PA, they got a strike most of the time.  If they didn’t end their PA, they got a ball most of the time.  So, yeah, the best way to not strikeout on 0-2 is to make sure the pitcher throws you a called ball.

Same kind of thing happens all the time in research, and I would bet Phil is right that the researchers introduced a bias.

Listen to all researchers: if it doesn’t pass the sniff test, your result is almost necessarily wrong.  It points to a methodology issue.  We would have been happy to see these hitters hit .320 or .330.  Maybe .350?  That starts to stretch it.  But .463?  You made a mistake, or your sample size is terribly low.


(96) Comments • 2011/02/03 • SabermetricsStatistical_Theory
Page 1 of 1 pages

<< Back to main