THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, October 21, 2011

A Warning from MGL!!

By , 11:32 PM

Now that I have your attention…

I just read a decent article on FG written by David Cameron about TLR’s bullpen choices in the 9th inning of game 2, which we have discussed on this blog.

In the comments section, I kind of lectured Dave, who has done much excellent analyses and written many fine articles on FG and other sites and publications, about using samples properly. I think it is important enough to warrant a thread on this blog.  Essentially I said (I’ll reprint some things from my comments on his article):

I am very uncomfortable when an analyst gets to choose which sample he wants to present to support his point or his opinion. This year only? Last 2 years? 3 years? Career? Lately, as in last half season? The last 10 games? You should not be allowed to do that, for obvious reasons (cherry picking your evidence makes your arguments intellectually dishonest, or misleading at best).

For example, Dave said this:

“While Hamilton’s strikeout rate against LHPs jumps to 22.1%, Rhodes K% against LHBs this year was just 16.1%. His career numbers are much better, but he’s not the same pitcher he was a few years ago, and Hamilton had hit an outfield fly against him the night before.”

Yes, he is not the same pitcher, but if this year his K% was higher than his career numbers, Dave would probably be quoting us his career numbers (heck, I would too if I had the choice!). The analyst should NOT have the choice. He should always be quoting a projection which is some kind of weighted career average or whatever the accepted standard is!

And for the last part of that last sentence, about Hammy hitting a fly ball the night before, David should get immediately thrown into the MGL jail. I can’t believe he even said that in that context. Shame on you Dave!

I followed up with this:

If you are allowed to split the samples up anyway you want, you can probably support just about any thesis from one end of the spectrum to the other. Which is why a standard must be used. As in all scientific fields, in sabermetrcis there is a generally acceptable standard in the industry – weighted career (or to simplify last 3 or 4 years).

That is not an arbitrary method mind you. When we are trying to answer questions such as, “Who should be used in an upcoming situation,” we are essentially asking the question, “How do we expect so and so to perform at some time in the future, in most cases, as in this, the immediate future, such as in the next PA or tomorrow?”

To do that, again, the accepted standard in the industry, after years of very thorough research and analysis, is to use a “Marcel-like” projection for component rates, GB and FB frequency, platoon splits, etc. It is also accepted standard to ignore things like clutch, home/road splits (other than the normal one of course), day/night, pitcher/batter historical matchups, hot and cold streaks, etc. Not because we KNOW that these don’t exist, but because we find, again, after years of thorough and extensive research, that even if they exist, they have little predictive value.

So I implore all analysts, including David, who is a fine one, to use these standards when presenting a thesis. If time or other constraints exist, which I understand, then some semblance of these standards should be used, or some qualifications issued, rather than disingenuously using one year, half year, or other similarly small and/or misleading samples (such as un-weighted career) in order to support an argument.


#1    Dan      (see all posts) 2011/10/22 (Sat) @ 13:29

It’s been a long time since I took statistics classes, but when you cherry pick your data set from a large pool of possibilities you greatly decrease the chances of finding something actually true.  What you need to do is predetermine which stats you will go looking for (ideally in all situations) and then present them as you find them.

You are correct that if you comb through enough data you can find something to support almost any conclusion.


#2    Geoff Buchan      (see all posts) 2011/10/22 (Sat) @ 17:07

I agree with the overall point - if you go data mining, you’ll find what you want to find. So cherry-picking data to support your hypothesis is indeed bad.

But in certain cases it may make sense to look at a different data set, for example if some major external factor likely changes a player’s ability.

If a pitcher has added a new, very different pitch, or has returned after major surgery radically altering his stuff, looking at data only from after that change may be more illustrative than weighted career numbers or a Marcel-like projection.

I’d leave out changing teams/home parks, because park adjustments already should, in theory, address that. So the general rule is good, but it should be okay to break it under unusual circumstances, so long as you acknowledge both that you’re deviating from standard practice, and why.


#3          (see all posts) 2011/10/22 (Sat) @ 19:42

"But in certain cases it may make sense to look at a different data set, for example if some major external factor likely changes a player’s ability.”

Sure, but one has to be very careful, as it is still too easy to manipulate the presentation of the data in order to support a thesis.

In addition, we often make assumptions about a change in talent with no evidence (or contrary evidence) to back it up.  For example, we talk about breakout years for hitters and then put a lot of emphasis on recent performance (and make up narratives which sound plausible about WHY the player’s true talent has changed dramatically) when research that I and Tango have done suggest that “breakouts” have little predictive value, beyond how they change a player’s normal (weighted last 3 or 4 years) projection…


#4    Geoff Buchan      (see all posts) 2011/10/22 (Sat) @ 19:58

I agree - one should be careful, and simply having a “breakout year” is not sufficient to justify a shorter sample.

I had in mind mainly injury comebacks, where the shorter sample might well indicate a worse performance, or, say, switching to become a knuckle-ball pitcher (as R.A. Dickey did a few years back) or some other very major change to approach to the game.

And this is probably much more a theoretical objection than a practical one: such major transitions are quite rare.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 20:16
Largest demonstration in Canadian history?

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards

May 24 08:13
espnW for hockey: CBC’s WhileTheMenWatch.com

May 24 00:16
Psst… wanna intern… somewhere?