THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, December 02, 2007

Do Baseball Insiders Really Understand Baseball (and Statistics)?

By , 01:39 AM

I did not really know how to title this entry, but…

Here is an article that, in my opinion, is a good example of how baseball “insiders” are woefully inadequate in understanding the confluence of baseball and statistics, such that it can and will lead to bad deicision-making.


I wrote this on BTF in response to the article:

Any pitcher with a bad year will be followed by a good year (relative to the bad year) and vice versa, because of regression toward the mean. I doubt that Fregosi has any idea what that means. Of course that does not necessarily mean that there are not some relievers who are over or under-used.

The REAL problem with reliever utilization is that managers tend to use more relievers who are not that good but are having a good season and vice versa, but that is another (related) story.

Anyway, back to the Fregosi thing, I am willing to bet that if you looked at reliever usage in terms of one-year IP or pitches thrown, you will NOT find a significant cause/effect relationship with respect to performance the next year. Of course you have to control for selective sampling. Here is what generally takes place which gives the illusion that Fregosi is right. This is simplifying things of course, but it helps to do that to explain the phenonmenon.

Say all your relievers were “true” 4.50 ERA pitchers. That is their true talent level, like for a fair coin it is .5 heads or tails. Whatever you see in any time period, like flipping a coin x number of times, is merely a random sample of that 4.50 ERA, and can be just about anything, but obviously tends to be around 4.50 and will approach exactly 4.50 the larger the sample of pitching (in IP or whatever).

In year X, half of those pitchers will pitch “badly” and have an ERA above 4.50 and half will pitch well with an ERA below 4.50. Since we are specifying that whenever they pitch, their true ERA’s are 4.50, the bad and good pitching is, be definition, bad and good pitching luck. Presumably, the pitchers having good luck will get more playing time and the ones having bad luck will get less playing time. The good luck pitchers may average an ERA of 4.00 and the bad luck ones, 5.00. What happens next year?

In year X+1, all the pitchers will average 4.50 ERA of course and the fluctuations (luck) will be randomly distributed among the pitchers and within the two groups that were lucky or unlucky last year. The previously lucky group will average 4.50 in year X+1 and so will the unlucky group.

So, the data will look something like this:

Year X ERA/IP Year X+1 ERA/IP
5.00/60 (100 pitchers) 4.50/75 (same 100 pitchers)
4.00/90 (100 pitchers) 4.50/75 (same 100 pitchers)

In reality the numbers would be more like this, since not all pitchers are true 4.50 pitchers of course. But the results are still the same.

Year X ERA/IP Year X+1 ERA/IP
5.50/60 (100 pitchers) 5.00/75 (same 100 pitchers)
3.50/90 (100 pitchers) 4.00/75 (same 100 pitchers)

(The pitchers who pitched badly in year X are indeed worse pitchers who also got unlucky and the ones who pitched well in year X are better pitchers who also got lucky.)

(Numbers for illustration purposes only.)

This is EXACTLY what happens every year in baseball. Not as simplified as that, but pretty much the same, more or less. To the unenlightened eye, it looks like pitchers who pitch more innings will get a lot worse and pitchers who pitch fewer innings will get much better, but that is an illusion because of selective sampling and regression toward the mean.

It is like the A/F fighter pilots in training. The pilots who did badly on a mission were scolded and the ones who did well were praised. The next mission, the pilots who did badly the first time, did a lot better and the ones who did well, did worse. The A/F brass concluded that scolding was better than praising as a training style. Of course, the subsequent results had little to do with the training method, but was simply regression toward the mean. (If I got that story a little wrong, I apologize, as it is from memory.)

Without RTFA, any pitcher with a bad year will be followed by a good year (relative to the bad year) and vice versa, because of regression toward the mean.  I doubt that Fregosi has any idea what that means.  Of course that does not necessarily mean that there are not some relievers who are over or underused.

The REAL problem with reliever utilization is that managers tend to use more relievers who are not that good but are having a good season and vice versa.

Anyway, back to the Fregosi thing, I am willing to bet that if you looked at reliever usage in terms of one-year IP or pitches thrown, you will NOT find a significant cause/effect relationship with respect to performance the next year.  Of course you have to control for selective sampling.  Here is what generally takes place which gives the illusion that Fregosi is right.  This is simplifying things of course, but it helps to do that to explain the phenonmenon.

Say all your relievers were “true” 4.50 ERA pitchers. That is their true talent level, like for a fair coin it is .5 heads or tails.  Whatever you see in any time period, like flipping a coin x number of times, is merely a random sample of that 4.50 ERA, and can be just about anything, but obviously tends to be around 4.50 and will approach exactly 4.50 the larger the sample of pitching (in IP or whatever).

In year X, half of those pitchers will pitch “badly” and have an ERA above 4.50 and half will pitch well with an ERA below 4.50. Since we are specifying that whenever they pitch, their true ERA’s are 4.50, the bad and good pitching is, be definition, bad and good pitching luck.  Presumably, the pitchers having good luck will get more playing time and the ones having bad luck will get less playing time.  The good luck pitchers may average an ERA of 4.00 and the bad luck ones, 5.00.  What happens next year?

In year X+1, all the pitchers will average 4.50 ERA of course and the fluctuations (luck) will be randomly distributed among the pitchers and within the two groups that were lucky or unlucky last year. The previously lucky group will average 4.50 in year X+1 and so will the unlucky group.

So, the data will look something like this:

Year X ERA/IP Year X+1 ERA/IP
5.00/60 (100 pitchers) 4.50/75 (same 100 pitchers)
4.00/90 (100 pitchers) 4.50/75 (same 100 pitchers)

In reality the numbers would be more like this, since not all pitchers are true 4.50 pitchers of course.  But the results are still the same.

Year X ERA/IP Year X+1 ERA/IP
5.50/60 (100 pitchers) 5.00/75 (same 100 pitchers)
3.50/90 (100 pitchers) 4.00/75 (same 100 pitchers)

(The pitchers who pitched badly in year X are indeed worse pitchers who also got unlucky and the ones who pitched well in year X are better pitchers who also got lucky.)

(Numbers for illustration purposes only.)

This is EXACTLY what happens every year in baseball.  Not as simplified as that, but pretty much the same, more or less.  To the unenlightened eye, it looks like pitchers who pitch more innings will get a lot worse and pitchers who pitch fewer innings will get much better, but that is an illusion because of selective sampling and regression toward the mean.

It is like the A/F fighter pilots in training.  The pilots who did badly on a mission were scolded and the ones who did well were praised.  The next mission, the pilots who did badly the first time, did a lot better and the ones who did well, did worse. The A/F brass concluded that scolding was better than praising as a training style.  Of course, the subsequent results had little to do with the training method, but was simply regression toward the mean.  (If I got that story a little wrong, I apologize, as it is from memory.)

#1    David Smyth      (see all posts) 2007/12/02 (Sun) @ 08:50

It’s interesting (but not surprising) that none of the baseball people mentioned small sample size IP and regression to the mean as the likely driving force. Not that they have to use those terms, but why not simply say, “Relievers’ ERAs bounce up and down a lot because they don’t pitch many innings, and things don’t get a chance to even out.”

Perhaps the reason is that they DON"T WANT to believe it, because it makes them (the pitching coaches, GMs, etc.) seem more powerless, because there’s no way to coach a player out of the regression to the mean effect.


#2          (see all posts) 2007/12/02 (Sun) @ 14:45

In my experience, people HATE to think that things happen randomly.  In baseball, that makes sense.  If Ozzie Smith makes an uncharacteristic error, and he says, “well, you know, I make fewer errors than anyone, but occasionally I’ll misplay a ball just because it’s luck and I’m not perfect,” he looks like a jerk.  Same for a GM—if he admits that there’s a lot of luck in performance statistics, it looks like he’s making excuses instead of doing his job.

More importantly, NOT believing that it was luck—believing that it’s something that can be fixed—is what makes great players great.  They don’t say, “well, it’s gonna happen sometimes.” They say, “I’m damn well going to make sure it doesn’t happen again.” And they work twice as hard on their fielding, and they get better.  And they still make occasional errors, but fewer.

In real life ... well, I know people who don’t feel well one day, and they go on and on about why it may have happened.  Maybe it was something they ate, or someone at work may have been ill, or they went out in the rain without an umbrella ... hey, sometimes things just happen, and you randomly catch a virus.  Accept that there are lots of things in life that just happen, that nobody can control!

But baseball insiders should know better.  The ones who don’t understand regression to the mean, and the impact of luck in baseball, are simply not as qualified for their job as they should be.


#3          (see all posts) 2007/12/02 (Sun) @ 23:07

Fundamental Attribution Error is what social psych calls that, Phil.  I think you’re definitely correct.


#4    Tangotiger      (see all posts) 2007/12/03 (Mon) @ 10:48

I’ve read many times from players where they say luck (or “the breaks") is a key component in their or their team’s play.  I think players do get that luck plays a lot in this.

As an example, they asked Burnitz, who was having a bad year, what he thought of his play, and the adjustments he needs to make.  He said that he was hitting the same way he always hits, he was doing the exact same thing as he always does.  And the ball is just not falling for him.  But, that he was not going to change, that he was going to continue to do things the way he always did.

Plenty of times you hear players say that they have the better team, but they didn’t get the breaks they needed.

I think it’s the media that tries to not talk about luck. If they talked about luck (timing, breaks) being the overriding factor in any single game, why would anyone pay them to talk?


#5    Pizza Cutter      (see all posts) 2007/12/03 (Mon) @ 14:42

FWIW, Fregosi’s hypothesis is testable.  I’m not putting a whole lot of money on it, but it could be tested.

As to luck, people evaluating their own performance often credit their own amazing abilities when they do something right and blame extraneous factors, including luck, when they do something wrong.  They do the reverse when evaluating other people, especially those they don’t like.

Baseball, as I’ve come more and more to understand, is more of a game of chance than perhaps even any of us would like to admit.  There’s a cultural ethos around baseball that it is, in some way, the manifestation of some giant poetic and amazing plan for the universe.  People don’t generally feel this way about a game of Monopoly.  We like it when sportscasters say “Team of Destiny” because it sounds cool.  (Right Rockies fans?)


#6          (see all posts) 2007/12/03 (Mon) @ 17:34

The phenomenon discussed here in comments 2/3/5 goes far beyond baseball of course; people in general have a tendency to extrapolate causality from very small samples and to ascribe to skill what is really luck. I think I’ve heard it argued somewhere (though I can’t remember where) that this is partly an evolutionary trait - the ability to recognize patterns and make decisions on a small amount of data can give a huge survival advantage, while the down side, which is superstition, hurts only a little bit except in extreme cases.

In running a baseball team (or any other business) the equation is much different, and it pays to understand statistics. A lot of the success of Beane and the other stat-minded GM’s has come not from being brilliant, but merely from being a bit more rational than their peers.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 14:20
Marcel 2009 is here

Nov 20 14:19
Nate Silver: hero to interviewers

Nov 20 13:42
Top Free Agent Pitchers

Nov 20 12:29
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being

Nov 20 12:27
David G. checks in again on whether experience matters in the post-season

Nov 20 10:42
Offense by position groups by decade

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel