Sunday, December 02, 2007
Do Baseball Insiders Really Understand Baseball (and Statistics)?
I did not really know how to title this entry, but…
Here is an article that, in my opinion, is a good example of how baseball “insiders” are woefully inadequate in understanding the confluence of baseball and statistics, such that it can and will lead to bad deicision-making.
I wrote this on BTF in response to the article:
Any pitcher with a bad year will be followed by a good year (relative to the bad year) and vice versa, because of regression toward the mean. I doubt that Fregosi has any idea what that means. Of course that does not necessarily mean that there are not some relievers who are over or under-used.
The REAL problem with reliever utilization is that managers tend to use more relievers who are not that good but are having a good season and vice versa, but that is another (related) story.
Anyway, back to the Fregosi thing, I am willing to bet that if you looked at reliever usage in terms of one-year IP or pitches thrown, you will NOT find a significant cause/effect relationship with respect to performance the next year. Of course you have to control for selective sampling. Here is what generally takes place which gives the illusion that Fregosi is right. This is simplifying things of course, but it helps to do that to explain the phenonmenon.
Say all your relievers were “true” 4.50 ERA pitchers. That is their true talent level, like for a fair coin it is .5 heads or tails. Whatever you see in any time period, like flipping a coin x number of times, is merely a random sample of that 4.50 ERA, and can be just about anything, but obviously tends to be around 4.50 and will approach exactly 4.50 the larger the sample of pitching (in IP or whatever).
In year X, half of those pitchers will pitch “badly” and have an ERA above 4.50 and half will pitch well with an ERA below 4.50. Since we are specifying that whenever they pitch, their true ERA’s are 4.50, the bad and good pitching is, be definition, bad and good pitching luck. Presumably, the pitchers having good luck will get more playing time and the ones having bad luck will get less playing time. The good luck pitchers may average an ERA of 4.00 and the bad luck ones, 5.00. What happens next year?
In year X+1, all the pitchers will average 4.50 ERA of course and the fluctuations (luck) will be randomly distributed among the pitchers and within the two groups that were lucky or unlucky last year. The previously lucky group will average 4.50 in year X+1 and so will the unlucky group.
So, the data will look something like this:
Year X ERA/IP Year X+1 ERA/IP
5.00/60 (100 pitchers) 4.50/75 (same 100 pitchers)
4.00/90 (100 pitchers) 4.50/75 (same 100 pitchers)
In reality the numbers would be more like this, since not all pitchers are true 4.50 pitchers of course. But the results are still the same.
Year X ERA/IP Year X+1 ERA/IP
5.50/60 (100 pitchers) 5.00/75 (same 100 pitchers)
3.50/90 (100 pitchers) 4.00/75 (same 100 pitchers)
(The pitchers who pitched badly in year X are indeed worse pitchers who also got unlucky and the ones who pitched well in year X are better pitchers who also got lucky.)
(Numbers for illustration purposes only.)
This is EXACTLY what happens every year in baseball. Not as simplified as that, but pretty much the same, more or less. To the unenlightened eye, it looks like pitchers who pitch more innings will get a lot worse and pitchers who pitch fewer innings will get much better, but that is an illusion because of selective sampling and regression toward the mean.
It is like the A/F fighter pilots in training. The pilots who did badly on a mission were scolded and the ones who did well were praised. The next mission, the pilots who did badly the first time, did a lot better and the ones who did well, did worse. The A/F brass concluded that scolding was better than praising as a training style. Of course, the subsequent results had little to do with the training method, but was simply regression toward the mean. (If I got that story a little wrong, I apologize, as it is from memory.)
Without RTFA, any pitcher with a bad year will be followed by a good year (relative to the bad year) and vice versa, because of regression toward the mean. I doubt that Fregosi has any idea what that means. Of course that does not necessarily mean that there are not some relievers who are over or underused.
The REAL problem with reliever utilization is that managers tend to use more relievers who are not that good but are having a good season and vice versa.
Anyway, back to the Fregosi thing, I am willing to bet that if you looked at reliever usage in terms of one-year IP or pitches thrown, you will NOT find a significant cause/effect relationship with respect to performance the next year. Of course you have to control for selective sampling. Here is what generally takes place which gives the illusion that Fregosi is right. This is simplifying things of course, but it helps to do that to explain the phenonmenon.
Say all your relievers were “true” 4.50 ERA pitchers. That is their true talent level, like for a fair coin it is .5 heads or tails. Whatever you see in any time period, like flipping a coin x number of times, is merely a random sample of that 4.50 ERA, and can be just about anything, but obviously tends to be around 4.50 and will approach exactly 4.50 the larger the sample of pitching (in IP or whatever).
In year X, half of those pitchers will pitch “badly” and have an ERA above 4.50 and half will pitch well with an ERA below 4.50. Since we are specifying that whenever they pitch, their true ERA’s are 4.50, the bad and good pitching is, be definition, bad and good pitching luck. Presumably, the pitchers having good luck will get more playing time and the ones having bad luck will get less playing time. The good luck pitchers may average an ERA of 4.00 and the bad luck ones, 5.00. What happens next year?
In year X+1, all the pitchers will average 4.50 ERA of course and the fluctuations (luck) will be randomly distributed among the pitchers and within the two groups that were lucky or unlucky last year. The previously lucky group will average 4.50 in year X+1 and so will the unlucky group.
So, the data will look something like this:
Year X ERA/IP Year X+1 ERA/IP
5.00/60 (100 pitchers) 4.50/75 (same 100 pitchers)
4.00/90 (100 pitchers) 4.50/75 (same 100 pitchers)
In reality the numbers would be more like this, since not all pitchers are true 4.50 pitchers of course. But the results are still the same.
Year X ERA/IP Year X+1 ERA/IP
5.50/60 (100 pitchers) 5.00/75 (same 100 pitchers)
3.50/90 (100 pitchers) 4.00/75 (same 100 pitchers)
(The pitchers who pitched badly in year X are indeed worse pitchers who also got unlucky and the ones who pitched well in year X are better pitchers who also got lucky.)
(Numbers for illustration purposes only.)
This is EXACTLY what happens every year in baseball. Not as simplified as that, but pretty much the same, more or less. To the unenlightened eye, it looks like pitchers who pitch more innings will get a lot worse and pitchers who pitch fewer innings will get much better, but that is an illusion because of selective sampling and regression toward the mean.
It is like the A/F fighter pilots in training. The pilots who did badly on a mission were scolded and the ones who did well were praised. The next mission, the pilots who did badly the first time, did a lot better and the ones who did well, did worse. The A/F brass concluded that scolding was better than praising as a training style. Of course, the subsequent results had little to do with the training method, but was simply regression toward the mean. (If I got that story a little wrong, I apologize, as it is from memory.)