Monday, March 29, 2010
Pre-Introducing Batted Ball FIP, part 2
QUESTION
Is there a HR pitching talent?
Some batted ball metrics include the homerun because they
(1) want to account for events that occurred and/or
(2) treat the homerun as persistent enough at the pitcher level that it indicates some degree of skill
Some batted ball metrics exclude the homerun because they
(1) are only interested in the flight trajectory of the ball (and not distance) and/or
(2) treat the homerun as having little persistence at the pitcher level
What you are about to read is going to be non-light reading.
INTERLUDE 1
Let me take a little bit of interlude to explain what observational data is. Observational data is a sample of something real. If someone is a true .400 OBP player, then we will observe him to be between .399 and .401 95% of the time if he came to bat one million times in the span of less than one second, all the while facing each pitcher an equal number of times (proportionate to the number of times they pitch) in an equal number of times in each park.
The less all that is true, the larger our confidence interval becomes, in order for us to maintain the same confidence level of 95%. Let’s see, a player cannot come to bat one million times in less than a second. He can’t even come to bat once in less than a second. And, as a human being, he is subject to aging. What we see therefore are observations of a person at different points in his life. He also does not face the same pitchers at each point in his life, nor does he play in the same park each time we collect data. Basically, the only common thread we have is that we have the same DNA for each data point. And we only have a few hundred data points, maybe a few thousand. So now, instead of observing our player at .399 to .401, we will observe him at say .360 to .440 one thousand times, 95% of the time.
***
INTERLUDE 2
Let’s take another interlude to discuss priors and Bayes. Suppose that you observe something with a .400 OBP over 1000 PA. Does this mean this player is a true .400 OBP? It all depends on the underlying population spread. Say that the league average OBP is .600, and that 95% of the players have a true rate between .550 and .650, but you actually don’t know who is the .650, who is the .570 and who is the .530. You just know you have them. And you are told that one player in the league batted .400 over 1000 PA. What are the chances that this was done by the .650 guy? By the .640 guy? .630? .600? .550? .500? .450? .350? etc, etc. You figure out the chance that each of those guys did it, you figure out how many of those guys there are at each of those levels, and that will give you the best guess as to who hit .400 over 1000 PA. Let’s say the average becomes .450 +/- .020. So, you are 95% sure that the guy who hit .400 over 1000 PA was a true .410 to .490 player.
Suppose that the 95% population range was not .550 to .650, but .590 to .610. If you witness .400 over 1000 PA, then you know that your best guess is that this guy is a .405 hitter +/- .010. All these numbers are for illustration purposes.
And suppose the league mean is .340 +/- .030? Well, now your best guess might be .380 +/- .020 or something.
In order for you to know what your best guess is, you have to know the underlying spread in the true talent, as well as the population mean, and the number of trials (plate appearances) for your player. All that will tell you how good your hitter is, and how certain you are of your guess.
***
PART 1
The general equation for variance looks like this:
variance(observed) = variance(true1) + variance(true2) + ... + varuiance(trueN) + variance(binomial)
There’s also co-variance terms, but let’s presume that each of our parameters are independent. And for the purposes of this discussion, let’s presume we only have one parameter (say the pitcher’s HR skill), so we have this:
variance(observed) = variance(true) + variance(binomial)
So, the variance we observe is directly linked to how much variance there is from the binomial (which is tied to the number of trials) and how much underlying true skill there is to begin with. You can get variance(true) to approach equaling variance(observed) by getting variance(binomial) to approach 0. You can do that by increasing your sample size to infinity. And, since the correlation coefficient, r, is the amount of observed variance that can be explained, then you get variance(true)/variance(observed) equal to 1.
Therefore, knowing the strength of the correlation coefficient is MEANINGLESS without also knowing the number of trials for your samples. This is the statistical equivalent of a lawyer being able to indict a ham sandwich.
***
PART 2
And so, we come to HR per batted balls.
Let me give you some data. From 2002-2009, there were 318 pitchers that had at least 600 air balls (outfield flies, inflied flies and line drives; or, equivalently, all contacted plate appearances excluding groundballs and bunts). The average was .067 HR per air ball, on an average of 1274 air balls.
Francisco Cordero has given up only 31 HR on 825 air balls, for a HR rate of .038. This figure is -3.4 standard deviations from the mean of .067. This is his z-score. On the other end of the scale is HR machine Bretty Myers, with 178 HR on 1874 air balls, for a HR rate of .095, and a z-Score of +4.9. I repeated this process for all 318 pitchers, getting their z-Scores.
I then took the standard deviation of their z-Scores. If there was no such thing as a HR skill, we would expect the standard deviation of their z-Scores to be 1.00. But, our 318 pitchers have a standard deviation of 1.35. This means that there is a definite HR skill present (the higher from 1.00, the more there is a skill in the metric.)
***
PART 3
What can we do with this data? Let’s bring it all together.
We can also divide all those terms in the original variance equation by variance(binomial) to get this:
variance(observed)/variance(binomial) = variance(true)/variance(binomial) + 1
Also note that a variance is simply the standard deviation squared. And, we said we got a standard deviation of the z-scores of 1.35. That means this:
1.35^2 = variance(observed)/variance(binomial) = variance(true)/variance(binomial) + 1
The variance(binomial) also follows easily enough from the 1274 air balls and mean of .067, as (.067*(1-.067)/1274)= .007^2
And so:
1.35^2 = variance(true)/(.007^2) + 1
variance(true) = .0063^2
And there we have it. The spread in HR skill per air ball is one standard deviation equal to .0063 HR per air ball.
Also note that the coefficient of determination, r-squared, is the amount of variance that can be explained by the parameter we are studying. So, it would be variance(true)/variance(observed). 1-r^2 would be variance(binomial)/variance(observed). So, we can update the above as:
1/(1-r^2) = 1.35^2 = variance(observed)/variance(binomial) = variance(true)/variance(binomial) + 1
And so, r=.67, when n=1274
We also have a general equation that says:
r = n / (x+n)
And in this case:
.67 = 1274 / (x+1274)
That makes x = 627
Therefore, our correlation equation is:
r = AirB / (AirB + 627)
Our regression equation is 1-r. And so:
regression rate = 627 / (AirB + 627)
If you have 627 airballs, you regress the HR rate 50% toward the mean. If you have 1274 air balls, you regress 33% toward the mean. Simple enough?
With Brett Myers’ 1874 air balls, we regress 25% toward the mean. And so, his .095 rate compared to the .067 mean gives us a regressed HR rate of .088. And that becomes our best estimate of Myers’ HR skill.
Ideally, you would do all this by also including a park adjustment. I said the spread was a standard deviation of 1.35 times the luck-based spread. But, that 1.35 also includes the spread of the park factors for the HR. The spread is not that great for the HR factor, say 1.02 or 1.03 or something, which has the effect of making the observed larger than it should. Also, since I’m looking at a pitcher’s career (2002-2009), it will include some aging in there, which has the opposite effect of making the spread look smaller than it is.
***
PART 4
Indeed, let’s look at actual seasonal-data, rather than this career-level data. From 2002-09, I have 1055 pitchers with at least 200 air balls, with an average of 310 air balls. The mean rate is .067 HR per air ball, and the standard deviation of their z-scores was 1.13.
Let’s put it in our equation:
1/(1-r^2) = 1.13^2
This gives us an r=.47
In our general equation:
r = n / (x+n)
And in this case:
.47 = 313 / (x+313)
That makes x = 353
Therefore, our correlation equation is:
r = AirB / (AirB + 353)
No matter how you do it, there is a HR skill, and we can see it based on a certain number of air balls. When you have 353 air balls, the HR per air ball ia about half skill and half luck. The more air balls you have, the more the skill portion overwhelms the noise component.
So, when deciding whether to include HR in a batter ball ERA metric, you need to really know how many air balls you have. From 2002-09, there were 285 pitchers that had more than 353 air balls, or about 35 pitchers per season. That is, for about one pitcher a team, the HR per air ball metric contained more skill than noise.
And so, if your choice is: HR per airball or not, then you need to choose not, if it’s an either/or case. But, as you add career data to a pitcher’s performance, then you the HR per airball becomes a choice of definitely yes, include. As I said, I have 318 pitchers with 600 air balls, and so, to throw that away would be foolish.
***
CONCLUSION
In the end, the more data you have, the more actual outcomes matter. Our job, as saberists, is to tell you, the reader, the line at which the metric crosses where it shows you more signal than noise. And for a pitcher’s HR per air ball rate, that line is around 353 air balls.
***
(Note: I should include park as a parameter, but it won’t affect the results much. It’ll turn the point where r=.50 at around 400 air balls. Someone else can pick it up from here if they want.)


Very interesting results. It definitely sounds plausible. I have two questions.
1) What if you only looked at oFB or oFB+iFB, but excluded all LD and HR on LD from your analysis? How much does that change the sample for r=.50?
2) How much is HR% of oFB or (oFB+iFB) or (oFB+iFB+LD) correlated with K, BB, oFB, iFB, LD, or GB rates?
The reason I ask the first question is that if all pitchers have the same line drive skill (~.19 LD/Batted Ball), and a smaller fraction of line drives leave the yard, then the skill is really going to be ((oFB+iFB)/Batted Ball). If 5% of line drives leave the yard and 11% of fly balls do (these are all made up numbers), then a pitcher with .525 GB/.190 LD/.285 (oFB+iFB) skill is going to give up 8.6% HR/AirBall while a pitcher with .430 GB/.190 LD/.380 (oFB+iFB) skill is going to give up 9.3% HR/AirBall. It’s a difference, but it’s related to batted ball skill.
The reason for the question about correlations is that it tells you how useful knowing the pitchers HR/FB skills are. My article a couple weeks talked about how BABIP is correlated with K/PA and BB/PA and (GB-oFB-iFB)/PA and enough so that it explains a large fraction of the fraction of BABIP variance not attributable to luck or defense or park. If HR/AirBall or whatever skill is correlated with other skills, it’s possible it tells you more information for a constructed run estimator like FIP but not anything extra would be needed in a regression.