THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, April 20, 2009

Another article by Eric Walker with which I have some disagreements (imagine that!)

By , 04:48 AM

Here is the URL:

http://baseballanalysts.com/archives/2009/04/precisely_inacc.php

While he makes some good points, in his usual pedantic and cock-sure (I guess it takes one to know one) fashion, he draws some incorrect conclusions.  Here is the comment I posted on the site:


While the “primer” by Eric is instructive and caution should always be taken when adjusting team or player stats using PF’s…

There are several problems with his thesis:

First of all, all of the issues he speaks of can be handled, statistically, with no problem whatsoever, if one has the know-how and takes the time to do so.  So let us not throw the baby out with the bath water by declaring that “All PF’s are next to worthless, don’t work, cannot and should not be used to adjust player and team stats, etc.”

More importantly, even a “bad” park factor can be useful and can be used appropriately.

In fact, consider this statement by Eric:

The end results are not totally meaningless: we can say with fair credibility that San Diego’s is a considerably more pitcher-friendly park than Colorado’s, and that the Mets and the Marlins were playing in parks without gross distorting effects. But to try to numerically correct any team’s results--much less any particular player’s results--by means of “park factors” is very, very wrong.”

He is 100% correct in the first part.  That obviously there is SOMETHING even in “bad” park factors that is able to give us useful information.  Even a “bad” PF generally tells us that COL is more of a hitter’s park than SD or WAS and that TEX is more of a hitter’s park than OAK.  So there must be SOME good information in PF’s, which there is of course.  It is good information combined with noise (and perhaps some biases).

The last part of his statement is flat out wrong, considering the first part.  If the first part is correct, which it is - that even a sloppy PF gives us some useful information, then it HAS to be correct that we can use those PF’s (in some way, shape or form) to adjust player and team stats so that our adjusted stats more accurately reflect a player’s or team’s performance in a park-neutral environment.  HAS TO!

That does not mean, however, that we have to use a “bad” park factor as the purveyor of that PF may or may not want us to use it.  For example, let’s say that our “bad” PF tells us that Coors inflates runs (as compared to the whole league) by 20% and that Petco deflates it by 10%.  Even Eric would agree that while he does not trust those numbers exactly, that they probably are on the right track and are somewhat in the ballpark, no pun intended.

But he, and other “PF naysayers” don’t want us to use those PF’s at all, at least not quantitatively, to adjust player and team stats, at least according to the last part of the statement from him I quoted above.

Poppycock!

Would it be better to, say, take a run scored in Coors (by some team, player, or whatever) and do nothing or would it be better to say, adjust it by 1% (even though our “bad” PF says to adjust it by 20%)?  If you said, “Better to adjust (by 1%) then not adjust at all,” you would be correct!

What about Petco and adjusting by 1% versus not adjusting at all? Same answer.  I hope everyone gets my point so far.

So, it is not that these “bad” PF’s are not useful, it is that it is not necessarily correct (although it could be, at least as opposed to not using them at all, as you will see in a minute) to use the exact numerical adjustments that the purveyors of these PF’s might want you to use - for example 20% for Coors and 10% for Petco.

So the answer is not to “not use them at all” which is (incorrectly) throwing the baby out with the bathwater.  The answer is for YOU to figure out how much of the actual PF that some system comes up with to use.  You do that by evaluating the rigor of the system and how much data it uses.  For example, there is absolutely NO reason not to use, for example, at least half of a run factor derived from a system that takes into consideration each team’s schedule and year to year changes in parks and uses 10 years worth of data.  (And by the way, using 10 years of data in just about any PF “system” is going to yield a number that is more accurate than the same system using 1 year numbers almost no matter what, and it is not going to even be close, as long as that system is halfway decent).

Getting back to the Coors and Petco example above and the “bad” (but still reasonable, like the ESPN one) PF system that gives Coors a 120 rating and Petco a 90 rating, so if most of us are in agreement that it would still be correct to adjust Coors and Petco numbers by 1% (in opposite directions of course), then what about 2%?  How about 5%?

Where do we cross the line from improving our numbers (by doing SOME adjustments) to doing worse than nothing at all?  I don’t know the answer to that as it depends on, as I said, how the particular system is constructed (is it a good one or not), but I can tell you that for almost any halfway decent system you can always use a little bit of a park factor to neutralize performance and do better than nothing at all.  And even if 120 and 90 are not correct, it STILL might be better to use the entire numbers than do nothing at all!

Let’s say that the real numbers are 115 and 95, which would not be unreasonable for a system that gets 120 and 90 (there is always going to be regression towards 100).  Do you think using 90 and 120 to adjust player and team numbers would be better” or worse than using nothing at all?  Again, if you said, “Better,” you would be correct!

#1    Patriot      (see all posts) 2009/04/20 (Mon) @ 10:06

Eric’s response to MGL’s critique is disappointing.  It consists of a lot of yelling and a declaration that “this is not a religion”.

No Eric, it isn’t.  Which is why maybe you should take another look at your own arguments. 

He doesn’t even address MGL’s point about regressing the factors.  It’s similar to the hand-waiving dismissal of multi-year factors that he gives in the original post.


#2    Tangotiger      (see all posts) 2009/04/20 (Mon) @ 10:18

My response, posted here and there:

==============================================
MGL is correct that something is better than nothing.  He is also correct that if you use too much, it might be worse than using nothing.

Take the case of Barry Bonds.  He has had substantial playing time at SF’s home park.  The number of HR he hit at home and on the road in his SF years are virtually identical.  But, if you look at all other LHH, you will find that SF depresses HR by 33%.

If you look at Juan Pierre at Coors, you will find that his HR rate does not change.  If you look at Dante Bichette, you will see that it changes substantially.

The real danger is using a one-size-fits-all park impact number, when each player is unique.  In the case of Bonds and Pierre, no park impact number is better than using any.  In the case of Bichette, you come to the opposite conclusion.

The real position to take is that current park impact numbers gets you from step A to some intermediate step, when we still have to get to step Z.  Some people believe that that intermediate step is step B or C, and others believe it’s step X or Y.


#3    Adam B.      (see all posts) 2009/04/20 (Mon) @ 11:27

Also if you use his arguments you basically never use math to adjust any numbers ever again. As in he wouldn’t adjust minor leaguers numbers at all to try and predict their major league performance, when it’s likely there is a substantial amount of regression needed. And by his argument, shouldn’t we not have aging curves, either? Yeah, he just doesn’t get it.


#4          (see all posts) 2009/04/20 (Mon) @ 11:59

Am I missing something?

If you use a small sample size, you get something that’s not reliable.  If you average the results from a bunch independent park effects from small sample sizes, you get something more reliable.  And, even then, there’s some random error and you can’t expect perfect accuracy.

That’s just elementary statistics.  But the conclusion that park effects shouldn’t be used ... WTF?


#5          (see all posts) 2009/04/20 (Mon) @ 12:13

I posted there:

==================

If your point is that the ESPN numbers are not perfectly accurate, I agree with you.

How accurate do you think they are? Saying that they’re “comically insufficient” may be true, but that statement is not useful unless it’s quantitative. In numerical terms, how useful or accurate are they? Are you saying that the 95% confidence intervals for parks are so wide that they all embrace 1.000? Are you saying they’re even wider?

Can you be a little more quantitatively precise so that we can evaluate your claim?


#6    ubelmann      (see all posts) 2009/04/20 (Mon) @ 14:16

My main complaint is similar to Phil’s.  In my reading of Eric’s article, I don’t see anywhere where he sets forth a clear test for whether or not a PF is useful.  That seems to be what we need.  Once we have a test for whether or not a PF is useful, then we can start to sort through which PF methodology gives the best results and then we can also get a handle on what the variance is around a specific PF (which ought to vary from year-to-year thanks to weather and other factors.)

Some of the problems that Eric bemoans are sources of systematic uncertainty (moving fences, use of the DH rule, etc.) and then there is obviously going to be some statistical variance thrown in that we can’t ever completely get rid of, but we can reduce with larger sample sizes.  Obviously by increasing the sample size we put ourselves at risk of increasing the systematic uncertainties in the PF, but I’m sure that we can adjust away a fair amount of the systematic uncertainty anyway.


#7    ubelmann      (see all posts) 2009/04/20 (Mon) @ 14:28

Phil/4:

Increasing the sample size *will* reduce the statistical uncertainty but it won’t reduce the systematic uncertainty, and could potentially increase the systematic uncertainty.

In a sense, we are trying to analyze a poorly-constructed experiment.  If we were specifically interested in how parks impact offensive levels, our experiment probably wouldn’t look very much like the current MLB regular season.  So we have lots of systematic biases to adjust for (weather, different home/road lineups, use of the DH, platoon effects, etc.), and those biases don’t go away by increasing the sample size (or worse, they change and appear to go away when they do truly exist.)

For instance, if it is somewhat cold at a park in year’s Y-1 and Y+1, but particularly warm at that park in year Y, then we wouldn’t necessarily want to average them all together.  If we hold all other things equal, the PF in years Y-1 and Y+1 ought to be different than the PF in year Y.  Ideally, we would understand how the temperature affects offense, adjust each year (or even better, each game) individually and then average over everything to kill the statistical noise.

In my mind, the way to do things is to list out all the sources of systematic uncertainty and then to figure out which are the largest and try to adjust them out.  Essentially, that’s why we have park factors in the first place--park effects are a source of systematic uncertainty in player performance that we can’t get rid of by taking a larger sample size.


#8          (see all posts) 2009/04/20 (Mon) @ 14:39

Ubelmann/4:

Sure, you have a bit of variation in the observed park effect from year to year because of weather and so forth.  So what?  Almost any measurement of anything has a certain amount of extraneous noise.  And you can estimate that noise by taking the variance of the observed year-to-year effects.

The conventional wisdom is that the noise is fairly small compared to the consistent effects.  Eric Walker is saying that the noise factor is so large that all park effects are good for is distinguishing Colorado from San Diego.

I say that it’s not enough to point out there might be extraneous effects.  Given the observation that park effects are reasonably consistent, critics have to show us some numbers.  Show the width of the confidence interval, and maybe we’ll agree with you.  But just saying “there are other factors involved” isn’t much of a criticism.


#9    MGL      (see all posts) 2009/04/20 (Mon) @ 14:52

Tango, I’ve said this before:

While it may be true that for some individual players you might not want to use ANY traditional park factors, which would actually mean that the traditional park factors will be not extreme enough for other players (for example, if Coors is 120 and it does not affect Pierre, Bonds, and Bichette at all, then for everyone else it must be 1.21), if you don’t know those players going in, then it is still better to use some kind of park factor for EVERYONE!

Yes of course, you are making a mistake on a few players, but, as Adam B. correctly point out above, that is true of ANY adjustment.  Some minor leaguers for whatever reason do not translate as everyone else from minors to majors.  We still want to do MLE’s for everyone!

I just read Eric’s response to my post. It is atrocious.  I don’t want this thread to turn into the one on his Steroids article.  At least I am not going to get involved in that.  I’ll just say that I was kind to Eric in pointing out that he gives a lot of useful information in the above-referenced article concerning how to improve upon computing useful park factors, such as controlling and adjusting for park changes, schedule, and the like.  But it is clear to me that he cannot engage in any kind of rational, cogent discussion with anyone who disagrees with anything he says (and there is lots to disagree with).

BTW, if we NEVER used park factors and the numbers they produce are near worthless, as Eric seems to imply, how would we ever know that Coors is a hitters’ park and Petco is a pitcher’s park, and to take any numbers at those parks with a grain of salt?  Even looking at a player or team’s home/road splits (like looking at all Rockies’ players splits over their career) is a “park factor” isn’t it?  His thesis is nonsense plain and simple, unless he wants to admit that his thesis is simply that any numbers that are produced by a park factor “system” need to be regressed and that some are better than others.  That’s all that really needs to be said.


#10    ubelmann      (see all posts) 2009/04/20 (Mon) @ 15:07

Phil/8:

The problem is not that some of these biases (like weather) are noise, the problem is that they are signal.

We are trying to measure the non-player effects that change the offensive environment, so in a perfect world, we’d rather not wash them out by averaging over a number of years, but we would study each bias, measure it, adjust for it, and then average the adjusted numbers over a period longer than one season to reduce statistical noise.  If we average before we adjust for a systematic bias, we risk losing real information about the park factor and that loss of information could outweigh the benefit of reduced statistical uncertainty. 

I haven’t studied this in-depth enough to say how much I agree with the conventional wisdom on PF (though I certainly am closer to CW than Mr. Walker.) I suspect that there is a certain degree to which researchers on the subject want PFs to be consistent from year-to-year and they may be averaging over too much data in an effort to get consistent PFs.  I would think that a good litmus test for the worth of a PF would be how much the PF helps to explain changes in performance when a player moves from one team to another within a season or from one season to the next.  That sort of a litmus test would allow for the possibility that PFs are inherently inconsistent from year-to-year, even if measured perfectly.

I don’t agree with Eric’s stance that we ought to just throw our hands up and give up--I just meant to point out that increasing the sample size does not always lead to a better estimate.


#11    Tangotiger      (see all posts) 2009/04/20 (Mon) @ 15:09

Barry Bonds, since 2000:
home: 160 HR / 1217 contacted PA (66 per 500 contacted PA)
road: 157 HR / 1247 (63 per 500)

I don’t have my “other LHH on Giants or their opponents”, but I seem to remember it was something on the order of 12 per 500 at home and 18 per 500 on the road.

One SD per 500 contact PA for Bonds is 4.7 HR.  If we presume that he should hit 63/18 * 12 HR at home, then he should have 42 HR, which is 5.1 SD.  If we use the odds ratio method (which is my preferred method), it bumps it up to 43 HR.

Clearly, Bonds is not affected by his home park like other LHH hitters.

At the same time, there’s no reason for us to think that Bonds’ 66/63 reverse split is legitimate.  If I had to guess, I’d say it should probably be 61 HR per 500 PA at home, and 68 on the road.

While most LHH hit 40% of their HR at Pac Bell, and Bonds hit 51% of his at Pac Bell, his true split is probably around 47% (after accounting for home field advantage).


#12          (see all posts) 2009/04/20 (Mon) @ 15:14

Ubelmann/8:

Anything that is consistent from year to year, like warmer or colder weather, is legitimately part of the park factor!  Park X may be partly a hitter’s park because it’s warm, and partly because of its dimensions.  But the park factor itself doesn’t care how much of each is involved.

But, sure, any changes, like moving the fences, doesn’t allow you average over multiple years.  And I agree with you the constancy of players over the years might introduce a systematic bias, if the players are particularly tailored for the park. 

But, again, you have to use numbers to show what size effect we’re talking about here.  Arguments of the type “you didn’t correct for everything, and therefore your analysis is worthless” are themselves worthless, unless they make a good argument that the missing corrections are significant.

(BTW, in “averaging” I wasn’t referring to multiple years, but to the averaging of the multiple parks that Eric did in his essay.)


#13    ubelmann      (see all posts) 2009/04/20 (Mon) @ 15:28

Of course, even if Bonds’ reverse split is legitimate, there are a couple of ways to view that.  One is that we shouldn’t expect his HR totals to increase if he moves to a neutral park. 

The other is that since he is not affected while other hitters are (apparently) affected, using the generic park factor adjustment would still be useful for improving the estimate of how much Bonds contributed to Giants wins.  A HR in a low-scoring environment is worth more than a HR in a high-scoring environment and if our goal is to study how many wins Bonds added to the Giants in a particular year, then he ought to be credited with more wins/HR in SF than he would be in a neutral park.

That’s not any kind of a new observation, but I always found it to be an important observation.  The correct park factor depends not just on the data in front of us; it also depends on why we’re making the adjustment in the first place.


#14    Tangotiger      (see all posts) 2009/04/20 (Mon) @ 15:33

Right.  For the purposes of this discussion, it’s about inferring true talent, not on the impact of his HR in the context of the run environment.


#15    MGL      (see all posts) 2009/04/20 (Mon) @ 15:34

Ideally, you want to isolate the yearly fluctuations in weather within a park, which can be done but is usually not of course.  But if you don’t or can’t, so what?  It is just one of many things that don’t get adjusted!  For example, if you were to park adjust player A’s stats for the last 3 years, but it was particularly cold in his park for one of those years, it is simply not going to make that much of a difference and the likelihood that it is particularly warm or cold (or windy in one direction or another) in any one park in any one year is not that great, especially as compared to other parks (if it is warm everywhere in one year, your “normalizations” to league averages, assuming you do that, should take care of that).

Again, ideally you want to either use long term park factors that take into consideration changes in that park and changes in the whole year, as well as schedule differences, and THEN weather adjust player and team stats as well, OR you want to use those same long-term park factors, with all the above corrections/adjustments and then incorporate any yearly differences in weather into those park factors.  If you do all that, you have some pretty darn good park factors!  Why is that so hard to accept?  Whether they apply equally to all players (which I am sure they don’t) does not bother me in the least.  Again, why should I be particularly bothered.  That is the case with almost any adjustment we do.  And BTW, there are statistical methods to tailor park adjustments to individual players.  For example, of we have an overall HR PF of .90 in SF, and Bonds’ 10 year (or 5 year, or whatever) splits are 1.00, then we can “combine” the two (using Bayesian probability) and use something like .95 (I don’t know what the correct number is).  In fact once we can estimate what the true “spread” of “player unique” PF’s are, we can determine how much to combine the two numbers…


#16    Tangotiger      (see all posts) 2009/04/20 (Mon) @ 15:53

Right, just like Andy did to figure out the true platoon splits of a player.


#17    ubelmann      (see all posts) 2009/04/20 (Mon) @ 15:58

Phil/12:

The average weather is certainly a legitimate part of the park factor that you will capture by averaging over many years.  However, the variations around the average weather are *also* a legitimate part of the park factor and you will underestimate them if you average over multiple seasons and you may completely miss them if you average over too many seasons.  (The weather is only an example here, there could be other things that vary from year-to-year that ought to be adjusted for.)

I’m just trying to point out that there’s a trade-off in averaging over larger sample sizes: we lower the statistical uncertainty but we may raise the systematic uncertainty. 

Since you clarified that you meant his averaging over the multiple parks, I think I see your point more clearly.  Eric is claiming that it has nothing to do with the methodology (therefore, it seemingly has nothing to do with adjusting for systematic biases) and everything to do with inadequate sample size (which presumably means that the statistical uncertainty is too large for his taste.) And if his claim is that the statistical uncertainty is the only problem, then averaging over many small samples should improve our estimate.


#18    ubelmann      (see all posts) 2009/04/20 (Mon) @ 16:08

MGL/15:

I would certainly agree that if you do all of those adjustments you would have a pretty darned good park factor.  I didn’t mean to focus the discussion so much on the weather, though, that was just one example of a systematic bias. 

I don’t think that any of these adjustments are impossible--some would just be more difficult/time-consuming than others and some of the systematic biases are small enough that it might not be worth our time to bother with them.  If there’s a list out there of systematic biases in PFs with estimates of their impact on the accuracy of the PFs, that would be extremely useful in deciding which things to worry about and which things to ignore.


#19    MGL      (see all posts) 2009/04/20 (Mon) @ 19:16

BTW, that is EXACTLY how I do my park factors (making adjustments for all those things), and I still regress quite a bit even with 10 or 15 years worth of data.

Ideally, you want to isolate the yearly fluctuations in weather within a park, which can be done but is usually not of course.  But if you don’t or can’t, so what?  It is just one of many things that don’t get adjusted!  For example, if you were to park adjust player A’s stats for the last 3 years, but it was particularly cold in his park for one of those years, it is simply not going to make that much of a difference and the likelihood that it is particularly warm or cold (or windy in one direction or another) in any one park in any one year is not that great, especially as compared to other parks (if it is warm everywhere in one year, your “normalizations” to league averages, assuming you do that, should take care of that).

Again, ideally you want to either use long term park factors that take into consideration changes in that park and changes in the whole year, as well as schedule differences, and THEN weather adjust player and team stats as well, OR you want to use those same long-term park factors, with all the above corrections/adjustments and then incorporate any yearly differences in weather into those park factors.  If you do all that, you have some pretty darn good park factors!  Why is that so hard to accept?  Whether they apply equally to all players (which I am sure they don’t) does not bother me in the least.  Again, why should I be particularly bothered.  That is the case with almost any adjustment we do.  And BTW, there are statistical methods to tailor park adjustments to individual players.  For example, of we have an overall HR PF of .90 in SF, and Bonds’ 10 year (or 5 year, or whatever) splits are 1.00, then we can “combine” the two (using Bayesian probability) and use something like .95 (I don’t know what the correct number is).  In fact once we can estimate what the true “spread” of “player unique” PF’s are, we can determine how much to combine the two numbers…


#20    terpsfan101      (see all posts) 2009/04/21 (Tue) @ 00:35

MGL,

What is the maximimum number of years that you would recommend using for a park factor? When I calculated historical park factors, I followed a suggestion by Tango about using as many years as possible (along as the park isn’t significantly altered). After reading some of the comments in this thread, I now see that this is a very flawed approach. I guess I took Tango’s suggestion to the extreme as I have a 68 year PF and for Fenway Park and a 71 year PF for Wrigley Field.


#21    MGL      (see all posts) 2009/04/21 (Tue) @ 02:52

I think that the more years the better, within reason.  Obviously if you go too far back, the other parks can and will change significantly, not to mention other fundamental things about the game that could easily affect PF’s.  Plus, when you are trying to increase sample size to reduce your error, you get diminishing returns as that sample gets larger.  I don’t know the answer to your question.  I don’t think there IS an answer.  It is one of those things where we know that 5 years is better than 3 and 71 years is too much.  But where in the middle is optimal, I have no idea.  I use up to around 15 years, but I adjust for all parks and for changes in a park.  Almost all parks change.  Boston added luxury boxes about 20 years ago which significantly changed how it plays (much less of a hitter’s park), and Wrigley got rid of some foul territory a few years ago and did some other renovations to their bleachers which also changed how it plays.


#22    terpsfan101      (see all posts) 2009/04/21 (Tue) @ 05:52

15 years sounds like a good limit. I will go back and redo the 15+ yr PF’s. The PF’s I use are Runs divided by adjusted games, meaning that they are essentially R/O. I correct them for other parks in the league and regress them using MGL’s equations that Patriot has posted on his blog.

As far as park changes are concerned, I only look at the dimensions. The only data I have on foul territory is the distance between home-plate and the backstop. But if there is information on foul territory changes I do utilize it. For example, both Dodger Stadium and Kauffman Stadium have recently reduced their foul territory. I also take into account the humidor at Coors beginning in 2002. I don’t think Wrigley Field has altered their dimensions since 1938. What year is MGL talking about when he says Wrigley got rid of some foul territory a few years ago?


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 20:16
Largest demonstration in Canadian history?

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards

May 24 08:13
espnW for hockey: CBC’s WhileTheMenWatch.com

May 24 00:16
Psst… wanna intern… somewhere?