THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, January 10, 2008

Wanna do some peer review?

By Tangotiger, 12:31 PM

Some academics don’t like us.  Others though appreciate our no holds barred approach.  I firmly believe that any punch above the belt is a fair punch.  Of course, once the guy is down, you back off.  (I hate those old hockey brawls where the guy would pin someone to the ice, and still keep punching the guy.) If you can offer criticism in an honest manner, without personally insulting someone, I think it’s fair.  You might come off as arrogant, or rough around the edges, but those are things that the reader can get past.  Respect, and quality of content.  That said, I got an email:


Tom,

My name is Bobby Swift. I am a senior at Harvard College and a member of the Harvard Sports Analysis Collective. Our paper, “Improving Major League Baseball Park Factor Estimates” was recently accepted for publication by the Journal of Quantitative Analysis in Sports. The paper will be published in April.

We develop a new estimator for Park Factors using an ANOVA weighted fixed-effects model for run generation. Our estimates are significantly less variable and more consistent on a year to year basis than ESPN’s published estimates. We are currently in the process of receiving feedback and reviews of the paper.

I am contacting you because we would appreciate, if you are willing, some feedback on our paper. I thought the feedback you and MGL gave Shane Jensen on his SAFE system was insightful and spot on. I think that comments from you, MGL, and other readers of your blog would be incredibly helpful.

Here is a link for our paper:
http://www.hcs.harvard.edu/%7Ehsac/Blog/wp-content/uploads/2007/12/pfpaper.pdf
Here is a link for our blog:
http://www.hcs.harvard.edu/~hsac/Blog/

Best,
HSAC

I haven’t looked at it yet, but you guys know I have strong feelings on the matter of park factors.  Let’s check it out, and see what we can do to help.

#1    ElBonte      (see all posts) 2008/01/10 (Thu) @ 13:51

I understand you’re probably looking more for a review/critique than a proofread, but there is a spelling error on page 5:

“Sheehan, unfortunately, does not disclose
Davenport[’]s methodology.”


#2    will      (see all posts) 2008/01/10 (Thu) @ 14:38

"We then used the actual 2006 season schedule,
and treated each game as a draw from a normal distribution centered at the
expected run value...”

My impression was that using a Normal distribution for run scoring is a problem (Not sure how it effects their paper- is CLT relevent?).
Salb has suggested and worked with Weibull distribution’s has he not?


#3    jinaz      (see all posts) 2008/01/10 (Thu) @ 14:41

I’m going to try to read the paper over lunch.  I did want to respond to this comment though, as a card-carrying academic:

Some academics don’t like us.  Others though appreciate our no holds barred approach.

Academics are trained to be the harshest and most critical people in the world (or, so we like to think).  Speaking at least from personal experience, reviews that I’ve seen on my own papers, grants, etc, have been far harsher than much of what goes on here.  Perhaps the tone is more cordial.  For example, I’ve never been called “stupid” in a review, which is a word that gets thrown around here from time to time (and, I’d add, isn’t a particularly effective criticism).  But in my experience, reviewers have no problem reporting criticisms that they feel completely undermine a multi-year study and make a paper unpublishable. smile

Therefore, if you’ve met with resistance in the past from academics with respect to your criticisms, keep in mind that we tend to be quick to defend our work given the rampant criticism we often encounter.  Also, it wouldn’t surprise me if some academics are more prone to brush off the critiques of “amateurs” simply because they’re not also academics. 

In the case of the folks that frequent this board, of course, you’re probably more qualified to critique a study like this--at least with respect to the baseball elements of it--than the academics who actually reviewed this paper prior to its acceptance in this journal.  I’d also add that we should keep in mind that, given that the paper has already been accepted, our comments here are very unlikely to affect the paper itself.  But that doesn’t mean that they won’t consider changes for future work.
-j


#4          (see all posts) 2008/01/10 (Thu) @ 15:11

I guess the first thing to note is that the PF they present in this model is additive and not multiplicative.  Does that make sense?  I see that they “translate” the PF to a multiplicative one at the end of the day, but if the model is that the PF is additive then they should present the additive factor.  A discussion justifying the use of an additive versus a multiplicative PF would be useful as well.  I am agnostic as to which one is appropriate.

Yes, the Weibull is the appropriate distribution from which to draw runs, since there is a lower limit on runs scored but no upper limit.  I don’t have the time at this moment, but if the authors are interested in generating the appropriate Weibul distributions I would be happy to discuss that with them.  My email is linked on my username.


#5    will      (see all posts) 2008/01/10 (Thu) @ 15:50

"We assume that the runs a team scores in a game are generated by a
linear process. Controlling for home field advantage and the o =ffensive and
defensive strength of the home and away teams, the Park Factor adds or
subtracts runs from the number that would have been scored in a neutral
park.”

I’m fairly confident that additive is wrong - on the ranges encountered in MLB the approximation may be acceptable - but I definitely think multiplicative is a better approach. Even if it takes extreme scenarios I prefer my model never to predict negative run scoring!


#6    David Gassko      (see all posts) 2008/01/10 (Thu) @ 16:27

This is a very interesting paper, and well-done, IMO. I do have a few comments, however:

- First of all, I don’t get the “inflationary bias.” The example given in the paper (with a two-team league where one park doubles run scoring and the other halves it) clearly shows the bias introduced by not controlling for a team’s opponents’ parks, and not anything else as far as I can tell. I’m not sure this actually affects the results, though.

- Assuming that run scoring is linear is also incorrect, since it is clearly governed by some kind of odds-ratio type process. I think this does have some affect on the results because of the authors’ use of detailed game-by-game data (which is great, by the way).

- The way to test which method is better to apply those park factors to the RC/27 for players who have switched parks and see which better predicts their performance in a new ballpark.

- Obviously, runs scored are not distributed normally, but as a Weibull distribution, however, if I recall and old e-mail conversation I had with Andy Dolphin, who knows 1,000 times more about statistics than I ever will, that may not actually matter.

- And finally, park factors are most definitely closer to being additive than multiplicative. I have no problem with presenting them in multiplicative form, since that is a much more familiar method, but again, correct park factors would be applied with the odds-ratio method, which is closer to being additive than multiplicative.

Again, overall, this is a very good and interesting paper.


#7    Guy      (see all posts) 2008/01/10 (Thu) @ 16:42

"And finally, park factors are most definitely closer to being additive than multiplicative. I have no problem with presenting them in multiplicative form, since that is a much more familiar method, but again, correct park factors would be applied with the odds-ratio method, which is closer to being additive than multiplicative.”

So if I’ve got a park that is +0.5 R/G or 111, you’re saying we would expect a 2.50 pitcher to give up 3.00 R/G rather than 2.78 R/G in that park. Correct?  Interesting.  Why do you think that’s the case?  And have you seen empirical data consistent with that?


#8    tangotiger      (see all posts) 2008/01/10 (Thu) @ 17:08

Actual runs per game should be used multiplicative, since it is very similar to an odds ratio!  It is runs per 27 outs.

When I showed how a guy giving up 2.5 runs in a 5.0 run league would do in a 3.0 run league, it was a straight ratio.  In both cases, he gives up runs at 50% of the league average.

(You can prove this by either using my Markov model, or converting the runs per game figure to an OBP, use Odds Ratio to convert OBP into the new environment, then turn that OBP back into a runs per game figure.)

HOWEVER, if you do something like OBP, or HR/PA or whatnot, then you MUST use Odds Ratio.


#9    MGL      (see all posts) 2008/01/10 (Thu) @ 17:09

I have always thought that park (run) factors were closer to being multiplicative than additive, but I think it has been shown to be some combination.  Didn’t Tippett present a paper at one of the SABR conferences, looking at that?  Shouldn’t it be easy to test, using extreme, but aggregate, data for players?


#10    David Gassko      (see all posts) 2008/01/10 (Thu) @ 17:49

Maybe run factors are. Sean Smith has shown that factors for strikeouts certainly are not:

http://mvn.com/mlb-stats/2007/04/18/to-add-or-to-multiply-part-i-of-many/


#11    Mike Flatt      (see all posts) 2008/01/10 (Thu) @ 20:30

I thought the additive/multiplicative debate was settled.  When using component park factors (and measuring ability) it is best to not use the multiplicative method (that is, if a player hits 40 HRs in a neutral stadium and is traded to a team with a HR park factor of 1.15, you say he’ll now hit 40*1.15 = 46 HRs).  They are based on the fact that every player is affected the same, which is of course not true.  I remember MGL saying that he used different HR park factors for different parts of a ballpark and then applied those numbers as an additive effect to opportunities.  Seems to make sense…

So, for a park run factor, which is based on runs per out, it should be okay to use the multiplicative method because you’re simply measuring value.


#12    Bobby Swift      (see all posts) 2008/01/10 (Thu) @ 21:10

Everyone - thanks for the feedback. We are in the middle of our final exam period right now, but we will try to respond as soon as possible. Thanks again.


#13          (see all posts) 2008/01/11 (Fri) @ 00:10

What about testing other types of splits? By adjusting for team, you’re solving for 3 parameters per team (O, D, park) plus a generic home-field advantage. You guys know better than I do whether you run into standard error problems, but maybe you can check whether you can get tighter park factor estimates with other splits or with fewer dummies.

I guess the best example I could think off the top of my head is handedness, as it is about the biggest obvious split there is out there (and hence less likely to be clouded by data randomness). Have dummy variables for lefties and righties (or if you want to do more work: the four possible L/R pitcher-batter matchups) and check their RC (or other good offensive stat) in different parks. This gets you only four handedness parameters plus one per park. Of course the flip side is that you’re controlling for fewer things with fewer dummies.

I don’t know if this is better than your team-by-team split method or not. But regardless, if both methods are reasonable they should yield similar answers for your park factors. It’s another good way of testing whether your numbers are reasonable.


#14    MGL      (see all posts) 2008/01/11 (Fri) @ 05:18

I am reading the paper now.  I am going to be harsh, because that is what I do.  I apologize in advance.  Also, I have to break this up into tw posts, as it is too long for one.

Here is the first paragraph I am confused by:

To illustrate this point, consider a simple two-team league. Team A plays their games in a park with PF = 2:0 and Team B plays their games
in a park with PF = 0:5. The teams are equally balanced in every other way, so that both teams would be expected to score four runs per game in
a neutral park (PF = 1:0). When they play one game in team A’s park, the expected score is 8􀀀8, and when they play one game in team B’s park,
the expected score is 2- 2. If we use the ESPN model to recalculate the park factors, the Park Factor for park A = 4:0, and the Park Factor for
park B = 0:25. The actual Park Factor values are thus biased further from neutral, and the e ects of each park appear much more severe than they
actually are.

That scenario is not possible.  A league cannot consist of two parks where one has a PF of 200 and the other .5.  All park factors in a league have to average 100!

If in one park, 16 total runs are scored and in the other, 4 total runs are scored, then the average runs scored in the league is 10, which means that park A has a PF of 150 and B, 40, for an average of 100.  Why give an example that canot exist?

The reason you have this asymptotic problem is that the ESPN equation they cite is wrong (it is not rigorous - it is a truncated version or whatever you want to call it).  You need to include a portion of the home runs scored and runs allowed and home games in the denominator in order to generate a PF which is “home park divided by ALL parks combined” and not just “home park divided by road parks.” If you do it the right way, then the asympotic problem disappears.  That needed to be explained!

The correct formula of course, not counting adjusting for innings played at home and on the road, which also needs to be done of course, is:

PF = ((RShome+RAhome)/Gameshome)/((RSroad+RAroad + (RShome+RAhome)/X)/(Gamesroad+Gameshome/X)

Where X is the number of parks in the league minus 1.

This is the so-called “other park correction factor” in the rigorous formula.

Of course, the ESPN formula works pretty well if there are lots of parks in the league.  When you cite an example (even a correct or plausible one, which was not done in the paper) using 2 parks, OF COURSE you will be way off if you don’t use the “other park correction factor.”

In the 2-team example where 16 runs are scored in one park and 4 in the other, the PF for park A is:

8+8/((2+2+8+8)/2)

which is 1.6 and not 4.0!

A simple PF is simply the number of expected runs scored in a park divided by the number of expected runs scored in ALL PARKS, assuming that league-average teams played in all parks.

Again, the “ESPN formula” is NOT that!  It is a ratio of one park to “all other parks” and not to the league as a whole.  That should have been clearly mentioned.

To be honest, it is not like you (the authors) are breaking any ground by noting that the ESPN is wrong.  Anyone who does any serious PF’s, at the very least, uses the “other park correction factor.”

The next thing, which is the biased or inbalanced schedule, is another thing of course.  Yes, it biases the PF’s (again, not to a large degree, unless the “unbalancedness” is large and the difference among parks is also large) and certainly a method that adjusts for that would be better.  Nothing groundbreaking there though.  I have been doing park adjustments for almost 20 years in which I account for every park that a player plays in and I compute my park factors using a method which accounts for an unbalanced schedule.  I use an iterative method.  First I compute PF’s as if everyone played everyone else the same number of times (a pure balanced schedule).  Then I redo the PF’s using each team’s previously computed (as if an unbalanced schedule) PF’s. I keep going several more times.  I think I will come out with a pretty good PF using that method.

Their figures have been so widely referenced by sportswriters and statisticians alike that it is astounding such errors have persisted.

I don’t know about their computational errors, but…

I have NEVER, EVER seen or heard ANY sportwriters or statisticians reference ESPN’s PF’s!  I was not even aware that ESPN computed their own PF’s.  Now, I am sure that someone has referenced them, but to say, “so widely referenced...” is just not right, I don’t think.

To give an example for how severe these
errors, coupled with the intrinsic biases of the model, are, ESPN still shows that Oakland’s park factor was 1.357 in 2001, 0.703 in 2002, and 0.515 in 2003, despite there having been no alterations to McAfee Coliseum in this
century.

Huh?  How is that an example of computational error?  How do we know that there are computational errors there?  And when you say, “despite there having been no structural alterations...” you are implying that a park’s run factor should be constant each year!  Huh?  We all know that a properly computed PF can fluctuate widely from year to year, due to random error, weather conditions, and structural changes in OTHER parks.

I believe you if you say that some or all of these numbers were computed incorrectly, but the fact they are different is NOT evidence that they were computed incorrectly, as you are implying (although it is unlikely that a PF would fluctuate that widely if they were computed correctly).

Such numbers are totally unrealistic...”

I am belaboring the point, but I don’t know what you mean by that.  None of those numbers by themselves are “unrealistic” other than perhaps the .515.  The other ones are entirely plausible, although perhaps not together (it is very unlikely, but not impossible, that a park would have a PF of 1.4 one year and .7 another).

By adding variables for park dimensions to
the analysis, Click (2005) was unable to improve on his estimates for next year’s Park Factors...

I am sure that I read Click’s study, but I don’t remember it explicitly.  However, if someone cannot improve a PF (an estimate of a subsequent year PF) by including park dimensions (or weather, altitude, wall heights, etc.) in their model, they are doing something wrong!

That would be like saying that I cannot improve my Marcel player projection model for hitters, by includng their weight, height, and speed.  Of course I can!

Now, I realize that you are not explicitly agreeing with Click or have not duplicated his study and his findings, but you are implicitly agreeing with that notion - that one cannot improve a PF model by including park dimensions.  That is nonsense!  Of course you can, even thought there are clearly many other things that go into a PF, some things changing from year to year (like weather and team composition) and other things not (like wall height and altitude), not the least of which is random fluctuation.

Several sabermetricians have acknowledged the
aws in the standard model for calculating Park Factors and have tried to correct for these biases. Sheehan (2001) for instance, describes the scheduling bias and its efect on Park Factors. Sheehan asserts that his colleague, Clay Davenport, uses a method for calculating Park Factors that is, actually weighted by games
played in each individual park.” Sheehan, unfortunately, does not disclose Davenports methodology.

OK, it is nice that you mention that there are other, better PF formulas than ESPN’s (which is the most basic and a bad one at that).

And I have just revealed my method for accounting for scheduling bias (using an iterative process). Of course there are plenty other, fairly easy ways, of doing that, including a rigorous method (mine is an approximation).  Either way, it is not great secret (devising a method for accounting for scheduling bias), I don’t think.

Several other sabermetricians have noted the variability of the results, but none has proposed an adequate solution for solving the problem.

I don’t know what you mean here, especially, “solving the problem.” The primary reason for variability (I assume that you mean year to year within each park, although you do not explicitly state that, and you should) is that we are dealing with one-year samples and lots of random fluctuation, independent of the park.  Accounting for imbalanced schedules ain’t gonna change that year to year variability much, assuming that teams play pretty much the same schedule each year, which they do.  And you ain’t going to get rid of the random variability.  So I don’t know what you mean by “solving the problem.” You ain’t gonna solve the problem, if by “problem” you mean year to year variability.  If you mean “estimating a true PF” or coming up with a PF that is best for subsequent years, then the only way to “solve the problem” is to use as many years as possible to reduce the noise (sample error) and to make sure that you constantly account for park changes and schedule changes. 

I will read on and cont. in next post…


#15    MGL      (see all posts) 2008/01/11 (Fri) @ 05:21

Click correctly explains this, but also notes that even by adding several years to the calculations, much of the between-year variability remains. He asserts that Park Factors are not as constant as they should be, but does not propose any methodological improvements to correct for the inherent biases.

I don’t get this either.  If each year is done correcltly and is an unbiased estimate of a park’s true PF, adding years is not going to decrease the between year variability!” I don’t get why you (or Click) say(s), “much of the between year variability reamins.” How does adding years supposed to change the between year variability.  The between year variability will always remian constant (it will fluctuate of course) and will be be entirely a function of the random sample error and measurement errors.  The latter we can try to reduce and the former we can do nothing about.  But in either case, adding years will not change anything as far as the year to year or “between year variability is concerned.” It will only (presumably and hopefully) get our multi-year PF estimate closer to the true PF, in terms of our confidence in that estimate.  But we all know that last part.

Your discussion of Thorn’s (and others) improvements seem out of place.  They should have been discussed at the beginning.  You should have presented the ESPN simple formula, explained why it is not very good (what it is lacking) and then simply explained how to improve it.  You also need to explain more clearly the distinction between two concepts.  One, using more years to estimate a park’s true PF and two, how to compute a sample PF more rigorously in any one year (by basically adding two things - one a more rigorous runs per inning rather than game, and two, by adding the other park correction factor - which you did not explain very well, BTW).  Those are two completely different concepts and you seem to conflate the two.

Because there are relatively few interleague games, and because the designated hitter rule di
ers in the National League (NL) and the American League (AL), but will be confounded with Park Factors, we estimated the AL and NL separately.

I am sorry to be snark here, but “Really?”

We assume that the runs a team scores in a game are generated by a linear process.

I honestly don’t know what that means.

Controlling for home
eld advantage and the o
ensive and defensive strength of the home and away teams, the Park Factor adds or subtracts runs from the number that would have been scored in a neutral park.

That has been discussed here already a little and the consensus seems to be that we are not sure that this is correct and/or to what extent.  I don’t know how the level or correctness of that statement/model would affect the results.  I also don’t know whether if the first statement is correct (that run scoring is a linear process) that means that the second (that a PF is additive) is also correct.  The reason I don’t know is that I don’t know what “run scoring is a linear process” means.  Is that statistical jargon or just English/semantics?

Now you are getting into the nuts and bolts of your methodology which is above me (ANOVA). I have a few questions though:

Why would a decrease in within year PF SD (I assume the SD of the differences between the parks’ PF) suggest a “better model” and one that decreases bias?  What if the larger differences among parks that one model finds is genuine?  And what bias is being decreased?  I have no doubt that your method is much better than the ESPN formula of course (since the ESPN formula is trash), but couldn’t your method just as well come up with a larger within year SD?

We ran a separate regression for
each year, and assessed the estimators on two criteria: First, on within year standard deviation to check the standard estimators inflationary bias, and second, on between year standard deviation to see which estimator produced
more consistent results. The seven-year averages for each of the estimators and the relevant standard deviations are shown in Tables 1, 2, and 3.

As I said, before, I don’t see why a “better” model would produce more consistent year to year results (reliability).  Quite the opposite could easily occur.  A bad model is often very reliable. What if our model was so bad that it computed a PF of 1.0 for every team.  Wouldn’t that show a low (zero) year to year variability (variance).  Same thing for within year variability among the parks.  Couldn’t a bad model show a low variance as well?

I am actually shocked at how similar your and ESPN’s means (PF’s) are!  Doesn’t that tell us that even a basic, shitty PF formula is pretty darn good, or at least almost as good as your rigorous one?

We were able to estimate this for all of the AL parks since none has changed since 2000...

1) In 05, Skydone changed the type of turf they use (NexTurf, more like grass).

2) In 04, Kauffman (KC) moved their fences back to pre-95 levels, severely changing the PF).

3) In Comerica in 2003, LC was moved in 25 feet.

4) In 2004, I think that Comerca lowered some of their fences, but I am not sure.

5) In 2001, in Camden Yard they moved back home plate (and thus the OF dimensions increased) 7 feet.  In 2002, they moved in back.

6) In 2001, they significantly changed the OF dimensions in U.S. Cellular (CWS).

7) 2004, the Metrodome replaced their old turf with Fieldturf, also a grass-like material.

Guys, that was just really bad research/fact-checking!

I have no doubt that your method is a good one for computing rigorous PF’s which correct the three major flaws in ESPN’s formula - not using an “OPC” factor, not adjuting for unbalanced schedules, and using games played rather than IP.

I also have no doubt that your method is going to far “outperform” that of ESPN in any analysis.

What I would have really liked to see was how your ANOVA method would against a basic “real” (not ESPN’s truncated one) PF formula which uses the OPC factor, uses IP rather than games, and uses some kind of method to correct for unbalanced scheduling, like my iteration method.  My guess is that the results would be essentially identical.  Any thoughts?


#16    tangotiger      (see all posts) 2008/01/11 (Fri) @ 08:40

If we use the Markov model:
http://www.tangotiger.net/markov.html

At these levels:
AVG / OBP / SLG
0.270 / 0.341 / 0.405

You get this:
4.905 : Runs Scored per Game

(It doesn’t matter that they might not be exact.  We are establishing a baseline.)

If you apply an Odds Ratio factor of 1.25 to OBP ratio

... That is, take OBP of .341, convert that to an odds ratio of .341/.659 or .517, multiply that by 1.25 to get an odds ratio of .6468, which implies an OBP of .393…

you will get an OBP of .393.  If you go back to my Markov model, setting AB to 31.65 and hitting CALCULATE gives you:

AVG / OBP / SLG
0.316 / 0.393 / 0.474
7.049 : Runs Scored per Game

So, the RPG went up by 44%.

Now, repeat the process but start your baseline way lower.  Set the AB to 45, and you get this baseline:
AVG / OBP / SLG
0.222 / 0.286 / 0.333
3.183 : Runs Scored per Game

Repeat the OBP ratio conversion by multiplying the ratio by 1.25

... .286/.714, or .401, becomes .5007 as a ratio, or a OBP rate of .333…

we get an OBP of .333.  Set AB to 38, and you get this Markov:
AVG / OBP / SLG
0.263 / 0.333 / 0.395
4.618 : Runs Scored per Game

And 4.618 divided by 3.183 is 1.45, or a 45% increase.

As you can see, virtually identical to the first case.

When it comes to park factors, for RUNS per OUT, you use the multiplicative method.  It’s the reason ERA+ works so well.


#17    studes      (see all posts) 2008/01/11 (Fri) @ 09:28

To reiterate one of MGL’s points, there have been some major changes in the structure of ballparks during the time in question.  I don’t know how much this impacts the results, but it’s a big flaw in the methodology.

And, like MGL, I’m dumbfounded that they consider ESPN to be “the” source of park factors.  And Thorn, Palmer’s method isn’t really new.  It was explained back in 1984, in The Hidden Game of Baseball.  Theirs is the same method used by BRef.


#18    John      (see all posts) 2008/01/11 (Fri) @ 09:35

Are there any sites online besides ESPN that give Park Factors?  I do have some from the Bill James Handbooks.  Im looking for the most accurate ones.


#19    Mike Flatt      (see all posts) 2008/01/11 (Fri) @ 13:57

MGL, when doing park adjustments for linear weights and your other stats, do you weigh the RPG at home against the RPG on the road or compare home context to league context?


#20    Tangotiger      (see all posts) 2008/01/11 (Fri) @ 14:01

Mike, MGL already said it:

The correct formula of course, not counting adjusting for innings played at home and on the road, which also needs to be done of course, is:

PF = ((RShome+RAhome)/Gameshome)/((RSroad+RAroad + (RShome+RAhome)/X)/(Gamesroad+Gameshome/X)

Where X is the number of parks in the league minus 1.

This is the so-called “other park correction factor” in the rigorous formula.


#21    jinaz      (see all posts) 2008/01/11 (Fri) @ 14:10

John/18,

I like Patriot’s:
http://gosu02.tripod.com/id103.html
-j


#22    john      (see all posts) 2008/01/11 (Fri) @ 14:24

Looks good to me.

Thanks


#23    MGL      (see all posts) 2008/01/11 (Fri) @ 15:44

There are plenty of sites that print PF’s.  Just google it.  B-Ref is the best known I would assume.


#24    jinaz      (see all posts) 2008/01/11 (Fri) @ 15:54

I like Patriot’s because they include regression (using MGL’s coefficients, apparently), they’re well thought-out, and seem reasonably rigorous without going overboard in trying to control for every little thing.  And, they’re super-easy to apply to players. -j


#25    MGL      (see all posts) 2008/01/11 (Fri) @ 15:56

The Patriot linked article above (#21) is a good Primer.  And of course, in my long post I forgot to include X/X+1 as a multiplier for the road data.  (Where X is number of parks in the league minus 1., the “other park factor correction” which Patriot explains well.) The best explanation is that the road data should include the home park as well, since your PF is the ratio of the home park to the league and not the home park to all other parks.  So for the league data, you obviously use 13/14 of the road data and 1/14 of the home data in a league with 14 parks, making sure you normalize (scale) everything to per game or per out or per inning or whatever before you weight the road data by the 13/14 and the home data by 1/14.


#26    David Gassko      (see all posts) 2008/01/11 (Fri) @ 17:27

We assume that the runs a team scores in a game are generated by a linear process.

I honestly don’t know what that means. “

Mickey, I think that means that the number of runs a team will score in a game can estimated linearly, in this case, using team RS, opponent RA, and a dummy for HFA.


#27    tangotiger      (see all posts) 2008/01/11 (Fri) @ 17:47

If that’s the case, that’s also wrong.  The runs scored is multiplicative, since runs per game (i.e., runs per out) is very close to an odds ratio.

A 5RPG team facing a 4RPG team in a 3RPG league will score 6.67 runs (5*4/3).

Again, using Markov:
set AB to 36.7, and you get: RPG = 5.00, OBP = .344 (14 safe, 26.7 outs)
set AB to 40.5, and you get: RPG = 4.00, OBP = .315 (14 safe, 30.5 outs)
set AB to 46.25 and you get: RPG = 3.00, OBP = .279 (14 safe, 36.25 outs)

Now, apply Odds Ratio method to those OBP:
14/26.7 * 14/30.5 / (14/36.25)
= ratio of .623 or OBP rate of .384

To get that OBP rate, you need 14 safe and 22.46 outs.  So, set AB to 32.46, and you get:
6.64 runs scored per game

My estimate above was 6.6667.

So, you see, it works.


#28    M23      (see all posts) 2008/01/11 (Fri) @ 23:04

When Patriot says “To sum for a team, (T-1)*Road+Home((T-1)/T% is road, 1/T% home),” is the *Road+Home filled in with the RPG on the road and at Home?


#29    M23      (see all posts) 2008/01/11 (Fri) @ 23:06

Also, how does his formula account for games played at home and on the road like MGL’s?


#30    KJOK      (see all posts) 2008/01/12 (Sat) @ 03:03

In addition to Patriot’s explanation, Mr. Walker also has a pretty good explanation of one of the better ways to calculate park effects here:

http://highboskage.com/stat-corrections.shtml


#31    MGL      (see all posts) 2008/01/12 (Sat) @ 04:58

M23, no matter how you express it, you simply want to use home stats per some unit, such as outs, IP, games, PA, etc., and league stats, also per the same unit, and create a ratio.  League stats are going to be comprised of 13/14 of the road stats plus 1/14 of the home stats in a 14 team (park) league.


#32    Alex D'Amour      (see all posts) 2008/01/12 (Sat) @ 14:53

Hi, I’m one of the authors on the paper and I really appreciate the criticism we’re getting here. I did most of the methodology and implementation for the paper. My background is mostly in statistics, and baseball is a very new area of application for me, so forgive any naivite in the terminology that I use and for anything obvious that I’m clearly missing. I haven’t had time to look over the Patriot PFs or tangotiger’s Markov model, but I’ll be doing that shortly (like Bobby said earlier, it’s finals period here). This post is in response to MGL’s long one.

I just want to address a couple of issues that jumped out at me. MGL said that our simple 2-team example was impossible because the park factors need to average to (i.e. have arithmetic mean) 100 in a league. This is a little strange to me since the values we’re generating are multiplicative factors, so shouldn’t we instead be demanding that the park factors have a geometric mean = 100?

I think that we have different ideas of how the park factor estimate should be used—in our case, we’re trying to find a number by which we can scale statistics generated in a particular park (I guess whether or not this is a valid goal is debatable). This at least seems to be how the ESPN PFs are marketed. How would the park factors that you generate (1.6, 0.4) be applied to recover the fact that park A scales results by 2x and park B scales them by 0.5? In our case, we assume that a park scales the “base number” of runs (which is a different number from the average runs per game) with respect to a “neutral park” whose PF is assumed to be 1.

Also, at least popularly, it seems that the truncated ESPN park factors are the most commonly cited—perhaps not in the sabermetrics community, but definitely on the web as a whole. They are the first hit on Google for “park factors”, and the second for “park factor”, so a lot of people are linking to them. I think this still makes our paper relevant, but clearly not as relevant as we thought it was at first. For future papers we’ll definitely consult this community first before making blanket statements like the ones we did in this paper.

But on the subject of ESPN estimators, I think it’s pretty clear that the ones that were posted (and are still posted) are pretty ridiculous. Fluctuating between a multiplicative park factor of 1.4 to 0.7 does seems a little extreme (if the factors are applied as I mentioned above). Our suspicions that they were way too variable were actually confirmed because we recalculated the ESPN estimates following the formula published on their website, the numbers that they’ve published are VERY DIFFERENT, and much further away from 1.

That being said, we were clearly comparing our new estimator to a straw man. When I have some more time, I’ll definitely run a similar simulation-based comparison between our estimator and the ones mentioned here to see how similar or different they are. It’ll be good to see whether the assumptions that we’ve made (many incorrect) have a large effect on the accuracy of our estimator compared to the ones you mention. I especially like the idea of the iterative MCMC style method that MGL mentioned, and was throwing a similar idea around a few days before submitting the paper for publication.

The final thing I want to touch on now is the question of within-year and between-year variability. Those measures were just developed as first-order comparisons to see if we were on the right track. Given our discovery of severe inflationary bias in the truncated park factors, we assumed that some “tucking in” of the PF estimates within each year would indicate some sort of bias correction. Similarly, we assumed (perhaps incorrectly) that minus some large structural changes the park factor should be relatively constant between years (assuming we controlled for the right things). These clearly aren’t measures of absolute quality, they were just simple non-rigorous questions we tried asking to see if our estimates were on the right track. The main results that we relied on were the compelling simulation-based tests that showed the sampling distributions of the two estimators.

Anyway, thanks again for peer reviewing. The comments are far more helpful than the ones we got from the journal, and we’ve clearly got a lot of catching up to do before we can make a real cutting edge contribution to this subject.

-- Alex


#33    MGL      (see all posts) 2008/01/13 (Sun) @ 00:01

Alex, I don’t know if the arithmetic or geometric mean of all PF’s should be 1.00, if we assume that the PF’s are multiplicative - I’ll have to defer to the statisticians on that one.

However, regardless, it is still impossible to have the scenario you presented in the article.  You say that park A doubles run scoring and park B halves it and that average run scoring is 4.0 rpg per team.  Well, if park A is 4 rpg and park B is 16 rpg, then average rpg in the league is 5 rpg per team and not 4.  You can’t have it both ways!  So the example is still impossible, whether all the PF’s have to have a geometric, arithmetic, harmominic, or whatever, mean of 1.00.  You simply can’t have a 2-team league where one park doubles run scoring and the other halves it.  That would require the following formula:

r=(2*r+.5r)/2

where r = runs per game and solves to zero!

The stuff about the Oakland PF’s was pertty superfluous.  If I saw numbers like that, I would probably redo the callcs also, but the fact that those numbers were all over the place did NOT mean that they were incorrect computed.  As you know, sample error has unlimited boundaries.  And for 81 home and 81 road games, you have lots of random fluctuation (sample error) in run scoring!  One SD in 81 games is more than half a run a game, I think!

As far as my iterative method, although I think it gives you a near perfect answer (perhaps after an infinite number of iterations, it gives you a perfect answer), I think that the article in HBH, linked to in #30 above, gives you the rigorous method for adjusting for schedule and it is not very difficult or complicated.


#34    Tangotiger      (see all posts) 2008/01/13 (Sun) @ 00:39

If you only have two parks, and in one park the average runs scored is 8 and in the other it’s 4, then it’s a true statement to say that one park scores runs at twice the rate and that another park scores runs at half the rate.

However, 2.0 and 0.5 are not the park factors.

The average runs scored in a park is 6.  And therefore, the park factors are 8/6 (1.333) and 4/6 (0.667).

You can’t do park factors where the denominator is different for each park (8/4=2 and 4/8=0.5).  Of course the PF won’t average out to 1.0.  But they shouldn’t, since you are trying to add numbers with different denominators!  Only someone who likes OBP+SLG would think of adding numbers with different denominators.


#35    Alex D'Amour      (see all posts) 2008/01/13 (Sun) @ 00:58

MGL, just wanted to clarify again, we don’t assume that the “base number” of runs is the same as the mean number of runs. It’s actually the geometric mean of the number of runs. Since we’re assuming park factors are multiplicative in the paper, we would assume that the arithmetic mean of runs (5) would be higher than the geometric mean of runs, which is actually 4 (sqrt(16*4)/2).

The idea is that to have these be actual comparators against a neutral park, since they’re multiplicative, we have to scale them multiplicatively around 1. When you normalize an additive constant, you subtract out the mean, and similarly when you normalize a multiplicative constant, you divide out the geometric mean. Since we’re working in a multiplicative paradigm in the paper, we set the geometric mean to 1 for our imaginary league. I’m pretty sure that in a world of multiplicative park factors, our scenario is possible.

Now I’m not sure how the odds ratio thing is supposed to work, but I’ll look into it.


#36    KJOK      (see all posts) 2008/01/13 (Sun) @ 03:24

In addition to HBH (Walker), another person who does a nice job on park factor calculations is Boyd Nation, who calclulates NCAA Park Factors.  College Park Factors have the challenge of both non-standard schedules and differing levels of competition thrown in.

He gives an overview of his methods here:

http://www.boydsworld.com/breadcrumbs/realpf.html

and here’s some more details on his method:

Get all your scores identified.  (Because very few pairs of college teams play home-and-home in the same year, I have to use multi-year sets.)

For each pair of teams who played games at both sites, calculate the runs per game scored at each site, then calculate the relative park
factor for those two teams by dividing one park’s average by the other (it’s never actually happened in one of my data sets, but be sure not
to divide by zero here).

Set everyone’s park factor to 100 (or whatever your base scale is).

Do this next loop until done:

Set the sum for each team to 0.

For each pair, do these steps:

Set the available pool of points to the sum of the two park factors.

For the first team, add to their sum the percentage of the pool represented by the relative park factor divided by the relative park factor plus 1 (in mathematical terms,
part1 = pool * (relpf(pair) / (relpf(pair) + 1)))

For the second team, add the remaining portion of the available pool to their sum.

Set each team’s park factor to their sum divided by the number of pairs that they’re part of.

If nobody’s park factor changed by more than .01 since the last time through, you’re done.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 21 17:29
Sabermetric Moves of the 2009 Pre-Season

Nov 22 06:40
The New Triple Crown

Nov 22 06:24
Chance of Scoring by Base/Out, Retrosheet Years

Nov 22 02:48
How good are the Fans in evaluating fielding?

Nov 21 20:13
Runs Produced

Nov 21 19:27
Marcel 2009 is here

Nov 21 16:43
Nate Silver: hero to interviewers

Nov 21 10:57
New BBTN

Nov 20 20:34
ABSO-lutely… not!

Nov 20 19:23
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being