THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, January 29, 2008

MGL Component Park Factors

By Tangotiger, 11:19 AM

Google Docs.


#1    MGL      (see all posts) 2008/01/29 (Tue) @ 19:45

For clarification, the way I did the year to year changes in each component (the second two charts) was to simply subtract the old park’s PF (for each component) from the new one’s PF.  If it is an additional park/team, then of course I subtracted 1.00 from the new park’s PF.

For the total changes from 93-07, I simply added up all the yearly changes.

To compute the “other park factors” to see what the unbalanced schedule does to each team, here is what I did:

Every time a team plays in its home park, I add 1/15 of its home PF (for each component of course) to the “tally” for that team.  (Technically, it should be 1/14 for the AL and 1/16 for the NL.) For each time that a team plays on the road, I add in the road team’s PF to the “tally”.  Then I divide the tally by the number of games played on the road plus 1/15 of the number of games played at home.  Again, I exclude all inter-league games.

If a team played all other teams in the league equally, this number should always equal 1.00 for all components.  But they do not play each team equally, so the number represents the total, composite PF (for each component) of the “league” that each team plays in, where “league” means the actual teams they play against and the number of games played against each team.

Also, keep in mind that since these “OPF’s” affect a team’s PF, I used a recursive process to first compute each team’s (multi-year) PF’s, with no OPF adjustments, then adjusted them for the unbalanced schedule (using the first set of OPF’s), then redid the OPF’s, then redid the adjusted PF’s, then redid the OPF’s, etc.  In reality, you only have to do that a few times, before everything settles in.

Remember that the first two charts (NL and AL) represent the composite PF’s for the “league” that each team played in in 2007.  They do NOT contain each team’s own PF’s.

The second two charts (NL and AL) represent the changes from year to year (when parks are added or changed) to the whole league, with the last line being the total accumulated changes since 1993.  Again, they do not contain any team PF’s.


#2    Mike Flatt      (see all posts) 2008/01/31 (Thu) @ 18:57

Awesome, I’ve been waiting for something like this for awhile.  I’m going to read it over the next few days when I have some time…


#3    fifth of      (see all posts) 2008/04/30 (Wed) @ 14:00

I was poking around the 2008 Texas Rangers page on BB-Ref, and things didn’t seem to be adding up for me. I looked at the top of the page and figured out why:

Park Factors
(multi-year): Batting - 100, Pitching - 100
(one-year): Batting - 97, Pitching - 98
Over 100 favors batters, under 100 favors pitchers.

What in the world!? My initial reaction was maybe Sean’s formulae are a bit too far from the state of the art. But then I looked for this thread and found that MGL’s park factors for The Ballpark have it as basically neutral. David Gassko’s recent spreadsheet doesn’t look like it agrees, but it would take me a while to add up the components to a get a better idea.

Has this been discussed somewhere at length? The Ballpark in Arlington, or whichever of its subsequent names it’s on now, was, in my mind, a pretty extreme hitter’s park.


#4    MGL      (see all posts) 2008/04/30 (Wed) @ 22:35

No, no, the park is still a very good hitter’s park!  HR’s to left and doubles to right are the biggest culprits.

I basically take my component park factors and then do a component overall run factor (like a component ERA).  The run factor for Arlington is 1.04, which is the third best in the AL behind the White Sox and Fenway.

Park factors are so sensitive to the data, as well as weather factors, not to mention team personnel, 3-year factors are almost useless.  Let’s not even talk about one-year factors.

I use every year a park has been in existence, up to 15 years.

I adjust for the schedule and for the “other parks” (e.g., when COL came into the league in 93, all other parks became more of a pitcher park), so that is not a problem.

Plus I regress each of the components appropriately (the best I can), and I regress them towards a different mean.  For example, I regress each park’s HR factors to right and left to a number which is commensurate (more or less) with the size of the field, height of the fence, altitude, and wind and weather patterns (I use the average fly ball distance factors as a proxy for weather and altitude).

For foul ball terr., which is important, I use the actual size (in square feet) of the foul territory to adjust (regress) the sample data.

Basically almost any park factor you see in any source is crap.

Except mine, of course!


#5    MGL      (see all posts) 2008/04/30 (Wed) @ 22:45

Oh, and of course, I adjust for changes in each park, like in 06, SD shortened RC a little, and PHI moved LF out and raised the fence a little, Coors starting in 06 is different from before than, and completely different from pre-humidor days, Dodger stadium has been removing foul terr. since 05. Etc.

I consider these to be the best “true” run park factors (as of 07) you will find for each park.  Remember that each park is compared to only those parks in their league.  So a park in the NL with a run factor of 1.0 is NOT necessarily the same as one in the AL with the same run factor, although it might be.

ARI 1.08
ATL 1.00
CHN 1.05
CIN 1.04
COL 1.10
FLO 1.00
HOU 1.06
LAN .99
MIL 1.02
NYN .96
PHI 1.07
PIT .98
SDN .92
SLN .98
SFN .96
WAS ??  (I have an estimate based on dimensions and the like. I just don’t have it in front of me right now.)

ALA 1.00
BAL .99
BOS 1.05
CHA 1.06
CLE .99
DET .97
KCA 1.00
MIN .99
NYA .97
OAK .98
SEA .97
TBA .99
TEX 1.04
TOR 1.02


#6    fifth of      (see all posts) 2008/04/30 (Wed) @ 22:46

OK, I was waaay stupid on this. I forgot that the Google Doc on this thread was showing the OTHER parks, not the actual park! Just skipped forward to the data because I for some reason thought it was basically the same as DSG’s that came out a month later. I know 1.04 ain’t neutral at all.

What are the chances Sean changes his PF’s in the near future with a little nudging?


#7    MGL      (see all posts) 2008/05/01 (Thu) @ 14:45

You’d have to ask him (Sean).  Change to what?  He is using the traditional formulas and 3-year factors, no?


#8    fifth of      (see all posts) 2008/05/01 (Thu) @ 16:08

Change to a more rigorous methodology/formula. I guess I should say he should update or enhance his formula, rather than simply change it. If people everywhere are looking at Rangers’ OPS+’s that are based on them playing in a neutral park, I think that’s an issue. (Putting aside whether OPS+ and ERA+ are themselves worthwhile.) I don’t have anything particular in mind. I think he’s just using the same formula he’s had in place for what, five plus years? The body of literature on park factors since he implemented his is pretty substantial. His 3-yr formula has Texas as a neutral park in 2007, not just 2008.

http://www.baseball-reference.com/about/parkadjust.shtml

I’m not sure I want to suggest to Sean that he completely overhaul his park factors. He has, I’m sure, hundreds or thousands of other things on his plate. But some regression and use of component data instead of or in addition to runs wouldn’t hurt. To his credit, he does have an innings pitched correction, which I had forgotten about (I thought he just used R/G instead of R/Out, when the latter is what we want).


#9    MGL      (see all posts) 2008/05/01 (Thu) @ 18:49

I don’t know that he is using anything worse than anyone else is using, and there is nothing particularly wrong with that he is using, it is just that using 3-year park factors (at most) created by runs scored alone, and even then, splitting that into batters and pitchers only (I think, see my discussion below), is going to lead to lots of mistakes.  And not regressing whatever numbers you come up with is REALLY going to lead to as lot of mistakes.

Plus, I am not sure what he means by batter and pitcher park factors.  You DON’T want to compute those separately and it seems like that is what he is doing.  There is no such thing as a separate batter and pitcher park factor.

A batter and pitcher park factor, as originated by Palmer (I think) was/is a misnomer.  What he did was to combine a park factor and an opponent factor (that fact that a team’s batters don’t hit against their own pitchers and vice versa).

The “batter and pitcher” park factors Palmer computes have NOTHING to do with the park factors per se, at least the “batter/pitcher” part.  The batter park factor is a combination of a park factor and adjusting for a team’s batters’ opponents and vice versa for the pitcher park factor.

It seems like Sean is doing something else with his “batter and pitcher” park factors, but I am not sure.  As I said, if he is just somehow splitting up the batter and the pitcher data (I am not even sure how you would do that, and certainly there is NO reason to do that), that is completely wrong.


#10    Tangotiger      (see all posts) 2008/05/02 (Fri) @ 06:47

Right, they (Sean, Pete) are doing “park + opponent”.  Park factor is a bad name, but, chalk that up to a whole list of bad names out there.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 08 04:25
Sabermetric Moves of the 2009 Pre-Season

Jan 09 02:33
Cheers

Jan 08 23:45
The first Hardball Times Annual available for download!

Jan 08 21:16
Line Drives

Jan 08 20:23
(recent) Historical WAR on Fangraphs

Jan 08 16:07
Clint Eastwood is Archie Bunker

Jan 08 16:06
Hardball Times Annual 2008, starring…

Jan 08 15:58
Madoff’s Ponzi

Jan 08 03:41
Valuing relievers

Jan 07 17:41
The latest in park factors