THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, July 27, 2006

Colorado Park Factor

By Tangotiger, 10:24 AM

Here is Dan Fox looking at park factors, for the minor league Colorado team.

The park configuration also plays a role.  I once did a(n unpublished) study on the park configuration, and…


found a relationship between distance and runs.  I can’t remember what it was.. something like every 10 feet added, increased run production by 1%, or .01 runs per game.  I can’t remember which, so don’t quote me.

The key was to figure out the amount of surface area.  Using the 5 distance markers (LF, alley, CF, alley, RF), I squared the values, averaged them, and took the square root.  That gave me a smoothed out radius for the field.  Pretty basic, right?  Since the area of the field is pi * r-squared / 4 (if it was a slice of a perfect circle), I was on my way.

Of course, not all fields are even close to being that nicely shaped.  Since we do have access to graphical representation of all the parks, we don’t even need to figure out a mathematical representation of them: just count the pixels on the screen.  Until then…

#1    David Gassko      (see all posts) 2006/07/27 (Thu) @ 16:43

I did some similar work with this here:

http://stats.mostvaluablenetwork.com/general/expected-park-factors/

I have since been working on something that does a much better job of predicting park factors. The problem is that there are variables we simply don’t understand that have a pretty large effect on how a park plays. Also, some data is tough to find. Your equation correlates pretty well (r = .69) with actual fair territory, though it consistently underpredicts (I had to add 4,868 square feet so that they would match up) the actual amount. I may use it to fill in my missing data points (nine of thirty). Still, I have my doubts that we can ever predict how a park will actually impact play until major league teams actually play games in it.


#2    MGL      (see all posts) 2006/07/28 (Fri) @ 00:32

I have fooled around with some regression equations as well for predicting overall park factors (run scoring factors).

The relevant variables, in rough order of importance (I think), are size of fair territory, altitude, average temperature, size of foul territory, and average fence height.  Other minor variables are average prevailing wind, lighting (visibility), and “smoothness” of outfield walls.

I do think that we can come up with some pretty good park factors before having any actual run scoring data.


#3    MGL      (see all posts) 2006/07/28 (Fri) @ 00:43

David, since you did not use altitude as one of your prediction variables, how in the world did you come up with a predicted HR factor of 1.08 for Coors Field, which has the LARGEST dimensions in ML baseball?  In fact, many of your numbers do not seem right assuming that you used outfield dimensions as your only dependent variables.


#4    dan      (see all posts) 2006/07/28 (Fri) @ 02:50

One of the things I’ve been a bit disappointed by over the course of this season is the lack of (what I perceived to be building) progress in turning lefty-righty split data into split park factors.  The apex seems to have been an article at Prospectus that was fairly eye-opening to me.  It examined Minute Maid’s HR factor, which was something like 40 points higher for righties than lefties (around 120/80).  Then you look at Clemens and Pettitte and the fact that Pettitte faced a very high percentage of righties, far more than Clemens.  I forget if it was put out there explicitly, but certainly the implication was that Pettitte’s season may have been as impressive as Clemens’, a statement you could only make by breaking into these split factors.

We’ve got loads of split data nowadays.  More importantly, we’ve got a fair number of asymmetrical parks.  It doesn’t just seem natural to move to split park factors; it almost seems silly not too.


#5    David Gassko      (see all posts) 2006/07/28 (Fri) @ 16:45

Mickey,

My coefficients were probably way off. I can’t find any other way to explain it. I’ll see if I can find the actual numbers I used for the study. Maybe I used some other variables; I don’t remember doing so…


#6    Joe Arthur      (see all posts) 2006/07/29 (Sat) @ 05:13

As a point of historical interest, Earnshaw Cook in Percentage Baseball (1964, 2nd ed 1966) used a similar approach for devising an equation for park home run factors (computing surface area of field, treating as equivalent to pi x r-squared / 4, then solving for r to get an average effective distance to the fence).


#7    Mike      (see all posts) 2006/08/05 (Sat) @ 22:52

I believe what Dan is talking about is multiplicative park factors…


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Dec 05 04:40
Sabermetric Moves of the 2009 Pre-Season

Dec 05 05:33
Avery being Avery

Dec 05 05:06
NYC’s 3 1/2 year mandatory jail time sentence for carrying a loaded weapon

Dec 04 23:42
Poll: Would you vote Raines for the Hall?

Dec 04 23:07
How to calculate the area of a baseball field

Dec 04 22:48
Complete Run Expectancy, Retrosheet Years

Dec 04 22:03
Raines for the Hall

Dec 04 15:55
Mailbags on Parade

Dec 04 14:01
What would happen if the shootout period was 10 minutes, not 5?

Dec 04 11:49
Estimating BABIP