THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, May 12, 2008

Quick Park Factors

By Tangotiger, 01:33 PM

This idea stemmed from the original Historical Abstract.  Here’s what I wrote to David at Fangraphs, and I’ll provide further commentary:


I believe that you have adjusted the WE based on the run environment for the year, correct?  Is that league/year or just year?  And do you have plans to go to park at some point? 

A quick stand-in for park is “team”, in that you simply look at the number of runs scored in that park, regardless of how many runs they scored on the road.  The implication here is that the offense/defense split will be exactly 50/50, regardless of how good or bad a team’s hitting or pitching is.  In one sense, it’s “ok”, since the universe is that park, and as far as the competitors are concerned, those hitters playing are the only ones in their universe.  And of course, in another sense, it’s not ok for the obvious reasons.  However, it is super quick to figure out the number of runs scored in a team’s park per 27 outs, each year.  This version of a park factor might make it better than no park factor at all.  I hate seeing the older Coors hitters getting too much of a boost.

So, this is how it works.  Imagine you have a team, let’s call them Nos Amours that is equal in its offense and defense.  The league scores 4.5 runs per 27 outs.  If at Nos Amours home park (let’s call it Big Owe) there are 5.0 runs per 27 outs scored, then we know, for sure, that this is a hitter’s park.  How did I figure that out, without knowing how many runs are scored on the road?  Well, I said that the team had a defense that is equal to the offense.  If there are 5.0 RPG at Big Owe, that would mean that the team scores 5 and allows 5, or scores 6 and allows 4.  What would they do on the road?  They’ll score 4.5 and allow 4.5, naturally (or a bit less, since Big Owe is not on the road, and the overall league average is 4.5).

But, what if you had a team that had a dismal offense and fantastic defense?  And at Big Owe, there are 5.0 RPG?  This would therefore be a hugely hitter-friendly park.  If they score 3 and allow 3 in a road park where the average team scores 4.5 RPG, and at Big Owe they score and allow 5 runs, that’s a huge home park advantage.

HOWEVER, by using the “team” park-based approach, regardless of the team makeup, Big Owe is a 5.0 RPG park in a league of 4.5.  So, a team that is really a 4.5 offense, 4.5 defense on the road is a 5.0 / 5.0 at Big Owe.  And a team that is 3.0 offense and 3.0 defense on the road is evaluated at Big Owe as if 5.0 RPG is the norm.  And therefore, the offense and defense will each get equal credit, even if we know better.

I hope that made sense.

Maybe one of you guys can give the one-line summary of this.

Anyway, this approach at least gives you a halfway decent park factor, and as mgl is fond of saying, some park adjustment is better than no park adjustment.

#1    KJOK      (see all posts) 2008/05/21 (Wed) @ 19:14

One-line summary?

Use the run environment of the PARK vs. other parks in the league to calculate the park effect, ignoring that the PARK run environment could be heavily influenced by the home team’s offense/defense.


#2    Tangotiger      (see all posts) 2008/10/03 (Fri) @ 15:28

http://www.fangraphs.com/blogs/index.php/win-probability-update/

Read the comments.

I really don’t disagree with anything there.


#3    Tangotiger      (see all posts) 2008/10/03 (Fri) @ 15:54

The BlueJays scored 359 runs at home and 355 on the road.  They allowed 289 runs at home and 321 runs on the road.

In all:
home games: 648 runs
away games: 676 runs

I can’t tell how many innings they played in all.  Let’s say it was 1450 innings at home and 1400 on the road.  So, per 9 innings, it’s 4.0 at home and 4.35 on the road.

So, what happens is that all games played at the Skydome is treated with a run environment of 4.0.

Where’s the problem?  Well, the long-term park factors implies that the Skydome is really run-neutral.  And since there are, I dunno, 4.6 runs scored per game this year in the league, the Skydome should be evaluated as a 4.6 runs per game park.

However, by only looking at how many runs are scored in the park (4.0), we are treating that as the universe.  This means that it looks like it’s a pitcher’s park.  And Halladay will get nicked here a good deal (0.6 runs per 9 IP, or .06 wins per 9 IP.  If he has 120 IP at home, that’s a total of .8 wins that he’s being undervalued.) On the other hand, all the Jays hitters will go up in value, since they all look like they are playing in a pitcher’s park.  (And for argument’s sake, let’s say they are truly playing in a run-neutral park.)

As I noted in my original example, this is exactly what was expected.  Treating the park as its own universe means that all the players are expecting to see 4 runs per game for each team.  And, we are FORCING the offense and defense to be equally responsible for that.

It is this last assumption that may be dangerous.  We know it’s not true in the sense we are most used to.  But it may be true in the sense of treating the park as its own universe, where everything adds up, and the offense and defense is equal.


#4    The Edge      (see all posts) 2008/10/03 (Fri) @ 17:03

So few parks deviate enough from the mean to make a serious dent in a player’s line that it’s hardly even worth making park adjustments for maybe two thirds of the teams. It seems to me that it would be easy enough to take each league’s R/G and apply a simple park factor.


#5    Tangotiger      (see all posts) 2008/10/03 (Fri) @ 17:06

The one concern is always Coors, and since Fangraphs goes back to 1972, and eventually back to 1954, there will be issues with the Astrodome among many other parks.

There’s a “best” way to do it, and a “quick” way to do it, and the bridge between the two is simply programming time.  So, it’s a matter of where along the quick-to-best scale David wants to go, and the closer he gets to “best”, the more time it takes.


#6    The Edge      (see all posts) 2008/10/03 (Fri) @ 20:19

Oh yeah, I forgot about the other 34 years. :-\


#7    The Edge      (see all posts) 2008/10/04 (Sat) @ 13:50

34? 36. I combined the two years to get 1974.


#8    Tangotiger      (see all posts) 2009/03/04 (Wed) @ 16:45

Here’s an interesting scenario that you may find fascinating.

Say you have two teams that are heavily overbalanced toward the pitching side.  They have a +200 run differential, but, they both play in neutral parks, both have league average offense, but both have standout pitching.  So, in the MLB universe of 30 teams, it’s the pitchers that absorb the lion’s share of the value.

For illustrative purposes, the league scores 4.5 RPG, these two teams score 4.5 RPG, and these two teams allow 3.25 RPG.

These two teams then enter the playoffs, and play each other.  They end up scoring 3.25 RPG each.  And naturally, allowing 3.25 RPG.

In this universe of the two-team playoff, we have no idea at all the balance of power.  Are their parks both pitcher-friendly or hitter-friendly?  Do we split the credit toward the defense or offense?

Indeed, it is only because we know how these two teams operate in the 30-team universe that we wish to give extra credit to the defense in allowing 3.25 RPG when they face each other.

But suppose they play each other 50 times, or 100 times.  Why do we necessarily care how they do in some other universe?

So, in one respect, the universe is only that in which the game is played.  And to that end, we should always give the credit 50/50.

Clearly, once we take them out of their own universe, and put them in a 30-team universe, the balance won’t remain 50/50 in terms of talent.  But, for the game in question, when all we have is two teams, does it really matter what happens to the other 28 teams 3 months earlier or in 2 weeks?

This was the Bill James argument from 25 years ago.  I like the argument, which is why I like the idea of the quick park factors.


#9    Patriot      (see all posts) 2009/03/04 (Wed) @ 17:48

I agree that it is an interesting argument.  As Bill said, he started using it because he didn’t want to mess with park factors for all history, but then it grew on him.  Obviously he doesn’t use the approach anymore, but I think he did use it in the contemporary Abstracts for at least a year.

I think of it a lot in regards to Win Shares, when Win Shares will zero out an entire offense if it is below 50% of the league average.  From the perspective of what the metric is trying to do, this is fine, as it’s a necessary constraint to enable the pitcher’s values to remain equal regardless of how bad their offense is, and is no different conceptually than giving out negative absolute runs to awful hitters.

But from another perspective, it is bothersome, which is that it assumes that the pitcher’s value should be constant.  From a certain perspective, though, it doesn’t really matter how your offense and defense rate relative to the league.  The crude way I’ve always thought about Tango’s “two-team universe” scenario is: If you score one run and allow two, you’re going to lose, even though your defense has been good and your offense has been bad.  One can say, “Well, the offense was -4 runs relative to the league and the defense was +3”, but one can just as easily imagine taking that run allowed off the board and changing the whole game.

If a team takes a 1-0 lead into the bottom of the ninth and Joe Borowski comes into the game and blows it, he’s the goat, even though the offense has been worse relative to the league than the pitching has (of course, as an individual, Borowski’s performance is the most harmful, whether you measure that context-neutrally or certainly it will be by WPA).  But the pitching unit can still be said to have outperformed the batting unit, yet the first reaction will be to lament the two runs allowed, not the four additional runs that would have been scored with average output.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 20:29
Who is Jeremy Lin?

Feb 11 20:11
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 16:48
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul