THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, February 23, 2007

Run Expectancy by Run Environment

By Tangotiger, 12:05 PM

Ever wanted to have the run expectancy chart for a 3.5 RPG environment, or 2.4 or 6.7?  Here you go:


I published it on Google Docs.

Here’s how to read the first line
In a 2.00 runs per game environment, with the bases empty and 0 outs, this state will occur 25.9% of the time.  The run expectancy (RE) is 0.222 (this is the only “duh” part, as it’s 2.00/9).  You will be held scoreless from this point onward to the end of the inning 84.9% of the time.  You will score exactly 1 run 10.4% of the time.

I suggest you “ctrl-A” to select all the data, and copy it to Excel.  Delete the header and footer.  In Excel, you can do “Date / Filter / Auto”, and it’ll set up drop-down filters to make this thing real easy to use.

Method
Now, how did I do all this?  I started with all games from 1999-2002 that had at least 8 full innings.  I selected only the first 8 innings.  I classified each game as A,B,C,D,E.  All the games with a low wOBA were placed in A group, and all the games with high wOBA were placed in the E group.

Then, it was simply a matter of weighting each group as I needed it.  For example, the A group was weighted at 98%, the B group at under 2%, and rest accordingly.  What this allowed me to do was use actual games with actual results, but weighting certain games more than others, so that I end up with a real 2.00 runs per game.

You will of course notice that I removed all 9th and extra innings, which means alot of smallball-9thInning-style data is not represented in the RE charts.  This may actually be a good thing, since one-run strategies are best evaluated in terms of “what if I don’t play for 1-run?”.  Therefore, these charts are not polluted with such events.

In any case, I’ve already provided on my site and in The Book the RE matrix that included those events.  The reader is free to use whichever is appropriate.

Next?

I’m hoping that Fangraphs.com or Baseball-Reference.com or Retrosheet.org uses this file, or applies my methodology to create their own files.  And from that, you can generate value-added performance results.  Anyone wanting to do so should make sure to let their readers have free access to the results.  In return, I grant you a perpetual, non-exclusive licence.

I’m also going to be using this file to generate WE and LI charts by run environment.  I have already arranged to provide these charts to Fangraphs.com.  I’d be happy to extend that offer to whoever comes calling, with the same provision as the previous paragraph.

#1    John Beamer      (see all posts) 2007/02/24 (Sat) @ 05:18

Tango

This is pretty cool stuff. Quick question though: What is stopping you from generating the RE matrix from a Markov? I would have that that would have been a lot easier?


#2    tangotiger      (see all posts) 2007/02/24 (Sat) @ 09:59

I could have generated the five base RE matrix from Markov, and then extrapolated that as I have done. 

However, with Markov, I force the state-to-state transitions as a constant, and force the frequency of each positive event to be the same, relative to the other positive events, and simply modify the frequency of the batting outs.  While this is a quick and cool way to generate the Markov (as I did in The Book, for the 3.2 RPG table), I’m not sure that I have such solid ground to do that.

You can compare the 3.2 from the Google Docs to what I have in The Book, or the 5.0 as well.

The process I’ve outlined here however can now be used to go through all the Retrosheet years, and create a larger sample for the various base RE tables.

However, even that is not good.  If you think about why a runner goes 1B to 3B on a single, that’s based on EXPECTATION of the run environment, and not on the after-the-fact knowledge of the game wOBA.  Therefore, more work needs to be done to figure out how the state-to-state transitions are affected by the EXPECTED run environment.


#3    Tangotiger      (see all posts) 2007/02/28 (Wed) @ 13:48

I changed my process slightly (most won’t notice the difference), so here’s the latest Google Docs.

I’ve also sent Fangraphs a custom WE chart by run environment as well.  This was the biggest issue with WPA on Fangraphs (wasn’t summing to zero at the hitter and pitcher level).  This will no longer be an issue.

More minor issues will be a custom LI chart by run environment, and park factors. 

Even more minor issues is the HFA.  As I wrote on Fangraphs:

The “HFA” is made up of 4 things:
1 - getting to play at home
2 - getting 3 more outs than your opponent when you are at bat
3 - getting opponent to play you differently because they have 3 more outs to get through
4 - use of relievers

My charts I provided Fangraphs only considers #2. As I’ve repeatedly said, I don’t care too much about the other 3, since it’ll come out in the wash over a season, and even for a game the impact is limited. And I have much bigger fish to fry.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Mar 19 20:45
Morgan Ensberg has parental advice

Mar 19 20:28
Another brilliant quote…

Mar 19 17:32
Will Mariano Rivera save only 22 games this year, and with a 3.53 ERA?

Mar 19 16:23
A very good interview with Bill James, by Geoff Baker, the Seattle writer

Mar 19 16:16
Does bad defense lead to pitchers having to throw more pitches?

Mar 19 16:07
One Year and One Million Hits Later

Mar 19 13:16
Open Letter from Cory Schwartz

Mar 19 12:21
Optimizing the batting order: Phillies and Yankees

Mar 19 10:13
Statistical Significance, or the reason that mathematician Ron Fisher is on MGL’s “On Notice” Board

Mar 19 09:36
To count or not to pitch count