THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, February 23, 2007

Run Expectancy by Run Environment

By Tangotiger, 12:05 PM

Ever wanted to have the run expectancy chart for a 3.5 RPG environment, or 2.4 or 6.7?  Here you go:


I published it on Google Docs.

Here’s how to read the first line
In a 2.00 runs per game environment, with the bases empty and 0 outs, this state will occur 25.9% of the time.  The run expectancy (RE) is 0.222 (this is the only “duh” part, as it’s 2.00/9).  You will be held scoreless from this point onward to the end of the inning 84.9% of the time.  You will score exactly 1 run 10.4% of the time.

I suggest you “ctrl-A” to select all the data, and copy it to Excel.  Delete the header and footer.  In Excel, you can do “Date / Filter / Auto”, and it’ll set up drop-down filters to make this thing real easy to use.

Method
Now, how did I do all this?  I started with all games from 1999-2002 that had at least 8 full innings.  I selected only the first 8 innings.  I classified each game as A,B,C,D,E.  All the games with a low wOBA were placed in A group, and all the games with high wOBA were placed in the E group.

Then, it was simply a matter of weighting each group as I needed it.  For example, the A group was weighted at 98%, the B group at under 2%, and rest accordingly.  What this allowed me to do was use actual games with actual results, but weighting certain games more than others, so that I end up with a real 2.00 runs per game.

You will of course notice that I removed all 9th and extra innings, which means alot of smallball-9thInning-style data is not represented in the RE charts.  This may actually be a good thing, since one-run strategies are best evaluated in terms of “what if I don’t play for 1-run?”.  Therefore, these charts are not polluted with such events.

In any case, I’ve already provided on my site and in The Book the RE matrix that included those events.  The reader is free to use whichever is appropriate.

Next?

I’m hoping that Fangraphs.com or Baseball-Reference.com or Retrosheet.org uses this file, or applies my methodology to create their own files.  And from that, you can generate value-added performance results.  Anyone wanting to do so should make sure to let their readers have free access to the results.  In return, I grant you a perpetual, non-exclusive licence.

I’m also going to be using this file to generate WE and LI charts by run environment.  I have already arranged to provide these charts to Fangraphs.com.  I’d be happy to extend that offer to whoever comes calling, with the same provision as the previous paragraph.

#1    John Beamer      (see all posts) 2007/02/24 (Sat) @ 05:18

Tango

This is pretty cool stuff. Quick question though: What is stopping you from generating the RE matrix from a Markov? I would have that that would have been a lot easier?


#2    tangotiger      (see all posts) 2007/02/24 (Sat) @ 09:59

I could have generated the five base RE matrix from Markov, and then extrapolated that as I have done. 

However, with Markov, I force the state-to-state transitions as a constant, and force the frequency of each positive event to be the same, relative to the other positive events, and simply modify the frequency of the batting outs.  While this is a quick and cool way to generate the Markov (as I did in The Book, for the 3.2 RPG table), I’m not sure that I have such solid ground to do that.

You can compare the 3.2 from the Google Docs to what I have in The Book, or the 5.0 as well.

The process I’ve outlined here however can now be used to go through all the Retrosheet years, and create a larger sample for the various base RE tables.

However, even that is not good.  If you think about why a runner goes 1B to 3B on a single, that’s based on EXPECTATION of the run environment, and not on the after-the-fact knowledge of the game wOBA.  Therefore, more work needs to be done to figure out how the state-to-state transitions are affected by the EXPECTED run environment.


#3    Tangotiger      (see all posts) 2007/02/28 (Wed) @ 13:48

I changed my process slightly (most won’t notice the difference), so here’s the latest Google Docs.

I’ve also sent Fangraphs a custom WE chart by run environment as well.  This was the biggest issue with WPA on Fangraphs (wasn’t summing to zero at the hitter and pitcher level).  This will no longer be an issue.

More minor issues will be a custom LI chart by run environment, and park factors. 

Even more minor issues is the HFA.  As I wrote on Fangraphs:

The “HFA” is made up of 4 things:
1 - getting to play at home
2 - getting 3 more outs than your opponent when you are at bat
3 - getting opponent to play you differently because they have 3 more outs to get through
4 - use of relievers

My charts I provided Fangraphs only considers #2. As I’ve repeatedly said, I don’t care too much about the other 3, since it’ll come out in the wash over a season, and even for a game the impact is limited. And I have much bigger fish to fry.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 21 17:29
Sabermetric Moves of the 2009 Pre-Season

Nov 22 06:40
The New Triple Crown

Nov 22 06:24
Chance of Scoring by Base/Out, Retrosheet Years

Nov 22 02:48
How good are the Fans in evaluating fielding?

Nov 21 20:13
Runs Produced

Nov 21 19:27
Marcel 2009 is here

Nov 21 16:43
Nate Silver: hero to interviewers

Nov 21 10:57
New BBTN

Nov 20 20:34
ABSO-lutely… not!

Nov 20 19:23
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being