Friday, February 23, 2007
Run Expectancy by Run Environment
Ever wanted to have the run expectancy chart for a 3.5 RPG environment, or 2.4 or 6.7? Here you go:
I published it on Google Docs.
Here’s how to read the first line
In a 2.00 runs per game environment, with the bases empty and 0 outs, this state will occur 25.9% of the time. The run expectancy (RE) is 0.222 (this is the only “duh” part, as it’s 2.00/9). You will be held scoreless from this point onward to the end of the inning 84.9% of the time. You will score exactly 1 run 10.4% of the time.
I suggest you “ctrl-A” to select all the data, and copy it to Excel. Delete the header and footer. In Excel, you can do “Date / Filter / Auto”, and it’ll set up drop-down filters to make this thing real easy to use.
Method
Now, how did I do all this? I started with all games from 1999-2002 that had at least 8 full innings. I selected only the first 8 innings. I classified each game as A,B,C,D,E. All the games with a low wOBA were placed in A group, and all the games with high wOBA were placed in the E group.
Then, it was simply a matter of weighting each group as I needed it. For example, the A group was weighted at 98%, the B group at under 2%, and rest accordingly. What this allowed me to do was use actual games with actual results, but weighting certain games more than others, so that I end up with a real 2.00 runs per game.
You will of course notice that I removed all 9th and extra innings, which means alot of smallball-9thInning-style data is not represented in the RE charts. This may actually be a good thing, since one-run strategies are best evaluated in terms of “what if I don’t play for 1-run?”. Therefore, these charts are not polluted with such events.
In any case, I’ve already provided on my site and in The Book the RE matrix that included those events. The reader is free to use whichever is appropriate.
Next?
I’m hoping that Fangraphs.com or Baseball-Reference.com or Retrosheet.org uses this file, or applies my methodology to create their own files. And from that, you can generate value-added performance results. Anyone wanting to do so should make sure to let their readers have free access to the results. In return, I grant you a perpetual, non-exclusive licence.
I’m also going to be using this file to generate WE and LI charts by run environment. I have already arranged to provide these charts to Fangraphs.com. I’d be happy to extend that offer to whoever comes calling, with the same provision as the previous paragraph.
Tango
This is pretty cool stuff. Quick question though: What is stopping you from generating the RE matrix from a Markov? I would have that that would have been a lot easier?