THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Wednesday, September 14, 2011

Markov 2: Gaussian Elimination

By Tangotiger, 02:45 PM

I created the Markov calculator, which models how baseball works… sans the outs on base.  Works perfectly.  Bill updated that to include the run frequency matrix.  Great.  The source code for both can be accessed by doing View/Source on your browser.

Well, now comes along Bill White:

I’d like to introduce (and provide some basic, over-simplified Java source code for) a new runs estimator that CAN perfectly model every event in the game (although the provided source code DOESN’T model everything).

One advantage of this new computational strategy is that it can be made to perfectly model baserunning events. The closest existing analogue—Tango’s Markov model—is similar in concept but doesn’t model double plays or any other events in which a baserunner could make an out on the basepaths.

The strategy is simple: set up a system of equations with 24 equations and 24 unknowns, with each unknown being the expected runs scored from a particular game state (e.g. runner on 1st w/ 1 out). Each equation models the expected runs scored for that particular state as a function of the expected runs scored in all other states we could transition into.  For example, the equation for no runners on, 2 out: e2out0on = homeRunPct*(1+e2out0on) + (singlePct+walkPct)*e2out1st + doublePct*e2out2nd + triplePct*e2out3rd

Once the system of equations is set up, we simply solve using Gaussian elimination:
http://en.wikipedia.org/wiki/System_of_linear_equations
http://en.wikipedia.org/wiki/Gaussian_elimination

The same computational strategy (solving a system of linear equations) can also be used to compute the odds of scoring exactly X runs in an inning and to find the expected percentage of plate appearances that occur during a particular game state.

I provided some source code to illustrate how this strategy works. However, there is a lot of room for improvement. Even though this computational strategy is powerful enough to perfectly model stolen bases, caught stealing, baserunners being thrown out at the plate, etc., the provided source code does not model any of those things. It does model double plays, and crudely models ordinary runner advancement (e.g. the odds of scoring from 2nd base on a single), but there’s a lot of work to be done. Also, the code hasn’t been thoroughly tested.

And he was nice enough to post his full source code.  It would be great if someone took that code and enhanced it to include the baserunner movements.

StateRunsDescription.txt

(19) Comments • 2011/09/16 • SabermetricsStatistical_Theory
Page 1 of 1 pages

Latest...

COMMENTS

May 26 03:03
Pete Palmer’s new book: Basic Ball

May 26 01:11
Largest demonstration in Canadian history?

May 25 23:40
“Why Kickstarter works”

May 25 19:41
What sabermetrics is NOT

May 25 16:59
Howard Stern

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 12:51
Chad Curtis

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

THREADS

September 14, 2011
Markov 2: Gaussian Elimination