Wednesday, September 14, 2011
Markov 2: Gaussian Elimination
I created the Markov calculator, which models how baseball works… sans the outs on base. Works perfectly. Bill updated that to include the run frequency matrix. Great. The source code for both can be accessed by doing View/Source on your browser.
Well, now comes along Bill White:
I’d like to introduce (and provide some basic, over-simplified Java source code for) a new runs estimator that CAN perfectly model every event in the game (although the provided source code DOESN’T model everything).
One advantage of this new computational strategy is that it can be made to perfectly model baserunning events. The closest existing analogue—Tango’s Markov model—is similar in concept but doesn’t model double plays or any other events in which a baserunner could make an out on the basepaths.
The strategy is simple: set up a system of equations with 24 equations and 24 unknowns, with each unknown being the expected runs scored from a particular game state (e.g. runner on 1st w/ 1 out). Each equation models the expected runs scored for that particular state as a function of the expected runs scored in all other states we could transition into. For example, the equation for no runners on, 2 out: e2out0on = homeRunPct*(1+e2out0on) + (singlePct+walkPct)*e2out1st + doublePct*e2out2nd + triplePct*e2out3rd
Once the system of equations is set up, we simply solve using Gaussian elimination:
http://en.wikipedia.org/wiki/System_of_linear_equations
http://en.wikipedia.org/wiki/Gaussian_eliminationThe same computational strategy (solving a system of linear equations) can also be used to compute the odds of scoring exactly X runs in an inning and to find the expected percentage of plate appearances that occur during a particular game state.
I provided some source code to illustrate how this strategy works. However, there is a lot of room for improvement. Even though this computational strategy is powerful enough to perfectly model stolen bases, caught stealing, baserunners being thrown out at the plate, etc., the provided source code does not model any of those things. It does model double plays, and crudely models ordinary runner advancement (e.g. the odds of scoring from 2nd base on a single), but there’s a lot of work to be done. Also, the code hasn’t been thoroughly tested.
And he was nice enough to post his full source code. It would be great if someone took that code and enhanced it to include the baserunner movements.


Recent comments
Older comments
Page 1 of 344 pages 1 2 3 > Last »Complete Archive – By Category
Complete Archive – By Date