THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, September 04, 2006

Quirks in the Win Expectancy Tables

By Tangotiger, 07:03 PM

A reader found a problem with the WE tables here, http://www.sportsmogul.com/vbulletin2/showthread.php?t=119234 .  I responded:

When I create the WE tables, I use a basic run frequency table, like this one:
http://www.tangotiger.net/RE9902score.html

I know how often to expect a certain number of runs to end of inning, from each of the 24 base/out states. If we focus on the 1b/3b, 0 out lines and the 2b/3b 0 out lines, we see the chance of a scoreless inning is only 12% with the guy on 1B, but 14% with the guy on 2b. However, the chance of a multi-run inning is way higher with the guy on 2B.

So, in this case, on average, small-ball is played much more often with the guy on 1B and 3b, than 2b and 3b. Again, on average.

However, when I run my numbers, I assume this applies to every inning. Clearly, this can’t be true in the 9th inning of a tie game, where teams play completely differently.

Therefore, the “human element” is missing from my basic WE charts that I have published. It would be definitely doable to incorporate, so you have something that is more logical.


#1    studes      (see all posts) 2006/09/05 (Tue) @ 03:54

That was a great catch.  FWIW, that doesn’t happen in the WPA spreadsheet.  I must be applying your math differently.


#2    Peter Jensen      (see all posts) 2006/09/05 (Tue) @ 08:33

You can generate the WE tables empirically.  Although that can result in some quirks as well with states with very small N, it still is useful to have for comparison purposes.


#3    tangotiger      (see all posts) 2006/09/05 (Tue) @ 08:55

It is useful, but, you will get ALOT of quirks, especially if you use single-year only.

Click on my name for empirical that spans multiple years.

This link is also useful:
http://www.tangotiger.net/innwin2.html


#4    Peter Jensen      (see all posts) 2006/09/05 (Tue) @ 09:18

No, a single year is not very useful.  But if the run environment doesn’t change drastically, it is possible to get multiyear runs that generate sufficient N for almost all states.  I think it is important not to assume that all differences between empirically generated tables and simulation generated tables are due to the sample size problems of the empirically generated tables.  Some (like the example above) have strategically related logical reasons for occuring.


#5    studes      (see all posts) 2006/09/05 (Tue) @ 10:14

Personally, I think there are too many problems comparing empirical tables to tables generated by math.  The sample sizes in cells vary too much, even with five years worth of data.  You can certainly assume that not all the differences are due to sample size, but which results should you assume that for?

To me, the “sanity check” is the best approach, a in the linked thread, or perhaps checks between tables generated two different ways (such as my spreadsheet and Tango’s tables).


#6    tangotiger      (see all posts) 2006/09/05 (Tue) @ 10:20

Few people, least of all me, would assume “all differences”, applies to anything.

It’s rather clear what the differences are, with first and foremost being the sample size.  Other differences are the quality of players in each batting slot, the use of relievers in close/late situations, and the strategic use of small-ball weapons.

When I run my sims, I need to run 1 million trials to get the win percentage to less than .001.  This can be easily shown by sqrt(.5*.5/1,000,000) = .0005.  So, 95% of the time, you’ll get it to within .001.  And with my process, I only need it to generate the 24 base/out states.

If you go to “multi-years” with a similar run environment, you are still at only 20,000 games played, but certainly not each of the thousands of states will have 20,000 observations.  You’re talking more like 1000 observations, using 5-year data, if you are lucky.  That gives us the 95% band of +/-.032 wins.

Clearly what needs to happen is a better understanding of when/how relievers and small-ball strategies are used, and incorporate those as states and figure out the appropriate event-frequencies and state-to-state transitions.


#7    Guy      (see all posts) 2006/09/05 (Tue) @ 10:57

I’ve long thought that a systematic comparison of empirical WE results with mathematically-based WE results could be very interesting.  To avoid small n problem, you’d want to focus on inning/score categories, not specific states—for example, how often does the home team win when up by 1 run at end of 8, vs. how often ‘should’ the home team win?  Up by 2 after 7?  The first thing you would need to correct for is quality of the teams, as teams with a lead would tend to be better.  But after that, I would think it could shed some light on how effective modern bullpen usage is.  Might also be interesting to see if there have been changes over time.


#8    tangotiger      (see all posts) 2006/09/05 (Tue) @ 11:09

Guy,

You can answer the question yourself, as I provided a link to answer that very question.

http://www.tangotiger.net/innwin2.html

If you click that, you will get not only the empirical, but also the Markov/probability numbers as well.

I should also mention that batting order plays a role.  In the second inning, it’s the bottom of the order that is due up, so when you do the comparisons, you have to be aware of that.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 22:11
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade