THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, February 16, 2007

Custom Win Values

By Tangotiger, 12:26 PM

Many of you are familiar with Custom Run Values.  In a recent post at BTF, someone was calculating Win Values.  It’s an easy step, so here you have it at Google Docs

Here’s an explanation:


The sheet named “Run Values” is simply a copy of what’s on my site.  The “Win Values” sheet is a translation of Runs to Wins, using PythagenPat. 

You’ll note that in the 3 to 7 RPG environment, the win value of the CS doesn’t change, and those of the out, K, and walk are fairly stable.

The “wOBA weights” is the coefficient to use to properly scale all the positive events.  This is a good one to look at, as it shows how the events change relative to each other, as the run environment changes.  The impact of the single goes down slightly as the run environment goes down, while the impact of the double goes up slightly.  The impact of the walk follows the single, except in very low run environments where it really drops in value.  The impact of the triple increases somewhat as the run environment decreases, while the HR increases substantially.

There’s a good reason why these weights in the 5 RPG are a bit different than the ones in The Book, but it’s not really worth getting into it, unless someone really wants to get into the innards.

#1          (see all posts) 2007/02/16 (Fri) @ 13:00

"The impact of the walk follows the single, except in very low run environments where it really drops in value.”

One of the very difficult part of Hall evaluation arguments is comparing eras.  I wonder if the impact of a walk is different in a low run environment caused by high strikeout rates (the late 60s) and consequential low hit rates, compared with a low run environment caused by very low home run rates (most years in the teens).  Eddie Collins and Joe Morgan come to mind immediately (although Morgan’s environment in Houston was nasty on both fronts).


#2    Tangotiger      (see all posts) 2007/02/16 (Fri) @ 13:11

Easy enough to check:

http://www.tangotiger.net/markov.html

The run value of the walk using Markov is:
.403

Then, increasing AB to 45, where RPG is now 3.2, and we get:
.320

Keeping AB at 45, and the K jumping to 12, and we get:
.313

Keeping AB at 45, keeping K at 12, increase hits to 11, walks to 5, and HR down to zero, and we get:
.363

So, the number of K does hardly a thing to the run value of the walk.  But, reducing the number of HR does increase the value of the walk alot.

***

Note that all the run values of the walk are too high, because of the self-imposed limitation of the Markov model.


#3    Peter Jensen      (see all posts) 2007/02/16 (Fri) @ 14:18

I am not sure that you can alter the run environment by just changing the number of at bats in your Markov model and have the run values for events be calculated correctly.  I would think you would have to model the exact transition rates for each state to state transition.


#4    tangotiger      (see all posts) 2007/02/16 (Fri) @ 14:25

I would think you would have to model the exact transition rates for each state to state transition.

Agreed.  Which is why the transition rates are specified at the bottom of the page.

If you want to argue that the transition rates themselves will change in response to the new run environment, I agree.

In any case, the model itself is perfect (under its assumptions), and it’s up to the reader to use it as best as possible.


#5    Peter Jensen      (see all posts) 2007/02/16 (Fri) @ 15:15

At the bottom of what page?  I am not talking about the base advance rates.  I am talking about the transition rates for each base out to base out transition in the Markov.  In other words not just changing the ABs to lower the runs/game but also the BBs, Hits, 2Bs, 3Bs, HRs, and Ks.  Ita makes a big difference in what you end up with for the event values.


#6    tangotiger      (see all posts) 2007/02/16 (Fri) @ 15:24

I still don’t think I follow you.

Are you talking about say reducing the number of HR (to zero) and increasing the number of singles, to end up with the same number of runs per game?

Perhaps you can give a concrete example so I can follow.

The source code is also available in the program (View/Source), if you want to take a gander.


#7    Peter Jensen      (see all posts) 2007/02/16 (Fri) @ 16:25

Mike’s question in post #1 concerned the impact on run values of the walk in low run environments that were caused by either low home run totals or high SO totals.  In your answer in post #2 you started by lowereing the run environment by increasing the number of ABs.  This method or altering the out rate seems to be the way you created your differing run environments for your “Custom Run Values” chart.  I am saying that this is wrong, both for answering Mike’s question and in investigating any real world run environment.

Example: to correctly answer Mike’s question you need to create a custom Markov that recreates the actual state to state transition rates for the low run environments that existed in the teens and in the late 60s.  For example, 1916 had a runs/game of 3.45.  1968 had a runs/game of 3.43.  But if you plug in the actual ABs, Hits, 2Bs, 3Bs, HRs, BBs, and Ks for each of those years the resulting run values for BBs, 1Bs, 2Bs, 3Bs, HRs, and BBs are quite different.

For--1916---1968

BB---.351---.337
1B---.465---.445
2B---.768---.739
3B--1.080--1.047
HR--1.483--1.470
OUT(-.228)(-.220)

Your chart of “Custom Run Values” is virtually useless because the run values will vary substantially depending on the factors that create a particular run environment.


#8    tangotiger      (see all posts) 2007/02/16 (Fri) @ 16:51

I agree that the Custom Run Values was based on keeping the “profile” of the events the same.  That is, the percentage of positive events is a constant.  You could very well create a virtually unlimited number of profiles that will lead to 3.5 runs per game.  As you state, the 1968 profile is different from the 1916 profile, even though they both lead to the same number of runs.

And, once you alter the frequency of each event, that impacts the individual run values of all events.

Furthermore, the way a team or an era approaches baserunning would also have a substantial effect.  Make everyone a speedster in 1916, and make the OF deeper, and now it’s alot easier to go 1b to 3b on a single, making the run value of a single that much more (both in terms of getting on, and in moving over).  And Coors would be alot different than the Astrodome, too, even if you have the exact same players in both parks.

I’m offering two things:
1 - the Markov calculator
2 - illustrations as to what happens when you put in different inputs

However, nothing is stopping anyone from simply plugging into the Markov calculator to figure all that out (with the large provision that it doesn’t allow SB or runners out on base). 

I think you and I both completely understand the model and how sensitive it is to the inputs.  Hopefully, others are now more aware of this.


#9    tangotiger      (see all posts) 2007/02/16 (Fri) @ 17:29

Plugging in the 1916 NL data into the Markov calculator gets us to 3.35 runs, as opposed to the actual 3.44 runs.

They were great percentage basestealers back then (77%, if the recording of that data is to be trusted).  That would account for about .06 runs per game.  The rest might better baserunning.  Just addiing +.02 to each of the rates at the bottom (on hits only) is enough to account for the rest of the difference.  Or, more reached on errors than normal.  That probably is the safer bet.

The 3.42 1968 NL gives us a Markov of 3.48.  They had horrible SB/CS numbers, enough to bring it down to the 3.42 level.

***

What is interesting in the run values of these two leagues is that the positive and negative events have more impact in the 1916 season than 1968.  That is, even though they both scored the same number of runs, it wasn’t that one positive amount was offset by another.  Instead, the positive numbers did more good, and the negative numbers did more harm, in 1916.


#10    Los Angeles Waterloo of Black Hawk      (see all posts) 2007/02/16 (Fri) @ 19:09

What is interesting in the run values of these two leagues is that the positive and negative events have more impact in the 1916 season than 1968.  That is, even though they both scored the same number of runs, it wasn’t that one positive amount was offset by another.  Instead, the positive numbers did more good, and the negative numbers did more harm, in 1916.

I think this reflects a combination of better baserunning and more fielding errors.  A base hit in 1916 probably had more advancement value than in 1968 for these reasons.  Of course, this makes an out even more costly.


#11    tangotiger      (see all posts) 2007/02/16 (Fri) @ 21:24

It has to do with more runners being on base, per PA.  The run value of the out is more, the more runners there are on base.  And in 1916, with fewer HR, that means more runners on base.

You can try out some various combinations in the Markov program, and you’ll see this.


#12    Joe Arthur      (see all posts) 2007/02/16 (Fri) @ 21:53

Tango -

re 1916 basestealing: I suspect you’ve been misled by retrosheet, which gives team totals that year without clearly indicating that the CS totals are incomplete. Look at individual team roster pages…

CS for AL 1914 and 1915 is “complete” and the SB% those years was 55% and 58%.


#13    tangotiger      (see all posts) 2007/02/17 (Sat) @ 00:51

Thanks for reminding me.  Normally Ruane puts an “i” when something is incomplete.

I also noticed that the pitching and batting runs scored/allowed don’t match.  Since we have the complete gamelogs, that would be a curious error.

Those might be the sum of the individual players causing those totals.


#14    tangotiger      (see all posts) 2007/02/17 (Sat) @ 00:55

Everyone should try this, to appreciate how run scoring works.  Try the Markov calculator, and copy/paste the LWTS values at the bottom.

Then, change the batting lines so you have no extrabase hits, and instead have 12 (or so) hits, so that you have 5 RPG.  Check out the run values at the bottom.

Now, go back, and put 5 hits, all HR, (still giving you around 5 RPG) and check out the run values.

As you can see, the “leverage” of each event is far greater the more runners you have on base.

It’s possible that Ty Cobb is being undervalued if we stick to the traditional measures.


#15    tangotiger      (see all posts) 2008/08/18 (Mon) @ 15:15

Bumping thread.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 11:55
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being

Nov 20 10:54
David G. checks in again on whether experience matters in the post-season

Nov 20 10:42
Offense by position groups by decade

Nov 20 04:02
Nate Silver: hero to interviewers

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel

Nov 19 16:40
One Year and One Million Hits Later

Nov 19 16:22
Soria as a starter?