THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, December 09, 2010

Bases, Outs by Event

By Tangotiger, 12:05 AM

image

This is how many bases are gained on each event.  This is as pure as it gets.  No adjustments, no fancy-schmancy divisions or multiplications.  A straight addition of all bases gained per event.

For a HR for example, you add up all the bases the runners took from 1B and 2B and 3B, and you get 1.4 bases gained, and the hitter himself gains 4.0 bases, for a total of 5.4 bases.  For the inning-ending plays (outs), I subtract the bases, hence -0.5 bases for strikeouts.

If you add up all the bases, divide by 4, you get exactly the total number of runs scored.  (That should make David Smyth happy.)

Most of the event categories are obvious.  The less obvious ones are FCSA is “fielder’s choice, safe”.  ROE is “reached on error”.  SAFE is “safe somehow”. XI is interference.  NIBB is unintentional walks.  The line separates the good from the bad events.  The top part are batter events, and bottom part are runner events.  DI is “defensive indifference”.  OA is “other advance”.

ROUGHLY speaking, if you take bases and divide by 4, you get close to Linear Weights.  For outs, subtract an extra -.18 runs per out for the current time period, and -.16 runs for the previous time periods.


#1    NaOH      (see all posts) 2010/12/09 (Thu) @ 00:48

Kind of weird having the event listed on the right. Seems like it should be on the left since we read left to right and each row of data is predicated on knowing the event basis.

To me, the data ordering should translate to a sentence. So, the first row of data would correspond to saying, “A home run from 1993-2010 was worth 5.41 bases, the same was true from 1969-1992, and it was just a touch higher at 5.43 from 1950-1968.”

That doesn’t happen easily when the event is on the right. On top of that, for the sake of chronology, it seems like all of the columns are reverse where they should be, but that’s a separate question of whether forward or reverse chronology is desirable.


#2    mettle      (see all posts) 2010/12/09 (Thu) @ 01:37

Wait, so does SB=1.11 bases mean that 11% of SBs lead to errors and an extra base? That seems high, so I’m thinking I’m misunderstanding.

Also, why are PK,CS<1.00 out?


#3    J-Doug      (see all posts) 2010/12/09 (Thu) @ 01:46

does SB=1.11 bases mean that 11% of SBs lead to errors and an extra base?

I’m guessing double steals have something to do with this.


#4    tangotiger      (see all posts) 2010/12/09 (Thu) @ 08:45

It is mostly errors for SB.

As for pickoffs and CS being less than 1 out: well, that’s the way MLB likes things, not me.  If you can believe it, you don’t have to be out to be picked off!  The pitcher can “pick you off”, but then the runner can make it to 2B safely.  Like being struck out, but reaching 1B safely.  “Strikesafe”?

Alot of gobbledygook like that, stuff like “he should have been safe, if the fielder would have just made the play”.  Earned runs, etc, all that stuff.

Don’t get me started.


#5    dave smyth      (see all posts) 2010/12/09 (Thu) @ 09:12

---"If you add up all the bases, divide by 4, you get exactly the total number of runs scored.  (That should make David Smyth happy.)
------

As a pig in $hit. smile

Nice chart. Can you do it for individuals?


#6    Joshua Maciel      (see all posts) 2010/12/09 (Thu) @ 10:31

And can you do it by base-out state?


#7    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 11:00

Yes, I can do it for anything (park, base/out, batter, pitcher, count, etc).  Here’s where it’ll start to look less useful:

In order for the bases/4 = runs to hold, I have to make sure that I “clear the bases” for the end of inning.  If you leave a runner on 3B to end the inning, those +3 bases have to have a matching -3 bases in order to balance out to 0 bases gained for the inning (since in baseball, the rule is that you lose the bases you gained after the third out).

The way I’ve done it here is to give the negative bases to the final out of the inning.  In the grand scheme of things, it doesn’t matter how I did it, because, well I only break down outs in limited ways (batting out, K, CS basically).  So, if you have an average of 1.5 bases left for each inning, I assign 0 to the first two outs, and 1.5 to the third out, for an average of 0.5 bases lost per out. 

What I possibly could do is take the 1.5 bases left and distribute that evenly to the three outs.  This is the “WPA” issue.  After all, we know we have three outs, and it’s not necessarily that each out is more or less valuable than another (though, you can make that case that it could be if you have a runner on 3B, less than 2 outs, and you strike out).

Once you start to get into it more, and try to work it out, you will eventually get to the point that you are going to look at it from a runs perspective, and not a bases/outs perspective.  And that leads you to Linear Weights.

So, I *can* do the breakdown in any way you guys like.  I might be left unsatisfied with how things end up because of the outs issue I noted above.  And so, I’m not sure if we are better off if I do that.

I’m ready to be convinced either way.


#8    DSMok1      (see all posts) 2010/12/09 (Thu) @ 11:39

It would probably be interesting to see just the “Base State” results, without splitting out by out state.


#9    dave smyth      (see all posts) 2010/12/09 (Thu) @ 11:49

Yes, once you start apportioning the left on base to all 3 outs, you are into probabilities instead of just counting bases. So you might as well just use LWts.

So this seems unfair to the guy who makes the final out. OTOH, I believe there are more runners on base with 2 outs, so this batter also has more opportunities.

Can you check to see what are the avg opps with 0, 1, and 2 outs (baserunner bases available to the batter), and then plug an avg batter into each situation, to see how it balances out.


#10    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 11:57

I can only do stuff like that when I’m home.

But… we know there are 1.5 bases when the third out occurs (data in chart above shows that).

At the bottom of this page, we see the frequency of each base/out state:
http://tangotiger.net/re24.html

We see when there are no outs, we have .24 times that there are 0 bases, .061 times there is 1 base, .015 there are 2 bases, .015+.002 there are 3 bases, .006 times there are 4 bases, .003 there are 5 bases and .004 there are 6 bases.  That gives us .346 PA (about one third of all PA) for a total of .205 bases, or 0.60 bases per PA when there are 0 outs.

You can use that chart there to work it out for 1 outs as well (and 2 outs if you like, to confirm that the 1.5 is accurate).


#11    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 12:07

Number of bases owned by runners, by out:
0 outs: 0.60
1 outs: 1.14
2 outs: 1.43

Number of potential bases that can be moved:
0 outs: 1.01
1 outs: 1.62
2 outs: 1.93

The first set of numbers is how many bases baserunners currently “gained”.

The second set of numbers is how many bases baserunners have left to gain to score a run.

So, at 0 outs, the baserunner(s) (if any) have gained 0.60 bases, and have left 1.01 bases to score.

Roughly speaking they are somewhat proportional (either by division or by subtraction).

So, yes, David makes a good point that you can’t just divide the left on base equally for the three outs, because the guys with 0 outs didn’t have as much potential to move runners over.

In the end, we’re just going to end up to linear weights.

I’m even more convinced now that it would be fruitless.

Great stuff David…


#12    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 12:11

I guess what you can try to do, if you want to push it, is give 20% of the remaining bases at the end of the inning to the first out, 35% to the 2nd out, and 45% to 3rd out, to match how much bases are being gained at those three out states.

This way, you are in balance, on average, by out states.


#13    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 12:36

It brings up some interesting thoughts.  Say you have a runner on 1B, and you move him to 2B but make an out.  And the next guy moves him to 3B but makes an out.  The final guy makes the final out.  So, you now have -3 bases to account for.  Under the 20/35/45 plan, it looks like this:

batter1: +1 base
batter2: +1 - .20x3 = +0.40
batter3: +1 - .35*3 = -0.05
batter4: -.45*3 = -1.35

Total: 0 bases, 3 outs

What would linear weights give?  Well, by base/out state, it would be:
batter1: +.40 runs
batter2: -.22 runs
batter3: -.34 runs
batter4: -.384 runs

Total: -.544 runs (which matches our starting +.544 level).  You can add +.136 runs to each if you like, to set it back to 0.  So:
batter1: +.536 runs
batter2: -.084 runs
batter3: -.204 runs
batter4: -.248 runs

According to linear weights, the two last guys who made the out had roughly the same impact.  But according to bases/outs, the third guy was the huge problem.


#14    dave smyth      (see all posts) 2010/12/09 (Thu) @ 13:16

---"In the end, we’re just going to end up to linear weights.”
------------

Not necessarily. The idea is to use factual information as much as possible (bases per out), and only use probabilities when forced (such as the attribution of credit on a 1st to 3rd extra base between the batter and runner. IIRC, that’s Tuttle’s “Base Production” system. I’m assuming you have seen his awesome series of articles, Tango, which unfortunately has vanished from the internet, AFAICT.


#15    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 13:42

There was a website no?  Codell?

***

Suppose that you have a superhigh run environment (OBP=.900).  Basically, just getting to 1B is an almost guarantee to score.  Why then do we care about counting base1 to base2 and base2 to base3?  Those bases are not equal.  Giving up an out so that you can go from 1B to 2B would be insane in this case.  And instead, you will be credited with “+1 base”, just as you would in a league with .300 OBP.

So, by counting each base up as we are doing, we are implying that each base is equal.  Instead, we should really be counting like this:
home to first
first to second
second to third
third to home

And counting bases like that.  So, for a HR, we would have something like the following:
home to first: 1 (the batter)
first to second: 1 + .3 (there’s 0.3 runners on 1B)
second to third: 1 + .5 (there’s another 0.2 runners on 2B)
third to home: 1 + .6 (there’s another 0.1 runners on 3B)

You add it up, and you get 5.4 bases, but now it’s broken down as:
HP to 1B: 1.0
1B to 2B: 1.3
2B to 3B: 1.5
3B to HP: 1.6

Each base gained doesn’t count the same. In the .900 OBP league, HP to 1B gets almost all the value, and the rest get little value.  In our current environment of .340 OBP, it would be something like this for the value of each base:
HP to 1B: 0.26
1B to 2B: 0.17
2B to 3B: 0.17
3B to HP: 0.40

So, you multiply the two charts:
HP to 1B: 0.26 x 1.0 = 0.26 runs
1B to 2B: 0.17 x 1.3 = 0.22 runs
2B to 3B: 0.17 x 1.5 = 0.26 runs
3B to HP: 0.40 x 1.6 = 0.64 runs

You add it up, and you get 1.38 runs, which is pretty much what we expected (1.40).


#16    Patriot      (see all posts) 2010/12/09 (Thu) @ 14:39

David, there are a series of SABR-L posts (30 some) by Tuttle about Base Production.  I don’t know if it’s the same articles you are referring to, but send me an email at the address above if interested.

Across all of the posts, it looks as if Mr. Tuttle would have been better off writing a book.


#17          (see all posts) 2010/12/09 (Thu) @ 15:08

I’m sure Tango will point out something obvious, but I always wondered why pickoffs hurt less than CS. Being a Nationals fan (agreed on the overpay, by the way, but just wanted to note how bad their outfield situation was), I have seen 10 too many times Nyjer Morgan get PKed, and always felt it damaged us as much as when he was caught stealing.

(Speaking of which, why does FanGraphs list only a random sprinkling of PKs in their player WPA pages? BP lists all the CS they account for, but many many more PKs than are listed. Would be interesting, because it is shocking how stupid certain players are “above average” insofar as you’ll find players who average over half a run lost when CS but only gain a tenth of a run when succesful).


#18    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 15:11

More “pickoffs” result in the runner being safe than a “caught stealing”.

If you look at the numbers above, you see that each pickoff results in 0.71 outs, while each CS results in 0.99 outs.

It’s a horrible and terrible accounting system, one that we would not invent if starting from scratch.


#19    dave smyth      (see all posts) 2010/12/09 (Thu) @ 15:26

yeah, patriot, those must be what i was referring to. the email off your name doesn’t work. i don’t know anything about sabr-l


#20          (see all posts) 2010/12/09 (Thu) @ 15:33

Sorry David, try this one.


#21    Peter Jensen      (see all posts) 2010/12/09 (Thu) @ 15:46

It’s a horrible and terrible accounting system, one that we would not invent if starting from scratch.

And what makes it more strange is that Retrosheet had originally built into the system Event_Type 7, Pickoff Errors, used it for a while and then abandoned it.


#22          (see all posts) 2010/12/09 (Thu) @ 16:00

Thanks, that makes a lot of sense.

In terms of your point using the high run environment example, I think that’s spot on, and gets me thinking about how teams could nuance their strategy based on what run environment each team they’re going to face presents. Actually, it seems that Toronto has done just that, insofar as their shift towards the long ball makes sense in the AL East. But beyond that, why don’t teams alter, say, their baserunning when facing the Mariners as opposed to the Yankees (an RE difference on par with the dead-ball era versus when barry bond’s head grew). The value of risking an out to move a runner over must be significantly different, period, but so should the relative value of doing so from 1st to 2nd versus 2nd to 3rd…

(sorry to ramble, I’m done now!)


#23    RMR      (see all posts) 2010/12/09 (Thu) @ 16:28

Sorry to be the lay man in the room, but what are the analytical applications for this—e.g. what “new” knowledge have we gained or might we gain by looking at production in this way?

I’m not at all implying the answer is none, I’m not just clear on what this helps us do—give us a more factual measure of run production than using the LW approach? 

And if at the player level were’re still applying an average value, what’s the philosophical advantage whether it’s average bases or average runs?  (or we would use real values, perhaps compared to against the averages to get a luck/leverage measure?)

Thanks!


#24    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 16:35

William: If you are facing Mariano Rivera, you should steal and bunt ALOT more than against an average pitcher.

***

Peter: I think they had some inconsistent use of it, so they abandoned it.  Maybe I should fix that.

For example, you will note that I have codes “80” and “97”.  Those are from 99-event_cd (meaning 19, and 2 respectively) that I created, to differentiate between an “out” where everyone is safe or not.  In looking at that, I should have converted the remaining FC (19) into the “2” bucket (regular out).  I’ll fix that in the future.

But, more to the point, I can do the same thing with PK (and to a lesser extent CS), and use PKE for all the PK where no out is recorded.


#25    Tangotiger      (see all posts) 2010/12/09 (Thu) @ 16:38

RMR: I think this helps the lay man on his way to accepting linear weights.

In terms of “selling”, you have to go from bases and outs, to runs, to wins.  Alot of us here are happy with wins and WAR and runs above average and linear weights.  But, for those who haven’t gone all-in yet, keeping things in terms of bases and outs would likely be the starting point.  Think of it as SABR101.


#26    RMR      (see all posts) 2010/12/09 (Thu) @ 17:35

Sounds good, Tango.  That’s what I assumed.  That said, it would be great to see some version of each approach laid out to help tell the story.  Method A (Process -> Result), Method B (Process -> Result), etc.  Look, the results are all pretty much the same!  We didn’t’ just make this stuff up!

There’s a great teaching opportunity here between this community and the rest of the baseball analysis world.  It sounds it is happening in fits and starts as individuals opt in.  I know this a conversation for a different thread, but I wonder how much active outreach this community is doing.  (not that it is anybody’s responsibility to do so, just that I wonder if the evolution could be sped up a bit...)


#27          (see all posts) 2010/12/11 (Sat) @ 07:10

Great work, Tango.

Like a good mlb player, you keep making adjustments to improve your game.

One suggestion to make your more enlightened analysis more relevant for who those angry primates perceive as sabermetric neanderthals, please segment the data since 2006 as it’s a different ballgame post PED testing/penalties enforcement.

I believe with your statistical acumen & that change your readers would find even more interesting & useful application of this analysis to real world baseball decision making.

Thanks & keep up the great work.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 02:54
Largest demonstration in Canadian history?

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards

May 24 08:13
espnW for hockey: CBC’s WhileTheMenWatch.com