THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, March 27, 2011

Base running lwts

By , 04:20 AM

I recently revamped my base running linear weights program.  The results will soon be available on Fan Graphs, along with their other stats of course.

I am still working out any bugs which might be present and tweaking the methodology.  I want to vet some results and methodologies with you guys so you can offer any comments and suggestions.

I don’t include SB/CS/PO numbers.  I also don’t include advancing or getting thrown out on WP or PB.

Basically, I keep track of all base runner advances and outs (and not advancing of course) on batted balls.

Right now, the only thing I keep track of on ground balls to the infield is when a ground ball is hit to the SS or 3B, how often a runner on second (with no runners on 1st or 3rd) with less than 2 outs, advances to third (or home) or gets thrown out at third (or home).

I could track how often the runner on first stays out of a DP with a runner on second, less than 2 outs, and a ground ball to the IF (keeping separate track of balls to the left and right side).

For air balls, I keep track of all advances, holds, and outs on hits and outs, with each situation treated separately, and batted balls to each outfield position treated separately as well.

For air balls, I don’t distinguish the fielding location of a batted ball other than which fielder fields it (in other words, left side, right side or middle).  And like I said, on ground balls with a runner on second, I only look at balls hit to the SS and 2B, and I don’t distinguish between the two.

I do a simple park adjustment for all batted balls, based on where in the OF (L,C,or R) it is hit at each park.

That is pretty much it.

So I might add in “staying out the DP” for runners on second and less than 2 outs with a runner on first.

I also might add in advances and outs on WP and PB, although I am worried that the numbers will reflect more of how often a pitcher threw a WP (or the catcher committed a PB) than how often a runner advances on a potential WP or PB.  IOW, if player A advanced 5 times on a WP and player B advanced no times, I am afraid that the difference is mostly because the pitcher happened to throw more pitches that got away with player A on the bases than the fact that player A attempted an advance more often than player B.  But I am not sure.  It is always a judgment call on what things to include based on whether there is much of a skill element in the numbers.  I suppose I can look at the y-t-y correlations in the WP and PB numbers for base runners, and if it is real low, I won’t include them.

Anyway, here are some preliminary results: 


It looks like on a yearly basis, one SD for runs per 150 games is around 4 runs.  IOW, the worst and best are plus or minus 5-6 runs per 150.  For true talent, one SD is probably 3 runs (plus or minus 4 runs or so).  I am assuming that the entire population comprises 3 SD.  So, there isn’t a whole lot of value in base running.  If we were to add in things that I am not measuring, we can probably add maybe 1 more run in SD.  I think that plus or minus 5 runs (per 150) is a reasonable estimate for the limits on base running talent.  Again, this is excluding base stealing.  I am pretty sure that if you asked a baseball person, they would say that a great base runner is worth several wins.  It looks to me like .5 wins (in talent, not observed performance) is reasonable.

One interesting thing is that on a team level, differences among players’ base running performance/talent really tends to even out.  This is not surprising as almost all teams have a mixture of slow and fast players.  It is almost impossible for a team to be great or terrible in base running.

Stuff you hear about a team like the Rays or Angels being worth 4 or 5 wins in base running is nonsense I believe.  Here are the observed team base running totals for 2010.  Keep in mind that these are un-regressed numbers.  As with UZR, they do NOT tell you what a player or a team has actually done.  If you want to estimate that, you would still need to do some regressing.  If you wanted to estimate team (or player) talent or do a projection, you would have to regress even more.  The reason for that (that you have to regress to get what a team actually did) is that if a team has good numbers, it tended to have had easier opportunities that my methodology is not picking up, and vice versa for teams with bad numbers.

These are absolute runs above or below average and not “per anything.” Again, they would have to be regressed to have any meaning in terms of what actually happened.  Don’t ask me how much, because I don’t know.

TBA +18
OAK +11
COL +10
FLO +9
TEX +8
DET +6
CIN +6
ALA +5
NYN +5
LAN +5
ATL +4
SLN +4
SDN +2
CHA +2
MIN +2
PIT 0
NYA 0
PHI -3
HOU -3
CHN -3
CLE -4
BAL -6
ARI -8
SFN -8
WAS -9
BOS -9
TOR -10
KCA -10
SEA -11
MIL -11

Here are some of the best and worst base runners in 2010.  There are few surprises.  These are in total runs above or below average, and not a rate (like per 150).

Best

Elvis Andrus +7
Drew Stubbs +5.8
Carl Crawford 5.8
Carlos Gonzalez 5.2
Colby Rasmus 5.0
Angel Pagan 5.0
Juan Pierre 4.9
Chone Figgins 4.6
Brett Gardner 4.5

(Joe Mauer is +4.3!  Is he that good a base runner or is that just a fluke?)

Worst

Adam Laroche -7.3
Prince Fielder -7.2
Billy Butler -6.1
David Ortiz -5.7
Aramis Ramirez -5.1
Ryan Howard -5
Vlad Guerrero -5
Jason Kubel -4.7
Jorge Posada -4.5
Casey Kotchman -4.5

That’s it for now!

#1    Jon Shepherd      (see all posts) 2011/03/27 (Sun) @ 08:20

Sorry if I missed it, but what is the reason to exclude stolen bases?


#2    David      (see all posts) 2011/03/27 (Sun) @ 09:33

I would say, from watching him, that Joe Mauer is an excellent baserunner.  Cautious on the paths (so he rarely gets thrown out), but good enough to know when he can take an extra base here and there.

The results overall are unsurprising - which is a good thing.  It pretty much matches what the “eye test” would tell anyone, with a couple little surprises that could help our analysis - like I really didn’t think ANYONE would be worse than Prince Fielder (he’s slow as molasses and on base ALL THE TIME), yet lo and behold.  There’s Adam LaRoche.  Huh.  Wouldn’t have guessed it.


#3    Tangotiger      (see all posts) 2011/03/27 (Sun) @ 10:32

Was Laroche the guy a few years ago who called for a bunt by himself?  We had a spirited thread about it.


#4    Sky      (see all posts) 2011/03/27 (Sun) @ 11:04

#1—We already have pretty good methods of measuring the value of SB/CS. This analysis adds to that. For a total analysis, you’d include both.


#5    Matt Swartz      (see all posts) 2011/03/27 (Sun) @ 11:39

Good stuff. This is obviously very similar to BP’s EqBRR but developed separately, which makes it worth comparing. The results are pretty similar interestingly enough.

I took EqBRR-EqSBR to get the non-SB runs in 2010 EqBRR for an equivalent comparison. Here’s what I get.

Team--
.83 correlation overall
EqBRR SD = 6.6
Baserun lwts = 7.6
Stdev(EqBRR SD - Baserun lwts) = 4.3
Most extreme differences: MIL -7.6 worse in baserun lwts & OAK 7.4 better in baserun lwts.

Individuals--
Of the top 9 guys MGL named, 6 are in BP’s Top 10 by EqBRR-EqSBR. Of the bottom 9, also 6 in BP’s Bottom 10. Same top guy and bottom guy for both stats.


#6    Peter Jensen      (see all posts) 2011/03/27 (Sun) @ 11:45

MGL - As we have discussed before running on the pitch and game score are other factors that determine whether an extra base is taken or not. So are number of outs in the inning, but I assume that you are aready adjusting for that, but just didn’t mention it above.  Are you also counting extra bases taken on GBs that are fielded by outfielders?  Where the air balls are hit in the outfield will also be an important factor, if and when we get reliable information on hit locations.  I think the whether the outfielder makes a running catch or whether he has time to set himself before the catch and throw is the important part of the hit location.  None of these factors will be important enough to make much of a change in the ordinal rankings of the best and worst runners, but they might make some adjustment of the overall run values.


#7    Jon Shepherd      (see all posts) 2011/03/27 (Sun) @ 11:48

#4 . . . trying to see how this might affect how stolen bases are fit into the whole WAR tool.  It might make sense for a reorganization.

Regarding the lists . . . they pass the sniff test for me and are similar to BP.  That probably is the best validation one will get as the effect is somewhat small.


#8    Matt Swartz      (see all posts) 2011/03/27 (Sun) @ 11:49

Actually Russell (Pizza Cutter) looked at the responsibility of moving from 1st to 3rd on a single in this article last year, with a really interesting conclusion.  Here’s a relevant clip:

http://www.baseballprospectus.com/article.php?articleid=10533

“But let’s take a look at what the numbers say. Maybe some hitters are good at placing the ball in such a way that the runner can more easily make it to third. I figured out the percentage of times that a runner had gone first-to-third safely on a single to the right, how often the right fielder had been so victimized, and how often it had happened on the batter’s watch and the pitcher’s watch. Which actor had the most to say about such situations? The results might surprise you:

Batter 9.2%
Pitcher 39.4% (sic)
Right Fielder 14.0%
Baserunner 26.2%
Noise 11.2%”


#9    MGL      (see all posts) 2011/03/27 (Sun) @ 15:46

Peter, yes of course I include outs as one of the parameters.  And ground ball base hits to the outfield. I forgot to mention that.

Yes, running on the pitch (whether it be a hit and run or stolen base attempt) contributes to those extra bases. It part of a player’s base running ability, so I don’t need to treat it separately.  IOW, if a player is fast and/or smart enough to take the extra base OR he was running on the pitch, in either case, he gets credit for good base running.  We just need to make sure that we don’t add to my numbers the value of a stolen base which includes the value of getting the extra base when the ball is put into play.  That would be double counting.

All the other things are important of course, but I agree that it won’t make that much difference in these numbers (because they are relatively small to begin with) and of course they will even out in the long run, so it is not worth it for me to put them in as parameters (although I could, at the risk of decreasing sample size in each “bucket” - like a 1 out ground ball single to short RC field with a runner on 1st and the game tied in the 9th inning).

Of course the authors of “Short Hops” would say that because my methodology does not include all relevant parameters and the parameters I use are not divided into enough buckets, the entire result is invalid! wink

What do you guys think about trying to include advances (and outs) on WP and PB?

What about staying out of the DP when on first base on ground balls?

Whoever asked about SB/CS, that can be added in easily and separately.  We don’t need PBP data for that.


#10    tangotiger      (see all posts) 2011/03/27 (Sun) @ 16:24

SB/CS: well, you would need PBP if you adjust for grass/turf and handedness of pitcher.  Also if you look at whether he’s stealing 2B or 3B (or home!), and if there are other runners on base (lead or trailing runner), and to a lesser extent number of outs.

Regardless, I’d keep everything separate.


#11    Sky      (see all posts) 2011/03/27 (Sun) @ 16:44

Matt, you might want to remove the runs from advancing on WP/PB, too, since MGL didn’t include those.


#12    Matt Swartz      (see all posts) 2011/03/27 (Sun) @ 17:05

Few changes.

Same .83 correlation for teams, 6.5 instead of 6.6 stdev for EqBRR, same 4.3 stdev for the difference, same top and bottom differences but now it’s +7.7 for OAK and -7.1 for MIL.

Juan Pierre now on top, with 6.6 instead of Andrus with 6.0. LaRoche still on the bottom for both. Six of the top nine and six of the bottom nine are the same for metrics.


#13    Peter Jensen      (see all posts) 2011/03/27 (Sun) @ 17:39

and of course they will even out in the long run, so it is not worth it for me to put them in as parameters

Score will even out as a parameter, but hit ball locations may be slow to even out due to a runner’s position in the batting order.  A lead off or number two hitter should have a lot more advancement opportunities on air balls hit to the outfield at a greater distance than number 4, 5, or 6 lineup positions as runners.  Since the leadoff and number 2 hitters are faster runners anyway it may be exagerating the effect of their speed not to account for where the balls are hit.  But as you point out, pretty small potatoes to worry about.

Running on the pitch as a designed hit or run or steal varies a lot from team to team as a strategic choice so it is not only a product of a runner’s speed.  And running on the pitch as a result of a 3-2 count, both with 2 outs and with one out, is used without much regard to a runner’s speed and is also biased by the batting style of the batters following a player who gets on base.

I guess my point is that all of these factors would tend to make a runner’s value in taking an extra base even less important than the numbers you cite in your post, and those numbers are already pretty small.


#14          (see all posts) 2011/03/27 (Sun) @ 19:24

MGL@9: Yes on PB/WP. Also, how well do these ratings correlate with other accepted matters of speed, namely SB, CS, GDP and 3B?


#15    MGL      (see all posts) 2011/03/27 (Sun) @ 23:59

Great input guys!  I am going to check out the y-t-y correlation for PB and WP to see how much base running skill that is capturing.  I’ll probably add staying out the DP and a force, for runners going from 1st to 2nd on a ground ball, although much of that comes from running on the pitch I would think.

This is why I love vetting these things on this blog!


#16          (see all posts) 2011/03/28 (Mon) @ 03:06

MGL, Probably very small effect, but did you include advancing on foul flys?  Aren’t WP and PB defined so that if there is no base runner advancement then it does not show up?  We want something like advancements, (and outs), per ball in dirt, or ball getting away from the catcher.  Which we don’t have.  I guess this leaves at least 2 questions; does the additional noise make us much more uncertain about the signal?  Does the additional noise have some sort of bias that would
move the results ina particular direction.


#17    Rally      (see all posts) 2011/03/28 (Mon) @ 09:41

I looked at two batters for an example of how often runners advance on doubles:

In 2004, Chone Figgins hit 7 doubles with a runner on first.  Only one scored, the others held up at third.

In 2006, Frank Thomas hit 11 doubles, 3 with runners on first.  All three scored.  Amazing that Thomas hit only 11 doubles in what was a great year for him.  Looking at 2007, he hit 30 doubles, 11 with runners on 1st.  9 scored, so he’s 12/14 over the two year period.

What I see is this:  If Figgins hits a ball where the runner on first has time to score, then Figgins doesn’t get a double, he gets a triple (17 that season).

If Thomas hits a ball far enough that he has time to lumber into 2B, then anybody normal runner should have time to run 3 bases.

If you only look at the location of the ball and the time baserunners have to advance, a Thomas double is a Figgins triple.


#18          (see all posts) 2011/03/28 (Mon) @ 10:46

What is the y-t-y for going from first to third? I’d imagine it’s low; Brooks Robinson and Tim Raines have near-identical advancement rates for their careers. (I’d also gather it’s more a function of baseball sense rather than speed.)


#19    MGL      (see all posts) 2011/03/28 (Mon) @ 15:12

After fixing some bugs and adding base running from first base with less than 2 outs on a ground ball (staying out of the GDP or force play), here are some new numbers:

Teams

TBA +23
DET +14
FLO +12
TEX +11
COL +10
CIN +9
SLN +9
ALA +8
LAN +8
OAK +7
SDN +6
NYN +3
CHA +2
NYA 1
CLE 1
MIN 0
ATL -1
PIT -1
PHI -4
CHN -5
HOU -6
WAS -8
MIL -8
BAL -10
ARI -12
SFN -12
BOS -12
KCA -12
TOR -16
SEA -17

Leaders

Andrus 9.8
Rasmus 7.7
Pierre 7.6
Stubbs 7.4
Crawford 6.9
Damon 6.6
C Gonzalez 6.2
Longoria 5.9 (is there anything this guy can’t
do?)
Hamilton 5.8 (see above, Longoria)
Prado 5.7
Gardner 5.6
A Ramirez 5.6

Trailers

Ad Laroche -10.1 (wow!)
Fielder -8.8
D Ortiz -7.4
Vlad -7.1 (how the mighty have fallen)
Aramis Ramirez -6.7 (should a guy that slow be allowed to play anywhere but 1B or DH?)
Howard -6.7 (how is that contract looking?)
McCann -6.1
H Matsui -5.9
Butler -5.9 (the only guy on KC who can hit can’t run!)


#20    Tangotiger      (see all posts) 2011/03/28 (Mon) @ 15:54

Yes, it was Adam Laroche that did that brain-dead bunt:

http://www.insidethebook.com/ee/index.php/site/comments/stupid_ballplayers/


#21    MGL      (see all posts) 2011/03/28 (Mon) @ 16:21

Interestingly, if we combine the last 4 years of data from my updated base running lwts program, with no weighting, and regress (250 games is the 50% regression point), the SD per 150 games is only 1.95 (min 100 games).  So we expect talent to be only somewhere on the order of plus or minus 6 runs per 150.

The best, 4 years combined and regressed

Figgins 5.2
Bonifacio 4.0
Rasmus 4.0
Andrus 3.8
Pierre 3.8
Ryan 3.7
Bartlett 3.6
Kinsler 3.5
Andres Torres 3.4

The worst

Posada -5.5
B Molina -5
Lowell -4.9
Fielder -4.8
C Lee -4.5 (How bad was his contract?)
Thome -4.5
Garko -4.3
D Ortiz -4.2
Kotchman -4


#22          (see all posts) 2011/03/28 (Mon) @ 22:51

I’m curious about the groundball to the infield thing.  First of all, in one place you say you only track groundballs to ss/3b, then later you say only 2b/ss.  Probably just a typo.  Assuming you mean only 2b/ss, I’m pretty interested in why you chose not to distinguish between balls to 2b (or right side of the diamond) and balls to ss (or left side of the diamond).  Any runner with a left hander at the plate is going to have many more opportunities to go from 2nd to 3rd on a ground ball to the infield as the left hander is going to pull the ball a higher percentage of the time.  The percentage of times a runner advances on a ball to the right side of the infield is probably 75% higher than when it goes to the left side.  A player who hits in front of left handers is going to advance in that sceario more frequently but not as a result of better base running, per se.


#23    MGL      (see all posts) 2011/03/29 (Tue) @ 00:54

There may be a typo in my description.  For a runner on second (and no one on first or third), I only keep track of advances and non-advances on ground balls hit to the SS and 2B.  The reasoning is that everyone advances on ground balls to the right side.  On balls hit to the left side, skill and speed are a factor.


#24          (see all posts) 2011/03/29 (Tue) @ 10:23

Maybe I’m missing something, but isn’t the 2nd baseman on the right side?  Not sure I quite follow the logic.


#25    MGL      (see all posts) 2011/03/29 (Tue) @ 20:13

Sorry, I keep writing 2B when I mean 3B of course.  SS and 3B.  Left side…


#26    MGL      (see all posts) 2011/03/30 (Wed) @ 02:45

Well, I figured WP and PB run values and did a y-t-y correlation for players with at least 100 games in each season.  The “r” is .018.  I am not going to include them in the numbers.  Apparently it is much more about the pitcher and catcher (and luck) than the base runner, at least in a season or two.

BP should not be including it either.

A word to the wise:

When you are developing a metric, often times that one metric includes a number of things that reflect different levels of control or luck with respect to the player who is being evaluated.  Make sure that you check out the “control” factor for all of those things that are lumped together in the stat. If you don’t and one or two components are mostly luck, at best you are not gaining anything by including it, and at worst, you are introducing noise to the metric - how much depends on the magnitude of the component that is mostly noise (little control by the play being evaluated).

An overlooked example is an un-regressed pitcher stat which includes HR, BB, SO, and BABIP, like ERC (component ERA).  Including BABIP is an example of where you are lumping together a bunch of components some of which have very different elements of noise/luck/control.  Not a good thing to do.

With UZR, for example, easy chances are an example of components that reflect little skill on the part of the fielder. They should not be included if possible. 

Of course the proper solution is to regress all of the components of these stats individually and THEN you can lump them or add them together.


#27    Rally      (see all posts) 2011/03/30 (Wed) @ 09:25

MGL,

What is your denominator?  WP+PB advances per time on base?  Or something else?


#28    Tangotiger      (see all posts) 2011/03/30 (Wed) @ 10:12

I would not do y-to-y because the numbers are so low.  The best way is to compare the observed distribution at a career level to the random one, and determine the difference between the two.

That said, MGL is correct that the proper way is to regress first, and then combine the components, rather than adding everything first, and then doing one regression.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 20:16
Largest demonstration in Canadian history?

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards

May 24 08:13
espnW for hockey: CBC’s WhileTheMenWatch.com