THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, September 21, 2009

Doug Glanville: Times On Base

By Tangotiger, 02:39 PM

He says:

Yes, 40 percent is a great on-base percentage, but there are ways you can get on base without statistically being “on base.” There’s a fielder’s choice, which you can force with your speed (or maybe the infielders didn’t quite make a clean play); or you can reach by error, and errors often result from the pressure you put on the infield by running well. So you may have gotten on base a few more times than the stats show, and with that comes more potential for scoring runs.

It is a monumental joke that MLB thinks that when you reach base on error, that it’s classified similarly as if you grounded out: you get a tally in the AB column, none in the H, none in the BB, and there is no official Reached on Error category for the tally.  So, by default, it gets treated like a ground out.  Stupid.  Well, I count it.  And more importantly, b-r.com and Retrosheet also count it.

On b-r.com, we see that Glanville reached base a total of 69 times, either due to error, or by fielder’s choice (and no out recorded).  He also never reached base on catcher’s interference.  Retrosheet shows him with 68 “ROE”, which has the same definition of B-R.com (and which is also my definition).  I don’t have my db handy at the office, otherwise I could tell you which one is correct (or why the two differ).

As Francona said recently, “You put a guy in the leadoff spot and tell him to be patient — we did that with Doug Glanville in Philadelphia. He went from having 200-and-something hits to about 150 hits and he ended up with 13 more walks. It just didn’t work.”

Not true.  When Glanville got his career high in hits (204), he also got a career high in walks (48).  Why would Glanville quote what something said about himself that wasn’t even true!  Looks like NY Times facts-checks as much as ESPN does.  (Or are Op-Ed pieces free of fact-checking?)


#1          (see all posts) 2009/09/21 (Mon) @ 15:24

missing the link.


#2    Tangotiger      (see all posts) 2009/09/21 (Mon) @ 15:41

Fixed.


#3          (see all posts) 2009/09/21 (Mon) @ 16:07

Tom,

Regular columnists don’t usually get fact-checked.  Of all the stuff I’ve had in print, only two pieces have been fact-checked - one by the WSJ, and a letter to the editor to the Toronto Star.  And The Star only fact-checked because they found my claims so surprising.


#4    Tangotiger      (see all posts) 2009/09/21 (Mon) @ 16:10

Right, WSJ fact-checked my stuff pretty well, and they caught a couple of esoteric, though real, errors.

I figured what WSJ did was the norm.


#5    Guy      (see all posts) 2009/09/21 (Mon) @ 16:23

I would guess the NYT does fact-check it’s columnists.  However, the fact checker would probably only confirm that Francona did make the statement, NOT that Francona’s claim was true.  Which may answer Tango’s question:  “Why would Glanville quote what something said about himself that wasn’t even true!”

BTW, I notice that Glanville was +8.5 WAR over his career.  Which made me wonder:  who are the replacement-level players with the most PA over their career?  If Rally stops by, maybe he can tell us.  (To be clear, Glanville himself was a notch better than replacement.  Though it’s crazy that 85% of his PAs were batting leadoff or second.)


#6    Tangotiger      (see all posts) 2009/09/21 (Mon) @ 16:30

Among “recent” non-pitchers with a career WAR between -1 and 1, Neifi Perez has 5365 PA.  John Mabry has also played alot.


#7    David Pinto      (see all posts) 2009/09/21 (Mon) @ 16:33

I remember when Mickey Rivers became a Yankees player that the NY announcers would talk about how Mickey’s speed caused fielders to make errors on rushed throws.  I really couldn’t find any evidence of that from the retrosheet data.  Mickey seemed to reach on an error at about the same rate as everyone else.


#8    Tangotiger      (see all posts) 2009/09/21 (Mon) @ 16:38

According to Rally’s database, the leaders who got more than their fair share of reaching base on error:

+36 runs, Derek Jeter
+36 Jimmy Wynn
+32 Craig Biggio
+30 Hank Aaron
...
-35 Pete Rose
-36 Yaz


#9    Mark R      (see all posts) 2009/09/21 (Mon) @ 16:39

If you’re familiar with George Will’s illustrious body of climate change opinion writing, you’ll know op-eds are sometimes allowed to skate by with glaring factual errors.


#10    Guy      (see all posts) 2009/09/21 (Mon) @ 16:55

Neifi is a good choice, with a WAR of exactly 0.0.  But Alfredo Griffin is even more impressive, with -2.2 WAR over an astounding 7,330 PA.  Honorable mention to Tim Foli with 1.2 WAR over 6,573 PA.

And as a Cardinals fan, I had to look up Ken Reitz:  4996 PA to get -4.1 WAR.  Sigh.


#11    MGL      (see all posts) 2009/09/21 (Mon) @ 17:07

I don’t see anything wrong with treating ROE’s as just another batted ball out, unless there is evidence that a significant portion of it is skill, and even then it is dicey.  From the standpoint of the guys who make the rules about scoring, an ROE is simply a ball that should have been turned into an out with ordinary effort by the fielder, and has nothing to do with the skill of the batter.  Now, if you want to start talking about the fact that ground ball hitters or RH hitters or fast runners get more ROE’s, then the discussion gets a little more complicated, but by no means is the issue settled.  But even then I still don’t see a compelling reason why they shouldn’t be considered just another out, even though the batter did in fact reach base (presumably through the “fault” of the fielder). 

If you want to make a case that it should be treated as a “times reached on base” because he did in fact reach base, then you have to argue that a sac bunt or a sac fly should be treated as an out because he did in fact make an out (regardless of the value of the out), and that IBB should be treated as a time on base (which is officially true, but Tango doesn’t count them as such).

At least ROE get recorded as a separate entry (at least in modern times) so that if someone wants to treat them other than an out, they can.  I don’t even like them in wOBA because it implies that an ROE has almost exactly the same skill component as a single or a walk, which it clearly doesn’t. 

The question is in recording stats for batters do we want to at least capture what we think is predominately a batter skill or do we just want to record what happened?  If the latter, then there can’t be any such thing as fielder indifference on an uncontested steal - that has to go as a SB.

If you have evidence that ROE are a significant batter skill, then by all means do something other than treat it as an out, but even then, giving it the same weight as a BB or a hit is ridiculous, at least if you want to use stats like OBP or wOBA to represent player skill, which is usually the case, rather than just “what happened.” If a player in yesterday’s game hits 3 routine ground balls to SS that are booted and strikes out his 4th AB, are you really satisfied saying that he went “3 for 4” or had a .750 OBP?  I’m not.  If you want to say that he went 0-4 and reached base on an error 3 times, I’m a lot happier with that. I can do with that what I want.  I don’t want to hear, “He reached base 3 times in 4 AB.”


#12          (see all posts) 2009/09/21 (Mon) @ 17:25

Tom Ruane wrote an article called:

“Do Some Batters Reach on Errors More Than Others?” It is at

http://www.retrosheet.org/Research/RuaneT/error_art.htm

It looks like speed matters, but I can’t tell how much by skimming it. I have not read it for a few years.


#13    Nick      (see all posts) 2009/09/21 (Mon) @ 18:01

I think ROE should just count as a hit.  Is there really a difference between Jimmy Rollins booting a ball and Derek Jeter not having the range to get to it?


#14    MGL      (see all posts) 2009/09/21 (Mon) @ 18:29

"Is there really a difference between Jimmy Rollins booting a ball and Derek Jeter not having the range to get to it?”

I don’t know if you are being serious, but there is a gigantic difference.  In one instance, the ball gets caught 98% of the time, or whatever the average fielding % is. In the other instance (a hit), the ball gets fielded 5% of the time or whatever the percentage is for average hit. If you really wanted to capture batter skill, which as I said, is usually the point of all these metrics, then you would give like 10% credit for a ROE and 95% for a hit, or something like that.  Since we don’t like to do that, the next best thing, by far, is 0% for a ROE and 100% for a hit, which is exactly what we do.


#15    Nick      (see all posts) 2009/09/21 (Mon) @ 18:35

I was thinking about value to your team, but if we are trying to measure ability it, then you’re right.


#16    Peter Jensen      (see all posts) 2009/09/21 (Mon) @ 20:08

MGL - I don’t think it is as simple a decision as you are making it out to be.  Yes, there are some plays where a fielder completely boots an easy ball and the batter had absolutely zero input in creating the ROE.  But there are other plays where a fielder makes a small bobble and recovers enough where he throws out a slow batter, but can’t throw out a fast batter.  Shouldn’t the batter receive some credit for the time on base that he created because of his speed even though the fielder is charged with an error?  If a speedy left handed batter chooses to try to hit to the left side of the infield rather than pull the ball, does it really make any difference whether he gets on base because he hits the ball perfectly and is given a hit or whether his speed forces the fielder to rush a throw when only a perfect throw would have gotten him out, but the scorer gives the fielder an error because the throw beat the runner to first but pulled the 1B off the bag?

When you think about the way linear weights is calculated, the batter is not getting positive credit for every ROE.  He is really only getting credit for the number of ROE that he gets above an average batter.  And like every other small sample size stat, ROE only becomes meaningful if a batter can repeat the skill year after year.


#17    Terry      (see all posts) 2009/09/21 (Mon) @ 20:56

It’s fascinating that players can be so divorced from their own realities.


#18    MGL      (see all posts) 2009/09/21 (Mon) @ 21:17

Peter, I mentioned several times that there is a skill element to the ROE (obviously). But it is not enough to give full credit to every extra ROE that a player gets.  Not even close, I don’t think.


#19    Tangotiger      (see all posts) 2009/09/21 (Mon) @ 22:15

MGL we talked about this several years ago, and
I presented my research:

http://www.tangotiger.net/archives/stud0274.shtml#1006

You regress ROE about 75%, given 600PA.  That means you add 1800 PA of league average performance for regression.

For someone with a 10,000 PA career, that means regressing only 15%.


#20    Colin Wyers      (see all posts) 2009/09/22 (Tue) @ 00:24

I don’t get MGL’s stance. Why do we care how much of a “skill” it is, for the purposes of a hitters value? Particularly distinguishing between the “skill” of being a GB hitter, versus the skill of getting more ROE per GB?

Not everything is a projection system, and we shouldn’t try to treat our value metric like it should be.


#21    nick      (see all posts) 2009/09/22 (Tue) @ 03:06

+36 runs, Derek Jeter
+36 Jimmy Wynn
+32 Craig Biggio
+30 Hank Aaron
...
-35 Pete Rose
-36 Yaz

Wow, Pete Rose, the most hustling of all ballplayers ever to hustle, is among the worst in history in reaching on errors?  This makes no sense, I thought--

except when you remember scoring decisions it makes perfect sense.  Dude hustles--there couldn’t have been a play.....Pete would get those balls scored hits!


#22    MGL      (see all posts) 2009/09/22 (Tue) @ 03:13

Yes, Tango, I realize that virtually everything that has even a modicum of skill eventually has close to zero regression.  The issue is always at what point are you better off ignoring it or treating it with no regression whatsoever (giving it 100 weight).  Why do we use FIP and DIPS in favor of ERA or even ERC?  BABIP for pitchers eventually gets regressed 0% or pretty close to it, just as ROE does.

Colin, let’s assume for a second that there is virtually no skill to an ROE (which we all agree is not true). And I am including whether a player is a GB or FB hitter or RH or LH in his “skill,” as it should be.  It is ridiculous for virtually any reason whatsoever to include ROE in a stat for that player unless you explicitly want to report what happened.  Whenever ANYONE looks at or references a stat, they are almost always implicitly doing so in order to represent something that the player had control over.  We report a player’s OBP, or BA, or OPS, or wOBA in order to talk about how good that player is, not what that player did by virtue of a mistake by a fielder (again, I am assuming for the sake of this argument that an ROE has nothing to do with the player whatsoever).  That is true for casual fans as well as sabermetricians.  That is why even the casual fan has no problem with defensive indifference and why the sabermetrician dismisses ERA in favor of FIP or DIPS.  That is my stance and I am sticking to it 100%.

The ONLY issue, as far as I am concerned is the fact that there is SOME skill in ROE, thus you can make an argument for dismissing it in the short run, giving it 100% weight in the long run, and flipping a coin in between.

If ROE were treated as a hit or simply included in OBP, as in hits+walks+ROE divided by PA, I really would have no problem with that whatsoever.  But they aren’t and there really isn’t any compelling reason why they should be. Obviously the people that made up the scoring rules did not realize that ROE was much of a skill and they weren’t that far from being right.  Given that they thought that there was little or no skill to an ROE, is it really unreasonable for them to think, “Well, the fielder should have made the play, and a relatively easy one at that, therefore we are not going to give any credit to the hitter just because the fielder happened to make a blunder on that play.” I can’t for the life of me see how that is not logical thinking or reasoning, and at the same time, as I said, if they chose to go in the other direction, which would involve thinking, “Well, the player reached base even through no great accomplishment on his part, therefore we’ll give him credit, even though he doesn’t deserve much,” I wouldn’t have much of a problem either.

I guess you and Tango are not on board with awarding no RBI when a batter hits into a DP and a run scored.  After all, a run batted in is a run batted in, regardless of whether the batter deserved one or not.  Like it or not, even though the original rule makers were not as enlightened as we are today, they tried at least, to give batters credit when they thought they deserved it and no credit when they thought they didn’t. That is the reasoning behind the ROE, the sac bunt, the sac fly, the defensive indifference, the no RBI on a DP, the non-error on the back end of a DP, etc.  They weren’t perfect, but at least they were consistent…


#23    Kincaid      (see all posts) 2009/09/22 (Tue) @ 03:31

I think we should care to what degree ROE reflects a hitter’s skill even for the purpose of measuring value.  If a hitter reaches on an error 5 times more than average in a year, but not because of anything of his doing and just because by chance fielders happened to mess up more on balls he hit, then why should we give the hitter full credit for creating that value?  If you give full credit of the value of the play to the hitter, what credit do you dock from the fielder?  If we don’t care to try to separate what value a hitter produces from the influence of the other players involved in a play, then why do we bother to do that for pitching (by using defense-independent measures to determine their value)?

A hit and a ROE might have the same value in a given situation, but we are going to pin much more responsibility on the fielder for the error. That leaves less of the play’s value to assign the hitter.  Like MGL was saying, the hitter might be responsible for proving 5% or 10% or whatever it is of the value on the play for an error.  Ideally, you would know what percentage to give to each party, and I don’t see why you should just not care how the responsibility divides up and just give full credit to the batter for creating that value.


#24    Kincaid      (see all posts) 2009/09/22 (Tue) @ 07:39

Regarding the discrepancy between B-R and Retrosheet on Glanville’s ROE, there are at least three plays they differ on in case anyone still cares:

Sept 4, 1998:  B-R lists 1 ROE, Retrosheet 2

First inning, Glanville reached 2nd on an error by the first baseman.  Fourth inning, Glanville reached first on FC on sac bunt attempt, no out recorded.

I don’t know why B-R doesn’t have both, but it seems Retrosheet is right on this one.

Sept 13, 2000:  B-R lists 1 ROE, Retrosheet 0

Seventh inning, Glanville reached first on an error by the third baseman but was thrown out at second trying to stretch the play into a 2-base error.

By the rules of counting hits, it would be a ROE, but as far as being on base, he wasn’t.  I guess neither one is officially right or wrong, but as far as Glanville’s intent, we’d go with Retrosheet.

April 25, 2001:  B-R list 1 ROE, Retrosheet 0

Ninth inning, one out, bases loaded, Glanville grounds to the pitcher, force out at home, first baseman credited with error on the catch at first.  Glanville’s presence on first is accounted for by ROE and not FC only because the error was on the catch on the tail end of the near-double-play.

Technically he reached on an error, but only because of a scoring quirk that turned a botched double play from FC to ROE.  He still made an out.

For the purposes of this discussion, at least, I would go with Retrosheet in every instance, although if you’re going to count ROE as hits and not take out all the hits where the batter-runner was thrown out taking an extra base, you might as well be consistent and count the second one.


#25    Guy      (see all posts) 2009/09/22 (Tue) @ 08:41

I would base the decision on whether to include ROE in a single year’s stats on whether it makes our assessment of the hitter’s contribution more or less accurate.  Just using some back-of-envelope calculations, it looks like the very best ROE hitters are reaching base +4 times a season.  So let’s guess the SD of true ROE talent is 2 ROE per 150 games (probably less).  If the ROE rate is about 1%, then the SD(error) is about 2.5 ROE.  So we’re adding more noise than signal by counting them in a single season of data.  I’m not sure what the right cutoff should be for including them, but I’m inclined to agree with MGL on excluding for a single season.

I think this is another situation like BABIP, where a valid finding for pitchers (they should be held responsible for unearned runs allowed) is being misapplied to hitters.


#26    Tangotiger      (see all posts) 2009/09/22 (Tue) @ 09:24

Since the regression point is 50% when the PA is 1800, that would be the point as to whether to include them or not for evaluation purposes (if you need a threshhold line, and you want to NOT apply regression).

So, after three years, reaching on error tells you more than if you didn’t have that count.

If you want to exclude it for 1 year or 2 years, I’m ok with that.  Once you get to 4 years and more, you need to include it, because it tells you more than it doesn’t.  The signal is greater than the noise once the PA level reaches 1800 (more or less).

Same deal with BABIP for a pitcher.  The 50% mark is around, I think, 3500-4000 BIP (or roughly 7-8 seasons).  If you want to completely exclude it (i.e., regression = 100%), that’s fine for a career of up to 6 years.  If you want to completely include it (i.e., regression = 0%), that’s fine for a career of at least 9 years.

Again, if the choice is to count or not-count, you simply figure the point where the signal is greater than the noise.

AT THE SAME TIME, the regression point is 50% for a batter’s doubles per PA when the PA is 1200 (based on my previous link)!  By the same token then, you have to exclude a batter’s number of doubles if the number of PA is under 1200.  Singles is at PA = 300.  If you have fewer than 300 PA, exclude singles altogether.

Again, only if you are in a count / no-count mode.


#27    Peter Jensen      (see all posts) 2009/09/22 (Tue) @ 09:26

Guy - Jeter has 47 ROE from 2005-2008.  The top ten in ROE rate are averaging between 7 and 8 ROE per 500 AB for the 4 years.  The average for all batters who had more than 1200 AB for the 4 years is 5 ROE per 500 AB.  The bottom ten in ROE averaged 2.2 per 500 AB for the 4 years.  It is definitely a skill, just not a very important one.  If you are trying to find a player who will help you make the playoffs in the next year you can probably safely ignore it.  But if you are negotiating a contract for a player who has the skill and you ignore it, you are leaving at least $400,000 a year on the table.  Thank goodness MGL is not a player agent.


#28    Tangotiger      (see all posts) 2009/09/22 (Tue) @ 09:37

Kincaid: the programming to figure out whether to count a ROE (or especially FC) as a “safe” play is fairly involved.

Since we count a single-out-at-2B as a “time on base”, then clearly we need to count ROE and FC where there were no outs made by the time the batter reached 1B, but there were outs after that point (runners run at their own risk) as a “time on base”.  It is not the most straight-forward thing to program, especially if there’s no real indication of which base the runner made the out (was it a non-force, or force?).

What I do seems to be what Retro does: if by the time the play is over, there were no outs recorded, then the FC counts as a time on base.  All other plays count as outs.

So, reaching 1B safely, and then have a runner thrown out on a non-force play I do count as an out, when I should count it as a time on base. 

The differences of 1 or 2 plays for someone’s career is reason enough to not get too worked up about getting it right.


#29    Guy      (see all posts) 2009/09/22 (Tue) @ 09:38

Thinking a bit more on this, my calculations (and Tango’s, I think) assume that the two possible outcomes on these plays are ROE and out.  But that’s not quite right:  some of these errors might be scored as hits by another scorer, on another day, or for a different fielder.  So there are really 3 outcomes.  What we really want to know, I think, is which has the higher y-t-y correlation for a hitter:  hits, or hits+ROE?  It could be the latter.....

(Similarly, in Tango’s doubles example, we wouldn’t want to throw out the doubles info because knowing TB/SLG gives us much more information, even in a single year, than BA alone.)


#30          (see all posts) 2009/09/22 (Tue) @ 13:46

What about BABIP?  Is there a connection between BABIP and ROE?  Do people who have a ‘talent’ for a high BABIP have a ‘talent’ for a high ROE?


#31    Tangotiger      (see all posts) 2009/09/22 (Tue) @ 14:28

crack: PLEASE, do me a favor (if you think I’ve earned it), and change your handle (and don’t put your email address if it’s going to have crack in it).  My software keeps trapping it as spam, and I’d rather not always have to keep moderating the posts because of it.


#32    MGL      (see all posts) 2009/09/22 (Tue) @ 15:51

This whole discussion (about what to do with ROE) underscores the notion (that I have been preaching for a long time) that when you look at a player’s stats (any stat) with an eye on making a judgment about how good that player is or about anything going forward (which is why we look at or talk about stats, 95% of the time), we always need to think in terms of the regressed version of those stats.  I know that sounds nerdy, but it is very true.

As most of you know, I absolutely HATE when anyone says or writes, “So-and-so is hitting X this year or this month, therefore...”

Basically, if you want to give us a number with no regard to the sample size of that number, give us a regressed number, and then we can talk.

I realize that that isn’t particularly practical, but at least when you look at a number, mentally include some kind of regression, given the sample size and given the “skill component” (spread of skill in the population) of the underlying data (the more “skill,” the less the regression).  And of course, don’t ignore other relevant numbers or cherry pick numbers.  For example, in telling us how someone did this year or the last 3 months (as a proxy for how “good” they are, or what we expect them to do in the near or far future), the crime is not so much not regressing those numbers, the crime is ignoring all the other history of that player…


#33    SirKodiak      (see all posts) 2009/09/23 (Wed) @ 03:23

MGL,

Have you ever given any thought to doing a sort of reverse UZR for batters?  What I mean is using the data gathered from UZR and looking at batter performance rather than fielder performance?  For example, a batter’s batted ball distribution actually resulted in x runs, but should have resulted in y runs (his batting skill).


#34    MGL      (see all posts) 2009/09/23 (Wed) @ 22:08

#33, sure.  I’ve thought about it.  Someone else may have already done it.  I’m not sure.  What you would find is what you would expect.  Half (Not exactly half - in fact, I don’t know the proportion) of the difference between expected and actual would be luck (e.g., a line drive caught or a pop fly that drops in) and half would be skill (e.g., all of Pujols “hard” GB or line drives - according to BIS - would actually be harder than Eckstein’s).  The biggest problem that would sort of make the entire effort fruitless is that you don’t know the position of the fielders.  So, for example, it would look like lots of those short fly balls by Eckstein should have dropped but were actually caught, and it would look like some of Pujols long fly balls and line drives should have fallen over the heads of the OF’ers, but of course they would generally be playing deeper against him and other power hitters.  Not to mention where the fielders are playing laterally, shifts, and the like.

So, while you could do “something” like a reverse UZR using the PBP data, you would have to really tweak the methodology to come up with something useful.

That is one of the problems with PZR for pitchers as well.  Not as problematic as with pitchers, but definitely problematic along the same lines.

Once we know the positions of all the fielders and real objective data on the batted balls, then we can pretty much dispense with actual stats for all players and substitute what I like to call “virtual stats.”


#35    Peter Jensen      (see all posts) 2009/09/23 (Wed) @ 22:59

SirKodiak - I think you are describing something like what I wrote about here. http://www.hardballtimes.com/main/article/using-hitf-x-to-measure-skill/


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 01:57
Who is Jeremy Lin?

Feb 12 00:40
Clutch analogy

Feb 12 00:38
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 10:29
Dwight Evans