THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, March 11, 2009

The Fielding Bible

By , 12:16 AM

I am starting this thread for us to review and critique the new (second edition I think) Fielding Bible by John Dewan (with Bill James).  I just started reading it.  Much of it is data on the players but there is plenty of commentary.  It is great stuff.  Some of it is band new.  There is way more stuff in it than in the first addition.

I do have some problems with it, which I will articulate after I read some more of it and think about and digest what I have read.  I’ll start by saying that the section by Bill James, as much as I love his writing, is not the model of clarity and consistency that I would have liked.  I’ll give you a brief example which I’ll explain in more detail in another post in this thread. In the second section, he talks about his Defensive Misplays system, which is a very good concept.  He spends a lot of time telling us how objective it is, as opposed to say, errors, which he does not like, partly because they are subjective.  However, as you read more about his Defensive Misplays (DM’s), it becomes clear that many of them are very subjective, or at least not as objective as he claims DM’s are in general.  He even becomes sort of apologetic at times about the subjectivity of some of the DM’s and keeps trying to make a case for them not really being that subjective.  I find much of those “cases” unconvincing.  Not a big deal though.

Another thing which is confusing to me is that some of the DM’s seem to already be recorded as errors, officially, and others are not.  Yet at some point he talks about combining DM’ and errors.  Since some of the DM’s are already errors, I don’t know why you would or can combine them since you would be double counting some of the errors.

One more thing:  At the beginning of his first essay explaining some of the metrics in the book, he gives us a list of the value per plus minus point of a run for the various fielding positions.  It appears that the value of a point is simply the sum of the average value of a hit plus error at that position and an out, assuming the same sign for both.  In other words, for a SS, the average value of a hit plus error is around .5 runs and the average value of an out is around .27 runs (ignoring the minus sign).  So the sum is .77, which is around what he has for most of the IF positions - and that makes sense.  For the OF positions, though, he uses .56 and .58 runs.  I have no idea how he gets these numbers.  Apparently the numbers he gives are NOT the sum of a hit (plus error) and an out.  But I have no idea how his numbers are derived.

Then he also says, “for the corner infielders and the outfielders, we use enhanced plus minus.” As opposed to the regular plus minus for the other infielders, I guess.  I thought I knew the difference between regular and enhanced plus minus, but I have no idea what he means with that sentence in the context in which it is written. And nowhere in the book prior to this does he even talk about the difference between regular and enhanced plus minus, I don’t think.

OK, I wasn’t going to yet, but I’ll give you one example of what I mean by being “subjective” with the DM’s.  There are 54 DM’s.  One of them is “wild pitches that the catcher should have caught.” As I said, he talks a lot about how the DM’s are hardly subjective at all.  Well, you can’t get much more subjective than deciding whether a catcher “should have” caught a ball that was scored a wild pitch.  Certainly not much less subjective than whether a batted ball “should have been” turned into an out by a fielder, which is what OS’s do when awarding an error or not.  As I said, he really emphasizes that errors are too subjective to be particularly meaningful, but that his DM’s are not.  Now, it is not a big deal, and I think that DM’s are great, but I don’t like arguments that are not sound and I don’t like someone I admire and respect being less than 100% intellectually honest.  I’ll leave it at that for now.

As you will see in a later post, one of the problems with all of his categories (DM’s great plays, plus minus, etc.) is that they often overlap (like DM’s and errors) to some degree and he does not put them together properly for us to get a comprehensive, accurate picture of a fielder’s overall defensive ability or performance.  You obviously can’t just add everything up when those things overlap, otherwise you end up double or triple (or more) counting some things. 


#1    David Gassko      (see all posts) 2009/03/11 (Wed) @ 01:09

Hey Mickey,

“Enhanced” plus/minus is the number of bases (instead of plays) that a player saved or cost his team, which is why the run values for the outfielders are lower. Regular plus/minus says a hit saved is a hit saved, but enhanced plus/minus recognizes that a hit saved is sometimes a single, sometimes a double, and sometimes even a triple.


#2          (see all posts) 2009/03/11 (Wed) @ 01:19

Just to clear things up a little bit… is Plus/Minus park-adjusted?


#3          (see all posts) 2009/03/11 (Wed) @ 01:27

In my system, if there’s a ball down the line past 1b or 3b, I give them the average number of bases for that type of base hit, the cost of a typical groundball down the line. The actual numbe of bases +/- the average I assign the outfielders, as I give them the responsibilty for getting to the ball and getting it back in, once the ball is out of the infield. Is that how TFB does it, or do they credit all the actual bases to the infielders?


#4    MGL      (see all posts) 2009/03/11 (Wed) @ 03:11

No disrespect to James and Dewan, but if you are publishing a book, and really smart people on a blog have to explain things in the book to other really smart people…

Anyway, I am midway through the section on catcher ERA.  Bill James writes a long essay about McCarver and Carlton.  I can’t say that much of it was really necessary (although some of it is interesting in and of itself), and I take exception to some of the things that Bill writes about using sample data to “prove” an assertion or to make a conclusion.  To wit (Carlton’s ERA was much better with McCarver than with other catchers, for 1676 IP with and 3541 IP without):

OK, so Carlton was much better with McCarver than other catchers.  So what?  What does that prove?
Actually, it doesn’t prove anything.  I was hoping that the data would shake out so that it could be analyzed to reach a definitive conclusion as to whether Carlton was better with McCarver or not.  In fact, though, it doesn’t…

There are a lot of things wrong with those sentences.  One, Carlton WAS better with McCarver than without.  That is a given.  What we want to know is whether there is any statistical evidence as to whether that was by chance or not.

Unfortunately, you can rarely study one set of sample data and get evidence of anything.  You have to look at lots of sample data and then make inferences about individual players.  Even if the difference between Carlton with and without McCarver were “off the charts” (in terms of the chances that the difference could occur randomly), we still really wouldn’t know anything - for two reasons.  One, this combo was likely cherry picked by James.  It was probably not a random combo in which he had no prior knowledge as to whether Carlton pitched better with McCarver or without.  Two, first you have to (if you can) determine whether calling a game is likely a skill among ALL major league catchers - if it is not, then no matter how unlikely any given split is for any catcher/pitcher combo, you HAVE TO conclude that it is random!

“to reach a definitive conclusion...” (That is what Bill wrote.) Huh?  Since when can you reach a definitive conclusion from sample data?  You can’t.  Ever.  You can reach conclusions that have very ahigh degree of reliability, but even that is atypical in these kinds of analyses.  That is why when I analyze sample results I am careful to use words like “suggest”, or perhaps “strongly suggest.” That is about all you can do from sample data. Bill really needs to understand this better, I think.

Bill also writes:

There are two possibilities: a) that Carlton was better because McCarver caught him, and b) that McCarver happened to catch Carlton when Carlton was having good years...I think, in all honesty, that the better argument is for a)…

I don’t know what, “in all honesty” and “I think” means, but these words should not be in a statistical analysis which is what Bill is attempting to do.  Again, he would have no idea as to the answer to that question, unless he at least looks at the spread of catcher ERA skill in the whole population of catchers.  (Or at least, do an analysis as to how likely the Carlton with and without split would be by chance - as it is possible that even with no evidence of a spread in catcher ERA true talent, that there might be a few catchers who possess such a skill, while everyone else does not, although I doubt that is the case.)


#5          (see all posts) 2009/03/11 (Wed) @ 04:52

I don’t mean to be quite so negative about the Fielding Bible. It is a great book, with great research in it.  We should all be grateful that John (and Bill) is generous enough to take a body of work that was extremely time consuming and expensive to collect and make the results available to the public.  I highly recommend the book to anyone who is interested in learning more about advanced and creative ways to evaluate and measure defense.


#6    Darren      (see all posts) 2009/03/11 (Wed) @ 08:58

I thought it was a very good book. I agree there is a lot of subjectivity around the Misplay Events. My problem with it was Dewan converting it to Misplays per Touches ratio. To me this makes no sense, as certain misplays did not require the fielder to even touch the ball. Perhaps they should amend it to Misplays per Expected Out. Other than that I think it was the best book to come out this offseason


#7    Tangotiger      (see all posts) 2009/03/11 (Wed) @ 09:34

We also have this thread on The Fielding Bible:
http://www.insidethebook.com/ee/index.php/site/comments/fielding_bible_excerpt/

In there, we have a longer discussion on the “enhanced” plays as for why the run values are what they are, which David has already summarized in the current thread.


#8          (see all posts) 2009/03/11 (Wed) @ 12:16

I understand Tom and mgl’s points, but in defense of the book (and I have no stake one way or another), it isn’t intended to be a scholarly presentation of the material.  In fact, I bet the authors would not claim to be scholars at all.  Bill James has often said he is not a real statistician.

mgl is right that words like “definitive conclusions,” “I think” and “in all honestly” don’t belong in a scholarly work, but these are not things that James (or his editors) are likely to put as much stock in.  They want to sell some of these books at Barnes and Noble to people who don’t spend a chunk of time on the sabermetric (or “quants") blogs.  There’s going to be some looser, more common language in there—and arguably, there should be when it is intended for a broader audience.

Now if he was publishing in a statistical journal, that’s a different ballgame.

Also, the “writing for the masses” excuse doesn’t work for the subjectivity of the DMs, the cherrypicking of Carlton/McCarver and certain other points made by Tom and mgl.  Those are just shortcomings of the system or the analysis.


#9    Tangotiger      (see all posts) 2009/03/11 (Wed) @ 12:27

I actually made no comment whatsoever on the matter of the contents or words of the book, seeing that I have yet to read it.  So, you should strike anything that you said with regards to me.  MGL is MGL, and we have no editorial relationship in the least, outside of The Book itself.

I also don’t think it’s a question of being “scholarly” as I read MGL.  It’s standing behind the words you write, as you have written them.

Bill is quite capable of defending himself, if he think’s MGL’s point-of-view contradicts Bill’s words.  Bill is a big boy.  No need to try to defend Ray Liotta from Joe Pesci.


#10    MGL      (see all posts) 2009/03/11 (Wed) @ 17:41

Chapel, fair points.

Tango said this:

I also don’t think it’s a question of being “scholarly” as I read MGL.  It’s standing behind the words you write, as you have written them.

Yes, there is a difference between being intellectually honest, i.e., standing behind your words, and being “technical” or not.  I probably wrote that wrong, but I agree with Tango’s 100%, and that WAS my point.

I think that James and Dewan would be the first ones to claim that anything they say is supposed to be honest and correct and that they should stand 100% behind their words.

So, I have no quarrel with a book like that being tailored toward a certain kind of audience, but I do have a problem when words contradict facts and known concepts.  I will admit, however, that when you are writing in a more casual style, that there is a potential for some questionable wording in a technical sense.  No big deal though.  As I said, it is a terrific (in a good way) book. 

One more thing.  I really think that a casual baseball fan would have a really hard time getting through a lot of the details in the book.  That is not a reason, I don’t think, for them not to buy it. I just want to make it clear that it is not devoid of a lot of fairly technical and hard to understand stuff.

And, as I already said, one of my general criticisms is that some of the explanations are a little disjointed and not so easy to understand, such as in one of the very first sections, it is not clear AT ALL what those run values for each position represent.  I don’t think that any casual baseball fan is going to understand them.  Some of the brilliant people on this web site are still trying to figure out how they were derived.

As far as the DM and GFP per touch, I agree, and that is confusing as well.  Also, there is apparently some disagreement between James and Dewan as to whether they should be presented as per inning or per touch.  Actually they end up presenting both, I think, which is fine.  For something like that, if you want to be able to compare one player to another, of course you want to normalize everything to something like “per touch” so a player does not get penalized or rewarded for happening to have more or fewer opportunities than another player or than average.  And certainly if you want to take their numbers and turn them into “true talent” (by aggregating several years, weighting and regressing, etc.), you want to make sure that everyone is on the same “scale.” On the other hand, if you just want to present what someone DID and what theoretical value they had to their team, in a retrospective analysis, then something like per inning or game is fine.  IOW, you are allowing them to get credit or blame if they happened to have a lot or few opps as compared to another or the average player.

I’ll mention another thing which is not clear at all in the beginning sections, especially the one on DM’s.  They talk a lot about awarding a DM or not if there is “no harm or foul.” IOW, if a player makes a mistake but nothing bad happens, like a batter hits a single, the outfielder makes a mistake, but the runner does not advance.  I have no idea whether they are giving the fielder a DM or not.  It seems like sometimes they say they are, regardless of what actually happened, and sometimes they say that they are not. Granted, I could probably clear up my confusion bu re-reading the material once or twice more, but I should not have to do that for something as basic as that.

As well, when you are doing this kind of analysis, you really should NOT base whether you credit or dock a player with what actually happened, result-wise.  That should be obvious.  Now it is OK to present two versions of a metric - one which represents what actually happened (for retrospective value purposes - as in “MVP type” analyses), and one which has predictive value.  Or at the very least, you should make it clear that one approach has more predictive value than the other.  The way they handle this sort of thing is VERY unclear and confusing in the book. Anyone else have the same issue?


#11    Brian Cartwright      (see all posts) 2009/03/11 (Wed) @ 17:52

OK, the batter hits a single and the outfielder bobbles it. This time, there was no dvance, but on all the similar bobbles by all the outfielders, there a certain probability that the batter or a runner will get an extra base. We record the actual result in order to calculate the expected result, and can report the actual results per player, but it’s the mean expected result that tells us about the outfielder’s true talent. Did I interpret you correctly? If so, I do think this is the proper way to do it.


#12          (see all posts) 2009/03/11 (Wed) @ 19:38

Sorry Tom, I don’t know how to edit prior posts, so I’ll have to redact your name this way.  I inadvertently lumped you in.

I know mgl didn’t use the term “scholarly”—I just think that’s his approach to analysis.  I don’t think of it as a negative.  I actually like that approach, as do most of the readers here.  Though I think it might cause us to set the bar a little high for some publications.  And for politicians, of course.  smile

Many people are familiar with the term “rainmaker” as it applies to lawyers.  The rainmaker is a lawyer who is smart, and is a generally good and wise lawyer, but now armed with some experience spends lots of time getting clients and conceptualizing.  Rarely does he/she do the actual drafting of complaints and documents where technical skill is required and some deeper issues emerge.  There are people in his/her firm better qualified for such work, for many reasons.  Some of them are also excellent idea generators in their own right.

I think of Bill James as a sort of sabermetric rainmaker at this stage of his career.  Not that he is the sole source of ideas, or that anyone on this blog (or elsewhere) bereft of new ideas. Nor that he is unskilled. It’s just that there are younger, more qualified soldiers to do some of the heavy lifting.  James asks good questions.  His methods of resolving them are usually interesting, and provoke discussion, but may not in some cases resolve the issues to everyone’s satisfaction once the soldiers start poking around.

His thoughts do, however, serve as a jumping off point for all kinds of deeper analysis—arguably, the entire proliferation of sabermetrics.  Tom, and mgl, and everyone else in this business or quasi-business dig deeper. 

In no way do I mean this as disrespect for Bill James. Nor did my earlier comment (or this one) seek to stifle any other speech.  I just think there are different audiences, different approaches, different skill sets (etc.) and they all have their own merit in one way or another.


#13    MGL      (see all posts) 2009/03/11 (Wed) @ 19:58

think of Bill James as a sort of sabermetric rainmaker at this stage of his career.  Not that he is the sole source of ideas, or that anyone on this blog (or elsewhere) bereft of new ideas. Nor that he is unskilled. It’s just that there are younger, more qualified soldiers to do some of the heavy lifting.  James asks good questions.  His methods of resolving them are usually interesting, and provoke discussion, but may not in some cases resolve the issues to everyone’s satisfaction once the soldiers start poking around.

His thoughts do, however, serve as a jumping off point for all kinds of deeper analysis—arguably, the entire proliferation of sabermetrics.  Tom, and mgl, and everyone else in this business or quasi-business dig deeper. 

Very well put!

Brian, of course that is the correct way to do it, and is exactly how it is done with his basic plus/minus.  How he does it with his other metrics (like DM and GFP), I am not sure.  I’d have to re-read the book, and even then, I am not sure how clear it is going to be.  To be honest, some of the writing in the book is a cluster****.


#14    MGL      (see all posts) 2009/03/13 (Fri) @ 17:20

Dewan has pitchers like Maddux and Rogers saving 10 to 15 runs in some seasons.  The best SS and CF in their best seasons save 30 runs (which is just a sample - not true talent).  That is in around 1200 innings.  A starting pitcher pitches 200 innings or so.  Saving 10 runs in 200 innings would be equivalent to a full-time player saving 60 runs.  How is it possible for a pitcher to save 10 runs in a season in 200 IP?  I have not looked at the underlying numbers, but that does not seem possible.  Dewan has Maddux saving 25 runs in the last 3 years. That would be the greatest fielding performance (per inning) for any position of all time I would think.  The equivalent would be a position player saving 150 runs in a 3 year period.


#15    BenJ      (see all posts) 2009/03/13 (Fri) @ 18:10

MGL,

My theory:  the difference between the best (Maddux, Rogers) and worst (Randy Johnson, Daniel Cabrera) pitchers defensively is extreme compared to the range at other positions. 

The way Plus/Minus is calculated, this would increase the spread between the best and worst pitchers, above/below average, and rightfully so.

We might even have observational confirmation of this.  Look at the Gold Glove winners.  Maddux and Rogers dominate the GGs, and they also dominate Dewan’s leaderboards (particularly if you remove Maddux’s horrible baserunner holding).  The other recent GG winners, Santana and Mussina, also fare well in the Plus/Minus Runs Saved component (though Mussina takes a hit like Maddux in holding runners).

What is clear to the voters is also clear in Dewan’s numbers.  How often does that happen?!?  (See:  McLouth, Nate)


#16    MGL      (see all posts) 2009/03/13 (Fri) @ 19:58

BenJ, I would have to look at the numbers.  I would not doubt that the difference between the best and worst pitchers is greater than at other positions, however, I question whether there are enough opportunities to make that much difference in only 200 innings.  But, as I said, I have to look at the numbers myself.

What I suspect is happening is that the good pitchers are snagging a lot of ground balls (comebackers) that other fielders, primarily the 2B and SS, would get to anyway, which is distorting the numbers.


#17    Tangotiger      (see all posts) 2009/03/13 (Fri) @ 22:08

The average pitcher makes .030 to .035 outs per BIP.  Maddux and Rogers are at around double that rate.

If you give them 600 or 700 BIP, that makes them +20 plays per season.

HOWEVER… if they don’t make the play, there’s a good chance their infielders will.


#18    Brian Cartwright      (see all posts) 2009/03/13 (Fri) @ 22:10

mgl, I think that is an excellent point. For most fielders, the assumption is that if they faily to make a play that the batter reaches, and that is the linear weight attached to the event. In the case of the pitcher, part of the balls that one pitcher fails to get to, that Maddux of Rogers may get, end up being outs anyway. These probabilities need to be derived, and an appropriate linear weight generated.


#19    MGL      (see all posts) 2009/03/13 (Fri) @ 22:55

That is why my system is perfect for pitchers (even though I don’t derive UZR for pitchers, for some reason - maybe I will).  If a pitcher fields a ball that is always fielded by someone, he does not get any credit, I don’t think (maybe they get some credit - I forgot).  He shouldn’t of course (get full credit - maybe a little).  He might be a great fielder and that is why he is snagging these balls, but if there is no reason to snag them, then we should not give a pitcher any credit for those catches.  A pitcher should only be given credit for those balls that would otherwise be hits, primarily like those up the middle, although I’m sure there are a few squibblers that the good fielding pitchers get to but other fielders can’t. And I would think that the better fielding pitchers turn more DP’s.  But overall, I don’t think any pitcher is going to save 15 runs in 200 IP.

Now, if a pitcher moves to another position, like SS or 2B, then we might want to revisit how we treat pitcher fielding metrics, if we want to use them to predict how they would do at another position, but I don’t think that happens very often (pitchers going to another infield position).

I also wonder if under Dewan’s system if the infielders behind the Maddux’ and Rogers’ are getting shortchanged.  Again, in UZR, a fielder does not ever get docked if someone else makes the play.  This is a good example of why I do that (although there is probably a better way of doing it, which is a combination of the two methods).


#20    Brian Cartwright      (see all posts) 2009/03/14 (Sat) @ 04:16

How about grouping ground balls by one of 18 vectors, how hard hit (sharp, soft, unspecified), and bathand. What pct of the time does the pitcher field the ball, to what result. Then, for balls that get fielded by another infielder (3, 4, 5, or 6) how many are outs and how many are hits. That should establish the baseline odds and thus weights for each batted ball type that could involve the pitcher. That way, failing to get the one hopper right at the mound will cost a lot, because those will mostly be hits up the middle, while similar balls to the left or right of the pitcher shouldn’t hurt nearly as much if the pitcher doesn’t field them as they are more likely to be fielder by the ss or 2b.


#21    MGL      (see all posts) 2009/03/15 (Sun) @ 00:38

Brian, sure, it should not be that hard to solve or at least greatly mitigate this potential problem, by looking at the data more closely.  As closely as Dewan looks at the data (a lot more closely than I), you would think that he already has this figured out (whether he is giving too much credit to certain pitchers for fielding balls that would be caught by other fielders).

Now that I have had time to think about it, when I give credit to a fielder for catching a ball, the most credit they can get is the difference between one complete out and the percentage of time that any fielder makes the play, which I think is the correct way to do it.  For example, if a ball in a certain bucket (speed, location, handedness of batter, etc.) is caught 100% of the time by someone, no one gets any credit for catching it.  If a ball is caught 90% of the time by someone, the most credit anyone can get is 10% of a catch, and only that much if they virtually never make that play themselves.  So if a certain ball up the middle is going to be caught by either the pitcher or another fielder 90%+ of the time anyway, and the pitcher makes the catch, he is not going to get much credit, again, which is the correct thing to do.  So if Maddux and Rogers et al. catch twice as many balls as the average pitcher, and lots of those balls were going to be caught anyway a large % of the time, they would not receive much credit in UZR.

I have no idea how Dewan does it however.  He may have explained exactly how he does it and he may have not.  I don’t know.  As I have indicated, I don’t think his and James’ strong suit, at least in the book, is explaining their various methodologies, among other things.


#22    Tangotiger      (see all posts) 2009/03/15 (Sun) @ 12:41

Somebody page SABRMatt and give him MGL/21’s response, because that’s about as clear as it gets in how we handle the zone-issue, without being tied to the zone-player, which as I understand it is his biggest problem with zones.


#23    Peter Jensen      (see all posts) 2009/03/15 (Sun) @ 14:11

Somebody page SABRMatt and give him MGL/21’s response, because that’s about as clear as it gets in how we handle the zone-issue

That’s as clear as it gets?  Are you kidding me?  Let me list the things I still don’t know about how UZR handles zones.

1. Just how many “buckets” does UZR use? For infielders ground balls is it the 22 angular zones that STATS has for between the foul lines times 3 different hit ball speeds times 2 types of handedness for the batter?  What is the etc. in (speed, location, handedness of batter, etc.)?  Is there any Park Adjustment for infield ground balls?

2. How are the league averages for each bucket calculated?  Over one year, multiple years?  Are hit balls that are in the shortstops angular zone but are fielded by the pitcher, catcher or third baseman deducted from the denominator in calculating the league percentage of plays that a shortstop should make in that zone?  If yes, is it all hit balls fielded by the pitcher catcher or third baseman or just those fielded for outs that are removed?

3. If a ball is hit up the middle in the “bucket” where the shortstop makes the play 75% of the time and the pitcher makes the play 20% of the time and the ball ends up being a hit how does UZR score that ball for both the pitcher and the shortstop?  Are those numbers different in UZR if the ball is fielded by the pitcher for an infield hit?

4. In the sentence ”If a ball is caught 90% of the time by someone, the most credit anyone can get is 10% of a catch, and only that much if they virtually never make that play themselves.”, what does the clause “and only that much if they virtually never make that play themselves” mean?  Could you illustrate with an example?

5. Are there additional buckets in the outfield based on distance?  Do you do separate buckets for fly balls and line drives?  Do you also do buckets for fliners?  How does the Park Adjusment for outfielders work? Do all outfield buckets get adjusted or just some?  Are all parks adjusted or just some?

6. How do you convert the plus/minus scores of plays made in UZR to runs?  Does the conversion differ between outfielders and infielders?  Is it different for each infield position and/or for each outfield position?

Those questions are just off the top of my head.  There are probably more.  Perhaps MGL has answered them in the past but I have not found where he has. 

I have no idea how Dewan does it however.  He may have explained exactly how he does it and he may have not.  I don’t know.  As I have indicated, I don’t think his and James’ strong suit, at least in the book, is explaining their various methodologies, among other things.

I agree with this, but given the number of questions that I have about UZR I find it a little ironic to have the statement come from MGL.


#24    Colin Wyers      (see all posts) 2009/03/15 (Sun) @ 14:39

Peter, I don’t know if you’ve seen it before, but I think MGL answered a lot of those questions in his series on UZR for Primer:

http://www.baseballthinkfactory.org/files/primate_studies/discussion/lichtman_2003-03-14_0/

http://www.baseballthinkfactory.org/files/primate_studies/discussion/lichtman_2003-03-21_0/

The comments are a horror show, as a lot of them have apparently been removed at some point.


#25    Peter Jensen      (see all posts) 2009/03/15 (Sun) @ 15:17

Colin - Thanks, I had not seen those articles.  Six years is a long time though and it would still be interesting to know if the methodology is the same today.


#26    MGL      (see all posts) 2009/03/15 (Sun) @ 15:59

Peter, there is no irony, as I did not write a book on UZR.  I am under no obligation whatsoever to explain my methodology, although I don’t mind.  And certainly much of it has already been explained somewhere and at some time.  I’m not sure why you think that my posts in this thread are supposed to be an exhaustive explanation of the UZR methodology.

BTW, here is how Dewan explains the plus/minus methodology in The Fielding Bible:

The Plus/Minus system gives us the percentage of time that a hard groundball to vector 197 was successfully converted to an out by the SS last year: 85.6%. We determine that the average SS wil contribute .856x.242 + (1-.856)x -.379 = .153 runs on the play.  If Drew converts the out, his play was worth .242 - .153 = .091 runs above the average SS.  (The .242 is the RE after converting an out with no one on base and no outs, and the .379 is the difference between that and the RE if no out is converted - this is just an example of how he figures the average value of an out for each bucket.)

Now, here is the part which is really confusing:

After repeating this procedure for every play, we found that there was still plenty of “noise” in the run values that had nothing to do with the fielder’s defensive ability.  For example, converting a difficult play with 2 outs and the bases loaded could be worth 2 full runs, but the same play with 2 outs and nobody on was worth a tenth of that, at no fault of the fielder…

To adjust for this bias, we took a look at how many runs each plus/minus point was worth...

I understand what he is saying, but the explanation to any lay person is an abomination.

He appears to be saying that they use the average run minus out value of each batted ball in each bucket and merely multiply that by the number of balls a fielder makes above or below average for each ball in each bucket, which is what I always thought.

And guess what?  That is dead wrong for the reasons I have been discussing.  It will give too much credit to players who field balls disproportionately that are always or usually caught.  And it will dock players too much for balls that are not caught by them but are caught by someone else.  That is especially true of pitchers, as we discuss above, and for discretionary balls in the OF, which are quite a few of all fly balls.  If Dewan is not making adjustments for these (and maybe he is and is just not telling us), then I am afraid his numbers in the OF are next to worthless.


#27    MGL      (see all posts) 2009/03/15 (Sun) @ 17:23

Peter, and everyone else, my “buckets” are 80 x 6 for starters, which is 80 sections of the field, broken down by the STATS slices (around angular 3 degrees each and around 25 feet in distance per slice).  The 80 is just an arbitrary number which represents the maximum number of 25 feet by 3 degree sections of the field that can be used for any particular fielding position.  Essentially think of the field as being sectioned into X amount of sections with each section being around 3 angular degrees by 25 feet.

The 6 (in the 80x6) is the speed of the batted ball (soft, medium, hard, as reported by STATS or BIS) times the handedness of the batter.

Those are the basic buckets.  It do it separately for fly balls and pop flies (although pop flies to the IF are not included) and then for line drives (including LD to the IF), and then for ground balls and separately for bunts.  So the 80x6 is actually multiplied by 4, so there are over 1600 potential buckets.

I don’t actually create buckets, but I make an adjustment for the bases/outs state and the 4 year G/F ratio of the pitcher.  Those adjustments are “across the board” for each fielding position.  So, for example, in a DP situation (less than 2 outs, and a runner on first), the adjustment for 2B might be .946, which means that the 2B in that bases/out state only catches 94.6% of the balls he catches overall, so I multiply that “out rate” for the second baseman for all buckets by .946 with less than 2 outs and a runner on first.  Each IF position has their own adjustments for certain base/out states (not all of them).  There are no OF adjustments for bases/outs.

Park effects are handled as an “across the board” adjustment as well.  Park adjustments are like the bases/outs.  I simply multiply out rates for each fielding position by a number like .987 for all balls in that park.  Each park has one IF park adjustment and 3 OF ones.  Those numbers are used regardless of where the ball is hit.  So, for example, the park adjustment in LF at Fenway might be .830.  All balls to left field get that adjustment.

For the pitcher GF adjustments, what I do is keep track of the average GF ratio of all the pitchers behind each fielder.  So if when Jeter was on the field and his average pitcher in 2008 had a GF ratio of 1.50 and the league average GF ratio of a pitcher was 1.30, he would get an overall adjustment such that we assume that his average ground ball was a little easier to field than an average SS.  The adjustment might be to multiply his own personal out rates by .980, again, across the board.

The park factors are based on 10 years of data or so, and are regressed.

The baselines for everyone is currently at 5 years.  So the average out rate for every bucket is based on 5 years of data.  I can use more or less if I want.  Usually the more the batter.  And I always zero out each position for each year.  One of ther weaknesses of that is that if in one year a particular position is better overall than in an average year, the players at that position will get shortchanged and vice versa.

I think Dewan uses only one year, which is a big mistake, in my opinion.  When you use a lot of buckets, you have to have as much data as possible.

As to how I compile each player’s run values, and I have explained this many times before on this blog:

An example is the best way to explain it.  Let’s say that in a certain bucket, 80% of the balls are turned into outs by someone.  And let’s say that 60% are made by the SS and 20% by the 2B.  So, 20% fall for hits of course.

If a ball falls for a hit in that bucket, both the 2B and SS will get docked of course.  A ball in that bucket is overall “worth” 80% of an out, which is around .8 x -.25 (let’s call all outs -.25 and all hits .5 to make the math easy) or -.20 and 20% of a hit, or .2 x .5, or .1, for a total of -.1.

So if a ball falls for a hit, the total demerit is .5 minus -.1 or .6 runs.  The 2B and SS must be docked a total of .6 runs.  Since of the outs made, the SS fields 3/4 of them and the 2B 1/4, the SS gets 3/4 of that total demerit, or .45 runs and the 2B gets 1/3 of that demerit or .15 runs.  So, after that one ball, the SS is -.45 runs in UZR and the 2B is -.15 runs.  Simple.

If the SS fields a ball in that bucket, then he simply gets credit for the difference between an out, -.25 runs, and the average value of a ball in that bucket, -.1 run, or .15 positive runs.  When he fields a ball, the 2B does not get docked at all.

Now, we should check that everything adds up to zero and each fielder is at zero, if everyone fields everything at league average rates.  So 100 balls are hit, 20 are hits, the SS fields 60 and the 2B fields 20.  That is league average.

So under my system, the 60 that the SS fields, he gets credit for 60 x .15 or 9 runs.  For the 20 that fall for hits, the SS gets docked 20x.45 or 9 runs.  Zero!  The second baseman gets docked .15 run per hit times 20 hits, or 3 runs.  He fields 20 balls at .15 runs per ball, or +3 runs. A total of zero!

Let’s say that in a particular time period, 80% of the balls in that bucket are still fielded, but 50% are fielded by a particlar SS and 30% by a particular 2B, rather than 60 and 20 (but the baselines are still 60 and 20).  Both get docked the same number of runs per 100 balls, -9 for the SS and -3 for the 2B.  But what about the plus side of the ledger?  The SS will now field only 50 balls, and get .15 runs credit per ball, for a total of 7.5.  So his net UZR is -1.5 now.  The 2B sill field 30 for a total of 4.5 runs.  His net UZR is +1.5.

What about under Dewan’s system.  The SS is supposed to field 60. He fields 50.  That is 10 less balls.  The difference between an out and a hit is .75 runs.  He gets a net plus/minus runs of -7.5 runs!  Ouch!  The 2B gets +7.5 runs of course.  The difference between mine and Dewan’s is that the 2B is getting a lot of credit for fielding extra balls even though 80% of the balls in that bucket are being fielded no matter what.  He is getting the same credit as if no balls in a bucket are typically fielded, but the 2B now catches 10 (out of 100).  Surely that can’t be right.  If you take it to the extreme, if 100% of the balls in a bucket are fielded by either the 2B or SS, under my system, no one gets any credit or demerits no matter who fields it.  In Dewan’s system if typically those outs are split 50/50 between the SS and 2B, if it ends up 60/40 the player with the 60 gets 7.5 runs of credit!  THat can’t be right!  Since we know that 100% of the balls in that bucket are caught by someone, they have to be easy balls (like a lazy fly ball) and that the percentage caught by each fielder is purely discretionary.

Now, I don’t know that Dewan is doing it that way, but if he is, it is clearly wrong and will lead to erroneous numbers on all lazy fly balls in the OF and for the pitchers (and the 2B and SS behind pitchers who particularly hog balls or don’t).

BTW, that is about it for UZR.  The complete explanation, more or less. Did I leave out anything?


#28    Peter Jensen      (see all posts) 2009/03/15 (Sun) @ 17:27

Peter, there is no irony, as I did not write a book on UZR.  I am under no obligation whatsoever to explain my methodology, although I don’t mind.  And certainly much of it has already been explained somewhere and at some time.  I’m not sure why you think that my posts in this thread are supposed to be an exhaustive explanation of the UZR methodology.

MGL - I never said or meant to imply that you were under any obligation to explain your methodolgy.  I was responding to Tango’s assertion that your explanation in Post 21 sufficiently clarified how the “zone-issues” are handled in UZR.  I listed questions that I had not found answers to about UZR.  Part of that was my failure to search diligently enough for answers and find the two articles referenced by Colin.  Having now read those, I have a better understanding of what your methodolgy was in 2003.  How much of your current methodology you chose to reveal is certainly up to you.

And as I stated before I agree with you that much of what is written in The Fielding Bible Volume 2 is very confusing. And as you point out much of what they are doing just seems wrong.  Plus, having three separate evaluation systems within a single book with no resolution as to what one should do when the different systems don’t agree seems silly.  I was not impressed with the people I talked to when I called to ask questions after the first Fielding Bible was published. I got the impression that they thought that they had invented a fielding system that was much superior to anything that had appeared before.


#29    Peter Jensen      (see all posts) 2009/03/15 (Sun) @ 17:30

MGL - Our last 2 posts crossed.  Thanks for the extra information on UZR.


#30    MGL      (see all posts) 2009/03/15 (Sun) @ 22:13

Peter, sure no problem.

Plus, having three separate evaluation systems within a single book with no resolution as to what one should do when the different systems don’t agree seems silly.

I agree 100%.  They have the basic plus minus, then they have the FM’s and GFP’s.  The FM’s and the basic plus minus clearly overlap.  Yet there is no reconciliation and no attempt to combine the 3 systems, as you say.  That is crazy.  They have all the data and tools to give us a total number for each player, yet they don’t.

Not to mention the fact, as I said, that they go out of their way to make it clear that errors are subjective but their FM’s are not (or at least not very much), yet clearly and obviously there is MUCH subjectivity in the FM’s.  As much as an official scorer is subjective when awarding an error or not (the problem with OS’s is not so much that awarding an error or not is subjective, but that they - not all of them I assume - are so embarrasingly BAD at making the judgments).

Here is another example (I gave one already on the “wild pitches that catchers SHOULD have caught” - if that is not subjective, what is?):

He talks about an outfielder going to the wall to make a catch and then the ball bounces off the wall and over his head.  He goes out of his way, again, to explain how this is NOT a subjective record. It is 100% objective (my own words).  Outfielder goes to wall (no subjectivity) and ball bounces over head (no subjectivity).  OK, putting aside the fact that I can clearly imagine a scenario where you have to make a judgment even those things that James and Dewan think are 100% objective, but he also goes on to say:

It is not to be classified as a FM if it is “perfectly and absolutely clear that it made no difference” as when the runners do not advance an extra base.  Putting aside, again, the fact that is probably NOT the correct way to do it (especially since the plus/minus, the backbone of the whole system does NOT do that - be results-oriented), he gives another example of where it (presumably it is absolutely and perfectly clear) did not make a difference - when Jose Reyes hits a ball 360 feet off the wall, ends up on second, and would have been on second anyway, even without the ball bouncing over the fielder’s head.

So he expects us to believe that there is not going to be any judgment involved in this FM?  Is he kidding?  What about a 310 line drive off the wall that Jose Reyes ends up on second after it bounces over the fielder’s head?  How is the person doing the recording supposed to know whether Reyes would have made it to second anyway?

If Dewan wants there to be no FM on plays where the runners would have ended up where they ended up if not for the ball bouncing over the fielder’s head, then obviously there HAS to be tons of judgment and subjectivity involved.  Sheesh!

And as I also mentioned, at some point they talk about not giving a FM when it makes no difference such as when the runners do not advance and then at other points they talk about NOT being result oriented, such as when a fielder makes an out with the bases loaded and 2 outs, and the swing is several runs.  There is no reconciliation of the two approaches (clearly the “non-resulted oriented” approach is the only one that has predictive value).  Even among the 54 (or whatever the number is) difference FM’s, some of them ARE result oriented and some of them are not.

Basically, and in conclusion, I love the work that Dewan and James did, especially with the FM’s and the GFP’s, which are sorely missing (some of them at least) in systems like UZR and their basic plus minus, but the book leaves a lot to be desired as far a the criticisms I have levied in this thread and in others.


#31    BenJ      (see all posts) 2009/03/16 (Mon) @ 18:33

MGL/26-27:

A couple clarifications about Dewan’s Plus/Minus system, going back to the first book. 

Dewan’s “buckets” are smaller, and he uses more batted ball categories (he also uses fliners).  Therefore, it sounds like UZR has a larger sample for each bucket (and more “accurate” in theory), but Dewan’s will be more precise.  A matter of preference, I suppose.

I’m going to use your example above to demonstrate how Dewan handles the groundballs up the middle.

“Let’s say that in a certain bucket, 80% of the balls are turned into outs by someone.  And let’s say that 60% are made by the SS and 20% by the 2B.  So, 20% fall for hits of course.”

Dewan’s system looks at each position individually, removing balls fielded by other defenders.  For example, of the 80% of balls not fielded by the second baseman, the average shortstop converted 60/80 = 75% to outs.  Dewan calls this the “Ratio” for the shortstop on the play.  For the second baseman, we have 20/40 = 50% or .50.  This is the “Ratio” for the second baseman on the same play.
Dewan gives the shortstop +.25 plus/minus points (aka “plays) for each play made, and -.75 for each play not made by any fielder.  The second baseman gets +.50 and -.50, respectively. 

“Let’s say that in a particular time period, 80% of the balls in that bucket are still fielded, but 50% are fielded by a particlar SS and 30% by a particular 2B, rather than 60 and 20 (but the baselines are still 60 and 20).”

If this was a sample of 100 balls, the shortstop gets 50*.25 + 20*(-.75) = -2.5 plays.  The second baseman gets 30*.50 + 20*(-.50) = 5.00 plays.  Then Dewan applies the run factor (he gives .76 runs per plus/minus point for both positions) to get -1.9 runs for the shortstop and 3.8 runs saved for the second baseman. 

Comparing this result to MGL’s excellent UZR explanation (much appreciated, by the way), we see that Dewan’s system is rewarding the second baseman more because he was making more difficult plays.  It’s not that the play itself was more valuable than any other groundball, but the play itself was more difficult for a second baseman so we give him more credit (above average) for making the out.

Follow that?  Hopefully that clears some things up.

There are a few things, like pitcher fielding, that are by default treated differently between the two systems.  I’m not sure one way is better; I think both UZR and Plus/Minus have strengths and weaknesses. 

And lastly, I think Dewan will be the first to admit that his book is not written with readers of this blog as the primary target audience; he knows you guys don’t need convincing of the importance defensive analysis.  He writes the book for casual fans, sportswriters, and team executives themselves who don’t want to be tied up in a mind-numbing explanation of every detail of his system.


#32    MGL      (see all posts) 2009/03/16 (Mon) @ 19:57

Thanks for the explanation.  My criticism of a book or article is NEVER that it is written for the masses rather than for the expert analyst.  In fact, quite the opposite.  If a book or mainstream article is not written such that an average person can understand what the heck is going on, then the author is making a mistake in my opinion.

I think that the standard defense of criticism that “the book/article is not written for experts” is apologetic, general and vague, and usually misplaced.  In no way, shape or form am I criticizing his book on that level.  Many other levels, but not that one.

As I said, the work is great, but the book is confusing, disjointed, and contradictory in some places.  In my opinion.

I’ll say one more time, though, that I think it is great work and I highly recommend the book.


#33    Peter Jensen      (see all posts) 2009/03/17 (Tue) @ 17:22

BenJ - Thanks for your very clear explanation of Dewan’s scoring system.  It does make one wonder.  Your explanation took only 5 paragraphs.  Fielding Bible 2 is 399 pages long and took 3 years to prepare.  It seems like sometime during those 3 years someone could have found the time to include those 5 paragraphs of explanation within those 399 pages.  It would have made everything so much clearer for both “expert analysts” and novices.

Am I correct that Dewan does not adjust for the handedness of the batter?  If that is true, then in your example lets say that the hit balls are normally divided 70% hit by right handers and 30% hit by left handers and that the average shortstop normally fields the ball for an out on 6 out of every 7 balls hit by right handers but on none of the balls hit by left handers.  Conversely the average 2nd baseman fields 2 out of every three balls hit by lefthanders for outs, but none of the balls hit by righthanders.  That results in the same 60% SS outs, 20% 2B outs, 20% hits average values as in your example above.

Now lets say that instead of the usual 70% righthanded and 30% lefthanded distribution of hit balls, the distribution is 55% righthanded and 45% lefthanded.  The shortstop fields 50 balls for outs and the 2nd baseman fields 30 and 20 are hits, as in your example above.  The SS gets -1.9 runs and the 2B gets +3.8, also as in your example above.  BUT, the 2nd baseman has actually fielded the balls at an exactly average 2nd baseman rate (66.7%)and the shortstop has fielded his 55 balls at a BETTER than average rate SS rate, 90.9% instead of 85.7%.  The 2B is being rewarded not for being a better fielder, but simply for being on the field for a non normal distribution of hit balls.  And the SS, who actually IS a better than average fielder gets fewer runs because of the hit ball distribution.


#34    MGL      (see all posts) 2009/03/17 (Tue) @ 17:35

I would hope that Dewan at leasts adjusts for the handedness of the batter. That is critical on two levels.  One, it completely changes the positioning of the fielders, which completely changes the out rates for each fielding position and for each batted ball bucket, as Peter explains in his example above.  Secondly, even if the fielders were in the exact same position for LH and RH batters, the batter handedness changes the speed of each batted ball (even within the speed categories that BIS provides) for each location, which also changes the out rates for each fielder.  For example, when a RHB hits a ground ball down the first base line, it tends to be harder than one by a LHB.  I guess the LHB “rolls some those over” and the RHB does not.  So even though BIS (and STATS) provides a “speed” rating for each batted ball (soft, med and hard only), within each category, I find that there are systematic differences in out rates (based on batter handedness, the pitcher’s G/F ratio, etc.) and thus, actual speed of the ball.

I also agree with Peter that it is almost incomprehensible that in the entire Fielding Bible, Volume II, that nowhere is the methodology of the backbone of their system (the plus minus) explained.  On the other hand, maybe they did that intentionally because they did not want to make the methodology too available.


#35    Tangotiger      (see all posts) 2009/03/17 (Tue) @ 19:02

In the article at Hardball Times about Utley, it looks like he adjusts for handedness, but in a more roundaboutway.

It would be easier if everyone were to show two primary zones for each position: one for LHH and one for RHH.  I think Peter does that, and Walsh in those cool graphs from last year.

Dewan instead using one zone, and then compares the results to the average.  It’s a weird way that gives you the same results (I think).


#36    Peter Jensen      (see all posts) 2009/03/17 (Tue) @ 19:26

MGL - Again, thank you for the detailed explanation explaining UZR in answer to my questions.  But now a hypothetical for you, partly to make sure I understand the math of UZR well enough and partly to point out an area where we may differ.

A zone bucket between the SS and 3B, ground balls hit medium sharply by right handed batters on a neutral surface with a normal base out distribution.  The average 3B fields 60% for outs, the SS 30%, with 10% singles to the outfield.  Typical number of hit balls in this bucket is 30 per team per year, so on average 18 3B outs, 9 SS outs and 3 OF singles.

A team has a third baseman with very poor range but the SS is above average in range.  The 2B and pitching staff are also above average.  The zone bucket described above gets 30 hit balls one year but the 3B only fields 40%.  The SS with his above average range is also able to field 40%, but 20% are hits.  The SS is able to get to 3 of the hits and keep them as infield singles.  So the distribution for this team is 12 3B outs,12 SS outs, 3 OF singles and 3 IF singles.

This is the math for UZR as I understand it. I am using the -.25 runs for an out and .5 runs for a hit as in your example above.  A hit ball is then worth (.9 * -.25) + (.1 *.5), or -.225 + .05 or,
-.175.  The SS, because he successfully fielded 3 extra balls, gets a bonus of ((-.175 -(-.25))* 3 or (.075) * 3 or .225.  However, since there are a total of 3 additional hits and the average SS is responsible for 1/3 of the hits, he is responsible for 1 additional hit whose value is ((-.175) - .5), or -.675.  So the SS ends up at -.45 runs even though he fields 3 more balls in the zone for outs than an average SS and keeps all the additional hits from going through to the outfield.

Is my math correct?  I know you don’t think that fielders can successfully field balls missed by another fielder, but in my research for the BZM fielding metric it seems to be a fairly common occurence.


#37    Brian Cartwright      (see all posts) 2009/03/17 (Tue) @ 20:26

The average for this bucket is 30 balls - 18 outs for the 3b, 9 outs for the ss. We can split the OF hits along the same ratio, giving 2 to 3b and 1 to ss. The average 3b had fielded 18 of 20, the average ss 9 of 10.

In the study case, 3b fields 12, ss 12. There are still 3 OF hits - we can split them 2/1 as per league average, or 1/1 as per the outs with these two fielders. A Since you say “SS is responsible for 1/3 of the hits”, I will give 2 OF hits to 3b, 1 to ss. Now there’s 3 infield hits, all fielded by the ss.

...observed...expected..delta
Pos GB OF IF GB OF IF GB OF IF
3b 12 2 0 | 18 2 0 | -6 0 0
ss 12 1 3 | 9 1 0 | -3 0 +3

3b -6(.25) = -1.5
ss -3(.25)+3(-.4) = -.75 + .12 = -.63
(minus being bad for the defense)

As a group, they turned 3 gb outs into infield singles


#38    Peter Jensen      (see all posts) 2009/03/18 (Wed) @ 01:15

Brian - I don’t understand.  Is this scenario something that you are proposing as a possible solution?  I was trying to follow UZR as it was explained by MGL above, not make up something new.


#39    Brian Cartwright      (see all posts) 2009/03/18 (Wed) @ 01:41

that’s how I would have done it - expected minus observed times value. Sorry if I went off in another direction


#40    Brian Cartwright      (see all posts) 2009/03/18 (Wed) @ 02:00

I should have labeled clearly what I was doing in 37.

But this is the question I am trying to answer - even if we all have the same set of data, Peter, mgl, Dewan, I and others appear to have different ways of calculating the run values, even if it seems as simple to me as sum((exp-obs)*lw) - with each of us constructing slightly different mathematical models, do we come up with the same answer (run value) - or can we determine which ways are right and which are wrong? I don’t want to be wrong!


#41    Peter Jensen      (see all posts) 2009/03/18 (Wed) @ 02:45

But this is the question I am trying to answer - even if we all have the same set of data, Peter, mgl, Dewan, I and others appear to have different ways of calculating the run values, even if it seems as simple to me as sum((exp-obs)*lw)

Finally someone has grasped the whole point of why I wrote my articles on BZM.  I was not interested in showing that BZM was better than other metrics.  I just wanted to show that there were many small decisions that go into the planning of any defensive metric with each decision having incremental small consequences on the resulting rankings.  Figuring out what is “right” for each decision is sometimes easy, but often near impossible to “prove”. But I had hoped that having a clear idea about how those decisions are made in each metric and an open discussion about the differences might lead to improvements in all the metrics.


#42    MGL      (see all posts) 2009/03/18 (Wed) @ 03:06

I had a long post which got lost I guess.  Putting in the IF hits really confuses the issue.  In UZR, all hits are counted the same.  I should probably have a different way of handling IF hits. Of course they are not always, in fact, usually not, balls that an IF “saves” from going into the OF.  Some of them are basic IF slow roller hits. Some of them are actually bobbled by the fielder but he does not get an error. Some of them could be turned into an out by a fielder with a strong throwing arm.  Etc.  We have no way of knowing from the data except perhaps to make some inferences from the parameters of the batted ball.  Even then all we can do is treat them separately and come up with a matrix of the probabilities of what happened and who was responsible, bad or good, based upon the location and speed (and other things) of the IF hit.

To be honest, I had never though of that before, and I thank you for bringing it up.

Anyway, UZR does not make any distinction for IF and OF hits, as I said, so it confuses the issue if you want to discuss how to handle hits and outs in any system. As I also said, handling IF hits is a completely separate issue.

Peter, you are 100% correct in how UZR handles that scenario.  It gives a minus UZR to both the SS and 3B even though the SS has more range in your scenario and fields more balls.

As you know, no matter how a metric handles outs and hits, you can always construct a scenario, like yours, where it would be wrong (yield the wrong result).  That does NOT tell you that it is a bad way of handling the data.  The reason is that the data itself does not give you enough information to construct a perfect methodology.  You can only hope that the method you choose works the best, or works well, overall. I have thought about using various methodologies over the years, and I think mine is the best, but I am by no means 100% (or even 90%) sure of that.  I have not given it enough thought.

As you said there are several different methodologies, and yes, they will produce different results.  While one is probably better than the other, it may be difficult to figure out which one is the best.  I have never, in all these discussions, come across someone who HAS figured it out. Peter, maybe you can be the first. I can tell you right now, that coming up with scenarios such as yours, where one method works better than another is an exercise in futility.  For every scenario where method A works better than method B, I can come up with another one where method B works better than method A.  Again, the trick is to come up with the method that works best given all possible scenarios, the likelihood of occurring and the limitations of the data. I also want to say that because there are different methods which produce different results, that does not necessarily mean that one or more of them is “wrong” and will produce bad results.  As I said, I think that there are several reasonable methods and even if they produce vastly different sample results over the short run, they could all be good.


#43    MGL      (see all posts) 2009/03/18 (Wed) @ 03:20

Oh and Brian, if I am interpreting your methodology correctly, which you seem to think is the obvious methodology, completely breaks down in buckets where most or all balls are caught, which is frequently the case in the OF, where many buckets contain easy to catch, discretionary balls.  That is one reason I don’t use that method.

Let’s say there is a bucket in which 50% are caught by LF and 50% by CF.  Now, for one team (or whatever), there are 100 of such balls, and the LF catches 60 and the CF 40:

LF expected 50, observed 60 +10
CF expected 50 observed 40 -10

Do you really want to give the LF credit for all those extra balls and the RF demerits? I would hope not.  The fact that 100% of the balls are caught strongly suggests that they are easy catches for one, the other, or both. That also strongly suggests that who catches at least some of them is discretionary.  UZR ignores that bucket.  I think that any system that does not is treading on thin water.  Now, the answer MIGHT be somewhere in between.  I have never spent enough time trying to figure that out (by writing some simulations maybe).  IOW, even if all balls are catchable in a certain bucket and there are lots of discretionary catches in that bucket, the fact that one fielder catches more and the other one less, in some data set MIGHT indicate that one fielder has more range than the other.  But certainly not +10 and -10 in the example above.  In any bucket where there is a certain high percentage of catches, we have to assume that a certain portion of them are discretionary, so that when one fielder catches more than his fair share in those buckets, we have to assume that some of them could easily have been caught by the other fielder, and that it took no extra skill for the other fielder to have caught it.  How much, if any, I don’t know.  We can also assume that in a bucket where ALL balls are caught, that has to mean that NO balls in that bucket could be hard to catch for the closest fielder.  I think that is a good assumption, otherwise someone would have missed a ball in that zone, right.

So if that is the case, when someone catches 60% even though the league average is 50% (and we want to give him credit for at least some of that extra 10%), we have to ask ourselves if any of those extra 10% catches were difficult for him (which is why we want to give him credit, right), why didn’t the other fielder catch that ball?  Remember we already said that there are NO balls in a bucket in which all are caught which can be difficult for ALL fielders.  So if one of the balls in that bucket is difficult for one fielder (the one you want to give credit to), it must have been easy for the other one, so why didn’t he catch it? I suppose we do see plays where one fielder makes a diving catch on a ball where the other fielder could have gotten it easily, especially in the IF, so maybe my system which never gives anyone credit to anyone as long as 100% (or close to it) of the balls in that bucket are caught around the league.

As you can see, coming up with the “correct” method is neither obvious nor easy.


#44    Peter Jensen      (see all posts) 2009/03/18 (Wed) @ 03:28

MGL - I agree with almost everything you say in the previous post.  But you misunderstand my motives.  I didn’t design the previous hypothetical to demonstrate that UZR was doing something “wrong”.  Nor do I think I know what is “right” or even necessarily think there can be one “right” method.  As I said in my previous post all I was hoping for was to generate discussion.  Sometimes, through discussion we can come to a consensus that a certain methodology is “wrong”, as we seem to have done with certain of plus/minus practices.  However, as I said before, it is often nearly impossible to prove that a certain methodology is the best one possible.


#45    Brian Cartwright      (see all posts) 2009/03/18 (Wed) @ 04:04

I think it’s a great discussion.

mgl - I understand your point on the discretionary balls. I will have to let that stew in my brain to see how it is implemented.

I would weight infield hits differently because they do not have as much ability to advance runners. I used some LWs Colin sent me. If the in bobbles the ball and it’s called a hit, then that’s now as good as getting the out. The LW of the inf hit should probably be set the same as an error on a groundball, to eliminate any bias from the scorer. If the ball is headed for the of and the inf knocks it down, then he has saved a little bit and should be rewarded.


#46    Brian Cartwright      (see all posts) 2009/03/18 (Wed) @ 04:43

I confess that I had not been giving enough credit to the concept of split zones or discretionary balls.

I did my first defensive metrics 25 years ago in the amateur league that I was the statiscian of. I had the advantage of training and instructing the official scorers in how to code batted balls. I worked on a top-down principle of extending team DER to an individual level, so that the individual defensive stats would add to the team’s. Each ball was coded as to which fielder had the best chance of making the play - so there were no split zones. There were of course discretionary plays (why we don’t usually do popups) but I figured with a large enough sample size that didn’t matter.

Fast forward to now, and I still find myself of the opinion that split zones only exist because of a deficiency in scorekeeping, so I tried to avoid dealing with them. In doing my first recent work on RetroSheet data, balls in the outfield, hit or out, are coded as to the fielder who retrieved the ball - so the 9 OF buckets are GB, LD, FB to LF, CF, RF. Again, no need to share responsibility for a ball. Now that Peter has done his work with GameDay data, which I plan on replicating in order to do minor league defensive metrics, we have to incorporate vector data. Which fielder threw the ball back in is no longer the first piece of data - two identically hit balls can be fielded by different positions, and so I will be forced to consider what mgl described in 43, which so far I have managed to not think about.

That said, some of the reason that different positions get to balls in the same bucket might not be discretionary, it could be positioning or something else. I presume we’d get a better idea looking at video which balls could truly be made outs by more than one fielder.

Snd when I was talking about “not wanting to do it wrong” this is what I had in mind, if we all shared ideas, at some point I might say “Oh, I didn’t think of that”


#47          (see all posts) 2009/03/18 (Wed) @ 05:53

Just to try and re-focus the discussion and the issue at hand, and for any readers that are a little lost, and to simplify the issue (which it really is - simple that is, but for the limitations of the data), we are really trying to do two things when a ball is put into play:

1) If it drops, we are trying to figure out exactly which fielders, playing in a league average position given that particular batter (and BTW, I don’t adjust for the power and speed of the batter, which I should, as a proxy for the position of the fielder - another source of noise) and game situation, had a chance to field the ball - i.e. how often a league average fielder playing in a typical position for that batter and game situation fields the ball - and what percentage of time they do field that exact type of ball.  Almost always, we can narrow that down to two fielders.  That seems pretty simply, especially since for many balls, it is either 100% or zero % even for the 2 fielders we narrow it down to.

2) For a ball that is caught, we do exactly the same thing.  The only issue there is that if a ball is caught, should we do anything with or to any other fielder? Probably not, but maybe.  Keep in mind I am talking about the perfect system here - not a ball that is recorded on a computer or a piece of paper, where we really don’t know exactly where the ball landed, how hard it was hit, where the fielders “should” have been playing (where they typically play given all the game parameters), etc.

That is it.  Pretty simple.

If we had 2 things, we could do that perfectly and easily and there would be nothing left to do:

One, the exact location of the ball, and 2, the hang time of the ball.  That is all we need to have a perfect system and there would be no need for this discussion.  There would be one more tricky part, which would be figuring out the average position of all the fielders for every given batted ball.  To do that, we would have to figure out what factors affect that, such as the bases/outs, the park, the power of the batter, his pull or opposite field tendencies, the weather, etc.  So, we really would never know the exact average location of all fielders given the game parameters when a batted ball was hit, so we really wouldn’t have a perfect system, but it would be pretty close.

So why are our current systems so imperfect?  Only due to the limitations of the data.  The locations are not very good, even though they seem like they should be, and the lack of “hang time” on batted balls is simply inexcusable, and critical of course.  Calling something a hard line drive or a soft fly ball or fliner just doesn’t cut it.

So that is what we need for a perfect system or at least a near-perfect one.  So given the limitations of our data, what is the best way to crunch the data such that we simply determine how often each of usually only 2 fielders fields a ball in any given bucket.  That is all we need to know! It really should be that simple.

For example, on a lazy fly ball to right center field that gets caught, what is the absolute correct way to handle that piece of data?  Well, someone catches that ball 100% of the time (less a few tenths of a percent for wind driven balls or where they call each other off and let it drop in, and errors of course), so no one gets any credit for that catch or not making that catch.

So from the data, how do we know that a ball was a “lazy fly ball?” As I said, if a bucket has a 100% catch rate, it is a good bet that it was a lazy fly ball.  That is why I don’t give anyone credit or demerits for a ball that is always caught!  You wouldn’t if you saw the ball. As several people have said many times, most balls are either caught all the time or none of the time. Those balls should be ignored.  The trick is to determine which ones they are from the data. Not so easy.  A lot easier if you see them with your own eyes.

I’ll leave it at that for now…


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:49
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps