THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, September 02, 2010

The two uncertainties of UZR

By Tangotiger, 02:01 PM

Pat says:

UZR liked him for 31 runs above replacement. ... Still, UZR wasn’t alone; DRS said he was even better, at 32 runs saved, and Total Zone liked him for 27 runs. It was truly an incredible year for Guti.... Finally, the glove hasn’t played like it did last year. UZR still thinks he’s been good for 7.5 runs, while DRS is even more bullish with 16 runs…

It’s VERY POSSIBLE that Guti’s glove DID play exactly last year as it is this year.  That’s because UZR has an uncertainty in classifying the degree of difficulty of a play. 

Then, there’s simply random variation.  For example, Pujols can be a true .440 wOBA hitter, and he swings and approached each PA exactly the same, and in the first 600 PA he has a .420 wOBA and in his next 600 PA he has a .460 wOBA, and this does NOT mean that Pujols played better.  He played the same, and good/bad luck explains the difference.

I’m not saying that this is what happened in Guti case, or any player’s case.  But, it’s important to understand that seeing a UZR number is not like seeing a wOBA number.  wOBA has one uncertainty (random variation), while UZR has a second uncertainty (classification of batted balls).

***

You also hear about how a park “played” like a hitter’s park, even though it is a pitcher’s park.  If players at PETCO for example generated more runs there than away, that doesn’t mean PETCO is now a hitter’s park.  It simply means that random variation reared its head (you flipped 10 straight heads).  It happens.


#1    David Cameron      (see all posts) 2010/09/02 (Thu) @ 15:08

I would bet that Guti will score worse on the FSR this year.  He simply hasn’t been as good as he was last year, which is fine, because last year he was crazy good.  He’s had a few tremendous catches this year, but last year they were nearly a weekly thing.  It has been easily noticeable watching the team that he hasn’t had the same kind of impact defensively this year. 

It’s probably more opportunity than lack of skill - just fewer line drives in the gap for him to run down and look amazing on.  But, either way, he’s made a lot less “oh my god” plays this year.


#2    Tangotiger      (see all posts) 2010/09/02 (Thu) @ 15:21

The first line is 2010, and the second one is 2009:

4.8 4.7 4.3 4.7 4.2 3.9 4.1
4.8 4.8 4.6 4.7 4.2 3.9 4.0

Very close.  His Velocity/Sprint is down a tad, and the rest of his numbers are a match. 

Last year, he had a 4.60 rating (on a scale of 1-5) in CF (2nd in the league), and this year it’s 4.49 (3rd in the league).

***

My point is solely about using UZR to come to conclusions like what Pat did, and not to debate the merits of Guti in particular (I’m just using him as an illustration).  And, because of the uncertainty of classifying plays, you can’t come to a conclusion like Pat did based on using the fielding metrics.


#3    KJOK      (see all posts) 2010/09/02 (Thu) @ 15:38

"You also hear about how a park “played” like a hitter’s park, even though it is a pitcher’s park.  If players at PETCO for example generated more runs there than away, that doesn’t mean PETCO is now a hitter’s park.”

BUT, BUT, BUT it’s still true that it ‘played’ like a hitter’s park, in that runs were then ‘less valuable’ - it took more runs to create a win.  So, unless that quote was in context of specifically applying the ‘hitter’s’ park factor to some future prediction, it is a true statement.


#4    Jamie      (see all posts) 2010/09/02 (Thu) @ 15:44

aren’t these stats above average?  so while his glove could still be considered great, the baseline has been raised by the other centerfielders playing in the league?


#5    Tangotiger      (see all posts) 2010/09/02 (Thu) @ 15:54

KJOK: the question is how much of the value to split to offense and how much to defense.  And, regardless of how it “played”, the baseline remains that of a pitcher’s park (albeit considering all data for all years), and therefore, if lots of runs were scored at PETCO this year, you give the lion’s share of the credit to the offense, and almost none to the defense.

If you were to treat PETCO suddenly as a hitter’s park in this illustration, the split of credit between offense and defense would be try to get them equals.


#6    KJOK      (see all posts) 2010/09/02 (Thu) @ 16:39

Tom:

So, you’re saying that if in 2009 PETCO has 6 rpg scored on average, and then in 2010 PETCO has 10 rpg scored on average, and Adrian Gonzalez has a .400 WOBA both years, that his value would be the same both years, because we ‘know’ PETCO is pitcher’s park? 

Again, I can see that if you’re forecasting or trying to find Gonzalez’ ‘true’ talent, but I’m fairly certain that his .400 WOBA in 2009 was MUCH more valuable in terms of wins to the Padres than his .400 WOBA in 2010, when PETCO ‘played as a hitter’s park’.


#7    Tangotiger      (see all posts) 2010/09/02 (Thu) @ 16:43

If we could forecast the hitter/pitcher-yness of a park based on characteristics of configuration and climate and whatnot and have no need to get sample data from performance such that we “know” that PETCO should get 7 RPG scored in it whereas all other parks get 9 RPG scored in it, then…

- if Adrian has a .400 wOBA in any year where PETCO is deemed to be a 7 RPG park, then yes, his value is identical in both seasons, regardless of how many runs were scored in that park


#8    Rally      (see all posts) 2010/09/02 (Thu) @ 16:52

"It’s probably more opportunity than lack of skill - just fewer line drives in the gap for him to run down and look amazing on.  But, either way, he’s made a lot less “oh my god” plays this year.”

He made 2 amazing catches last night.  In the games against the Angels I don’t recall him missing anything, or even failing to make an amazing play when the opportunity presented.  I don’t watch more than a small number of his games though.

Maybe it’s just about the opps.  The estimate from the field f/x guys that 95, or 85-90%, or whatever are plays that anyone will make or nobody will make.  It could be that in a given season, an outstanding CF might be given enough variable chances to be +35 if he’s lucky, but if he’s unlucky and has fewer high variables, he might only have the opportunity to be +15.  Even while catching the same number of balls overall and not having any change in ability.

If we went through a stretch where every other ball hit to right was either a high lazy fly to medium center, and the other half were line drives just over the head of the infielder or right down the chalkline, both Adam Dunn and Ichiro would rank as average fielders.


#9    MGL      (see all posts) 2010/09/02 (Thu) @ 17:24

Right, there are many ways for the numbers to be different than a player’s true talent, which is why I don’t get bogged down in them.  I simply do a mental regression and that is the end of it. I don’t care if Guti’s +35 last year was actually playing great defense or some mis-classified batted balls, or both (likely a little of both).  I don’t concern myself with that.  If that was all the data I had on Guti, then I would probably assume he was a true +15 and that is the end of it.  Now, in 2010, if he is a plus +7.5, that is reasonably close to what I expected (+15). And of course, now his true talent estimate is maybe +10 (or whatever) rather than the +15 I thought it was.  Our estimate of player’s true talent in any category is always a work in progress - a moving average.  Again, I don’t concern myself with how or why it is +7.5 this year and was +35 last year (unless a player is or was injured or something like that - even then, an injured player is going to fluctuate randomly a lot as well).


#10    mettle      (see all posts) 2010/09/02 (Thu) @ 17:30

Re: Pujols, I guess it depends on what you mean by “exactly the same” in “he swings and approached each PA exactly the same” and the same goes for fielding and Gutierez. Part of what we call ‘random variation’ is probably things like expecting fastball, getting curve, expecting LD, getting fly. Those are things that even out over time, but are part of approaching an AB.


#11    MGL      (see all posts) 2010/09/02 (Thu) @ 17:48

Something I was thinking about the other day with respect to the Mariners and defense:

Obviously the Mariners offense has stunk this year and many of their players have performed way below their projections, for whatever reasons or none at all.

However, there is a PR danger in constructing a team with defense in mind, if you are ultimately not successful in the win/loss column.

Let’s say that you don’t have a large payroll and you are trying to put together an average lineup, which would be quite admirable given your payroll limitations.  If you have an average defensive team or even below average, your lineup’s offensive stats are likely going to look decent or even pretty good, and you will be praised for putting together such a good lineup on a limited budget, even if you are not doing so well in the w/l department.

But, let’s say you decide to look for good defensive players, and you put together that same average lineup but 30 or 40 runs (per 150) is tied up in defense.  What do you think the public and the media is going to say if you are not winning games?  They are going to look at your lineup’s offensive stats (and your RPG) and criticize you for putting together such a lousy lineup.  Even if your team UZR (or DEF) is good, not too many people will know or care about that.  It will simply look like a bad lineup.

Basically what is happening is that the defense of your lineup shows up in the pitching stats!  Which is why to some extent defense will be an undervalued commodity for some time hence.  That is one of the issues with the M’s this year.  Their team UZR is not great, but it is still +11 runs, which is around +13 runs for the season.  And it is likely better than that in true talent (if you look at each player’s UZR projection prorated for their playing time).  So really, the M’s lineup is not quite as bad (overall) as most people think it is. And of course, their pitching is worse than most people think it is…


#12    Tangotiger      (see all posts) 2010/09/02 (Thu) @ 19:29

I’m going to move posts 12 to 15 to the other thread David linked to.

This thread should be reserved to the “plays like” concept.


#13    Phil D      (see all posts) 2010/09/02 (Thu) @ 20:05

MGL/11

This year’s Oakland A’s are a prime example of that.


#14    Pat Andriola      (see all posts) 2010/09/02 (Thu) @ 23:32

I think you are overextending my point from the phrase I used. I understand the inherent biases in UZR; I’ve written about them before. The difference is that when I see universally outstanding 2009 reports from UZR, DRS, and TZ, and then more human scores for 2010 from the same group of stats, saying his glove hasn’t played like it did last year is a fair argument. I haven’t made any sweeping generalization about his true talent level, just the relatively smaller sample size that is this season thus far. Sure, I could put an asterisk and note that there is a chance due to error and whatnot that my statement isn’t necessarily 100% accurate, but we live in probabilities. Besides, the post was specific to value via WAR, which uses UZR, so according to UZR his glove really wasn’t the same, and that’s all that’s germane.


#15    MGL      (see all posts) 2010/09/03 (Fri) @ 00:10

What Tango is trying to say, one of the things at least, is that UZR does NOT necessarily tell us how a player’s glove has been, irrespective of true talent.  And he is correct.  That is one of the ways in which it differs from offensive (and other stats) stats.  If a player has a .300 BA, regardless of what his true talent BA is, 30% of his AB have indeed been hits. Not so for UZR.  Just because a player’s UZR is +10 that does not mean that he has been good with the glove. In fact, it is possible that he has been awful with the glove and he just happened to have lots of batted ball opportunities that were misclassified by the data or the UZR engine, or by both.  Or some other reasons. For example, perhaps he happened to have some odd (though correct) positioning that the UZR engine did not infer properly.  Etc.

So basically, at the risk of being redundtant, what Tango is saying, and he is correct, is that UZR does NOT necessarily tell us how well a player played defense. It only implies it.


#16    greenback      (see all posts) 2010/09/03 (Fri) @ 01:02

"that UZR does NOT necessarily tell us how well a player played defense”

You could say the same thing about offensive stats though. I don’t think anybody really knows if a given hitter has been given a disproportionate number of meatballs by opposing pitchers, although I guess that’s something pitch f/x could come close to answering.


#17    MGL      (see all posts) 2010/09/03 (Fri) @ 09:06

#16, right that is a similar thing.  Even though a batter may have gotten a hit, it might have been on a mistake pitch right down the middle or it might have been a soft ground ball that snuck through the infield.  But the difference is that the offensive stat likely measured what actually “happened” which is that the batter got the single, even though it might not have been much to his credit. We do, however, say that a batter “hit well” when he gets lots of hits, regardless of how those hits occurred.  Interestingly, if a fielder catches a lot of balls but they were all easy, we don’t say that he “fielded well.” Funny how that is.

I agree with you that one of the (many) things we will eventually use pitch f/x for is to adjust batter performance to account for the types of pitches he gets, sort of like a “pitcher factor.”


#18    Tangotiger      (see all posts) 2010/09/03 (Fri) @ 10:18

The difference is that the batter himself influences the kinds of pitches he gets.  A slap hitter will see far more fastballs (and down the middle) than Pujols would.


#19    mettle      (see all posts) 2010/09/03 (Fri) @ 10:40

15-17: Is FLD% the closest we have to BA for fielders if we pretend scorers use some more objective way of scoring an E?
Do you think there is even a conceivable way to get a number that reflects what a fielder did in the same way?


#20    MGL      (see all posts) 2010/09/03 (Fri) @ 17:57

mettle, right. What you are wondering is pretty much what simple ZR does.  It defines a pre-determined area for each fielder.  If a ball is caught by that fielder in that area, he gets credit for an out. If not, he doesn’t.


#21    Brian Cartwright      (see all posts) 2010/09/04 (Sat) @ 00:54

MGL 17 - totally agree with you analysis here, but I don’t think defense has to be that way.

I first started doing defense about thirty years ago after reading Bill James describe DER (defense efficiency ratio). It was simple - count the number of batted balls on the field, regardless of the location, and find what percent of them were made into outs. Out or not. The not outs can then be divided into hits or errors, but either way, the batter reached base.

My one enhancement was to add which fielder was responsible for each batted ball. At that time, we didn’t yet have Project Scoresheet or Retrosheet, but I was a summer league statistician who hired and trained the official scorers for each games, so I had them provide me with the appropriately recorded play by play.

While Jones was playing left field, there were 100 balls hit to left, 70 outs, 29 hits and 1 error, a .700 individual DER. I reported defense stats as the numbers of opportunities, hits, reached on errors, and other errors, plus DPs for infielders. I figured there would be a much higher degree of agreement between observers on whether the opportunity belonged to 3B or SS than if a ball was a hit or an error. We can make the same arguments that MGL has in #17, some balls are too hard, some are too soft, but the record is the record. In a large enough sample, it would even out.

It’s when we want to take these raw stats and put them in a context, comparing them to a baseline and calculating plus/minus and run values.

Back to Gutierrez, I have him +12, +18, +25, +12 for each of the past four seasons. We have measured year to year random variations in batting and pitching stats. We can accept that a hitter may have a .300 batting average one year and .260 then next. We have to measure and convey those the same variations on defense and hopefully avoid having the users of the data question the validity of any defensive metrics. As Tango has said concerning projections, measure and then know the uncertainties of the source data and the random variations.


#22          (see all posts) 2010/09/04 (Sat) @ 03:45

"wOBA has one uncertainty (random variation), while UZR has a second uncertainty (classification of batted balls). “

This is untrue.  wOBA has a huge uncertainty in that it’s run values for each event are the same for every team, lineup position, and park (in practical usage anyways).  And the park adjustments, when used, have huge uncertainties when applied equally to individual hitters with vastly different hit charts. 

While I agree classification of batted balls is an uncertainty, has anyone bothered to estimate what that might be.  And is not that uncertainty simply a random process assuming the interns have no particular bias. 

They are equally likely to misclassify a batted ball on the plus side as well as the negative side. Just like an official scorer may rule an error as a hit or vice versa, and an umpire may call a runner out at 1B rather than safe.

I personally don’t give a darn about trying to measure players true ability which is uncertain, which with defense means having to wait 3 years, and then the number means little given the player is 3 years older.

I care mainly about what the player has done in a given season.  In other words, how much has a player contributed to his teams run production or run prevention.  A walk which does not help create a run is worth nothing. An RBI hit or walk is always worth something. 

And if a fielder is lucky in being able to field more balls than his peers because of his positioning or good luck, then he has had a good year in preventing runs, even if it does not mean he is as good in terms of ability as the numbers say he was.

Now you say a metric that says Player A was 30 runs above average in 2009, and only 7 runs above average in 2010, that this is not evidence that he has been less productive in the field in 2010 due to uncertainty in the metric.  Then I must say such a metric may have little value after 3 years, since it implies an uncertainty of +/-10 runs in a given year.

Example.  Say player X has the following 3 year UZR

10 (0 to 20)
5 (-5 to 5)
15 (5 to 25)

This means his 3 year average is

10 (0 to 17)

So you can only say he was somewhere between between a league average 3Bman and a Gold Glove 3Bman.  How helpful is that?


#23    MGL      (see all posts) 2010/09/04 (Sat) @ 10:10

Brian, what you are describing is simply ZR, with floating zones based on observation - which is fine.


#24    Brian Cartwright      (see all posts) 2010/09/04 (Sat) @ 10:33

mgl, yes, and I was doing it ten years before ZR.

My point was that we are capable of describing defense in factual terms as you described offense in #15. That’s were I started back then and since have added WOWY to account for park factors, batted ball types, etc.

We are currently suffering a crisis in confidence of the users of defensive metrics, and we need to be looking for ways to regain that trust. Colin is trying to step way back and see how well he can do with the most basic assumptions. I’ve just described my general method. The users might not need to know the start to finish process of making the sausage, but at least know we’re just not just pulling numbers out of our butts.


#25    MGL      (see all posts) 2010/09/04 (Sat) @ 19:30

It is just that UZR or Dewan or PMR is MUCH better than just assigning responsibility to an out or a hit (or error).  Is it as transparent or as easy to understand or reconcile (or believe)?  No.  Do they converge in the long run?  Yes.

For me, I could not care less what the public thinks of a metric.  I am interested it making it as accurate as possible.  If a fielder does or does not make an out on an easy or a hard play, I want the metric to reflect that (how hard the play was to make).  Not that he just made the play or he didn’t, regardless of how easy or difficult the play was.  That seems an awfully gross way of measuring fielding.  Which it is of course.


#26    Brian Cartwright      (see all posts) 2010/09/04 (Sat) @ 19:51

We each have our ideas on how to make the metric as accurate as possible, of course that is the goal.

But I am concerned with the current atmosphere in the consuming public. Jinaz told me the other day about the hate he was getting at BtB for suggesting that fielding metrics have any validity. We know that’s not true, but to some degree we have to educate the people who read our metrics that they do have value, do represent the truth, but within limits. It’s accepted that batting and pitching stats for a player can jump around from one season to the next. Explain that it’s also true for fielding, and to what degree.

In the middle of each of us posting our thoughts, I am running some queries to display the data in sme new formats, and I’m getting insights into my own process that I didn’t have before. So I’m still learning, and I like that.


#27          (see all posts) 2010/09/05 (Sun) @ 02:24

I care mainly about what the player has done in a given season.  In other words, how much has a player contributed to his teams run production or run prevention.  A walk which does not help create a run is worth nothing. An RBI hit or walk is always worth something.

Where do we draw the line?  If an RBI comes in a loss is it still worth something?  If the RBI comes in a game that is won by more than 1 run, is it still worth something?  If a win comes but the team fails to make the postseason, was the win worth something?  Should we only count runs produced in wins of 1-run games by playoff teams (affectionately known as RPIWOORGBPT) as worth something?

If that’s what you want that is fine, but let’s try to be consistent.


#28    MGL      (see all posts) 2010/09/05 (Sun) @ 18:44

”...but to some degree we have to educate the people who read our metrics that they do have value, do represent the truth, but within limits.”

Maybe you do, but I don’t.  As I said, I don’t care at all whether the public likes or believes my metrics.

#27, right, I have argued that for years. Which is why there is NO right answer for an MVP award or any other award.

One person might think that a run or a hit in a loss is worthless.  Some people think it is worth the same as in a win.  Some people think that a hit that does not score a run is worth the same as one that does. Etc.  It is all a matter of perspective and personal taste.

To me, the idea that WAR or some such thing should be the metric of choice for an award is patently ridiculous. I’m not saying that that is wrong.  I am saying that there is no more reason to use WAR than there is to use RBI or even wins for a pitcher.  Or Win Shares.  Or whatever you want that makes some reasonable sense.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 01:57
Who is Jeremy Lin?

Feb 12 00:40
Clutch analogy

Feb 12 00:38
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 10:29
Dwight Evans