THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, October 28, 2009

Teixeira, take 2

By Tangotiger, 03:31 PM

I don’t think Sean is accounting for the measurement error, which I talked about in the 2008 THT Annual.  We simply don’t have enough parameters being tracked to distinguish between a play that should be marked as .94 outs, but is actually interpreted as .97 or .90, depending what parameters are being included.  Obviously, sample size is our friend.  But, a few hundred plays is not our friend.

If we go by observers, it’s no contest: Teix is one of the best fielding 1B, and Ryan Howard is average.


#1    Tangotiger      (see all posts) 2009/10/28 (Wed) @ 16:24

My comment at Primer:

Ron is 100% right about the precision. And MGL has said as much as well. I think it’s more than ridiculous to quote something as 1.4 runs.

If you ask me, or ask MGL, this question:
“Which is a better indicator of his performance in 2009: his UZR in 2009 or his UZR for his career?”

I will presume that MGL will answer the latter, and I will wholeheartedly answer the latter. I know, it sounds ridiculous.

This has everything to do with the lack of parameters necessary enough to distinguish a play between being a .92 out play and a .97 out play. Now, you may think “well, a .05 outs per play error… big deal”. Well, if you have 400 such plays, you’ve got 20 runs. And that’s the difference between the best fielder and an average one. More reasonably, we have say a .10 outs per play error on half the plays. You still get the same problem.

It’s not as big as that. It’s close to what Ron is saying (measurement error of 5 to 10 runs). But, that’s the problem.

Teix is one of the best fielding 1B in baseball. That’s what the Fans consistently say on my site every year. If UZR misses that, then that’s one of its misses. It’s going to be wrong 20% of the time. Teix is one of those misses.

The Fans also were blind to Junior for the longest while. They finally accepted his lack of range years after UZR saw the lack in range. The Fans are going to be wrong 20% of the time too.


#2    Guy      (see all posts) 2009/10/28 (Wed) @ 17:23

Let me take my life in my hands here and raise a few questions about UZR.  According to UZR, Philly was 47 runs better than NY on defense.  But NY had a DER of .699, a shade better than Phil’s .695.  UZR clearly includes more than just whether BIP become outs, but that’s the central element.  So I assume UZR is telling us the Phi. fielders faced much tougher chances.  How much confidence should we have in one year of UZR, vs. the team’s total DER?  Which do we think is closer to the true story?  And what if the discrepancy continued for another season? 

And I’m very skeptical of the claim that Jeter has suddenly improved the last two seasons.  UZR says he has.  But as best I can tell, Jeter isn’t actually making more outs than he ever did before.  What’s changed is UZR’s lowered expectations—Jeter’s expected outs:
2006 369
2007 357
2008 337
2009 309

Let’s compare 2009 Jeter to career Jeter and the average SS. 

Assists/9:
2009 Jeter:  2.42
Career Jeter: 2.61
Avg SS:  2.82

DP/9
2009 Jeter:  .48
Career Jeter: .55
Avg SS:  .64

PO/9
2009 Jeter:  1.47
Career Jeter: 1.54
Avg SS:  1.54

In each case, Jeter made fewer outs this year than over his career (a very low standard), and except for putouts he remains well below league average. 

Now, I understand that different pitchers, and pure chance, can change a fielder’s opportunities.  But when a player keeps making the same # of outs he always has (or fewer), but a system claims he’s getting better (at an age few players improve), I’m going to be skeptical.  On top of that, we know Jeter’s UZR ratings changed dramatically when the data source switch was made.  So I confess my confidence in the UZR ratings is not what it once was.  But I assume someone here will set me straight.


#3    Anthony      (see all posts) 2009/10/28 (Wed) @ 17:43

Given the fact that the Yankees added a bunch of strikeout-pitchers this year, it would make sense for Jeter to have fewer opportunities.


#4    MGL      (see all posts) 2009/10/28 (Wed) @ 18:17

I’ve said this many times before, and with all due respect to Sean, any discussion about talent or that implies a projection that starts with “Let’s look at this year’s stats,” I want no part of.  None, whatsoever.  Not even if the person presenting the data mentions previous years in the discussion.  Once they start showing us charts of this year’s stats, I stop reading. I mean if, for example, Tex has a career UZR of +5 per 150, why would you even WANT to show us his one year of -2.4 if you are discussing his true talent defense?  Isn’t that being misleading, disingenuous, or almost lying to our face?  I can understand if you don’t know any prior history or you don’t understand why you use more than one year of data (plus regression) if you want to talk about “talent,” but surely Sean is not in that category of persons.  I’m sorry, but I would pay no mind whatsoever to an article or analysis like this, other than perhaps to point out how weak it is.

You don’t start with one year stats, particularly with defensive metrics, and then talk about prior years as if it is an afterthought or a qualifier. You start with multi-year stats, then you regress, and then you go from there (then, and only then, can you talk about this year’s stats and what it might mean if they are different than the career numbers).  And since when is UZR spelled U.Z.R.?  I’ve never seen that before in my life.

Guy, even though DER seems like a clean way to assess defense on a team level, I really don’t think you can take it seriously if you know team UZR.  I could be wrong, but I don’t think you can.  DER is such a crude stat.  If it were at least split up into GB and FB DER or it did something with SLG percentage, then you might be able to take it seriously.  But just BABIP (which is what DER is)?  It’s crap.

As far as Jeter’s numbers go, who knows what his exact true talent level is now or before.  Like any other record of a performance, numbers bounce around all over the place over time and if your statistical record is a reasonable refelction of the talent you are trying to capture and you have few biases in the data or methodology, then the running average will tend to converge on that player’s average talent over whatever time period you are recording.  If a player’s true talent changes by more than a little over time, you kind of get screwed, because you will never have a large enough sample to be able to estimate that true talent at any point in time with any certainty.  That’s the bottom line with all stats.  I simply don’t concern myself with stats jumping around for a player and I don’t worry about whether they used to be bad and now they are good or they used to be good and now they are bad, or any possible permutation of good/bad.  At any point in time, my best guess as to their talent is simply the weighted and regressed stats, regardless of whether they have been consistent or all over the board if we happen to break them down into time periods (which I don’t even like to do other than for weighting purposes).  I have no idea whether Jeter is in fact better now than he was before (which is not that likely because of aging, but is possible) or we simply underrated his defense before, or if we overrate it now, or again, any combination thereof that would “explain” why is UZR went from good to awful to decent.  Frankly, again, I don’t care a hoot. Not one iota.  I’ll still with his weighted average regressed as my current estimate of his true talent on defense, as I do with every other player, and I’ll let everyone argue as to what he REALLY is or why his or any other player’s stats have been all “over the place” or not.  I don’t care. I really don’t.

And BTW, when I say “regressed,” I mean towards some scouting report (and/or other data that might reflect defense, such as a speed score), such as that of the Fans, if it is available and if I think it is accurate.


#5    JK      (see all posts) 2009/10/28 (Wed) @ 19:28

When doing a weighted average, should you use UZR/150 or the raw numbers?  How much weight do you give to each year?  How much weight do you give to scouting reports?  Thanks.


#6    King Yao      (see all posts) 2009/10/28 (Wed) @ 21:30

MGL: “his weighted averaged regressed as my current estimate of his true talent on defense.”

I’m not sure why you are so adamant about not caring, but it does seem you have an estimate.  Whether you care emotionally or not, I don’t care smile ... but you do have an opinion in the argument.


#7    MGL      (see all posts) 2009/10/29 (Thu) @ 01:10

I am adamant about not caring for the following reason:

Let’s say that there is some information out there that adds 5% to my conclusion.  Not much, but a little.  And let’s also say that I am not an entirely rational person and because of that, if I avail myself of that information, I am going to lose 10% of my conclusion because I may process that information irrationally.

That is the reason.

JK, I don’t have a number for how much to weight a scouting reports.  Since I want to give it more weight the less data I have, a good rule of thumb is to merely use the scouting report as the number to regress to.  To do that, you have to attach a number to the scouting report.  That is obviously going to be a subjective ballpark number.

It doesn’t matter whether you use UZR/150 or UZR period, as long as you do the math properly.

Say, you have 3 years of data:

year 1 100 games +10 per 150
year 2 140 games +3 per 150
year 3 120 games 0 per 150

Scouting thinks player is a little below average.

Say we use a weighting of 3/4/5.

We have a weighted average of:

5 * 120 * 0 plus
4 * 140 * 3 plus
3 * 100 * 10

divided by (5 *120 + 4 * 140 + 3 * 100)

= 3.21 runs per 150.  That is our 3-year weighted average using the above data.

We have 360 games, but we have to discount them for the weighting (because of the weighting, we have a smaller effective sample size).

We discount them the same way we weight them, but in reverse.

100 + 140 * 3/4 + 120 * 3/5 = 277

Now, we regress the 3.21 weighted UZR/150, using 277 games are our sample size.  Using a rough rule of thumb of regressing 50% per 150 games, we regress 150/150+x) where x is the sample size in games, which is 150/(150+277), or 35% towards the scouting report.

If we call “a little below average” -3, we regress +3.21 35% toward -3, which is:

3.21 - (3.21 - -3) * .35, or

3.21 - (6.21) * .35, or

3.21 - 2.17, or

1.04.

Voila!


#8    Colin Wyers      (see all posts) 2009/10/29 (Thu) @ 11:34

Guy, even though DER seems like a clean way to assess defense on a team level, I really don’t think you can take it seriously if you know team UZR.  I could be wrong, but I don’t think you can.  DER is such a crude stat.  If it were at least split up into GB and FB DER or it did something with SLG percentage, then you might be able to take it seriously.  But just BABIP (which is what DER is)?  It’s crap.

In this case the crudeness of DER/BABIP is I think it’s key feature. What we, after all, are trying to accomplish with zone-like defensive metrics is an individual player’s DER (or at least something akin to that).

And so we build a lot of complexities into our individual player defensive metrics to handle the issue of splitting credit for balls in play. At the team level, we don’t really care about that so much - for a ground ball up the left field side, in UZR we care very deeply about assigning responsibility to the 3B or SS. If we only want to measure TEAM defense, we don’t care so much.

It’s possible to refine DER further (ground ball and air ball, for instance, as you suggest). But at the team level a lot of what’s done with UZR is simply unnecessary to measure team defense.


#9    Guy      (see all posts) 2009/10/29 (Thu) @ 11:54

I agree that DER is crude.  I suppose what you want is something like wOBA against on BIP.  Adjust for park, and maybe adjust for the distribution of GB/LD/FB (or just GB and non-GB, if you distrust the LD%).  I don’t remember if wOBA includes DPs, but if not I would incorporate DPs.  That should be a pretty good measure of team defense (although you still have a small potential impact by pitching staff).

My question is then, which would better predict next year’s wOBA-BIP on a team with little turnover of fielders:  current UZR or current wOBA-BIP? And what if you used career UZRs—how much better does your prediction become? 

*

I’m going to backtrack a little on Jeter:  the last two years, NY pitchers have allowed about 3% fewer GBs (than 2006-07).  And they have faced about 3% fewer RHHs.  So Jeter’s opportunities could have declined quite a bit.  At the same time, I’ll be a lot more convinced of his alleged improvement if/when he starts making more outs for his team!


#10    Tangotiger      (see all posts) 2009/10/29 (Thu) @ 12:10

Guy, when I run my WOWY, we’ll see what it gives us, since it controls for the identity of the pitchers and batters.  We’ll know for sure how much his playig environment changed (without knowing, however, if the number of actual balls hit his way actually changed).


#11    Steve Sommer      (see all posts) 2009/10/29 (Thu) @ 13:41

I was trying to do something like MGL mentions in #7 only using the fan’s scouting report in lieu of an actual scouting report.  While compiling UZR numbers from fangraphs I noticed that the defensive games number appears higher this year than years past.  Does anyone have any insight on that?  Thanks.


#12    MGL      (see all posts) 2009/10/29 (Thu) @ 15:39

"While compiling UZR numbers from fangraphs I noticed that the defensive games number appears higher this year than years past.  Does anyone have any insight on that?  Thanks.”

I’m working on figuring that out with Dave Appleman of Fangraphs.  We’ll have an answer soon.

You are right, but really you should be referring to ZR and not UZR.  DER is better than adding up the individual ZR, for the reasons you say, but there is too much other information in UZR that is not included in DER to take DER seriously when you know team UZR.

And remember that UZR is NOT really a zone based system.  Fielders do not have areas assigned to them or not.  Their area of responsibility is the whole field.  And also remember that if a ball is caught, no one gets docked any runs in UZR (and if a ball falls for a hit, every fielding position that has ever fielded a ball in that bucket gets docked) unlike some other PBP systems, so in that sense, it is really a team level system.

Because UZR is using how hard the ball is hit, what kind of ball it was, the base runners and outs, the park, the handedness of the batters, the G/F ratio of the pitchers, and a few other things, and DER does not, you simply can’t compare the two, I don’t think. Of course, those things will tend to even out in the long run, so it may be that DER and UZR tend to converge and it may even be that DER (or a more rigorous version of it) surpasses team UZR at some point because some of the noise is eliminated in DER, as you point out.


#13    Steve Sommer      (see all posts) 2009/11/05 (Thu) @ 18:32

Looks like they fixed the DG problem

http://www.fangraphs.com/blogs/index.php/uzr-update-dg


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:49
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps