THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, January 05, 2010

If you have UZR and TZ, should you EVER use FRAA?

By Tangotiger, 12:20 PM

Steven in the BPro comments wrote:

Two things at work in the player comments as far as the fielding metrics: Christina and I encourage the writers to debate with the stats (all the stats) where appropriate. Second, I’m not really satisfied with any one fielding metric and consult a range of them, and I know the others do the same. Until a grand unified theory of fielding stats comes along, that is going to be an area where what the writers see and what the stats say aren’t going to match up every time.

There is nothing, nothing at all, that FRAA adds to the discussion, if you have UZR and TZ.  FRAA is a subset of these two.  It would be like using ZR and UZR.  ZR is a subset of UZR.  Once you have UZR, you no longer need ZR.  A similar analogy would be to still use OBP or SLG, if you have wOBA or EqA.  wOBA and EqA already combine OBP and SLG.  Therefore, you don’t need those two, unless you specifically need them for their individuality.  FRAA has no such individuality.

So, guys, please, stop using FRAA.  It adds nothing to the discussion. 

The question is TZ v UZR.  While accepting that UZR is better overall (because it uses more granular information), if there is something that TZ does differently from UZR, then that’s your case for using TZ.  That’s my case for WOWY: I actually look at the identity of the batters and pitchers, and I rely on the fact that over a few years, the stringer bias will overcome some of the signal.


#1    Rally      (see all posts) 2010/01/05 (Tue) @ 14:00

The reason I use TZ in the WAR calculation is simple: It’s mine and I can use it.

Other than that, if it has value it’s because it’s a different source of data.  UZR now uses BIS data.  TZ uses retrosheet, and they get their data, like BBtype classifications, from the MLB gameday files.  I think zone rating still has value as well, because it’s the only currently available system using the STATS data.

We know that STATS and BIS data can be different.  I don’t think anyone has come close to telling us which is more accurate.  It would be a tough project, I think you’d need full access to both datasets and time to watch a lot of game replays to reconcile them.


#2    Tangotiger      (see all posts) 2010/01/05 (Tue) @ 14:25

Rally, excellent point on the data source.


#3    Colin Wyers      (see all posts) 2010/01/05 (Tue) @ 14:50

The data provider issue is interesting. Chris Dial and I went through Mark Teixiera’s STATS ZR rating and compared it with BIS-sourced data (like UZR and RZR). It’s hard to say for sure the source of all the disagreements, but STATS ZR has Tex as the top defensive first baseman in the AL last year, Chris tells me.

We also took a look at Retrosheet data - Tex didn’t have a lot of line drive snags inflating his ZR data, for instance. A quick conversion of THT’s RZR/OOZ data to a plus/minus system seems to agree pretty well with UZR, so I don’t think this is a case where the difference is in the extra adjustments UZR makes.

So it really comes down to - how much can you trust the data? And what can you do when you don’t trust the data all that much? Because that’s where I’m at right now.

And that really does tie into the original point here. I do have some ideas for ways to control for some of the data issues we see. I’m testing those ideas and working on applying them to a fully armed and operational play-by-play defensive metric, with hit location data. Once I’m done with that, there’s your new PBP metric for Baseball Prospectus.

Is FRAA going away at that point? Probably not entirely. For the Retroera, we can probably do a simpler PBP metric - I already have SZR, so it’s not like this is a particularly burdensome cross to bear.

But then you have the pre-Retroera. At that point, I don’t know that FRAA is particularly different from JAARF or Fielding Win Shares or what have you. Retrosheet is doing some fantastic work with the box score event files for the 20s and 30s and I have some ideas for how to turn that into a defensive metric, but that’s probably a little further off in the horizon.


#4    Tangotiger      (see all posts) 2010/01/05 (Tue) @ 15:04

Colin, if you mean that FRAA can be used when talking about Ty Cobb, then, yes, that’s perfectly legitimate.  Otherwise, I don’t want to see FRAA used for anyone born since Mays/Mantle were born.

***

“And what can you do when you don’t trust the data all that much? Because that’s where I’m at right now. “

Right, that’s EXACTLY where I was with WOWY.  Once I’ve identified the hitter, pitcher, park, and base/out situation my next step was to look at the batted ball type (number of GB v airballs, and subdivide the airballs into FB, Pop, LD), and then furthermore into spray-slices. 

But a funny thing happened: my leaders based on purely factual information, on a career-level, made so much sense that I had to ask myself: do I *really* want to add subjective information?  Yes, Joel Pineiro becomes a GB machine in 2009.  I’m going to flub that one.  At the same time, is Roger Clemens’ spray pattern going to change much? 

The next factual piece of information I’d add is age, so that rather than looking at all of the “without”, I would limit myself to weighting the recent “withouts” more than the older withouts.  I’m not sure about that.

***

But, right, I hear you.  I think for 1-2 seasons, UZR does it the right way, even with all the subjective data.  For under 1 season, I prefer the Fans’ Scouting Report.  And for 6+ seasons, I prefer WOWY.

It all becomes a scale, based on the signal/noise ratios.  The Andruw Jones thing is what really shocked my foundation (he’s either +112 runs in 7 years, or he’s 0, depending on the data source).  That’s super-bothersome to me.

So, the Fans’ Scouting Report gets say a weight of “1” all the time.  UZR say starts at 0 and tops off at say 3 if you have 2 years of data, and then goes back down to 2 for a career.

And WOWY say starts at 0 and tops off at say 3 for a career.

The intersecting point for Fans/UZR would be say 1 year.  The intersectiing point for Fans/WOWY might be say 2 years.  And the intersecting point for UZR/WOWY might be say 5 years.

Something like that.  If you graph it in your mind, you’ll see what I mean.


#5    Rally      (see all posts) 2010/01/05 (Tue) @ 15:24

"At that point, I don’t know that FRAA is particularly different from JAARF or Fielding Win Shares or what have you.”

I agree.  I have no idea which would give you a better picture if you’re trying to evaluate Cobb vs Speaker or something like that.  And I don’t think it would be worth the time to find out, or if such would even be possible.

Good luck on the new metric.


#6    Brian Cartwright      (see all posts) 2010/01/05 (Tue) @ 16:19

I use a WOWY based system, also borrowing from Fox’s SFR, utilizing Gameday data, including minor leagues. Along with UZR and other pbp metrics, basically each ball is put in a bucket with it’s own expected probability of success. Create an expected value and compared it to the observed. With WOWY the expected value is how everyone else did in this situation (park, location, bathand, etc).

In the Mike Silva fielding thread, I posted Teixeira’s numbers. My analysis had him as consistently a little above average on infield hits and errors each of the last four seasons, below average on preventing ground ball hits to the outfield in 2006-2007, great in 2008, slightly above average in 2009. When there are only 30 or 35 plays not made each season, it’s real hard to get the noise out. Maybe he had 5 more hot grounders in 2009 than 2008.

What I also showed is that when you take those seasonal results and Marcel them, things look a lot more like is expected - my top 3 fulltime MLB 1b defense projections for 2010 are Pujols, Kotchman and Tex. Jeff Larish and Steve Pearce, who each split time between minors and majors, rated above Tex.


#7    Jamesian      (see all posts) 2010/01/05 (Tue) @ 16:56

Here’s a question for you regarding WOWY. We had a little discussion about UZR on Fangraphs a while back and you compared Adam Everett to Royce Clayton, who I thought was a comparable defensive player as a younger player. In doing so, you determined that Everett was 32 runs per year better defensively than Clayton (or at least that was what I took away).

I’ve come to understand that Everett was a little better than I gave him credit for, but he doesn’t seem that much better than Clayton was. I’d still say Everett was closer to Clayton than Ozzie Smith.

Since then, I’ve discovered TZ numbers and Clayton in his 20s was about a +8 run defensive player whereas Everett was about a +15 run defensive player in his four years as an everyday player in his prime. Outside of an outlandish +30 run season, Everett averaged to +8 in the other years.

So Everett and a young Clayton seem pretty comparable although Everett was probably a notch above Clayton although Clayton also had a four-year run as a +15 run defensive player.

But 32 runs different? What accounts for such a difference between the numbers?

The numbers you used might have been greatly reduced by Clayton’s play in his 30s, but even then he was a roughly average defensive player.


#8    Peter Jensen      (see all posts) 2010/01/05 (Tue) @ 17:12

The Andruw Jones thing is what really shocked my foundation (he’s either +112 runs in 7 years, or he’s 0, depending on the data source).  That’s super-bothersome to me.

It should be bothersome to MGL, STATS and BIS as well.  It shouldn’t be difficult for MGL to compare the data from STATS and BIS and identify where the problem is.  From what MGL has written in various posts I am not entirely certain, and I am not sure that he is entirely certain, that UZR programming is exactly the same for STATS and BIS.  But whether it is biased data, or difference in programming, or a programming error, MGL is the only one who can actually find out and I wish he would.  The data providers also have an interest in finding out.  If it turns out to be even partially a UZR problem then they are being unfairly maligned.


#9    Tangotiger      (see all posts) 2010/01/05 (Tue) @ 17:27

Well, I think I’m really the only one doing the maligning.  And MGL is operating under an implied as-is agreement with us.

That said, yes, for sure, I’d be highly interested in that.  Given your (Peter) work in the comparison of BIS and STATS, I think it’s fairly reasonably to believe that there is a bias somewhere.  Even just a straight ZR to Plus/Minus comparison would be good enough to start with to show that.  Convert ZR into plus/minus, and we’re on our way.


#10    Peter Jensen      (see all posts) 2010/01/05 (Tue) @ 17:33

It will be interesting to see when Colin gets his new PBP+Hit Location metric up and working whether the player ratings will be significantly different than those of my BZM fielding metric.  UZR has one problem of a single metric giving different values from different data sets, but right now we don’t have two publicly available PBP fielding metrics that use the same data.  When we do, Colin and I will both gain from seeing whether the different assumptions we have made turned into different results.  Having that ability to compare and perhaps learn better techniques is why I published as much detail as I did about BZM, but not every detail, so that others might make different choices from the ones that I made.


#11    Rally      (see all posts) 2010/01/05 (Tue) @ 17:55

Peter, are you planning on publishing the results of your BZM work?


#12    Peter Jensen      (see all posts) 2010/01/05 (Tue) @ 18:45

That said, yes, for sure, I’d be highly interested in that.  Given your (Peter) work in the comparison of BIS and STATS, I think it’s fairly reasonably to believe that there is a bias somewhere.

The work I did showed that the observational data has limits on its possible accuracy that were greater than most people had assumed.  But that is not the same as showing bias.  It is quite possible that a full years worth of fielding data would be enough so that 2 data sources that differ greatly on individual plays would be pretty similar when the year’s data is taken as a whole.  And I would have assumed that 7 years of aggregated data surely would be similar.  Even if one of the Turner Field stringers was biased it is highly unlikely that he or she would have been there the entire 7 years, and even so it still wooul account for only half the data.


#13    Peter Jensen      (see all posts) 2010/01/05 (Tue) @ 18:51

Rally - I published the numbers for 2008 last year. I just finished inputting the data for 2009 and am preparing two articles for THT, one on the general results and one on the Teixeira question.


#14    MGL      (see all posts) 2010/01/05 (Tue) @ 19:27

Did everyone get their THT Annual?  I think I ordered mine a long time ago but I have not received it yet.


#15          (see all posts) 2010/01/05 (Tue) @ 20:04

I got mine right away, MGL.  I ordered mine about 3 weeks before it came out and it came a day or two after it was released.


#16    Zach      (see all posts) 2010/01/05 (Tue) @ 20:25

Rally, how do you calculate TZ for pre-Retrosheet years? Is it a linear weights calculation similar to FRAA?


#17          (see all posts) 2010/01/05 (Tue) @ 21:58

MGL, who did you order it from?  I can follow up with ACTA if you ordered it from them.


#18    Rally      (see all posts) 2010/01/05 (Tue) @ 22:52

I’m not sure what FRAA does.  It’s pretty much just trying to estimate opportunities and plays made and come up with a run value on it.


#19    MGL      (see all posts) 2010/01/06 (Wed) @ 00:42

Thanks Studes.  I thought I ordered it from Acta, but in searching my e-mail box I didn’t find anything from them, so I just ordered it again (or for the first time).  It’s all good.


#20    Tangotiger      (see all posts) 2010/01/06 (Wed) @ 00:57

FRAA does more by estimating the handedness splits, and maybe even the GB/FB tendency of the pitching staff, and who knows what else.

A perfectly respectable stat, pre-1952.


#21    Rally      (see all posts) 2010/01/06 (Wed) @ 10:03

Pitcher handedness is a big part of it.  Thanks to the Lahman database I know how many balls in play were allowed by lefties and righties (except for a few cases in the 1800’s with unknown pitcher throwing hand).  I think I have a GB/FB estimator in there as well, but that part isn’t so easy.  If we see a team hasa larger than normal share of assists to putouts, we don’t know how much is due to a groundball tendency or if the infielders were just more efficient fielders than the OF’s.  I don’t remember how I handled that one.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:26
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps