THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, July 16, 2010

State of fielding

By Tangotiger, 12:42 PM

Colin is wondering exactly where we are:

I’m not really sure that we’ve gotten any further than where we were when Zone Rating and Defensive Average were proposed in the ‘80s. And if we have gotten further, I’m not sure how we would really tell.
...
Are we being objective about fielding analysis? In other words, do we know what we think we know?
...
So our metrics don’t do a very good job of agreeing. We don’t know which methods are “better,” only which ones we like more. And our data hasn’t been validated against some objective standard. To me, this opens up a simple question—how good are our defensive metrics? Are they useful? How useful?  And if we go back to the beginning, where we talked about what sabermetrics is about, it doesn’t seem to me to be good or valid sabermetrics to accept these metrics without some sort of evidence, some objective facts that show they measure what we think they measure. And I think the burden of proof is on those who are making claims based upon these metrics to provide that evidence. 

My response to him, and later readers:

I had already responded to Colin’s point regarding the comparison of correlation for offense and defense. I will reproduce here:

What is the relevance here? You are taking known hits, known extra base hits, and known outs, and you have one system that arranges it one way and another that arranges it another way. The correlation would have to be in the high r=.9x. With fielding systems, you are taking known outs (in some systems), estimated outs (in another), and estimated hits (for all systems), and trying to find the correlation.

***

Otherwise, I share Colin’s general skepticism of subjective data being treated as objective data. He fairly asks legitimate and nuanced questions regarding the advancement level of fielding stats.

However, I am bothered that a reader, after reading Colin’s piece, would come to a conclusion like:

“Seeing as we haven’t made much progress in 20 years with defensive metrics...”

The only fair conclusion to make is that we don’t know how much progress we have made, and not that we haven’t made much progress.

We’ve made “some”. Is that a little? A lot? You can’t say “not much”. This is part of the nuance in Colin’s piece that may be glossed over, if said reader is representative of a portion of the readership.

===

I guess this is part of the nuance. Colin said this:

“The short version: I’m not really sure that we’ve gotten any further than where we were when Zone Rating and Defensive Average were proposed in the ‘80s. And if we have gotten further, I’m not sure how we would really tell.”

So, he’s asking two questions:
“Have we gotten any further?”
“How can we tell if we have?”

Rather than specifically making it questions, he’s wondering. But, he’s not concluding.

His actual conclusion was questions:
“To me, this opens up a simple question—how good are our defensive metrics? Are they useful? How useful?”

And that’s where we are. We’re in the investigation stage. And in the noted thread, I said this:

“Room for improvement and discussion, as long as you start with the [recorded, not estimated] data. ”


#1          (see all posts) 2010/07/16 (Fri) @ 12:54

I’m a bit amused that Tom and I were commenting on Colin’s article twice at the same time with related thoughts.

My more extensive (philosophical) response to Colin’s piece is at THT: http://www.hardballtimes.com/main/blog_article/can-we-objectively-evaluate-advanced-fielding-data/


#2          (see all posts) 2010/07/16 (Fri) @ 13:04

For offensive statistics, I have the impression that you have some basic building blocks: hits, walks, outs, strikeouts, homeruns, doubles, triples, stolen bases, caught stealing.  Everyone sees these, they exist, and the trick is combining them in meaningful ways.  The only thing really subjective is errors, and they don’t come up much.

For defensive statistics, I don’t have the impression that you have the basic building blocks.  I see attempts to come up with sophisticated fielding statistics, without defining the basic core inputs.

I would like scorers to start crediting outs.  When an out is made, assign responsibility to one offensive player and one defensive player.  It would be subjective, and the defensive player would usually be the pitcher (but not always), but it would go some ways to providing a building block.  Maybe later someone could go back and see whether the outs are hard outs, like strikeouts, and soft outs involving some combination of players, one of who was awarded credit for the outs.


#3          (see all posts) 2010/07/16 (Fri) @ 14:57

If we trust FIP to be an accurate indicator of a pitcher’s contribution to runs-against, why not use that to gauge the accuracy of UZR or DRS or whatever defensive metric you chose?  This would only work, of course, if we use raw FIP… removing the normalization to league-average ERA. 

I just ran a quick estimate using only 2008 stats.  I don’t know what the raw FIP is for each team, so I just re-normalized to RA instead of ERA.  I came up with r=0.77 for FIP-estimated defense vs. team UZR.  Think this method has any merit?


#4    Tangotiger      (see all posts) 2010/07/16 (Fri) @ 16:12

Ben, you are really talking about DER or BABIP.  And, the assumption of BABIP being 0% pitching doesn’t hold.


#5    Nick Steiner      (see all posts) 2010/07/16 (Fri) @ 16:33

Here is my question regarding this whole thing.  Where should the burden of proof lie?  Should the default assumption be that UZR is better than Zone Rating because it uses more granular data and a more rigorous methodology, or should it be that the two are equal until it’s proven that UZR is actually better?

IMO, the first one sounds much more reasonable.  The only claim that UZR supporters are making is that it’s better than the basic defense estimators, not any specific claims to its accuracy.


#6    Colin Wyers      (see all posts) 2010/07/16 (Fri) @ 16:48

Nick, how do we know the methodology is more “rigorous?” It’s more complicated, certainly. But until you test it - go through the assumptions and validate them, in other words - you haven’t done anything rigorously (and if you have, then we’re using the word differently).

As for the granular data - again, we know the data has flaws. In the case of STATS ZR to STATS-based UZR, we know the data has the same flaws. Does a granular treatment of the data make those flaws more or less relevant to the final analysis? Again - we don’t know.

My feeling, as I expressed in the article, is that the burden of proof lies with those who want to make positive claims based upon these fielding metrics. I mean, I can prove my position pretty easily - the burden of proof to say “I don’t know X” is really, really low. I think it’s on the people who say “I *do* know X” to provide that proof.


#7          (see all posts) 2010/07/16 (Fri) @ 16:52

Nick/5, if you must use some fielding system and you want to choose between UZR and zone rating, your reasoning is fine.  You have to use one, choose the one that seems aesthetically better constructed to you.  I do the same.  We might be wrong, but there’s no way to know. 

But that’s a very limited and somewhat hypothetical scope.  For one, when are you forced to use a fielding metric?  But more to the point, UZR supporters are making far more reaching claims than that in practice.

People claim to actually be able to assign a run/win value to a player’s fielding.  They claim to be able to tell you whether one player is a better fielder than another.  They claim to be able to tell you whether a given team has improved defensively over the offseason.  They claim to tell you whether a team has made a good trade or free agent signing partially on the basis of the purported fielding skills of the players involved.

Where is the evidence that supports those claims?

In fact, where is the data that backs up this claim that you made:

[UZR is] better than the basic defense estimators

Unsubstantiated opinion in the sabermetric world isn’t worth much.


#8    Nick Steiner      (see all posts) 2010/07/16 (Fri) @ 17:09

As I said, my opinion on UZR is based on the fact that it includes all of the same information as Zone Rating, as well as additional information.  Therefore it should be considered better until proven otherwise.  Just because Colin has shown that there is bias in the batted ball data and in the location data or what have you, does *not* mean anything in itself.  We don’t know how large that bias is and how it effects the metrics. 

If the whole point of Colin’s article is to say “we don’t objectively know anything more about how good are defensive metrics are now than we did 20 years ago”, well that’s fine, but it doesn’t really help us much.  I don’t think it should be taboo to make a subjective judgement of value on a metric - we do it all the time with the various DIPS stats out there. 

Anyway, I’ve proposed this test of UZR before, and I haven’t heard a response yet:

Take all players with x UZR’s, grouped by 2 or 3 runs or what have you and compare to WOWY.  So if you look at all players over the past 8 years who were rated between 5 and 8 runs in an individual season, you could see how many runs their team gave up when they were on the field compared to when they weren’t.  Over some 350,000 innings during that span, all sample size issues with the WOWY would be wiped away and the only differences in the measurement would be the accuracy of UZR. 

I unfortunately do not have the database skills (or database) to construct such a test.  I have no doubt it would be a breeze for Colin or Tango or MGL, and I think it would go a long way towards validating the metrics.


#9    Guy      (see all posts) 2010/07/16 (Fri) @ 17:16

I think the “burden of proof” dispute is just a distraction at this point.  What we all want to know is 1) how much more accurate are UZR/DRS than basic estimators (if at all), 2) are there any systemic biases in the PBP data that can be corrected to improve those metrics, and 3) to the extent UZR and DRS differ, is one better than the other (and why)? Let’s just focus on answering those questions. (And if we expand the discussion a little, we also need to know if there are types of BIP being ignored by most metrics that should be considered, like LDs and foul popups.)

Colin:  do you have the ability to look at those correlations you ran for team-switchers?  I’d be interested to know if opportunities correlate at all.  They shouldn’t, unless it’s being influenced by the player himself.

What we’d really like to know is whether there’s any correlation between opportunities and some objective measure of range.  Not easy to do.  But might be interesting to look at OF’s speed score (or some fielding-independent measure of speed) and their opportunities.  Any correlation would mean their ability to get to or near FBs is influencing perceived opportunities.


#10    Tangotiger      (see all posts) 2010/07/16 (Fri) @ 17:17

Should the default assumption be that UZR is better than Zone Rating because it uses more granular data and a more rigorous methodology, or should it be that the two are equal until it’s proven that UZR is actually better?

Ideally, MGL/Fangraphs would present a UZR-basic and UZR-advanced.  And the UZR-basic would be along the lines of ZR (with the handedness and park adjustments).  And UZR-advanced is the current UZR.

***

Colin: I was just thinking: couldn’t we say that the state of forecasting is in similar form?  You saw the Forecasting threads, right?  Marcel is once again above average.  Marcel was better than PECOTA last year, and a bit better than ZiPS, though Chone beat them all.

Again, can’t we say the burden of proof is on the forecasters to prove themselves?  And in this case, we’ve got the baseline test: Marcel.  I haven’t touched the code since I first created it, and it takes just seconds to run.


#11          (see all posts) 2010/07/16 (Fri) @ 17:25

Nick/8, I have no problem with subjective information in sabermetrics.  I have a problem with subjective information masquerading as objectively quantified information.

If people want to say, “In my opinion, Ichiro Suzuki’s fielding is about 8 runs above average per season”, I have no problem with that.  That’s their opinion.  If it’s based on UZR, good for them.  But if they want to claim that there is evidence that Ichiro’s fielding is actually and objectively +8 runs, and that teams should be acting accordingly, I want to see that evidence.  I think that’s a fair expectation.

I’m actually quite happy if people want to say that Ichiro’s fielding is +8 runs +/- 20 or so runs.  That’s fair.  But that’s not what I see people claiming.  They denominate things in increments of 0.5 WAR, implying that they’ve nailed Ichiro’s fielding skill or performance down to within 5 runs or better.  I’m very skeptical of that.


#12          (see all posts) 2010/07/16 (Fri) @ 17:29

I think the “burden of proof” dispute is just a distraction at this point.  What we all want to know is 1) how much more accurate are UZR/DRS than basic estimators (if at all), 2) are there any systemic biases in the PBP data that can be corrected to improve those metrics, and 3) to the extent UZR and DRS differ, is one better than the other (and why)? Let’s just focus on answering those questions.

I agree completely.  To the extent that this turns into a battle of personalities and the focus turns away from those questions, we are all the lesser.  Your reminder is helpful to me.

Regarding correlation of opportunties, there is what I reported here:
http://www.hardballtimes.com/main/blog_article/from-twitter-uzr-and-plus-minus/

CW: Can we look at year-to-year correlation for ExO in players who DON’T switch?
MF: I can’t find ExO on Fangraphs any more, is it okay to use BIZ?
CW: Hrm. Looks like they pulled DG as well. I don’t think BIZ quite gets at what I’m measuring, but it could work.
MF: Team-switchers BIZ/inn for outfielders y-t-y correlation R^2=0.06, for non-switchers R^2=0.18.
MF: Team-switchers BIZ/inn for infielders y-t-y correlation R^2=0.55, for non-switchers R^2=0.65.
MF: The difference between BIZ and ExO being ExO includes a measure of difficulty, e.g. how hard the scorer thought the ball was hit? Is that what you’re saying?
CW: Yeah. Still, I wonder how much of the BIZ correlation is pitcher tendencies and how much is scorer bias.

Is that what you were asking for, Guy?


#13    Guy      (see all posts) 2010/07/16 (Fri) @ 17:46

Mike: 
Thanks.  Are those really R^2, or “r”?  I can’t believe infielders’ BIZ has an r of .75 when they change teams.  But even if r =.55, that’s still very troubling.  And this is BIZ normalized for position, right?  (obviously, SSs will have more opportunities than other IFs).

Looks like coding balls to the outfield is not much of a problem at the zone level, which makes sense.  But there could still be a lot of bias in the BIS location coding, as well as how hard ball was hit rating.


#14          (see all posts) 2010/07/16 (Fri) @ 17:58

One thing that I would like to see is a zone-based fielding metric tied into a zone-based pitching metric to see how the runs allowed by a team tie out.  This is essentially what Ben was getting at with post 3 above, where he asks if we might be able to tie the fielding metrics into FIP.

I would guess that for those with the databases and knowledge, this shouldn’t be too difficult to calculate.  If a pitcher allows a “routine” groundball to short, that would be an out 95% of the time and would be a single 100% of the time it was not fielded, the pitcher gets credited:

1 BIP x .95 out x .47 single = .4465 runs

And the shortstop gets credited:

1 BIP x .05 out x .47 single = .0235 runs.

This is, as I recall, essentially what UZR does for the fielder, but I’ve never seen the pitcher side of it, and as such have never seen a tie-out at the team level.

Example:  I have an old UZR spreadsheet that shows that the Angel players totaled +95 runs in 2002.  (NOTE:  I think UZR may have been modified since then, and some players did not have enough playing time to appear on the sheet.  As this is for illustrative purposes only, I don’t think that matters—we can just assume that the contributions of any unlisted players totals zero.) The Angels allowed 851 in what appears to be an essentially neutral ballpark in a league where the average team allowed 776 runs—they were +75 runs between pitching and fielding.

This would mean that the pitchers, with the BIP they allowed and their strikeouts and HR and BB and all the rest of it—were -20 runs, as a group!  That is certainly possible—but does that all tie out in the end?  Can we go through pitcher by pitcher and see where that leaves them?

Coming at it from the pitcher side might give us some insight into what proper adjustments might be on the fielder side.


#15          (see all posts) 2010/07/16 (Fri) @ 18:20

Guy/13, yes, those are really R-squared.  It’s BIZ as reported on Fangraphs.

Colin has a very plausible explanation for the bias there: that it is a range bias.  For a video scored data set, such as is the basis for the BIS data underlying UZR and +/- on Fangraphs, balls will tend to be marked as closer to a fielder with better range than to a fielder with poorer range, even though they are hit on an identical trajectory.  This is because of the restricted view of the play offered on TV, which often only shows the fielder and the ball and no other distinguishing landmarks and only after the fielder is already in motion and near to the ball.  This is particularly true on the infield and particularly true on balls hit toward the SS/3B hole. 

With that limited frame of reference, how can the scorer tell whether the ball is close to a fielder because it was hit closer to him to begin with or because the fielder moved more quickly to the ball than another fielder might have been able to move?


#16    Colin Wyers      (see all posts) 2010/07/16 (Fri) @ 18:27

A bit of recursion here - that’s actually Guy’s theory, and I thought I attributed it to him.

And yes, some of the high r-squared Mike reports is certainly a positional bias. The values I reported are have been “normalized” by BIZ per inning for that position, and are far lower. I have not looked separately for park switchers for the normalized values.


#17    Tangotiger      (see all posts) 2010/07/16 (Fri) @ 19:38

Colin: not sure why you are doing per inning (i.e., per 3 outs including strikeouts).  You should do per BIP.


#18    Colin Wyers      (see all posts) 2010/07/16 (Fri) @ 19:56

I used Innings because it’s what’s available on Fangraphs. It’s the same reason I didn’t do several other things I would have liked to have done - I don’t have the data readily available. I’m working on building a better database for the purposes of comparing fielding metrics, including Dial’s DRS data and an account of BIP while a fielder is on the field, as you suggest.


#19    Colin Wyers      (see all posts) 2010/07/16 (Fri) @ 20:11

(Separate post in case you decide to move this discussion to its own thread at some point.)

As for projections - I think it’s a bit of a different case, in that we can readily test what a projection system is doing and how well its doing it. We actually have a very robust dataset to test forecasting systems against that everyone can agree upon - the statistics from the following season.

But certainly, the burden of proof is on the forecasters to show that they’re doing what they claim they are doing. (Probably even moreso, since there is that data to validate against.)

And the question that logically follows from this is - are we going to see that level of testing when PECOTA is published after this season? And, for whatever it’s worth, I’m promising that it will.


#20          (see all posts) 2010/07/17 (Sat) @ 00:31

Unlike offensive stats where there is play by play data recorded, UZR and DRS are hidden behind a wall. You get a single number that says this is it for the whole season.  Hard to validate it.

If UZR and DRS had play by play recorded, fans and peers could evaluate how UZR and DRS credited a given play, and judge for themselves on a play by play basis how credible it is.

All estimators and models should be distrusted unless what is being estimated can be determined with some degree of certainty.  For example, there are models that provide forecasting of typhoons and hurricanes, and these can be evaluated against the actual track which, is knows with some certainty.  The forecasting tends to be more accurate when you average the models forecasts and throw out outliers.

UZR says Player A is +10.  DRS may say he is +5.  The actual number is unknown.  There are only 2 models, so you can not throw out the outliers.  Why not average them?.  Nobody can say with any certainty that UZR is better than DRS or vice versa.

The big problem with the fielding metrics like UZR is it can not judge ability without knowing the actual positioning, even with 3 year data.  Positioning is more a function of team/coaching/scouting/etc than an individual player deciding where to position himself against a given hitter/situation, etc.  At best it can provide an indication of how many balls get fielded for outs and compared to league averages, and this is affected by fielding ability, pitching (hitting locations-command), positioning.

If someone wants to make that a teams/player good UZR or bad UZR is based on the fielders ability and not any other reason, it must be based on subjective evaluation.  The data is not meaningless, even SSS, it just needs to be used with extreme caution and not taken as gospel.


#21    Brian Cartwright      (see all posts) 2010/07/17 (Sat) @ 01:23

As Colin stated well in his article, the major problem is we don’t know for sure how many opportunities the fielder had, it’s like trying to judge a batter on his hits but with an unknown number of at bats.

There are more than two models, perhaps two based on BIS data. Rally’s Total Zone and my Oliver Fielding Runs at THT are based on Gameday, as is Peter Jensen’s Big Zone Metric (which I believe is unpublished). Colin is also working with Gameday.

A year is likely not long enough to get a stable rating. I use a three year weighted mean, the same as for batting and pitching, to get the runs saved per opportunity (then regressed) in order to produce one of seven letter grade ratings, but that is not yet published at THT. That is a lot more stable year to year. Guys might be Gd one year and Vg the next, but there’s not much jumping of several grades, and I would think that if these grades were generated from the various systems there would be more agreement.


#22    Tangotiger      (see all posts) 2010/07/17 (Sat) @ 06:41

"And, for whatever it’s worth, I’m promising that it will. “

Great, spoken like a true scientist!

Also, please also address / test the percentile ranges.  Specifically, do the ranges work for rookies as well as veterans in their 20s and veterans in their 30s?  For batters and pitchers.  I’ve been asking for this to get done since the day they were first published.


#23    Tangotiger      (see all posts) 2010/07/17 (Sat) @ 06:44

"Positioning is more a function of team/coaching/scouting/etc than an individual player deciding where to position himself against a given hitter/situation, etc. “

MGL assigns positioning to the fielder in UZR.  This is either a bug or feature depending on your POV.


#24    MGL      (see all posts) 2010/07/17 (Sat) @ 07:54

"MGL assigns positioning to the fielder in UZR.”

Not sure what you mean.


#25    Tangotiger      (see all posts) 2010/07/17 (Sat) @ 11:14

I mean that if a player positions himself down the line, and he catches 100% of those balls, and he happens to position himself in the hole and catches 100% of those balls, then this 3B is going to look like a UZR superstar.

Even if the manager is the one who told him to stand there.

Implicit in UZR is that positioning is attributed to the fielder.


#26    Guy      (see all posts) 2010/07/17 (Sat) @ 12:15

Colin/Mike (or Chris):  Have you ever looked at the correlation between BIZ/inning and Zone Rating?  Just playing around with the last 3 years of data on Fangraphs, in CF I get an r of -.35 for BIZ and ZR.  At SS the relationship is weaker but still negative, r = -.1.  There should be no relationship.  If these results are typical, that suggests a potential problem for zone rating—that players are receiving a lower rating only because more balls are being coded as in their zone.  Or alternatively, I guess it could suggest that each additional marginal ball in a zone is more difficult than average to field (which ZR should then account for, if true).


#27    MGL      (see all posts) 2010/07/17 (Sat) @ 16:35

#25, right, it would be nice to know how much of a fielder’s talent or lack thereof is positioning and how much is range, but alas, we are not there yet.

FWIW (not much), I find this discussion (mostly, but not completely) lacking in quality (as you might think).  I would not even know where and how to participate, so I won’t.


#28    d      (see all posts) 2010/07/17 (Sat) @ 16:49

Well, maybe this here thread is not generating tons of ‘quality’ discussion, but I certainly find Colin’s general points about fielding metrics very interesting…


#29          (see all posts) 2010/07/18 (Sun) @ 00:40

I am admittingly naive when it comes to defensive metrics. But when so much of a player’s value or WAR or whatever comes from defense, would it be possible for us to combine observation, game spray/hit charts with game stats to achieve a really good estimate for how many runs a player saves his team?

For example, we know how many runs a team on average scores when the leadiff man gets on, so when a fielder does not get to a ball that he should, allowing the leadiff man to reach first couldn’t this count as “-x.x” fielding runs for that player? Likewise, if Ichiro robs a HR with 2 outs and the bases full couldn’t this count as “+4 runs saved”? Or something like that?

If this is how it is actually done, then please forgive my ignorance. But it seems that if it were a focus, observation and data and knowledge of game state situations could result in IMO highly accurate defensive metrics in terms of actual or probable fielding runs ... Even in cases of infielders not getting the trail runner on DP’s and things if that nature.

Just wondering and thinking out loud.


#30          (see all posts) 2010/07/18 (Sun) @ 00:44

Note: “game stats” in paragraph 1 should be “game states”.


#31    Brian Cartwright      (see all posts) 2010/07/18 (Sun) @ 01:55

In my system (THT Forecasts) I do not attempt to classify individual plays as to whether or not they should have been made.

The basic concept is to count observed plays, calculate an expected value based on the overall mean values of the opportunities (this is where the art is) and then normalize the observed value in the context of the expected and the population mean.

Gameday reports which fielder retrieved each ball. This may sometimes be different from the responsible fielder (always on ground ball hits to the outfield, which I treat as a separate category). Gameday does also label deflected balls, which I assign to the fielder who deflected.

Therefor I can count ground balls each infielder gets his hands on (outs, hits or errors), count how many balls each outfielder got to (outs hits or errors) and count the ground ball hits to the outfield and assign each to an infielder based on my own criteria.

From this data I have what should be a fairly accurate count of plays made, but have to estimate plays not made (the hits). I believe every hit should be assigned to a fielder, so that the sum of each fielder’s outs and opportunities sums to the team total.

On most fly ball hits to the outfield the retriever fielder is also the responsible one, but sometimes an outfielder is credited with retrieving a ball which in reality another outfielder had a better play on.

Unless noted as deflected, I will assign infield hits and errors to the fielder of record. Should be accurate.

Most ground ball hits to the outfield go between infielders, and without making judgement calls by watching video, these are going to be rougher estimates of which infielder the hit should be assigned to.

For infielders I also count observed and calculate expected double plays started and pivoted.

I then take the observed total in each category (si, do, tr, hr, roe), subtract the expected total then multiply each difference by their run values.


#32    Chris Dial      (see all posts) 2010/07/18 (Sun) @ 23:13

And the UZR-basic would be along the lines of ZR (with the handedness and park adjustments).

How is this not my work?


#33    Chris Dial      (see all posts) 2010/07/18 (Sun) @ 23:17

the major problem is we don’t know for sure how many opportunities the fielder had, it’s like trying to judge a batter on his hits but with an unknown number of at bats.

I don’t think this is completely accurate.  We have a VERY good idea of chances.  How many of each hitters hits should have been ROEs?  I don’t think we have a very good idea of that.

There are more than two models, perhaps two based on BIS data. Rally’s Total Zone and my Oliver Fielding Runs at THT are based on Gameday, as is Peter Jensen’s Big Zone Metric (which I believe is unpublished). Colin is also working with Gameday.

Um, thanks?

A year is likely not long enough to get a stable rating

I am not sure this is true.


#34    Chris Dial      (see all posts) 2010/07/18 (Sun) @ 23:19

Guy @26,
I have looked at this years ago, and got a zero (or 0.00xx) varying between positions, depending on the chances.


#35    Colin Wyers      (see all posts) 2010/07/19 (Mon) @ 00:25

I don’t think this is completely accurate.  We have a VERY good idea of chances.  How many of each hitters hits should have been ROEs?  I don’t think we have a very good idea of that.

There are only three things that a fielding metric does, in essence:

1) Figure plays made,
2) Figure opportunities (implicitly or explicity)
3) Convert plays to runs.

The first and third point are, for the most part, areas of agreement. The results are different. If we know how to measure fielding chances, why are our metrics as different as they are?


#36    Tangotiger      (see all posts) 2010/07/19 (Mon) @ 00:55

There’s a 4th one, or a subset of the 2nd one: the quality of the opportunity.

There should be a great deal overlap between what UZR, +/-, and PMR considers an opportunity, seeing that they all use BIS.  Indeed, each of those systems considers ALL BIP as an opportunity for each fielder (as does WOWY, FWIW).

Where they differ is in assigning a quality of opportunity.  They all have an identical zone location and distance.  So, that’s common. 

But, UZR might say “that ball was thrown by Derek Lowe to a RHH hitter on grass… the out rate for SS in that case goes up by +.06 for balls hit to angle -17 degrees, over and above the .85 standard rate if we only used the zone”.  +/- might stop and only consider the RHH hitter and come out with .87.  PMR might ignore that it’s Derek Lowe, but include that it’s Dodger Stadium and come out with .84

So, all 3 systems look at the same BIP and make three different conclusions as to the quality of that BIP to be turned into an out by an average SS.

It all depends on what parameters each is using, and how those parameters behave with the other parameters.

WOWY on the other hand simply looks at the identity of the batter and pitcher and park and say “.14 outs for SS, .10 for 3B, .07 for RF...”

As you can see, WOWY has problems on the low end, as it ignores possibly valuable information.  Any system that ignores the actual, or estimated, location of a batted ball is not good in the short-run.  In the long-run, random biases wash away, and you are left with systematic biases.  That’s why WOWY over a long period, say 6+ years, should hold up pretty well with the other PBP systems.  We don’t expect Derek Lowe to have a much different batted ball distribution for his various SS.  Well, it will be different, but when you look at all 100 pitchers a SS has had, that will pretty much match their career rates overall.


#37    Brian Cartwright      (see all posts) 2010/07/19 (Mon) @ 01:20

Chris, there was no slight intended when I did not list DRS. I started with BIS based systems, then did a list of those I know to be based on Gameday. Went blank after that. I did state last week that when I saw people here reference DRS I thought they were talking about you and not Dewan, as I refer to his as plus/minus.

Colin’s step 2, with Tango’s addendum - I’ve stated that’s where the ‘art’ is, 1 and 3 are pretty straight forward, but everyone has different ideas on how to model how many opportunities and their expected values.

I do a WOWY based system, where I count how many balls to each fielder and the actual results, then calculate the expected results for a typical fielder given the same distribution of balls in the same ballpark with the same handedness of batters, etc.

Right now, like Peter, I only have one zone for each fielder. I have a hit location table, but have yet to exploit it. I’m likely to give each outfielder nine zones for flyballs (front, even, over for depth, left, even, right for angle) and I expect the angles of ground ball hits to the outfield to help a lot in assigning those to infielders, which I believe is pretty close to what Rally is doing with Total Zone, although I’ll still claim I’ve had those ideas since Bill James first published the Project Scoresheet chart Tango recently reproduced. (I put a version of it on the scoresheets I gave to my official scorers back in the 80’s as a guide for them to label hit location).


#38    fra paolo      (see all posts) 2010/07/19 (Mon) @ 07:31

I wonder if Colin thought about fielding in cricket, and then about fielding in baseball, whether it would help him in his quest for objectivity. Because a batsman has almost the entire 360 degrees of the field to place the ball, fielders get moved around a lot more in cricket than they do in baseball.

Which led me to the conclusion some years ago that any defensive system in baseball which defined the zones in which plays should be made, is probably as good as one should expect. Because the batter has a much more restricted arc in which to put the ball in play, and because fielders occupy relatively rigid locations (compared to cricketers) the vast majority of the time, I’m of the view that the optimum fielding positions have been identified. So zones where balls are caught n percent of the time or more are probably the ‘average’ against which we can measure ability. Finding those zones will become easier with something like hit tracker, and over time we’ll accumulate the kind of data that will give ‘through a glass darkly’ insight into yesteryear.

I’m not sure that worrying about the difference between fly balls and line drives actually gets us anywhere, again based on my cricket experience. A ball hit to the slips can be a ‘sitter’ that gets dropped or a tough chance that is still a chance. In either case, there’s a subjective judgment that has to be made, including by the player, judging whether to go for the catch or letting the ball drop and stopping the run.


#39    Rally      (see all posts) 2010/07/19 (Mon) @ 09:12

"And the UZR-basic would be along the lines of ZR (with the handedness and park adjustments).”

“How is this not my work?”

Are you telling us your ZR work has handedness and park adjustments now?


#40    Chris Dial      (see all posts) 2010/07/19 (Mon) @ 09:44

Chris, there was no slight intended when I did not list DRS.

Brian, I apologize - I was only teasing.  I know you don’t mean a slight.  Sometimes I act silly in the middle of a serious discussion.


#41    Tangotiger      (see all posts) 2010/07/19 (Mon) @ 09:47

The basic construction of any metric, of any kind, of any sport is this:

(playerRate minus leagueRate) times playerOpps

If a goalie saves 92% of shots and the league average is 90% and he had 2000 shots, then he’s saved +40 shots above average.

If a pitcher allows 3 runs per 9 IP and the league allows 4.5, and the pitcher has pitched 20 effective games (180 IP) this pitcher is 27 runs better than average.

If a batter gets on base at a .400 clip and the average is .340 and the batter has 600 PA, the batter got on base +36 times above average.

And if a fielder got 90% outs in a zone that the league got 85%, and this fielder got 500 balls hit into that zone, then he’s +25 outs better than average.

That is the basic construction for any and every stat around.  That’s the basis.  From there, you try to figure out what the “true” league average was, based on that player’s specific qualities of opportunities.

A pitcher pitches alot at Coors and little at Petco?  Well, the league average is higher.  A goalie faces alot of PP time?  Then the league save average is lower.  A batter faces a disproportionate number of opposite-handed pitchers (relative to his batting hand)?  Then the league average is higher.

And if an infielder has a disproportionate number of GB pitchers, and a disproportionate number of RHH, and a disproportionate number of runners on 1B with less than 2 outs, and a disproportionate number of hard hit balls, and a disproportionate number of grass fields, and a disporportionate number of balls hit at -17 degrees, ad infinitum, then that affects the league average outs.

And that’s what our job is: to identify all those parameters that could effect the league average rate, and figure out how much impact those parameters have.


#42    Chris Dial      (see all posts) 2010/07/19 (Mon) @ 09:51

Are you telling us your ZR work has handedness and park adjustments now?

No.  I am saying that adjustment is negligible to my work.

I like the handedness, and park factors for infield (IIRC, does anyone use one?) are simple enough to apply and likely *barely* move the needle, a few OF slots excepted.

When I first made my rating I adjusted the rating for a pitching factor which was SLG based.  It didn’t make much difference. 

So take DDRS (Dial DRS), and apply a “team L/R ratio/league L/R ratio” to my numbers.  Then regular park adjust (but I cannot say I know what a correct PF is.  Personally, I don’t think a PF should be used, but rather a specific adjustment to BIP that were high off the wall for Fenway/HOU. 

I don’t see a significant factor in any other park, particularly as more parks have these features, it comes out in the wash.

I am not as data-based skilled as the rest of you guys, so quickly dumping L/R BIP ratios takes me more time.  I can probably ask SG to add it to the spreadsheet he has created, though.  If any of you have a quickie way, let me know.

I’d also like to see a list of defensive park factors just like I can see them for offense.


#43    dq      (see all posts) 2010/07/19 (Mon) @ 11:16

41/great post

but to be picky
(playerRate minus leagueRate) times playerOpps

It’s probably not leagueRate,but more generically BaselineRate or StandardRate, as in many cases ReplacementRate is used in place of leagueRate.

You are measuring against a standard. That standard may be league average, but it may be replacement level, or even a diferent standard.


#44    Rally      (see all posts) 2010/07/19 (Mon) @ 11:51

"I’d also like to see a list of defensive park factors just like I can see them for offense.”

Step one is you have to have the zone rating broken down by home and road.  The spreadsheet you have could be used to get daily fielding lines (just subtract yesterday’s numbers from today’s, and repeat x162).  Obviously that would be a ton of work, and I wouldn’t ask anyone to do it.

But unless you have that, you can’t calculate park factors, and any statement about how much or how little park factors would affect the player ratings is nothing but a pure guess.


#45    Brian Cartwright      (see all posts) 2010/07/19 (Mon) @ 12:00

What Tango said in 41

dq - you develop some baseline based on the population

Chris - major league parks are +/- about 15% on rate of ground ball hits to the outfield. I use that in defense, but haven’t yet for batters or pitchers.


#46    Tangotiger      (see all posts) 2010/07/19 (Mon) @ 12:19

"I’d also like to see a list of defensive park factors just like I can see them for offense.”

MGL published it in his UZR, part 2 article, back in 2003 or 2004.


#47    Chris Dial      (see all posts) 2010/07/19 (Mon) @ 13:18

MGL published it in his UZR, part 2 article, back in 2003 or 2004

Yes, but those were pre-data he has now, and so I would be SHOCKED if he still used those.  IN addition, he only had a season or two there, and now he doesn’t use any of that data.


#48    Chris Dial      (see all posts) 2010/07/19 (Mon) @ 13:19

But unless you have that, you can’t calculate park factors, and any statement about how much or how little park factors would affect the player ratings is nothing but a pure guess.

So you don’t use park factors?


#49    Chris Dial      (see all posts) 2010/07/19 (Mon) @ 13:22

major league parks are +/- about 15% on rate of ground ball hits to the outfield. I use that in defense, but haven’t yet for batters or pitchers.

the *parks* are or the construct of pitching staffs?


#50    Tangotiger      (see all posts) 2010/07/19 (Mon) @ 14:05

Yes, but those were pre-data he has now, and so I would be SHOCKED if he still used those.

Obviously he updates them. 

The point is that he’s published them, and you can get a feel for how much the parks, the Gb/fb tendency, the runners on base, etc affect the ratings.


#51    Chris Dial      (see all posts) 2010/07/19 (Mon) @ 14:07

The point is that he’s published them, and you can get a feel for how much the parks, the Gb/fb tendency, the runners on base, etc affect the ratings.

Well, based on how Andruw Jones and Ichiro’s ratings changed, I can’t agree with your claim, so I’ll go back to my original comment: if you claim to park adjust something, publish the park factors.


#52    Tangotiger      (see all posts) 2010/07/19 (Mon) @ 14:20

Are those the ratings that changed between bUZR and sUZR (i.e., different data sources)?  What does that have to do with anything?

I still think my point is valid that you can get a feel for things by looking at the adjustments he’s published in the past.  You think not, so, that’s that.


#53    Chris Dial      (see all posts) 2010/07/19 (Mon) @ 15:00

Are those the ratings that changed between bUZR and sUZR (i.e., different data sources)?  What does that have to do with anything?

I DON’T KNOW WHAT ELSE CHANGED.  That’s part of the problem.  You like to say “It’s different data through the same engine”.  Well, they are two different dataset constructs (coordinates vs pie shapes), and thus PFs calculate differently.  I mean, it’s got to be different to change the values that much.

I still think my point is valid that you can get a feel for things by looking at the adjustments he’s published in the past.  You think not, so, that’s that.

I have a feel for it - that’s not the point.  I cannot believe 1. you don’t think PFs should be published, and 2. that you think the conversation should end with something published 7 years ago off data not used anymore because “we can get a feel for it”.

If no PFs were published for 1993-2010, and someone said “just go with the 1980s PFs, you’d have a cow.


#54    Brian Cartwright      (see all posts) 2010/07/19 (Mon) @ 15:14

Chris #49 -

It’s a measure of the parks

For example, I would take what the Padres and DBacks did in Petco, and compare to what the same two teams did in Phoenix. (In this case, ground ball hits to the outfield divided by total ground balls). Repeat for every team & ballpark combination with weighted matched pairs, and calculate the ratio.

Artificial turf is of course faster than grass. I believe Veteran’s Stadium had the highest GBH factor. For grass, there’s likely several factors in play - how humid is the turf (drier is harder), how long is the grass, etc.

1.10 Newest Busch Stadium
1.09 Miller Park
1.09 Comerica park
1.08 Dodger Stadium
1.08 Marlin's Park in Miami
1.05 Cleveland
1.05 Phoenix
1.04 Camden Yards
1.03 Anaheim
1.03 Safeco
1.03 Wrigley
1.03 Minute Maid
1.02 Old Yankee Stadium
1.01 Citizen's Phi
1.01 Shea
1.01 Fenway
1.01 Great American
1.00 Kauffman
0.99 Humohrey
0.98 White Sox
0.97 Rangers
0.97 Coors
0.96 RFK
0.96 Oakland
0.96 PNC
0.95 Turner
0.94 Nats Park
0.94 Target
0.92 Giants
0.91 Tampa
0.91 Rogers
0.83 Petco

1.34 Cd. Obregon
1.31 Syracuse
1.30 Lowell MA
1.25 Columbus GA
....
0.73 Ft Myers FL
0.59 Billings MT
0.59 San Luis Potosi
0.56 Poza Rica


#55    Brian Cartwright      (see all posts) 2010/07/19 (Mon) @ 15:26

one goof, when I did the above query I grouped by name instead of park_id and it merged the two Yankee Stadiums - Old was 1.07 in 127k gb, New is 0.90 in 52k gb.


#56    MGL      (see all posts) 2010/07/19 (Mon) @ 16:20

Remember that even though turf is faster than grass, the ground ball hit rate is around the same because there are more IF and bunt hits on grass (and of course the infielders play deeper on turf).


#57    bf      (see all posts) 2010/07/19 (Mon) @ 16:26

54/s a measure of the parks

Do you adjust for pitchers as well?

And do you regress it?


#58    MGL      (see all posts) 2010/07/19 (Mon) @ 16:29

I don’t think that true IF park factors are anywhere near plus or minus 15%.  For one thing, as I said, faster turfs have fewer IF hits, so the total hits per GB are pretty much the same at all parks.

In any case, I take multi-year GB park factors and regress them.  How much I regress them is based on the year to year correlations (which are low).  I don’t have the data in front of me (I am cruising to Bermuda as we speak), but the regressed IF park factors are like .98 to 1.02.  TRUE IF (GB) park factors, plus or minus 15%?  No chance.

OF park factors are extremely tricky for obvious reasons.

I don’t know what PF I published in the past.  They are probably very different now.  I completely changed the way I do the OF park factors for one thing.

Are they that important?  Eh, not really…


#59    Brian Cartwright      (see all posts) 2010/07/19 (Mon) @ 16:41

The factors I listed are only for 2005-2010, source Gameday.

These factors are ground ball hits to the outfield, and do not include infield hits. Those are in another list.

They are regressed, but do not take into account the pitcher. By holding the teams constant, it’s not a perfect fit, but I am seeing how the same groups of players do in different parks.

By using WOWY for defense, I do not apply park factors in the classic manner, but instead I am measuring how each fielder did in each park, and comparing that to how every other fielder did in the same parks in the same situations.

I do intend in the near future to use these kind of factors for batters and pitchers. A pitcher who throws a lot of grounders would see a disproportionate change in hits allowed when going from a slow to fast turf, compared to using one overall hits factor not broken down by grounders and flys.


#60    terpsfan101      (see all posts) 2010/07/19 (Mon) @ 21:04

Turf fields increase the on-field temperature a few degrees. At least this was true for astro-turf. I don’t know about the new grassy turfs. Well it really doesn’t matter anymore anyway, since no team plays all their home games outside on artificial turf anymore.

When I was doing simple run park factors, a team would lose around .02, .03 runs on their park factors (including dome teams) when they switched from astro-turf to grassy turf.


#61          (see all posts) 2010/07/19 (Mon) @ 23:35

In June of 1991 I got to pitch in a high school game at Busch Stadium (turf). The on-field thermometer read 118 degrees. It was, literally, “blistering” for a bunch of naive HS players that showed up just wearing a single pair of sanitary socks.

Anyway, my rambling point us that turf seems to increase the on-field temp by more than a few degrees, if that matters. FWIW, the game was played at 10:30 AM. At 3:15 the field temp had to be unbearable.

I also learned that major leaguers make diving on turf look much easier than it actually is. Think diving on asphalt.


#62    studes      (see all posts) 2010/07/20 (Tue) @ 18:58

Sometimes I act silly in the middle of a serious discussion.

Sometimes?


#63    Chris Dial      (see all posts) 2010/07/22 (Thu) @ 13:35

Okay, all of the time.  I can’t add smarts, so I go with levity…

My piece on pitching allowed runs in an effort to get the idea rolling.


#64    Tangotiger      (see all posts) 2010/07/22 (Thu) @ 14:24

I created a new thread.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 09 19:40
Psst… wanna intern in Canada?

Feb 09 19:10
Who’s evaluating the 2011 forecasts this year?

Feb 09 18:35
MGL: Today on Clubhouse Confidential

Feb 09 17:36
New PECOTA

Feb 09 16:38
The will of the people?

Feb 09 16:25
Correlation of pitcher metrics: FIP strikes again

Feb 09 11:56
Forecaster’s Challenge: 2012?

Feb 09 11:45
When is a life entity considered a person?

Feb 09 10:08
Change in fastball velocity by going from starter to reliever

Feb 08 22:41
Batman, the webslinger?