THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, August 16, 2009

Poz getting it wrong again (or at least overstating his case)

By , 12:44 PM

Poz talks about Betancourt again.  He cites his 2009 Dewan plus/minus and his UZR to make the case that he is a terrible defender which irks him because Dayton Moore and some of Betancourt’s new teammates (namely Bloomquist) think that he is a very good defender.  Poz also says that if you watch him play and you look at his error totals (2) in the last 28 games, you might think he is around an average defender.  Once gain, he is complaining bitterly about the signing.

He makes some mistakes in the article, and again, I don’t think he should be making these kinds of mistakes if he is to be revered as much as he is by some sabermetric folk.

From the Poz article:

John Dewan and his people at Baseball Info Solutions have come up with a way to measure defense. What they do, using video technology, is chart every single batted ball on a computer. And then, after entering all the data (how fast the ball’s going, the direction, the height, etc.), they determine how often that exact ball is turned into an out. For instance, a hard ground ball 6 inches to the right of third base — how often does the third baseman come up with that play and throw out the runner. How about a high chopping ground ball that is just over the pitcher’s glove? How about a slicing line drive that would hit the chalk in right field?

He is really overstating the precision with which the data is recorded and I think he knows that, or at least should.  There is no way that they can differentiate between a ball hit 6 inches from the base line and 3 feet from the base line.  And there is NO category that I am aware of that is a “high chopping ground ball just over the pitcher’s glove.” Come on!

Which is one reason why there is so much measurement error in these metrics in the short run.  A high chopping ground ball over the pitchers mound (that could easily be fielded by either the SS or 2B) could just as easily have been a ground skinner up the middle that no one could possibly have fielded.  They could easily fall into the same bucket, in which case, the fielder who catches the first one will be over compensated on that ball and both the SS and 2B will be overly penalized on the second one.  In the long run all of these things will even out, but in the short run, they create all kinds of problems due to measurement error and even bias.

According to the system, an average shortstop — not a good one, mind you, just an average one — would have made 15 more plays than Betancourt did in his first month.

Or another way: If you stretched that out over, say, 150 games, the system says he would make EIGHTY-THREE fewer plays than the average shortstop. It’s unheard of. It’s lunacy. Nobody compares.

No, no, no and no!  That does NOT mean that he would be 83 plays below average after 150 games any more than if Adam Everett was -1 play after one game, that would mean that he would be 150 plays below average after 150 games!  I HATE when people do this!

If Betancourt is -15 plays so far, it is likely that he was somewhat bad on defense, but worse than his true talent, and it is also likely that there is measurement error in that number such that he was probably only half as bad as that number.  That is why if all we knew was that a player was -15 plays after 100 games, we would call his true talent like -8 per 150 or something like that, which is bad but not horrible.

And if Poz really wants to talk about how bad of a SS that KC got in the trade, he needs to STOP quoting numbers from 100 games only.  Again, he knows that. His total career UZR at SS is -7.2 per 150.  If we weight the individual years, it is probably -10 or -11, which is not good, but let’s not mislead people by quoting things like, “That’s 83 plays below average (per season)...”


#1    Terry      (see all posts) 2009/08/16 (Sun) @ 13:24

If I had a dime for every time someone argued against the usefulness of fielding stats by stating that it’s physically impossible for a player to be 83 plays below average, i’d be rich.

Hyperbole can actually get in the way of an argument.

You’re dead on righteous to argue that measurement error and true talent must always be an important part of the prism one uses to interpret any metric.


#2    devil_fingers      (see all posts) 2009/08/16 (Sun) @ 13:49

MGL:

You completely correct on the facts of the matter. Posts like his are also good reminders to wanna-bes like myself. But…

At the risk of repetition, I think I should emphasize that the reason people like me like Posnanski so much as that, even though he sometimes gets it wrong, he is at least trying to take new approaches seriously and to use them in writing to a “mainstream” audience—in this case, the KC Star, which is usually even less sabermetrically-oriented than his blog.

That doesn’t mean that he shouldn’t be called out when he makes a mistake. I often comment on his blog when I wish I didn’t. I’m sure some people who read just that think I don’t like him, which is the opposite of the the case. I get worked up that because it bugs me more when Poz does something like the piece on Ricciardi or other stuff because I actually have higher expectations for him. If Olney or Rosenthal was doing this stuff, I’d probably just sigh and leave it alone.

The fact that I (we?) hold JoPo/Poz to a igher standard is just evidence of a (rightly) higher standard.

Just my two cents.


#3    JD      (see all posts) 2009/08/16 (Sun) @ 13:55

"In the long run all of these things will even out, but in the short run, they create all kinds of problems due to measurement error and even bias.”

MGL, I think you missed the most important thing here (though perhaps it’s so obvious and shouldn’t need to be said): Even if Betancourt SHOULD have gotten to that particular ball - even if every other SS in baseball history gets to that ball and Betancourt didn’t - it doesn’t mean he doesn’t get to that particular ball EVERY OTHER TIME. It’s using one play to make an argument when in no other instance in baseball (one at bat, one pitch, one stolen base attempt, etc.) would anybody outside of the very stupid/ignorant even consider doing such a thing.

And for what it’s worth, when KC played Oakland recently, Betancourt made a handful of pretty fantastic plays at short. I’m starting to think that he’s better than the UZR numbers would indicate when he’s trying, and he seems to be mostly trying with his new team. We’ll see how long that lasts.


#4    MGL      (see all posts) 2009/08/16 (Sun) @ 14:24

JD, I didn’t miss that point.  I have said this a million times:  There are 2 sources of error in any sample metric. One is the fact that a player may have actually played better or worse than he would in the long run for whatever reason or no reason at all.  The example you give is correct.  Let’s say there is a ball that few SS other than the best get to on a regular basis.  Betancourt may have gotten to that ball once in one opportunity or even twice in 2 opportunities (or 2 out of 3 or whatever).  In the long run, he may get to only 20% of them, but in one opportunity, he may have succeeded 100% of the time.  That is sample error.  That is a guy hitting 1 HR in his first game or 10 HR in his first 100 PA or whatever.  Sample error due to the random nature of sample results around a population mean.

The second source of error is measurement error and that is a completely different animal.  In sampling statistics it may not even occur.  When we flip a coin X number of times, there is no measurement error.  All we have is sampling error.  In the case of defensive metrics we have large and numerous measurement error.  For example, that one ball that Yuni got to that is only gotten to 20% of the time by the average fielder, may NOT have had the characteristics we thought it did.  While all hard hit ground balls hit in area X up the middle may be fielded 20% of the time, there are obviously lots of different types of balls in that bucket, from an easy, high chopper (albeit hard hit, if that is our bucket) to a ground skinner.  As well, some of those balls in that bucket are hit directly up the middle and some are hit a little to the SS or 2B side.  In addition to that, sometimes the SS is playing in position A and sometimes in position B because of the batter, pitcher, base runners and outs.  Lot and lots of potential measurement error.  So that ball that Yuni fielded may have been an easy chopper a little towards the SS side, even though it went into the “hard hit ball up the middle” bucket which is only fielded 20% of the time on the average by all SS.

So, to summarize, we have two sources of error in fielding metrics:  One, when a player actually performs good or bad, but better or worse than he would in the long run. Two, when according to the data, the fielder performed like “X” but he really performed like “Y” because the data is not precise.

When we see an above or below average number for a fielder, we regress that number because of both types of error (sample and measurement).  So if Yuni is -15 plays after 28 games, it is more likely than not that he played worse than he would play in the long run (than his true talent level) AND it is more likely than not, that he didn’t actually play as bad as those numbers indicate, because of measurement error (e.g., some of those balls that he didn’t get to were actually harder than their “buckets” suggested and/or some of the balls that he did get to were not as hard as their buckets suggested).

So what exactly did I miss?  I am not following you.


#5          (see all posts) 2009/08/17 (Mon) @ 02:25

No Offense MGL, but sometimes you need to take things with a little grain of salt and not be so perturbed.

I truly don’t mean to insult you here, but you seem to be missing the point here on why people like Poz, and you seem to have way-idealized views on how baseball media should be.

Poz is great for two key reasons:
1.  His asides are funny and entertaining
2.  He embraces new statistics and tries to understand them and use them to enhance his own understandings (and columns)

The first, is a personal thing.  The second, is naturally not going to be perfect....because no baseball media guy is really into it right now....

Most ignore all sabremetrics, a lot of them (particularly older newspaper columnists) publically rant against older stats, or just make statements that are just stupid.  FireJoeMorgan (RIP sadly) is basically the great example of this. 

Now Poz is slightly misusing UZR here.  But as you yourself agree, the main point (Betancourt is a bad to terrible fielder...despite what Dayton Moore claims) is correct.  He’s obviously correct on offensive stats regarding betancourt as well.  I understand you’re touchy on UZR since it’s your baby, but well...it’s not like he misused it in a way that isn’t understandable.  100 Games is something in most stats which people consider to be a decent sample size...it just happens UZR needs really large sample sizes to approximate true talent levels. 

Poz doesn’t need to get his stats used correctly all the time to be “revered so highly.” He’s not being put on a pedestal for being up to date on all stats, using them correctly at all times, and therefore being the greatest writer in modern times.  He’s revered for being a guy who’s funny, witty and tries to understand the newest stats and apply them in his columns.


#6    Nick      (see all posts) 2009/08/17 (Mon) @ 04:36

garik - While people like Poz, who I love, have good motives, misuse of statistics is a problem.  In this case, it doesn’t really matter because his point is correct, but what happens if it isn’t so cut and dry? 

I’ll use an example of our good friend Mike Silva.  In one of his lastest columns he wrote:

Why did I ask Mientkiewicz this question? Because I looked up Mark Teixeira’s UZR rating. Apparently Tex has a negative UZR rating this season. Anyone who has watched him play knows that is complete nonsense. He very well could be the best defensive 1B in baseball. As Tyler Kepner of the NY Times said at his twitter last night, “UZR says Teixeira is below average at 1B, which completely negates that stat for me”. I concur Tyler.

That’s where somebody misusing statistics can take you.  Similarly to what Poz is doing, Silva is making conclusions based off of small sample size errors in a stat.  You can obviously see that his post is much more dangerous.


#7    MGL      (see all posts) 2009/08/17 (Mon) @ 10:23

qarik16, as I said, I like and respect Poz.  (Maybe not as much as some folks like Tango and Neyer, but that’s a personal thing - Neyer loves the show “The Office” and I think it is unwatchable.)

I was just pointing out some misleading and inaccurate things in one of his articles.  That’s all.  I wouldn’t even do that if it were a typical MSM writer.  That should be a feather in Poz’ hat. 

He is clearly one of the best MSW’s (mainstream writer) out there.  Sometimes if you are good, you get held to a higher standard, like Tiger Woods.  That’s all I am doing.

As well, I like to discuss and clarify the nuances of certain things.

It is also a pet peeve of mine when anyone, especially someone who should know better, quotes a single season number as a proxy for someone’s true talent or how they are expected to play in the future.  That happens ALL the time and it bothers me.  Neyer does it.  People on BBTF who should know better do it, and of course the average fan and MSW does it.  PED’s aside, how about, “Look at David Ortiz’ OPS+ this year - why is he even playing and why is he DH’ing? He’s terrible!  They should release him.”

The fact that UZR is “my baby” means nothing.  Absolutely nothing.  I don’t care about those things.  I may harp on UZR if only because I know exactly how it works.  Heck, it wasn’t even my idea - it was STATS and Dewan’s 15 years ago.  It is not rocket science or a breakthrough.  It is just the application of simple and obvious concepts as it relates to defensive value and talent and given the data that we’ve had for 15 or 20 years now.  Anyone with a modicum of programming skill can put together a UZR or plus/minus program in a short time.


#8          (see all posts) 2009/08/17 (Mon) @ 11:01

You’re comparing apples and oranges here.

Person A (Poz) : Trying to apply a statistic correctly, misuses it.

Person B (Silva): Person who uses a shortcoming of a stat to disregard the stat completely.

The former person (Poz) is making an honest mistake here, and my point above is that MGL shouldn’t have an angry reaction to it (He’s perfectly fine at posting a correction here....and personally, i’d think it’d be nice if he e-mailed Joe Poz to correct him).

But remember, this is a guy making an honest mistake here, and Poz IS not the only one making this mistake.  Mind you, if we can’t rely on UZR numbers on a per-year basis due to sample size (basically), then people doing WAR calculations using these #s are making a similar error too, right?  Unless they decide to use careers to estimate true talent levels, which i’m pretty sure they’re not.

Mind you once again, my point isnt that we should be free to misuse a stat...but that you shouldn’t take it personally when they misuse it, and that MGL should chill off when it comes to Poz and others.


#9    Mike Fast      (see all posts) 2009/08/17 (Mon) @ 11:45

Mind you, if we can’t rely on UZR numbers on a per-year basis due to sample size (basically), then people doing WAR calculations using these #s are making a similar error too, right?  Unless they decide to use careers to estimate true talent levels, which i’m pretty sure they’re not

We can’t really rely on a one-year sample size of offensive numbers to estimate true talent levels, either.  Just like UZR or Plus/Minus, offensive numbers are subject to sample error and measurement error if you’re trying to estimate true offensive talent. 

The defensive stats do a better job than the offensive stats at minimizing measurement error, but because there are less fielding events per fielder than offensive events per hitter, particularly at some positions, the sample error for defensive stats is a little worse.  On the whole, though, I wouldn’t be surprised if the sum of sample + measurement error is smaller for good defensive stats (UZR, Plus/Minus, etc.) than for offensive ones (LWts, etc.).


#10    Michael      (see all posts) 2009/08/17 (Mon) @ 14:42

From what I gather from MGL’s various explanations on the topic, there seems to be a good deal of measurement error based on certain types of balls hit that we can’t qualify exactly, as there would be too few samples in a given bucket to be accurate. Thus we’re missing qualities of balls in play that are different within each bucket.

Based on what Mike is saying, it’s possible that the measurement+sampling error of offensive and defensive statistics are similar. My question would be how much difference is there between measurement error for linear weights models like wOBA on offense and similar models like UZR on defense, qualitatively (can’t expect a quantitative answer on that one)? I kind of have a grasp on the differences in sampling between UZR and wOBA, but not so much on the measurement error.

Also, FanGraphs has a hot topic on Texiera’s supposedly average defense in terms of UZR. Tyler Kepner tells us UZR is no good because he sees Texeira being amazing at first. (sigh)


#11    MGL      (see all posts) 2009/08/17 (Mon) @ 16:24

I’ve talked about this before.  One of the differences between offensive and defensive stats is that we have neat little buckets for offensive stats that we don’t have for defense - namely the events themselves - singles, doubles, etc.  We COULD have those for defense, but we don’t. NOT having them is better!

How much measurement error there is depends on your reference point.  If you are measuring singles, doubles, triples, walks, etc., then offensive stats have virtually no measurement error.  However, those neat little packages - the singles, doubles, etc., are not the greatest tools for estimating offensive true talent or how those neat little buckets will play out in the future (offensive projections).  A seeing eye ground ball single and a screaming line drive out are “mis-measured” from the stand point of offensive true talent and projecting future offensive value.  An easy example will suffice.  We have one data point on player A. He hits a screaming line drive for an out.  We have one data point on player B, and that is a bloop single to short RC field.  Who is likely the better offensive player (by a tiny amount of course because of the sample size of data we have)?  All of the offensive metrics that use those neat little buckets (wOBA, lwts, RC, BaseRuns, etc.), which most of them do, will get the answer dead wrong.

That is what Mike means when he says that we actually have smaller measurement error with defense than with offense.  He is right. But again, that is from the reference point of true defensive and offensive talent.  From the standpoint of those near little (and often inaccurate) offensive buckets, we have much less measurement error on offense, but those neat little buckets are NOT the best thing to use, by far, for estimating offensive talent and predicting future offensive performance.  We use them for convenience.  Some day we won
t.  Some day we’ll do the same thing on offense that we do on defense, when we get better data.  The reason we don’t care on defense is that what we are trying to measure - namely the ability of fielders to turn batted balls into outs - does not affect the quality of the batted ball.

On offense, unless we get better data, we are going to be biased in our data crunching.  For example, a hard hit ground ball to location X from Ryan Howard is probably not the same thing as a hard hit ball to location X from Jason Kendall, even though the current data tells us it is.  Plus, in order to evaluate offense using only batted ball data, we need to incorporate fielder positioning.  Not so with defense, assuming we have not much bias in pool of batters or pitchers, and base runners and outs, which we shouldn’t in the long run (and that is why we make adjustments for those in the short run).

So basically for offense, right now, because of the limitations in our data, we are actually helped by those near little buckets - the singles, doubles, etc.  And of course, in the very long run, that is all we need (singles, doubles, etc.) because those neat little buckets exactly describe the type of ball hit plus the location of the fielders.

Anyway, getting back to these silly discussions about whether one year samples of metrics are “reliable” or “accurate..”

They are silly because they are what they are.  A sample is a sample is a sample is a sample.  The best way to explain it is that if a metric is perfect.  Absolutely perfect.  I mean perfect as in G-d perfect.  It better and I mean it BETTER be wrong for a certain exact percentage of players, otherwise someone is cheating.

That is the answer to the, “Why does Teixera have a league average UZR when we observe him to be a great defensive first baseman (assuming that is true, which it may not be).” And until people grasp the concept in the above paragraph, these discussion are always going to be silly and meaningless.

Let me repeat that for ALL the fans and MSM writers who point to Teixera or any other player that UZR MAY have gotten wrong (not that many of them are reading this...):

Let’s start with the premise that a certain metric - any metric - is perfect.  Again, I mean STONE COLD PERFECT.  We are starting with that premise.  That metric MUST BE WRONG for a certain percentage of players over any finite period of time!  Again, for that perfect metric, how much wrong and for what percentage of players, depends on the sample size.  But, it MUST BE WRONG for some players.  So how could the fact that someone picks a player for whom it appears to be wrong (and I’ll concede that it is dead wrong for this hypothetical exercise) be evidence that the metric is anything less than perfect?  I’ll answer that question.  It can’t be!  There has to be some other method of analysis to determine how good that metric is.  It CAN’T be by picking one or a few players for whom the metric appears to be wrong.  Why?  Because the PERFECT METRIC HAS TO BE WRONG FOR A CERTAIN PERCENTAGE OF PLAYERS.  THAT IS AN INTRACTABLE LAW OF NATURE OR MATHEMATICS OR WHATEVER YOU WANT TO CALL IT.


#12    Davor      (see all posts) 2009/08/18 (Tue) @ 03:09

MGL,

there may be another reason for the way Poz said it; how would it look in mainstream article to say: he is worth roughly -0.5 runs per game, which is around -20 per 162 games, because nobody can really be as bad as the metric says… and then the explanation of regression is put in. I don’t think many readers would read the article, and quite a few of those who would would start thinking “well, those metrics can’t be trusted at all, even statheads say that!”. And while putting -80 instead of -20 or so overstates the case, both numbers mean terrible fielder.

As for the question of measurement error, when you try to predict true talent, or future value, discrete measured offensive results may introduce larger error than defensive measurements. But when you are analysing MVP candidates, advanced offensive metrics should be accurate enough, but defensive metrics contain large uncertainty, specially at 1B and C, where most metrics (as far as I know) have problems. So, simply adding offensive and defensive results, like WAR does, is not optimal. And too many people who use stats in discussions regard such metrics as absolutely accurate. To me, it sometimes look like (hyperbole) weighing your truck (15 700 kg) adding your sleeping powder weighed at the drug-store (10 g) and saying that the mass is exactly 15 700 010 g.

There is another problem with stats, which is rarely mentioned: talent level is mostly considered constant, adjusted for aging. But, looking at the individual players, it often fails. For example, Derek Jeter played last year with leg injury, this year he is mostly healthy. His true talent level on D should be at least 5 - 10 runs higher this year than last year. I’d say that the injury effect is more pronounced in defensive than in offensive stats, because managers are more likely to keep in the lineup injured player who still hits, but can’t field, than one who can’t hit.


#13          (see all posts) 2009/08/18 (Tue) @ 08:43

Is current Dewan +/- available to the public? if so, where?


#14          (see all posts) 2009/08/18 (Tue) @ 10:05

Is current Dewan +/- available to the public? if so, where?

It’s available at Bill James’s paysite ($3 a month for a site subscription).  Last time I checked there (and it’s been a while), you were limited by the format to looking at only one player at a time and couldn’t, for instance, call up a list containing all shortstops, etc.


#15    MGL      (see all posts) 2009/08/18 (Tue) @ 11:17

Davor, yes to everything you said, although I don’t think that Poz is deliberately couching things a certain way for his audience.  I think he truly does not have a firm grasp on how these things work.  Again, he has a firmer grasp on 99.7% of the mainstream writers out there.

Using context neutral stats for MVP candidates is ridiculous whether it be offensive or defense.  That is a bugaboo of mine when I see analysts and saber-friendly writers constantly quoting OPS+ or wOBA or WAR for MVP type of awards. That is absurd.  You don’t create ANY value with a context neutral stat unless those context-neutral stats happen to turn into runs and wins which they often, but not always, do. One player can have an OPS+ of 110 and another 140, and the former can easily have produced more value in terms of runs and wins.

You are right about the offense versus defense and value thing too.  While defensive metrics use more granular data, it is true that what offensive metrics use are EXACTLY what the player did so that you are closer to MVP type value when you use offensive metrics than when you use defensive ones.  To some extent.  Then again, the defensive metrics do in fact tell you whether a fielder turned a batted ball into an out or he didn’t (and the value of the hit when an out was not made), which is all you really want to know and is essentially the same thing as whether a hitter made an out or single, double, etc.  The difference between the offense and the defense is that a single is a single is a single, and an out is an out is an out, and a double is a double is a double, etc., from the perspective of retrospective value (IOW, we don’t care whether it was bloop single or a line drive out - some people might of course, even in retrospect).  But with defense, while a hit is a hit is a hit and an out is an out is an out, the problem is that darn measurement error and bias in the data.  Some of those outs and hits might have been easier or harder than we thought because of the limitations of the data.  The same is true of those offensive categories (some of those hits may have been hit hard and some not - same for the outs), but not too many people pay attention to that.

Here is the interesting/ironic part though.  With offense and retrospective value, if a guy gets a single, it is what it is, and we don’t care what that single looked like.  For defense, for some reason when a guy makes a play (turns a batted ball into an out), we DO care how hard it was to field!  Why is that?  I don’t know.  I really don’t.

So, for example, let’s say that Teixeira has a zero UZR (let’s PLEASE stop calling -.8 runs negative!) because he caught exactly the same number of balls as an average fielder would catch given the the number of balls hit in his area.  Well, if it happened to be that more of those balls that he caught or didn’t catch were more difficult than average (and he actually was VERY good rather than just average, over those same games), people would scream bloody murder, “Hey, I watched him play, and he makes great plays day in and day out!  What’s up with the -.8 runs in UZR?  That can’t be right!”

But what if Teixeira were batting .210 after 50 games but that was only because a lot of his line drives were being caught and no bloops were landing for hits?  Heck, the difference between .210 and .300 in 50 games is only 20 hits or so.  Not too many people would be screaming bloody murder.  Some people might notice that he was unlucky on BIP, but most people would say, “Don’t worry, he’s a good hitter, he’ll get un-tracked soon.” No one would be saying, “.210, are you kidding me?  That batting average stat is worthless!  I see him every day, and he constantly hits the ball hard!”


#16    Davor      (see all posts) 2009/08/19 (Wed) @ 03:05

MGL, the basic difference between your Tex examples is sample size. Good hitters hit into prolonged bad-luck periods, but it rarely lasts more than 1/3 of the season, and by the end of the season their stats look reasonable (not “normal” for them, but good enough). But required sample size for defensive stats is much larger. So, if everything Tex had to field this year averaged to 3/4 of the difficulty of the appropriate defensive segment (zone on the field, speed, hard-hit,...; depending of the stat in question) instead of average 1/2 of the difficulty of the segment, it would be equivalent to hitting line drives right at the fielders for 50 games. But, on offense, he would have 100+ games to get his stats back to normal, while of defense equivalent time would be next two years. And when people see second-half statistics, they won’t say “well, he had some hard luck, it should all even out during the next two seasons”.
I think that defensive stats are still at the stage where more different stats should be compared (at the minimum UZR and +/-) and where scouting component should be included (either by professionals, if possible, or knowledgeable fans) - something like Tango’s project. And if there is discrepancy between data points, it should be open-mindedly analyzed.

When we are talking about 1B, I remember you did a study on the effect on scooping throws. I can’t remember if it was just the effect on preventing errors, or was the possibility to get the hitter out if 1B is great at scooping and quick, vs. preventing an error, but allowing a hit included.


#17    studes      (see all posts) 2009/08/19 (Wed) @ 08:57

It’s available at Bill James’s paysite ($3 a month for a site subscription).  Last time I checked there (and it’s been a while), you were limited by the format to looking at only one player at a time and couldn’t, for instance, call up a list containing all shortstops, etc.

Leaderboards are now available, showing the top ten at each position.


#18    Luke Gofannon      (see all posts) 2009/08/19 (Wed) @ 10:45

Leaderboards are now available, showing the top ten at each position.

Thanks for the heads-up.  I will check it out.


#19    MGL      (see all posts) 2009/08/19 (Wed) @ 12:05

Davor the scoop study is on Frangraphs.  All I can do is look at the error rate differences among fielders with different 1B.  That’s all that matters anyway. If a fielder make a throw that is difficult to catch, either the 1B makes the catch in which case the batter is usually out or he doesn’t make the catch and someone gets an error.  If the batter was going to be safe anyway (and no error is given), then it doesn’t matter whether the 1B made the catch or not, so we don’t really care about that (although technically we’d love to know about ALL scoops, just like we’d like to know when a catcher doesn’t block a bad pitch but the runner doesn’t advance or the catcher blocks a bad throw with no runners on base).


#20          (see all posts) 2009/08/19 (Wed) @ 13:48

"For defense, for some reason when a guy makes a play (turns a batted ball into an out), we DO care how hard it was to field!  Why is that?  I don’t know.  I really don’t.”

This is an interesting thing I’ve never really given deep thought to.  My initial instincts are there has to be a difference as to why, though thinking through it I’m having a hard time coming up with a good reason.  My thought process right now is if, using your example with Teix, you replaced him with an average defender, that defender wouldn’t have made the same number of plays defensively as Teix did and would in fact produce below average results.  Offensively, a player may be unlucky to only produce average results when their true talent is better than that, but if you replace them with an average hitter the results stay the same - average production.  I don’t feel confident I’ve addressed enough of the issue though, so I’d like to get your thoughts on this MGL (or anyone else that cares to weigh in)?

Given that I look at things through WAR (since it’s freely available at fangraphs, even if it has its faults) - this seems to me it goes along the lines of production vs. a hypothetical replacement which is the basic concept of WAR…


#21          (see all posts) 2009/08/19 (Wed) @ 14:00

Just how a difficult a ball was to field seems to matter because there are simply so many more trials for batting than for fielding; I would have to assume that the assumption is that over the course of a season, 600 PAs or so, the level of difficulty of pitching that a batter faces is very very likely to average out so that any random Player A will face very similar quality to any random Player B. How many chances are there in a fielding season however? Far fewer, and the distribution therefore seems to be far more likely to be skewed one way or another for any random Player A or B. Teixeira, for instance, averages just about 100 Assists per season, to use a weak but probably still descriptive statistic for my purposes. I would have to think that it’s hard to assume that ~100-150 chances will necessarily be a standard distribution of difficulty amongst batted balls in his zone.

This is just my thinking, FYI, so I’d appreciate any criticism/responses…


#22    MGL      (see all posts) 2009/08/19 (Wed) @ 16:16

By the way, there are not necessarily many fewer opportunities for fielding than for hitting.  For example, a SS or CF get around 4 (or more, if you consider the ones that are not caught but possibly could be) chances per game, nearly the same as for hitting.  But…

All PA are created equal.  Not so with fielding chances.  Probably 90% of all chances to and around a fielder are either routine or not catchable.  There are very few balls that can distinguish a good from a bad fielder.  THAT is why the effective sample size of fielding is so much smaller than for hitting.  IOW, when we actually record the number of chances or opportunities for fielding, if we looked at video, we could probably just discard 90% of the batted balls from the data and NOT include them in the UZR (or plus/minus or whatever) engine.  That is an interesting point.

What is really interesting is that it may in fact be that fielding metrics are MORE accurate than offensive ones (because the data used is much more granular), but for the effective sample size, as some people have recently suggested…


#23          (see all posts) 2009/08/19 (Wed) @ 17:36

"That is a bugaboo of mine when I see analysts and saber-friendly writers constantly quoting OPS+ or wOBA or WAR for MVP type of awards. That is absurd.  You don’t create ANY value with a context neutral stat unless those context-neutral stats happen to turn into runs and wins which they often, but not always, do. One player can have an OPS+ of 110 and another 140, and the former can easily have produced more value in terms of runs and wins.”

This is another point I’d like to discuss.  Isn’t the point of using a context neutral stat that if you have a large enough sample size, these issues should just about go away, and now you’re measuring what actually happened in how it directly relates to runs produced?  There is still the issue of the lineup around a player - put Barry Bonds at his peak with 8 of me in the lineup and the only times his production turns into runs is when he hits a HR.  The point, though, is that the player is doing everything in their power to create those runs and you don’t want to factor others contributions into said player’s contributions.  Maybe one season isn’t a big enough sample size, I could see that, but once you achieve a large enough sample, what actually happened in terms of singles, doubles, etc. are the exact events that lead to runs, and are measured accurately even if the process for producing them is not (but the process doesn’t matter since we’re just trying to describe the results).

This also goes to your point that maybe fielding metrics are more accurate than batting ones - it would depend on what you’re trying to measure accurately, correct?  It’s certainly logical that they’d be a more accurate measurement of true ability, but when looking at a players accomplishments, do we really care about their true talent level as much as what they actually accomplished?  And measuring the events that directly lead to runs offensively (singles/doubles, etc.) should be a very accurate portrayal of their offensive production.


#24    Mike Fast      (see all posts) 2009/11/18 (Wed) @ 15:18

Christina Kahrl made a comment in her article explaining her rookie of the year vote that made me think of this thread.

Maybe it’s because of the limitations of defensive metrics, which are more suggestive than absolutely descriptive…

And offensive metrics ARE absolutely descriptive?  Any more descriptive than defensive stats?

It was actually a very good piece by Christina, and I don’t mean to suggest otherwise by the nit I am picking from it.


#25    Mike Fast      (see all posts) 2009/11/18 (Wed) @ 15:19

Oops, forgot the link to Christina’s article:
http://baseballprospectus.com/article.php?articleid=9775


#26    Tangotiger      (see all posts) 2009/11/18 (Wed) @ 15:33

It’s typical politics isn’t it?

There’s 50-60 good things and 40-50 bad things about everything.  If you want to talk about who and what you are supporting, you focuson the 50-60 good things.  And if you want to talk against it, you focus on the 40-50 bad things.

There is uncertainty in all the metrics, fielding moreso than others.  And that statement is pure politics.

And who references Rate2 anyway?


#27    Mike Fast      (see all posts) 2009/11/18 (Wed) @ 15:37

I liked the article simply because of the detail in which she went through her thought process.  Maybe also a little because her vote agreed with my preference.  And I liked to see that she gave fielding significant credit.  That has to be an improvement over most of the voting pool.

And if you read Will’s article on his NL ROY vote for comparison, Christina’s thought process comes out smelling like a rose.
http://baseballprospectus.com/article.php?articleid=9773


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential