THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, December 05, 2008

The state of fielding sabermetrics in MLB

By Tangotiger, 02:40 PM

Will Carroll:

There’s a lot of discussion about defense and defensive statistics coming into vogue, especially with Peter Gammons stating that teams are realizing that the negatives of defense take away from even big positives on offense. Manny Ramirez, he’s looking at you. Adam Dunn? You too.

But while people are crowing about Zone Rating or even Dave Pinto’s Probabilistic Model, they’re missing something. If you think teams don’t read the studies and stats out here, you’re wrong. They’ve done that for years. What most people don’t realize is that while there’s some great minds writing about baseball, there’s some great minds inside the game too. There are the names you know — Eddie Epstein, Ben Baumer, Keith Woolner — and more names that you don’t.

There have been occasions where I’ve been privy to some of that work that’s going on in front offices and it simply blows away things you’d call the state of the art in sabermetrics. Not a little ahead — a lot. Some of it leaks out, some doesn’t. Some is absolutely horrible too, while some is ignored. Just don’t think that teams are using the same things you are. Anyone that thinks that, isn’t thinking.

One of his readers replied:

Your post just said, “There are some smart people in baseball, TRUST me. I can’t say anymore, but it would BLOW YOUR MIND.”

One of the best sabermetrician out there, if not the best, is Tom Tippett (with the Redsox).  And his fielding work is on par with MGL.  One of the other best is Dan Fox (Pirates), and his published fielding work is below that of MGL (and only because of the limited data he uses; presumably he has access to the same data as MGL and Tom now, and so, his fielding work will soon be/is on par with theirs).

Fielding systems, once they all have access to the same data, are all pretty simple.  It’s all a matter of certain presumptions you make of that data, and how you decide to apply those adjustments.  I’ve always advocated the “SAFE” approach (continuous function), similar to what Gabe did at ProTrade, combined with a WOWY approach.  You put all of us in a room, and in a matter of days, we’ll all come out with practically the same system.  And when HITf/x comes out, we’ll be in even more agreement (after two weeks).  Really, half our job is trying to infer something from the data that hasn’t been recorded.  HOW that inference is done is what differentiates MGL from Pinto from Jensen et al.  But, with the eventual HITf/x (and GPS of fielders), practically ALL that inference goes away.

So, I don’t accept Will’s perspective on this issue at all.  Sorry Will.


#1    Rally      (see all posts) 2008/12/05 (Fri) @ 15:21

Maybe there is great work being done on fielding in MLB offices, but are they being ignored? 

I’m talking about simple things, like being able to value the relative worth of Carlos Lee and Mark Ellis.  J.D. Drew got a pretty good contract in 2007, but only one team was really after him, and he got less $ than inferior players like Soriano and Lee, who had multiple bidders.

Too many teams can’t even get the obvious things right, so I’m skeptical that they are doing this advanced work that will blow my mind.


#2          (see all posts) 2008/12/05 (Fri) @ 15:28

Do you believe that there are guys working for teams that you’ve never heard of?  Or do you think they all start out publishing stuff publicly and then are hired?


#3          (see all posts) 2008/12/05 (Fri) @ 15:48

Oh geez, three of the latter comments are taking BP to task for not implementing their own PBP fielding metric or using someone else’s. I’d like to see BP’s response to that.


#4    Mike Fast      (see all posts) 2008/12/05 (Fri) @ 15:54

I don’t know nearly as many front office guys as Will does, and I don’t talk to them regularly, but I am skeptical of his perspective based on how I’ve seen some of the supposedly sabermetrically advanced front offices handle the PITCHf/x data.

I wouldn’t be surprised if there are a handful of clubs out there that have defensive metrics that are on par with what is publicly available but have user interfaces that make the information much more useful.

I know that there are great minds in baseball.  I also know that one person is very limited in what he or she can analyze across the whole scope of the baseball spectrum, and the list of teams that have more than two top-flight people analyzing something is very short.

I’ve been more impressed with some of the things I’ve seen here or at THT or StatSpeak than most of the things I’ve seen from the clubs.

Does Will mean to tell me that the clubs are following what Brian Cartwright is doing over at StatSpeak?  I didn’t get contacted by clubs until I moved to THT.  Not to mention, there is some pretty impressive analysis going on in the public domain that is still unpublished, and if you only follow the main sites (THT, BP), you’re going to miss it until it really becomes mainstream.  That’s not to say that some leading-edge stuff isn’t published at THT.  It is.  But my point is that I wonder if Will is up to speed with what goes on in the public sabermetric world enough to make an accurate comparison between that and the private world.

I recognize that my perspective is limited, maybe moreso than Will’s, but like Tango, I just haven’t seen what Will has seen.


#5          (see all posts) 2008/12/05 (Fri) @ 17:11

Unless you’re Bill James and this is 1981, I really don’t really see how one person would be capable of *consistently* doing groundbreaking work while working for an MLB team.  I think there might be occasional bursts of genius, but it takes a community of colleagues building on each other’s work to make real progress.

I may be wrong ...


#6    Tangotiger      (see all posts) 2008/12/05 (Fri) @ 17:57

Will responded “what he said” to this reader:

as someone with a little bit of insight into this (maybe not as much as will), the proprietary stuff that one team i know about has access to is far more in-depth than stuff that’s available to the rest of us, even on pay sites such as BP. i don’t necessarily know how much more useful this information is (or even if the team in question takes it into account or whether the other teams have similar stuff). but my mind was blown away by what i saw. for example, the analysis of an individual player and what he did on the field on every pitch and play over the course of his entire (not insubstantial career), taking into account contexts that i wouldn’t even have conceived of, was something that i simply don’t think an amateur “seamhead” would be able to do one of, let alone what i believe were hundreds of similar analyses in their database. one guy wouldn’t have the manpower, the technology, access to the same information and, perhaps most importantly, the time to do what i saw with the same quality.

Obviously, we are not disparaging anyone’s work here.  And, I think we’re all thrilled for any saberist’s work being done for a team, especially being given the rope (time) to do whatever he has to do.

But I believe that this is mostly a perspiration issue (time and effort), not inspiration (intelligence and insight).

What contexts could we be discussing here?  The identity of the batter, runners, pitcher and teammates?  The game state (inning, score, base, out, count)?  The park?  Climate (temperature, sunlight)?  The speed and angle of the contact?  The kind of pitch thrown, spin, angle of deflection, speed, and location?  What exactly is being considered that one could not conceive that would make that reader say that and Will concur?

***

And the other point being made is that this analysis may not even translate over for a proper decision-analysis.  Carlos Lee, and all the other contracts that we’ve ripped up here, are cases in point.


#7    larry      (see all posts) 2008/12/05 (Fri) @ 18:39

well, that’s my post will quoted. i’m not sure how much of this kind of stuff that i can really reveal (as may be implied, this isn’t my own work that i’m referring to and i don’t want to get anyone in trouble). but the most interesting “context” that i saw considered in this quite exhaustive analysis was what i guess i would call “health” variables. i didn’t get all the details about what these variables were supposed to be covering. but i guess what i found interesting about it from an outsider’s perspective was that they were variables that took into account how the player was, for lack of a better term, feeling that game. like, what niggling injury a player was carrying that wouldn’t necessarily be revealed to the media and fans. also, for pitchers, there was some kind of throwing arm strength/fatigue testing component (i’m not up on all the health stuff that goes on so i’m not sure what the technical terms for that stuff is). i don’t really know how this stuff was quantified, either, as their seems to be a subjective component to some of these variables. but, thinking out loud, i suppose one could mitigate some of that by creating benchmarks, either overall or for a particular player.

there were lots and lots of other variables available in these (some of which i’d question what use they would actually have in evaluating player performance but others which i thought were novel or, in a couple cases, things i recall, and to do this day still read, people writing “i wish i could quantify x...").

like i said in the post, i’m not sure how much utility this stuff has over and above what the “rest of us” already have access to. but more information is usually better, right?


#8    MGL      (see all posts) 2008/12/05 (Fri) @ 22:19

I can’t say what teams are or are not doing, although to some extent we can infer which teams are the “are not doing” ones, from their trades/acquisitions and other strategies.  The “are doing” ones are more difficult to identify.

That being said, to blankly state that teams are doing things that would “blow us away” is a meaningless statement.  One, I don’t think anything a team could possibly be doing would “blow me away,” and two, if there are teams that are doing advanced analysis, it is a small minority.

The principal “problem,” I think, with teams and sabermetrics/analysis, is the disconnect between the work that may be done by analysts in the FO and the GM and manager.  Doing the work and utilizing it are two different things.

I also agree that the amount of work that could potentially be done requires a staggering amount of time.  Are there any teams that currently employ a team of full-time analysts?  If there are not, then don’t tell me about “work that would blow me away.”

Finally, what might blow Will away is not necessarily the same things that would blow Tango away.  Will is NOT a sabermetrician, although he obviously knows a lot more about sabermetrics than the average fan or sports writer.

I consider Will’s post a “throwaway post.”


#9          (see all posts) 2008/12/06 (Sat) @ 00:13

Mike, how much do I owe you for that?

As I have written, I think most of the problem with defense is what’s still missing from the freely available data, leaving us to infer.

I had some correspondence with Dan Fox while he was working on SFR. It’s true his data set wasn’t as precise as some, but his intent was to be able to use it with historical pbp, and as I suggested, minor league pbp.

One thing I like alot about SFR is that the ability to turn a batted ball into an out is not the only thing being measured - Fox also includes bases allowed to batters stretching hits and runners taking extra bases.


#10          (see all posts) 2008/12/06 (Sat) @ 13:39

I can’t imagine that any fielding analysis that uses the Stats Inc historical PBP database (that is, with zone and distance for each batted ball) deviates much from UZR.  If Stats Inc included player positioning prior to the pitch and prior to the swing, we could have a vastly better algorithm - though not one that’s computationally more “mind-blowing”.

What might “blow my mind” is the outcome of (though not necessarily the computation inherent in) work done by some team made a conscious decision 5 or 6 years ago to use scouts/stringers to observe MLB and minor-league games and generate a much better dataset than Stats.  I remember seeing a mention that the Red Sox started something like this.

Ignoring Will’s hyperbole, there is an important point here, which is that a lot of teams have gotten the message that fielding might have some value to it.  Whereas before they might have valued it, at a “gut” level, at some infinitesimal amount relative to how they valued hitting, now they might actually understand that bad fielding costs one or two wins.  That’s a lot of “mights” - and notice that Will doesn’t make reference to players with seemingly mediocre bats in the corners who have higher contracts because their fielding is perceived to be good.


#11    sam      (see all posts) 2008/12/06 (Sat) @ 15:33

Thanks for this post.  My reaction to Carroll’s post was much of the same, without the insider knowledge to support my reactions.  It’s also worth noting Carroll’s brief example:

“Scotty—I’m not enough of a stat guy to comment on why, if I was allowed to. The things I was shown—which I’m sure wasn’t the “good stuff”—were done on condition of secrecy. The best I saw was a graph of defense, rather than a number, but I really can’t say much more about it without violating trust.”


#12    Rick      (see all posts) 2008/12/06 (Sat) @ 19:35

In light of the current financial crisis, we would be quite irresponsible to fail to emphasize the enormous difference between having analysis done and actually using it to guide decisions. MGL made the point, but it really can’t be overemphasized.

Most large organizations have an in house group of analysts.  It is the rare organization that has executive leadership willing to not just let the analysts sit at the table, but rely on their advice.

I’d be shocked if an organization didn’t have somebody, and likely multiple people, doing sabermetric work.  But how many GM’s, scouting directors, etc. understand it and utilize it as a core part of their decision making process?  Many fewer, I’m sure.  Juan Pierre.  Luis Castillo.  Tom Glavine.  Livan Hernandez.  Barry Zito.  Carlos Silva.

I’m sure there’s some amazing work being done in offices across the league.  There seems to be many fewer GMs who are willing to listen to their employees.


#13    Pizza Cutter      (see all posts) 2008/12/06 (Sat) @ 22:40

I think there’s a distinction to be made on the issue of what might be “mind-blowing” about what the MLB teams/consultants have done.  Consider for a moment that guys who work for MLB teams probably have much better data to work with.  I love Retrosheet, and good Lord knows I’d be nowhere without it (I suppose it’s debatable if I’m anywhere with it), but it’s a limited data set.  I’m sure all of us who do analyses would recognize the moment where you stop and say “If only I had (insert wish here) available...” Plus, most of us are hobbyists.  I can put a few hours a week into this stuff (some weeks, no time at all).  I assume many of you are in the same position.

A thought experiment that I am perhaps uniquely unqualified to offer.  No one’s ever shown me any defensive graphs and I have no idea what exactly some of the teams are working with.  I can hazard a few guesses, and I’d probably be right.  But let’s level the playing field, so to speak.  Suppose that someone set you up with the same data sources and a full time income so that your job was simply do Sabermetrics.  I’ll let you collect yourself and wipe the drool off the keyboard…


#14    MGL      (see all posts) 2008/12/07 (Sun) @ 00:16

I’ll play the other side of the net for a minute.  We are in the middle of a sabermetric revolution/evolution in baseball.  Of course some teams are doing good work, while other teams are not - yet.  And everything in between.  This year, more teams are doing better work than last year.  Next year, more teams will be doing better work than this year.  Nothing “mind blowing” about that.  And yes, Will’s post appears to be hyperbolic. One man’s “mind blowing” is another man’s, “that’s nice.” And when someone says, “I saw something really cool, but I can’t tell you what it is,” well…


#15    Peter Jensen      (see all posts) 2008/12/07 (Sun) @ 00:18

It certainly would be possible for a team to have set up a video photogrammetric analysis system that would do for their own home park what the proposed Fielding f/x system would do for the league.  That would give them more accurate data on hit locations and hit ball speeds than any of the observation based data gathering systems that both MGL and I have shown be lacking in agreement.  If a team had done so it could have a “mindblowing” fielding metric for their own players and some information about their opponents from games played in their home park. A team could also be gathering similar information from their minor league clubs.  In addition to providing more accurate numerical information for sabermetric type analysis frame by frame observation of video would also aid pitching and batting coaches in diagnosing mechanical flaws of players.  There would be no reason for any of us amateurs to know if such systems exist.


#16    Mike Fast      (see all posts) 2008/12/07 (Sun) @ 02:22

Peter, I agree that a team could have done that in theory.  I’d give about 20:1 odds that it hasn’t happened.  It’s not just a simple matter of setting up a couple cameras and collecting the video.  It takes a lot of engineering expertise to make those cameras work and then to get anything useful out of the resulting video. 

I have seen no signs that teams have anything approaching this level of expertise.  Dan Fox is a sharp guy.  Tom Tippett is a sharp guy.  I don’t think either one of them would have the expertise to make such a system work.  Could a club have tracked down people with the expertise to make it happen?  Sure.  But would any club be willing to put in the multi-hundred thousand dollar investment to make it work?  I seriously doubt it when even the sabermetric savvy clubs are depending heavily on interns and cheap help to do their number crunching rather than shelling out $100k or higher salaries to get experts to do their work. 

Maybe that cost landscape is starting to shift for a few clubs so that they are willing to put more bucks into analysis.  Cleveland is well known for their comprehensive database, and everyone believes that’s what Huntington hired Dan Fox to put in place in Pittsburgh, and Boston is known to be doing some of things that Larry mentioned in post #7.  But those are step below the level of investment and expertise needed to make a good PITCHf/x or FieldF/x system work, IMO.

Also, you mention that there is no reason for amateurs to know if such systems exist.  Several industry professionals have already commented in this thread.  Maybe none of them have worked with Boston or Cleveland, the two teams I’d finger as most likely to be the subject of Will’s mind-blowing encounter, (or maybe they have, for all I know) but that kind of thing doesn’t stay secret for too long in an industry this small.


#17    Mike Fast      (see all posts) 2008/12/07 (Sun) @ 02:36

To belabor my point even further, it’s not that I don’t think a team couldn’t have done something really, really cool that none of us expected and a level above what we could do.  It’s possible.

But what I have not gotten a sense for in my interaction with the guys who work for the clubs is that they are any different than the professionals I have worked with elsewhere on a day-to-day basis.  There are some ordinary people, a few smart people, a lot of hard-working people, and a genius here or there.  Put several of them together and they can come up with some pretty cool stuff. 

But, like the rest of us, they have to deal with things like budget realities, having more things on their plate than they can handle, having to postpone the important for the urgent, management who doesn’t understand why they want to do something, etc.

In some ways, the insiders/outsiders contrast reminds me of industry versus academia.  On the one hand, the really coolest stuff happens in academia, where all sorts of esoteric ideas can be pursued.  The same is true in sabermetrics.  The stats guys in the industry don’t have the luxury of sitting around thinking about a tenth of the stuff that we do.

On the other hand, what happens in industry is almost more impressive than what happens in academia.  Taking an esoteric idea, making it manufacturable, and finding a market to sell it profitably, takes a heck of lot more dedication and concerted effort and not a little bit of genius at times, too, for that matter.  So what the stats guys in front offices do that impresses me is not primarily the mind-blowing nature of the analysis they conceive but the ability to integrate with scouts and players and businessmen and figure out how to make the data useful to all those parties.


#18    larry      (see all posts) 2008/12/07 (Sun) @ 11:50

Mike,

just to play a devil’s advocate on your point about whether a team would be willing to make a multi-hundred thousand dollar investment (which is probably a good estimate of what doing such more advanced work costs). one would think that, considering the cost of a single win on the free agent market, a rational team (assuming they thought there could be almost any value added by such an investment) would make such an investment.


#19    dan      (see all posts) 2008/12/07 (Sun) @ 13:08

To expand on what Larry said in #18, if a FA costs about $4.8M per win, then the cost of one extra run would be $480K. That might not be how you’re supposed to think about it--I’m not sure. But it seems that virtually any amount of runs that can be added to a team would be worth the investment. If a team spends $480K per year, and adds on average 10 runs every 10 years, then with salary inflation, they have done better than break even.


#20          (see all posts) 2008/12/07 (Sun) @ 13:11

Larry, that makes sense to some people.

Funny that we’re talking about installing a photogrammetric system, because that’s what I do for a living.

For years, we have worked viewing actual film aerial photographs, making maps from those. We have been using a mapping software for almost 20 years that everyone in the department knows very well. Now everything is moving over to digital photography. The software to view the digital imags doesn’t support our old mapping software, so everyone needs to be trained how to use a drafting software, jerry-rigged for mapping. Even when and if we learn how to use it, it’s not as fast and not as accurate. I have a quote from the mapping software vendor that for $20k they will write the drivers necessary to use the software we all know and love with our digital photos. But, of course, the boss says that’s too much money. But how much money are we wasting on using what we have now? Now that so many people are retraining and they are screwing up the boss is getting mad, so it might be time to bring the $20k up again.


#21    Tangotiger      (see all posts) 2008/12/07 (Sun) @ 14:14

You may think that teams would value the wins this way.  They simply don’t.  Saberists are far more considered an expense than an investment.

The first team that signs two full time employees to work on PITCHf/x full time and pays them the corporate america rate (not the 20-40% discount workers give to MLB teams), then we know that there’s at least one team that sees this as an investment.

I don’t know when that first team will take that plunge, but I’ll assume that the second team will be, optimistically, for the 2011 season.


#22    MGL      (see all posts) 2008/12/08 (Mon) @ 07:26

While we (the hard-core persons on this site) know (at least we think) that an expenditure of 1 million per year or more on a sabermetric department in the FO is a bargain of an investment, it takes a lot of pieces to fall into place for a team to think/know, and actually implement, it.

Put yourself in the place of an owner or a GM (or other high level-exec in a baseball franchise) - smart, but knows almost nothing about the value of sabermetrics.  Maybe reads and hears a lot about it.  Does not doubt that it might have some great value.  Certainly hears it denigrated by someone - probably more than one person - in his organization.  Probably by at least one person who is also smart and well-respected in the organization.  The manager, especially if he is a veteran one, wants no part of it, perhaps.

Are you going to “blindly” give the OK to budget one or two million for a sabermetric/statistics department, even though that is a drop in the bucket compared to payroll and expenses in general?  Nope.  You might take a flyer on one guy at $50,000, maybe even 75.  Eventually, if you think it is paying dividends - and that is going to be really hard to determine - you might expand that department and budget more and more money.

Or the perfect combination of ownership and management might see the value right away and think very little of spending a mil or two per year.  Either way, though, I think it takes a “prefect storm”, at least in 2009, to start paying out a mil+ per year for sabermetrics.  In 3 or 5 or 10 years, it might (probably will) be a different story.


#23          (see all posts) 2008/12/08 (Mon) @ 14:03

MGL/22 - I often think back to a World Series broadcast I saw on ESPN classic.  The announcer says of Joe Rudi: “He’s one of the few ballplayers who will work out with weights.” The time from a few players having success with it to everyone doing it was about 15-20 years...Hopefully saber work will move faster…


#24    Tangotiger      (see all posts) 2008/12/09 (Tue) @ 17:50

Pinto’s charts of all players.  Here’s Utley:
http://www.baseballmusings.com/cgi-bin/DisplayCharts.py?PlayerID=1679&fpos=4&year=2008


#25          (see all posts) 2008/12/10 (Wed) @ 19:59

I’ll just suggest a couple things:

1. Yes, there are teams doing things that would blow you away. Just the use of SportVu is pretty insane.

2. I probably overstated my case and said it badly.

3. ... but I still think there’s a lot more stuff going on inside than out. When there’s good solid work outside, it’s pulled inside quick. (Dan Fox comes to mind ...)


#26    Tangotiger      (see all posts) 2008/12/10 (Wed) @ 20:13

Will, thanks for stopping by.  You certainly take your share of heat, so it’s big of you to come into the fire.

As for point 3, Dan is an exception, not the rule.  The “scouting” of sabermetricians is done on a sporadic basis, and with some fortuitous timing.  When you have MGL and John Walsh and other heavyweights sitting on the sidelines for the most part, you know it’s not very efficient.  And certainly, even those saberists that are tapped are not done to realize their full potential (lots of simply part time work).


#27    Mike Flatt      (see all posts) 2008/12/10 (Wed) @ 20:45

If I understand SportVu correctly, couldn’t it be used to determine all defensive players’ initial starting position?  I think STATS, INC. acquired SportVu and I’m interested to see how they use it for MLB.


#28    Mike Fast      (see all posts) 2008/12/11 (Thu) @ 00:09

Is SportVu doing anything with baseball?  I thought they were mainly a soccer company.


#29    Tangotiger      (see all posts) 2008/12/11 (Thu) @ 08:40

And, given the same data, the public will do a better job than the teams, even if the teams have all 30 of the top saberists.


#30    Colin Wyers      (see all posts) 2008/12/11 (Thu) @ 12:21

The problem you have working for teams (I presume, at least) is that there’s, what, maybe three of you? And one or two of those are interns?

The great thing about us public saberists is that there really isn’t just one of us or two of us. We talk, we collaborate, we share ideas and we correct each other’s mistakes. Maybe MLB does have the top 30 analysts, but I still think that the 30 top analysts working alone don’t have a significant edge on the next 30 working together.

Teams do have the edge in resources, of course, and there are probably some teams out there that are more dedicated than most to utilizing that edge. But that’s no guarantee of better work - try reading the Cathedral and the Bazaar at some point:

http://www.catb.org/~esr/writings/cathedral-bazaar/

By this logic, Microsoft should be able to lap open-source software. And in some cases that’s true. And in some cases? It really isn’t.

[Caveat: I would absolutely love to work for an MLB team (or if I’m being honest with myself, BP), so you can try to read into this any sour-graping you’d like.]


#31    J      (see all posts) 2008/12/14 (Sun) @ 01:57

I don’t know if anyone is still reading this thread, but I thought this was interesting with regard to the discussion of organizational structure.

http://blog.seattletimes.nwsource.com/mariners/2008/12/12/no_more_advance_scouts_for_ms.html


#32    Fargo      (see all posts) 2008/12/15 (Mon) @ 03:38

To add to a possibly defunct thread, one thing that occurred to me is a term from football teams. After a game, the coaches “grade” each player. I don’t think they do this by reviewing a lot (or even any) pbp film. But it’s SOP in football. 

Imagine in baseball if a team could do this for each defensive player. Maybe not every game. But imagine that they literally watched a player play an entire game—where he positions himself, what he does on every pitch and batted ball.  Do the infielders move to cover the base, do they move to the right cutoff location, how quickly do they get off the throw, do they throw to the right base, how accurate is the throw, how strong?

Imagine that by truly tracking a player’s every move on every play you could learn whether he’s doing the things that are going to put him in a position to make a play, to back up the play, etc.

Would a baseball player accept the idea of coaches “grading” him based on this—giving him a score for a game even on the kinds of things that Tom has asked us fans to do (but only) for a player’s entire season?

The player wouldn’t be terribly receptive, I imagine if he hadn’t made any errors, hadn’t been out of position on any play he was expected to make, and so on.  But the smartest team players are going to be in position to make a play on every pitch, and many of the things he does do not show up in any of the standard stats. And a smart team might evaluate and train thair players by doing such careful grading.

Why couldn’t this be part of the assessment of how well players play defense?


#33    Tangotiger      (see all posts) 2008/12/15 (Mon) @ 11:33

They do this in hockey, grading each player from 1 to 5.  I think it’s a perfectly fine system, and I agree with it as a way to present an additional data point.

This is why I support giving a 1 to 5 grade for every batted ball, and you can further extend that for every fielding play.  So, Jimmy Rollins is faced with a “4” play (not a routine play, but should get alot of outs on it), and he performs at a “4” level (he puts in a little extra effort or moves a bit faster, to make sure to get the out).

I believe in human observation, and indeed, it is THAT human observation to which we are going to be judged against.  Human observation, properly disseminated, may not always be right, but if you go against it, you better have pretty good evidence.


#34    Peter Jensen      (see all posts) 2008/12/15 (Mon) @ 12:24

Tango - I couldn’t disagree with you more.  Human observation is what has gotten Derek Jeter 3 Gold Gloves.  Human observation is what has given us three data gathering systems on hit balls that don’t even come close to agreeing on where the ball was hit, let alone how hard it was hit.  Having people grade a play on how difficult it was will muddy what imperfect information we have even further.  Did a play appear difficult because a player was out of position? Because he reacted slowly?  Because of bad footwork?  You can’t judge positioning and reaction time from TV and it would still be difficult at the park.  You can’t watch all 8 fielders at once. 

The fielding metrics we have are not terrible when analyzing 3 or more years of aggregated data.  They are not particularly good either.  Not because of bad metric design, but because of the inconsistency of observational data.  The only way we are going to improve our projections for fieldiing AND hitting and pitching is if we have electronically produced data on where a ball lands and at what speed and what horizontal and vertical angles it was hit.  Until that day comes we will have to live with the relatively large margins of error that our projections have now.


#35    Tangotiger      (see all posts) 2008/12/15 (Mon) @ 12:45

"Human observation, properly disseminated,”

My Fans’ Scouting Report was never close to giving Jeter the best fielder award.  I’m not in favor of ALL human observation, but human observation, within guidelines, and with qualified observers.

I wouldn’t call 3 STATS stringers, or 1 or 2 BIS observers per game using TV as necessarily a great way to do it.  It’s one step along the way.  If I had a hundred observers spread out at the park, acting as human triangularists, I’d be super happy with that.  And having each of them telling me how well the fielders did, I’d be happy with that too.


#36    Peter Jensen      (see all posts) 2008/12/15 (Mon) @ 13:57

Your 100 observers MIGHT do better than the current STATS, BIS and MLB observers at locating the hit ball position and they might not.  Having them judge the difficulty of the play is a very bad idea for the reasons I gave above.  It is just not something that can be discriminated by observation.  Three or four feet of bad positioning can make an easy play look hard.  An extra couple of hundreths of a second in reaction time can make an easy play look hard.  How can a fan judge correct positioning if he doesn’t know what pitch is going to be thrown and the defensive player on the field does?  Does your average fan note whether an outfielder’s first step is a crossover step or not? 

Its part of a defensive player’s job to use every tool that he has to make every play look easy.  But ultimately what counts is whether the play is made or not.  Should the smartest players be downgraded because they are successful at making the plays look easy, and the more physically able but not as smart fielders receive more credit when they make the same number of plays but make them look hard?  Very dumb idea.


#37    Tangotiger      (see all posts) 2008/12/15 (Mon) @ 14:33

The observer could include the positioning of the fielder as well.  If an observer is biased that a play he calls tough happens to be when a fielder is positioned farther away than other fielders would, then the data would reflect that.

It is a certainty that more data, intelligently collected and disseminated (note the condition here), is better than less data.  The question is always to the degree to which this data impacts things.

As it stands, players are drafted the world over based on visual observation in all kinds of sports, even if there is limited or just basic scoring data.

Our job is to try to quantify and infer as much sa we can all our observations, however objective or subjective they are.

The analysts will look for the biases in this data.


#38          (see all posts) 2008/12/15 (Mon) @ 14:52

As we attempt to look closer, get a more granular view, we find that with current information we just don’t have the resolution - there is too much discrepancy between the observers.

I agree with Tango, that by increasing the number of observers for any given event will help us to mean out the measurements, converging towards the true value.

A also agree with Peter that we have to stay away from subjective opinions, such as difficulty of play. Let’s list the facts. Here’s where the ball was hit, who was the fielder with the best chance of fielding it, and did he.

I have always disagreed with relying on LD% to build other measure, such as BABIP or UZR. LD% is much more subjective between observers, and the expected values of LD vs non-LD are so different that it end ups giving such a high leverage to a stat with a small sample size and high subjectivity.


#39    Tangotiger      (see all posts) 2008/12/15 (Mon) @ 15:02

"who was the fielder with the best chance of fielding it,”

Why is that a “fact”? 

And, what exactly do you lose by having the observer quantify: “Man, that was such a tough play, I can’t believe he actually got that runner out”?  So, you can call that a “1” hit (meaning a routine hit), and a “5” play (meaning fielder went out of his way to make an out).

If you line up all the 1/5 hit/field plays, you’ll probably find alot of Ozzie Smith types, and very few Felipe Lopez.

The question is if you will find non-flashy players in there.  Is that data being collected biased against anyone?

And if it is, then you can adjust for that.  Indeed, when scouting such players, then you know that perhaps flashy players are overvalued, and we can try to account for that.

There is value in recording the 1/5 hit/field plays.  The question is if the payoff is there.


#40    Pizza Cutter      (see all posts) 2008/12/15 (Mon) @ 17:30

Peter, you are right that there are some inherent (and somewhat obvious) problems in observational data.  However, as someone who relies a lot on observational data in my day job, I can honestly say that there are ways that you can make that data significantly less biased.  For example, having a good coding scheme/rubric, training the observers, and getting some good inter-rater reliability is a good start.  Writers fit none of these criteria.  Specifically trained observers can, if it’s done right.

Would the trained observers be any better than the usual fielding stats?  I doubt that it would be soooo much better that it would justify the expense (if better at all.) But that’s an open question.

However, there is something to be said for a granular look at specific pieces of fielding.  Coaches do this all the time informally.  If a guy is a good natural athlete and could be a better defender if he took better routes to the ball or if he was better at positioning himself, it’s good for coaches to know that they should focus on those particular areas.  If it’s a throwing problem, then we need to strengthen the guy’s arm.  Worth a look.


#41          (see all posts) 2008/12/15 (Mon) @ 17:55

Tom, es there is judgement on which fielder was closets to a ball, but I think there’d be much more consensus than whether a ball was a line drive or not, or a 3 or 4.

We can code things like that, but let’s examine our data so that we can rank them in terms of reliability. I don’t think LD% is reliable, therefor, I don’t want to base critical decisions on it. If we cna empirically determine which measurements are less reliable, we can lessen their importance, while promoting those which are more reliable. By reliable, I mean how well different trained observers agree when looking at the same thing.


#42    Tangotiger      (see all posts) 2008/12/15 (Mon) @ 18:08

Brian, it’s a given that the “closest fielder” would have more consensus than the “3/4” system, since the “3/4” system would, as ONE of its requirements, be that we identify all the fielders who could have made the play.

Basically, rather than saying “0..5” in your system, you are saying “yes/no”.

***

Line Drives v Flyballs: this is an important one.  LD outs happen say 25% of the time, while FB outs happen 75% of the time.  This means that if you make an out on a LD, you get +.75 outs, while making an out on a FB gives you +.25 outs (all on average).  That is a .50 out difference based on a classification of a play that could go either way, say, 10 or 20% of the time.

How about with a 1...5 classification system?  If we say that the average out rates are:
1 - 5%
2 - 25%
3 - 50%
4 - 75%
5 - 95%

And, we, as baseball fans, looks at a play, will it be possible, at all, for a “5” play from one guy to be marked as a “3” play from another guy?  That’d be pretty hard to do, isn’t it?  But, suppose it is possible, if you only have 2 observers.  What happens if you have 100 observers?  Aren’t you going to get say 70 people think something is a “5” play, 25 think it’s a “4”, and 5 think it’s a “3”?  And, won’t you be able to be weight each observer more, based on their historical observations?  There’s policing going on here whereby if you are in the majority, then your votes in the other observations will weight more.  It’s a great way to weed out bad observations.

So, in this particular play, the unweighted average is 88%.  As we can see, we have some uncertainty on this play (was it a .95 play?  was it a .75 play?).  The uncertainty level could be lower than a single stringer saying “line drive”.

Furthermore, if you have say 100% of the people calling something a “5” play, then that probably means it’s more of the higher range in the 90-100% scale, than giving the mean of 95%.

***

What are you going to do if someone says “Ichiro is a great outfielder”.  Will you really be a slave to the numbers, or are you going to say, “Well, there is a certain amount of uncertainty here.  While I’m 90% sure that Andruw Jones is an average outfielder, I will concur that 10% of the time I’m wrong.”

***

Human observation does nothing but help us, if we process that data in an intelligent fashion.


#43    Tangotiger      (see all posts) 2008/12/15 (Mon) @ 18:11

And as I wrote in the 2008 Hardball Times Annual:

Even in “sure out” situations, where the average shortstop makes the play 99% of the time, UZR and its sister systems (David Pinto’s PMR, John Dewan’s Plus/Minus), would only be able to classify the play to say around 95%. That is, there is enough uncertainty in the classification of a ball whereby a play that you and I, as fans, would be able to spot the ball as a “99%” play, the advanced fielding systems, with their limited data input, can only classify it as a “95%” play. That’s pretty good with only a .04 error range here. But, as we saw, the difference between a bad fielder and an average one is .03 outs per play.


#44    Fargo      (see all posts) 2008/12/15 (Mon) @ 18:55

I agree the the idea that at base the grading has to be observational (but some combination of this and “objective” metrics is also reasonable).

But what I would propose as a way to reduce the costs would be that a sample of trained observers. 100 seems reasonable but that gives a +/- of about 10% for any given observation/play; however, but if they observe all plays in a game the event log grows a lot so your number of observers can probably be much smaller.  Also I’d have those observers grade the fielder’s actions even when they aren’t “in” on a play (nowhere near the ball, for example, but are expected to move to cover a base, e.g., the pitcher covering the plate when the catcher has to go after a nubber or a PB/WP).

The “grading” can be done on the basis of video, which could be rerun, and the video could even be enhanced by putting electronic markings on the ground to indicate location on a grid.

The “observers,” then, could grade one player for a game, or one position for an entire game—that player’s movements would be captured in video, with a “pic in pic” perhaps for larger simultaneous views of what’s happening on the given pitch or play (not just what that player is doing). (An occasional cutaway to a particularly luscious fan in the stands could be a welcome tension breaker! Sorry guys, no beaver shots!)

Grade each player for ca. 16 random games per year, i.e., about 10% of the regular season games—but grade them on every pitch/play for those games.  The observer panel can do this at their leisure and submit their grades online.

It would be a heckuva interesting experiment.  You’d definitely need to develop rubrics for grading and to train the observers.  I can’t calculate the statistical “power” you’d need—you might not need 100 observers of each play(er) to get what you want if the different plays over the entire game are pooled together as observations.  Maybe 10 trained observers would be needed for each position. A competent sampling design expert would be able to figure this out.


#45    Colin Wyers      (see all posts) 2008/12/15 (Mon) @ 22:24

I’d rather have data that’s less, not more, subjective. Instead of telling me whether or not a ball was a ground ball, line drive, fly ball or popup, just give me speed and distance and I can figure out the rest from there. If you figure speed and distance using HitF/X or whatnot, fine. If you figure it from a stopwatch and a stringer with a diagram of the field, that’s fine too. Instead of telling me who had the best chance of fielding the ball, just tell me where everyone was on the field at the time the ball was hit. Again, I can take it from there.


#46          (see all posts) 2009/01/03 (Sat) @ 15:29

Mike Fast, could you drop me a private email (click my name)...thanks


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:26
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps