THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, March 21, 2007

Creating a Fielding System

By Tangotiger, 08:24 AM

Joe Arthur reports interesting data on Manny Ramirez:


Extending what Misterdirt did with retrosheet and including 2006 I get 4 year totals by hit type of:

Fenway
F Manny 363/554 .655 (312 TB allowed; 1.63 per hit) all others 645/948 .680 (501 TB allowed, 1.65 per hit)
L Manny 42/334 .126 (404 TB allowed 1.38 per hit) all others 70/646 .108 (780 TB allowed 1.35 per hit)
P Manny 0/0 all others 1/1 1.000 (0 TB allowed)

Red Sox Away Games
F Manny 370/445 .831 (115 TB allowed 1.53 per hit), all others 774/899 .861 (194 TB allowed, 1.55 per hit)
L Manny 47/365 .129 (414 TB allowed 1.30 per hit), all others 106/694 .153 (747 TB allowed, 1.27 per hit)
P none

If you assumed that the specific difficulty of the batted balls evened out, and if you assumed that these other left fielders (red sox opponents and red sox backups for Manny) were collectively average, and if you assumed retrosheet hit types were consistently judged and recorded without errors, then you could conclude from all this that Manny was -14 plays on flys at fenway and +6 on line drives at fenway, and -13 plays on flys in away games and -9 on line drives, in his actual opportunities. There is no sign that he is better at preventing extra base hits at Fenway, and no sign that he is worse at allowing them away from Fenway. His total bases allowed seems completely comparable in both. Cumulatively over 4 years, he would be -8 plays at Fenway and -22 plays away. Accounting for the hit value of those missing plays, that would be about -10 runs and -23 runs. So that’s about -8 runs per year (actual) in an average of 1104 defensive innings. Casting that into the 150G = 1350 inning basis quoted for UZR, you’d get -10 runs a year, (in terms of performance, not talent).

In short, Manny is about .025 to .030 outs per play worse than the average LF (whether in Fenway or not), and his line-drive advantage in Fenway is balanced against the same out of Fenway.

There’s about 350 FB plays in LF, so Manny is 10 outs per 162 GP worse than the average LF, which would be around 8 or 9 runs.

***

In post #102, MGL in the same thread has Manny as being -20 runs, just on the road.  If we focus only on road performance, Manny didn’t do well on LD.  Joe’s data from above shows that he’s -.03 on FB and -.025 on LD.  Work it out, and that puts Manny at -17 outs per 162 GP, or about -14 or -15 runs.

Pretty much, we see that Joe’s results and MGL’s results match.

What’s in question is Fenway.  I do not agree with MGL’s method of park adjustment. In post 104 he says:

Presently, I use one single number to adjust ALL of a player’s stats in each park, regardless of the zone it was hit in. For example, in Fenway, I think the LF park adjustment is 81, meaning that LF’ers catch 81% of the total balls caught in all parks in the AL. So if Manny catches 60% in Zone A in Fenway, he gets credit for catching 60% divided by .81. Same in zone B, etc. That is not a great way to do it of course, but neither would having an adjustment factor for each of the dozens of zones in LF.

Why am I not crazy about this?  In Fenway, Manny (and his opponents) will be playing saying 20 feet closer to the plate than usual.  If the out rate in non-Fenway parks in that zone is .70, it’ll be say .90 in Fenway.  But, perhaps the zone that is 40 feet to his left will be much closer whether at Fenway or not.  The adjustment can’t possibly be even.

The way to do this is to start with the normal position for a LF at Fenway (against a LHH and RHH).  You don’t have to create an adjustment “per zone”, but you would just create a function of distance (and direction) from that normal starting point.

So, the normal starting point at Fenway is zone “265 from home”, and the out rate is .85.  The normal starting point at non-Fenway is “285 feet from home” and the out rate is .90.  (All numbers for illustration).

For every 10 feet forward, the out rate drops by .05.  For every 10 feet backward, the out rate drops by .10.  For every 10 feet to left or right, the out rate drops by .03.

It doesn’t have to be linear obviously.  But, that’s how you get around the “per zone adjustment”.  And, this has the double advantage of modeling reality.  This is how fielders are positioned, knowing that moving up 10 feet means a tradeoff somewhere.

(Again, all numbers for illustration only.)

#1    MGL      (see all posts) 2007/03/21 (Wed) @ 11:11

I don’t disagree that my park adjustment methodology is poor for Fenway.  However, in the long run, it should be just fine.  It is not biased (assuming that everyone plays LF starting at around the same location), which is important.

Your method is one that I have toyed around with for UZR (for all the data, not just at quirky parks), but I have not deployed it yet (and probably never will).  It is probably the best way to handle the problem of using such small zones (which I do now - I use STATS slices and distances rather than Retro zones).  Essentially you are smoothing out the per zone baselines and interpolating from park to park.

In any case, in the same thread I used my basic methodology and crunched the data at Fenway only, essentially comparing Manny to everyone else at Fenway, similar to what Joe did, though I treated each zone separately as I usually do.

I forgot how the numbers came out, but they were considerably more conservative than what came out using my basic .81 park adjustment for LF at Fenway.

However, as I have said many times, no matter how you slice it, Manny is likely one of the worst fielders in baseball.  Fenway probably mitigates his deficiencies.  Given all the data and the different ways we have seen of parsing and analyzing it, I would now guess that Manny is around -15 (per 150) going into this year.

Even if Joe (or Misterdirt or whoever is doing the analysis) gets -10 per year over the last 4 years (with no weighting I assume), after age adjusting, you get around -15 going into this year.

Heck one of the reasons why we regress sample data in the first place is to “account” for the mistakes we make in analyzing it.  I may have gotten -25 over the last 4 years with a “flawed methodology” or something like that, but after I get done regressing it, it is more like -18 or -20 which is not that far off from -15.


#2    Peter Jensen      (see all posts) 2007/03/21 (Wed) @ 12:13

I am Misterdirt.  Thanks to Joe for extending my analysis.  However, I don’t agree with his translation of plays not made, (-30,) to runs cost, (-32), that he makes in the last 2 sentences in his post.  The LD hit value for LF is .58 and the out value is -.26 so turning a LD into a hit from an out is .83 runs.  For FBs the numbers are .66 hit, -.26 out, .92 runs hit to out.  This changes Joe’s -8 actual runs cost by Manny per year to -6.5.


#3    tangotiger      (see all posts) 2007/03/21 (Wed) @ 13:21

Good catch.

http://www.tangotiger.net/scouting/pos2006_LF.html

If we don’t consider the arm portion of his Fan evaluation, Manny comes in as 47th out of 55 LF, with 17 “points” behind the average LF.  Roughly speaking, that’s about -9 runs. 

Crawford is +34 points ahead of the average, which is about +18 runs, and Dunn is at the bottom, with 27 points below average, or -14 runs.

I think calling Manny as anything between -10 to -20 runs is reasonable.  It’s hard to justify *anyone* as being worse than -25 runs (unless Frank Thomas is playing SS).


#4    tangotiger      (see all posts) 2007/03/21 (Wed) @ 13:24

And Manny does have a plus arm, so we shouldn’t forget that.  That’s worth a couple of runs.


#5    Rally      (see all posts) 2007/03/21 (Wed) @ 14:39

Peter, can you explain a little how you came up with that data?

I’m assuming those numbers count all flys and line drives hit towards LF, whether they were right at him, off the wall, or 50 feet in front of the OF.  Is this correct or are there some hit locations that don’t make it into the denominator?

Where on retrosheet can you find this and for how many years is this data available?

Very interesting stuff, but I’m quite a novice when it comes to crunching retrosheet.


#6          (see all posts) 2007/03/21 (Wed) @ 15:30

Re: ‘worst’ fielders.  Tom Tippett’s article about fielding evaluation, available on his website, says

“Most of our work at the player level uses zone-based data. We compare the rate at which each fielder turned batted balls into outs in each zone with the overall averages. If a player made more than the normal number of plays, he gets a plus score for that zone. If he fell short of the overall average, he gets a minus score. By computing a weighted average of all of his zones, we get a figure that tells us how many more (or fewer) plays he made than the average defender. We call this figure “net plays”.

“In a typical season, the top fielders at each position make 25-30 more plays than the average. Exceptional fielders have posted marks as high as 40-60 net plays, but those are fairly uncommon. Recent examples include Darin Erstad in 2002, Scott Rolen just about every year, and Andruw Jones in his better seasons. The worst fielders tend to be in the minus 25-40 [net plays made] range.”

As mentioned in a prior post, Jeter was -39 ground ball plays under Dewan’s Plus/Minus for 2005.  So there are some fielders who might be -30 runs in a season, though it’s unlikely that that represents a true talent level.  For that, yes, I agree on -20 or -25 runs.


#7    tangotiger      (see all posts) 2007/03/21 (Wed) @ 16:12

I did mean true talent, and not sample data. 

I operate on the principle of players being -.04 to +.05 outs per play.  From 1B (3 plays per game) to SS (5 plays per game), that works out to -19 plays at 1B to -32 plays at SS, or -15 to -25 runs as the lower-limit.  On the high-end, that’d be +20 to +30, as the upper-limit.

Of course, it’s possible that Ozzie Smith at his peak was better than +30 runs above average as his true talent level.

But, my guidelines are decent enough to carry around.  So, if you see Erstad as +50 runs one year, you really have to ask: How’s that possible?  It could very well be that his ball distribution was not being recorded well, or that the analyst was not getting enough granularity out of it.

Scott Rolen is a SS playing 3B, so I can believe that he reached the upper boundary for 3B (+25 runs).


#8    Peter Jensen      (see all posts) 2007/03/21 (Wed) @ 17:36

Rally - Most recent years of Retrosheet have complete data for “batted ball type” (Event field #47) and “fielded by” (Event field #46).  The “fielded by” field is the player who catches or picks up the ball, not the player whose zone the ball was in.  I use every play where the player is listed as the “fielded by” player and the batter ball type shown in retrosheet. I make the assumption that Joe states in his post that the distribution of balls hit to a player or “fielding difficulty” will even out over several years.  So essentially it is a zone system with one big zone, i.e. everything hit to left field.

For the run values you have to create an “Expected Runs Matrix”.  I then assign before and after run values from the expected runs matrix to every play and calculate the difference as the “Run Value Added” for that play.  From there it is easy to query the average Run Value Added for outs to LF on FBs, hits to LF on FBs, etc.  Hope that helps.


#9    Rally      (see all posts) 2007/03/21 (Wed) @ 19:22

Looks like you’d miss a play like Ramirez diving for and missing a flyball, the ball rolling to the wall, and Coco picking it up.  It would show fielded by CF.  But plays like that are few and far between so that should be a decent overview of outfield defense.

I guess you wouldn’t be able to apply it to the infield, if a ground ball is fielded by CF you wouldn’t know if it was missed by the 2b or the ss.


#10    Joe Arthur      (see all posts) 2007/03/21 (Wed) @ 20:02

Tango,
thanks for continuing the discussion; Manny’s defense is almost as popular a topic as Barry Bonds and steroids, and even more popular than Derek Jeter and his girlfriends ...

I’ll explain what I did (or tried to do) for run values: I thought I was using the values from The Book, for out,single and double, but I did so from memory. I used -.29 for the out, +.46 for a single, and .80 for a double; looks like the Book has -.299,+.475,+.776 derived from ‘99-’02(?)
Anyway, as I computed it, the run “cost” of a single became .75, with 1.09 for a double. [If hits allowed were evenly divided between singles and doubles, that would be 1.5 total bases per hit, and an average run value of .92.] As Tango’s original quotation illustrates, I found Manny and other left fielders to be allowing a little over 1.5 bases per hit on fly balls, even away from Fenway, i.e. more doubles in the mix than singles, and a run value above .92…
For Manny at Fenway, to get 1.63 TB/hit on fly balls, 63% of the hits should be doubles (assuming no triples). So (.63)(1.09) + (.37)(.75) is an average run value of .96 runs per missed fly ball at Fenway. It would be .88 for missed line drives at fenway, .93 for missed flies on the road and .85 for missed line drives on the road. So I think by my own approach I should have arrived at -8 actual runs at Fenway and -20 on the road over 4 years, not -10 and -23. Prorated to a per 150G basis, that would be -10 and -24, or -11 and -26 per 162G. roughly -9 runs per year in either case.

Since that average is the centerpoint of the 2003-2006 period, it would fit with a progression in performance of -6,-8,-10,and -12 over these years, and a projection of -14 for 2007.  I did not take any account of the cost of advancement errors on misfielded hits or bad throws, so this isn’t a complete estimate ...

I’d just emphasize that at the aggregate level there is no sign that Manny has any special skill at “playing the wall” and preventing extra bases.


#11    Joe Arthur      (see all posts) 2007/03/21 (Wed) @ 20:38

One of the assumptions I made was that the difficulty of chances basically evened out. But there may be something interesting going on with balls in play to left at Fenway. No adjustment for quality or handedness of batters here, but from retrosheet I counted 2482 balls in play in 5701 1/3 innings at fenway, vs 2403 in 5822 1/3 in the road games (both teams). That’s .435 per inning vs .413. That may just reflect more batters per inning, but the ratio of fly balls to line drives is also different. At Fenway 1502 Flys vs 980 linedrives (ratio of 1.53), and 1344 flies vs 1059 linedrives in the road games (ratio of 1.27).  Some of that (all of it??) has to be home runs kept in play by the wall, but possibly batters at Fenway adjust their approach to “aim for” the wall. If so, perhaps the scatter of balls in play isn’t typical there, which could affect fielding difficulty. This sort of high-level, aggregate view possible from retrosheet can’t address that possibility…

I think there’s room for Manny to be found to be rather worse than my analysis suggested, though of course he could also improve ...


#12    Joe Arthur      (see all posts) 2007/03/22 (Thu) @ 07:35

whoops!  I said “prorated” in #10 above, but instead I extrapolated to totals that represented 4 seasons’ worth of 162 games. Per 162 games, from his performance over 2003-2006, I’d have Manny at -3 runs in 81 games at home and -6 runs in 81 on the road, for a total of -9 runs. 

“There’s about 350 FB plays in LF, so Manny is 10 outs per 162 GP worse than the average LF”

Tango - I’m guessing, but I think you’re thinking of the value Chris Dial uses for opportunities. Those are opportunities as measured by stats ZR, which disregards the majority of line drives hit to left, and a small portion of fly balls. Using “opportunity” as Peter and I were measuring it from retrosheet, I think you get close to 600 opportunities per 162G for the average LF. I’d have Manny at about -16 plays/162 on this basis…


#13    tangotiger      (see all posts) 2007/03/22 (Thu) @ 08:25

I actually used *your* FB (as opposed to FB+LD), which is 356 per 162 GP.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 15:21
The two uncertainties of UZR

Sep 02 15:17
Mail: rWAR v fWAR

Sep 02 14:59
Roger Federer

Sep 02 14:59
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 14:57
Could Rob Dibble have been a comp for Strasburg?

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?