THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, March 06, 2009

How much is a great fielder worth?  25 runs.

By Tangotiger, 10:05 AM

Here’s one way to tell.

Last year in 2008, with Mark Ellis on the field — perhaps the best-fielding 2B of our generation — the A’s allowed 457 runs while recording 3035 outs.  This means they allowed 4.07 runs per game.  When Ellis was not on the field, they allowed 233 runs on 1270 outs, or 4.95 runs per game.  That difference is a whopping 0.89 fewer runs per game, when Ellis is on the field.  In 2008 anyway.

Now, this by itself doesn’t prove anything.  There are many reasons why the A’s could have given up fewer runs when Ellis was on the field, by coincidence.  It depends who is pitching, who the other fielders are, which team is batting, and who his backups were.  And, just some lucky bounces.  This is the reason we would like a large sample size.  The games Ellis played and games he missed just won’t cut it.  By increasing the number of players, we hope that all these other externalities work themselves out in the wash, leaving us with one major input (the great fielder) and the one output (runs allowed).

So, using WOWY (With Or Without You) based on balls in play, I selected the best twenty or so infielders (2B, SS, 3B) since 1993.  It’s mostly the names you know: Everett, Sanchez, Bartlett, Rolen, Reese, Hudson, Inge, etc.  By looking at a large enough number of great infielders, the idea is that all the noise around them will cancel out.  My only additional constraint was that he must have been on the field for at least 1000 outs for a given team-season, and must have been off the field for at least 1000 outs for that same team-season.

I came up with 68 such seasons since 1993.  The total number of games played was 5386 games on the field and 5400 games off the field.  You have to admit that that’s alot of games.  When the star fielders were on the field, their team allowed 4.60 runs per game, and when they weren’t on the field, they allowed 4.83 runs per game.  Per 162 games, this difference comes out to 37 runs.

I repeated this exercise with the outfielders: Erstad, Beltran, Cameron, Endy Chavez, etc.  Ichiro was part of my initial group, but since he barely misses any games, he’s not included.  I have 60 outfielder seasons.  Their teams allowed 4.89 runs with them, and 5.00 without them, for a difference of 19 runs per 162 game season.

And I did the same for firstbasemen: Minky, Derrick Lee, Tex, etc.  I have 96 such seasons.  They allowed 4.89 runs with these great fielding 1B, and 4.99 without, for a difference of 15 runs.

Roughly speaking, that gives us 26 runs (3 parts 37, 3 parts 19, 1 part 15) of fewer runs scored when you have one great fielder on the field, than when you don’t.

Here’s another way to tell.

UZR is shown on Fangraphs.com for seasons 2002-2008.  The best fielder by far is Andruw Jones.  Including his abysmal 2008 season, he was +139.4 runs in 883 games, or an average of… 26 runs per 162 games.

Sweet, I know. 

If I take the top 20 fielders in UZR, minimum 162 games, their average is +18 runs per 162G.  So, we should feel quite confident that a great fielder adds some 20-ish or so more runs than an average fielder, per 162 games. 

I know most readers here see the range for fielding as +/- 20 runs.  That’s pretty much spot on.


#1    Guy      (see all posts) 2009/03/06 (Fri) @ 12:26

Great stuff.  I don’t disagree with the general finding, but I do see the potential for a small bias, mainly in the IF.  Weak hitters will be pinchhit for late in games in which their team is losing.  So if Pokey comes out, his replacement is probably facing a better-than-average offense, maybe playing behind a below-average pitcher, and other good fielders (like the catcher) may also have been pulled to maximize offense.  Conversely, when Pokey comes off the bench to play defense in the late innings, it’s a game his team is winning, so he’s got a closer on the mound, the best fielding team the manager can put on the field, and below-average offense. 

I’m sure this isn’t a huge deal:  most of the backup innings are games started by that player.  And some of your good glove guys can also hit, so he plays all 9 or not at all—he averages 8.7 IP per game at 3B.  But Pokey averaged 8.1 IP/G at 2B, and Rey Sanchez only averaged 7.8 IP per game at SS, so they got pulled some and also came into games late.  This might magnify your IF #s just a bit.


#2          (see all posts) 2009/03/06 (Fri) @ 12:33

Hah, Guy stole my post (except the part about pinch-hitting for these guys, which I never thought of) grin Maybe eliminate games from the sample in which the fielder didn’t play all 9 innings, and instead of using runs per game, use runs per ball in play?  That might help correct for the high-strikeout closers who will be disproportionately on the mound with these defensive wizards in the late innings.


#3    Rally      (see all posts) 2009/03/06 (Fri) @ 12:45

You could look at how many hits they allow with/without, and then convert these to runs.

Then you remove any effect of pitcher walks, strikeouts, and homers, and the possibility that these guys are more likely to be on the field for late inning relievers.


#4    Tangotiger      (see all posts) 2009/03/06 (Fri) @ 12:51

Thanks for the comments.

There is no doubt that there are biases, specifically for the identity of the batters and pitchers, the other fielders, and backups.  That is a given, and I’m not really worried about trying to control for it.  The reason is that this is a first-pass grand look at the issue.  Just taking in the view.

In order to do what we want, we go to the ball in play level to see how many outs and hits are recorded, while controlling for those things.  That’s WOWY.

And you can try to get even more granular.  That’s PMR or UZR.

But, the idea behind what I just did was to give a taste test for how much fielding impact could there be.

Clearly, once I go on the road of trying to account for more biases, and do the adjustments properly, I’m just going to end up at WOWY anyway.  After all, there’s no reason to count the number of HR allowed, with Ellis on or off the field! 

So, if we think in terms of a dinner menu, this is strictly the appetizer.


#5    Tangotiger      (see all posts) 2009/03/06 (Fri) @ 12:55

You could look at how many hits they allow with/without, and then convert these to runs.

Then you remove any effect of pitcher walks, strikeouts, and homers, and the possibility that these guys are more likely to be on the field for late inning relievers.

Right, that’s the basis of WOWY, which I’ve shown saves some 30 or 40 plays per 162 G at the high-end, which is in-line with what we’ve got here.

The real appeal in what I’ve done here is I speak at the most basic level: runs allowed.  I don’t have to explain hits and BIP, and outs to run conversions, etc.  You guys appreciate it because you spend the time to appreciate it.

But, for those guys who are just looking out the store window, not realizing there is a fantastic sabermetric party going on, they need a 5-second look to make an assessment on it.

That’s what this is.  This is my hook.  I’m showing 25 runs.  Once they come into the party, then I can bathe them with WOWY and UZR.


#6          (see all posts) 2009/03/06 (Fri) @ 13:11

That’s a great point.  My mom watches probably 100 Red Sox games each year, but I’m guessing has never heard of OPS.  If I want to explain why Manny isn’t that great, or why Pedroia is REALLY great, that’s a simple but convincing argument.


#7    Tangotiger      (see all posts) 2009/03/06 (Fri) @ 13:22

Yes, exactly.  Everyone understands ERA.  And this is what this is: a player’s “personal” ERA (or RA actually), and compared to his backups, exactly as we’ve seen with “catcher’s ERA”.

In no way should we look at a single player’s “personal ERA”, for the reasons noted.  Not unless he splits his time alot every year, a perpetual 40-100 game guy. And at that, only look at the career level. This is why it’s kinda ok for a catcher.  But for a catcher, the pitcher-bias will be enormous ("personal catchers” most notably).

But, give me 20 players, each with a few seasons, and, it starts to look good.  Appetizing anyway smile


#8    Rally      (see all posts) 2009/03/06 (Fri) @ 14:55

This suggest that the range of fielding impact is much greater in the infield.

TotalZone has similar ranges for infield vs outfield, and for UZR, I think the OF ranges are a bit more extreme than infield.

Maybe we’re overestimating the spread in outfield because of all the noise, wider areas with potential for miscoding, FB vs liners, lack of hang time, etc.

It makes perfect sense that the impact of the top 1B is less, as they field fewer plays than the other infielders.


#9    Tangotiger      (see all posts) 2009/03/06 (Fri) @ 15:03

Rally, I had a somewhat similar thought regarding the IF/OF, but hadn’t considered the “noise” aspect regarding the wider range in UZR.

My next step is to cast a wider net by:
- not forcing the min 1000 out rule, and simply take all comers, and weight their WOWY runs allowed by the lesser of the with/without total outs
- adding more good fielders (reduces quality a bit, but increases sample size)
- going back to more years, say to 1969

Hopefully, I’d get something that will have a substantially reduced level of uncertainty.


#10    Peter Jensen      (see all posts) 2009/03/06 (Fri) @ 16:07

This suggest that the range of fielding impact is much greater in the infield.

TotalZone has similar ranges for infield vs outfield, and for UZR, I think the OF ranges are a bit more extreme than infield.

I think the wider range that Tango found for infielders than for outfielders is a direct result of his methodology.  His method presumes that the replacement fielder will be an average fielder in the aggregate.  If you look at the composition of 25 man rosters, the 4th outfielder is much more likely to be an above average fielder. This is because part of his role is to be a late inning replacement for a poor fielding outfielder in close games. 

This is not true for a utility infielder.  He is more likely to have some ability to play all infield positions but is less likely to be above average at any of them. 

Hence the run differential by Tango’s method will be greater for the infield than the outfield.


#11    Tangotiger      (see all posts) 2009/03/06 (Fri) @ 16:33

Counterpoint to Peter:
http://www.insidethebook.com/ee/index.php/site/comments/uzr_positional_adjustments_revised_with_2008_uzr/#2

I then took the next 30 players who played the most games (players 31 through 60) at each position, each year.  Here’s how the backups did:

pos Rper150_weighted class
3 -2.8 2_Backups
4 -0.7 2_Backups
5 -3.0 2_Backups
6 -3.2 2_Backups
7 +4.4 2_Backups
8 -5.4 2_Backups
9 -1.4 2_Backups

Peter may be right about LF, but other than Crawford and some years for Endy, I think all the other OF I had were CF or RF.

So, these guys were being commpared probably to -2 or -3 fielders.

Furthermore, it’s probably likely that the backups of the great fielders is not a great fielder, but a poor fielder, decent hitter.  (Just a guess.)


#12    Brian Cartwright      (see all posts) 2009/03/06 (Fri) @ 17:31

I recently ran WOWY on catchers, for GB 2003-2008, WP+PB 1953-2008 and base stealing 2003-2008.

Ground balls only have a range of about +/- 2 runs. Over the past 6 seasons Jason Kendall in 2008 had the most chances, 74.

WP and PB combined are +/- 10 to 12 runs, while stolen bases go up to about 15 runs.

Carter and Freehan maxed out best season combined at +28


#13    MGL      (see all posts) 2009/03/06 (Fri) @ 17:44

Tango, how did you choose the best fielders?  That makes a huge difference!  You can’t choose them by the numbers and then look at the numbers to estimate the spread in true talent. Even if I choose the best by UZR and then do WOWY, I won’t come up with the correct spread in true talent.  By choosing the best fielders based on UZR, I am choosing those players who were partially lucky in UZR.  That luck with carry over into WOWY.  So how did you choose the best fielders?


#14    Tangotiger      (see all posts) 2009/03/06 (Fri) @ 17:59

I used WOWY by balls in play, just like described in my Jeter article in THT08.

I did NOT choose by WOWY by runs per game, for exactly the reason you cite.

Now, if you are arguing that the two methods are similar enough, then I will agree with you, but only very slightly.  There is a tinge of bias there.

However, I’d be happy to select based on UZR or Rally’s TZ or the Fans Scouting Report any other method of your choosing.  Just give me a list of retroid, and I can run it.


#15          (see all posts) 2009/03/06 (Fri) @ 18:16

What if you did the worst at each position?  I wonder if the magnitude would be the same. 

I also wonder if there is any difference between CF and the corner positions, and among the three (non-1B) infield positions.


#16          (see all posts) 2009/03/06 (Fri) @ 18:51

"I used WOWY by balls in play, just like described in my Jeter article in THT08.”

So you just chose the best players from the players with the best WOWY?

If that is the case, then you still have to regress (albeit not that much) to get the true value of a great fielder, no?

My main issue is that you are not really increasing your sample size (in order to not have to regress or to only regress a small amount) if you combine players that are good based on their sample numbers.

For example, say you have 20 players who all have great WOWY numbers and they average 2000 BIP each (around 3 or 4 seasons).  Combining all those players into one does NOT give you a larger sample size.  You still have an entity with a sample size of 2000 BIP.  Now, if you sampled those players randomly or using some independent (of the WOWY value) criteria, then you can treat the group as if it were a sample size of 40000.

That is what I was driving at.


#17    Rally      (see all posts) 2009/03/06 (Fri) @ 20:25

TotalZone is not the same, but it’s using retrosheet data, so I think you’d have the same players regardless.

Selecting by fan scouting report might be best.


#18    MGL      (see all posts) 2009/03/06 (Fri) @ 22:47

Selecting by the Fan Scouting report is only going to answer the question, “What is the max number of runs saved for fielders that the fans think are the best.” These are obviously not necessarily the best fielders.

The only way to do this is to use the numbers themselves and then to regress appropriately.  You can do that with UZR, TotalZone, WOWY, or whatever you want. I think that is what Tango was doing, which is fine, other than the fact that the numbers must be regressed and I don’t think he did that, but I could be wrong.


#19    Tangotiger      (see all posts) 2009/03/07 (Sat) @ 07:46

If I wanted to know the relationship between (K-BB)/PA and runs allowed, I would select the best at (K-BB)/PA, and worst at that same metric, and compare what their runs allowed is.

I would not need to do regression.

It’s the same process here.  I am selecting (basically) on which CF had the lowest hits/BIP, and then looking to see what their runs/out is.  No regression needed.


#20    Peter Jensen      (see all posts) 2009/03/07 (Sat) @ 11:34

It’s the same process here.  I am selecting (basically) on which CF had the lowest hits/BIP, and then looking to see what their runs/out is.  No regression needed.

Did you select your best fielders on the basis of their career numbers and then look for seasons for those fielders where they were on and off the field 1000 outs each, regardless of how they performed in that particular season?  If so, I agree with you that no regression is necessary. Although, the fact that the player was not on the field for 40 games in a particular season may be introducing some bias that influences your results.

I think MGL may be assuming you looked at the highest player performance seasons and then selected from them the seasons where the player was on and off the field for a 1000 outs.  In that case regression would be necessary.


#21    Tangotiger      (see all posts) 2009/03/07 (Sat) @ 12:44

Your first paragraph is what I did and I agree with.

***

MGL’s point is valid *only* if I looked at the lowest runs per out for each fielder (or lowest compared to his backups).

If I selected particular seasons where I looked for the lowest hits per BIP among all CF, then I would not need regression, just like if I looked for the lowest FIP scores in any particular season would not require any regression when comparing to runs per 9IP of that same season.

If you take the 50 best FIP seasons since 1993, they will pretty much match those pitchers’ runs per 9IP.

And if you run a regression of all FIP seasons since 1993, the slope will be right around 1.

As long as you don’t select on the thing you are measuring, you are not introducing a sampling bias.

I cannot, for example, select the 50 best Runs per 9IP, and expect that to match FIP, since Runs per 9IP is a variable dependent on the components within FIP.


#22    Peter Jensen      (see all posts) 2009/03/07 (Sat) @ 13:25

MGL’s point is valid *only* if I looked at the lowest runs per out for each fielder (or lowest compared to his backups).

If I selected particular seasons where I looked for the lowest hits per BIP among all CF, then I would not need regression, just like if I looked for the lowest FIP scores in any particular season would not require any regression when comparing to runs per 9IP of that same season.

I am not sure why you would conclude this.  If you are hypothesizing a strong relationship between hits per BIP and runs saved per out, then selecting on one is going to bias the other.  This is not a problem if your intent is to show the strength of the relationship between the two.  But if your intent is to show what a great fielder saves a team in runs over a season, you have a selection bias from having chosen the best seasons because they don’t represent the best fielders, they represent the best fielders PLUS the best luck over a single season.  In this case I would agree with MGL that the best single seasons need to be regressed in order to remove the variation due to luck for that season.  Moot point since you used the better method in the first place.


#23    MGL      (see all posts) 2009/03/07 (Sat) @ 17:02

Tango, I don’t get your point.  Unless I am not reading your methodology correctly. As Peter says, if you are looking at all the best fielders combined, you are going to come up with a number that represents their talent and their good fortune.  If you want to tell us the number that approximates their talent, you have to regress the number that you came up with, as we always do.

Since the title of the thread is “How much are the best fielders worth?,” I assume you mean “in true talent” since that is the question that people want to answer.  If that is the case, then no matter what methodology you use, you have to regress something somewhere along the way.

If I use UZR or WOWY and I look at the 5 or 10 or 20 best fielders, according to the numbers, at any position, even while limiting my players to a large minimum number of BIP (or outs or whatever), then whatever number I come up has to be regressed.  I don’t see any other alternative.


#24    Tangotiger      (see all posts) 2009/03/07 (Sat) @ 18:53

I am selecting on hits per BIP.  I am looking at runs per out.

This is no different than selecting the best pitchers by (K-BB)/PA, and using runs per 9IP to determine how much a group of great pitchers is worth.

Thinking about it, then yes, I would need some regression.  I’m not doing the egregious think of selecting by runs per out first, and then looking at runs per out to see how good they are. 

As Peter said, it’s a mostly moot point since I looked at career totals.

Thinking about it even more, the perfect way would be to look at hits per BIP for Mark Ellis excluding 2008 season. And so on.

Ok, I’m with you guys.  There were two things going on here.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential