THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, May 31, 2008

Do speedy outfielders (and those with good arms) save extra-base hits?

By , 03:12 AM

I’m talking about by cutting off hits in the gap (or wherever) and by the threat of a strong and/or accurate arm.  As far as I know, none of the defensive systems takes this into consideration.  UZR and my “arm lwts” do not.


I looked at 5 years of data.  I grouped all outfielders into 3 groups, according to speed scores, slow, average and fast, where the slow and fast groups are each about 20% of all players.  In each park and in each field, LF, CF, and RF (in order to control for park and field), I summed the difference between the lwts value of all hits for each group and the average lwts value of a hit to that field in that park.

IOW, let’s say that in Safeco, all hits to LF had an average lwts value per hit of .60 (where s=.47, d=.78, and t=1.03).  If hits with speedy outfielders in LF at Safeco had an average lwts value per hit of .59, then the speedy outfielder group would get credit of .01 runs per hit, weighted by the number of hits to left field with speedy outfielders in the field.  So on and so forth for all fields, all parks, and all three groups of outfielders.

The final result and answer is that yes, speed in the OF results in fewer extra base hits, but not by a whole lot.

If I break it down by field:

LF
Difference between fast and slow players (the 20% fastest and 20% slowest) is around 2 runs.

CF
3.9 runs

RF
2.4 runs

If I isolate the fastest and slowest 5%, the difference is around 4.8 runs in CF, 2.7 in LF, and 3.6 in RF.

So I think it is fair to say that the difference between the fastest and slowest players in terms of cutting off extra base hits in the gaps is plus or minus 3 runs per 150 in CF, and plus or minus 1.5 to 2 runs at the corners, which isn’t wood I guess.

UZR does not include this, and I don’t think any of the other defensive metrics do either.  Of course, another way to do this for individual players is to look at the average hit value in each section of the field and compare that to the average hit value for each section when that player is on the field.  I think you are going to get too much noise using that method, and I am not sure that all of the databases track where the ball lands as opposed to where it is fielded.  I like the idea of just using an outfielder’s speed rating and adding a constant to his UZR or other defensive metric.

What about the same thing using “arms?” Players with better arms should be able to hold players to fewer extra base hits.  Again, my “outfield arm linear weights” does not keep track of this, only of “holds” and “kills.”

For the best and worst 8% of the arms in CF, the difference is 1 run (per 150 games).  In LF, it is also 1 run.  In RF, it is 2.4 runs.  Again those are differences between the best and worst 8% in “arm”.  So, in CF and RF, it is plus or minus .5 run per 150, and in RF, it is plus or minus 1.2 runs per 150.

As I said, I think this is the first time anyone has quantified this, but I am not sure.

#1          (see all posts) 2008/05/31 (Sat) @ 10:16

Dan Fox’s SFR does.

Step 1 is to calculate how many fly ball hits per fly ball the OF allowed. Most systems stop here.

Step 2 is total bases on those fly ball hits.

Step 3 is total bases on ground ball hits.

Step 4 advancement of baserunners on hits & flyball outs.

Then add it all up.

Dan has a set of spreahdsheets available for download on his blog site. I have loaded some of them into Access.

I did a similar formula back in my amateur days (One of my old projects which I have yet to resurrect.) I credited ground ball hits as hits against the infielders, but any extra bases for doubles or triples against the outfielders.

If the batters singles, think of this as a hold, a double as an advance. Once on second, going for third is another advance. Of course, measure this against the expected value of the plays that the defender is presented with. And the will definitely be park factors for the outfield.


#2          (see all posts) 2008/05/31 (Sat) @ 10:23

Instead of testing the existing hypothesis, try running the numbers on everyone, and then see if there’s any grouping of players by speed, etc.

Positioning, speed and judgement (how soon to react, getting proper read) will go towards converting flys into outs.

Same thing plus arm for keeping batters and runners from advancing.

I believe speed would be most important in cutting off balls in the gap, increasingly the shorter the outfield distance, also on catching balls hit over the of.


#3    MGL      (see all posts) 2008/05/31 (Sat) @ 12:43

I really don’t like doing it by individual fielders.  Too much noise I think.  And if you did, you definitely have to do it by location of the batted ball and park by park.


#4    Peter Jensen      (see all posts) 2008/05/31 (Sat) @ 13:47

MGL - I am not sure why you would expect there to be more noise for this then there is for an individual hitters extra base hits, and we don’t feel the need to adjust those to league average rates.  As you have already noted you would have to adjust to each park and to the particular outfield position of each player, but that would probably be enough.  You would also want to look at any tendencies that a team’s pitching staff might have (also adjusted for park and field) on a with you or without you basis, but I doubt if the variations would be sufficient to require adjustment.  I use the actual run value of all hit in the air balls and the extra run value of ground balls ( like Dan Fox does)in my own unpublished fielding metric.  It should give a more accurate assessment of a fielder’s total fielding skills.  But only when the sample size is sufficient, just like with pitching and hitting metrics.  A single year’s hitting statistics for a batter is not very useful for establishing his “true” offensive value, and neither is is a single year’s fielding statistics.  I use the same three year aggregates for each.

I don’t like your idea of adding a standard adjustment based on speed and throwing ability because as Brian correctly points out in his post there are other fielding skills (positioning and reaction time) that are also important in preventing extra base hits.


#5          (see all posts) 2008/05/31 (Sat) @ 14:33

I see what your method is now, starting with groups of players to increase the sample sizes and better control biases. I haven’t done a study that way yet, but I get the concept now.

Dan Fox did do his minor & major SFR using play by play data. I already had a similar idea, and plan on developing my own formula to pretty much replicate what Dan did.


#6          (see all posts) 2008/05/31 (Sat) @ 14:41

#4 Peter - I think I agree with everything you said.

There are different reasons why a particular player may be a better or worse fielder. First let’s figure out, on a broad a scale as possible, who are the better and worse, then see if we can correlate or otherwise figure out which skills (speed, arm, etc) were responsible. I think this is similar to SABR Matt’s matrix approach.

On a performance level, what I really want to know is if the player is good or not. It seems more academic to then know why, but that doesn’t mean it not important to know why. In MGL’s example, if I run a team with a small park, which kind of outfielders are best suited to playing defense for my team. Leverage every advantage you can.


#7    MGL      (see all posts) 2008/05/31 (Sat) @ 17:17

The analogy between a hitter’s extra base hits and a fielder’s is not a good one.  They are completely different as far as noise goes.  Completely.

There are basically 2 sources of noise in baseball.  One if when the inputs are not what you think they are.  An example of that would be a single for a batter might be a 23 hopper through the SS hole or it might be a screaming line drive.  One should clearly be treated differently than the other, but absent any descriptive information about each single, we don’t treat them any differently, and even if we have access to batted ball data, we still tend to just use whatever a hit was scored as and treat them all the same.

Anyway, the second source of sample error is when we know exactly what the input is but we also know that because we are sampling a player’s true talent, the results are not necessarily commensurate with what we would expect if we had an infinite sample of performance.  So, for example, if a player has lots of screaming line drive hits, it STILL does not necessarily mean that he has that kind of true talent.  He may have just gotten “lucky” and hit lots of screaming line drives during the period of time we did our sampling. It is VERY important to keep that concept in mind when analyzing player performance.  Some people think that once you reconcile the input, that whatever is left is true talent.  It is not, of course.  You have just reduced, but not eliminated, your sample error.

Anyway, the reason I brought that up (and got a little sidetracked in the meantime) was that we can easily reconcile the inputs for a hitter’s singles and extra base hits and pretty much get rid of “one-half” of the noise or sample error. In fact, in most cases, just knowing that a hit was an extra base hit or a single automatically does that for us. The reason is that for hitters, the skill we are trying to isolate is whether he hits balls hard and/or far!  Most extra base hits are hit hard and/or long.  In fact, many of them are “almost home runs.” So by knowing the ratio of singles and extra base hits for a hitter, we automatically get a pretty good of his power to hits ratio, which is the “skill” or “true talent” we are looking for.

Not so with fielders in the context of what we are trying to look at!  Simply using the ratio of singles to extra base hits for an outfielder does NOT give us a whole lot of information about the skill we are looking at!

Let me couch it this way, and I think (hope) you will get my point (and it is an important point):  Let’s say that you observed 30 singles, doubles, and triples for a batter.  It would be pretty clear that most of the doubles were hit harder and/or farther than the singles AND you would get a pretty good idea of the power of the batter. 

Now, imagine the same thing for outfielders.  Imagine watching 30 singles and extra base hits.  Remember, the skill we are trying to find is the ability of the OF to cut off balls in the gaps because of their speed and the “ability” to keep baserunners from stretching singles into doubles and doubles into triples because of the threat of their good arms.

Well, in 30 hits, you are simply not going to see that many hits that were in the gap or singles that could have been doubles or doubles that could have been triples.  Most of those hits will be routine and the same regardless of the fielder.  Every once in a while, you will see a play by an outfielder (good or bad) that might have been different had it been fielded by an average outfielder.  Every once in a while.  Yet, each outfielder in 30 hits will have all kinds of singles to extra base hit ratios.  Most of that will be noise.  You simply cannot make sense of those differences without at the very least tracking exactly where the ball lands.

While I agree (of course) that there are other factors besides speed that contribute to the skill that we are looking at, I submit that there is NO WAY to come close to isolating that skill by looking at the actual singles/extra base hit ratio for each fielder.  The noise will overwhelm the skill and you will end up regressing most of what you find an a year or even a couple of years almost all the way to the mean.

That being said, it is fine to do that if you are first going to figure out how much to regress based on how many hits there are, and then do the regression where the means are based on speed.  Yet, if you are going to regress almost all sample performances less than 5 or 10 years almost all the way to the mean, well you might as well just use the mean (where the population is based on speed scores).

The last thing I would want to see is sample numbers for outfielders that looked at their actual singles/extra base hits allowed ratios. Those numbers will be meaningless unless you have gobs and gobs of data (large sample sizes).  Again, completely different than with hitters.  Completely.


#8          (see all posts) 2008/06/01 (Sun) @ 19:08

So the first souce of noise is granularity, how precisely the event is described, and the second is sample size.

If mots of the hits to an outfielder are routine, then isn’t the observed variance from fielder to fielder more likely to be true value, as this would be a reduction of noise #1?

The better we can describe an event (which will only be getting better in the future) the better we can calculate the expected value. The 23 hop single has a different expectency than the screaming line drive. Especially now with RDBMS and SQL, we can look at ballpark, outs, score, LI, WE, bathand, vector, speed off bat (even if only soft, avg, sharp) and might come out with literally hundreds of combinations, but a more precise expected value for each, and then the weighted mean of all for the overall expected to compare to the observed. The more events we can classify, the more we can resolve sample size.

Dan Fox’s basic approach with SFR was measuring hits and total bases for the batters, and advances by the base runners for each fielder, compared to the expected value based on the mix of batted balls hit to each fielder. This is a probablisitic model which also incorporates the extra base advances of both the batter and baserunners, which goes back to your original question in this thread.

If batters are stretching hits and baserunners are taking the extra base more often against some fielders than others, then I do think it quite likely that the fielder is not getting to the ball as quickly (lack of speed or poor jumps) or lakcs a strong arm once he gets to the ball, or all of the above. It may very well be difficult to determine the exact mix of reasons why the performance turns out so, assuming this is a true talent level, but it should not be difficult to weed out the good from the bad (for whatever reasons) in this particular type of performance.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 04:02
Nate Silver: hero to interviewers

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel

Nov 19 19:13
Offense by position groups by decade

Nov 19 17:32
Changes in home run rates during the Retrosheet years

Nov 19 16:40
One Year and One Million Hits Later

Nov 19 16:22
Soria as a starter?

Nov 19 13:50
Response of a fired head coach