THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, November 05, 2007

Pinto’s PMR

By Tangotiger, 10:53 AM

The Yankees led the league with making 62 more plays than expected, given their context of pitchers and park and whatever else David includes.  According to The Fans, that was probably the result of their 1B, 3B, and CF fielding, and not having a big hole anywhere else.  It’s easy to see how you can be +20 plays at three positions, and overall average in the other 5 (or 6), and end up with the best fielding team in the league.  It’s alot easier to believe the Cubs according to The Fans.

I look forward to one of you hardworkers out there to summarize the team-level fielding of PMR, Plus/Minus, UZR, Hardball Times, STATS ZR, BIS ZR, and The Fans. 


#1    David Smyth      (see all posts) 2007/11/05 (Mon) @ 14:53

I never fail to be surprised by the results of these advanced fielding systems. I saw most of the Cubs’ games, and I never had the sense that I was watching an impressive team defense.

I realize that what I am doing, to some degree, is rejecting ‘objective’ data because it doesn’t agree with my visual impressions. I understand that trap. But still.....


#2    David Pinto      (see all posts) 2007/11/05 (Mon) @ 20:06

Thanks for the link!


#3          (see all posts) 2007/11/05 (Mon) @ 21:17

"I look forward to one of you hardworkers out there to summarize the team-level fielding of PMR, Plus/Minus, UZR, Hardball Times, STATS ZR, BIS ZR, and The Fans. “

I volunteer. I’m starting it out now, but I would appreciate it if Tangotiger or someone could give me advice on how to take the fans’ reports and convert it to runs (which is just plays * .75 or .80). I have an idea but want to be sure before I do the legwork.


#4    tangotiger      (see all posts) 2007/11/05 (Mon) @ 21:34

Take the average for the player and subtract the average for the position.

Multiply that by 0.7.

Then, for each position, multiply by:
1.25 SS, 2B
1.00 CF, 3B
0.75 LF, RF, 1B


#5          (see all posts) 2007/11/06 (Tue) @ 13:23

Tango: That formula works fine for a straight player-to-player comparison, but on a team level, don’t we need to account for playing time? Is it better to try and estimate that from innings played, or to cheat and use the total chances from one (or more) of the zone rating systems? Is there a similar list of average chances per innings played for the various positions we can use?


#6    Tangotiger      (see all posts) 2007/11/06 (Tue) @ 14:41

If you have multiple players at the same position, then you should weight them by the number of team BIP, or positional BIZ.  But, using innings played, which you can easily get at b-r.com, is just as good.


#7          (see all posts) 2007/11/06 (Tue) @ 22:14

What should we do about innings from players who aren’t in the Scouting Report? Just looking at the Cubs, they’ve got Henry Blanco and Geovanny Soto catching 231 innings between them, and there’s no data on either of them. (And Bowen’s on the ballot for the Padres, but I’m going to go ahead and assume that the fan ratings don’t have a significant park/league factor.)

And I’m guessing the answer the players that shift position is to just apply the adjustment multiple times for each position and use innings played to divy it up for each player.


#8    Phil D.      (see all posts) 2007/11/07 (Wed) @ 00:53

I’m using defensive innings to go from the fan ratings to runs saved as Tango described in post six. Colin raised a good question in post seven and so far I have just been ignoring the guys who played and do not have a rating.

I just started it yesterday, but for those who want to see my results so far, click my name. I’ll be adding more metrics and a glossary/explanation soon. I’d appreciate any suggestions anyone may have.


#9    tangotiger      (see all posts) 2007/11/07 (Wed) @ 08:42

For the missing guys, I’d go with “0”.  There is a decent way to test it: find out the UZR, PMR, RZR’s of all the missing Fans players.  I would be shocked if they weren’t something very close to 0 per 162 games.  I’ll guess they would be a shade below, like -1 or -2.


#10    Tangotiger      (see all posts) 2007/11/07 (Wed) @ 16:23

David continues his look, this time at DER of fielders, by pitcher:
http://www.baseballmusings.com/archives/023901.php

I made a comment in that thread, and it basically comes down to: I don’t know what’s being measured.


#11          (see all posts) 2007/11/07 (Wed) @ 17:30

If Tango’s right that Wang deserves some or all of the credit for the “extra” outs, then the original ranking of the Yankees as the #1 defense is undermined.
According to PMR, the Yankees defense made 61.5 more actual outs than predicted outs. 33 of those were made with Wang on the mound. If we take the possibly drastic step of giving Wang all the credit for those 33 extra outs, then the Yanks fall back to something like 7th place overall.


#12    Guy      (see all posts) 2007/11/07 (Wed) @ 17:30

As I wrote over at baseballmusings, I think one issue David needs to wrestle with is the impact of home team hitters on a visitor-based model. If the home hitters put tougher-to-field balls into play, then the expected DER will be low, and vice-versa.  For example, MLB teams had a BABIP of .306 against all groundball pitchers, but the Yankees were .331.  So NY infielders are likely judged against a very low standard:  the model assumes many GBs in Yankee stadium are hard to field, but that may only (or largely) be true when Yankee hitters hit them.  In theory, the parameters control for this, but in reality I doubt they can.  Of the 8 teams with the lowest predicted DER, every single one of them had hitters with above-average BABIP. 

This might be mitigated by adjusting each park factor based on the home team’s overall BABIP on GB, LD, and OFs.  Or maybe park factors aren’t necessary on GBs at all?


#13    Rally      (see all posts) 2007/11/07 (Wed) @ 17:39

I did park factors for groundballs in my totalzone measure, and used multiyear factors in every case, and used an overall groundball factor instead of a position specific one.  I think every team was in the range of .98 to 1.02.

Besides turf, this can be affected by groundskeeping, and that’s something that’s easy for a team to change and impossible for an analyst to know (unless retrosheet wants to add average grass blade height to their files).

Not using a groundball park factor at all is probably a safe and defensible choice.


#14    Tangotiger      (see all posts) 2007/11/07 (Wed) @ 17:51

Guy: I saw your comments, but it depends.  Say that Jeter the hitter does have high BABIP (he might be the active leader), and therefore it makes it look like it’s tough to get an out off the visiting pitchers.  But, if that BABIP can be estimated based on his location and hardness of hit and his speed, then we don’t have a problem.

The solution would be to include the identity of the pitchers and hitters into the model.  I believe David tried this once, and removed it because it added little.  But, it may add a whole lot in some cases, like here.


#15    Guy      (see all posts) 2007/11/07 (Wed) @ 23:17

Tango:
The problem I see is that ALL (or virtually all) of the out probabilities on balls hit in Yankee Stadium are determined by what happens to balls hit by Yankee hitters (since all balls handled by visiting fielders are of course hit by home hitters).  So almost all of these probabilities are determined largely by outcomes on balls hit by 9 or 10 specific hitters.  Now, if high-BABIP and low-BABIP hitters differed only in their GB/LD/FB proportions, there’s no problem.  But that isn’t true:  hitters vary considerably in their out% for different types of balls.

So that leaves us with the six parameters to try to control for quality of hitters.  Maybe the parameters are specific enough to do that, but that seems unlikely.  It would be easy for Pinto to find out:  look at some specific combinations of the six factors (across all parks), and then see if the DER for those hit by hitters with high BABIP differs from those hit by hitters with low BABIP.  My guess is the balls struck by good hitters—despite the 6 parameters being identical—are turned into outs less frequently.

Another easy check:  if I’m right, teams with high-BABIP hitters should show a very large home/away fielding split (actual vs. predicted DER) with fielders enjoying a larger-than-average home advantage; low-BABIP teams should be the opposite, with fielders enjoying a small or no home advantage.


#16    Tangotiger      (see all posts) 2007/11/07 (Wed) @ 23:42

The solution therefore is to include the identity of the batter.  This way, if Jeter bumps down the out rate by .05 per GB, we can use this for all of his opposing fielders, etc, etc.

Like I said, David did do this in one implementation of PMR, the first year it came out.  You can check back in his blog.  Do a search for “identity”.  Can’t be too many of those.


#17    Rally      (see all posts) 2007/11/08 (Thu) @ 11:04

Shortstops are up now.

I agree with Guy that there’s a problem that will overrate the Yankee defense, but even with that PMR has Jeter as the worst shortstop in the league.


#18    Tangotiger      (see all posts) 2007/11/08 (Thu) @ 11:09

Here’s the link for all future posts by David:
http://www.baseballmusings.com/archives/cat_probabilistic_model_of_range.php


#19    David Smyth      (see all posts) 2007/11/08 (Thu) @ 20:16

I realize that much of this blog is over my head, but still, my reaction to Phil’s tables is that, after playing around with the numbers…

Of the 3 objective stats (PMR, THT ZR, FRAA), the THT ZR probably comes closest to the correct rank order for the teams. But, I suspect that the spread is too great. So, the THT ZR regressed 33% towards zero gets my vote. I think that this is probably even better than averaging all 3 metrics together.

Of course, the best metric for teams does not necessarily translate into the best metric for individual fielders.


#20    Tangotiger      (see all posts) 2007/11/12 (Mon) @ 11:12

Pinto has CF out… Ichiro is #1.  It is incredible to me that two PBP based-systems can have the same guy at the very top and the very bottom. 

I’m trying to get my hands on the BIS and STATS data.  If I do, looking at the BIP when Ichiro was on the field is job 1.

***

http://www.baseballthinkfactory.org/files/newsstand/discussion/fanhouse_chipper_jones_is_a_tad_bitter/

MGL said:

Before, even if all 3B were particularly bad this year (as compared to the last 5 years), I would set the average UZR for all 3rd basemen to zero, so you would not know that they were all bad collectively unless you looked at the pre-adjusted numbers. Now, I don’t adjust every position to zero, which I think is better. This year seems to be a bad year for defense in general, other than at CF, and especially for 2B and SS. I am not sure why. Last year 3B was good and all the rest were bad. In 05, defense was great other than at 2B and RF, and they were only marginally bad. Interesting. 04 was around neutral.

I don’t agree with this sentiment, in general.  Clearly, you can’t zero things out every year, at each position.  But, you must zero things out at the league level.  I’m not sure what MGL does.

The reason you can’t zero out at the position level is that you can have a great fielder not play at a particular position, and all of a sudden, every one ends up looking better.  A team could legitimately go from a +20 fielder to a -20 fielder, for a 40 run switch, which averages out to 1.3 runs.  I wouldn’t move the rest of the CF up by 1.3 runs to compensate.  I know that if Barry Bonds’ as a hitter and his +100 runs goes away, I’m not going to then make adjustments of 3 runs to the rest of the LF.

So, kudos to mgl for realizing that zeroing out at a position is not the right thing to do, all the time.

However, given the choice, if you will choose one or the other, I’d go with zeroing out every year, at each position.

What you really want to do is control for everything.  If you have the same pitcher with the same fielder in the same park and the same “marked” ball in play, then they should turn that play into an out at the same rate, in 06 or 07.  Obviously, you would like a large enough number of combination of players and parks to do this.  And, you might want to apply some aging.  But, that would tell you better if the “classification” of the balls hit are consistent or not.

We certainly zero-out at the league level for hitting, and we should do so for fielding as well, unless you have evidence otherwise.


#21    Tangotiger      (see all posts) 2007/11/14 (Wed) @ 11:46

3B are up at Pinto’s.  Zimmerman and Feliz get the love, while Atkins and Braun don’t.  Fans probably do the best job with 3B.


#22    Tangotiger      (see all posts) 2008/01/22 (Tue) @ 12:33

Pinto’s got his cool charts.  Here’s Tulo and Pedro Feliz:

http://www.baseballmusings.com/cgi-bin/DisplayCharts.py?PlayerID=3531&fpos=6&year=2007

http://www.baseballmusings.com/cgi-bin/DisplayCharts.py?PlayerID=1112&fpos=5&year=2007

Has Feliz been signed yet?  Has anyone traded for Inge yet?  What a joke.


#23    Sky      (see all posts) 2008/01/22 (Tue) @ 21:14

Looking at Tulo’s GB chart, he’s basically average going to his left, but excellent to his right.  Garret Atkins is awful to his left and average to his right.

http://www.baseballmusings.com/cgi-bin/DisplayCharts.py?PlayerID=1790&fpos=5&year=2007

Are both Atkins and Tulo cheating towards 3B so that they end up covering the same ground as the average SS/3B combo, but with Tulo doing more of the work and covering for Atkins’ poor glove?


#24    tangotiger      (see all posts) 2008/01/23 (Wed) @ 08:01

Don’t forget that Tulo has one of the best SS arms out there.  He can be covering the same ground as the average SS (i.e., gets to the same number of balls), but he simply throws out more.

What’s interesting is that the data DOES show you that.  I would prefer that the “range” part be split into “did he get to the ball”, and “did he throw the runner out”.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 04:02
Nate Silver: hero to interviewers

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel

Nov 19 19:13
Offense by position groups by decade

Nov 19 17:32
Changes in home run rates during the Retrosheet years

Nov 19 16:40
One Year and One Million Hits Later

Nov 19 16:22
Soria as a starter?

Nov 19 13:50
Response of a fired head coach