THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, July 12, 2006

Isolating Pitchers From Batters

By Tangotiger, 10:38 AM

I received an email from Benjamin Alamar, PhD, Editor, Journal of Quantitative Analysis in Sports, directing me to a research paper posted here:

http://www.bepress.com/jqas/vol2/iss3/4/

It discusses the Run Expectancy matrix.  I invite the authors of that paper to discuss or refute my points here. 


Disclosure: I have corresponded a few times with two of those authors, and therefore, you may consider my comments here biased towards them.  They are not.  As evidence, I have written a book with mgl, and we don’t pull punches towards each other.  I just call ‘em as I see ‘em.

The paper properly surmises that:
The quantitative problem then becomes splitting the change in NERV in such a way that batters who put a ball in play that is hard to field gets a positive credit, even if a defensive player makes an unlikely play on the ball and the change in NERV for the entire play is negative.

The next statement however is a falsehood:
There is no way, for example, to determine using a NERV analysis whether a great pitcher or a great batter has more total effect on a team over the course of a season. Splitting NERV between batters and pitchers objectively, allows for the direct comparison of total NERV between pitchers and batters to determine which will have a greater impact on a team.

I’ll get back to this after I read the rest of the paper.

I was afraid they’d be going here:

Finally, the expected change NERV change is calculated and split between the batter and pitcher according to our estimated split (batter 62% and pitcher 38%).

This is also wrong, for reasons I will elaborate at the end.

Previous work that has utilized NERV has used tables that calculate the expectancy based upon only the number of outs and the configuration of men on base.

While this is for the most part true, but I don’t like the word “only”.  RE and WE matrix are based on whatever states you want to include in them.  The 24 state matrix is the easiest to use, but certainly not the “only” one ever used.

ForwardRuns was used as the dependent variable in a series of regressions to determine the best fit.

I don’t like this, but I understand that it’s close enough, and easier to program.  Woolner did the same in BP 2005.

Table 1 looks a bit like what John Jarvis may have done.  Again, I oppose regression being used here.  A basic-Markov, or full-Markov, calculation should be implemented.  Again, I understand the attraction to best-fit models. 

For example, if I am reading it right, the lineup position’s starting RE is always going progressively lower.  This is another falsehood, since the RE with the 9th batter (AL parks at least) is *higher* than the RE with the 8th batter, simply because RE is about runs to end of inning, and the 9th hitter is followed by the 1-2-3 hitters.

The charts in Figure 1 are beautiful.  I’ve done similar charts, based on the 2004 BIS data I have.  They look really cool when you split it up by batter handedness, as you can see the shift in positioning.  You can almost infer positioning based on charts like this, for every player.  I haven’t researched it yet, but I would bet Ichiro would be a good case study as a guy who is an exceptional baseball athlete who scores relatively low on fielding metrics.  I wouldn’t be surprised that it’s his positioning that makes him end up worse than he should be based on his toolset alone.  (Of course, positioning is a skill, but whether that skill is the manager or the fielder is up for debate.)

Section 4 I believe is where the problem lies.  As I discussed in The Book, a .300 OBP pitcher facing a .400 OBP hitter results in the same performance expectation as a .400 OBP pitcher facing a .300 OBP hitter.  It’s this whole “splitting up” issue that is the problem, something that plagues Win Shares, and I think is the problem in this paper.

Tables 3a, 3b are unclear to me how that could have happened, and as the paper discusses on page 9:

If NERV had not been split according to the 62%/38% estimated in this paper, but rather using the arbitrary 50%/50% split used in standard analysis, the scores of the best pitchers would have been relatively higher and the scores of the best batters would have been relatively lower. The conclusion from that analysis would be that the best pitchers are relatively more important than the best batters. It is therefore important to make the estimated split in order to more accurately compare batters with pitchers.

I am certainly missing something, because I give 100% of the credit to the hitter, and 100% of the credit to the pitcher.  (Remember, this is not the same as 50-50, and it looks like I’m double-counting.) And I get results where the top 10 hitters are pretty much ahead of the top 10 pitchers.

Ok, now let me tell you why this splitting-issue is not the right way to proceed.  And I’m going to use hockey as an example.  In hockey, everyone on the ice for the scoring team gets a “plus 1”, and the players on the ice for the opposing team gets a “minus 1”.  This means that every goal has five pluses and five minuses.  If we follow the logic in this paper, this would entail taking this one goal, and somehow splitting it up among the players on the ice.  Perhaps giving the five guys on the scoring team a total of +.5 goals, and the give guys on the opposing team a total of -.5.  And then, among the scoring team, deciding who gets the share of the +.5, like maybe +.25 for the goal scorer, +.10 for the playmaker, and +.05 for the other guys on the ice.

Assume you have a player, which I’ll call Obby Borr.  He’s a +120 for the Bruins, and when he’s not on the ice, his teammates are +0.  Also assume that Borr plays with everyone on the team.  If you proceed with a “splitting” arrangement, Borr will end up being credited for something like +60 goals, if he’s lucky.  Likely, under the splitting-system, he’ll be even lower.  However, in my system, he gets +120.

You see, Borr plus his teammates is +120.  All of his teammates are zero.  Therefore, Borr plus zero is +120, making Borr equal to +120.

You have to treat each player as if he’s his own universe, and you adjust for the extra parameters.  The same logic applies to strength of schedule, or how to credit the DP between the 2B and SS, and several other concepts. 

Getting back to splitting up run expectancy (or win expectancy), you just don’t do it that way.  Having the leading hitters at +25, as Table 3a shows, is simply wrong.  If you add the performance of those players into any team (and take out an average hitter), that team will certainly score more than 25 more runs than otherwise.  Splitting doesn’t work.

#1    MGL      (see all posts) 2006/07/12 (Wed) @ 16:17

I must admit that most of the paper is over my head, as is Tango’s brief analysis herein.  (OK, I just admitted to the world that I am not as smart as I make myself out to be sometimes - or at least as readers think I make myself out to be...).

However, I agree with Tango that the 100%/100% model is correct.  The relative “responsibilities” of the batter and pitcher in the batter/pitcher matchup is simply based on the difference in the spread of skill among batters and then again among pitchers.  If all pitchers were throwing BP fastballs down the middle, as in the HR derby, it is intuitvely obvious that all of the responsibility in the outcome can be attributed to the batter.  The reason is that the spread of talent among the pitchers is zero.

Now, we know that the spread of talent among pitchers in almost (maybe all, I don’t recall OFTOMH) all of the components (HR, K, BB, etc.) is smaller than among batters.  HR rate is a good example.  Batters’ HR rates go from around 3 a year to 40 a year (true rates).  Pitcher HR rates go from around 10 to 30 (again, true rates, not observed rates).

Anyway, when we assign 100% of the “responsiblity” to pitchers and to batters when we compile the NERV changes, the “real” relative rates of responsibility “come out in the wash” because we are always comparing or normalizing each player to the average of his group (batters or pitchers).

So in the example of the batting practice pitcher, in the long run (getting rid of random fluctuation) the sum of the NERV changes for all of the pitchers will automaitcally equal zero.  IOW, all pitchers will be league average pitchers.  If we have a league where there is some spread of pitcher skill, but not much - a lot less than batter skill - we will find that the spread of total NERV change among all the pitchers will again automatically be small - at least smaller than the spread of batter NERV change.

If I do it the traditional way, using 100% for batters and 100% for pitchers, I also get a larger spread for batters than for pitchers - IOW, the best batters will be better than the best pitchers.

So I don’t really get where these guys are coming from either, although I am pretty sure that at least one of them is smarter than I.

Maybe what we are missing is that if you use the batted ball types and characteristics (distance, speed, etc.) to determine the NERV change rather than the actual result of the play (s,d, etc.), you have to somehow use this 62/38 split or you won’t come up with the correct answer.  I am not sure.


#2    Ben Alamar      (see all posts) 2006/07/12 (Wed) @ 22:53

I certainly appreciate the careful reading of our work as well as the opportunity to respond to the thoughtful criticism.

When the idea of using an expected run value calculation was first introduced, it was an elegant and balance accounting system of the events that occured during a game.  This type of system certainly has its allures and its uses, but, what I argue in the paper, is that a pure accounting system does not uncover the relative value that pitchers and hitters add to their teams.  Adding up the points earned in the system described, will not net out to zero, and when in comes to evaluting players across positions, I do not believe that that is neccesary.

First, the expected runs calculated in the paper, are just the expected runs created from the game between the hitter and the pitcher (we strip out the effects of defense and running through the use of expected outcomes of balls in play), so there is not a one to one correlation between total runs and the points that hitters and pitchers earn in this system.

Second, and on a more theoretical point, consider a pitcher/hitter game (an at bat) that results in a homerun.  It is not clear to me why the pitcher and the hitter would recieve the same number of points in a system that was concerned principally with determining the relative value of these two players.  The two have obviously different tasks and because of this, one of them may exert more control on the outcome.  What we find in the paper, is that the skill of the hitter, on average, has a greater effect on the outcome of an at bat than the skill of the pitcher.  If this result is correct, than it would seem that the hitter should be rewarded more for a homerun than a pitcher is penalized, because the hitter has more control.

Third, on MGL’s point regarding the relative variation of skill between pitchers and hitters, the model specifically controls for this, by normalizing the skill variables used.

I believe this is an interesting debate and I do hope that it will continue.

-Ben Alamar


#3    tangotiger      (see all posts) 2006/07/13 (Thu) @ 06:57

The splitting process is one that I’ve tried, and then realized that it had holes.  Which is why I do it the way I do it.

I understand why you want to do what you do, and it sounds right.  But, it breaks down.  If you can provide your aggregate team totals for hitting,pitching,fielding (or offense, defense), I will prove to you that the splitting process is wrong.  Just something like:
NYY,+86,+21,-61
or some such, should be fine.


#4    Ben Alamar      (see all posts) 2006/07/13 (Thu) @ 09:27

The guys at protrade.com just sent me the most current numbers for this season.  These numbers split NERV, but do not use the 62/38 split for pitching and hitting so I would suggest that the hitting numbers are understated and the pitching numbers are over stated.  In order they are Hitting, RUnning, Pitching, Fielding, Total. 

{List edited by Tom to round numbers to one decimal place}

team,Bat,Run,Pitch,Field,NES
White Sox,101.8,2.1,62.2,-48.6,117.4
Tigers,30.6,-8.9,69.8,6.3,97.9
Mets,93.6,0.9,-1.8,-13.0,79.7
Red Sox,53.6,-4.6,64.1,-42.6,70.6
Dodgers,114.3,-2.3,-13.3,-29.8,68.9
Yankees,54.8,1.5,24.9,-18.7,62.5
Blue Jays,41.0,-3.9,45.8,-37.2,45.8
Padres,26.9,2.9,6.8,3.2,39.8
Cardinals,64.8,-5.6,-13.5,-11.5,34.2
Rockies,-13.6,-9.2,48.5,6.8,32.4
Giants,48.1,-7.3,-14.1,5.6,32.3
Indians,76.6,-1.7,-17.5,-28.2,29.2
Rangers,-11.8,-1.3,75.5,-35.2,27.2
Twins,13.5,2.8,16.0,-7.9,24.4
Reds,50.7,2.8,-29.7,-10.5,13.4
Braves,59.7,-5.1,-45.5,-7.6,1.6
Marlins,51.3,-4.8,-16.6,-32.0,-2.1
Diamondbacks,23.2,-1.3,0.8,-30.1,-7.3
Mariners,-1.7,2.2,-12.5,-1.1,-13.1
Athletics,-38.9,2.7,5.0,14.5,-16.7
Astros,9.7,-0.9,-24.0,-3.8,-19.0
Angels,-9.6,5.9,40.8,-57.8,-20.8
Phillies,8.2,0.0,0.1,-33.2,-24.9
Brewers,22.9,1.3,-10.3,-69.8,-55.9
Nationals,45.0,-10.6,-32.7,-64.6,-62.9
Pirates,3.0,-5.0,-30.1,-39.9,-72.0
Orioles,8.4,4.7,-73.5,-11.7,-72.0
Cubs,-26.6,0.3,-67.6,16.8,-77.1
Devil Rays,-58.3,-2.1,13.3,-38.0,-85.1
Royals,-34.4,-7.3,-74.1,-24.8,-140.6


#5    Tangotiger      (see all posts) 2006/07/13 (Thu) @ 10:28

This data suggest that the numbers using the run expectancy, RE, matrix (NERV as you are calling it), as I use it.  That is, no splitting done at all.  And this makes sense.

If you do a simple runs scored minus runs allowed, you will get a distribution with a standard deviation of 60 runs.  The SD of your last column is also 60 runs.  If you were to give any side any lesser credit than the full 100%, that side will be shortchanged.

Combining your two hitting measures, you get an SD of 42, and you get an SD of 43 for your pitching+fielding.  This compares to the SD of 37 for runs scored and 40 for runs allowed.  Again, everything looks perfectly in-synch.

Your top 5 NES teams (WhiteSox, Tigers, Mets, Redsox, Dodgers) are +434, while those teams’ actual run differential is +429. 

The bottom 5 NES are -447, and their run differential is -425.

In short, trying to do any splitting will simply unalign the NES numbers which are currently in-line with runs scored and allowed numbers.


#6    Ben Alamar      (see all posts) 2006/07/17 (Mon) @ 11:43

Your analysis is correct, but you are still assuming that the accounting is important.  I am not trying to create a system that actualy credits specific players with specific runs, but rather trying to determine the relative value of different players.  I could just as easily set the average of my system to 1 or 100 or 54 and not change the imformation that the system provides, which is that, relatively speaking, hitters have more control of the outcome of any specific at bat than a pitcher and that over the course of a season, a great pitcher and a great hitter add approximately the same value to their team.  Note that I said value and not runs.


#7    Tangotiger      (see all posts) 2006/07/17 (Mon) @ 12:34

Let me ask a simple question: a hitter with a “true rate” of .400 OBP faces a pitcher with a “true rate” of .300 OBP.  What’s the expected outcome?  (Use whatever league average you want.)

Now, a hitter with a “true rate” of .300 OBP faces a pitcher witha “true rate” of .400 OBP.  What’s the expected outcome.

My contention is: same thing.

By your reasoning, it seems that you would argue that the first matchup will result in a higher outcome than the second.  Is that correct?


#8    Ben Alamar      (see all posts) 2006/07/17 (Mon) @ 15:59

The split that is determined in the paper does not actually depend upon the skills of the hitter and pitcher in a given matchup.  It is possible to use the same regression equations to produce this type of system that adjusts for competition faced, but we do not do that.

Instead, the results say that in a typical plate appearance, the outcome is influence more by the hitter than the pitcher.  Assuming that we acurately measured that, then we say that given the hitter’s greater influence on the outcome, he deserves a greater share of the credit/blame of the outcome than the pitcher.

In essence, we are saying that when Pujols strikes out, it is more his fault than the pitcher’s.


#9    tangotiger      (see all posts) 2006/11/08 (Wed) @ 17:21

I’m going through this thread, and wanted to maybe come to a common point.

Ben’s last statement is correct, that when Pujols does anything, in that one PA, he is more responsible for it than the pitcher.  This is simply a function of the talent distribution for hitting being wider than the talent distribution for pitching.  That is, if you have a very very tight distribution, and you have another distribution that is very very wide, and you randomly select one player from each distribution, it’s clear that the player chosen from the wider distribution will be more responsible for that one particular outcome than the other player.

And we agree on this.

But, this does not follow that the NERV should also be split as 62/38, or 160/100, or whatever non-equal split you want to make.  I tackled this splitting issue already.

I think this is what the problem may be, that the authors linked one (variance in talent distribution) to the other (overall effect to outcome).


#10    dq      (see all posts) 2006/11/09 (Thu) @ 13:16

I read this thing, and have a few questions from my old mind:

1. They did a good job in trying to make the fielding neutral, by looking at results based on type of ball hit and location, not actual results. So I think they mean its batters 62, pitchers 38 with neutral fielding impact. So, I think that fielding is not part of the 62 or 38, they are doing batting versus pitching with no fielding. If fielding is 2/3 of pitching (24), does that make it 62 versus 62?

2. They used all types of RE data. I think that most everyone assumes they cancel each other out in a large population. There data should be better, but I’m not sure how much precision they build in here. Any idea what that might be?

3. Not sure how the 62 versus 38 is finally derived. They compute batter and pitcher rate coefficients - but are those impacted because there is a greater spread of talent in hitting than pitching?


#11    Tangotiger      (see all posts) 2008/02/07 (Thu) @ 12:22

bumping

(related discussion elsewhere on “splitting")


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 13:42
Top Free Agent Pitchers

Nov 20 13:39
Marcel 2009 is here

Nov 20 12:29
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being

Nov 20 12:27
David G. checks in again on whether experience matters in the post-season

Nov 20 10:42
Offense by position groups by decade

Nov 20 04:02
Nate Silver: hero to interviewers

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel