THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, February 02, 2010

Various implementations of WAR; talking WPA/LI

By Tangotiger, 10:30 AM

The framework of Wins Above Replacement (WAR) was developed in this blog over a period of months or years. 
Nonpitchers:
- Offense relative to average
- Fielding relative to average
- Positional adjustment
- Common replacement level adjustment
You add those up, and multiply by expected/deserved playing time.

Pitchers
- Pitching relative to average
- Role adjustment (starter / reliever)
- Leverage adjustment (reliever)
- Common replacement Level
You add those up, and multiply by expected/deserved playing time.

That’s the framework of WAR.  There are different implementations of this framework, as Fangraphs has it (fWAR) and Rally’s Baseball Projection has it (rWAR).  They each make some decision as to how to count each component.  The great thing about the framework is how you can slide one thing in or out without affecting anything else.  Prefer UZR to TZ, but like everything else Rally did?  No problem, slide one out, slide the other one in.
Jeremy doesn’t like some of the choices made by some of these implementations.

My main philosophical problem with Fangraphs’ WAR (fWAR) is that relievers are given extra value for having pitched in high-leverage situations. Personally, I don’t understand why we use a pitcher’s actual leverage index and chain from there. Why not just start and end with the deserved leverage index?

In terms of forecasting, I definitely go with deserved LI.  But, in terms of accounting for the past season, I don’t know that I would do that.  After all, imagine if Mariano Rivera was used in mop-up duty all season.  His actual win impact was muted.  Imagine Ozzie Smith at DH or 1B.  Imagine Adam Dunn in CF.  So, in terms of accounting for actual wins and losses, the actual usage is what we care about, and not the optimal / deserved usage.

But, that’s fine.  We can disagree on it.  Jeremy can have his jgWAR if he likes.  As long as we adhere to the basics of the framework, then 90% of the disagreement goes away.  Now the conversation moves to the periphery.  And that’s a good thing.  It’s no longer an RC v BsR debate, but a debate as to the specific component of the “B” variable in BsR.  That’s good.

Now, Jeremy also brings up WPA/LI for pitchers.  In this particular case, there’s a bit of a problem.  WPA/LI is great for hitters because it neutralizes all the PA so that all the PA are equally impactful.  It simply recalibrates each of the components.  Basically, it’s like having a game-state specific version of wOBA.  Sometimes the coefficient for the HR is 1.5 and walk is 0.9, or sometimes the coefficient for the HR is 3.0 and the walk is 0.4.  There’s always some sort of recalibration based on how much impact the walk and HR will have on that particular game state.  But, for pitchers, it’s not so easy.  Prime-Pedro has many fewer men on base specifically because Prime-Pedro is pitching.  So, we don’t want to neutralize each PA so that they count equally and then add them up in a linear fashion, which is what WPA/LI does.  WPA/LI is the first step, but then it needs to be BaseRuns-ized in order to get it into the right scale.

How big a deal is this?  Interestingly not big at all.  When I look at the big 4 of our generation, their career WPA/LI and career WPA has almost no difference.  In any case, to the extent that we have an issue, the bias would work the same for all the same-quality pitchers.  Basically, this is one of those things that is not worth worrying about, other than if you are really someone who enjoys digging into an issue like this.  So, yeah, WPA/LI (for pitchers) gets you at least 95% of the way there, if not 99%.

I love WPA/LI because it balances every PA to be equal to any other PA, while also adjusting for the particular vagaries of the situation.  A runner on 3B and less than 2 outs and down by 1 run is not the same as with 2 outs and up by 5.  WPA/LI handles it properly, and is the only metric to do so, other than its cousin WPA.


#1    Jeremy Greenhouse      (see all posts) 2010/02/02 (Tue) @ 13:30

Tango, thing is, we’re not “accounting for actual wins and losses.” If I’m trying to find a player’s value, I’m placing him under optimal usage and removing the manager entirely. A run is a run is a run, no? Then why do you think a 6th inning Mo is less valuable than a 9th inning Mo?

I agree about not adding up WPA/LI in a linear fashion, and I wasn’t trying to suggest that. Just that the weights of WPA/LI make more sense in a WAR metric than regular linear weights.


#2    Tangotiger      (see all posts) 2010/02/02 (Tue) @ 14:23

If I’m trying to find a player’s value, I’m placing him under optimal usage and removing the manager entirely.

I have no quarrel with what YOUR implementation would be.  I have my own, Rally has his own and Fangraphs has its own.  As long as you state your intentions, and are consistent, then your system is justifiable, and therefore, fine.

In your case, you say you want to find his optimal usage.  But, are you also relying on his observed performance? So, are you saying that you are going to treat Jason Bartlett’s 2009 season as-is, and then try to find the optimal usage for that known performance?  And same for Brad Lidge’s 2009 season?

As for a run is a run is a run, again, it all depends on the question you are asking, and how you are implementing your version of WAR. 

I can see why Rally handles pitchers the way he does and I can see why Fangraphs does as well.


#3    Colin Wyers      (see all posts) 2010/02/02 (Tue) @ 14:36

If a run is a run, then why use LI at all? Either you buy into the WPA and LI framework and its assumptions or you don’t. If you do, then the LI for a pitcher (in retrospect) is what it is. If you don’t, then why are you trying to apply it to a pitcher at all?


#4    Tangotiger      (see all posts) 2010/02/02 (Tue) @ 14:44

The way I see it is “what would an average player have done here”, as the comparison point.

For example, Mattingly had 145 RBIs.  But, what would an average player have done given Rickey and the other guys Mattingly had.  So, the “145” means nothing unless I know what I can compare it to.

It’s one thing to say that when Mo is involved, his WPA/LI is 2.5 and his WPA is 5.0, and the bullpen WPA is 6.0.  But, I need to know what happens if you have an average pitcher in the bullpen.  Roles get shifted, and the baseline comparison point starts to change.

Basically, I don’t treat the RBIs or WPAs or anything actually as something that “belongs” to the player like money in his bank account. Part of that money belongs to the broker who lent it to him, and we’re trying to figure out how much he actually borrowed.


#5          (see all posts) 2010/02/02 (Tue) @ 14:49

Tom,

Isn’t the problem with WPA/LI for pitchers that it completely ignores the DIPS theory?  Isn’t there a way to create a FIP-WPA/LI?

I wrote you an email about this a while ago.  Do you think that WPA/LI is good enough as it is, and making a fielding-independent version would cause more problems than it resolves?  Or would that just be impossible to do? 

Just curious, as I’ve been thinking about this a lot lately, but don’t have the knowledge to work on it myself to see how it comes out.


#6    Colin Wyers      (see all posts) 2010/02/02 (Tue) @ 14:56

Well that’s a problem I have with bullpen chaining LI - as far as I can tell, it ignores half of the story.

Right now, when people adjust for chain LI, they do so in a downward direction, but not upward. In other words, if you have a good setup man, the closer’s LI goes up, because the setup man keeps the score closer. So a good setup man should in theory get credit for the LI win difference between the observed LI and the LI that an average setup man would provide, under the assumption of an average closer.


#7    Tangotiger      (see all posts) 2010/02/02 (Tue) @ 15:27

Steve/5:

If you only credit the pitcher on FIP events, you are forgetting about the sequencing of events.  Basically, a guy who is terrible with men on base is saved by a FIP-based metric.  Even WPA/LI won’t help, because it treats each PA equally.

From your perspective, you’d prefer what Rally does,since he starts with the events, and then removes the “average team defense” impact on the BIP.  It’s crude in that respect (doesn’t try to actually look at each BIP), but it at least doesn’t pretend that fielding is neutral or all pitchers.

***

Colin/6: hmmm… ok, so let’s take the 1996 Yankees

http://www.fangraphs.com/winss.aspx?team=Yankees&pos=all&stats=rel&qual=0&type=3&season=1996&month=0

They had 4 relievers all season (Wetteland, Rivera, Wickman, Nelson), plus the equivalent of 1.5 more relievers.

The LI:
2.4 Wetteland
1.5 Rivera

0.8 Nelson
0.8 everyone else
0.7 Wickman

Rivera was so good, he ended up with a +5.3 WPA to Wetteland’s +4.1 WPA.

In terms of “deserved LI”, based on performance of that season, Rivera should have gotten more than Wetteland.

In terms of “chained actual LI”, you are suggesting that taking out Rivera from the mix means not only impact to the relievers below him, but also to Wetteland above him.  That Wetteland wouldn’t have found himself with a 2.4 LI.

I don’t necessarily disagree with you.  But boy, that sounds like a tough thing to handle.


#8    David Cameron      (see all posts) 2010/02/02 (Tue) @ 15:30

Then why do you think a 6th inning Mo is less valuable than a 9th inning Mo?

Closers do not get selectively used to maximize platoon advantages.  9th inning specialists have to face whoever is due up, regardless of their handedness. 

Relievers used earlier in the game are selected based on which types of hitters are coming up.  If you’re a RH middle reliever, odds are pretty good that your manager will put you into the game in an advantageous position, where you can face multiple RH hitters and be removed if a tough LH hitter comes up. 

There’s a degree of difficulty that is inherently greater in pitching the 9th inning consistently as opposed to being selectively used when the match-ups are in your favor. 

So, if you’re not going to pay attention to leverage, you then have to adjust for these differences in another way.  You can’t compare closers numbers to middle relievers numbers without some kind of adjustment.


#9    Colin Wyers      (see all posts) 2010/02/02 (Tue) @ 15:46

If you’re trying to adjust pitcher WPA the way Rally adjusts pitcher RA, you need to figure a pitcher’s leveraged defensive support. I can only think of about a dozen ways that gets messy.

================

Yes, figuring out bullpen LI that way is probably obscenely difficult. That’s one of the big reasons I haven’t done anything with it yet.

But yes, if you’re trying to assign credit for LI, what you probably should be doing is giving Wetteland credit for whatever LI he would have had if Rivera had been replaced by an average reliever. Then, Rivera should get credit for the difference between the two LIs for Wetteland, based on the assumption that Wetteland was average. (Of course, now you run into the question of average what - average for all pitchers? All relievers? Relievers in similar situations?)


#10    Jeremy Greenhouse      (see all posts) 2010/02/02 (Tue) @ 16:32

Tango/2, “Are you saying that you are going to treat Jason Bartlett’s 2009 season as-is, and then try to find the optimal usage for that known performance?  And same for Brad Lidge’s 2009 season?”

Yes. I believe that is what I would do.

Colin/3, you misinterpret what I mean when I say a run is a run. You say “then the LI for a pitcher (in retrospect) is what it is.” But what I’m saying is that two pitchers with the same amount of innings and runs above replacement, but different LIs, should end up with the same WAR. We should scale their respective LIs to be the same. This is not saying all pitcher’s deserved LI should be the same and we should disregard the WPA and LI framework. Your #6 seems to be similar to what I’m saying.

Dave/8, I’m talking in theory, not in reality. I understand the current (flawed) structure of the bullpen has defined roles for the 6th inning and 9th inning that can’t be directly compared. And anyway, I really don’t think that incorporating actual LI is the correct way to adjust for quality of opposition.


#11    Colin Wyers      (see all posts) 2010/02/02 (Tue) @ 16:39

I don’t think my number six says the same thing at all. The idea still looks at a pitcher’s context - a closer will still have a higher LI than other relievers. It just reallocates some of the LI credit based upon the performance of other pitchers.

If you want two pitchers with the same innings and RAR to have the same WAR, then just ignore LI. That’s really all you have to do.


#12    Tangotiger      (see all posts) 2010/02/02 (Tue) @ 17:17

But what I’m saying is that two pitchers with the same amount of innings and runs above replacement, but different LIs, should end up with the same WAR.

Then you would have to give Lidge and LI of close to 0, so that he ends up with a WAR of 0.

If you want two pitchers with the same innings and RAR to have the same WAR, then just ignore LI. That’s really all you have to do.

I think he must also mean that 70 IP of Mo performance as a starter must be less than 70 IP of Mo performance as a reliever.


#13    Colin Wyers      (see all posts) 2010/02/02 (Tue) @ 17:29

I think he must also mean that 70 IP of Mo performance as a starter must be less than 70 IP of Mo performance as a reliever.

Well, I guess. But in that case, a run really isn’t a run, is it? Else 70 IP of Mo as a starter should be equal to 70 IP of Mo as a reliever (sort of - that depends on how you imagine Mo would pitch as a starter versus the typical starter and reliever baselines we use, but that’s a pedantic issue that confuses things).


#14    Tangotiger      (see all posts) 2010/02/02 (Tue) @ 17:37

Right, a run is not a run is not a run.

No one acts like it is, no one pays as if it is, and so we don’t need to pretend that it is, just because we are predisposed to believing that it is.


#15          (see all posts) 2010/02/02 (Tue) @ 18:34

Tom/#7,

But couldn’t we create something that doesn’t get rid of sequencing while still getting rid of fielder effects?  Basically the same as WPA/LI, except that the pitcher gets the same credit for a ball in play - the average WPA/LI of balls in play for that situation.  ks, bbs, hr, are treated exactly the same.  That way, if someone is good at getting a strikeout when he needs one, he’ll get credit, but he won’t get any more credit for a lineout than a single. 

You could apply this to the theories behind xFIP or tra also.  For xFIP, the player gets the WPA/LI for the average fly ball whenever a flyball is given up.  This way, if someone is good at keeping the ball on the ground in situations where HR would be very detrimental, he’ll get credit for that.

For tra, pitchers who can induce ground balls in double play situations would get credit, whether or not a double play actually happened. 

I personally think something like this would be the best of both worlds, and better than just using RA and subtracting out defense after the fact.  It includes sequencing and “pitching to the situation” but doesn’t depend on if a ball squeaks through the infield or not.

Anyway, like I said, I do not have the knowledge to create something like this, or even if it would be practical to do so, so I thought I would pitch the idea to people who do.


#16    Colin Wyers      (see all posts) 2010/02/02 (Tue) @ 19:38

You can’t use WPA/LI for pitching. Okay, okay, let me rephrase this: you can, but it leads to incorrect assumptions. A pitcher has more control over his run environment than WPA/LI, which is really a sort of LWTS, supposes.

So you’re stuck with doing WPA. Which, you can do, I guess. But then you have the reverse problem. As an example:

If a pitcher allows a ground ball, you dock him the average change in WPA in that situation. Okay, that works. But now for the next play, do you credit him with the LI for the actual play, or the average follow-LI for a ground ball in that spot? Now chain this over four or five plays in an inning.

What you either end up with is a system where you’re not really being fielding independent, because you’re letting fielders influence LI, or you’re tracking multiple LIs - one for the pitcher assuming average fielding support, one (or eight!) for the rest of the defense assuming average pitching…


#17    Tangotiger      (see all posts) 2010/02/02 (Tue) @ 20:56

WPA/LI is dynamic linear weights, in that each PA is equally weighted, but the run value of each event changes based on the particular game state.  It’s best to think of it as a custom wOBA equation.

So, if you allow a walk, the average wOBA would be .720, but in some game states, the wOBA for that walk would be .640 and in another it’s .830 and in another it might be .910 and in another it might be .440. 

You do this for all walks, and the average wOBA, FOR ANY PITCHER, will be around .720.

You repeat this for HR, and K and singles and doubles, and whatnot.  You get the same kind of thing going on.

The average wOBA for all the events will match a pitcher’s OBP, more or less.  That’s the whole point of wOBA.  And by extension, WPA/LI. 

So, all we are doing is trying to make more sense of a pitcher’s OBP and SLG and whatnot.

WPA/LI is a step up from any of the component stats.

If you have an argument against WPA/LI, you will have the same argument against ANY component method.  So, it won’t be a WPA/LI issue, but a component-based issue.

***

As for looking at runs allowed, well, now you are into a sequencing issue AND a fielder-issue, since you are now linked to the fielder making plays or not, and that will drive everything.

***

No matter what you do, you’re going to cut corners somewhere.  So, it’s not an issue against WPA/LI, but really that you need to define exactly what it is that you want to measure, and then construct the metric to match what you want.

Give me the question first, and I’ll give you the right answer.


#18          (see all posts) 2010/02/02 (Tue) @ 22:21

So let me see if I understand all this.  Colin doesn’t like WPA/LI because it changes the weights given to events based on game state, and the pitcher is partially responsible for his game state, so he shouldn’t get more or less credit based on a situation he created.

Tom is saying that since WPA/LI gives equal weight to each PA, but just changes the weights of the specific outcomes, you aren’t giving someone extra credit for getting out of a situation he created.  It is better than something like wOBA for pitchers because the pitcher is only partially responsible for changing the game state, and should get credit for pitching to the situation.  A pitcher won’t get credit for pitching into a situation with a higher LI and getting an out, and won’t get hurt by the fielders putting him in a situation with a higher LI and not getting an out, because the LI is taken out.

Is that about right?

So, if someone were to decide to take the side of WPA/LI, and want to put that into a WAR formula, what would be the best way to do it?  I realize that “best” could mean a lot of things, depending on what you think is important, but I think we have a lot of the same goals in mind when it comes to creating a value metric.

Would it be best to just leave WPA/LI and “base-runsize” it like you said?  Or possibly doing something more like Rally does, and subtract out defense after the fact?  Or would it be better to create a metric that uses the DIPS stats and changes the weights based on the game state?


#19    Dan Turkenkopf      (see all posts) 2010/02/03 (Wed) @ 16:19

I’m in the midst of trying to create a WPA/LI based WAR which is turning into a really big task. 

My conceptual approach is to take WPA and assign values to batter / baserunner(s) / pitcher / fielder(s) based on average outcomes and Colin’s SZR so that everything sums to 0 per play.  From there, divide by LI.  Sum everything and subtract replacement value.

That approach contains a at least one big assumption beyond guessing on some of the hit location information missing from Retrosheet.  I’m assuming that defensive performance is unleveraged - in other words, defensive players perform the same regardless of the game situation - so that conversion % is the same no matter the LI.

My biggest concern is how to treat LI for relievers.  Without some concept of LI, every reliever will less valuable than any starter (pretty much), but I’m struggling on the best way to include it.  I sort of like the idea of anchoring the LI at the point of entry into the game / beginning of the inning so that the pitcher has minimal impact in changing his own leverage.

I’m also having to do a bunch of little studies to determine rate of advancement based on a lot of factors, and figuring how to split credit between pitcher and catcher on stolen bases and wild pitches / passed balls.

Thoughts?


#20    Tangotiger      (see all posts) 2010/02/03 (Wed) @ 17:06

You have to be careful with this “splitting” business.  Read this blog post:

http://www.insidethebook.com/ee/index.php/site/comments/isolating_pitchers_from_batters/

And especially follow the hockey example.


#21          (see all posts) 2010/02/04 (Thu) @ 02:00

Dan,

couldn’t you treat the reliever as a starter, using WPA/LI, then after you’re done, multiply by the average LI?  It would probably be simpler than anchoring the LI of their entry, and it would probably work out similarly.


#22          (see all posts) 2010/02/04 (Thu) @ 02:03

One more thing, if you do the anchoring, you wouldn’t be taking into account chaining of LI.  If you just multiply it afterwards, you can multiply it by (LI+1)/2 (or better formula, if you have one) to take that into account.


#23    Dan Turkenkopf      (see all posts) 2010/02/04 (Thu) @ 09:01

Tango: Thanks, I’ll take a look at that in more depth.  I think what it implies is that allocating credit for a single shared play is pretty tough and may require a WOWY and odds ratio to do correctly.  Am I on the right track?

Steven: Ah yes, I forgot to mention that I was planning on using the chained leverage at entry time.  It’s not much harder to calculate the leverage at entry time than it is to calculate for each play and this way you eliminate the impact the pitcher has on his own leverage.


#24    Tangotiger      (see all posts) 2010/02/04 (Thu) @ 10:42

Dan, the thing to remember about the allocating of shared plays is that the way you do it for a single random play will be very different than for thousands of plays with a common thread (say all plays involving Derek Jeter).

For example, for a single play, you would have the batter, pitcher, park, runners, game situation, on deck batter, etc.  Lots of things going on.  And let’s say Jeter gets a single, where the change in win expectancy was +.04 wins for the batting team.  Who gets credit for that +.04?

But, if you have one thousand singles for Derek Jeter, then ALL of the non-Jeter parameters listed are pretty much random.  That is, they are noise.  And therefore, the sum of ALL of the non-Jeter parameters will cancel out to zero.  All of those non-Jeter parameters essentially is the neutral environment Jeter found himself in (to the extent that we are presuming, for this illustration, that after 1000 Jeter singles, all the non-Jeter parameters are the same for Jeter and other players).  And so, if the sum total WPA of the 1000 Jeter singles is +42 wins, then we have to give 100% of the credit to Jeter.

Now, what if in that one single we are talking about, Jeter hit the single off Maddux.  We repeat the process and look at the 1000 singles hit off Maddux and similarly conclude that we give 100% of the credit to Maddux.

And we repeat with Yankee stadium.

And with Soriano on 1B when a single is hit.

And, so on.

Given enough data where you have one non-changing variable, and all the other parameters basically cancel out (to the extent that our illustration presumes it will cancel out), each parameter, IN ISOLATION, is going to get 100% of the credit for the play.

This is why you have to be very careful in the “sharing” of events like DP, and whether to give to 2B or SS or half-half or whatnot.  In a single isolate play, you may as well give each player 50% credit.  But, over a career, each is going to get 100% of the credit (TO THE EXTENT that each player plays with a random teammate).

You can see therefore why it would breakdown with Trammell/Whitaker, but makes perfect sense with Ripken/whoever.

And that’s why I say you have to be careful.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 15:28
Largest demonstration in Canadian history?

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 15:02
Pete Palmer’s new book: Basic Ball

May 25 14:44
What sabermetrics is NOT

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion