THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


2013 Bill James Handbook

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, November 26, 2008

The History of the wOBA, part 1

By Tangotiger, 03:35 PM

Why does wOBA exist?

Here’s an extremely long post that hopefully will answer all your questions.  And if it doesn’t, well, I guess that’s why I called this thread “part 1”.


When we were working on The Book, it became clear very quickly that one thing we’d need to do is establish the statistical significance of the data we were seeing.  And the best tool available for us was using the binomial distribution of a rate stat.  The binomial distribution requires that events occur as a binary (i.e., safe or out).  While you may think of batting average (BA), the better rate stat is on base percentage (OBP). 

BA takes the nonsensical view that a walk is a non-event.  Well, perhaps for what it tries to accomplish, it might be perfectly legitimate.  What we need is something that has plate appearances (PA) in the denominator, and that includes walks.  So, OBP becomes the stat of choice.  It describes each PA rather clearly: safe or out.  And that’s what baseball is about at its core.  OBP, while not as popular as BA yet, has staying power.  While the continued existence of BA is iffy (a large part of its being is simply inertia), OBP will always exist.  If it didn’t exist, it would have been invented.  BA enjoys no such fundamental truth. 

Anyway, while OBP initially satisfied our objective for The Book, it became quite clear very early that we simply couldn’t always treat a walk and a HR equally.  In OBP, each safe event counts as “1”.  So, we needed something else, something that could align itself to OBP to make comparisons straightforward in The Book, be binomial-like, and better weight each event.  We needed something that was almost interchangeable with OBP.

Enter Linear Weights.  Linear Weights contains the simple truths of baseball run creation.  I knew that I needed to somehow get Linear Weights onto a “rate” scale, so that I could do what I needed to do for The Book.  The overriding constraint is that the average weight of each safe event must be exactly equal to 1.  Since OBP has each safe event as “1”, and since my new rate stat will have the same denominator as OBP (i.e., PA), then the numerators had to match.

This means that if I underweight the walk, then I need to overweight the extra base hits.  Overall, the average of these coefficients had to be exactly 1.  With a bit of work, I came up with the logical basis to convert Linear Weights into a rate stat.

This is why the stat is called a WEIGHTED On Base Average.  It keeps the basis of OBP, which is safe divided by PA, except it tweaks the weight of each safe event to better match its actual impact to scoring runs, all while being centered, overall, to a weight of exactly 1.

So, weighted On Base Average (wOBA) exists to serve a particular purpose, one which was leveraged as best we could, in The Book.  We were able to use wOBA and OBP at the same time, treat them almost in the same manner, and see results along the same scale.  This is the reason for its initial existence.

Since then, it gains a bit of traction.  And we get the natural questions, the first being: why not park adjust it?  Well, the reason we didn’t need to, most of the time, is that in The Book, for the reason we needed the stat, when you deal with large groups of data, the park effects will cancel out (the data is not biased and so, applying adjustments really just obfuscates things).  Sometimes, we DID need to park adjust it.  And, in those times in The Book, we did so.  There’s nothing inherently difficult about park adjusting a stat.  Everyone does it.  For the purposes of The Book, the park adjustments didn’t need to be that strict, since again, even the times we needed it, the bias is not strong enough for us to worry whether a park factor should really be 1.04 and not 1.03.

But, when wOBA is released into the wild, these minor adjustments become more important, especially when you deal with Coors or Petco hitters, and you are dealing with individial hitters.  In The Book, we were worried about groups of data, not really worried about individual hitters.  These adjustments are no longer minor.  There is nothing inherent about wOBA that would prevent you from making the park adjustment.  Just because I haven’t done it yet doesn’t make it a design flaw; it simply means that someone out there is free to do it.  As Bill James once said, “I can’t do all this myself”.  This is why most of my work is fairly open and reproducible, so that some people can roll up their sleeves.

Responding to Rob

Rob Neyer, friend of The Book Blog, wrote a blog entry about wOBA, and his readers posted their thoughts and questions.  I will highlight some parts of that thread in the hopes of casting some light on the matter.

According to wOBA, Albert Pujols was 67 runs better than average; according to BRAA, he was 82 runs better.

Clay Davenport at Baseball Prospectus shows Albert Pujols with:
102 Runs above replacement, 72 Runs Above Position, and 88 Runs above replacement position.  He also shows him with 98 Batting runs above replacement, 82 batting runs above average.  I’m not entirely sure exactly what each one does, nor why the two “runs above replacement” figures would differ by 4 runs.

Keith Woolner, at the same site shows Pujols with 99 runs above replacement, 75 runs above positional average, and 91 runs above league average.

Clearly, just within one site, there is some head-scratchers.  Baseball-Reference has Pete Palmer’s park-adjusted Linear Weights, and Pujols is +77 runs above average.

What does unadjusted wOBA say?  Actually, it doesn’t say anything, since I haven’t published those numbers.  You can try to work them out, and the answer is +72 runs. (The figure Rob cites, +67, I can get close to as +68, if I treat the IBB as a non-event.)

There are two main issues with Pujols: how to treat the IBB (which is enormous for great hitters), and (as it applies to everyone) how to handle the park and league adjustment.

In any case, you have five different results from four different sources, all using different methodologies, and applying different adjustments.  Mine is completely open, and indeed, I published the exact SQL on my site.  I would hope that showing my work is actually a good thing, and that the black box systems don’t get a pass.

It’s weighted on-base average … except it’s not, at all. It’s really linear weights on a scale that looks like on-base percentage

I’ll disagree slightly, but not vehemently.  I explained at the top of this thread why it can be considered a weighted on-base average.  Rejecting the argument basically says that it is impossible to even have a weighted on-base average, that the very idea is nonsensical.  Again, I’ll disagree slightly, but not vehemently.

Now, let’s jump ahead and say that two or three years down the line, the big mistake was discovered internally. Would BP announce to the world that all those numbers over the previous three years had been wrong? Or would the guys running the show decide that the loss of credibility (and potentially, revenues) isn’t balanced by the loss of integrity?

Indeed, I did find problems.  I announced it to the world via my blog.  Clay Davenport was receptive to the arguments, agreed that changes needed to be made.  He’s made bug fixes that he announced, and he did other changes, but certainly not all of them.  Others at BP were not so receptive to my arguments. 

What I usually want to know isn’t how good a hitter someone is. What I really want to know is how good a player he is, and WARP, by combining hitting and fielding, tells us this.

This is really outside the wOBA issue.  You can’t criticize a metric for not doing what it wasn’t supposed to do.  “Paris Hilton isn’t funny.”  Guess what, she’s not supposed to be!  While Rob wasn’t necessarily criticizing wOBA, his argument is out of place in the wOBA discussion.

Responding to Rob’s readers

I know it’s become chic to criticize BP for this kind of stuff, but it really seems like nitpicking.

If you study the differences between Woolner and Davenport, it is not nitpicking at all.  Over the summer, I showed a 15-run gap between how Woolner sees ARod/Mauer and how Davenport sees the same two players at the same time.

Trust but verify.

Rob, while I know you didn’t develop wOBA, can you nonetheless explain to me the 1.15 factor involved in the computation? What is the signifcance?

In the “logical basis” and “exact SQL” links above, I describe it in detail.

Is there actually a peer reviewed journal for baseball (or sports in general) analysis?

No offense, but of the peer-reviewed journals I have read (and even done peer-review for), I was less than overwhelmed.  The best peer-review (for sabermetrics) is done by bloggers and readers of blogs.

I like the sound of wOBA, but if it’s not park-adjusted, I’m not sure I’d really be able to rely on it.

Agreed, which is why I support use of EqA or Palmer’s “BtnRuns” at Baseball-Reference.

Why not just translate into runs and have done? Also, I think the lack of park-adjustment is a serious flaw. I’d almost rather stick with OPS+.

This was answered in the main thread.  Palmer’s Linear Weights at Baseball-Reference has the batting runs, park-adjusted.  OPS+ should die sooner rather than later.

Enough already with another stat! We are acting like ivory tower intellectuals trying to analyze and objectify everything.

I take it this is a bad thing to do?

Also, since wOBA isn’t park-adjusted… isn’t the next step for someone to figure out how to do that? It was just introduced—nobody expects it to be perfect yet.

Right, exactly.

wOBA, huh? Every time I see this stat, or even think about it, I will always think of this:

http://www.youtube.com/watch?v=maYnqbdo2jw&feature=related

Busted. 

You are the first person to mention this to me.  I watched alot of Sesame Street with my boy, and that song was *definitely* a reason that I named it wOBA.  I wanted to give the name something light.  I used to call it lwtsOBA or lwtsOBP (for linear weights), and I was thinking of also wOBP.  But, wOBA worked for the name, and making it match to the song was my little secret.

...so scaling this stat to OBA just seems like a foolish effort. I can’t think why they didn’t scale it to batting average, except perhaps to avoid comparison to EqA, which would not be a good reason.

I hope now you know the real reason for the scale.

As to what’s already been probably said, wOBA looks like a poor man’s EqA.

It would be better described as an unadjusted, and far less complicated, EqA.

At its core, EqA is this:
(H + TB + 1.5*(W + HB + SB) + SF + SH)/(AB + W + HB + SH + SF + SB + CS)

If we can strip out all the extras, and focus on the big elements, we are left with this:
(H + TB + 1.5*W)/PA

So, the weights are:
1.5 BB
2.0 1B
3.0 2B
4.0 3B
5.0 HR

The denominator of wOBA is similar to EqA, so all we want to do is compare the numerators:
.72*BB + .90*1B + 1.24*2B + 1.56*3B + 1.95*HR

If we multiply my numbers by 2.4, we get:
1.7 BB
2.2 1B
3.0 2B
3.7 3B
4.7 HR

So, we see that “generally” they agree.

Clay does have some construction issues, and the conversion of EqR to EqA is fairly intensive.  I’ve spoken to him about it.  I think if Clay (and Bill James) were more involved with the grass-roots work that the sabermetric community is doing, we’d get closure on this pretty quickly.

...but reading the comments it appears to me that people who took the “OBA” part of the name seriously didn’t realize this. I think it’s too misleading.

It worked for what we needed.

I’ve never really understood the necessity of having one single metric that defines a player’s value.

You can’t fault something for not being what it wasn’t supposed to be.

...why sabermetricians don’t use a stat that would simply measure total bases per plate appearances? Something like TB + BB / AB + BB. Not quite Boswell’s total average, but similar.

We do.  We discussed it and it’s described here.

 

#1    .(JavaScript must be enabled to view this email address)      (see all posts) 2008/11/27 (Thu) @ 04:29

“I like the sound of wOBA, but if it’s not park-adjusted, I’m not sure I’d really be able to rely on it.” (Neyer reader)

Agreed, which is why I support use of EqA or Palmer’s “BtnRuns� at Baseball-Reference. (Tango)

What exactly do you mean by this? You use wOBA all the time here, I don’t remember you ever using EqA or BtnRuns. You’ve obviously used LWTS, which I think is the basis of BtnRuns, but I don’t recall you advocating the use of EqA anywhere.


#2    .(JavaScript must be enabled to view this email address)      (see all posts) 2008/11/27 (Thu) @ 06:07

I have park adjusted wOBA’s. Simply adjust the counting stats before applying the wOBA formula.


#3    Tangotiger      (see all posts) 2008/11/27 (Thu) @ 06:48

I don’t use EqA, but I don’t have a problem with others who do.  Every stat has some limitation, and I highlight those limitations when others don’t.


#4    david smyth      (see all posts) 2008/11/27 (Thu) @ 09:00

I’m curious about a couple of details about the decisions Tango made in constructing wOBA.

First, the ‘raison d’etre’ of having the stat as a percentage ( like actual OBA) to help with the binomial and significance testing stuff—wOBA isn’t like that (it varies between 0 and 1.95). How does that impact these technical things? And, is it a bit misleading to name this stat after a true percentage rate, when it is not a percentage stat at all? And, this scale (0 to 1.95) means that it is simply a coincidence that a Pujols-level .430 wOBA matches up well with his actual OBA. If you made wOBA into a true percentage stat (via an exponent or whatever), Pujol’s wOBA would be quite a bit lower than his actual OBA. Which is more esthetically displeasing in spite of being more realistic.

Second, the ‘units’ of wOBA have no intrinsic meaning (such as runs, baserunners, wins, etc.).

Given these considerations, I might have chosen to go with a scale which also has no intrinsic meaning, to acknowledge that fact, and to provide maximum ease of interpretation. Such a scale is avg = 100.

I assume Tango considered all of this, and had good reasons for the construction that was chosen.  But, you never know until you ask.


#5    Tangotiger      (see all posts) 2008/11/27 (Thu) @ 09:28

It is called wOBA not wOBP.  An average is not bounded by 0 and 1, unlike a percentage.

You are quite correct that since the (potential) range is 0 to 1.95 that it is not a true binomial and can’t be used as a binomial.  In the Appendix, we show that 1 SD = sqrt(woba * (1.1 - wOBA) / pa)

The “1.1” is the key, and it is that because of the range of 0 to 1.95.

I couldn’t have made it scale it like an index (which I have done in the past with Linear Weights Ratio) because I need to have something that is binomial-like.  So, the denominator must, absolutely must, have PA (number of opps).

The question then is how to create the numerator.  I could have made it so that the “1.1” would indeed have been “1.0”, and then be used exactly like a binomial.  I don’t know what the scale would have been for that.

For the purposes of The Book, I found it easier for the mean of the metric to be identical to the mean of OBP, because we kept using both metrics.  It was a nice “check” for our calculations when the two metrics produced means that we very close to each other.  It meant there was no bias in walks or extrabasehits.


#6    Ben R      (see all posts) 2008/11/27 (Thu) @ 10:49

I hope now you know the real reason for the scale.

I have wondered about this as well.  I understand that in the context of The Book it was helpful to have wOBA scaled to wOBP so that we understand the results without having to adjust to a different scale. 

However, taken on its own, wouldn’t it be simpler to just divorce the two?  I mean, now we can sort descending on fangraphs, do we really need to peg it to a more familiar reference?  Wouldn’t it be easier at this point to tell readers at espn that wOBA is “the average run value per plate appearance” (or something) without trying to relate it to OBP?


#7    Tangotiger      (see all posts) 2008/11/27 (Thu) @ 11:05

Sure, that’s LWTS per PA, or runs above average per PA.  Nothing wrong with that at all.  In fact, that’s what you want.  Indeed, I always talk about wins per 700 PA.

No need to use the term “wOBA” though.  Just call it something else.  And, in the “How to Calculate wOBA” thread, I include the manner to calculate LWTS and PA, and so, that’s there already.  And Fangraphs has “BRAA” which is Linear Weights by the 24 base/out state.  And baseball-reference has BtnRuns, which is Linear Weights.  So, it’s there in two places already.  Both sites provide the number of PA (though we have the IBB and SH issue).

What you will find, without question, is having to answer all those people who say “how can you have negative value?”. No one has ever said this about wOBA, even though wOBA is Linear Weights.


#8    .(JavaScript must be enabled to view this email address)      (see all posts) 2008/11/30 (Sun) @ 07:06

I watched alot of Sesame Street with my boy, and that song was *definitely* a reason that I named it wOBA.

Alright, that did it for me.  Now I’m a wOBA fan.


#9    Ben R      (see all posts) 2008/12/02 (Tue) @ 03:11

What you will find, without question, is having to answer all those people who say “how can you have negative value?�

Yes, but if you left wOBA the same (with linear weights scaled against an out), but removed the ~15% bump to peg the mean to league average OBP, you wouldn’t get a negative result.  (Or am I missing something?)

Wouldn’t this be easier to explain to the reader at ESPN because it has more inherent “meaning?”


#10    Tangotiger      (see all posts) 2008/12/02 (Tue) @ 03:17

Again, understand the reason I created wOBA in the first place.  It had nothing to do with explaining anything to an ESPN reader, and everything to do in trying to use the binomial of OBP, and in similar spirit, the binomial-like of an overall measure.  Being able to switch between OBP and wOBA in The Book was a necessity, and having one scale helped.

Ben, are you making your point as someone who has read The Book or from some other point?


#11    Ben R      (see all posts) 2008/12/02 (Tue) @ 03:37

I loved The Book, and I understand the necessity of the scale in that context. 

My critique is a minor one and that is: when advanced stats are presented to the general baseball audience, additional layers can lead them to throw up their hands.  Baseball fans love their averages, and it seems to me that something like wOBA could really take off if the wider baseball audience saw it as simply “runs” per plate appearance.  (As opposed to counting stats like VORP or BRAA).  I could be wrong, though.

This is not really a critique of your stat or of your use of it, as I enjoy both.


#12    Tangotiger      (see all posts) 2008/12/02 (Tue) @ 04:10

Ben, I don’t disagree with anything you say in that post.

Making it “runs per PA”, which is essentially Linear Weights divided by PA plus 0.12, is a fine metric on its own.  You can even make it as runs per game, by taking that figure and multiplying it by 4.3. 

MLVr is runs above average per game: so that’s LWTS/PA*4.3 (without that .12 getting in the way there).

All fine ideas on their own.  And all irrelevant to the existence of wOBA.


#13    Doug Dennis      (see all posts) 2008/12/02 (Tue) @ 05:04

TT—

I have been a fan of yours (and MGL) for quite some time.

I am curious what advantage wOBA has over RC/9.  I am sure there must be one, but I cannot figure out on my own what it is.

thanks~

Doug D.


#14    Tangotiger      (see all posts) 2008/12/02 (Tue) @ 05:16

What implementation of RC are you asking for exactly?

Presuming it’s the one where it’s the “player makes up his whole team”, then RC suffers from the basic problem that it doesn’t match reality at all.  Bonds doesn’t get to drive himself in after walking.  Rather, he sets himself up for others to drive in, or he drives in others he sees.  That’s baseball.  Even James concedes to this by approaching it on a “Theoretical Team” aspect by (now) figuring RC “with/without” the player.

In short, the RC you are (probably) referring to is not in the same league as the RC from the Theoretical Team version, much less one compared to Linear Weights.


#15    Peter Jensen      (see all posts) 2008/12/02 (Tue) @ 10:01

I have finally gotten around to comparing linear weights to wOBA.  wOBA consistently underestimates the linear weights of players with positive linear weights and consistently overestimates the linear weights of players with negative linear weights.


#16    terpsfan101      (see all posts) 2008/12/02 (Tue) @ 10:42

Peter,

You need to use the exact multiplier when converting wOBA into runs above average. The multiplier scales the initial wOBA weights to the league OBP.

Let’s say that you’re initial wOBA weights are .280 and the league OBP is .335. The multiplier for the initial wOBA weights is then 1.2 (.335/.280).

Now, you need to use 1.2 when converting wOBA into runs above average. If you use 1.15, which is just a rule of thumb, then you will not get the same runs above average figure that you calculated with Linear Weights. If you use 1.2, you will get the exact same number that you calculated with Linear Weights.


#17    Tangotiger      (see all posts) 2008/12/02 (Tue) @ 13:29

Peter, you are not doing what I am doing.  In the thread “how to calculate wOBA” from last week, I provided (I think) the exact SQL, and the match was perfect.

Check it out…


#18    Peter Jensen      (see all posts) 2008/12/02 (Tue) @ 19:54

Tango - Thanks. I looked at the other thread and I think I see where I went wrong.


#19    Ben R      (see all posts) 2009/05/26 (Tue) @ 11:53

Tango,

Two questions.

1) When fangraphs and firstinning list wOBA for minor league players, are they using linear weights tailored to each particular league or just ML lwts?

2) Do you use any particular rule of thumb to relate changes in run environment to the value of certain types of hitters? (Something like: an elite hit for average guy becomes 5% more valuable per half a run increase in the environment.)  Or is the reality to complex for this type of breakdown?


#20    terpsfan101      (see all posts) 2009/05/26 (Tue) @ 13:18

1. I believe Fangraphs uses different sets of custom linear weights for each minor league. I just checked the 2008 International League, and the runs above average (RAA) summed to -9.5, and the PCL summed to -44 runs. Due to decimal truncation and special treatment of SH and IBB, they won’t sum to exactly zero.

2. I’ll let Tango answer this.


#21    Tangotiger      (see all posts) 2009/05/26 (Tue) @ 20:34

I believe the formula I posted and the one that Fangraphs uses has the run value of each event move up and down in concert and to the same extent, except for the HR (fixed at 1.4) and SB (fixed at 0.2).

Nonetheless, the runs per win value changes for the run environment, so the most stable for the win value will be the out.


Commenting is not available in this channel entry.

<< Back to main


Latest...

COMMENTS

Feb 11 02:49
You say Goodbye… and I say Hello

Jan 25 18:36
Blog Beta Testers Needed

Jan 19 02:41
NHL apologizes for being late, and will have players make it up for them

Jan 17 15:31
NHL, NHLPA MOU

Jan 15 19:40
Looks like I picked a good day to suspend blogging

Jan 05 17:24
Are the best one-and-done players better than the worst first-ballot Hall of Famers?

Jan 05 16:52
Poll: I read eBooks on…

Jan 05 16:06
Base scores

Jan 05 13:54
Steubenville High

Jan 04 19:45
“The NHL is using this suit in an attempt to force the players to remain in a union�