THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, December 19, 2006

Why is EqA so complicated?

By Tangotiger, 12:31 PM

Thanks to Patriot, here is how EqA works:


1. Start off with:
H + TB + 1.5 (BB+HBP) + SB + SH + SF
-----------------------------------------------
AB + BB + HBP + SH + SF + CS + SB/3

call that RAW

2. Then, figure the same for the league, and call that lgRAW.

3. You also need to know the league Runs per PA. 

4. EQR = (2*RAW/LgRAW - 1)*(LgR/LgPA)*PA

5. EQA = (EQR/Out/5)^.4

There’s no one out there that can tell me what that is doing, nor why.  Except Clay and maybe Patriot.

Now, let’s also remember wOBA:
wOBA = (.72*BB+.90*1B+1.24*2B+1.56*3B+1.95*HR)/PA

In order to get wOBA onto the EQA scale, multiply it by some constant.  In the illustration that follows, wOBA will be multipled by .798 to create wBA (weighted batting average).

Anyway, here’s my illustration:
ab 560
h 150
2b 30
3b 3
hr 17
bb 60

This gives us a BA/OBP/SLG of .268/.339/.423.  Pretty standard.

In Clay’s equations, I set the lgRAW to .769, and the Runs per PA to .123.  In the end, I get a batting average, wBA, and EqA all as .268.

We’re baselined.

Now, let’s do something very simple.  Change the AB from 560 to 460.  The battng average is now .326, the wBA is .320, and Clay’s EqA is .318.  Change AB to 360, and we have .417, .396, .392, respectively.  Go the other way, and change AB to 660, to give us .227, .231, .229.  Make AB 1000, and we have .150, .157, .122.

Change AB to 1181, and we have .127, .134, and.... EqA breaks.

Let’s continue, and this time, I’ll play with the HR values.  Jump the HR from 17 to 37: wBA is .295, and EqA is .293.  How about a double-shot of keeping the HR to 37, and doubling the walks to 120?  wOBA is .320 and EqA is .318.  And what if we also bring the total hits down to 100?  wBA is .267, and EqA is .... .267.

So, let’s recap.  EqA and wBA track each other very well.  Extremely close.  Except at the extremes, where EqA breaks.  EqA is extremely complicated.  Not only is it complicated, but it’s not even technically sound.  wOBA on the other hand is extremely simple, and provides results that are similar to EqA, when EqA works.

wOBA is nothing more than Linear Weights, expressed on the OBP (or BA) scale.

My question: why in the world would anyone use the machinations of EqA, if you can get there in much smaller steps?

#1    Tangotiger      (see all posts) 2006/12/19 (Tue) @ 13:05

Oh, I also should report modified OPS as:
1.8*OBP + SLG all times some multiplier.  In my illustrations above, that multiplier is set to .2595.

Anyway, when EqA is .318, modified OPS is .322.  When EqA is .392, the other is .404.

When EqA is .229, the other is .229.  When EqA is .122, the other is .154 (if you remember, wBA is .157, and the actual BA is .150). 

And when EqA breaks, modified OPS is .131, which compares to wBA of .134, and actual BA of .127.


#2    David Cameron      (see all posts) 2006/12/19 (Tue) @ 13:51

My question: why in the world would anyone use the machinations of EqA, if you can get there in much smaller steps?

Real world answer? BP has worked very hard to get their branded stats into the forefront of the statistical lexicon so that they can continue to drive traffic and subscribers to their site and make a lot of money.

It’s the same reason you still see them using WARP in their books and articles, even though they all know it’s broken to the point of near uselessness.  BP wants to make money, and pushing BP stats, regardless of their effectiveness relative to other open-source metrics, is good for their bottom line.


#3    Tangotiger      (see all posts) 2006/12/19 (Tue) @ 15:27

Maybe that’s what BP believes, but I doubt that this belief is true.  At the other end is b-r.com which pretty much is open source.  You could in fact re-create b-r.com, character by character (but who’d want to?).

b-r.com has its own problems, with its reliance on Bill James and Pete Palmer equations that are severely outdated.  One example is the sim scores.  I have my own way to do it, and it’s very very close to the way studes did it in, I think, the first THT annual.  And it’s the way a true mathematician (like Sean) would want to do it anyway. 

Pitcher *and* hitter park factors?  Yeesh.  Given that we actually have PBP for 1957-2006, is there any reason whatsoever not to have LH and RH park factors?  Or park factors for strong and weak hitters?  Fast and slow?  GB and FB hitters?  I’d hate to see a park factor that applies the same to Barry Bonds and Neifi Perez.

MGL actually opened the UZR black box with his 2-part article on Primer a few years ago.  Again, you can recreate his UZR, if you have the PBP data, with hit locations.  No one’s done it.  I even released the Marcel equations, and I don’t know how many people bother to code it.  (Takes a max of three hours, and 30 minutes if you are fast.)

Opening up the black box will not cause a single dent on anyone’s bottom line.


#4    Patriot      (see all posts) 2006/12/19 (Tue) @ 16:11

I agree with Tango that “opening up the black box” won’t hurt anybody’s profit, but I also am not sure it’s fair to characterize BP as such in this case.  For one thing, they have accepted Pythagenpat rather then their own Pythagenport, despite the fact that for the normal major league contexts, the two will give almost identical results.

Additionally, EQR is not that bad.  I’ve said this many times, but it’s really just a linear weight formula, with a bizarre construction that obfuscates that (and makes SB and CS the only source of non-linearity, which is clearly wrong, but not fatal because the distortions aren’t big).  Now it is of course very awkward when you can do a simple and clear LW formula and get the same results.  But if someone offered me a list of EQR or a list of RC(as B-R does), I’d take the EQR list every time.

EQA on the other hand, is atrocious, because it distorts the scale of everything.  A .200 EQA is not to a .400 EQA as .2 runs/out is to .4 runs/out, and that’s a real pain in the butt.  And even if they want the BA scale, wOBA or GPA or what have you can get the scale down without distorting the relationship between the rate stat and run creation.


#5    tangotiger      (see all posts) 2006/12/19 (Tue) @ 16:45

Patriot, agreed that BP has made strides.  They have accepted your PythagenPat, since it is just as accurate in the normal range, better than theirs at the extreme, and a much simpler construction.  Woolner did the same in accepting the Tango Distribution (fits all three criteria).

I agree that the “list” of EqA is of course superior to b-r.com’s RC.  My point is simply that if you can go from “a” to “b”, why take a detour to c,d,e,f,g, if it doesn’t get you any there closer, faster, or better?  EqA is Linear Weights, and there’s no reason to hide it.

***

As for the scaling/distortion, I’m not sure I agree with you.  Going back to my illustration, when I have AB set to 560 (or 410 outs, 620 PA), EqR is 76.3, and EqA is .268.  If I change AB to 460 (or 310 outs, 520 PA), EqR is 88.6 and EqA is .318.

wOBA is .336 in the first case, and .400 in the second case.  The difference in wOBA is .064, which, if you divide by 1.15, gives us .056 runs per PA.  At 620 PA, that’s 34 runs.

EqR was 76.3 per 620 PA (and 410 outs) and 88.6 per 520 PA (and 310 outs), or 105.6 per 620 PA (and 370 outs).  That difference is 29.3 runs, and 40 fewer outs.  (Those outs would be worth around .186, or 76/410 runs each, so 29.3+7.4 = 36.7).  That +36.7 is about 8% higher than the +34 of wOBA.

***

The conversion if EqA to runs is pretty close to PA * (EqA - league average).  In my illustration, that’s 620 * (.318 - .268) = +31


#6    tangotiger      (see all posts) 2006/12/19 (Tue) @ 16:52

Actually, multiply that final EqA to runs equation by 10%, and we’re done.

Why?  Because EqA to wOBA is multiplied by 1.27.  wOBA to runs divides by 1.15.  1.27/1.15 = 1.10.


#7    Patriot      (see all posts) 2006/12/20 (Wed) @ 11:25

Tango said:

I agree that the “list” of EqA is of course superior to b-r.com’s RC.  My point is simply that if you can go from “a” to “b”, why take a detour to c,d,e,f,g, if it doesn’t get you any there closer, faster, or better?  EqA is Linear Weights, and there’s no reason to hide it.

I agree with this 100%.

My point about EQA distorting related to the rate part itself, not to EQR.  You may remember when Chris Dial published a replacement level article on Primer in which he found that his replacement group hit at 90% of the league average EQA, and how this was surprisingly high compared to the 75% of league average hitting or so ususally given for replacment level.  But in fact, because of the EQA scale distortion, hitting at 90% of the league average EQA is equivalent to hitting at 77% of the league avergae r/o, so there was nothing surprising about the findings at all.  The scale just made it easy to misinterpret them.


#8    Tangotiger      (see all posts) 2006/12/20 (Wed) @ 11:39

Patriot, agreed.

In my illustration a .268 EqA is league average, or 76.3 RC per 620 PA.  The .318 EqA is +34 runs, with is 110.3 RC per 620 PA (with the “outs” adjusted, if you know what I mean).  So, the “magnitude” is 110.3/76.3 or 1.45x higher.

The differential in runs is 34 per 620, or .055 runs per PA (which is not too terribly different from the .318 minus .268).

In order to keep both of these alive (the differental of .055 runs per PA, and the 1.45x magnitude), you need to set the baseline to .123.

That .123 is no accident, as it represents the absolute runs per PA.  So, if you want to express it as a “rate”, you need to show it as:
.123 runs per PA
.178 runs per PA

This preserves the differential (.055), and the magnitude (1.45).

All Clay did, basically, was simply add +.137 to this number to force it to .260.  (Not exactly correct, since the differential doesn’t represent runs per PA on a 1:1 basis.)


#9    Tangotiger      (see all posts) 2006/12/20 (Wed) @ 11:47

Going back to the “80% replacement level”, this is what it means:

Let’s assume that runs per PA is .125, so that 80% of that is .100.  So, the replacement level is .025 runs per PA below average.

We also know that:
(EqA - .260) * 1.10 = runs per PA = -.025

That makes EqA replacement level .237.  And .237/.260=91%.

So, roughly speaking in this illustration, and as you are echoinig, 80% replacement level in RC is 90% replacement level in EqA.


#10    Los Angeles Waterloo of Black Hawk      (see all posts) 2007/01/02 (Tue) @ 20:28

Just coming back to this for a second:

My question: why in the world would anyone use the machinations of EqA, if you can get there in much smaller steps?

On the consumer end, of course, EqA is far easier to access than wOBA.  All I have to do to get EqA is click.  What’s more, it’s park-adjusted. 

I certainly accept that wOBA works better than EqA.  But if I just want a quick-and-dirty on how good a hitter/basestealer is, EqA is easy to find and will generally be right.

Pitcher *and* hitter park factors?  Yeesh.  Given that we actually have PBP for 1957-2006, is there any reason whatsoever not to have LH and RH park factors?  Or park factors for strong and weak hitters?  Fast and slow?  GB and FB hitters?  I’d hate to see a park factor that applies the same to Barry Bonds and Neifi Perez.

Well, that depends on what you want to use your park factors for, I believe.  If you want park factors to aid you in being predictive, all of these other factors will be helpful.  But if you’re trying to understand what a player has already contributed, in his run environment, then all you want is the park factor for runs scored.  The separate park factors for hitters and pitchers reflects that, as they (in theory, anyway) remove the biases of pitchers and hitters on the same team not facing each other (e.g. not only do the Chavez Ravine opponents in 1968 have to deal with an extreme pitchers’ park, they also have to deal with Sandy Koufax and Don Drysdale, which their teammates do not).


#11    tangotiger      (see all posts) 2007/01/02 (Tue) @ 23:22

I’m not arguing about doing away with EqA as a concept, nor with using wOBA to supplant it.  Simply to not make EqA so darn complicated.  EqA is linear weights.  Why not simply reflect that, rather than doing the 5-step process? 

Most of EqA is smoke and mirrors.  Go from step A to step B, without going to steps c,d,e,f,g.  That’s all I’m calling for.  Remove the smoke and mirrors, remove all the middle steps.  Simply make it… simple.


#12    Tangotiger      (see all posts) 2007/01/03 (Wed) @ 15:23

As for the “park” factors.

Obviously, this points to a misunderstanding on what a “park” means, on my part, and to a general complaint I have when everything is rolled into one.  So, we have a park factor, a “not playing against my own teammates” factor, a strength of schedule factor, and a strength of league factor.  Why any combination of those should be called a “park” factor is silly.

Even if you want to apply a “not playing against my own teammates” factor, that factor needs to be based on the true talent level on the players, and certainly not to a single-season performance of the teammates. 

And, if you are going to do a “strength of team” adjustment, why not “strength of opponent” as well, especially with as unbalanced a schedule as we have?

They should all be kept separate, as individual factors.  My main point stands that we should be calculating everything better, and since we have the data, we can.

I’ll point to my article on parks here, so as to not repeat anythign else I want to say:
http://www.tangotiger.net/parks.html


#13    Rally      (see all posts) 2007/01/03 (Wed) @ 16:38

When was that parks article written?

If anything, McGwire hitting 60 HR at Coors seems conservative, not ridiculous.


#14    Tangotiger      (see all posts) 2007/01/03 (Wed) @ 16:42

Don’t forget the title of that section: “Practical Example”.  That was an illustration. 

Regardless, if McGwire hits 30 HR in 250 AB-K at the Astrodome, how many do you expect him to hit at Coors?  The article is pretty clear as to why you can’t just double the number.


#15    Joe Dimino      (see all posts) 2007/01/13 (Sat) @ 05:11

If BPro has accepted PythaganPat, why haven’t they applied it to WARP?

Within WARP they are still using the 2.0 exponent when comparing a pitcher’s runs allowed to league average. You can see this because guys like Koufax at the extremes are still being underrated.

This bothed me so much that I’ve reworked the entire thing for my own pitcher ratings for the Hall of Merit. IMO it’s a fatal flaw in WARPs pitcher ratings.


#16    Joe Dimino      (see all posts) 2007/01/13 (Sat) @ 05:15

Tango (or anyone else) if you were going to use one out of the box easily accessible ‘context’ factor (as opposed to park, for the reasons you mentioned above), what would it be? Going back to 1871.

I’ve been using the baseball-reference factors because they break out pitching and hitting and adjust for not facing your own team. But if there’s something better that can be applied to pitchers, I’m all ears.

I don’t have the time to figure them myself, I just need something that I can take an run with . . .

Thanks for the help.


#17    David Gassko      (see all posts) 2007/01/13 (Sat) @ 12:16

Joe,

Do you have Michael Schell’s “Baseball’s All-Time Best Sluggers”? It’s a great book (a must-own, in fact), and in one of the appendices he lists component park factors for every park and year ever.


#18    Tangotiger      (see all posts) 2007/01/15 (Mon) @ 23:58

Joe, I don’t think I can help you on this one.

Btw, I noticed your comments on BTF about your disagreement with the replacement level being different for starters and relievers.  I’m assuming you read the chapter in The Book how the same pitcher gets much better numbers when relieving than starting.  Therefore, if you create a “replacement level pitcher” (which I agree with… I don’t know, Jose Lima at some point in his career), you then have to assess how this pitcher would perform as both a starter and reliever.  And it is that level (same pitcher, different numbers) that becomes your replacement level.

Obviously, it’s not just Jose Lima.  But, get yourself a group of replacement level pitchers, and those pitchers will get say a .350 OBP as a reliever, and the same pitchers will get a .380 OBP as a starter. 

(All numbers for illustration only.)


#19    tangotiger      (see all posts) 2007/01/16 (Tue) @ 09:16

Let me also add that you do have “different” replacement levels for the Astrodome as you would for Coors.

My point is that all the starter/reliever thing is an adjustment factor, just like for parks.

And the fact remains that the replacement pitcher is exactly the same, be it the replacement pitcher as a reliever at the Astrodome, or the replacement pitcher as a starter at Coors.  The sole questions is: what kind of numbers would a replacement level pitcher put up in a particular role, against a particular competition, at a particular park.


#20    Joe Dimino      (see all posts) 2007/01/16 (Tue) @ 14:38

Tango, my big issue is quality leakage. I read the study, back when I bought The Book, but haven’t had a chance to re-read it since your post.

But the way it seems all of these studies have been set up is to compare guys who did both. The problem is, since just about everyone begins as a starter, and only relieves if they aren’t good enough, I think we end up with serious quality leakage. Those studies tend to only look at failed starters, who then make it as relievers. If they failed as a starter and failed as a reliever they wouldn’t be in the study. I think this distorts the overall effect.

Also, if you just add the BPro stat “Bequeathed Runs Prevented” from a relievers run total, this makes up a significant percentage of the difference for the reliever as opposed to a starter (this number is positive lifetime for just about every reliever I’ve looked at).

So what I do is add the BRP stat to pitcher run totals (I also adjust for the Bullpen Support stat and Inherited Runners Prevented). I adjust for leverage too.

But the way I see it, even if there were an advantage besides the one that comes from pitching partial innings (which is what BRP stat measures) . . .  In deciding whether a pitcher should be a starter or a reliever, managers make a conscious decision to trade innings for leverage and effectiveness. So, when ranking pitchers for historical purposes, I think it should all be considered pitching in terms of a replacement level. The starters have the advantage of more innings, the relievers have higher effectiveness (mostly because of the partial innings they pitch) and greater leverage.

I could be completely missing something here, if I am, please help me see the light.


#21    Tangotiger      (see all posts) 2007/01/16 (Tue) @ 15:07

You should re-read the relevant part of The Book (Table 84).  I definitely tackled the selective sampling issue you described head-on, though perhaps not to a firm conclusion.  However, the process laid out is good enough that someone who wanted to take an even more serious look at it can do so.

You are wrong that the bulk of the issue is the stranded/inherited runners.  In The Book, I didn’t even bother to look at that, as I focused simply on the peripherals.

Without question, the average starter is better than the average reliever.  Giving a common replacement rate of numbers simply makes no sense.  You however hide that fact because of all the extra innings a starter throws. 

That is, I have the replacement win% of a pitcher as a starter at .380, and the same pitcher as a reliever is .470.  But, because you use say 180 IP for a starter and 70 IP for a reliever, you can have a .410 replacement level for both roles, and it still looks pretty good for the starter overall.


#22    Tangotiger      (see all posts) 2007/01/16 (Tue) @ 15:08

Oh, and the average leverage for a starter is around .98 and for a reliever is around 1.04.  Hardly a difference that makes much impact.


#23    Joe Dimino      (see all posts) 2007/01/16 (Tue) @ 15:16

I’ll reread it.

The average leverage for a 1970s - 1980s relief ace, from what I’ve seen is about 1.3-1.4, with some guys being much higher (Sutter/Fingers/Gossage). I haven’t really looked at anyone that didn’t at least have a few seasons as an ace.


#24    Joe Dimino      (see all posts) 2007/01/16 (Tue) @ 15:21

OK - let’s say I give in completely, and assume this is the proper way to handle it - how far back does this apply? I assume (again, not about to grab The Book right now) that this was studied 1999-2002 like most of the studies in The Book.

Do the same rates apply back into the 1970s? The 1950s? The deadball era? Some guys, like 3-Finger Brown, have significant relief innings, even back then.

Do you think this is universal, or only something that describes the late 90s?

Reliever ERA+’s are as good now as they’ve ever been - since they were lower in the earlier years (and closer to starter ERA+’s) does that mean the replacement level was also lower?


#25    Joe Dimino      (see all posts) 2007/01/16 (Tue) @ 15:25

"Without question, the average starter is better than the average reliever.  Giving a common replacement rate of numbers simply makes no sense.  You however hide that fact because of all the extra innings a starter throws.”

Then why does the average reliever give up fewer runs? If by better you mean more durable, I definitely agree. But better in terms of being able to prevent runs on the mound?

“I have the replacement win% of a pitcher as a starter at .380, and the same pitcher as a reliever is .470.”

Again, that tells me the relievers are better, not the starters. Wouldn’t the same apply to a catcher’s offense, vs. a CFs (just using an example, I know CF hit a little better than .470). And we know CF are better hitters than catchers.


#26    Joe Dimino      (see all posts) 2007/01/16 (Tue) @ 15:26

"But better in terms of being able to prevent runs on the mound?”

Should say WHEN on the mound . . . oops.


#27    Tangotiger      (see all posts) 2007/01/16 (Tue) @ 16:09

The relief ace LI is 1.7 to 2.0.  The bullpen overall is close to 1.0.

***

I only looked at a few players in the 70s.  My guess is that you may see the same pattern.  I wrote this on another site:

There is a mountain of difference in reliever/starter performance for the same pitcher these days. We are talking about 1 run per 9 IP, or about .090 wins per game. That is, a pitcher who is a “.450” pitcher as a starter would be a “.540” pitcher as a reliever.

This is true for 1999-2002, and likely true for post 2002 as well, and probably true for post 1986.

However, I do not know what that figure is for the 60s-80s.

Just picking at some random players:
Bob Stanley
http://www.retrosheet.org/boxesetc/Lstanb0010.htm
Struck out 3 times as many batters as a reliever, even though he only faced twice the number of batters. ERA difference of 1.12.

Woodie Fryman:
http://www.retrosheet.org/boxesetc/Lfrymw1010.htm
Struck out 14% more batters in relief than as a starter, even though over half his relief games were as a 39+ year old. ERA difference of 1.08.

These two prove nothing of course. But, it should certainly mean that someone should expand the study I did in The Book.

If you missed it, you may want to digest this:
http://www.insidethebook.com/ee/index.php/site/comments/john_smoltz/

***

As for the reason they give up less runs: that’s what I’m trying to say.  They have an easier job of it.  Essentially, a starter has one hand tied behind his back.

Something similar happens with DH and PH.  It’s so much harder to do that, than to be a position player.  There’s a *huge* PH penalty.

The average reliever has an LI of 1.0.  The average starter has an LI of 1.0.  The average starter has 2 to 3 times more innings.  Why is that?  Because he’s better.

Model that reality.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:26
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps