THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, January 10, 2008

Linear Weights Ratio, Total Average, Estimated Runs Produced…

By Tangotiger, 11:49 AM

Studes checks in with a quick way to evaluate a seasonal batting line.  For you Thomas Boswell fans, it’s basically a weighted version of Total Average.  It also mimics my Linear Weights Ratio, but uses better-to-remember weights (if you take my numbers and multiply by 3, you get close to Studes’ numbers).  And, for you Paul Johnson fans, Studes’ weights are the same as Estimated Runs Produced

Basically, all of us offer some form of Linear Weights, in about as an easy form as we can.  You have bases on one side and outs on the other.


#1    david smyth      (see all posts) 2008/01/10 (Thu) @ 18:47

The interesting thing to me is that, if you have enough data, counting the bases (and outs) is all you need. IOW, run scoring only appears to be non-linear beacause of lack of data. If you have all of the bases gained by a team, and all of the bases lost (from OOB or stranded at end of inning), find the difference, divide by 4, and you have runs scored.

So, it’s not really BaseRuns that has the correct equation, it’s Total Average. BsR simply does a better job of compensating for lack of full data.


#2    david smyth      (see all posts) 2008/01/10 (Thu) @ 19:09

I should mention that my fave batter stat of the moment is a simple variation of Tango’s wOBA:

(2.5*H + 2*BB + XB)/(AB + BB)

is the basic version. (XB of course is TB-H).

Nails an individual batter’s contribution to a team (since there is no self-interaction). Also, nails and delineates the 3 major categories of production (hits, walks, and extra bases)


#3    studes      (see all posts) 2008/01/11 (Fri) @ 08:31

That’s pretty interesting, David.  The “geometric” argument, to me, has always been something like this: If a team gets, say, five total bases in a game, they’re likely to score zero runs.  But if they get ten, they’re likely to score, maybe, two.  And if they get 15, they’re likely to score five.  (made-up numbers!) IOW, the natural grouping of events drives a geometric increase in scoring.  It’s always made sense to me.

However, James himself (who pushed the geometric point of view) seemed to back off that position in Win Shares, when he admitted that a straight line works within all “normal” major league cases, ie. when he used the 52% minimum and formed a straight line from there.

Anyway, I’d be interested in understanding more about your thinking and research.


#4    Tangotiger      (see all posts) 2008/01/11 (Fri) @ 10:27

Cool David.

This is what David did:
starts with
0.72 BB
0.90 1B
1.26 2B
1.62 3B
1.98 HR

All these numbers are very close to the wOBA coefficients.
The above is EXACTLY the same as:
0.72 BB
0.90 H
0.36 2B
0.72 3B
1.08 HR

Which is exactly the same as:
0.72 BB
0.90 H
0.36 (TB-H)

Dividing all those numbers by 0.36 gets you:
2 BB
2.5 H
1 (TB-H)

Or
2.0 BB
1.5 H
1.0 TB

Interestingly, the league average is close to 1.000, which is useful to know.


#5    david smyth      (see all posts) 2008/01/11 (Fri) @ 18:13

----"Anyway, I’d be interested in understanding more about your thinking and research.”

Thanks, studes, but I wouldn’t call anything I do ‘research’ by current standards. I just play with numbers on a yellow pad with a calculator.

Anyway, to get a run you need 4 bases. Count the net bases, divide by 4, and you have runs, exactly. The only reason that these ‘geometric’ formulas were needed, and also the weighted linear formulas like XR, is lack of complete data. With PBP, the complete base data is available. So, why are we still messing around with things like 1b=.475? Because, while counting bases on the team level is straightforward, on the individual level the attribution of each base (primarily baserunner advancement) is a bit problematic. I’m not sure if you have seen it, but an analyst named (help me here) Bill Hubble? (or something like that) published a 13 (or so) article series on this, called Base Production in the 1990s. Hopefully his fine, ahead-of-time work is still available on the net somewhere. Another early analyst who was onto this was Travis Jackson in the mid-eighties in his 2 “The Last Word in baseball statistics” books.


#6    david smyth      (see all posts) 2008/01/11 (Fri) @ 19:08

Oh yeah, it was Tuttle, not Hubble. I think he used overall avg rates to divide the credit for runner advancement between hitter and runner. But with the PBP, couldn’t this be refined so that if it’s a med. hard single to RF slice x by a lefty batter against a righty pitcher, and the RF has X arm rating, etc., the credit for going to 3rd is much more fairly divided between runner and batter, on an individual play basis.

Not that I really care for this level of micro-analysis.


#7    Tangotiger      (see all posts) 2008/09/10 (Wed) @ 12:23

By the way, I like David’s equation.  You can also express it as:

Plus/Minus
= (TB+1.5*H+2*BB-PA) * .36

The terms within the parens is close to zero for the league.  You could also include 0.5*SB-CS.

***

This is somewhat similar to EqR which if you work it out is something like:
Total Runs
= (2.5*TB+2.5*H+3.75*BB-PA)*.12

Which if you divide the inner terms by 2.5 and multiply the .12 by 2.5 you get:
Total Runs
= (TB+H+1.5*BB-PA*.4)*.3

In order to get it as Plus/Minus, you need to subtract .18 runs per out
-.18Outs
= -.18*(PA-H-BB)
= .3*(-PA*.6+H*.6+BB*.6)

So, EqR as Plus/Minus
= (TB+H+1.5*BB-PA*.4)*.3 -.18Outs
= (TB+H+1.5*BB-PA*.4)*.3 + .3*(-PA*.6+H*.6+BB*.6)
= (TB+1.6*H+2.1*BB-PA)*.3

Compare that to David’s equation:
= (TB+1.5*H+2*BB-PA) * .36

Pretty much g-dd-mn the same thing.

The issue with Clay’s equation is how he adjusts is for the run environment.  As I showed in another thread, it doesn’t work in extreme cases.  Otherwise, what Clay does with EqR is exactly wOBA.

I think his translation between EqR and EqA is unnecessarily complicated, since I’ve shown that you can get from wOBA to runs in a simple linear relationship.


#8    Tangotiger      (see all posts) 2008/09/10 (Wed) @ 14:21

We can also try to compare EqA to OPS+.

The formula courtesy of Patriot’s blog:
EqR = (2*RAW/LgRAW - 1) * PA * Lg(R/PA)

where RAW = (H + TB + 1.5*(W + HB + SB) + SH + SF)/(AB + W + HB + SH + SF + SB + CS)

If we strip out all the extra parameters, you basically get:
RAW = SLG + 1.5*OBP

The denominator of course makes this a bit tricky, but it’s something like that.

OPS = SLG+OBP
but OPS+ uses this something like this numerator:
OPSnum = SLG+1.25*OBP

So, RAW and OPSnum are very similar.

OPS+ = 2*OPSnum/lgOPSnum-1

EqR
= (2*RAW/LgRAW - 1) * PA * Lg(R/PA)
= (2*RAW/LgRAW - 1) * stuff

See how similar EqR and OPS+ is?

EqR, OPS, wOBA are all different variations of Linear Weights.  And the least complicated of the bunch is good ole Linear Weights.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 08 04:25
Sabermetric Moves of the 2009 Pre-Season

Jan 09 02:33
Cheers

Jan 08 23:45
The first Hardball Times Annual available for download!

Jan 08 21:16
Line Drives

Jan 08 20:23
(recent) Historical WAR on Fangraphs

Jan 08 16:07
Clint Eastwood is Archie Bunker

Jan 08 16:06
Hardball Times Annual 2008, starring…

Jan 08 15:58
Madoff’s Ponzi

Jan 08 03:41
Valuing relievers

Jan 07 17:41
The latest in park factors