THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


2013 Bill James Handbook

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, November 24, 2008

wOBA year-by-year calculations

Here is the full specs for calculating wOBA for each season, based on the database from the Baseball Databank:


=======
Step 1
=======

Create a VIEW (or QUERY) named LeagueRunsPerOut based on this SQL:

SELECT
Pitching.yearID
, Sum([R])/Sum([IPouts]) AS RperOut
, Sum(Pitching.R) AS totR
, Sum(Pitching.IPouts) AS totOuts

FROM
PrimPos
INNER JOIN
Pitching
ON PrimPos.yearID = Pitching.yearID
AND PrimPos.playerID = Pitching.playerID

WHERE PrimPos.PosPrim=“P”

GROUP BY Pitching.yearID
;

The purpose here is simply to create a run environment for each season.  I exclude all nonpitcher’s pitching numbers.


=======
Step 2
=======

Create a VIEW (or QUERY) named RunValues based on this SQL:

SELECT
Batting.yearID
, RperOut
, [RperOut]+0.14 AS runBB
, [runBB]+0.025 AS runHB
, [runBB]+0.155 AS run1B
, [run1B]+0.3 AS run2B
, [run2B]+0.27 AS run3B
, 1.4 AS runHR
, 0.2 AS runSB
, 2*[RperOut]+0.075 AS runCS

FROM
LeagueRunsPerOut

INNER JOIN
(
Batting
INNER JOIN
PrimPos
ON (Batting.yearID = PrimPos.yearID)
AND (Batting.playerID = PrimPos.playerID)
)
ON LeagueRunsPerOut.yearID = Batting.yearID

WHERE PrimPos.PosPrim <> "P"

GROUP BY
Batting.yearID
, RperOut

;

(The “FROM” clause can be rewritten clearer, but then it won’t work in Access.)

I set the run value of the walk as +.14 runs above the value of runs per out.  While it is not necessarily exactly that all the time, it’s basically that for various run environments in MLB over the last fifty years.  You can see the evidence here:
http://www.insidethebook.com/ee/index.php/site/article/linear_weights_by_run_environment/

In each run environment, the difference between the run value of the walk and runs per out (or RperI divided by 3) is between .134 and .143.  Close enough for us.

The other batting run values work similarly.  They are further double-checked here:
http://www.insidethebook.com/ee/index.php/site/comments/actual_wins_retrosheet_years/#4

The run value of the SB is fixed at .20, and the CS is set with a bit of a fudge, but works fairly well.

So, that’s we have the Linear Weights values for each event, for each season.

You could do the same thing with BaseRuns.  I chose not to, only for simplicity’s sake.  You could try to do it yourself.

=======
Step 3
=======

Create a VIEW (or QUERY) named RunValues2 based on this SQL:

SELECT
RunValues.yearID
, RunValues.RperOut
, RunValues.runBB
, RunValues.runHB
, RunValues.run1B
, RunValues.run2B
, RunValues.run3B
, RunValues.runHR
, RunValues.runSB
, RunValues.runCS
, Sum([runBB]*([BB]-nz([ibb]))+[runHB]*nz([HBP])+[run1B]*([H]-[2b]-[3b]-[HR])+[run2B]*[2b]+[run3B]*[3b]+1.4*[HR]+[runSB]*nz([SB])-[runCS]*nz([CS]))/Sum([ab]-[h]+nz([SF])) AS runMinus
, Sum([runBB]*([BB]-nz([ibb]))+[runHB]*nz([HBP])+[run1B]*([H]-[2b]-[3b]-[HR])+[run2B]*[2b]+[run3B]*[3b]+1.4*[HR]+[runSB]*nz([SB])-[runCS]*nz([CS]))/Sum([BB]-nz([IBB])+nz([HBP])+[H]) AS runPlus
, Sum([BB]-nz([IBB])+nz([HBP])+[H])/Sum([AB]+[BB]-nz([IBB])+nz([HBP])+nz([SF])) AS wOBA
, 1/([runPlus]+[runMinus]) AS wOBAscale
, ([runBB]+[runMinus])*[wOBAscale] AS wobaBB
, ([runHB]+[runMinus])*[wOBAscale] AS wobaHB
, ([run1B]+[runMinus])*[wOBAscale] AS woba1B
, ([run2B]+[runMinus])*[wOBAscale] AS woba2B
, ([run3B]+[runMinus])*[wOBAscale] AS woba3B
, ([runHR]+[runMinus])*[wOBAscale] AS wobaHR
, [runSB]*[wOBAscale] AS wobaSB
, [runCS]*[wOBAscale] AS wobaCS

FROM
RunValues
INNER JOIN
(
Batting
INNER JOIN
PrimPos
ON Batting.playerID = PrimPos.playerID
AND Batting.yearID = PrimPos.yearID
)
ON RunValues.yearID = Batting.yearID

GROUP BY
RunValues.yearID
, RunValues.RperOut
, RunValues.runBB
, RunValues.runHB
, RunValues.run1B
, RunValues.run2B
, RunValues.run3B
, RunValues.runHR
, RunValues.runSB
, RunValues.runCS

ORDER BY
RunValues.yearID DESC
;


I could have merged these last two VIEWS, but that’s not important.

Notes:
- runMinus sets the run value for the missing events, which is AB minus H plus SF; if we had reached base on error, then we’d update these last two views accordingly; it is basically the run value of the batting out

- runPlus determines the average run value of the safe batting events (walks, hitbatters, hits)

- there is a wOBA calculation, which you will see is actually an OBP calculation; they are interchangeable at the league level

- the wOBAscale is the multiplier that we will be applying to get the run values into a wOBA scale; it also lets you convert from wOBA to runs per PA (while The Book says to use 1.15, you actually use whatever this value is for the season in question)

- for all the batting events, we take the run value of each event, add in the run value of the outs, and then multiply by the wOBAscale factor; play around with why and how I am doing this to see if this makes sense to you; if this makes total sense to you, then make a short post for your fellow readers; otherwise, I’ll have to make a long boring post to that effect

- for the running events, we only apply the multiplier; once you understand the previous point, you will understand the reason for this point

***

Since 1956, the weighted average is:
wobaBB 0.71
wobaHB 0.74
woba1B 0.90
woba2B 1.28
woba3B 1.63
wobaHR 2.10
wobaSB 0.25
wobaCS 0.51

Remember, this is what is used for the Baseball Databank.  If you use different events, like Reaching base on error for example, things will change a bit.  It also depends whether you do, or do not, want to include SB/CS.  In The Book, we were almost always interested only in the batter/pitcher matchup, and so, the SB/CS numbers would not make sense.  Here is the full output of the above:
http://tangotiger.net/bdb/lwts_woba_for_bdb.txt

Anyway, I hope this makes the entire LWTS ~ wOBA relationship clearer.  If not as clear-as-crystal, more than clear-as-mud.

<< Back to main