THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, February 16, 2011

Runs Per Win

By Tangotiger, 04:44 PM

Matt.  I did not know this: (lgRPG^(1-z)) * 2, where z is between 0.27 and 0.29.  Can you show me how this is derived, because I was doing it the brute force way.  Sh!t, had I know it was this easy, I would never have proposed my method.


#1          (see all posts) 2011/02/16 (Wed) @ 17:52

I don’t have the mathematical chops to solve this anymore, but I assume this is how you’d set it up:

[R^x/(R^x+RA^x)]-[R^x/(R^x+{RA+y}^x)] = 0.0061728395

Where:

R = lg R/G
RA = lg RA/g (same as R)
x = exponent
y = lg R/W <--- what you’re solving for
0.00617258395 = .500 - (80/82)

Basically a league average team minus a league average offensive team that allows “y” more runs is the difference in 1 win.

Once you set RA = R, it simplifies to:

R^x/2R^x - R^x/(R^x+(R+y)^x) = 0.00617258395

Which simplifies to:

1/2 - R^x/(R^x+(R+y)^x) = 0.00617258395

Or:

R^x/(R^x+(R+y)^x) = 0.49382716049

And I’m stuck. Does anyone know how to finish this off?


#2    Tangotiger      (see all posts) 2011/02/16 (Wed) @ 18:12

The way I was doing it was:

Runs=R=10
exponent=e=.28
really small number=m=.000001

10^.28=1.905

R/2+m=5.000001
R/2-m=4.999999
5.00001/4.9999 = 1.0000004
1.0000004^1.905=1.0000008
1.0000008/(1.0000008+1)=.500002
.500002-.5=.00002
5.00001-4.99999=2m=.000002
Divide the last 2 numbers to get 10.49615

(Apologies if I have the number of decimal places wrong… not important if you follow along.)

And this is exactly the same thing as:
R^(1-e)*2

So, I’d love for someone to work that out to see how you get from what I just did to this.  Presumably, you are going to take the derivative, so that you don’t have to contend with the “really small number” I was using as a brute force.


#3    Kincaid      (see all posts) 2011/02/16 (Wed) @ 18:32

It is just the derivative of PythagenPat.  If you set R and RA to both be RPG/2 (an average team), then the derivative will simplify to RPG^z/(2*RPG).  Since the derivative of PythagenPat is wins per run, take the reciprocal of that, which is 2*RPG/RPG^z.  Simplify the exponents, and you get RPG^(1-z)*2.


#4    Patriot      (see all posts) 2011/02/16 (Wed) @ 18:44

See the above link for a more detailed description.

The formula only works for a run ratio of 1 (that is R = RA), and you have to assume that the RPG is constant (which is fine if you are trying to determine RPW at some specific RPG value).

I am sure that there has to be a more elegant way to do the calculus than this, but:

First differentiate Pythagenpat W% with respect to Run Ratio (RR = R/RA):

W% = RR^x/(RR^x + 1)

dW%/dRR = ((RR^x + 1)*(x*RR^(x-1))-RR^x*(x*RR^(x-1)))/(RR^x + 1)^2

Run Differential = RD = (R - RA) per game, and if we fix RPG (R + RA per game) at some constant value, we can write RR in terms of RD:

RR = (RD/(2*RPG) + .5)/(.5 - RD/(2*RPG)).

Now differentiate RR with respect to RD:

dRR/dRD = 1/(2*RPG*(.5-RD/(2*RPG))^2)

Then multiply dW%/dRR*dRR/dRD to get dW%/dRD:

((2*RPG*(RR^x + 1)^2*(.5 - RD/(2*RPG))^2)/(x*RR^(x-1))

dW%/dRD is Runs Per Win, by definition. 

If you assume that the team scores as many runs as it allows, then RD = 0 and RR = 1:

RPW = ((2*RPG*(1^x + 1)^2*(.5 - 0/(2*RPG))^2)/(x*1^(x - 1))
= ((2*RPG*(2)^2*(.5)^2)/x
= (2*RPG)/x

This result works with any Pythagorean exponent--the most interesting result is that with a Pythagorean exponent of 2, RPW = RPG.

For Pythagenpat, x = RPG^z, so:

RPW = (2*RPG)/(RPG^z) = 2*RPG^(1 - z)

I have published this a few different times (in addition to the page linked above, see the blogposts listed below), so I’m glad that somebody remembered it.  The first link below shows how if you linearize 2*RPG^(1-z) to get rid of exponents, you get something very close to Tango’s .75*RPG + 3 formula.

http://walksaber.blogspot.com/2009/01/runs-per-win-from-pythagenpat.html

http://walksaber.blogspot.com/2006/01/runs-per-win.html


#5    KJOK      (see all posts) 2011/02/16 (Wed) @ 18:55

I had Patriot’s formula written down as:

RGP * .7509 + 2.7598

Pete Palmer and Total Baseball uses 10 * SQRT(RPG/9)

BenL used the simple RPG * 1.099

Based on historical empirical runs data and some brute force, I came up with:

(2 * RPG^.71) + (RGP-1) * .026

I’m not sure what a good way to measure ‘most accurate’ would be?


#6    Patriot      (see all posts) 2011/02/16 (Wed) @ 19:12

If you want to test most accurate based on historical teams, a RMSE test would probably be sufficient.  If you’re more concerned about theoretical purity (as I am), you’ll want whichever formulation can be directly related to your W% estimator of choice (which for me is Pythagenpat, as a completely unbiased observer...er...)

Ben’s RPG/.91 = RPG*1.099 is equivalent to using a Pythagorean exponent of 1.82 (which was James’ “ideal” value, either that or 1.83) according to the above formula for RPW based on Pythagorean:

RPW = 2*RPG/x
= 2*RPG/1.82 = RPG/.91


#7    Tangotiger      (see all posts) 2011/02/16 (Wed) @ 19:30

Since we mostly, I think, care about pitchers and not teams, then we should be careful about trying to best-fit to teams and extrapolating that to pitchers.


#8    Patriot      (see all posts) 2011/02/16 (Wed) @ 19:34

Agreed.  My comment was assuming that teams were the focus and that accuracy was to get a best fit with no other considerations.


#9    KJOK      (see all posts) 2011/02/16 (Wed) @ 19:53

I think I was trying to tie to Pygathenpat - my convoluted formula seems to track 2*RPG^(1 - z) almost exactly!? (where z = .28)


#10    Kincaid      (see all posts) 2011/02/16 (Wed) @ 21:04

I have never seen much use for a runs to wins conversion for pitchers.  Since we generally rate pitchers in terms of runs allowed per game, you can just plug their numbers directly to PythagenPat and get a W% directly, and then you can get wins from W% and IP.  It’s simpler than dealing with converting runs to wins, and it forgoes the approximations/assumptions involved in deriving a runs to wins conversion.  If you need a runs to wins conversion for pitchers for some reason, I would just calculate runs and wins separately and get the conversion factor by dividing the two.

This type of factor seems much more useful for non-pitchers to me.


#11    Matt Klaassen      (see all posts) 2011/02/16 (Wed) @ 22:24

This was my first chance to comment here since the post went up.

I’m really sorry I couldn’t be the first comment here, since I was going to keep up the charade of me actually being able to figure something like this out as “revenge” for all the Tango/MGL posts that say something like “here’s some easy aging curves I whipped out in 20 minutes with bdb,” which leads to me spending hours/days in MySQL without being able to replicate the results then giving up in frustrating/humiliation. wink heh. heh?

But seriously… I didn’t mean to give anyone the impression that I derived this myself. I read Patriot’s post when it first came out and assumed that’s how everyone did it. I couldn’t ever derive it, my math “skills” are even more embarrassing than my programming “skills.”

Unless I explicitly say otherwise, you can pretty much assume I never come up with anything on my own. Maybe someday I’ll hit my head and luck into something.


#12    Matt Klaassen      (see all posts) 2011/02/16 (Wed) @ 22:31

Kincaid #10: When I do pitcher WAR for myself, I figure out win% to derive runs, then convert runs-to-wins dynamically… am I inadvertantly “double dipping” on the runs-to-wins conversion?


#13    Matt Klaassen      (see all posts) 2011/02/16 (Wed) @ 22:36

Sorry to emulate a spambot… Tango, didn’t you also have this formula in your wiki for a long time? I thought it was, but maybe you just added it.

http://www.tangotiger.net/wiki/index.php?title=Runs_Per_Win


#14    Tangotiger      (see all posts) 2011/02/16 (Wed) @ 22:52

I never noticed it!


#15    Kincaid      (see all posts) 2011/02/16 (Wed) @ 23:01

Matt/12, what are you doing to go from W% to runs?  It is possible for a process like that to work out fine, depending on how you do it.

At the very least, that sounds like an extraneous process, though.  If you have W%, all you need to do is subtract out the replacement level W% (i.e. .390, or whatever you are using) to find wins above replacement per game, then calculate the number of full games from IP, and then multiply those two together.  For example, if a player has a .590 W% based on PythagenPat, and you are using .390 as replacement level, he would be .2 WAR per game.  If he pitched 180 innings, that is 20 full games, so he would be 20*.2 = 4 WAR.


#16    Matt Klaassen      (see all posts) 2011/02/16 (Wed) @ 23:23

Kincaid/15:

Let’s see… been a while since I put the SQL together, it’s a mess. I’ll just look at my spreadsheet version (it’s from a couple years ago, so I used Tango’s RPW). I cobbled it together from different sources and made some of my own choices…

I start with RA/FIP/whatever (it doesn’t matter for this, right? Let’s not worry about parks either, that’s easy enough). Then using it and lgRA I generate the PythagPat exponent then a winning percentage relative to league average.

The next step is the conversion to runs (saved), not sure why I did it this way: I take the pitcher’s extrapolated win% and subtract replacement level (or whatever baseline) from it, then multiply the result by innings pitched to get runs above replacement. Ex: for .485 pitcher X, (.485 - .380) * 140 IP = ~14.5 RAR.

Then using Tango’s formula modified for dynamic RPW conversion, I get a RPW for the pitcher. I think I originally got this from the FanGraphs value series:

((Pitcher X’s RA * Average IP of Pitcher X appearance)+((18- Average Pitcher X IP per G) * LgRA)) + 2) * 1.5

Then divide the RAR by that.

[I guess I could switch that to PythenPat now, but I’ll figure that out later.

Seems kinda convoluted, but it made sense at the time.


#17    Kincaid      (see all posts) 2011/02/17 (Thu) @ 00:11

I take the pitcher’s extrapolated win% and subtract replacement level (or whatever baseline) from it, then multiply the result by innings pitched to get runs above replacement. Ex: for .485 pitcher X, (.485 - .380) * 140 IP = ~14.5 RAR.

That step just gives you wins above replacement times 9, because it is wins per 9 innings (wins per game) times innings.  It doesn’t necessarily have anything to do with runs, but it can be a decent approximates runs since it assumes a runs/wins conversion factor of 9.  As long as the custom runs per win factor is close to 9, that will be ok.  With good pitchers, that is going to over-count the runs to wins conversion, though.  In effect, you are multiplying wins by 9 to get runs, and then dividing runs by the conversion factor to go back to wins.  For good pitchers, that factor will be less than 9, so you will end up over-counting wins a bit.  If the conversion factor is higher than 9, it will be the opposite.

(.485 - .380) is already on the scale of wins (per game), so converting it to runs would require using the runs to wins conversion factor, not just multiplying by innings.  Since you are just going to use the same factor to convert back, that step is unnecessary.  You can convert (.485 - .380) directly to WAR by multiplying by IP/9.

A better way to get RAR if you want it would be to just take the difference of RA9/FIP/etc and league runs per game, and scale that to the pitcher’s innings pitched the same way you do with wins.  You don’t have to worry about dealing with converting one scale to the other at all since both are directly calculable.


#18    Kincaid      (see all posts) 2011/02/17 (Thu) @ 01:49

The last paragraph in #17 is for calculating runs above average, not replacement.  For RAR, you would need to figure out what the replacement level RA9 is.


#19          (see all posts) 2011/02/17 (Thu) @ 03:26

If I remember correctly, there was a Statspeak article a few years ago that discussed this topic (I think it was either Colin, Brian or Pizza—that narrows it down a bunch, ha!).

The old link is this: http://statspeak.net/2009/01/when-ten-runs-isnt-a-win.html


#20    Dan Novick      (see all posts) 2011/02/17 (Thu) @ 21:18

Aaron:

That article was by me...good memory! I think it was my first or second ever post at statspeak.

The link is broken however, and probably for the best. That piece wasn’t anything close to the level of sophistication the gentlemen above me are working at. It was a mere illustration.


#21          (see all posts) 2011/02/18 (Fri) @ 18:47

I’ve always used 3.33*(R/G^.71), which I guess is almost the same, just counting R/G for one team instead of two.  I must have posted that somewhere a long time ago.


#22    Tangotiger      (see all posts) 2011/02/18 (Fri) @ 19:00

Dan, right, it would be virtually the same thing.  But you have to admit:

2*(RPG^.72)

(with RPG being both teams)

Certainly is cleaner…


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 11:41
Do pitcher’s reach back for velocity when needed?

May 25 11:33
“Why Kickstarter works”

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 10:14
Largest demonstration in Canadian history?

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 17:04
Firefox, IE, or Chrome?