THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Friday, February 18, 2011

Ratios to rates

By Tangotiger, 11:30 AM

Carson has an equation to convert out ratios into contact rates:

rate = .18ln(ratio) + .38

I’ll repeat my comments over there:

==================================
Carson, a groundout to airout ratio means:

g/a

A groundball percentage means:
g/(g+a)

So, in order to convert a ratio into a percentage, you do:
ratio/(ratio+1)

A g/a of .5 means a gb% of 0.33. A g/a of 2 means a gb% of 0.67, and so on.

However, in MLB, they exclude lineouts from the numerator and denomiator in the g/o ratio. But, they are included in the gb%. So, a gb% is actually:
g / (g + a + l)

Furthermore, in g/a refers only to outs, while gb% refers to all contacted balls. So, you’d have to convert the go to a gb by saying doing go/.75 = gb. And so on.

***

All to say: I don’t doubt the best-fit of the equation you found.

I do think that we can come up with a different equation that is grounded (no pun intended) in logic. And you can then do a best-fit against that equation.

***

Right.

If you have a g/a ratio of .500, 1, 2 the ln of that is going to give you: -.69, 0, +.69. So, perfectly symmetrical. Which matches what the g/(g+a) would give you of .333, .500, .667, respectively.

But, the actual equation for gb% is g/(g+a+l). Would the ln(g/a) still necessarily hold as a core part of the conversion?

I don’t know, I’m asking.

***

Following up:

To convert the ratio to a rate, if we had the exact same parameters in both, we’d do:

g% = g/(g+a) = .x*ln(g/a) + .5

That x would approach 0.25 as g/a approaches 1. And in MLB, x would range from .24 to .25.

So, if we used all contacted balls, then a best-fit equation would come in at something like .25*ln(g/a) + .5.

But, as noted, the ratio actually uses only outs, and excludes lineouts. The rate uses all contacted balls.

Carson’s best-fit, using observed data, changes that .25 coefficient to .18. It changes the intercept from .5 to .38.

My question is if someone here would like to try to come up with an equation without relying on individual data, and simply use some logic to the process. To presume that 20% of batted balls are line drives, that 25% of those are lineouts, and so on.

(13) Comments • 2011/02/22 • SabermetricsStatistical_Theory
Page 1 of 1 pages

Latest...

COMMENTS

May 26 07:27
“Why Kickstarter works”

May 26 03:03
Pete Palmer’s new book: Basic Ball

May 26 01:11
Largest demonstration in Canadian history?

May 25 19:41
What sabermetrics is NOT

May 25 16:59
Howard Stern

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 12:51
Chad Curtis

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

THREADS

February 18, 2011
Ratios to rates