THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, September 16, 2009

More math nonsense

By Tangotiger, 03:40 PM

In addition to my problem with ”seasonal age” is my problem with “GB/FB ratio”.  Dave points out how ridiculous it is to ignore line drives.  That is a great point.  I wish I would have thought about that part more.

My bigger problem is the non-symmetry of a ratio.  For example, if the “average” GB/FB ratio is 1.1, then if you have a linear equation that uses GB/FB ratio, it will give equal impact to a ratio that is 2.0 or 0.1 (both are 0.9 “ratios” from the average).  As you can see, this is mathematical nonsense.  If the average GB/FB ratio is 1.1, then the GB/(GB+FB) percentage is 52.4%.  And a 2.0 GB/FB ratio is a 66.7% percentage.  Symetrically, that’s a GB/(GB+FB) percentage of 38%, or a GB/FB ratio is 0.6.  That is, a 2.0 GB/FB ratio is as far away from 1.1, as 0.6 is from 1.1.

Indeed, imagine instead of using GB/FB ratio, you use FB/GB ratio.  If you have an equation that is based on using the ratio, you should be able to simply change the coefficient, and get back the exact same results from your equation.  This cannot happen if you use a ratio.  Suppose, for example, instead of on base percentage, you use outs percentage.  That is, 1-OBP, which I’ll call OP.  If you have a metric that is 1.8*OBP+SLG, then the equivalent using OP is SLG-1.8*OP+1.8.

I talked about it three years ago.


#1    wcw      (see all posts) 2009/09/16 (Wed) @ 16:28

I dislike G/F ratio myself, but on the grounds that the variables of interest are denominated by AB: GB%, LD% and FB%.  Those are ratios, too, though.  Just normalize them.


#2    Tangotiger      (see all posts) 2009/09/16 (Wed) @ 16:36

How is GB% a ratio? 

I guess the way you are saying it is that A/(A+B) is a ratio.  But, it’s ALSO a rate.  A/B is NOT a rate. If you are going to get semantical like that, then I mean a non-rate.  If I can’t use the word ratio, then give me something else I can use.


#3    Guy      (see all posts) 2009/09/16 (Wed) @ 16:45

Let’s add SO/BB ratio to the list of bad ideas (though one intended to capture an important relationship).  A 9K/3BB pitcher is in no way comparable to a 6K/2BB pitcher.


#4    Tangotiger      (see all posts) 2009/09/16 (Wed) @ 17:01

Guy, totally with you.  Guy was the first one who alerted me to the possibility that K-BB per PA would be better.

And he’s totally right.  A 6/2 K/BB ratio is equivalent to a 10/6 ratio (given the same number of batters faced).  I provided empirical results as proof a year or two ago.


#5    SirKodiak      (see all posts) 2009/09/17 (Thu) @ 05:14

Ratio: comparison of two numbers that are either both ‘true numbers’ (like 5/3) or both ‘like denominate numbers’ (like 5ft/3ft) and no longer has a unit of measurement.

Rate: comparison of two numbers that are ‘unlike denominate numbers’ (like 5km/3hr) and create a new unit of measurement.

For whatever that is worth to you.


#6          (see all posts) 2009/09/17 (Thu) @ 17:39

I find GB%, FB%, LD%, K%, BB%, etc to be the most helpful in understanding the pitcher’s value. 
GB/FB or K/BB dont tell the whole picture, but these do.  They are essentially ratios of GB/AB, FB/AB, etc. 

Then we can find a value for each outcome and sum up the percentages to determine the expected value of the AB, for that pitcher.


#7    Kincaid      (see all posts) 2009/09/17 (Thu) @ 19:48

I’m not so sure I see that much advantage to GB% over GB/FB in determining a pitcher’s value, going forward at least, beyond just the non-linear issue that Tango talks about.  But that’s more or less an issue of how you present the data.  GB/FB can easily be turned into a rate that looks just like GB%.  Once you do that, how much do you gain by adding line drives into the equation?  As far as the difference in the information provided by each, one basically regresses LD rate 100% and the other not at all.  For one season’s worth of data, I’m not convinced looking at the one that doesn’t regress LD rate at all (GB%) is any better than the one that regresses it 100% (GB/FB).  I’d have to see some sort of data that suggests that GB% is a better estimate of GB-inducing talent than GB/FB (or the form of it turned into a rate), particularly for pitchers with large differences in LD rate like the article cites.

If all you care about is how many ground balls were actually allowed in the sample, then I guess that is different.


#8    Tangotiger      (see all posts) 2009/09/18 (Fri) @ 00:25

In order to regress LD “fairly”, I use (GB-FB)/BIP.

This way, you keep the relative value between GB and FB constant.  And you get the denominator right.

It’s so darn simple, too.


#9    Jim P      (see all posts) 2009/09/18 (Fri) @ 00:30

You could try analyzing the ratios in logspace (I do that sometimes for work when the ratios vary by more than a factor of about 5).  It’s possible that the GB/FB ratio is a lognormal distribution.  This would take care of the symmetry, since a 1.0 would be equally spaced between 0.5 and 2.0, and it doesn’t matter if you do GB/FB or FB/GB as far as calculating averages or variances.


#10    Davor      (see all posts) 2009/09/18 (Fri) @ 02:23

While LD% is extremely important for pitcher’s success, pitchers who cannot prevent line drives don’t pitch in majors. So, pitchers in majors should be the very top of line drive prevention, and have small difference in their natural ability, enough that GB/FB from LD perspective isn’t a problem. The fact that GB/FB isn’t symmetrical is a problem and should be taken into account in formulas - so, no straight linear equations.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:26
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps