THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, December 19, 2006

Ratios or Rates?

I am trying to convince JC over at Sabernomics that there is a huge difference between using GB/FB ratio, FB/GB ratio, and GB/(GB+FB) or GB rates.  Head on over there.  Below is a summary of my posts.


We talked about this at the time. Are you still doing GB to FB *ratios*, or GB rates? Ratios are not symmetrical, and whether you do GB per GB+FB, or FB per GB+FB, you should end up with the same result. Ratios don’t do that.

***

As for using the ratio, then how to justify using GB/FB instead of the reverse? What you are saying, by using GB/FB is that the higher the GB is more important than the lower FB. That is, let’s say the GB/FB has a mean of 1.00. A GB/FB of 2.0 is the same as a FB/GB of 0.50. But, using the GB/FB as the ratio has double the impact of FB/GB, even though they are describing the exact same thing.

Just because something best-fits better on the sample doesn’t mean that it’s the right thing. A best-fit analysis would give the run value of a double .66 and the single .52 (instead of the more true .77, .47).

***

Ah, but the coefficient will not change accordingly. What will happen is this: mow the guys with the highest FB/GB ratio will move *more* than the high GB/FB ratio players.

Think of it in an extreme situation: you have a guy with 100 GB and 1 FB. In your current PrOps, this guy has a 100.00 value, which you multiply by some coefficient, say “.002″. So, he moves +.20 points up. If on the other hand you used FB/GB ratio, your coefficient may be “-.002″, which multiplied to 1/100 (or .01) will be zero.

From where I sit, using GB/FB taints your process whereby the higher the GB, the more impact than the higher the FB.

If you create a FB/GB version of PrOps, show your results both way (old Props, new Props) for Frank Thomas and Derek Jeter, and you will see the impact of this bias.

***

I just ran three different regressions, using GB/FB ratio, FB/GB ratio, and GB/(GB+FB) or GB rate.  This was ran against GPA on the THT site.  (The use of GPA, or OPS, etc, doesn’t really matter.) I used 2004-2006 data of all players with at least 502 PA.

The 2006 Frank Thomas is the most extreme, with a FB/GB of 2.44.  His resulting regression yielded results of: .287, .313, .298.

At the other end is the 2004 Ichiro, with a GB/FB of 3.55.  His results are: .247, .261, .255.

The sample standard deviations are: .0057, .0072, .0071

In all cases, the mean was .276.

GPA is analogous to batting average.  Those are some HUGE differences, don’t you think? 

***

The correlation coefficients were (r) were .21, .26, .26. And, it should go without saying, that using FB/(GB+FB) produced the exact same estimated GPA for each player as the GB rate, as well as the exact same r. 

(30) Comments • 2007/05/20 • SabermetricsStatistical_Theory
Page 1 of 1 pages

<< Back to main