THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, March 08, 2007

The Four Horsemen

By Tangotiger, 12:17 PM

Studes follows the Voros approach in describing some players.  It is in fact Voros’ approach that allowed me to create aging charts.  (See Legend at the bottom)

As you can see, each rate describes something specific.

Now, there’s no reason that you must look at things this way.  It assumes a certain independence that perhaps is not warranted.  You could for example, look at things in other ways.  Rather than removing HBP from the denominator first, then the BB, then the K, then the HR, you can remove all four right away.

So…


You have your PA split by HBP, BB, K, HR in one group (the unfielded balls), then fielded balls in another.

From the unfielded balls, you can create a rate of HR per unfielded ball.  Then, remove the HR, and you can get a rate of K per uncontaced ball.  Then walks per unswung ball.

For the fielded balls, you would do H+RBOE per fielded balls.  Then 2B+3B per safe-fielded balls, and 3B per extrabase-safe-fielded balls.

Or, you can first break up the fielded balls into GB or air ball.  In the air ball, break them up into pop, liner, fly.  For each of those, follow the pattern in the previous paragraphs for hits.

Or, you can remove K, BB, HBP first, and then remove HBP from that group, as HBP per non-contacted ball.  Then do BB per K + BB.  Perhaps this approach makes more sense for pitchers, and the other approach makes more sense for hitters.

The key, as Voros showed us when he introduced DIPS, is to keep things as independent as possible.  But, just because we do this, doesn’t mean it’s right.  But, it sure seems right.  And, at the very least, it create a profile of a player, one that can be used for comparison purposes.

And it was this kind of approach that allowed me to identify comparable players in this old article.

#1    studes      (see all posts) 2007/03/08 (Thu) @ 19:33

Hey Tango, I like the idea of similarity “scores” based on the “four horsemen.” Mind if I use that for an article someday?


#2    tangotiger      (see all posts) 2007/03/08 (Thu) @ 20:27

You actually have used something like that already!

In one of the THT annuals, you looked for sim scores based on the batted ball data.  Remember?  You simply have to extend the exact line of reasoning to the other rate stats.  Just figure the z-score for each stat.  Its brilliant in its simplicity.

On top of which, like I did for the Speedster article, you can “reverse” the sign on one category to find the most similiar in all categories, and most dissimilar in the other.

Neat, right?


#3    David Smyth      (see all posts) 2007/03/08 (Thu) @ 20:42

I don’t really buy this line of approach, with these ‘made-up’ denominators. Batters simply produce a certain frequency of outcomes, per PA. There are all sorts of interactions between the outcomes, and the ‘average’ interaction profile is simply not strongly adaptable to individuals. If two players have the same HR per batted ball, but a 15% difference in the frequency of batted balls per PA, what are we really gaining by analyzing it in this way?


#4    Guy      (see all posts) 2007/03/09 (Fri) @ 10:27

David:
I think you would agree there is utility in measuring BABIP separately from the impact of Ks on BA, as in DIPS or FIP, is pretty clear.  Beyond that, this approach is useful for trying to see if pitchers tend to “package” certain skills, i.e. measuring correlation.  If we want to know whether high-K pitchers give up fewer HRs, we want to know how often he gives up HRs APART from striking out a lot of hitters.  However, I guess I can see a case for using BB/PA, K/PA, HR/PA and BABIP as the four “foundation” rates.


#5    tangotiger      (see all posts) 2007/03/09 (Fri) @ 11:12

On the pitcher’s side, BB, K and HR are highly correlated, and we all agree that there is value in BABIP.  So, I don’t really see the issue in at least creating two pools: non-fieldable PA and fieldable PA.  The RJs of the world have something like 60-65% fieldable PA, while the Radkes of te world have over 80% fieldable PA.  From there, it seems somewhat logical to continue the breakdown, by separating the HR from BB, K, HBP.

I agree with David’s basic sentiment that before we go ahead and do all this, we should have a somewhat strong foundation for doing so.  Otherwise, we are really just creating a model to make our lives easier, rather than actually modeling reality.

But, hey, the idea is out there. Hopefully it spurs the bright minds out there to tackle the issue.  Until then, I won’t be forcefully fighting that it does make sense, just prima facie.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade