THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, April 10, 2008

Batted Ball DIPS

By Tangotiger, 12:57 PM

Voros got the ball rolling with DIPS.  You can see what he’s doing, step-by-step, essentially recreating a pitching line, presuming all batted balls in park resulted in the same percentage of hits.  And you end up with a dERA.  (This has a high correlation with FIP, which is why I prefer FIP.) Then, Hardball Times over the past few years, have been breaking down the batted balls by FB, GB, LD, and so, each one has its own hit and out rate, or basically run value, culminating in a fantastic resource, and indeed, a staple in THT Annuals.

Essentially, a LD has the same run value as a walk, an infield fly has the same run value as a K.  You could throw those into the FIP equation, if you like.  Now, completing the circle, Graham basically follows the Voros method of recasting a pitching line but looking at the batted ball type.  I’d like to call it DIPS 3.0, but Gassko used the name three years ago, and it looks like he did something very similar to Graham.  I am glad that someone as seemingly resourceful as Graham (I don’t think I know him) is in our midst.

I’m having a hard time keeping all these different flavors of Batted Ball Run equations straight.  I hope now you guys are in the same boat as I am.  Maybe Graham can write something up for Hardball Times (without those greek letters).


#1    David Cameron      (see all posts) 2008/04/10 (Thu) @ 15:06

Graham (last name McAree) is the resident British genius of the Mariner blogosphere.  He’s been posting at LL for a while, and he’s a moderator at USSM as well.  He’s also quite snarky.  He’s kind of like a European mgl.


#2    studes      (see all posts) 2008/04/10 (Thu) @ 15:46

I just can’t tell what’s different between what he’s done and what David already did.  Maybe David and Graham can put their heads together to figure it out.


#3          (see all posts) 2008/04/11 (Fri) @ 03:38

Hi everyone, thank you for the interest. As Dave Cameron has said, my name’s Graham MacAree and I’m the M’s blogosphere’s resident snarky Englishman.

There are a few key differences between my stuff and Gassko’s…

1) I use out values per ball type as well as run values to generate an expected outs profile, which gets around the nasty dependency of having to use innings pitched rather than batters faced. (this is the biggest one).

2) Park effects and regression.

3) I don’t assume HR/FB is static across all pitchers.

4) Presentation and generation of player cards, etc. I’ve tried to make everything accessible and stuffed with information that is generally unavailable to the public (i.e. me). I don’t know if anyone’s keeping up with my diaries on LL, but I’m just releasing player cards now.

{Editor’s note: link was being treated as spam, and removed}

5) Mine is probably riddled with typos on the input data (since they’re all enterted by hand, which was fun), and probably uses quite questionable statistics on the regression analysis. Matthew Carruth and I are working on regression based on his work with pitch-by-pitch, as well as automating data collection. Because I’m sick of going to Fangraphs and typing all the data in by hand.

And yeah, I think that’s it. There are a few differences in methodology, some of which are quite significant, but the main difference is that DIPS 3.0 seems to have fallen by the wayside (I certainly wasn’t aware of it when I first started work on my stuff in early 2007) and tRA is still being worked on, and is currently proving very popular amongst M’s blogodenizens.

Forgive me if that was an trifle incoherent, I’ve just got out of bed and have only had one cup of tea so far…


#4    Tangotiger      (see all posts) 2008/04/11 (Fri) @ 10:55

Graham’s note was marked for moderation, and has been opened up.  He provided a link, but my blog software kept thinking it was spam, so I had to remove it.


#5          (see all posts) 2008/04/11 (Fri) @ 10:59

Hmm. That’s very strange, sorry about that.


#6    Colin Wyers      (see all posts) 2008/04/11 (Fri) @ 13:16

Let me see if I’m understanding correctly. Inputs are:

K/BB/HBP/GB/OFB/IFB/HR

If that’s all you need, you should be able to parse that out of the Retrosheet event logs. (I think you can even get pitcher batters faced; I’ll check on that.) Obviously that won’t help for 2008 (or 1999, for that matter) but it’s probably the easiest way to get large amounts of data on the topic.


#7    Colin Wyers      (see all posts) 2008/04/11 (Fri) @ 13:39

I did some quick spot-checking with B-Ref and I think I have PBF done right.

http://www.editgrid.com/user/cwyers/Batted_Ball_DIPS_Inputs,_2004-2007

Let me know if that’s any help to you.


#8          (see all posts) 2008/04/11 (Fri) @ 17:05

Colin, thank you very much for the offer of help, but I don’t think I need to automate the data gathering right now - Matthew and I are working on generating up-to-date tRA and the associated stats by spidering MLB’s Gameday (which we’re then going to try to make available on the net), and he’s also going to be providing me with past seasons by using retrosheet logs. We’ve already started doing this, it’s now on me to get around to finding a neat way of importing all of these data into my spreadsheet.

I do very much appreciate the effort though.


#9    David Gassko      (see all posts) 2008/04/11 (Fri) @ 22:37

Hey Graham,

I don’t think you’ve read my newer articles on DIPS 3.0/LIPS. Neither use actual IP and DIPS 3.0 uses a pitcher’s actual HR/F. Park effects can also be included in those statistics if you want to, as I think Derek Carty (the THT fantasy blogger, who, by the way, uses LIPS all the time) does now or will soon.


#10    Guy      (see all posts) 2008/04/12 (Sat) @ 18:41

David:
Can you post link(s) to your most recent work?  And if it isn’t addressed there, how much does the batted ball variables increase y-t-y correlation, compared to original DIPS?


#11    David Gassko      (see all posts) 2008/04/12 (Sat) @ 19:15

Here you go: http://www.hardballtimes.com/main/article/dips-lips-and-hips/

The key paragraph:

“I take each player’s batted ball line, and based on league averages, I convert it into expected single, doubles, triples, home runs, reached on error, outs, and grounded into double plays. For DIPS 3.0, I use actual home runs instead of expected. For LIPS, I first transform the player’s batted ball line by substituting a league average line drive rate. Then I add in the pitcher’s actual walk, strikeout, and hit batter numbers, and calculate his expected run average (as opposed to earned run average) using BaseRuns.”

No idea as to the answer to your second question, but it probably improves the correlation a little bit.


#12    Guy      (see all posts) 2008/04/13 (Sun) @ 08:02

Thanks, David.

Have you become a Wang believer yet?  Or is he the luckiest pitcher in baseball history? :>) Seriously, I wonder if an adjustment for extreme GB pitchers might improve the model, since they will tend to ‘overachieve’ on BABIP....


#13    David Gassko      (see all posts) 2008/04/13 (Sun) @ 15:33

Well, there are a million adjustments that would improve the model, but then we would have ourselves a projection system! The projections I developed for THT, by the way, have Wang posting a 3.75 ERA in ‘08. But a DIPS model should be simple, and if that means it misses on certain players, I’m fine with that.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Dec 05 04:40
Sabermetric Moves of the 2009 Pre-Season

Dec 05 05:33
Avery being Avery

Dec 05 05:06
NYC’s 3 1/2 year mandatory jail time sentence for carrying a loaded weapon

Dec 04 23:42
Poll: Would you vote Raines for the Hall?

Dec 04 23:07
How to calculate the area of a baseball field

Dec 04 22:48
Complete Run Expectancy, Retrosheet Years

Dec 04 22:03
Raines for the Hall

Dec 04 15:55
Mailbags on Parade

Dec 04 14:01
What would happen if the shootout period was 10 minutes, not 5?

Dec 04 11:49
Estimating BABIP