THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, January 11, 2007

Quality of Pitching in NL and AL

By Tangotiger, 12:49 PM

John Walsh wrote a great piece about OF arms in the Hardball Times Annual.  It’s one of those great “well-presented, intensely-researched, easy-to-grasp” articles that I like.  You don’t have to get the math-mumbo-jumbo to understand what’s going on.  John is one of my favorite sabermetricians, and for that reason, I hold him to a higher standard.  In his article:


http://www.hardballtimes.com/main/article/americans-defeat-nationals-in-pitchers-duel/

He says:

Another thing to note is a key assumption: this method assumes that the average park in the NL does not favor hitters over the average AL park, and vice versa. Standard park factors compare parks within a league—we have very little information on how AL parks compare to NL parks, simply because of the low number of interleague games. My intuitive feeling is that the assumption is a reasonable one, but it should be kept in mind, nonetheless.

Again, this is a sticking point, and it really sticks out.  And, I’m a stickler.  What John showed with 62% of dual-leagued players having a higher OPS in the NL is caused by three things: pitchers, fielders, parks.  Yet, because he attributes the parks to be equal, without any justification, he concludes it’s because of the pitchers and fielders.  This may be the case, but this has not been demonstrated.

What bothers me is when he further justified it as “low number” of interleague games.  There have been, I believe, over 2000 interleague games since 1997.  That’s more than enough to ascertain the run-generating effect of a typical AL and NL park.  You still have to be careful of course, with parks being added/removed, and altered.  But, Yankee Stadium and Shea Stadium haven’t changed much, if at all.  There’s no reason that we can’t use all interleague games since 1997.

As well, though a likely minor point, how are AL and NL relievers used?  And what’s the gap between the starter and reliever in each league?  As we know, there is a huge gap in a pitcher performing as a starter than the same pitcher as a reliever.  This would need to be addressed.

***

As an aside: the average AL fastball pitcher is 1 to 2 MPH faster than the average NL fastball pitcher.  That is likely fairly good evidence that AL pitchers are better.  Or of course, there is a “gun” bias in AL parks, and we need a “gun” park factor.

#1          (see all posts) 2007/01/11 (Thu) @ 16:55

What do you mean by gun bias?  Employees in AL parks are more speed conscious that NL ones?  Is there actual data on this?


#2    Guy      (see all posts) 2007/01/11 (Thu) @ 17:31

Tango:  Using the data from MGL’s AL-NL articles in THT, I think you can do a comparison of NL and AL parks.  Using 2000-2005 interleague data, I found the difference was tiny, with NL parks adding about 7 runs per season compared to AL parks.  See this thread:  http://www.insidethebook.com/ee/index.php/site/comments/why_did_the_al_cream_the_nl_this_year_in_interleague_play/.  So I don’t think that would change John’s analysis much.  His 60 run estimate is also very close to the estimate I came up with there of an AL pitcher advantage of about 50 runs.


#3    tangotiger      (see all posts) 2007/01/11 (Thu) @ 18:23

Tools: not more speed conscious, but at what point they pick up the ball.  If one park picks up the ball when it leaves the pitcher’s hand, and another one picks up the ball when it crosses the plate, you’ve got a 7-8 MPH difference.  So, every 7-8 feet or so, you are talking about 1 MPH here or there.

If this was random, no one would care.  But, if you have a few AL parks that pick up the ball closer to the pitcher, and a few in NL parks that pick up the ball closer to the batter, then that’s a systematic bias.  And that’s what needs to be corrected.

And, it’s easy enough.  Figure out RJ’s fastball at Yankee Stadium and on the road.  Do the same to Mussina, et al.  Do it to the visiting pitchers.  Voila: a gun park factor, just like you’d have a HR park factor or a LH/RH park factor, or a GB/FB park factor.

***

Guy, is that 7 runs per 162x40 PA (.001 runs per PA)?


#4    Tango      (see all posts) 2007/01/11 (Thu) @ 18:37

The difference was .6 runs per 500 PAs (which was how MGL did his analysis).  That is, a hitter with 500 PAs generated 0.6 more runs in an NL park vs. an AL park.  (Of course, both leagues’ hitters do better in their own league, because you’re capturing homefield advantage). 

MGL’s data was only presented to one decimal and per 500 PAs, so I could be off a little due to rounding, but at least for those 6 years it appears the parks were pretty equivalent.


#5    tangotiger      (see all posts) 2007/01/11 (Thu) @ 19:04

I guess that was Guy posting #4.

Ok, so that’s .0012 runs per PA.  1 point of wOBA is .0009 runs per PA.  OPS is 2.2 times wOBA.  So, 1 point of OPS is .002 runs per PA (if I’m doing all this correctly).

So, agreed, less than half a point of OPS difference between the parks.  John was proved right here (though my objections stand with merit).

The other issue is fielders.  Again, are fielders equal or not in both leagues?  Dual-leagued fielders is one way to do this.  Another is to look at the Fans’ Scouting Report.


#6    MGL      (see all posts) 2007/01/11 (Thu) @ 22:45

A technical note:  A lot of people use the wrong “uncertainty calculation” when estimating the uncertainty of a “difference between (or sum of) two sample values” such as NLOPS-ALOPS.

Remember that the the standard deviation of the sum of or difference between two values is the square root of the sum of the variances and NOT just the square root of one of the variances and definitely not the standard deviation of the two samples combined (in fact, it is exactly twice that I think).

I am assuming that John did it the right way, but many otherwise competent researchers forget the “rule.”


#7    John Walsh      (see all posts) 2007/01/12 (Fri) @ 06:10

Tom—Thanks much for the kind words. 

Regarding your point about AL v. NL parks, you’re perfectly right of course.  I tried to make it clear the assumption I was making about the equivalence of NL and AL parks, an assumption I felt intuitively was reasonable.  Although, as you say, I did not present any data to support my view.  I had not seen the thread here on MGL’s articles from last summer, so I was not aware of Guy’s conclusion the the park differences in the two leagues were small. I’m glad my assumption appears to be valid, although this seems like a topic that could use a dedicated study.

As for the starter/reliever issue that is a good point. I agree with you that it’s probably minor, but should be checked.  Another issue that I thought about, but did not study, is that of older players moving preferentially to the AL where they can DH.  (I did not make any age corrections, nor did I exclude DH’s.)

MGL—yes, I calculated the uncertainty on the OPS difference correctly: var(a-b)= var(a)+var(b), where var is the square of the 1 SD uncertainty. Doesn’t everybody do it this way?


#8    Tangotiger      (see all posts) 2007/01/12 (Fri) @ 13:04

John, thanks for stopping by, and taking the critique in the spirit in which it was delivered.

Yes, the aging issue is also interesting.  Because you limited your study to a 3-year gap, that curtails alot of it.  And, maybe only 60% of players switch to the AL as they get older, as opposed to 50%?  So, the expectation should be a very low change in OPS as a result of the aging issue, in your sample.


#9    Guy      (see all posts) 2007/01/12 (Fri) @ 14:37

In terms of fielding, I think you could do this by looking at DER, excluding PAs by pitchers and DHs.  Calculate these 8 DERs:
INTRA-LG
ALF vs. ALH on Rd (AL parks)
ALF vs. ALH at Hm (AL)
NLF vs. NLH on Rd (NL)
NLF vs. NLH at Hm (NL)
INTER-LG
ALF vs. NLH on Rd (NL Parks)
ALF vs. NLH at Hm (AL)
NLF vs. ALH on Rd (AL)
NLF vs. ALH at Hm (NL)

Weighting each one equally to control for Hm/Rd, hitter quality, and park effects, just calculate an average DER for ALF and for NLF.  Of couse, the one thing this doesn’t control for is pitchers.  However, it seems pretty unlikely that there could be anything more than a very small league difference in pitcher BABIP ability, once you control for park and defense (and I believe in BABIP ability as much as anyone!).  So any significant difference in league DER would probably be a real measure of different fielding ability. 

That said, my guess is you’d do a lot of work to find no difference (or a tiny difference).
an analysis similar to what


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade