THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, July 05, 2006

Why did the AL cream the NL this year in interleague play?

By , 09:25 AM

A 3-part article, by MGL, over at The Hardball Times investigating whether the AL is in fact the better league (no duh?), and if so, whether their advantage is in hitting, pitching, or both.


Check out my (MGL) 3-part article at The Hardball Times looking at the difference in talent between the NL and AL this year and over the years of interleague play (1997-2006 - yes it has been that long!).

Part 1

Part 2

Part 3

#1    Guy      (see all posts) 2006/07/05 (Wed) @ 12:50

MGL:
Interesting analysis.  I look forward to parts 2 and 3.

Quick question:  in calculating league-specific linear weights for these players, how do you handle the DH/pitcher difference?  I assume pitchers-as-hitters are excluded in NL.  Are DHs included or excluded in AL? (and does it matter?)


#2    MGL      (see all posts) 2006/07/05 (Wed) @ 18:19

Yes, I eliminated DH’s and pitcher hitting from the analysis.


#3    tangotiger      (see all posts) 2006/07/07 (Fri) @ 07:39

I’ve updated this page with the three links to mgl’s articles.

The Part 3 framework is the one that I prefer to use.  In fact, that’s pretty much the way all league and positional and age adjustments are made.  Take one thing, and put him in two distinct environments, and see how it responds.  The difference in result is attributed to the thing you added.  You take a whole bunch of 23 year olds, and see how they do against the league, and then see how they do at age 24.  That difference in performance is the age (and year, which is usually close to zero) factor.

Same thing with position adjustments.  How does Nomar do at SS?  How does he do at 1B?  How does Erstad do at CF and 1B?  That difference is attributed to the position.  As MGL noted in Part 3, and similarly here with positions: you have a familiarity/experience factor to consider.

Since Part 3 is very overwhelming, I’d like to re-read it a few times, to see if I would do something differently.


#4    Guy      (see all posts) 2006/07/07 (Fri) @ 10:00

Part 3 is indeed very interesting, and a bit overwhelming.  But it seems the same data can be organized to support a somewhat different conclusion.  Combining some of MGL’s Hm and Rd results, I get these 2000-2005 matchups, in R per 500 PA/BF:
ALH v. ALP 4.5
ALH v. NLP 4.8
NLH v. ALP 1.8
NLH v. NLP 3.9

With this approach, ALH’s do look stronger, though the gap is not that large and is much larger against ALPs (2.7) than NLPs (.9). NLPs perform worse than ALPs, though the gap comes mainly when facing NLHs (2.1, compared to .3 against ALHs).  Assuming I’ve done these calculations correctly, it would seem to suggest that the AL advantage is nearly as great in pitching as in hitting (though neither is that large).  And what really stands out here is AL pitchers dominating NL hitters. 

Also, if AL and NL pitchers are roughly equal, shouldn’t we expect AL hitters to put up better offensive numbers than NL hitters in intra-league play? However, AL hitters intra-league are +4.5 R/500 PA, while NL hitters intra-lg are +3.9 R—so the edge for AL hitters is just .6/500 or .05 R/G (assuming equally talented pitchers).

Perhaps there is more of a balanced AL edge—in both hitting and pitching—than MGL has estimated?  (Or, more likely, MGL will tell me why this is all wrong.)


#5    MGL      (see all posts) 2006/07/07 (Fri) @ 17:51

Let me mull over what you did and said.  The numbers and reasoning in part III are pretty straightforward.  I think you are again mixing up pitching with hitting (or thinking that they are interchangeable, which they are not).  As Tom said, once you compare one thing (say NL batters) in two different environments, the difference in results MUST be the difference in the environments.  That is all I am doing with the approach in Part III.  The results are unequivocable, although the interpretation (and possible sample error) are not.  I think you are misinterpereting the results, Guy, but I have to think it over for a while.

OK, I looked over your chart and your logic.  Your chart, first of all, does not adjust for the pool of hitters and pitchers (which are slighltly different) in all the matchups.  That is no big deal.  Your numbers for comparing NL hitters with AL hitters are otherwise correct.  One way, it looks like the AL hitters are .9 better and the other way, it looks like they are 2.7 better.  One way is not better than the other way (and the reason there is probably a difference is the “familiarity factor"), so you simply average the differences, which comes out to 1.8 in favor of the AL in hitting.

Here is where you went wrong:  There is no pitching comparison in the numbers you cite!  You can’t compare the pitching with those numbers because the pitchers are not pitching in the same parks!  Take the first two rows in which you seem to able to compare NL pitching with AL pitching, since they are both facing AL hitting.  No!  One is in the AL parks and the other is in the NL parks!  That does not tell you anything (unless you knew exactly how to compare the parks).

If you want to compare pitching, go to the next section in the article (Part III) where the pitchers are on the road and are facing the same batters IN THE SAME PARKS. That is the only way to compare the pitchers.

One thing that the article did not print (THT forgot to add it to the article when I edited it) is that if we compare a “core” group of pitchers (neither young nor old) from last year to this year, who did not change leagues, we find that the NL core pitchers “got better” and that the AL core pitchers “got worse” which strongly suggests that indeed the AL overall pitching got a lot better this year and that the NL overall pitching got worse.  I don’t think this happened from talent migration.  I think it was just a fluke such that there are lots of good young pitchers in the AL this year and that there are lots of bad young and old pitchers in the NL this year.  In fact, I estimated in the part of the article that did not get printed, that the pitching advantage was close to a .5 runs per game for the AL this year to go along with the .4 (or so) advantage in hitting, which would explain the AL’s dominance in IL play this year.


#6    Guy      (see all posts) 2006/07/07 (Fri) @ 18:40

"There is no pitching comparison in the numbers you cite!  You can’t compare the pitching with those numbers because the pitchers are not pitching in the same parks!”

But the analysis does work if there is no significant aggregate difference btwn AL and NL parks, which is what I think your figures indicate.  The interleague 2000-2005 figures:
ALH NLH Total
AL Pks +7.3 -1.3 +3.0
NL Pks +2.2 +4.9 +3.6
It looks like the NL parks increase offense by just .046 R/G, not enough to really invalidate the interpretation of the comparisons I made.

And if the parks are comparable, then it becomes hard to sustain the idea that AL hitters--but not pitchers--are superior, given the rough parity in intra-league hitting (AL hitters intra-league are +4.5 R/500 PA; NL hitters are +3.9 R).

The “core pitcher” analysis sounds interesting. 
The idea that young AL pitchers (and bad NL pitchers) have created an AL pitching advantage seems plausible.


#7    MGL      (see all posts) 2006/07/07 (Fri) @ 18:58

Yes, as I said, if the parks are equivalent, then you can compare pitching as well both ways.  But even if you did, one (the hitting comparison) has nothing to do with the other (the pitching comparison).  Plus we really have no idea that the parks are equivalent.  There is too much random variance in the numbers you cite.  Why not just compare pitchers using the same batters in the same parks.  The less variables we have to control for, the better, right?  Anyway, as I said in the article, there is some evidence that the AL pitching has been a little better over the many years of IL play, along with the hitting.  However, one or the other is probably somewhat incorrect, as the IL win/loss records suggest more parity.

The whole analysis is a little tricky, but it does allow us to make some inferences I think.


#8    Guy      (see all posts) 2006/07/07 (Fri) @ 20:08

"Plus we really have no idea that the parks are equivalent.  There is too much random variance in the numbers you cite.”

We use park factors all the time that are based on 3 or 5 seasons (240 or 400 games).  Here we’re essentially calculating an aggregate “NL (or Al) park factor”, and we have about 1500 games 2000-2005, 750 in each league.  Seems like a pretty good sample (though it is true that the parks are not identical in all seasons). 

“Why not just compare pitchers using the same batters in the same parks.  The less variables we have to control for, the better, right?”

In general, yes, but the price of the controls is using only interleague games, which greatly reduces sample size.  In one respect, interleague games are the best data to answer this question—you have the two leagues facing each other on the same field.  But it’s also limited, in size and because every game involves one team being forced to playing a different game than they play 90% of the time. 

“The whole analysis is a little tricky”

Definitely!


#9    Guy      (see all posts) 2006/07/08 (Sat) @ 06:38

To follow up, if you look at 2005 using the combined Hm/Rd and intra/inter-lg approach, you get something like this (some rounding errors, I’m sure):
ALH v ALP +2.2*
ALH v NLP +6.4
NLH v ALP -.3
NLH v NLP +3.5*
(*includes park adjustment of +.3 in AL pks, -.3 NL pks)
Then:
ALHs are +2.7 compared to NLHs (avg of 2.5, 2.9)
ALPs are +4.0 compared to NLPs (avg of 4.2, 3.8)
Total 2005 AL advantage:  +6.7, or .51 R/G (slightly higher than your estimate). This would indicate that about 60% of AL edge is pitching, as opposed to 0% using intra-league data only.

One advantage of this approach is it should provide more stable estimates for individual seasons.


#10    MGL      (see all posts) 2006/07/11 (Tue) @ 17:27

A reader of the THT article wrote to ask:

Mitchel,

Hi there.  I just wanted to tell you how much I enjoyed your article over at the Hardball Times.  But I wonder if you took into account the fact that the NL has two more teams than the AL.  I think the inclusion of an additional 50 borderline major leaguers would have to dilute the pool of talent in the NL, and might account for a large portion of the difference between the two leagues.

Obviously, this wouldn’t change your conclusion that the AL is a tougher league than the NL, but I wonder what the results would be if you eliminated the worst 50 players in the NL?

Anyway, keep up the good work.

yrs,

Jeff Mathews

To which I responded:

Jeff,

Thanks for the question and comments Jeff.

The fact that one league has more teams than the other should have no effect on the overall talent in each league since teams and players are free to go to whichever team in whichever league they want (more or less).  Imagine that one league had one team and the other league had 29 teams.  Would the league with one team be the best league?  Why?

What about the divisions?  Are divisions with fewer teams better than divisions with more teams?  Again, why would this be, and how can this be, when we can set the number of teams in each division to whatever we want and whenever we want?

The pool of talent available to the major leagues is split among all major league teams, presumably evenly; it doesn’t matter how many teams are in each league.  Of course, in reality, the teams that spend the most (assuming equally talented front offices), not including random, year-to-year fluctuations, should have the most talent. 

One reason why the AL is the better league is that they spend more per team.  In fact, that may be the only, or at least, the primary, reason.

BTW, your proposition would be true if we started out with 14 teams in each league and then added 2 teams to the NL and did not have an expansion draft.  Even with an expansion draft, the league with the 2 extra teams would in fact be weaker, at least for a while.  Eventually though, the leagues would even out in talent, no matter how many teams were in each league.  The last year of expansion, 1998, one team was added to each league, so we would not have expected to see either league get weaker with respect to the other league. In 1993, when 2 teams were added to the NL, we definitely would have expected the NL to get weaker overall, however, only for a while.  Eventually the talent would even out again.

Mitchel


#11    Joe Arthur      (see all posts) 2006/07/13 (Thu) @ 01:29

sorry to be late to the party ...

Mickey, you may have accounted for these factors in the HBT series and just not said so to keep the total discussion to a manageable size (or I may just have missed the mentions).

If I’m interpreting the argument correctly (a big IF) I surmise a few factors (first one probably extremely mild) which ought to soften the measured advantage for the AL at least on the hitters’ side. All are connected to the different strategies of the DH-less league.

1) It seems plausible that NL reserves need to be more fungible defensively so that double-switches don’t end up too costly on defense. Imagine two players, one who is -7 runs offensively and -3 runs defensively, the other who is -11 runs offensively and +1 defensively. Both are -10 runs total, and in that sense equally talented, equally capable of winning games. The 2nd player ought to be more attractive to an NL team because he ought to be able fill in over a wider range of the defensive spectrum, and increase the options for a double-switch and how long teams can delay a pitcher coming to bat again or making another substitution. The team would get back on defense the value it gave up on offense, plus have extra flexibility, for whatever that’s worth. If this tradeoff actually is made consistently in putting together NL rosters, looking at offense only to compare players would bias toward the AL (they’d have a higher offensive talent and baseline even though the players’ total value was equal).

2) NL teams pinch hit a lot more (4.1% of PA vs 1.6% for AL 2000-2005), and pinch hitting seems to be ‘harder’ (as concluded in “The Book” pp.111-3: 34 points of wOBA). If you don’t remove pinch hitting, this would depress the baseline offensive performance per 500 PA in the NL, and make regulars in the NL who never pinch hit look relatively better. (non-regular players presumably get a greater proportion of these PA as PH in the NL.) To restate that, strategic need to use pinch hitters distorts how well the league’s baseline reflects the league’s true offensive talent.

3) In a like vein, collectively AL hitters seem to get the platoon advantage more often than NL hitters (56.3% vs 51.9% 2000-2005 though 2004 and especially 2005 show the NL getting closer to the AL rate), perhaps because NL teams are sometimes ‘forced’ to pinch hit without the platoon advantage. This imbalance in platoon advantage would raise the offensive baseline in the AL,as a function of strategy, not talent. Well, the platoon percentages are derived from the retrosheet league split pages; pitcher (and DH) batting are not removed there, as you do remove them in your establishment of baselines. Perhaps the platoon discrepancies would turn out to vanish if those adjustments were made…

Finally, another (more laborious and indirect) way to a get a view of the relative talent in the AL and NL would be to look at triple-A players. Here we have NL organization players competing directly with AL-organization players, with a large sample size. There is also substantial movement between AAA and the majors, so there’d be empirical basis for comparing up to the majors ... Adding this to the overall analysis would involve more assumptions, even more adjustments, certainly a lot more work, but the reward would be probably much bigger sample sizes.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Dec 03 21:29
Sabermetric Moves of the 2009 Pre-Season

Dec 03 21:15
What would happen if the shootout period was 10 minutes, not 5?

Dec 03 20:51
Marcel 2009 is here

Dec 03 18:40
Avery being Avery

Dec 03 17:41
How to calculate the area of a baseball field

Dec 03 16:57
NYC’s 3 1/2 year mandatory jail time sentence for carrying a loaded weapon

Dec 03 14:50
The Return of the Baseball Abstract?  No, the next best thing…

Dec 03 14:48
Estimating BABIP

Dec 03 10:42
What was Pedro worth?

Dec 03 10:20
Complete Run Expectancy, Retrosheet Years