THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, October 25, 2008

Is the AL still better than the NL, and if yes, is it pitching, hitting, or both?

By , 11:47 PM

I wrote an article two years ago that detailed how we can determine which league is better in hitting, pitching, and overall (of course defense and baserunning should be included as well).  Basically, there are two ways. 

One, we can look at how players do when they switch leagues from one year to the next compared to players who don’t switch leagues.
For example, let’s say there is a league-average batter (league-normalized linear weights of 1.00) in the NL and he goes to the AL the next year.  And let’s say that in the AL, his normalized lwts is 1.10, whereas the player who stays in the AL, has a league-average lwts in year I and then a league-average lwts in year II, we can assume that the batting is a lot better in the NL (and average batter in the NL becomes a better batter, relative to the league, when he goes to the AL).  That assumes that both players are the same age.  Obviously we can’t tell anything from one or even a few batters, but since there are usually around 30 batters who switch leagues in any given year and they get a total of over 5000 PA (using the lesser of the PA for each player), we have a decent (not great) sample size to work with.

Anyway, the other method is to look at how, for example, NL hitters do against AL pitchers in AL parks (in inter-league games of course), as compared to AL hitters in AL parks in non-inter-league games.  Assuming the same pool of pitchers and batters (we can control for that), if the NL batters do better than the AL batters, then we can assume that NL batters are better.  Again, sample size issues are present.

When we do these two kinds of analyses, we have to do it “both ways.” In other words, we have to check NL players who go to the AL and AL players who go to the NL and “split the difference.” The reason is this:  Let’s say that NL pitchers go to the AL and get better.  That suggests that NL pitchers are better.  But, what if the reason they get better is that AL batters are not familiar with them.  In other words, even if the league quality were equal, ANY pitcher who switches leagues would get “better.”

Well, if NL pitchers going to the AL get 10% better and AL pitchers going to the NL also get 10% better, obviously we can’t assume that one league is better than the other (they are about equal).  But if an NL pitcher goes to the AL and gets 10% better and an AL pitcher goes to the NL and remains the same, then if we “split the difference,” we can assume that NL pitchers are 5% better than AL pitchers.  The NL pitcher “improves” by 5% because of talent disparity between the two leagues, and then improves another 5% because he has never been seen in his new league.  The AL pitcher who goes to the NL improves by 5% because he has never been seen in the NL, but gets worse by 5% because the NL is the better pitching league, for a net gain of zero.

Using the “switched leagues” method, I am going to recap the numbers from the last few years and then I’ll give you the 2008 numbers (players who switched leagues from 07 to 08).

I’ll do that for hitting and pitching.  Using the numbers from this year, we can get some idea as to whether and by how much the AL still has superiority.

One other piece of data is relevant to that question this year:  In inter-league games this year, the AL won 59.4% of the games, suggesting that it is the much better league.  Then again, at plus or minus 2 SD, that is 53 to 65%.  Last year it was 54% and the year before it was 61%.  Since you would not expect any differences in the leagues to shift dramatically from one year to the next (they could I suppose), that is more evidence that the AL is better this year, as well as during the last few years.

Let’s look at some of those “league-switch” numbers.


Let’s start with batters.  I am using batting lwts which are “zero’d out” for each year in each league.  In order to avoid issues of batters in each league having different regression amounts from year I to year II, I “forced” the lwts for batters in both leagues in year I to be equal.  For example, if the batters who moved from the AL to the NL had a lwts in year I (in the AL) of -2.0 (per 150 games) collectively and those from the NL to the AL were -5 in year I, I removed some of the really bad batters in the NL in year I in order to “force” the -5 to -2 (same as the AL to NL batters).  If you don’t follow that, don’t worry about it.

Batters who switched leagues from 04 to 05:

NL to AL

N “min” PA age lwts difference between year II and year I

36 8671 31.4 -11.39 runs (they got worse)

AL to NL

34 9768 32.3 +.8 (got better)

This implies that the AL is better offensively by around 6 runs per 150 games per player.  That is around .34 rpg.

We also want to look at players who do not change leagues, to see if either league got better or worse relative to itself the year before.

Players who did not switch leagues

NL 205 30.1 -2.61
AL 166 29.7 -2.60

This suggests that neither league changed much from 04 to 05, although maybe the NL got a little worse as we would actually expect these players to lose more in the NL because they are a little older (.4 years).

05 to 06

NL to AL

34 8782 30.6 -2.15 runs (they got worse)

AL to NL

24 7813 31.0 +2.36 (got better)

This implies that the AL is now 2.26 runs per player per 150 games better, which is around .13 rpg.

What about players who did not switch leagues.  We are hoping that the NL got a little better relative to the AL, since the gap appears to have shrunk since the last year.

NL

230 30.3 -3.10

AL

157 29.6 -1.26

And yes, this does indeed suggest that the NL got a little better and/or the AL got a little worse, hitting-wise, as players from 05 to 06 in the NL got 3.1 runs “worse” and players in the AL, got only 1.25 runs “worse.”

06 to 07

NL to AL

30 6145 30.8 -12.43 runs (they got worse)

AL to NL

16 6116 31.1 -.12 (same)

This implies that the AL is now 6.15 runs per player per 150 games better, which is around .35 rpg, similar to 04 to 05.

What about players who did not switch leagues.  We are hoping that the AL got a little better this time, relative to the NL, since the gap between the leagues appears to have gotten bigger since the last year.

NL

233 30.1 -3.43

AL

167 29.6 -2.23

No, this time the NL appears to have gotten even a little better and/or the AL got a little worse, hitting-wise.

We really don’t expect everything to match up, as we are dealing with relatively small sample sizes.

How about this year?

07 to 08

NL to AL

27 5649 29.9 +4.64 runs (they got better)

AL to NL

23 5357 32.1 -.57 (a little worse)

Wow, this implies that the NL is now the better hitting team by 2.6 runs per 150 games per player, or .15 rpg!

What about players who did not switch leagues.  We are hoping that the NL got a little better, relative to the AL, since the gap between the leagues appears to have reversed.

NL

228 29.7 -1.47

AL

169 29.7 -2.10

Nope.  This suggests that the AL actually got a little better, hitting-wise.

So, here is what the gap looks like, from 05-08, using this “players who switched leagues” methodology:

05
.34 AL

06
.13 AL

07 .35 AL

08
.15 NL

I’ll have to look at IL play to see how those numbers would match up to these.  Next post, I’ll give you the pitching numbers.

#1          (see all posts) 2008/10/26 (Sun) @ 00:57

Here are the pitching numbers.  I am using normalized component ERA (NERC).

05 to 06

NL to AL

35 30.0 9196 .13 (got worse)

AL to NL

41 32.4 11142 -.22 (got better)

Implies that the AL has better pitching by .175 rp9.

No change of leagues:

NL

208 29.7 .24 (got worse)
AL

140 28.9 .34 (got worse)

Implies little change, maybe the AL got a little better from 05 to 06.

06 to 07

NL to AL

30 31.3 7864 .25 (got worse)

AL to NL

22 30.7 5731 .01 (got a hair worse)

Implies that the AL has better pitching by .12 rp9.

No change of leagues:

NL

227 29.7 .04 (got a little worse or same)

AL

166 28.3 .01 (basically same)

Implies little relative change, maybe both leagues got a little worse, since we expect pitchers to get worse by around .2 each year.

07 to 08

NL to AL

23 30.7 3113 .07 (got a hair worse)

AL to NL

31 28.2 5731 -.41 (got a lot better)

Implies that the AL has better pitching by .24 rp9.

No change of leagues:

NL

213 29.8 .31 (got worse)

AL

175 28.7 .13 (a little worse)

Implies that the NL got tougher which is the opposite of what we get for the players who switched leagues this year as compared to players who switched leagues last year.

To recap:

06

.17 AL

07

.12 AL

08

.24 AL


#2    Xeifrank      (see all posts) 2008/10/26 (Sun) @ 02:04

2008: NL batters by .15
2008: AL pitchers by .24

Is the conclusion the AL was more talented by .09 runs this year?  Doesn’t seem like much.  Would the rest come from fielding and baserunning?  Or the DH?  .09 doesn’t seem like enough to support the actual difference in performance between the two leagues.  Thanks for running this analysis.
vr, Xei


#3    MGL      (see all posts) 2008/10/26 (Sun) @ 03:23

The DH has nothing to do with anything.  The .09 is the “suggestion” from the data.  I am more confident in the .24 (more or less) pitching advantage for the AL than the .15 batting for the NL.  As I said, I want to look at the inter-league data to see how it compares to the “switch leagues” data.  The reason I trust the pitching data more than the hitting data is that the .24 pitching advantage for the AL is in line with what it has been the last 5 years or so.  An NL advantage this year would be quite a turnaround from the last few years.  As I said, an abrupt turnaround from one year to another is unlikely, but possible.

.09 doesn’t seem like enough to support the actual difference in performance between the two leagues.

What makes you say that?  How do you know what the “actual difference in performance” is?

From the inter-league games?  As I said, while it was 59% AL, that could easily be 53%.  Well, not easily, but that would be 2 SD less.

Anyway, I do actually agree with you and suspect that the difference is closer to the .24 total and maybe more.  I doubt there is much difference in base running. There could be in defense.  I could check UZR since I baseline everything to the last 5 years, AL and NL combined.  If there is a substantial difference between AL and NL defense, it should show up in the NL and AL UZR.  I’ll check it out.


#4    MGL      (see all posts) 2008/10/26 (Sun) @ 03:58

Actually, the defense is already included in the pitching numbers so we don’t have to look at defense.

Another reason I trust the pitching and defense numbers more than hitting, using this analysis is that it is “cleaner.” If the leagues are equal, then an average pitcher in one league should definitely be an average pitcher in the other league as well.

With hitting, this kind of analysis (switching leagues) is messy. In the AL, I measure a batter against all batters including the DH’s.  In the NL, against all batters not including the pitchers.

The AL SHOULD be a better hitting league because of the DH.  So any batter who switches, from say the AL to the NL, is is now measured against all batters, but no DH’s, should probably fare better comparatively.  That probably means that when they play one another in inter-league play in the AL parks, when the NL teams have to put a backup player on the field to cover the DH, they probably have poorer hitting than the AL, overall.  But in the NL parks, when the DH sits or occasionally plays in the field (with probably bad defense), the offense for both the AL and NL may be the same or the NL may even be better (since often the best hitters on the AL teams are on the bench).

So really the best way to look at the differences in hitting is the “inter-league” game method. I’ll do that tomorrow night.

In addition to everything else, it is possible for one league to be better than the other league, but when the play each other in IL games, the relative talent may not manifest itself (like, for example, the AL may have better offense, but in half the IL games, some of that good offense, the DH, he is sitting on the bench).


#5    MGL      (see all posts) 2008/10/26 (Sun) @ 04:00

Or the NL may have a better (or equal) offense, but in all IL games in the AL parks, they may have to take batters who are normally backups and play them as regulars to cover the DH. That actually may be much of what is going on in terms of the AL superiority even though the NL looks like the better hitting league this year.  It is s tricky issue.


#6    MGL      (see all posts) 2008/10/27 (Mon) @ 01:49

OK, I looked at inter-league game results.  Keep in mind that to some extent this is going to parallel inter-league win/loss results, or inter-league pythagorean win/loss results, even though I am looking at component results during inter-league play and not runs scored and runs allowed.

Let’s start with the batters, since I was not happy with the “league switch” results which suggested that all of a sudden the NL got .15 runs better than the AL.  If that were the case, and the relative pitching remained the same, that would suggest that rpg would be substantially higher this year in the NL as compared to the AL.  Although that started out to be the case big time this year (remember that whole “Joe Sheehan” thing about NL and AL runs scoring), it did not end up to be.  The AL scored around .4 runs more than the NL this year, about the same as it has been for a while.

Anyway, I looked at two groups of data, as I said I would do:

AL batters versus NL pitchers in NL parks (in inter-league games of course).

They had a lwts of 8.18 runs per 500 PA, after adjusting for the pool of pitchers they faced.

We want to compare that to NL batters facing the same NL pitchers in the same parks (roughly).

So I looked at all NL road stats in NL parks.  They were 4.84 after adjusting for the pitchers they faced.

When I say “adjusted for the pitchers they faced” for both groups, I made sure that I compared each group of batters (AL on the road in NL parks and NL on the road in NL parks) while facing the same pool of pitchers.  In other words, I wanted to make sure that in IL games, there just didn’t happen to be a particularly good or bad crop of pitchers in the NL parks.

AL batter versus NL pitchers in NL parks

8.18

NL batter versus NL pitchers in NL parks

4.84

So it looks like AL batters did a lot better than NL batters facing the same pitchers in the same parks, by around 3.34 runs per 500 PA, or around .24 runs per game.  If AL pitchers were that much better, as it seemed they were from the last (league switch) analysis that would be a 55.3% advantage for the AL, less than the 59% win rate in IL games this year, but a substantial advantage nonetheless.

But…

Just because the AL batted a lot better than the NL in IL games in NL parks to the tune of .24 rpg (based on underlying component performance), does not necessarily mean that the AL batters were that much better than the NL batters.  After all, AL teams are not using their DH’s in the NL parks, they are shuffling their lineups, etc.

So I adjusted for the hitting pool of the AL batters in the NL parks. It turns out, for some reason, that these group of hitters were a little better than the average hitter in the AL in general.

That reduced the 3.34 runs per 500 PA (in lwts) AL advantage to an AL advantage of 2.45 runs per 500 PA.

But…

The AL batters are hitting in unfamiliar parks facing unfamiliar pitchers, so maybe they should hit a lot worse even though they are better hitters than in the NL (and the NL batters will REALLY hit badly in AL parks).

To see if that is true, we have to look at NL batters in AL parks versus AL batters in AL parks, and do the same pitcher and batter “pool” adjustments.

NL batters in AL parks (in IL games)

-7.17 runs per 500 PA

AL batters in AL parks (in non-IL games)

-3.77 runs per 500 PA

So in this case, the AL has the advantage by 3.40 runs per 500 PA.  Combine that with the results in the AL parks, and we have a:

2.90 runs per 500 advantage for the AL hitters, or .22 rpg.

If we average that with the .15 advantage we came up with for the NL, using the switch league method, we get .035 rpg batting advantage for the AL.

Let’s do the same thing with pitchers.  Remember, using our other method, we came up with a .24 rpg advantage for the AL.  Let’s see what the “inter-league” game method comes up with.

NL pitchers in AL parks

allow 9.34 runs per 500 PA

AL pitchers in AL parks

4.57 runs per 500 PA

AL pitchers in NL parks

allow 2.48 runs per 500 PA

NL pitchers in NL parks

5.21 runs per 500 PA

This is a very pronounced pitching difference as we expected.

AL advantage in pitching of 3.75 runs per 500 PA, or around .28 per game.  If we average that with the .24 we got from the other method, we get:

AL pitching advantage = .26 rpg.

AL batting advantage = .035 rpg.

.533 win percentage for the AL.


#7    Guy      (see all posts) 2008/10/27 (Mon) @ 11:10

Great data, MGL.  But I’m not persuaded that your best final estimate is reached by just averaging your two methods.  It seems to me that the method you use here in post #6 is much more robust than the league-switcher approach.  It gives you a much larger sample size, and avoids some of the additional complicating factors introduced when looking at switchers—aging, strength of opposition, etc.  And the result of the second method—roughly a .550 AL win%—is also much more consistent with the AL’s actual performance over the past three years.


#8    Tangotiger      (see all posts) 2008/10/27 (Mon) @ 11:19

I agree that the league-switcher methodology is ripe for selection bias, and sample size issues.  Are the players that move league-to-league representative of the players in the league, or are they biased in some form?  What is the uncertainty level?

Looking at the 11% of games played inter-league likely (a) dwarfs over the sample size of the league-switchers, and (b) definitely is representative of the players of the leagues.

League-switchers is a good way to do it for the pre-inter-conference games.  Even then, you still have bias in handedness, age, player profiles that needs to be taken care of.


#9    MGL      (see all posts) 2008/10/27 (Mon) @ 12:32

You guys are probably right, although I do think you should use the league switch data to some extent.  It is independent data which should theoretically add to your knowledge.

League-switchers is a good way to do it for the pre-inter-conference games.

What does that mean?  What re “"pre-inter-conference games?” I assume you are using “conference” for “league” (making a dig at baseball), but what are “pre-interleague” games?


#10    Tangotiger      (see all posts) 2008/10/27 (Mon) @ 13:00

I meant before 1996 (or whenever inter-conference games started).  Before then, we actually had two leagues.


#11    MGL      (see all posts) 2008/10/27 (Mon) @ 16:21

Oh, OK, makes perfect sense.  I think IL started in 97.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:59
Roger Federer

Sep 02 14:59
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 14:57
Could Rob Dibble have been a comp for Strasburg?

Sep 02 14:49
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II