THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, February 22, 2009

“True” aging patterns

By Tangotiger, 08:07 AM

Looking at performances of the batter/pitcher matchups, year-to-year, in the same parks, at the same age and in the same role in the same park (see chart below).

Age is the age of both the hitter and pitcher of the first year.  wOBA1 is the wOBA at “age”, and wOBA2 is the wOBA at “age"+1.  diff is the difference between the two wOBA.  PA is the minimum of the PA at age 1 and age 2, for the matchup of the batter/pitcher.  I not only matched on making sure it’s the same batter facing the same pitcher, but in the same park and in the same role (starter/sub-relief).

What this shows is how much of an advantage one has over the other.  Presumably the selective sampling cancels out for both sides?  I’d only look at the data between ages 22 and 34.  It would seem that the batter always has the advantage until at least his early 30s.  From age 33 to the end of their careers, the matchups was a cumulative .344 in the first year and .339 in the next year (on 5872 PA).  This at least points to the possibility that pitchers peak later than hitters.  However, if I bring it back to age 32 (I only selected age 33 because I looked at the data), it’s .341 in year 1, .342 in year 2, on 9,883 PA.  That data suggests they age the same in their 30s.

What say you?


age wOBA1 wOBA2 diff PA
19 0.650 1.350 0.700 2
20 0.291 0.226 (0.065) 34
21 0.304 0.320 0.016 271
22 0.325 0.323 (0.002) 1,457
23 0.335 0.336 0.001 4,212
24 0.332 0.337 0.005 8,564
25 0.330 0.333 0.003 13,154
26 0.332 0.342 0.010 16,043
27 0.335 0.333 (0.002) 16,557
28 0.336 0.342 0.006 13,455
29 0.334 0.334 0.000 10,979
30 0.332 0.343 0.011 7,806
31 0.340 0.346 0.006 5,955
32 0.336 0.346 0.011 4,011
33 0.347 0.337 (0.010) 2,606
34 0.335 0.357 0.022 1,574
35 0.364 0.326 (0.039) 819
36 0.341 0.352 0.011 462
37 0.349 0.300 (0.048) 223
38 0.295 0.304 0.009 106
39 0.231 0.357 0.126 43
40 0.293 0.244 (0.049) 24
41 0.580 0.180 (0.400) 5
42 0.300 0.200 (0.100) 10

#1    Tangotiger      (see all posts) 2009/02/22 (Sun) @ 08:34

This is when the batter is 10 years older than the pitcher, where presumably the batter is in his decline phase:
BAT_AGE PIT_AGE WOBA1 WOBA2 WOBA PA
28 18 0.650 0.294 (0.356) 13
29 19 0.347 0.385 0.038 121
30 20 0.333 0.386 0.053 476
31 21 0.342 0.348 0.006 1,396
32 22 0.343 0.343 (0.000) 2,795
33 23 0.341 0.333 (0.008) 3,759
34 24 0.328 0.343 0.015 3,914
35 25 0.340 0.342 0.001 3,358
36 26 0.349 0.353 0.004 2,543
37 27 0.357 0.342 (0.015) 1,729
38 28 0.337 0.361 0.024 978
39 29 0.353 0.325 (0.028) 604
40 30 0.329 0.322 (0.007) 252
41 31 0.393 0.317 (0.077) 91
42 32 0.239 0.269 0.030 67
43 33 0.257 0.293 0.036 11
44 34 0.233 0.450 0.217 2
46 36 - - - 1

And the pitcher is 10 years older:
BAT_AGE PIT_AGE WOBA1 WOBA2 WOBA PA
19 29 0.276 0.379 0.103 45
20 30 0.322 0.314 (0.009) 283
21 31 0.313 0.337 0.024 965
22 32 0.312 0.331 0.020 1,944
23 33 0.332 0.333 0.000 2,804
24 34 0.317 0.325 0.008 3,200
25 35 0.326 0.349 0.023 3,299
26 36 0.328 0.357 0.029 3,094
27 37 0.331 0.336 0.005 2,214
28 38 0.332 0.336 0.003 1,734
29 39 0.309 0.317 0.008 1,164
30 40 0.343 0.353 0.010 781
31 41 0.358 0.339 (0.019) 340
32 42 0.297 0.338 0.041 162
33 43 0.319 0.356 0.037 108
34 44 0.348 0.360 0.012 44
35 45 0.332 0.346 0.014 35
36 46 0.391 0.247 (0.144) 11
37 47 0.495 0.478 (0.017) 9

This last chart is the one that makes the most sense.  When the batter is 10 years younger than the pitcher, the wOBA is .325 the first year and .338 the second year.  So, the combination of the batter improving and the pitcher declining (or the batter declining a bit and the pitcher declining alot in the older years) shows a gap of 13 wOBA points.

So, perhaps a batter gains 6 or 7 wOBA points in his 20s and a pitcher loses 6 or 7 in his 30s?


#2    Tangotiger      (see all posts) 2009/02/22 (Sun) @ 08:44

Using the process of only matching as I described it (matching on batter-pitcher year-to-year, park, role), the 10-yr peak age for pitchers is 19-28, and it’s 28-37 for hitters.

And perhaps the reason we find hitters actually peaking much earlier is because they take advantage of the pitchers that are outside my sample when they are younger.

Similary, it’s the older pitchers who can take advantage of the out-of-sample hitters the most.

This would support the idea that the more a batter sees a pitcher, the more of an advantage he has.  And a pitcher gets his value by his “newness”.  This would also support the idea of why pitchers have much more success as relievers.


#3    Guy      (see all posts) 2009/02/22 (Sun) @ 09:25

Cool stuff.  I don’t think the first table necessarily indicates a later peak for pitchers.  If anything, the hitter gains at ages 24-26 suggest to me that pitchers are peaking earlier, maybe a lot earlier.  After age 32 this kind of data becomes very hard to interpret, both because of shrinking sample size and the large (IMO) selective sampling problem that these are likely players with unusually flat post-peak curves (many of those who decline more quickly are out of the game).

Is there a reason you can’t isolate hitters by tracking them against opposing pitchers of all ages (and vice-versa)?


#4    Tangotiger      (see all posts) 2009/02/22 (Sun) @ 11:45

"Is there a reason you can’t isolate hitters by tracking them against opposing pitchers of all ages (and vice-versa)? “

I should have been clearer that when I said this:
“Using the process of only matching as I described it (matching on batter-pitcher year-to-year, park, role), the 10-yr peak age for pitchers is 19-28, and it’s 28-37 for hitters. “

I meant all batter/pitcher matchups, not just those of the same age.  So, for each batter, I looked at the pitchers in back-to-back years, regardless of age.  (Average age was 28.3 years old or so.)

You get this for batters:
BAT_AGE WOBA PA
18 - 3
19 0.015 824
20 0.027 3,675
21 0.011 16,068
22 0.016 39,337
23 0.006 70,369
24 0.010 106,948
25 0.006 137,518
26 0.007 155,080
27 (0.001) 158,640
28 0.003 148,991
29 0.000 134,658
30 0.004 116,962
31 (0.002) 101,745
32 0.001 82,293
33 (0.006) 66,533
34 0.000 49,343
35 0.004 36,659
36 (0.001) 25,362
37 (0.001) 17,300
38 (0.009) 11,176
39 (0.009) 6,958
40 (0.013) 3,283
41 (0.026) 1,790
42 0.031 890
43 (0.056) 414
44 (0.090) 184
45 0.007 37
46 (0.090) 43
47 0.410 7
48 (0.142) 3
49 16

You get a continual increase until the mid 30s.

For pitchers you get this:
PIT_AGE WOBA PA
18 (0.048) 186
19 0.001 1,725
20 0.010 6,775
21 0.003 20,961
22 (0.002) 50,039
23 0.001 83,935
24 0.001 116,512
25 (0.003) 141,365
26 0.003 155,543
27 0.003 153,770
28 0.006 132,880
29 0.003 123,540
30 0.004 102,234
31 0.003 87,478
32 0.005 75,859
33 0.006 60,389
34 0.007 47,633
35 0.010 35,853
36 0.006 28,970
37 0.001 19,666
38 0.003 15,322
39 0.009 11,144
40 (0.002) 8,711
41 0.007 4,464
42 (0.012) 3,110
43 0.006 2,024
44 0.024 1,214
45 0.005 1,252
46 0.016 211
47 (0.027) 301
48 (0.057) 11
49 - 16

This means they get continually worse since at least their mid 20s.


#5    Guy      (see all posts) 2009/02/22 (Sun) @ 18:30

Overall, the pitchers decline 3 points between years Y and Y+1 (i.e. wOBA is +.003).  So if we adjust the hitters accordingly, assuming they gain 3 points from the decline in opposing pitcher ability, we get this result (last column is smoother 3-yr running average):

AGE wOBA Smooth
19 0.012
20 0.024 0.011
21 0.008 0.012
22 0.013 0.007
23 0.003 0.007
24 0.007 0.004
25 0.003 0.004
26 0.004 0.001
27 -0.004 0.000
28 0 -0.002
29 -0.003 -0.001
30 0.001 -0.002
31 -0.005 -0.002
32 -0.002 -0.005
33 -0.009 -0.005
34 -0.003 -0.005
35 0.001 -0.002
36 -0.004 -0.002
37 -0.004 -0.006
38 -0.012 -0.008
39 -0.012 -0.013
40 -0.016 -0.016

This looks like a plausible curve.  And it means the hitters as a group really don’t change from Y to Y+1, so we can take the pitcher results at face value. 

Tango:  remind me, do you weight the hitter/pitcher matchups in the 2 samples—so a hitter has same # of Maddux PAs in both samples?


#6    Tangotiger      (see all posts) 2009/02/22 (Sun) @ 20:01

Yes, always.  The “PA” you see in the columns is identical at age X and X+1.  Never would I consider to not do that.


#7    Tangotiger      (see all posts) 2009/02/22 (Sun) @ 20:15

The overall average of matching the hitters and pitchers is a wOBA of .334 at age X and .337 at age X+1.

This means that the hitters are 3 points ahead of pitchers in the second year. 

Remember, we really don’t have a selection sampling issue because we have the same hitters AND the same pitchers in both samples.  As it turns out, both groups are 28.3 years old at age X and 29.3 (obviously) at age X+1.

It’s interesting that Guy is saying to compare the hitter’s gains to the average 3 point gain that they get overall.  If I do that, the peak age is 25-31 for the batters, which makes alot more sense, with a peak at age 27.

If I do the same for pitchers, I get a peak age of 25-33, with a peak at 26-28.

Wow.  Thanks Guy.  You’ve given me the last bit I needed to solve my problem.

The slope is also bigger for hitters than pitchers.  I think the reason is because there is a wider diparity in hitting talent than pitching talent.  So, I can see why you’d see more improvement/decline with hitters.

Anyway, very fascinated.


#8    Guy      (see all posts) 2009/02/22 (Sun) @ 21:52

The way I’m looking at it, you don’t need to adjust the pitchers at all.  They appear to basically hold steady from 21-25, then decline steadily after that.  That is, most pitchers at any point in time are already declining.  Thus, your WOWY hitters get a 3-point boost regardless of their own aging gain/loss.  But the distribution of hitters is pretty symmetrical around their peak, so there’s no corresponding aggregate change in hitter talent complicating your pitcher results. At least, that’s my theory.

*

Re #6, do you weight by lesser PAs?  While that usually makes sense, I wonder whether it does for aging studies.  To some extent, PAs reflects performance.  For young players, those making the biggest gains would often have far fewer PAs in year one (like Y=300PA/Y+1=600PA), while older players in steep decline might be the reverse.  I would think these players may be underweighted in comparison to steady (600/600) players, tending to flatten your age curves.


#9    Tangotiger      (see all posts) 2009/02/23 (Mon) @ 08:25

I don’t see why I would not subtract 3 wOBA points on both sides, since that would be the baseline to compare against.  The 3 wOBA points adjustment would be for the “familiarity” factor that hitters get over pitchers, year over year.


#10    Guy      (see all posts) 2009/02/23 (Mon) @ 09:37

I think we just have two different theories about what’s going on.  You’re saying the 3 pts is a function of familiarity.  If so, adjusting both sides does make sense.  I’m speculating it’s different aging curves for pitchers and hitters, not familiarity:  most pitchers in your samples are over 25 and thus declining, so your hitters are facing easier opponents in year Y+1.  But the reverse is not true:  your hitter pools are balanced around the peak, so their talent is (roughly) unchanged.

I think you could test the familiarity theory by examining whether higher numbers of PAs in a pitcher-hitter matchup—holding age constant—producer larger gains for the batter. 

If familiarity is that big a factor, presumably hitters enjoy a steady benefit up to a certain age, as their proportion of PAs against familiar pitchers grows and then plateaus, and the reverse is true for pitchers.  And I would think there would be diminishing returns here (i.e. a hitter isn’t better in his 51st than 50th PA).


#11    Tangotiger      (see all posts) 2009/02/23 (Mon) @ 10:36

When I subtract 3 wOBA points across the board, I get a peak age of 27 for hitters and pitchers as I said here:

It’s interesting that Guy is saying to compare the hitter’s gains to the average 3 point gain that they get overall.  If I do that, the peak age is 25-31 for the batters, which makes alot more sense, with a peak at age 27.

If I do the same for pitchers, I get a peak age of 25-33, with a peak at 26-28.

The age of the hitters and pitchers are identical (28.3 in year 1 and 29.3 in year 2).

Considering that the overall careers of hitters and pitchers are roughly the same (they enter at the same time, the best pitchers and best hitters are their best at the same age and they exit MLB at the same time, more or less), then I think a model is more supportable if presumptions hold to what we actually see.

And, removing 3 wOBA points year-over-year gives us results that support this.

If this finding is real, that we have a 3 wOBA point for “familiarity”, this will be a pretty powerful finding.  Future research would try to answer the following:
- how much is the familiarity effect the more PA the batter/pitcher have seen of each other (I seem to remember Dave Smith of Retrosheet might have looked at this… maybe it was Pizza Cutter)
- does the familiarity extend alot more in-year than previous year?  And how many years back does it go?

So, what I had believed to be selective sampling when I got some improbable results a few years back may instead have been mostly about familiarity:
http://www.tangotiger.net/adjacentPitching.html


#12    Guy      (see all posts) 2009/02/23 (Mon) @ 12:14

Agreed that familiarity is considerably more plausible on its face—but some combination could be at work.  To see familiarity, suppose you went back to your original sample of pitcher/hitters of same age, and divided each year in two based on # of PA in year 1 (or maybe combined PA over 2 years).  If familiarity is main factor, wouldn’t we see hitters gaining more in the high-PA samples? 

And I’m still not sold on the lesser-PA approach for aging studies.  Maybe it won’t make a difference in practice, but I think it could have the effect of artificially flattening the curves.  Certainly, matchups between good hitters and good pitchers will be overweighted, and we don’t know that they age the same as lesser players.


#13    Tangotiger      (see all posts) 2009/02/23 (Mon) @ 13:07

Yes, good point on trying to break down the # of PA.  I’ll also have to look at “times through the order”, since that has an 8 point wOBA effect.  So, a good pitcher that sticks around to face a good hitter more often in the same game will have more of a wOBA impact.

Also good point that the sample players in my pool is not necessarily representative, because I force the hitters/pitchers to match, and that will skew toward really good players.  Whether that means a different aging curve hasn’t been established.


#14    Guy      (see all posts) 2009/02/24 (Tue) @ 08:29

Overweighting the good players, while a potential problem, probably doesn’t introduce a systemic bias.  I’m more concerned that weighting by lesser BFs will tend to understate improvement by young players, and understate decline by older players.  I just took a quick look at pitchers who were 23/24, 24/24, 31/32, or 32/33 the past two seasons, comparing ERA+ the two seasons.  Pre-peak, the lesser-BF produces a smaller estimate of improvement than weighting by combined total BF (or straight average).  Post-peak, lesser-BF yields a smaller performance decline than weighting by total BF (or straight average).

I’m not sure what the best method here is.  But I do think weighting by the lesser-BF will tend to flatten the age curve, perhaps quite a bit.  And I assume the reason for that approach is to minimize the impact of small-N outliers, but if you’re aggregating enough players that may not be a big problem.


#15    Tangotiger      (see all posts) 2009/02/24 (Tue) @ 10:20

There are only two real choices to weighting the samples: lesser of two PAs or harmonic mean.

Otherwise, if someone has a .800 wOBA in 3 PA at age 23 and a .300 wOBA in 600 PA at age 24, I can’t possibly presume a .500 drop in wOBA, weighted as much as someone with a .300 wOBA in 600 PA at age23 and .305 wOBA in 600 PA at age24.

A “lesser” approach would give the former a weight of “3” and the latter a weight of “600”.

A harmonic mean approach would give the former a weight of “6” and the latter a weight of “600”.

If I understand the harmonic mean approach well-enough, this one would approximate the z-score approach, such that a .500 drop in 3/600 PA would be as many SD to the left of the mean, as a .005 change in 600/600 PA would be to the right of the mean, so that, overall, we can say that there was no change.

Perhaps one of the statistical minds here can clear it up for me.

Regardless, it’s clear that one needs to be severely overweighted relative to the other, simply because of the noise in the sample.


#16    Guy      (see all posts) 2009/02/24 (Tue) @ 11:19

For yor WOWY approach, this noise is less of a problem because no hitter-pitcher matchup in one season is more than, what, 15-20 PAs?  So you won’t have huge distortions like your example in any case.  Also, with a large sample your .800 wOBA will be offset by another 3-PA hitter with .000 wOBA.

Still, I don’t disagree that small-N extreme values are problematic.  But these methods are only dealing with that problem, while ignoring another challenge:  your PAs aren’t independent of the change you’re trying to measure.  There’s an intra-season correlation between performance and PA/BF.  So both of these methods will give less weight to players with big performance changes.  Yes, some of that big change is just noise, but not all of it. 

Take these three pitchers, all 32 in 2007 and posting similar OPS+:
Millwood 87
Looper 89
M. Morris 90
The following year Morris fell off a cliff (44 OPS+), while Millwood was flat (86) and Looper improved (102).  As a result, Morris only faced 118 batters the next year, while the others pitched full seasons.

Morris repesents 37% of our year 1 sample, and 24% of the combined BF, but just 7% if we use lesser-BF.  Now, I readily concede that Morris was not a 44 OPS+ true talent in 2008.  On the other hand, he had pitched poorly the last third of the 2007 season and is now out of baseball—pretty good evidence that he is also no longer a 90 OPS+ pitcher.  If you give the Matt Morrises of the world only 1/7th the weight of successful pitchers, he basically vanishes and I think you understate the real rate of decline.  (And the reverse happens with pre-peak players, as those posting large gains often have a small # of PA/BF in year 1, and so receive little weight.)

If we assume that all players age the same, and all the variance is just noise, then the lesser-PA (or harmonic mean) approach works fine.  But I think we have to assume that players actually age differently.  And in that case, we’re systematically undercounting the players making the biggest changes at each age.


#17    Tangotiger      (see all posts) 2009/02/24 (Tue) @ 11:35

Except..... this applies as much to hitters and pitchers.  This is the fantastic part about this selection bias issue that we’ve never done before: I’m controlling for both the hitter AND pitcher.

So, I would like you to rewrite your post taking into account this fact.


#18    Tangotiger      (see all posts) 2009/02/24 (Tue) @ 11:37

At the very least, while the year-to-year may be somewhat flatter than we’d otherwise expect, the peak would not change, would it?


#19    Guy      (see all posts) 2009/02/24 (Tue) @ 13:24

Re #17:  Sorry, I don’t follow this.  How does the fact this applies to both hitters and pitchers tell us how much weight to give Matt Morris? 

#18:  I think that’s right:  the peak should be same.  Basically, using lesser-PA will give less weight to big performance changes in both directions, and give more weight to consistent performers.  Pre-peak, that means understating improvement, while post-peak it means understating decline.

It occurs to me the impact may not be the same pre-peak as post-peak.  Pre-peak, plenty of players post big declines as well as big gains in year two, because MLB is still trying to figure out their true talent.  Since both gainers and losers get underweighted (because they receive fewer PAs in their bad season), the total impact may be smaller.  But post-peak, players tend to either maintain their PAs in year two (because they are maintaining or improving performance) or decline.  So the main group getting underweighted by the lesser-PA method post-peak are those whose performances decline.


#20    MGL      (see all posts) 2009/02/24 (Tue) @ 13:55

Guy is right about the method we often use (for other things as well as aging curves) to weight year to year differences (using the lesser of the two opportunities).  It can and does introduce a bias.  I have recently been in favor of NOT doing this kind of weighting.  To simply do no weighting at all (treating each pair equally) or to “fill in” the “missing” PA with an estimate of performance.  If you use the “fill in” method, I think you have to do a recursive computation, since filling in requires a knowledge of what you are trying to figure out, in this case, the difference that one year makes in performance.

What we want to do is to somehow create a data set that allows every player to play a certain number of PA in every single year, just like we are conducting a laboratory experiment which enables us to compute precise aging curves.  In order to do that, we would, of course, take every player who debuts in the majors at whatever age, and then make sure that they play in the majors for at least one more year, preferably a number of years until they are 40 or so.


#21    Guy      (see all posts) 2009/02/24 (Tue) @ 13:59

In my post #16, I used “OPS+” throughout when I should have said “ERA+” (duh).  The #s are correct, AFAIK.....


#22    Tangotiger      (see all posts) 2009/02/24 (Tue) @ 17:24

Take these three pitchers, all 32 in 2007 and posting similar OPS+:
Millwood 87
Looper 89
M. Morris 90
The following year Morris fell off a cliff (44 OPS+), while Millwood was flat (86) and Looper improved (102).  As a result, Morris only faced 118 batters the next year, while the others pitched full seasons.

Morris repesents 37% of our year 1 sample, and 24% of the combined BF, but just 7% if we use lesser-BF. 

How much should his sample count?  Obviously he is not a true 44 pitcher.  Indeed, if you give Morris 7% and you split the other two guys evenly, you get a weighted OPS of 90 in 2008.

Indeed, if you wanted to fill-in another 180 innings for Morris, you’d have to presume he was around an 85 OPS pitcher for those innings, giving him an overall estimated OPS of 80 in 200 innings. 

Does it necessarily help us to count Morris at 200 innings of 80 OPS in 2008, or count him as 30 innings of 44 OPS?

This seems to be what MGL might be suggesting, to fill-in a best-estimate for the missing innings.  And, this may be what Guy is suggesting in order to give Morris more weight.

In the end, I would guess that doing this filling in gives us the same answer.


#23          (see all posts) 2009/02/24 (Tue) @ 20:51

If you were to guess at a level for innings not pitched because of attrition, you could possible do a Marcel type calc to estimate the “true talent level:, assuming the sample in year n is to small

One of the things I’m working on is an aging analysis using a Marcel approach, but instead of the actual values for the last 3 seasons, I’m going to look at the last 3 intra-season changes, to see which way a player is trending, then calculate how much of a regression of expected change at that age is needed in order to best predict the performance in year n+1.


#24    Tangotiger      (see all posts) 2009/02/25 (Wed) @ 00:53

I broke up the 1.5 million matching PA on whether the batter/pitcher matchup occurred a little (1 or 2 PA), a lot (4 or more), or average (3 PA) for each season.  Each group had roughly half a million PA.

I was first of all surprised how low I had to put the threshhold.

Anyway, the wOBA back-to-back for the pitcher/batter matchups with just 1 or 2 PA was .339 in year 1 and .346 in year 2.

For those with 3 PA in back-to-back years, it was .334 in year 1 and .335 in year 2.

And for those with at least 4 PA in back-to-back years, it was .329 in each year.  Basically, the familiarity factor that I was expected did not come into play at all.

However, if I bump it up to at least 10 PA in back-to-back years, it was .334 in year 1 and .349 in year 2.  This however was based on only 497 matches and 5798 PA.  1 SD is 7 points, so all I’ve shown here is that we have a non-zero difference, and not that the gap is anywhere close to that.

If I just look at those with 1 PA in back-to-back years, it’s .331 to .341 for improvement.  And with 2 PA, it’s .347 to .351 for improvement.  At 3 PA, it’s .334 to .335 in back-to-back years. At 4 PA, it’s .328 to .328.  At 5 PA, it’s .341 to .343.

***

It seems that we have a clear bias if the batter/pitcher only happened once or twice in back-to-back years.  If I set it so that you needed a minimum of 4 PA in back-to-back years, I get .329 as the wOBA in back-to-back years.  That seems to take care of this bias.

If I do this, then I get a peak age of wOBA for pitchers at age 26-28 (or 23-34 if you want something wider).

For hitters, it’s 26-29, or 25-33 for a wider peak.

So, I think I’m reasonably confident in saying that between 1954-2008, that the peak age for hitters and for pitchers is 27, and that this can be verified by looking at all samples of data (totalling 500,000 PA) by matching in back-to-back years on the identity of the pitcher, batter, park, role, and that they each faced each other under those conditions at least 4 times in each year.

I don’t know about you guys, but that’s pretty exciting for me.


#25    Guy      (see all posts) 2009/02/25 (Wed) @ 01:49

Tango:  Just to clarify, is this using the matchups with hitter/pitcher both the same age, or all matchups as in post #4?  And can you run the new numbers for us, based on 4+ PA?  Thanks....


#26    Tangotiger      (see all posts) 2009/02/25 (Wed) @ 08:57

All numbers, not just same age.  I’ll post the numbers a bit later…


#27    Tangotiger      (see all posts) 2009/02/25 (Wed) @ 13:57

Here is the data, smoothed, and making the .330 point the peak for pitcher and batter.  You see pitcher is flat from age 26 to 33. 

It seems to me that by limiting it as I did (min 4 PA in matchups back-to-back years), I’m basically knocking out any bad pitcher.

Age Pitcher Batter
22 0.335 0.307
23 0.334 0.314
24 0.334 0.320
25 0.332 0.324
26 0.330 0.328
27 0.330 0.329
28 0.330 0.330
29 0.330 0.329
30 0.331 0.328
31 0.330 0.327
32 0.330 0.324
33 0.330 0.321
34 0.333 0.318
35 0.337 0.315
36 0.341 0.310
37 0.344 0.303
38 0.345 0.293
39 0.348 0.281

Here is the chart using all data (1.5 million PA), but normalized (by removing .003 wOBA in each year-to-year comp).  Data has NOT been smoothed:

Age Pitcher Batter
22 0.344 0.300
23 0.340 0.312
24 0.338 0.316
25 0.336 0.323
26 0.330 0.326
27 0.330 0.330
28 0.330 0.326
29 0.332 0.326
30 0.333 0.324
31 0.333 0.325
32 0.333 0.320
33 0.335 0.318
34 0.339 0.309
35 0.343 0.306
36 0.350 0.306
37 0.353 0.303
38 0.352 0.298
39 0.352 0.287
40 0.359
41 0.353
42 0.357

As you can see, the hitter path is pretty much the same as the first chart, but the pitcher one is less flat.  That’s because alot of my bad pitchers are still here.

So, back to Guy’s point that yes, you need to be careful with selection bias as it may lead to a flatter curve.


#28    Guy      (see all posts) 2009/02/25 (Wed) @ 15:20

Cool stuff.  Might be worth writing up for THT. 
Am I right in thinking these tables are chained results? 

The pitching curve is still too flat for my taste.  I don’t buy the idea that hitters decline twice as much as pitchers by age 35, for example. This probably reflects that teams have more options for changing the role of a pitcher as his skill declines, especially the move from starting to bullpen.  The pitchers who remain as starters and rack up large IP totals are those who succeed, i.e. decline more slowly.  And even within the pen, a pitcher can be used fewer innings, and with more platoon edge, as he ages.

My quibbles aside, this is great work—a very interesting new way to look at the issue. WOWY is definitely one of your most important discoveries…


#29    Tangotiger      (see all posts) 2009/02/25 (Wed) @ 17:00

Cool, thanks.

Good point about the starter/relief transition.  I am treating the starter/relief as distinct, so that Mulholland-the-starter and Mulholland-the-reliever are two separate entities.  But perhaps I should just add an adjustment (say .025 or .030 wOBA) to equalize them (*).  Otherwise, I do have a selection bias issue to contend with.

(*) I’ve been meaning to improve on my starter/relief translation since I did the work in The Book, and now seems like the perfect time to do so.  Age bias, era-bias, handedness-bias, strikeout-bias.  Anything else you guys think I should look at?

I agree that the aging pattern can’t be what it is for pitchers, compared to hitters.  After all, of the 1.5MM PA in my sample, hitters aged 34 and older accounted for 10% of the sample and the pitchers accounted for 12%.  (And for pitchers, it’s probably due to specialization.) So, we should see similar aging.

***

Something you said made me realize that I should also match on batter handedness, since JT Snow the lefty and JT Snow the righty won’t be the same guy, if each of those is facing Greg Maddux.


#30    Tangotiger      (see all posts) 2009/02/25 (Wed) @ 17:15

Then again, maybe it should.  After all, if a switch hitter decides to only hit as a LHH even against a LHP, that must mean he thinks he’s better as a LHH than as a RHH. 

If I treat him as a typical switch hitter, I will be giving him a standard platoon advantage that simply won’t apply here.

That makes me think of another project: guys who quit platooning, and how they perform against the platoon advantage subsequent to that.  So much to do, so little time…


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jul 30 03:43
Roy Halladay’s Bobby Orr career

Jul 30 02:33
Cleveland: Meet Patrick Roy

Jul 30 01:42
“I believe…”

Jul 30 00:30
Maddon at it again…

Jul 29 23:04
Introductions: Strasburg, BABIP… BABIP, Strasburg

Jul 29 20:31
Bannister: the greatest saberist spokesperson ever

Jul 29 19:25
Gotta give Joe Torre some credit

Jul 29 19:10
SABR 111 - Out value

Jul 29 17:47
Reducing bias in fielding metrics

Jul 29 17:44
Colin full-time at BPro