THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, June 05, 2007

Another Aging Study

By Tangotiger, 10:38 AM

Guest-blogger Steve Walters over at the Wages of Wins blog links to an aging study by Ray Fair (PDF).  I was asked to review the paper a few months ago, which I have reprinted in the comments section, and will repost here:


“but surprisingly there seems to have been no rigorous attempt to estimate them”

There are several articles noted here:
http://www.tangotiger.net/#Forecast

The choice of “10 years” has its own selective sampling issues, as only certain quality of players will be in that pool. That pool of players is hardly representative of the population of MLB players. See this mini-study for more information:
http://www.tangotiger.net/archives/artAging.shtml#1013

The effect is for the curve to be flatter than it should be.

You acknowledge injury as a possible outcome, but ignore it afterwards. It’s a big deal, especially for pitchers.

A 9% increase in ERA from age 26 to 37 (3.50 to 3.81) is pretty much impossible to believe. That is extremely flat. The sampling issues noted above need to be addressed.

You talk about the 90s (steroid-era), but you also acknowledge that you did not adjust for the change in run environment. The 1993-2006 time period is a huge offensive era. Without adjusting for the context, it’s going to look like hitters are going better in those years. As well, the new parks are more offense-friendly these days.

While the overall model by age won’t change if you adjust for parks and year, you can’t then ignore these issues when looking at 1990s players in particular. They need to be adjusted, if you are going to focus on them.

While you choose 10+ year periods for each pitcher, you don’t have the same pitchers in each age group. You can have a 10yr period from ages 23-32 or 25-34, or a 15yr period from 22-36, etc. As noted in one of my linked articles, you need to pair the age groups to have the same pitchers and the same weights in each age group. So, select the same pitchers at age 23 and 24, and weight them equally. Then, select the same pitchers at age 24 and 25, etc. The pitchers of 23/24 do not need to be the same as 24/25.

And the comparisons of ERA to chess is not appropriate. Losing 15% in ERA would be the equivalent of losing 7.5% in OBP. It’s just not a 1:1 comparison.

OPS is not a good measure to use for ranking players, in addition to the other problems I noted earlier. Linear Weights would have been a better choice.

ERA also has its own problems (being fielder-dependent as well as the sequence-of-events dependent), and a “component ERA” similar to Linear Weights, would have been more appropriate.

#1          (see all posts) 2007/06/05 (Tue) @ 14:42

Tango, did the author change his paper any in the light of your comments?


#2    tangotiger      (see all posts) 2007/06/05 (Tue) @ 14:48

The paper on the site is dated Mar, 2007, and my comments were done on Mar 14, 2007.  I believe that the paper on the site is the one I reviewed.


#3          (see all posts) 2007/06/05 (Tue) @ 16:40

Tango,

Pleased to see some criticism from someone who read the paper; half the comments from the Wage of Wins site came from people who didn’t even open the PDF.

1. Wouldn’t the sampling problem only hurt his stats?  If his sampled aging curve is flatter than the “true” aging curve, then his model inflates the likelihood of deviant performance curves: it thinks Bonds is more likely to perform better late in his career than the “true” aging curve would suggest.  That should just make his conclusions overly conservative, not suspect.

It might call into question the “peak” age, in that you might expect that players who succeed in the MLB for 10 years “peaked” later than the rest.  Maybe.  But the point is that just because theres selective sampling doesn’t mean there’s a confound.

2. How would you have him correct for the run environment? He allows each player to have their own pre- and post-peak mean performance vary (variables A1i and A2i). Thus, in effect, he not only allows the model to fit variance between years, he’s letting it fit variance between players.  Doesn’t that capture run-environment variance?

Seems like you are nit-picking, without giving your standard kudos for interesting research.


#4    Tangotiger      (see all posts) 2007/06/05 (Tue) @ 17:23

When I post on my blog, I’m typically complimentary to the effort, but I’ll call out holes.

The post I made was an as-is, unedited version of the review I sent the author.  In this case, my audience was not to all of you guys about an unsolicited article, but to the specific author, whose editor asked me for a critical review. I’m not sure I was supposed to say “good job” on specific points, but I’m an amateur reviewer.  This would be similar to when MGL or someone else would ask my specific opinion, and I’ll blast away.

***

I would try to separate the author’s aging curve (which is flatter than it should) from his steroids-conclusions.  So, yes, Bonds’ performance is farther from a true typical aging curve.

However, an aging curve is not some single fixed curve.  Some players, or types of players, age differently.  For example, we know that a player’s walk rate increases well into his late 30s.  A guy like Bonds can leverage his wisdom with his power.  Not only is he getting smarter, like most hitters, in learning to take a walk, but, now that he’s got that extra skill in his pocket, he can really leverage it.  So, it’s possible that for a guy like Bonds, the aging curve should be even flatter than we expect.

A guy like Juan Pierre, who relies on his speed, may get killed on beating out IF singles in his mid-30s, can no longer stretch singles into doubles, and won’t steal as much.  And, learning to take a walk probably won’t help a guy who can’t make the pitcher pay the price for it.

So, it’s a tough call to make on trying to compare an individual aging curve to the “typical” curve.

***

As for correcting for the differing league means every year: simply calculate his Linear Weights or RC or OPS relative to league for that season.  The variance is irrelavent.


#5    MGL      (see all posts) 2007/06/05 (Tue) @ 22:51

Maybe I am off-base here, but I just can’t bring myself to read a study that uses statistical techniques that I think may be unnecessary to answer a particular question.  Sort of a corollary to Occam’s Razor.

Look at ALL players’ differerences between age X and age X+1 and weight them by a certain number of PA (what is it, 1/(1/N1 + 1/N2) or something like that?).  That is all you need and want to do.

Even then, you still have 2 problems which are difficult to overcome. One, there is ALWAYS a selective sampling problem associated with the fact that players who play in any given year have been a little lucky in the previous year.  That is true of all years, and especialy true of the last year, when players tend to have gotten very unlucky.  That tends to make declines after peak age look larger than they are and assents before peak age look flatter than they really are.

Two, the quality of play in the whole league may get a little better each year, which screws up the “deltas” as we have discussed before, also making it look like declines are bigger than they are assents flatter than they are.

I DNRTFS, for the reason I initially stated, but at the very least, using only players who played for 10 years is obviously extremely problematic unless you want to state that your conclusions about aging only apply to “players who have played (or will play) for 10 years” and even then you have the same selective sampling problem that I discuss above.

One of the reasons they have played that long is that they have gotten a little lucky along the way (not to mention the fact that they are elite players, who might age differently than non-elite players).

And of course you HAVE TO use some measure of pitching and offense that is relative to the league.  Things change each year that cause the league averages to change, such as new parks, different strike zones, change in the overall quality of the pitching (which will screw up the hitting stats) and the batting (which will screw up the pitching stats), the weather, the baseball, etc.  I don’t know how you can generate reliable data and conclusions without using statistics that are relative to the league.

Anyway, maybe I will read the article, but I am not sure that I will be able to understand the techniques used and how they relate to the question (aging curves).

BTW, I can’t stand having to give automatic props to an author or a study/article.  It is OK if you are reviewing a piece and want the readers to know whether it is worth reading or not and what your overall assessment is, or what the good points and bad points are (which is what you do anyway when you critique something), but if your task is simply to critique, I say go ahead and fire away.

When I do research, I don’t want someone to pat me on the back.  I simply want to know what is right and what is wrong, more or less, mostly the latter, as I can usually assume that somethin gis right when it is not mentioned.


#6          (see all posts) 2007/06/06 (Wed) @ 02:12

MGL—the article isn’t that difficult a read. You can pretty much skip over the methodology section and not miss that much ... there are a couple of assumptions that are important but that is it

As far as academic papers go I thought is was reasonably readable.


#7          (see all posts) 2007/06/06 (Wed) @ 02:25

MGL, you’re right:  you’re off base.  Half of what you said is reasonable and valid, and the other half is a rant that you would likely take back if you read the paper (reminiscent of Joe Morgan’s rant against Moneyball, actually).

Your first criticism is that the analysis is more sophisticated that it needs to be.  Your second criticism is that your simple version of the analysis doesn’t work.  The funny thing is that the sophisticated version doesn’t have the same problems as your dumbed-down version. You essentially refuted your first argument by presenting the second, and your second argument is moot once you refute the first. smile

You don’t need to give props or pat anyone on the back. But without even reading it, you’re suggesting that not only is it a poor paper, but that it is almost beneath you to read it. And thats full-on ridiculous, Joe Morgan style.

This paper isn’t perfect. I think the model could use some tweaking to address a few problems--one that you mention, and some others that you don’t.  But you should read it and make an effort to decipher it, because its a relatively simple but powerful framework that does NOT fall prey to all of the problems that you just listed.

Happy reading


#8    MGL      (see all posts) 2007/06/06 (Wed) @ 03:21

I’ll read it.  For the record, I am NOT suggesting it is beneath me to read it.  Quite the opposite.  Did you not read this?

“I am not sure I will be able to understand the techniques used.”

And I do apologize.  I have a real bug for academic papers that err on basic things (relating to baseball research).  I certainly should not assume that is the case here without reading the paper.

As I said, I’ll read it and if I can understand it, I’ll give my opinion.  Maybe even a pat on the back or two. wink


#9    MGL      (see all posts) 2007/06/06 (Wed) @ 04:12

They get better because they gain experience, and they get worse because of the human aging process

They get better because of the human aging process as well.  In fact, we don’t know (at least I don’t) how much of the improvement is due to the “human aging process” and how much is due to experience, but I think that it is clear that it is both (for example, players get stronger as they age up until some peak).

…but surprisingly there seems to have been no rigorous attempt to estimate them.

I guess it depends on the author’s definition of “rigorous,” but Tango, I and others have done some pretty good research on aging curves in baseball.  Or perhaps the author was only referring to published papers in academic venues.

I have little problem with using OBP, OPS, or ERA.  Of course, linear weights (or something like it) and component ERA (which is essentially lwts against) would probably be better and certainly I would use something relative to the league, although there are some problems with that.  For example, if the overall quality of the league declined, say in offense, a player who actually stays the same would appear as if he got better.

It should also be noted that if the improvement of a player up to the peak-performance age is interpreted as the player gaining experience (as opposed to,
say, just getting physically better), this experience according to the assumptions
of the model comes with age, not with the number of years played in the major
leagues. A player coming into the major leagues at, say, age 26 is assumed to be on the same age pro_le as an age-26 player who has been in the major leagues for 4 years. In other words, minor league experience is assumed to be the same as major league experience.

I am not sure what the author is trying to say here.  Again, why is improvement interpreted as players gaining experience rather than getting physically better, or both?  And if it is experience, where does the experience start?  If in the minors, not all players come into the minors at the same age.  If it includes college, and college experience is roughly the same as minor league experience, then you have roughly the same experience for all players from age 18 onward I guess.

There is no straightforward way to adjust for this, but fortunately the fraction of pitchers in the sample who are potentially affected in a large way by it is small.

I agree that it is not a big deal if most players are not affected by this, but why is there “no straightforward way to adjust for this?” Using a relative (to the league) ERA, like ERA+ (which also account for park affects), is not a “straightforward way?”

No attempt has been in this study to adjust for different ball parks.

No attempt has been “made”?  I think that “ball park” is usually spelled “ballpark” (one word).

Also, for OBP and OPS, I would not include IBB just in case there is an age bias there (for example, older players may be more likely to be issued IBB, given the same talent level), and also because they are NOT the same as non-IBB.  Perhaps in the database there is no distinction in the early years.  Or maybe the author did not include IBB. I don’t know. It is probably no big deal though.

The study seems fine to me, given my limited understanding of the methodology.

Question for the author:  What is wrong with the “simple” method of computing peak age and improvement and decline rates (which Tango and I have used), which is, as I mentioned, simply summing, weighting, and averaging all players “differences” between year X and year X+1?


#10          (see all posts) 2007/06/06 (Wed) @ 04:15

CDM,

Do you know the author of the paper or something? This whole exercise is so flawed, it’s pointless. My favorite part is the all-time player rankings by age-adjusted OPS.

Seriously, estimating age effects is not that hard, which is why it’s amazing that Fair has bungled it so badly. Yes, there are difficult, if not impossible, problems to solve like adjusting for improving quality of play (though I have posted my solution here), but those are really outside of the scope of such a study. Regress to the mean, find your deltas, go home.


#11          (see all posts) 2007/06/06 (Wed) @ 14:00

MGL,

My apologies if I misread your post, or if I was uncharitable in mine.  Perhaps I shouldn’t have invoked the name of Joe Morgan smile

David,

No, I have no relation to the author.  The first time I saw the paper was when it was posted here.  If I’ve come out in its defense, it is only because I felt the criticism was unfair.  People think so highly of their own work, and yet are so quick to deride and dismiss the work of others.

I’ve seen a lot of research that could be accused of being so flawed to be pointless (e.g., http://www.hardballtimes.com/main/article/how-much-is-matsuzaka-worth/ ).  If you read the article fairly, and think it falls into that category, and can make a coherent argument to that effect (so far, you’ve made none), thats great. Then you can call it a day and feel smarter than an Ivy league economist.  Thats always a good feeling. smile


#12    Tangotiger      (see all posts) 2007/06/06 (Wed) @ 14:23

If I’ve come out in its defense, it is only because I felt the criticism was unfair.  People think so highly of their own work, and yet are so quick to deride and dismiss the work of others.

Is your last statement in context with the rest of your paragraph?  Or was it an add-on to talk about general “people”.  Exactly who and what are you talking about?

***

The rest of this post applies to everyone, and is not necessarily directed at cdm, though it was inspired by his post.

I suggest that if one wants to criticize someone specifically, do so, rather than saying “people”, at which point one may be pointing the finger at me (since I’m the main critic here) or MGL or maybe anyone on this thread or linked threads.

(Nate Silver for example, a person who I respect, did this kind of painting when discussing forecasting systems.  Without singling any of them out, he casts a shadow over all of them.  I’m guessing he meant THT, but he could have meant Marcel, or Shandler, or who knows what.)

As long as we can all be respectful of each other’s opinions, there’s no reason to pull punches, as long as they are above the belt. I’m happy to give Clay, Woolner, MGL, or whoever, a bloody nose, if they don’t protect their face.  If you keep your elbows up, that wouldn’t happen.

And, same back to me.  If what I said is b-llsh-t, then call me out on it.  But, articulate the reason.  “Are you going to bark all day, little doggie, or are you going to bite?”

Also, it’s more productive to critique an article rather than critique a critique of an article.  At this point, I feel I should be critiquing a critique of the critiques of the article!  Let’s stick to the merits, if we can.


#13    Guy      (see all posts) 2007/06/07 (Thu) @ 11:12

I guess I come down somewhere between CDM and DSG.  I think the Fair paper has some fairly serious flaws, especially in terms of identifying late-career “overperformers,” as I said in posts over at WoW.  On the other hand, David’s dismissal was a little glib.  The delta method has it’s own problems, in that it’s hard to know what the right amount of regression is, and how to weight samples, and even small differences resulting from those decisions have a huge impact once you chain the results. 

CDM:  What do you think Fair has added to what Jim Albert did earlier (http://personal.bgsu.edu/~albert/papers/career_trajectory.pdf)?  Is it mainly the allowance for different pre- and post-peak curves, or has Fair made other improvements as well?

The quadratic method, relying on players with a long history, obviously is limited to very good players (Albert requires 5,000 PAs, Fair 10 FT seasons).  But that’s only a problem IF good players have a distinct aging curve, which is possible but not necessarily so.  I’m more concerned about a kind of selective sampling problem that the delta method also faces:  if players have different curvatures (as Albert seems to show), then players with long careers will not just be above-average in talent, but will also tend to have flatter than average aging curves.  That is, a player who reaches 90% of peak at age 20 is more likely to play in the majors than a player who is only at 75% of his peak.  Similarly, a player who declines slowly is much more likely to be playing at age 37 than a rapid-decline player (given equal peak talent). 

I see two ways to try to deal with this.  One is to use MLEs for older players still in the minors.  There aren’t a lot of 39-yr-old minor leaguers, but there may be enough 31-35 yr. olds to subtantially improve our estimates of decline in those years.  Then, having done that, we can extrapolate from the the 27-to-35 decline rates to make better estimates of post-35 decline, rather than relying only on the tiny number of (mostly flat-curve) players who are still playing at those ages. 

A related point:  I’d like to see age studies look at total productivity (such as RAR) as well as rate stats.  If we look at RAR, we’ll see far sharper declines due to reduced PT.  I don’t think either approach is “right” per se, but together they would tell the aging story better than either approach by itself.


#14    MGL      (see all posts) 2007/06/07 (Thu) @ 19:47

Guy (and others), why is regressing necessary when using the “delta” approach?  Are you guys (no pun intended) talking about regressing every yearly performance toward the mean, using the appropriate regression amoung given only the number of PA in that year?  For example, if a player hits 30 HR in 1999 (in 600 PA) at age 28 and then 5 HR (in 300 PA) in 2000 at age 29, you would use the 30 regressed to maybe 22 per 500 PA (using only the 600 PA as the sample size to figure out how much to regress) and then maybe 11 per 500 PA for the 5 regressed, using the 300 PA to do the regression?  Why not just use the actual performance numbers, non-regressed?


#15    tangotiger      (see all posts) 2007/06/07 (Thu) @ 20:42

If you don’t regress, you end up with an impossible situation like here:

http://www.tangotiger.net/adjacentPitching.html


#16    Guy      (see all posts) 2007/06/07 (Thu) @ 22:53

The problem is that some of the players who were unlucky in year 1 don’t appear in the year 2 sample.  Essentially, your year 1 sample always consists of more lucky than unlucky players.  Then, regression to mean creates an exaggerated decline. 

However, I would think you could eliminate a lot of that just by setting a minimum career PA high enough to get rid of the flotsam and jetsam who have one decent season and then decline, but not so high that it becomes a study only of long-career players (maybe 2000 PAs?). 

* *

A related issue is what to do with a player’s final season.  I know Tango thinks these should be omitted, but I’m not sure that’s right.  The final season is not itself a problem: if a player has a lousy season at age 35 and then retires, he belongs in the 34-35 comparison.  Leaving him out will skew your sample toward players who were lucky in year 2, and thus understate overall decline.  Where he creates a problem is in the 35-36 comparison (in which he doesn’t appear), because it’s likely he would have rebounded somewhat at age 36 were he allowed to play.  Perhaps this could be remedied by taking all the age 35 final season players, creating a Marcel-like projection for their age 36 season, and see how much including them changes your 35-36 results.


#17    MGL      (see all posts) 2007/06/08 (Fri) @ 01:38

Why would you ignore the final season when you are regressing toward the mean?  And if you are ignoring the last season because it is likely an unlucky one (that is why the player retires), then you must ignore the penultimate season for players who are old or not that good because it is likely a lucky one otherwise the player would have retired.  And so on.

Any season that has a subsequent season is a lucky season and regressing to the mean is NOT going to solve that problem.  In fact, I still don’t see why you must regress to the mean.  Tango’s “impossible” results in post 15 above is because he is not using the “delta” method, right?  I use the delta method all the time in my aging research without doing any regressing and I come up with absolutely normal results.  I do come up with a low peak age than I think the real peak age is, but I think that is due to two things - one, the selective sampling problem above, and two, the problem of the overall league quality getting better each year (on the average of course).


#18    MGL      (see all posts) 2007/06/08 (Fri) @ 06:00

I’ve read this article a bunch of times and I can’t really tell if you are using the “delta” method and of you are weighting by the lesser of the PA:

Since that probably made no sense, let’s take an example. Rick Wise, at the age of 21, in 1966, faced 413 batters, and hit 3 of them. (It’s actually 416, but I removed the 3 batters he IBB.) That’s a rate of .0073. In the AL in 1966, that rate was .0053. I also computed Wise’s rate at the age of 22 in 1967.

I repeated this step for all pitchers aged 21/22. I took the simple average of all the pitchers and league rates. This gave me a HBP rate of .0058 at age 21 and .0059 at age 22. The league rates were .0057 and .0058.

I don’t know what you are doing here.  You have to give a more detailed example.  The delta method is simply:

Take all pitchers from age 21 to 22, regardless of how many PA’s they had in year X or year X+1.  Take the difference in K rates (or whatever stat) and weight by the lesser of the two PA.  Get a weighted average for all pitchers. These are K’s per 500 PA and the actual number of PA are in parentheses.

*************Age 21 Age 22
Pitcher A 70 (200) 80 (10)
Pitcher B 60 (100) 65 (300)
Pitcher C 80 (500) 75 (200)

For pitcher A, we have +10 weighted by 10 PA, for pitcher B, we have +5, weighted by 100 PA, and for pitcher C, we have -5 weighted by 200 PA.  I know there is a more rigorous way of doing the weighting, but I forgot what it is.  I think it might be 1/(1/PA1 + 1/PA2)/2, so that if PA1 is 200 and PA2 is 10, the weighting is 1/((.005+.1)/2), or 19.

So for age 21 to age 22, we have -400/310, or -1.29.  So pitchers decline by 1.29 K per 500 PA from age 21 to age 22.

No need to regress (I don’t think).  No need to chain.

Given that data, how would you (Tango) do the calcs using the method in your article with and without regression?

As I said in my previous post, I think it might be better if we regress because we do have the problem of any pitcher with a year X+1 being lucky in year X.  So let’s do the same computations as above and regress 50% (for 500 PA) toward a league mean of 80, with the regression equation being 1-PA/(PA+500):

*************Age 21 Age 22
Pitcher A 77 (200) 80 (10)
Pitcher B 77 (100) 74 (300)
Pitcher C 80 (500) 79 (200)

(3*10 - 3 * 100 - 1 * 200)/310, or -1.5, more of a decrease in this case because the pitchers at age 21 were quite a bit below the mean.  Not a realistic example, of course, since the reason we are regressing in the first place is because all of the players in year X will presumably be above average (lucky) if they also have some PA in year X+1, and certainly some of the unlucky ones will drop off and not have a year X+1.  Either way (lucky ones get another year or unlucky ones don’t - actually these are the same thing) we are left with a luckier than average sample in all year X’s.

One of the problems with doing any regressing at all is that we really don’t know exactly how much to regress and we don’t really know the mean to regress toward.  I would think that we want to use the means for each age.  And where do we get those means from?  We can’t use the mean of all 21 year old pitchers who also pitch at age 22. That would be tautological.  I guess we would use the mean of all 21 year old pitchers whether they pitched at age 22 or not.  That mean should be lower than the mean of all 21 year old pitchers who also pitched at 22.  If it is not, then our assumption of selective sampling is wrong.


#19    Tangotiger      (see all posts) 2007/06/08 (Fri) @ 10:15

I do use the “delta” method, but rather than a differential, I compare the ratios.  Let’s take HBP.

The numerator is HBP
The denominator is (PA - IBB - SH) minus (HBP).

I compute the ratio for each season, and then divide the ratios for the two adjacent seasons.

For walks, the numerator is NIBB.  The denominator is (PA - IBB - SH) minus (HBP + NIBB).

So, this will give me the ratio of walks to non-walks.  Say, 100 walks to 500 non-walks (ratio is 0.20 to 1) in year 1, and 90 walks to 600 non-walks in year 2 (ratio of 0.15 to 1).

The “aging” therefore is 0.15/0.20 = 0.75.  This means that the walk-to-non-walk ratio was reduced by 25% between those two years.  The “weight” I used is the lesser of the numerator+denominator.

When I do SO, in order to keep the binomial alive, and not have competing parameters, the denominator will now be:
(PA - IBB - SH) minus (HBP + NIBB + SO).

So, that’s what that first chart shows.  And, if you work it out, whether you do it my way, or your way, you will get a peak age of 23 or 24 for pitchers.


#20    MGL      (see all posts) 2007/06/08 (Fri) @ 12:43

I’m still not sure that you are using the “delta method” (it can be a ratio or a difference - it does not matter), and I DON’T think that I will come up with a peak age of 23 or 24 using the true delta method.  Do you mean peak age for K or for a measure of total pitching production (like ERC or ERA+ or lwts or OPS against)?

If you can use the data for the 3 pitchers, A, B, and C, I have in my post above, and tell me how you would do an aging curve for the K rates given (without doing any regressions), I can then tell exactly what you are doing.  It is NOT clear what you are doing from the description in your article.


#21    Tangotiger      (see all posts) 2007/06/08 (Fri) @ 13:16

These are K’s per 500 PA and the actual number of PA are in parentheses.

*************Age 21 Age 22
Pitcher A 70 (200) 80 (10)
Pitcher B 60 (100) 65 (300)
Pitcher C 80 (500) 75 (200)

I rescale each pitcher:

Pitcher A (10 PA)
70/500*10 = 1.4 K
80/500*10 = 1.6 K

Pitcher B (100 PA)
60/500*100 = 12 K
65/500*100 = 13 K

Pitcher C (200 PA)
80/500*200 = 32 K
75/500*200 = 30 K

310 PA
Age 21: 1.4 + 12 + 32 = 45.4 K
Age 22: 1.6 + 13 + 30 = 44.6 K

Ratio of K to non-K:
Age 21: 45.4 / (310-45.4) = .172
Age 22: 44.6 / (310-44.6) = .168

Age21 to Age22 transition ratio: .168/.172 = .977

Do this for every adjacent seasons, and chain.


#22    Tangotiger      (see all posts) 2007/06/08 (Fri) @ 13:21

(I updated post 21.  Disregard anything else you may have read in its place.)

And, for those new around here, the reason you use ratios ( x/ y ) and not rates ( x/ x+y ) is well-documented elsewhere in my blog or archives.  In short, that’s how you work Odds (by ratios).  Using rates will invariably lead to impossible situations, like rates above 1.00, depending on how high your starting mean is.  This is clear if you look at things like ZR or fielding percentage.


#23    Tangotiger      (see all posts) 2007/06/08 (Fri) @ 13:24

MGL is correct that Andy uses a more rigorous method for weighting.  In essence, if you have 200 PA and 10 PA, you would have a weight of:
2/(1/200+1/10) = 19

No big deal, though I suppose we should stick with Andy’s method.


#24    Guy      (see all posts) 2007/06/08 (Fri) @ 13:43

It seems to me that weighting by lesser PA (either Andy’s method or the shortcut) potentially creates a problem, given that PT and performance are highly correlated.  A player with relatively consistent performance will tend to get weighted more (both Ns will be high) than a player whose performance markedly improves or declines (which will cause one of the Ns to be small).  I would think this will flatten the age curve, by underweighting young guys who post big improvements and older guys in sharp decline (though partially offset by low weights also being given to young guys declining and old guys improving). 

And of course, weighting by PA will in general give more weight to better players (may be unavoidable).

I don’t know what the right solution is—you can’t weight all players the same if you’re going to include guys with 15 PA.  Maybe setting a reasonable minimum (200 PAs), then weight all players equally?

* *

MGL:  are you at liberty to post some of your age findings?


#25    Tangotiger      (see all posts) 2007/06/08 (Fri) @ 13:44

With Andy’s method, the weights would be 19, 150, 286, meaning relatively, they are: 4%, 33%, 63%.

Compare this to the lesser-of-two approach of 10, 100, 200, or 3%, 32%, 65%.

Obviously, things are not always going to be so similar.  You really feel the effect when you have a big gap and a tight gap between the two data points (say lots of 10/500 and 300/300 cases).


#26    Tangotiger      (see all posts) 2007/06/08 (Fri) @ 14:17

I believe the reason that Andy weights the way he does is because the lesser number of PA means that the performance data has more uncertainty. 

Therefore, it would be wrong to set a threshhold of say 200 PA, and then count all the year-to-year changes of each pairing the same, simply because those from lesser PA have larger uncertainty.

However, your point is valid that since PA is not an independent parameter (the better you perform, the more PA you get), it would seem that you still need to do *something*.

The answer is to do it both ways, and see what difference you get.

The absolute requirement however is to ensure that you have the exact same players with the exact same weights in both pools.  And then you need to chain the results, to, in essence, guarantee you have the exact same players throughout.  The Fair paper did not do that.


#27    Tangotiger      (see all posts) 2007/06/13 (Wed) @ 22:21

MGL, do you concur that if you don’t regress, then you end up with a peak age in the early 20s using the delta method (whether your way or mine)?


#28    Tangotiger      (see all posts) 2008/01/11 (Fri) @ 10:18

The published paper is here:
http://www.bepress.com/jqas/vol4/iss1/1

I only looked at the section where he critiques the delta approach (section 5).  I agree with that critique, as I’ve discussed here:
http://tangotiger.net/aging.html

It is strange that he actually links to my article (which makes the observation about the retiring effect) yet presents it as something seemingly new.  In any case, this article drives home the point that you need regression toward the mean to counteract the selective sampling issue:
http://tangotiger.net/adjacentPitching.html

I will re-read his paper, and see if my comments I noted in my review (at the top of this blog) were addressed or not.


#29    Tangotiger      (see all posts) 2008/01/11 (Fri) @ 13:55

(Note: when I refer to a page number, it’s the PDF page number, not what’s on the actual printed paper.)

***

I am reading the new version of the Fair paper.  The parts I have in quotes below were my ORIGINAL review comments to Fair.  I will be commenting on the comments, to see if Fair applies what I had said.

“but surprisingly there seems to have been no rigorous attempt to estimate them”

There are several articles noted here:
http://www.tangotiger.net/#Forecast

Good job for Fair for acknowledging.

The choice of “10 years” has its own selective sampling issues, as only certain quality of players will be in that pool. That pool of players is hardly representative of the population of MLB players. See this mini-study for more information:
http://www.tangotiger.net/archives/artAging.shtml#1013

The effect is for the curve to be flatter than it should be.

You acknowledge injury as a possible outcome, but ignore it afterwards. It’s a big deal, especially for pitchers.

Fair acknowledges by saying “The aim of this paper is to estimate aging effects for injury-free, career baseball players, and the sample was chosen with this in mind.”

With that rather large provision, the conclusions of the paper, whatever that may be, will ONLY apply to the population of players that fits the sampling criteria.  In effect, it is an optimistic forecast, if compared to the average ballplayer who you won’t know if he will be injury-free.  That is, if Tulowitzki plays for 10 years (implying that he was injury-free AND good enough to play for 10 years), his aging pattern can be based on Fair’s numbers.

A 9% increase in ERA from age 26 to 37 (3.50 to 3.81) is pretty much impossible to believe. That is extremely flat. The sampling issues noted above need to be addressed.

His new paper on page 13 says: If a pitcher’s peak ERA is 3.50 (the mean of ERA in the sample is 3.50), then the 0.314 value for R37 means that his predicted ERA at age 37 is 3.814, an increase of 9.0 percent.

That is, he has the same conclusion, which is absurd on its face.  Who are the 37 year old pitchers in his study?  It surely is a subset of the 27 year old pitchers.  And, that subset is all the good pitchers.  What a terrible choice he made here.

You talk about the 90s (steroid-era), but you also acknowledge that you did not adjust for the change in run environment. The 1993-2006 time period is a huge offensive era. Without adjusting for the context, it’s going to look like hitters are going better in those years. As well, the new parks are more offense-friendly these days.

While the overall model by age won’t change if you adjust for parks and year, you can’t then ignore these issues when looking at 1990s players in particular. They need to be adjusted, if you are going to focus on them.

He acknowledges the Larry Walker / Coors issue, but doesn’t give the necessary weight to the explosion in runs per game that occurred between 1992-1994.  Plain and simple, there was a sudden and dramatic change that occurred somewhere around 1993.  It was a one-time kind of event, something far more explainable with a sudden imbalance of hitters/pitchers due to expansion or to a juiced ball.  The steroids effect, whatever that might be, would have a prolonged effect.

While you choose 10+ year periods for each pitcher, you don’t have the same pitchers in each age group. You can have a 10yr period from ages 23-32 or 25-34, or a 15yr period from 22-36, etc. As noted in one of my linked articles, you need to pair the age groups to have the same pitchers and the same weights in each age group. So, select the same pitchers at age 23 and 24, and weight them equally. Then, select the same pitchers at age 24 and 25, etc. The pitchers of 23/24 do not need to be the same as 24/25.

His 10-yr period is per player, and therefore, you would not have the exact same number of players at each age level.  For example, someone’s 10-yr period could start at age 25, and another at age 21.  So, when you compare the age 24 and age 25 players, it’s not the same players.

I am again disappointed.

And the comparisons of ERA to chess is not appropriate. Losing 15% in ERA would be the equivalent of losing 7.5% in OBP. It’s just not a 1:1 comparison.

He says: For OPB in line 1, the percent lost is .020 divided by .354, which is 5.6 percent. Finally, for ERA in line 1, the percent lost is .520 divided by 3.50, which is 14.9 percent.

And thereby completely ignoring what I said.  Another disappointment.  A 5.6% change in OBP would actually imply a 11-14% drop in runs scored, a number fairly close to his pitcher numbers!

And he further makes it worse on page 20 when he still talks about chess and the otehr sports.  I hate to sound like Mad Dog Russo, but Terrible Job, Right There (© ).

OPS is not a good measure to use for ranking players, in addition to the other problems I noted earlier. Linear Weights would have been a better choice.

A very poor job at the end of page 3, top of page 4, in terms of trying to get a good hitting measure.  OPS is definitely inferior to Linear Weights, and his brushoff of this is inexcusable, especially since I alerted him.  He said this: … there are no rigorous ways of testing whether one measure is better than another.  Oh, really?

He also dismisses methods to adjust for the run environment, believing that the changes in run environment might apply to only some players.  While this is true for some variables (like his example of some guys being on the juice), it is far more significant that all players are affected in most cases (1987, change in strike zone, juiced ball, etc).

He said this: This work is based on the assumption that the 15-year-or-so period that a player plays is stable for that player. This assumption is obviously only an approximation, since some changes clearly take place within any 15-year period, but it may not be a bad approximation.

I am quite disappointed in his statements here.

ERA also has its own problems (being fielder-dependent as well as the sequence-of-events dependent), and a “component ERA” similar to Linear Weights, would have been more appropriate.

Ignored.

***

This is a further review of his paper, and not a review of him applying my previous review:

To get a sense of magnitudes, if a player’s peak OPS is 0.800 (the mean of OPS in the sample is 0.793), then the -0.045 value for R37 means that his predicted OPS at age 37 is 0.755, a decrease of 5.6 percent.

A change in OPS of .045 implies around a 1 win per 162 games (700 PA).  A 1 win change in hitting, from his peak to age 37 is fairly low.  This simply proves how flat his curve is.  In my article here:
http://tangotiger.net/aging.html
I get a 1.5 to 2.5 win difference per 162 G, depending which of the two columns of the table at the end of the article you prefer.


#30    Guy      (see all posts) 2008/01/11 (Fri) @ 14:57

Very disappointing indeed.  We also had a good thread on this paper over at Wages of Wins, which I have to think Fair saw, and which raised several of your concerns and others:  http://dberri.wordpress.com/2007/06/04/rocket-science-clemens-and-‘roids/

The decision to use percentage changes in different statistics, as you say, is a mistake.  And to use % change to compare across sports is just silly; he could at least have used SDs. But his refusal to normalize hitter data for run environment, after being warned of his mistake, is deeply irresponsible.  The paper clearly implicates a number of players as potential PED users, simply because their career happened to begin before 1994 and end after 1994.  This creates the illusion of extraordinary late-career performance.  If he was skeptical of our claims on this issue, a quick look at pitchers active in the same period would have shown him the problem—that group would have tended to greatly underperform in their later years. 

Maybe Fair just didn’t understand the importance of normalizing to league offense. But it wouldn’t have been that hard to do, and at worst would prove to be redundant.  Given the relevance of this work to the PED issue, I think Fair had an obligation to get it right. And he didn’t.


#31    Tangotiger      (see all posts) 2008/01/11 (Fri) @ 15:12

Here’s the link Guy was posting:
http://dberri.wordpress.com/2007/06/04/rocket-science-clemens-and-‘roids/

All that gobbledygook is because of the apostrophe in the URL.


#32    Tangotiger      (see all posts) 2008/01/11 (Fri) @ 15:14

Okay, that didn’t work.  Cut/paste this:
http://dberri.wordpress.com/2007/06/04/rocket-science-clemens-and-

then put an apostrophe like this:

Then type this:
roids/


#33    Tangotiger      (see all posts) 2008/08/01 (Fri) @ 11:42

Bumping in response to Phil’s post here:
http://sabermetricresearch.blogspot.com/2008/07/batters-improve-when-young-but-it-looks.html

And here:
http://sabermetricresearch.blogspot.com/2008/07/wanted-pointers-to-baseball-aging.html

Phil: I hope you give the Fair paper its deserved kick in the a$$, as my post 29 does.


#34          (see all posts) 2008/08/01 (Fri) @ 18:06

Just finished rereading the Fair study, and all these comments.

Tango, I agree with your criticisms, but my main problem with the Fair study isn’t actually any of those.  My main problem is that the assumptions are so incorrect that even with perfect logic and math after that, the conclusions are still doubtful.

Why assume that every player has the same ascent and descent, when we know that fast players age quickly and sluggers age more slowly?  Why limit the study to 10-year players, when you’re selectively sampling players who probably age more slowly than most?

Here’s a question: “if you retrospectively try to fit a couple of quadratics to the careers to players who had successful careers, which players don’t fit the curves very well?” The Fair paper answers that question—but who cares?  It’s not a baseball question.

Here are some real baseball questions:

1.  If a 25-year-old has a wOBA of .350 this year, what is his expected wOBA next year?

2.  What kinds of players age more gracefully than others?

3.  At what age does a very skilled player peak?  What about just a good player?  What about a mediocre player?

These are hard questions, mostly because of selective sampling issues.  The hard work is figuring out a method to resolve those issues.  The Fair paper just assumes them away, and does some fancy math to get very specific equations that may or may not bear any resemblance to reality.


#35          (see all posts) 2008/08/01 (Fri) @ 18:13

re: #27: is there a simple explanation of why, if you don’t regress, you get a too-early peak?


#36          (see all posts) 2008/08/01 (Fri) @ 18:17

OK, I got it.  Selective sampling always works in one direction: making declines seem bigger (and improvements seem smaller).  So the small improvement from 26 to 27 (say) winds up showing up as a decline, and the peak shows at 26 instead of 27.


#37          (see all posts) 2008/08/01 (Fri) @ 18:51

Hang on a minute ... why is regressing to the mean the right way of correcting the (adjacent-year method) selective sampling problem?  I think it REDUCES the problem, but only by coincidence.

Three guys, 35 years old.  They each have .250 talent.  Next year, they’ll have .240 talent.

But this year, just by luck, A hits .300.  B hits .250.  C hits .200.  So, next year, A gets to play full-time.  B plays half-time.  C has to retire.  A and B hit .240 next year.

The reality is they all dropped 10 points due to aging.  But it looks like A dropped 60 points, and B 10 points.  Since A is weighted twice as heavily as B (because of playing time), it looks like decline of 43 points on average.

Now, suppose you regress the first season, halfway back to .250.  A is now .275, B .250, C .225.  Now, A appears to have dropped 35 points, B 10 points.  The apparent drop is now 27 points.  That’s less than 43 (without regressing), but still not the actual value of 10.

Regressing provides a counter-effect to the selective sampling effect, which makes the answer a bit less wrong.  But I don’t see why there is a theoretical reason for doing it.  All I see is an arbitrary way of maybe getting closer to the right answer, for no particular reason.

Am I wrong?


#38          (see all posts) 2008/08/01 (Fri) @ 19:03

Okay, I see it.  My example is wrong because the 50% regression doesn’t get back to the real talent.  In a world where everyone was a .250 hitter, like my example, you’d regress 100% instead of 50%, and get the right answer. 

So regressing IS the theoretically correct way of doing it.

Still, it’s hard to regress to the true talent without knowing what the true talent is for that kind of ballplayer.  You don’t want to regress Wade Boggs to the league-average of .250.

P.S.  Sorry about asking questions and then answering them myself two minutes later.

But, in general, maybe regressing to the average is probably good enough for practical purposes.


#39    tangotiger      (see all posts) 2008/08/01 (Fri) @ 19:10

Right, you need to regress to the talent level at that age group.  Or, at whatever group you are drawing the players from.  Your example is perfect.


#40          (see all posts) 2008/08/02 (Sat) @ 01:40

It is funny that this thread was revived, because I was thinking about starting a new thread with some thoughts I had about aging studies.

As we all know and has been mentioned herein and in other forums, aging studies are fraught with selective sampling problems that tend to make it look like players age worse than they do and tend to shift the peak age downward.

I don’t like regression as a solution because determining the peak age and proper curve is highly dependent on the exact method for regressing to the mean (how much to regress for any given PA), as Tango once showed in one of his pitcher aging curve studies (which is on his web site I think).

We definitely do not ever know exactly what those regressions are, so using them to “correct” these improper aging curves we get when we don’t do any regressing at all, is not the perfect solution, by any means.

Here is what I have been thinking about:

Why are we doing any weighting at all of the “deltas?” A “delta” is the difference in performance between one age and another for each player in our sample.  The weighting (including players who do not play at all in tear 2) is what causes the selective sampling problem of course. 

Players who have been unlucky in year 1 tend to get less playing time in year 2 (and in year 1), and vice versa for players who got lucky in year 1.  Phil gives a nice example above where it looks like the aging curve is a weighted 43 points of decline when it is only 10 points in reality.

I don’t think that any of the “deltas” should be weighted!  No matter what the number of PA in year 1 and year 2, the differences (deltas) should be weighted equally.  The only reason we are weighting is to smooth out the fluctuations we get when some of the players have ridiculous deltas because of small sample sizes in year 1 or year 2 (or both).  But, the price we pay for that weighted and smoothing out the data, is the selective sampling problem I mentioned above, which causes the aging curve to be shifted to the left - a lot.

But, I don’t think there is any reason to weight the deltas!  After all, if we have 3 players in year 1 who hit .290 in 600 PA, .270 in 300 PA, and .180 in 10 PA, we simply want to see how each of them does in the next year and take a simple average of those differences. No weighting.  If the .180 player happens to hit .348 in 23 PA the next year, so what?  If we have enough players in our sample and enough years, those crazy deltas for the low PA players will even themselves out anyway.  We don’t have to do that by weighting the samples.  In fact, we should NOT weight the samples.

In fact, I believe that if we don’t weight the deltas - we treat each “player season pair” equally as long as they had at least 1 PA in year 1 and in year 2, we completely get rid of the selective sampling problem.

Now, some of you may be thinking, what about the players who do poorly in year 1 and have ZERO PA in year 2 (they retire, get hurt, sent to the minors, etc.)?  You are on the right track!  We have to include these guys as well.  Our goal is “conduct” an experiment whereby we have every player in the major leagues at age 21 (and other ages) in year 1 all play again in year 2 and then we see how much their performance improved or declined on the average.

We don’t care how many PA’s they get in year 1 or in year 2, but we DO care whether they get a chance to play in year 2.  In order to conduct our experiment properly, we have to let all our players play in year 2.  We can’t just let the good and lucky ones play in year 2 and exclude the bad and unlucky ones, which is what happens in reality.

So what do we do?  Good question!  We let those players who played in year 1 but not in year 2 play!  How do we do that?  Simple. If they played in the minors in year 2, we use their MLE’s.  Not perfect, but not too bad either.  If they do not play at all, for whatever reason, we assume that they played a certain number of PA (actually it does not matter how many PA, because remember we are weighting each delta or player season pair equally) at a certain rate.  What rate is that, you ask, since that is what we are trying to determine?  Easy.  We do a Marcel or other projection and pencil that in as our player’s performance in year 2, even though he did not actually play!  How much do we adjust him for aging?  Another good question, since that is what we are trying to determine.  We start with a reasonable age adjustment, then we do the whole thing and come up with an aging curve.  Then we use that “new” age adjustment and do the whole thing again (an iterative or recursive process like we do with “strength of opponent” adjustments) a few more times.

Voila, we have a proper aging curve with no selective sampling problems.

Is anyone up to doing this?  I really feel like this is the correct way to do it and I don’t think anyone has ever done it this way.  I think that it is imperative to use all players, and to weight them equally.

There are only two problems which come up, but I think they are minor compared to the selective sampling problems that occur with weighting (by the harmonic mean or lesser of the two PA pair).  One, if players with few PA (bad players, part time ones, etc.) at any one age have a significantly different true aging curve than players who get more PA, they will be overrepresented in the resultant overall curve, since we are weighting everyone equally this time.

Two, and this is a problem with all aging research, how do we separate aging from leagues getting better over time?  I am not sure how to do this.

If we use absolute stats (rather than league normalized ones) like OPS, if the entire pool of pitchers gets slightly better every year, it will look like batters get slightly worse every year and will also shift the aging curve a little to the left (like the selective sampling does).  If we use league normalized stats (likle OPS+ or lwts), it will also look like players get worse every year, since the league as a whole would get better and a player who stays the same will look like he got worse.

Yet in order to see if leagues get better each year, we probably have to know what the aging curve looks like in the first place.  Maybe there is a way to know whether and by how much leagues as a whole change every year with time and then we can incorporate that into my method for doing an aging curve.

So, again, anyone want to take a preliminary shot at this?

Just do the delta method for aging picking any age interval (22-35?), use all players in year 1, use MLE’s or projections in year 2 for players who played in year 1 but did not play in the majors in year 2, and above all, do not weight the delta’s.  Take a simple average.

This is Phil’s example of players:

But this year, just by luck, A hits .300.  B hits .250.  C hits .200.  So, next year, A gets to play full-time.  B plays half-time.  C has to retire.  A and B hit .240 next year.

So for these players, we have -.060 for player A, -.010 for player B, and for player C, we estimate that he would hit .225 (.200 regressed 50% toward .250 - the regression amount could be different depending on his PA in year 1), and our preliminary aging decline is 10 points (where ever we got that - it doesn’t matter - we will eventually hone in on that and redo the calcs), so he hits .215 in year 2, which is an increase of .015 points.

So we take the simple average of -.060, -.010, and .015, which is -.0183 points or a decline of 18 points from age 35 to age 36.

Now we would re-do that using 18.3 points of decline for player C rather than 10 points.  Now we have -.060, -.015, and +.0067.  So the new average decline is 22.8 points.  We do that again a few more times and we stabilize our average delta for age 35/36.


#41    Colin Wyers      (see all posts) 2008/08/02 (Sat) @ 01:56

I was thinking about this earlier as well, and came to a radically different conclusion than you did. Instead of using Marcels for year 2, I think the key is to use them for year 1. By using three years of PAs, properly weighted, you get rid of the selective sampling issue - a “lucky” player in year X will have his performance accounted for when you include year X-1 and X-2 in it, putting it on equal footing with X+1.


#42          (see all posts) 2008/08/02 (Sat) @ 03:13

I screwed that up.  You have to weight the deltas by the PA in the first year, but you don’t care about the second year.  As long as everyone plays the second year, the stats in the second year for everyone are an unbiased estimate of their true talent (at that age and point in time).  So the mistake is weighting by the min or harmonic mean of the PA for both years and for not using anything if a player does not play in year 2.

I did an aging curve for lwts, OPS, OBA, BA, and SA for players from 01 to 06 only, using the method above.  Only lwts is normalized to the league.  If a player changed leagues from one year to the next, I did not use his delta for that age pair (but I did “fill in the blank” for him as if he did not play at all in year 2).  If a player skipped a year or more, I also “filled in the blanks.”

I should have “filled in the blanks” (done a projection) using multi-year data for a player, but I only used year 1 data and regressed according to the number of PA in year 1.

The 50% regression PA I used were:

lwts 300
OPS 350
OBA 400
BA 800
SA 350

After a couple of iterations, I got:

Age pair lwts OPS OBA BA SA
20-21 6.4 .040 .010 .016 .029
21-22 1.0 .010 .006 -.002 .005
22-23 3.3 .023 .008 .004 .015
23-24 2.3 .016 .006 .003 .010
24-25 1.2 .011 .005 .002 .006
25-26 1.1 .009 .004 .002 .006
26-27 0.8 .005 .003 .000 .003
27-28 -1.3 .-.010 -.001 -.004 -.008
28-29 0.1 .000 .001 -.001 -.001
29-30 -1.4 -.011 -.002 -.004 -.009
30-31 -1.8 -.013 -.004 -.005 -.010
31-32 -1.3 -.009 -.004 -.003 -.005
32-33 -3.4 -.023 -.007 -.009 -.019
33-34 -4.2 -.029 -.010 -.009 -.019
34-35 .5 -.005 .000 .000 -.005
35-36 -4.6 -.030 -.011 -.008 -.020
36-37 -.7 -.015 -.007 -.008 -.009
37-38 -3.9 -.036 -.011 -.006 -.025
38-39 -5.0 -.046 -.018 -.009 -.028
39-40 1.4 .007 -.002 -.004 .009

Overall production (lwts and OPS) peaks sharply at 27.  Actually, so does everything else (OBA, BA, and SA).

For age, I am using year of season minus year of birth.


#43          (see all posts) 2008/08/02 (Sat) @ 03:22

Colin, that is what we are talking about when we say to “regress” year 1.  We really mean establish a true talent level which means doing a Marcel using any past data you can.  Obviously if year 1 is the first year of a player’s career, you are just regressing the stats in year 1.

As I said, I don’t like that method as it makes everything too contingent upon the projection methodology.  We know that Marcels are only an approximation.

As I also said, I don’t think we need to mess around at all with year 1 data or year 2 data unless the player did not play in year 2.  It is much cleaner.  There is no need to do a Marcel for year 1 as you are proposing.  Using your method, not only do you have to do a Marcel for year 1, but if the player did not play in year 2, you have to do a projection for that year also, which is going to be exactly the same as year 1 plus an aging adjustment.

My method is EXACTLY the same as you would do if I asked you to conduct a perfect experiment in order to find aging curves. You would simply take everyone who played in year 1, no matter how many PA, and force them to play a lot in the next year, and you would get a perfect answer.  That is what I am doing, except that I can’t force them to play a lot, so I use whatever they happen to play in year 2 (which is fine) and if they don’t play at all, I am estimating what they would have done if they did play.  And since I am not weighting anything by the number of PA in year 2, I am not created a biased sample.  If I were using PA in year 2 to weight something, the better (luckier) players in year 2 would be over-represented since they tend to get more playing time.  And there are not many player seasons that fall into that category (of not playing in year 2), so I am only “monkeying” with a little bit of the data.

You are monkeying with ALL the data, which is not a good thing when you are trying to find nuances in the aging curves.


#44          (see all posts) 2008/08/02 (Sat) @ 08:10

I’m not sure equal weighting would work, because of year 2 selective sampling.

Two guys, expected to hit .250.  One guy hits .100 in April, and retires.  The second guy hits .300 in April, plays the rest of the year, and winds up hitting .280.

The simple average winds up being .190, which again skews your results.  The weighted average is about .250, which is correct.

However, using the MLEs is a great idea, if they exist. 

Let me think about MGL’s iterative method a little more ... I think you still have the equal weighting problem, but what if you used Marcels not just for retired players, but also for players with few AB?  The guy who hits .100 in April, pad him out to 400 PA or something.

But I’m still thinking about this ...


#45          (see all posts) 2008/08/02 (Sat) @ 08:14

As for separating aging from the league getting better ... I don’t think that can be done, as I argued
here
.

All I think you can do is measure the COMBINED effects of aging and league changes.  But isn’t that what you really want to know?  The question is how the player will do in a baseball context, not what’s happening inside his body. 

Sure, it would be interesting to separate the two, but the main question is the combined effect, in my opinion.


#46          (see all posts) 2008/08/02 (Sat) @ 08:19

I’ve tried to summarize some of the arguments here in my presentation ... if anyone wants to look at the slides and make suggestions, they’re here.

Please don’t link to them or distribute.  I’ll post them to my website permanently after the presentation.

Any comments would be appreciated.  The audience is statisticians who aren’t necessarily big baseball fans, but who are interested in the steroids issue.


#47    Peter Jensen      (see all posts) 2008/08/02 (Sat) @ 11:35

Phil - I looked at your slides.  The idea that pitchers peak at 18 is also due to selective sampling since the only pitchers that reach the majors at that age are those that are thought to already possess major league level pitching skills.  If it is actually true that the aggregate of pitchers at any age declines in the following year then you would look for the age where the highest number of pitchers have their first full major league season as the peak age for pitching.


#48          (see all posts) 2008/08/02 (Sat) @ 11:44

Peter: thanks for taking a look, appreciate it.

I agree with you that there’s selective sampling at 18.  But why won’t regression to the mean partially correct for that, like it does at higher ages?

And I’m not sure why you’d look for the age with the most rookies.  If the most rookies are at 20, but some teams hold pitchers back until 22, won’t you still have the same problems at 20 and 21?


#49    tangotiger      (see all posts) 2008/08/02 (Sat) @ 11:55

I agree with Peter, and I seem to remember having this conversation with him and maybe David Gassko.  If one-third of the players are between the ages of 25 and 29 (just picking numbers at semi-random) and one-third are younger than 25 and one-third older than 29, then can’t we say that the peak age is right around 27 (unless you suspect a heavy skew).

Even if you don’t do it by this process, at the very least, simply count the number of players (or number of PA and BFP) at each age class, and then create a smoothing function, and whatever is the peak is likely the peak age.


#50          (see all posts) 2008/08/02 (Sat) @ 12:10

Right, I see.  If most of the guys in the league at 27 weren’t in the league at 22, how can we assume that, at 22, they were BETTER than they are at 27?  That makes no sense.

But it still looks like the guys who did come up young didn’t improve.  This seems to be true even after regressing.

It could be that selective sampling is a bigger problem here than with batters, because the difference in innings between starter and reliever is so high.  With a hitter, even if he’s good but not great, he might be full-time.  With a pitcher, if he’s good but not great—especially at age 21—he goes to the bullpen and gets 1/3 the innings, or to the minors and gets none.

Tango, Peter: what’s your explanation?


#51    MGL      (see all posts) 2008/08/02 (Sat) @ 12:29

I’m not sure equal weighting would work, because of year 2 selective sampling.

Two guys, expected to hit .250.  One guy hits .100 in April, and retires.  The second guy hits .300 in April, plays the rest of the year, and winds up hitting .280.

I think you are right, and I screwed up again.  I have think about the solution.

One of the problem with “filling in” to account for the selective sampling we keep talking about is this:

We are assuming that if a player gets less playing time or retires or gets sent down to the minors that what he “would have done” if allowed to play is a Marcel or a projection.  And that assumes that the teams and coaches know nothing about these players that we (the Marcels) don’t know.  That may or may not be the case, but my guess is that the Marcels are going to be at least slightly optimistic with regard to what a player “would have done” if allowed to play.

IOW, let’s say that we have 2 35 year old players, both with career .800 OPS going into season 1.  In season one, both players hit .600 in 400 PA.  Player 1 retires or is forced to retire, and player 2 keeps playing in year 2.  Are we to assume that both players would hit the same in year 2 if both were allowed to play?  That is what we are doing when we “fill in the blanks.”

I’m not sure there is a solution to that, since we have to “fill in the blanks to one degree or another.”

OK, maybe the solution is to NOT fill in the blanks for the players who do not go on to play, but to eliminate the last year of all players (because that year is definitely going to be an “unlucky” year even if it is true that teams an coaches “know” that this payer’s skills have REALLY declined), and to only adjust the stats of players who go on to play another year by regressing every year they play.

That is the basic problem that tends to shift our aging curve to the left - that any player who goes on to play another season tends to be a little lucky.  Thus, all player seasons in our sample are “lucky” seasons as long as they are followed by another season of the same player.  That seems a little counter-intuitive, but it is true - EVERY season, for all players, followed by another season is a somewhat lucky season (much less so for established and good players than for young/inexperienced and/or bad players)!

Back to the weighting thing, I think you are right that we have problems in both seasons, but I am not sure how to handle that off the top of my head.  I’ll have to think about it and play around with some numbers on paper or on the computer.  I am pretty sure that the traditional method of weighting by the lesser of the two PA or the harmonic mean is not the right way to do it, but I am not even too sure about THAT anymore!


#52    MGL      (see all posts) 2008/08/02 (Sat) @ 12:51

I have also always gotten that pitchers get worse at any age, no matter how I slice it.

There could be selective sampling reasons for this that are not the traditional one (traditional one being that better players get more playing time as time goes on).

Is it possible that pitching in the major leagues causes one to get worse simply because of higher stress and more innings pitched?

Is it possible that for pitchers who do make the major leagues, that they were NOT better when they were younger but that for any given pitcher in professional baseball, this is not true?

Is it possible that there is a lag of a year or two between when an organization realizes that a pitcher is good?

Is it possible that a pitcher peaks at age X, but that teams let them get seasoned for 1-2 more years after their peak before they are allowed to pitch in the majors?

Ditto above, but teams don’t want players to start getting MLB service time before a certain age, for contract reasons?

Could it be that the league getting better is what “causes” pitchers to look like they get worse each year, but that teams are evaluating pitchers based on their “context-neutral” talent.  IOW, let’s say that a pitcher gets a little better from age 18-26 and then a little worse after that.  And let’s say that teams know this and bring up pitchers when they are 24, on the average.  And let’s say that the major league gets a lot better every year.  Any pitcher in the majors at any age, will look like they get worse each year!

There are probably more plausible explanations.


#53    MGL      (see all posts) 2008/08/02 (Sat) @ 12:57

Phil, are you doing a presentation on aging somewhere?


#54          (see all posts) 2008/08/02 (Sat) @ 13:01

Yup, at the American Statistical Association convention in Denver.  It’s actually on statistics and steroids, but my part is an overview of aging research.

Jim Albert, Michael Schell, and Andy Dolphin will also be presenting at that session.

http://www.amstat.org/meetings/jsm/2008/onlineprogram/index.cfm?fuseaction=activity_details&activityid=333&sessionid=203234


#55          (see all posts) 2008/08/02 (Sat) @ 13:06

Yup, I’m presenting at the ASA convention, along with Jim Albert, Andy Dolphin, and Michael Schell.

I mentioned it on my blog here.


#56          (see all posts) 2008/08/02 (Sat) @ 13:13

MGL/52: Those suggestions all sound at least reasonable to me.

“Is it possible that for pitchers who do make the major leagues, that they were NOT better when they were younger but that for any given pitcher in professional baseball, this is not true?”

This would mean a very, very steep ascent.  Anyone who makes it to the majors this year, was really bad the previous year, and had a sudden jump.  That sudden jump brought them to their peak, and they start declining.

I like this explanation in theory, but it doesn’t sound all that plausible.

How’s this: maybe there are two factors involved in pitching.  Call them velocity and intelligence.  You need both to make the majors.

Velocity starts declining from age 16.  Intelligence improves from childhood to young adulthood, then stops.

So as soon as you reach peak intelligence (and have enough velocity), you make the majors.  But you start declining instantly because velocity is constantly dropping, and intelligence doesn’t change much.

(In any case, maybe “intelligence” isn’t the best word.  Maybe “mastery” or some such.)

That would explain the data, and it’s consistent with MGL’s quoted hypothesis above.  But how plausible is it?


#57    Peter Jensen      (see all posts) 2008/08/02 (Sat) @ 17:11

Velocity probably starts declining once a pitcher fills out his height with muscle mass.  For some people that might happen at 18, for others in might not happen until the mid 20s.  But almost all 18 year old pitchers need some development to be major league ready.  Most need to master at least 2 other pitches and to develop consistency with their locations.  They need to progress through the minors where they gradually see increasingly better quality hitters and learn what they need to do to succeed against them.  By the time a team feels they are ready to be in the major leagues their fastball is probably already in decline and there is relatively little else for them to improve.  In their first year they are likely to enjoy an advantage of being a relatively unknown quantity.  In later years, after batters have faced them more often and they have been better scouted and their fast ball continues to decline, their performance is more likely to decline as well.

Hitters also probably also show an aging decline almost immediately after their rookie year.  But it is offset by an experience improvement as the hitter begins to learn the opposing pitchers and generally acclimates to the tougher pitching in the major leagues.  After a batter has completed his “education” in a year or two, his overall performance starts declining as well.  Pitchers have the benefit of being able to rely on experienced catchers who know the batters tendencies already so their performance doesn’t show as much experiential improvement.

That’s my untested theory of why the aging curves show the shapes they do for batters and pitchers.


#58    MGL      (see all posts) 2008/08/02 (Sat) @ 18:20

I vaguely recall that the studies I have done with hitters suggests that age and not experience dictates the aging process, so I don’t think that learning and experience have much to do with the aging process for hitters.  IOW, a 27 year old who has spent 5 years in the majors will see the same aging process from 27 to 28 as a rookie 27 year old will.

For pitchers, I like the theory that by the time they get into the majors there is not a whole lot left for them to master mentally (and they are already on the decline physically, although I doubt that their velocity peak is 16).  Of course this goes against EVERYTHING you hear from every insider in baseball, not that that should be surprising or even influence our thinking.  You hear a million times a year how a (good) young pitcher will only gets better as he builds up experience in the majors.  But we know that that is simply not true (that young pitchers get better - they don’t - on the average of course).

I also agree with Peter that pitchers get a great advantage when they are not “known” (seen by the batters).  So that probably contributes to a decline in performance once they get into the majors and become more and more known.

Even conventional wisdom says that you don’t have to have seen batters to know how to pitcher to them, but as a batter, it really helps to have faced a pitcher before.


#59    MGL      (see all posts) 2008/08/02 (Sat) @ 23:47

OK, I am back to using the harmonic mean of both PA since it is clear that you want to weight by the PA in year 1 and in year 2 because of the selective sampling problem.  Without the selective sampling problem, you probably want to use a different weighting system or no weighting at all.

I am still “filling in” for a player who does not play in year 2 though, although I am not sure how many PA to use as a weight for the year 1 plus the fill in year.  I think we should use the harmonic mean of the PA in year 1 and some average number of PA in year 2.  So I am just using the PA from year 1, which is close enough.

I am still doing an iterative process to fill in non-existent year 2’s including an aging adjustment.

Here is what I get now for the aging adjustments:

Age pair lwts OPS OBA BA SA
20-21 9.0 .054 .018 .021 .036
21-22 2.6 .020 .010 .001 .010
22-23 5.2 .036 .012 .010 .024
23-24 4.0 .026 .010 .006 .017
24-25 3.2 .022 .009 .006 .014
25-26 2.6 .018 .007 .005 .011
26-27 2.4 .015 .007 .004 .009
27-28 .7 .001 .002 .000 -.001
28-29 1.7 .009 .005 .003 .004
29-30 .5 .000 .002 .000 -.002
30-31 -.4 -.005 -.001 -.002 -.004
31-32 .3 .000 .000 .001 .000
32-33 -1.7 -.015 -.003 -.006 -.012
33-34 -1.3 -.011 -.004 -.004 -.007
34-35 1.8 .001 .001 .002 -.001
35-36 -1.4 -.013 -.005 -.001 -.008
36-37 1.8 -.002 -.001 -.003 -.002
37-38 -1.8 -.026 -.007 -.002 -.019
38-39 -3.6 -.040 -.017 -.005 -.023
39-40 2.3 .013 .000 -.002 .014

So now the peak occurs at around 30 and is level until 33.

I also seem to recall that if I do the same thing for the “non-steroid” era (the above data is from 01-06), the peak occurs much earlier.

So for one thing, I think it is possible that when we talk about aging curves, we have to be careful about what era we are talking about.  Steroid era.  Pre-steroid era.  Early or mid-20th century when they did not have the same medical care and training programs that they have now, etc.


#60    MGL      (see all posts) 2008/08/03 (Sun) @ 00:16

If I do the same thing for years 1970-1989, presumably a modern, but non-steroid, era, I get a peak age of 29 and a definite decline after that, .6 runs per year for the next 3 years, 1.3 per year for the next 3, and 1.5 per year for the next 3.

I made an error in the previous chart (01-06). I included pitcher hitting.  If I exclude that, I get a peak age of 29 as well and then the same type of decline I get with the pre-1990 data.  So there does not appear to be a difference between the steroid and pre-steroid eras as I thought.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:26
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps