THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, July 27, 2009

Most people still don’t understand the concept of regression toward the mean…

By , 01:42 AM

There is a thread on BTF about Sabathia’s “numbers,” particularly his BB and K rates, being down this year, as compared to last year, although he is still pitching very well of course.

While the quality of the posts on BTF is nowhere near that of this blog (although they beat us easily in quantity), there are some reasonably intelligent regulars on that site (if anyone interprets that as a dig, it is).

Anyway, no one mentions the obvious so far.  Any player who posts a better than average number in any category for one year or for 100 years is EXPECTED to do worse in any other time period you look at, even if that player’s true talent never changes.  In fact, when we talk about regression toward the mean in a mathematical sense, we are talking about the exact same player - no change in true talent, just another “sample” of his performance.  Changes in true talent are another issue altogether.

On top of that, ALL pitchers’ true talent is expected to get worse in every subsequent year, although we don’t exactly know why.

Put those two things together, and you fully expect every pitcher who has been better than average previously to get considerably worse in any subsequent time period.

Now I don’t know if Sabathia’s drop-offs in K and BB rates are exactly what we would have expected, but it couldn’t be that far off.

The thing that people don’t understand (actually one of the things) about regression toward the mean in baseball is that the reason any above or below average player will always regress, on the average, towards average, is that they were not really as good or bad as we thought in the first place, based on any of their stats.  That goes for Sabathia, Halladay, Bonds, Chipper Jones, etc., etc.  Chipper Jones is not as good as his career stats tell us, even after you do all the appropriate adjustments.  Same for Halladay.  And Sabathia.  And everyone else who has been above average and we think has true talent X.  When I say “as we think” I mean as their stats suggest, not as we think based on a credible projection which already does the regression.  And of course, there is some chance that any given player is better than his prior stats - it is just that the chances of him being worse is greater than the chances of him being better.  That is ALWAYS the case, as long as we properly define the mean for that player.

That is the KEY to understanding regression toward the mean and is what most people don’t understand, even if they think they understand the concept.


#1    greenback06      (see all posts) 2009/07/27 (Mon) @ 02:47

"below average player will always regress”

Isn’t the problem with below-average major leaguers what you regress towards? To phrase the question differently, isn’t regression to the mean is essentially a shortcut for Bayesian expectations? So regression to the mean would work for players with results at level X because the population has many more players whose true talent is below level X than players with true talent above level X. But players with below average major league stats shouldn’t get that benefit because even at results level Y, there are still many more players (namely in the minors) below true talent Y than above.


#2          (see all posts) 2009/07/27 (Mon) @ 03:21

"Chipper Jones is not as good as his career stats tell us...”

Does this have to be true?  His low end of performance was still 248/362/485, which wouldn’t get him benched.  Obviously talent distribution would suggest that it’s more likely he out-performed his true talent, but can we categorically say he didn’t under-perform - perhaps as La Russa alleged about J.D. Drew?


#3          (see all posts) 2009/07/27 (Mon) @ 09:00

"below average player will always regress”

Just to be clear—and this is a great post for working to clarify something subtle and definitely almost always misunderstood—we’re talking about the *group* of below-average players. 

So maybe, “the above or below average player is expected to regress toward the mean in the stat in which he was above or below average”.  (Or “The mean of the group that was above or below average will always be closer to average next time through")

Around that second-observation mean, the set of second measurements will still be randomly-distributed—some of the people who were lucky/unlucky last year will be lucky/unlucky this year, too (and some small group may stay that way their entire career,depending on how long and how lucky).


#4    Tangotiger      (see all posts) 2009/07/27 (Mon) @ 10:05

The (colloquial) definition of “regress” is to go back to a previous state.

So, if someone plays 162, 162, 162 games, then the “previous state” from a colloquial standpoint would be 162.  That is, the state is a state that actually existed.

If someone plays 162, 162, 162 games, then the statistical definition would have that player regress a certain amount toward the population he was drawn from.  This would mean say 150 games.  In this case, the 150 represents our best guess as to his CURRENT talent level.

If you had perfect knowledge of his population (i.e., n=1, himself), you would regress his past performance 100% toward that mean (i.e., his god-known true talent level).  His future performance would be some random point around that true talent mean.


#5    Peter Jensen      (see all posts) 2009/07/27 (Mon) @ 11:54

Wow! 79 comments and climbing over at BTF.  Most of them illustrating both of MGL’s points: 1. Many people don’t fully understand regression to the mean, and 2.Many, if not most of the posts at BTF are feeble attempts at comedy rather than a dialogue about sabermetric issues.


#6    Nick      (see all posts) 2009/07/27 (Mon) @ 11:57

Isn’t the basis behind regression to the mean that talent is normally distributed?  It doesn’t seem like that is the case in baseball.


#7          (see all posts) 2009/07/27 (Mon) @ 12:10

Couple things:

Nick: the sample(s) don’t need to be normally-distributed, but they should be identically-distributed in the pre- and post- periods.  This is TT’s mention of ‘no change in ‘true’ talent’.

Re: regression on n=1, the formulation of RttM that I’ve always read is about conditional expectation for sample mean of a group—given a ‘true’ league batting average of .280, if we look at the people who hit .340 last season their expected BA this season will be between .280 and .340.  The single-player version I’m not clear on, though there’s the implication that many of the player in the sample will have to hit worse.

And re: the other discussion --

“What you are arguing is that both Bonds and Perez had some invisible but real “True Talent” of which their actual performances are but a reflection—a pretty good reflection, but just a reflection. What is your evidence for that proposition?”

Thanks for trying to clear up RttM, TT—now if you have a minute, can you write up a quick primer on statistical inference about parameters based on sample statistics?  You could probably get that kind of thing published.  You know, as a Book, or something.


#8    Brian Cartwright      (see all posts) 2009/07/27 (Mon) @ 12:11

In MLB, the talent approximates a normal distribution.

Actually, the players in the major league, and all of professional baseball, are the right end of a normal distribution of all humans.

However, It appears normal for our purposes because there are a limted number of highly talented players available to be employed, and there are a limited number of lesser talented players permitted to be employed. The lower a player is below the mean level of MLB, the less likely it is he will play.


#9    Tangotiger      (see all posts) 2009/07/27 (Mon) @ 12:12

This is the BTF thread that is tracking MGL’s thread:
http://www.baseballthinkfactory.org/files/newsstand/discussion/the_book_blog_mgl_most_people_still_dont_understand_the_concept_of_regressi/

These were my comments:

=============================================

So that means that the quality of comments here are better than he’s observing, and the quality there is less, right?

Brilliant!

***

MGL has his public persona, however it is that it is taken.  I’d also like to point out regarding those who says that he never explains anything in a useful manner, to look at his responses on our wiki.  There is some 80 responses from him.  There has to be at least one that is useful and clear, no?

***

I have never been entirely comfortable with the Platonic assumption adopted by many sabermetricians that a player has an unchanging “True Talent,"…

Not only should you not be comfortable, you should outright reject it.  Players are humans, not machines, and therefore, have an ever-changing true talent level, by the second.

This is why we can’t look at a player’s career, without weighting his more recent seasons more.  It is because they are not unchanging machines.  Aging, conditioning, etc, are all part of the human condition.

If someone wants to say “If Chipper weren’t human”, then yes, look at his career total, and presume he’s the well-oiled machine that doesn’t need any maintenance.  Otherwise, you have to:
a. presume he’s human
b. accept that sample observed performances, no matter how many you think is enough, is never enough

***

For some reason, he doesn’t push my buttons, either. At least he’s intelligent and, seemingly, true to himself. (Also, he makes me chuckle.) It’s the idiots without a sense of humor who write in a patronizing tone that get me fired up.

That’s a well-considered point-of-view.

The opposite is, of course, just as reasonable.

***

If you have this much difficulty conveying your thoughts and findings to a general audience you might consider improving your ability to communicate to a general audience. Or stop trying to communicate outside of specialized audiences.

I believe MGL stopped posting at primer a long long time ago, and posts almost exclusively at our blog.  So, he’s done exactly as you said, and it doesn’t seem to matter.

Otherwise he can’t post anywhere as he wishes.

***

However, it’s a factor, not a law. Other factors are health, experience, and aging.

The “mean” is not the single league-average mean, but the mean of the population you are drawing the player from.  Therefore, his health, experience and aging are all part of the regression equation.

***

Funny though, we primates can dish it but we have a hard time taking it.

You think?!?

Though, I’m not sure that Primates are disproportionately ornery compared to the population at large.

***

...but he’s clearly in need of BTF acknowledgement of his work. Else he wouldn’t make the call back. And seriously, if you’re writing a book for the mass consumer market, you’re trying to communicate to a general audience. It’s not like he’s submitting his work to peer reviewed statistical analysis journals.

1. I don’t think MGL cares about any acknowledgement from anyone.

2. MGL has helped more people directly than any other saberist around.

3. The Book is a tough read, but it is readable.  I don’t see the need to presume that his blog writing style equals his book writing style.  If you want to take a passage from The Book that you think was poorly communicated, please highlight it.  Otherwise, no need to make blanket assertions.

4. The peer-review we get from the general saber audience is far better than the peer-review we’d get from academia.  When it comes to subject matter experts (SME), the hardcore baseball fan trumps the barely-a-baseball-fan academician.  Ideally, you have someone like Andy Dolphin, Ted Turocy, or Walt Davis.  But, those are the exceptions. 

Otherwise, though I’d like to hear from both of them, I’d place a lot more weight on Chris Dial than Rodney Fort.

***

“True talent” refers to a player’s ability at a particular moment in time.

Correct!

***

mgl’s was… over the top generous.

MGL is extremely generous with his time.  He also donated all his profits from The Book to Retrosheet. 

So, his “need” to write The Book (meaning spending hundreds of hours writing, the never-ending and grueling editing sessions we went through, etc to provide a tight, hopefully timeless book) was done strictly on an educational basis.

***

I must be missing something, because I don’t understand why this concept is useful or important.

The Book is available for free reading from Amazon’s Look Inside.  I suggest reading the last few pages of Chapter 1.  After you do that, let me know if you have more questions.

***

This makes me think of somebody watching Kirk Gibson’s home run off Eckersley and yelling “Small sample size! This means nothing!” at the TV. Or yelling it at the spreadsheet where he’d read about it.

The correct response is how it was called by the announcer: “I don’t believe what I just saw”.  It is not anything more than that.  It doesn’t push Gibson or Jack Morris in his game 7 above any line beyond where they already were before that point in time.

These moment-in-time performances should be enjoyed and absorbed for what they were, and not try to be explained beyond that.

***

Wouldn’t major-league players tend to regress towards replacement level?

Albert Pujols would not get regressed to that point, because replacement level players do not get the amount of playing time Pujols has gotten in his career.  You would regress say Willie Bloomquist to that point.  The number of games you play is part of the regression equation.

***

I’ll bet you $50 zillion he will. If you don’t take that bet, it proves I’m right.

Well-played.


#10    David      (see all posts) 2009/07/27 (Mon) @ 12:12

Nick, sort of, though your larger point is completely right.  MGL’s post is making an assumption about the median and the mean.

MGL says, “...it is just that the chances of him being worse is greater than the chances of him being better.” This isn’t necessarily true without knowing something about the distribution of talent.  If a player posts numbers that are above the mean but below the median, then the player is actually more likely to get better.  The underlying distribution matters, as Nick points out (though the mean=median in distributions other than the normal).


#11          (see all posts) 2009/07/27 (Mon) @ 12:20

David, can you go into more detail?  I’m trying to think of distributions where RttM won’t hold—to be honest, i’ve never really thought about it before and can’t find many references. 

Also, scratch the part where I said i didn’t know the n=1 version—I was thinking of n as players, but if it’s player-seasons or something, it works fine.


#12    MGL      (see all posts) 2009/07/27 (Mon) @ 13:06

Yes, I believe that regression is a shortcut for a full Bayesian computation, which is “given the exact distribution of talent for similar players, and given what player A has done, what are chances that he is a true X, a true Y, a true Z, etc.?”

If the distribution of true talent were exactly normal, then we can do a “one-step” simple regression.  If it is somewhat normal but skewed, then we might have to use a different number than the mean to regress towards, which would be a short-cut again.

I think the problem with Brian’s thesis that baseball talent is pretty normally distributed if you weight by playing time (which it is) is that we don’t want to use THAT distribution (the normal one, the one where player performance id weighted by playing time) for regression purposes.  We want to use the distribution of all players, say, in the minor leagues who get a chance to play in the majors.  THAT is the population we are drawing from. So it is probably true that players need to be regressed toward a number that is somewhat less the mean.

#2, of course we don’t KNOW that Chipper (or any player who plays better than some average) got lucky.  That is the assumption only because it is more likely (how likely depends on the number of opps in that historical sample and the amount that he is above the mean) that he is worse than his stats than better than his stats.

Obviously if a player’s OPS is only 10 points more than the mean of similar players in 3000 PA, he may be only 51% likely to be worse than that and 49% likely to be better.  If a player is 50 or 100 points better than average in 300 PA, it is probably 95% likely that he is worse than that.  Etc.

Regardless of how to do the regression and what mean to regress towards, and whether you do a rigorous, complete Bayesian computation to do the “regression” or you do a short-cut, which requires a somewhat normal distribution for the “a priori” probabilities, I think, my original point is still the same.  The reason we do any kind of regression at all is because the assumption is ALWAYS that any player who has performed better than some number which is probably pretty close to the mean of similar players in any number of opps, from 10 to a million, got lucky and is not as good as his stats suggest.  And the same is true in reverse for players who performed below average, again, for ANY number of PA (opps if not an offensive stat).

That is why if you have a database of historical baseball performance and choose any stat whatsoever and look at a large group of players (to reduce your sample error), you will ALWAYS find (occasionally you will not because of sample error of course) that if in one time period that group performs above average, they will perform worse than that in any other time period, and vice versa if a group performs below average in one timer period.

And BTW, that is proof that using the mean as the point to regress toward is probably close enough.  For example, let’s say that the mean HR per 500 PA for all MLB’ers was 14.  And let’s say that you look at all players who were between 12 and 13 for one season and they averaged 12.7.  My guess is that in any other time period (the next season or the season before) that they would be somewhere between 12.7 and 14, suggesting that even though it is true that there are many more below average players to choose from in the minor league population and the general population of males 20-40 years old, you still are going to regress toward a number which is pretty close to the mean of MLB’ers. One reason is that if a player gets to play for a full season, it is not only because he had good numbers along the way.  It is also because the teams know that he has good skills, is a good prospect, etc.

In other words, if the fact that baseball talent is really the tail end of a normal distribution (for all 20-40 males in the world) significantly influenced the way we did our regression, we would have to allow all males to get some PA in the major leagues such that someone with a true talent of, say, 1 HR per 500 PA with no chance of ever playing professional baseball, would get a chance to get 300 PA and get really lucky.  That isn’t the case.  In other words, the players in the “a priori” distribution are players who get the chance to play in the majors only, which are the best minor league players essentially.  We are not dealing at all with the distribution of any other persons.


#13    villageidiom      (see all posts) 2009/07/27 (Mon) @ 13:17

In [4] Tango - rather succinctly - explains regression to the mean in its proper form, but also the common use of the phrase. Tango, you bring quality to this site. (If anyone interprets that as a dig, it is.)


#14    David      (see all posts) 2009/07/27 (Mon) @ 15:06

Henry, I don’t know if you’re asking for the name of a “famous” distribution or not.  But I just mean something like: 0, 1, 1, 2, 1000.  The median is 1, mean is >> 1.  If someone gets a “2” in a year, then regressing to the mean actually increases their predicted performance for the next year. 

I think the idea of “regression to the mean” comes from an equation like the following:

Future Performance = a + b1 * (Established Talent Level) + e

Estimate this equation using the “right” group (obviously, this is a difficult step on its own).  Using an OLS regression gets you the minimum of the squared residuals or, put differently, the “best” mean for a given Established Talent Level.  But this does not necessarily mean you can say anyone is more likely or less likely to improve.  OLS only cares about the mean while “more likely” is more of a median/quantile idea. 

Honestly, this distinction is pretty meaningless in the current context for the reasons Brian mentions - the mean and median are basically the same here - though I do think quantile analysis is a potentially valuable tool for projections.


#15    MGL      (see all posts) 2009/07/27 (Mon) @ 19:06

Amusing and lively discussion going on at BTF:

http://www.baseballthinkfactory.org/files/newsstand/discussion/the_book_blog_mgl_most_people_still_dont_understand_the_concept_of_regressi/P100/

I had absolutely no idea that I aroused such emotion from people I have never met in my entire life and have had nothing to do with whatsoever (other than a few I do know and have kind words to write).  And I can’t imagine for the life of me why.

I have answered thousands of emails, written a Book, published dozens of articles, did a few interviews, keep up with a blog and a “mailbag,” and maybe a few other things I have forgotten, all for pretty much nothing.  Never made a dime off of any of these - not interested in that.  Well, at least no one can call me a gold-digger I guess…


#16    MGL      (see all posts) 2009/07/27 (Mon) @ 19:08

Ditto for Tango (contributing tons of stuff for nothing) and of course plenty of other folks on the web…


#17    greenback06      (see all posts) 2009/07/27 (Mon) @ 19:56

MGL, I don’t mean to indulge on your hospitality here, but you did just take a dig at the intelligence of a bunch of folks, and it wasn’t exactly out of character. You also have a tendency to author These People Don’t Understand posts, which in general is a fast track to hostility, even if it’s right, or perhaps especially if it’s right. Again, I don’t have any intention of telling you how to write on your blog, but the source of those emotions isn’t hard for me to see.


#18    Matthew Cornwell      (see all posts) 2009/07/27 (Mon) @ 20:01

Some might not like hearing that their favorite players are “lucky.”

Let’s say Chipper is your favorite player and you take offense to the notion that his talent might not be as good as his production would indicate becasue of what MGL is suggesting.  One should not fret too much, as it is very likey that those ranked above him had equal if not more “luck” on their side.  It shouldn’t change our overall perception of players since everybody regresses - not just Chipper.


#19    David      (see all posts) 2009/07/27 (Mon) @ 20:34

Taken literally, it is very difficult to reconcile these 2 things you said:

1) there are some reasonably intelligent regulars on that site (if anyone interprets that as a dig, it is).

2) I had absolutely no idea that I aroused such emotion from people I have never met in my entire life and have had nothing to do with whatsoever (other than a few I do know and have kind words to write).  And I can’t imagine for the life of me why.

Obviously, the main problem is that you made a little dig at a group of people you don’t know.  But, it’s also important that you tried to make a nuanced point about a concept and then people on that site responded to your point.  And you didn’t really respond back to them, ignoring some legitimate concerns.  Your original post had some mistakes in it.  Nuanced mistakes, definitely, but I’m not sure they’re “better” than the nuanced point you were originally harping on.

Basically, just phrase things like, “I think people are making the following mistake” and treat things like a friendly conversation instead of calling people out.  Otherwise, when you make a minor mistake in that same post, people will call you out.  Don’t act surprised when they do.


#20    MGL      (see all posts) 2009/07/27 (Mon) @ 20:59

I am surprised at the quantity of posts directed toward me, not that there are some posts that take offense to what I said or how I said it.  I really am. No big deal.  Just surprised, that’s all.  Come on.  164 and counting.  I’m sure all of you would have put the over/under at 188 wink.

I posted something which was and is pretty straightforward - absolutely nothing controversial about what I initially said in this thread.  And I made a slightly arousing comment - slightly ("reasonably intelligent people...").  164 and counting?  Come on…


#21    Peter Jensen      (see all posts) 2009/07/27 (Mon) @ 22:11

The thing that seems strange to me about BTF is that if you hadn’t made that comment about the posters there I think that there was a 50% chance that they wouldn’t have linked to your post at all and if they had you probably wouldn’t have gotten more than about 10 comments.  It must be sad for you and Tom who used to participate in some really good discussions at the old Primer to see the level of discourse at BTF now.


#22    MGL      (see all posts) 2009/07/27 (Mon) @ 22:41

I don’t know that Primer has changed much really.  I’m not sure.  It was always a little sketchy. I don’t miss it at all.  I still read some of the threads, mostly for the articles, not too much for the comments.  It is amazing how puerile that site can be.  The comments threads that is.  I have nothing against the site of course.  It is a fine site - the home page is a little messy and the navigation difficult.  But nothing wrong with the site.  Lots of strange birds who are regular posters and contributors there.  IMO at least.


#23    Patriot      (see all posts) 2009/07/27 (Mon) @ 23:07

MGL’s description of BTF is spot-on, I think.

One particular thing that has annoyed me on the occasions that my blog has been linked there is that the posters seem to respond as if it was self-linked.  I write about something mundane, it gets linked there, and a poster goes off about how self-indulgent the post is.  Well, yeah, it’s from my own personal blog.  I didn’t ask anyone to link to it. 

The exception was Primate Studies, which always had a high level of discourse, but of course this blog is essentially the continuation of Primate Studies.


#24    MGL      (see all posts) 2009/07/27 (Mon) @ 23:15

Here is a quote from someone at BTF (seems like a reasonable person):

“In any case, the only thing I really wondered about was the hasty phrasing in mgl’s post, which made it seem like below-average players are fixing to regress toward average. That doesn’t mean that Vizquel has a good chance at an OPS of .760 next year; it just means that he’s headed (most likely down) to whatever OPS we can expect a 43-year-old shortstop who was never a great hitter to, on the average, achieve.”

Yes, all below-average players (players who have posted below-average stats) are more likely than not to regress toward average, if we know nothing else about that player.  And actually if we know other things about that player, the only thing that is generally going to change is the definition of “average” (the mean of whatever population to which the player belongs).  So, in the case of Vizquel, yes, he is expected to regress toward the mean of however a 43 year old (and whatever characteristic we want to use to define his population) SS generally performs.  If that is more than he has performed over the last 3 or 4 or 5 years (weighted), then we expect him to perform better at any point in the future (after doing whatever age adjustments we need to do).

The main thing I wanted to point out in that quote is “who was never a great hitter.” That is NOT relevant to the regression! The player’s stats do not have anything to do with the mean that we expect the player to regress towards.  43 year old?  Fine.  SS?  Fine.  Small in stature?  Fine.  Still in good shape?  Fine.  The fact that he was never that good a hitter, or that he has not hit well for the last few years?  No, no and no.

Now that I said that the player’s stats should have nothing to do with the regression, I will say that it could.  But not in the way that most people think.  I’ll give you an example.  Let’s say that a young player has some terrible stats for the first 2 months of his play.  By all rights he should not be playing anymore.  He should be on the bench or sent down to the minors.  But he is still playing.  Why?  There is some chance that the scouts recognize that he is a better player than his stats suggest or he is otherwise a great prospect with a great pedigree.  So sometimes bad play by a player when he is allowed to continue to play can actually tell us a little something about his true talent.  So while prior stats per se cannot influence our regression, sometimes those stats can allow us to infer something about his true talent that we would not know otherwise.

Anyway, I think that most people understand the basic concepts of regression toward the mean and how it works in practice.  What I really wanted to say, and I may not have said it well, was that I don’t think many people understand “why” the regression occurs.

Again, and I’ll put it succinctly, the regression occurs because the player was NEVER as good or bad as his prior stats suggested, and he is simply likely to return more or less ("towards" or however you want to say it) to his true talent level as it existed all along.  The fact that a player’s true talent can and does change all the time is a completely separate issue that I am not addressing (at this time) at all. If someone thinks or says that I HAVE to address it when discussing regression toward the mean, they are wrong...


#25          (see all posts) 2009/07/27 (Mon) @ 23:52

I think the important thing is that players should ideally get regressed to their own true talent level, and although MGL and Tango are aware of this, it was not getting through to people for some reason.  The bold part in MGL/24 should help.  Of course, being mortal beings, we do not know a player’s true talent so we must guess at it based upon the population from which we are drawing the player.  I think it’s a subtle but important point that the population mean is being used as a proxy for the player’s true talent.

Keep up the good work guys; you are first rate sabermetricians and I’m very glad to have found The Book and this site.


#26    David      (see all posts) 2009/07/28 (Tue) @ 00:22

MGL, I just think that’s too strong of a statement for reasons given previously in the BTF thread.  If we know each player’s true talent (t), then there’s really no issue here.  There would be no regressing at all - we’d just always use t as a projection and we’d observe t + e where e is noise/luck. 

But we don’t know t, so we have to use covariates (age, weight, etc.) to guess t.  Say we have group 1 which consists of a bunch of players with the exact same covariates and we know that their mean talent skill level is s (s=[t_1+...+t_N]/N).  But there’s heterogeneity in skill within group 1.  Take a high-skilled player in this group (t > s).  In year 1, they perform above the mean (t + e > s) but below their own talent level (e < 0).  Player 2 should perform better in period 2, but there’s a chance he doesn’t.  Say he doesn’t and “regresses” to s in period 2.  What you’re calling “regression” here is really just random variance and skill heterogeneity.  The regressing player was actually _better_ than his original stats. 

My point is really just this: If we know t, then the issue of regression is trivial.  Everyone would know to use t.  If you don’t know t, there’s heterogeneity due to unobservable skill.  Once that’s true, you can’t say that a person regressing to the mean (based on observables) is definitely regressing because he was worse than his original stats suggested.  Likely?  Sure.  But not 100%. 

The broader point is that someone can say Sabathia’s K% is “too low” if they believe they are using information “unobserved” (or just unused) to the statistician.  I realize that the spirit of a lot of comments are not in this vein...but some are.


#27    MGL      (see all posts) 2009/07/28 (Tue) @ 04:03

David, the first part of my statement in bold above is wrong. I should have said, “LIKELY never as good or bad..” where likely is some number greater than 50%.  Other than that, there is no “strong” or “weak” in what I said.  What I said, with the above correction, is exactly correct, and is how regression toward the mean should be viewed.

OF COURSE we don’t know a player’s true talent level.  We never do and never will.  What is the point of mentioning that?

“I think it’s a subtle but important point that the population mean is being used as a proxy for the player’s true talent.”

Thanks for the kind words, mickey, but you have that wrong.  We don’t use a population mean as a proxy for a player’s true talent.  And we don’t “regress to a player’s true talent.” We use regression toward a mean to estimate a player’s true talent.  That is the only way to do it. (I suppose using the vernacular meaning of the word “regressed” a player “regresses” to his true talent, but if we are using the vernacular definition of “regress” we can’t really include a player whose true talent is better than his sample performance, can we?)

It is tautological that a player is most likely to play at his true talent level in any unknown period of time. Duh!  We regress toward a population mean in order to estimate a player’s true talent. Simple.  Nothing more and nothing less.  This entire thread could have been summed up in one sentence. If people want to argue about how to come up with that mean, that’s fine with me…


#28    David      (see all posts) 2009/07/28 (Tue) @ 04:31

The point was that you made a statement - multiple times - that was slightly wrong.  Your entire post was based on the fact that you don’t think people understand the nuances of the idea of regression to the mean.  But then you made similar mistakes and some small errors.  So, I was correcting.  Not a big deal (it’s hard to consistently use precise language) but if we’re going to get it right, we should get it right. Wasn’t that the point of all this?  If not, why exactly did you make the original post?  The core of your original point was that we don’t know a player’s true talent level.  But now it’s too obvious to mention?


#29          (see all posts) 2009/07/28 (Tue) @ 09:23

"Thanks for the kind words, mickey, but you have that wrong.  We don’t use a population mean as a proxy for a player’s true talent.  And we don’t “regress to a player’s true talent.” We use regression toward a mean to estimate a player’s true talent.  That is the only way to do it. (I suppose using the vernacular meaning of the word “regressed” a player “regresses” to his true talent, but if we are using the vernacular definition of “regress” we can’t really include a player whose true talent is better than his sample performance, can we?)”

Yeah upon rereading what I wrote I realize that it didn’t convey what I meant.  I agree with you here; thanks for clarifying my point.  When I combine vernacular usage with a technical discussion (or technical usage in a casual discussion) bad things often happen.


#30    birtelcom      (see all posts) 2009/07/28 (Tue) @ 18:11

I wonder if part of the problem of understanding mgl’s regression point is that it is so abstract.  Not that there is anything wrong with abstract as such, but it doesn’t necessarily promote understanding.  Absractness is, I find, something of a recurring stylistic issue here at Book Blog. Bill James’s ideas are more successfully adopted, I suspect, because he can be so good at using actual historical or contemporary examples to vividly demonstrate the full effect of the phenomena he describes.

For example, to say that Chipper Jones is subject to a regression to the mean assumption that his true talent is probably lower than his actual career performance level, while true so far as it goes, is so limited in its practical significance that it will tend to leave many people mystified.  Given the much larger factors in determining Chipper’s likely current “true talent level” or future predicted performance (the rather large sample of his actual performance over a long career, his age) regression to the mean for him seems likely to be very small indeed.  Given that, using him as an example seems to positively discourage understanding rather than encourage it.  That’s OK if you want to be provocative rather than educational, but then one shouldn’t really complain when people are provoked rather than educated.

By the way, how much does the regression to the mean part of the Marcel formula affect evaluation of Chipper these days?  Mentioning that, and contrasting it to say, how much of an effect there is with respect to players with much smaller actual levels of experience, might have helped reduce the confusion, but illustrating the point with a bit less abstraction.


#31    Tangotiger      (see all posts) 2009/07/28 (Tue) @ 19:42

”...is so limited in its practical significance “

It is so not!  I base my entire evaluation work on its application.


#32          (see all posts) 2009/07/29 (Wed) @ 15:20

So what is your definition of “luck”? I would say that in everyone’s haste to define “what others don’t understand” we lose the ability to realize all the stuff that will never be able to be understood.

That’s why most people here need to take a lesson from Godel on the limits of predictive analytics. You don’t want to end up like the financial quants, selling stuff to be simpler and less arbitrary than it actually is.

Bayesian analytics are some of the most objective ways of looking at data, but there is ALWAYS subjectivity lurking in most anything. I mean, doesn’t it seem silly to use words to argue about math? I find it entertaining though. Silly and entertaining are not mutually exclusive especially with no “dog in the hunt”.


#33    birtelcom      (see all posts) 2009/07/29 (Wed) @ 16:55

tango #31: When I used the phrase “so limited in its practical significance” I did not mean to refer to regression to the mean generally but only to its specific application to Chipper Jones, or any other player with a similarly long career worth of real performance numbers.  If we wanted to do a best guess as to Chipper’s “true talent level”, characterized in terms of wOBA, what would the diference be between doing that without a regression to the mean in the formula and with one? Over about 9,000 MLB PAs, Chipper has a career wOBA of .406. These 9,000 MLB PAs are not an infinitely large sample, and it is true that there is a higher probability of Chipper’s true career talent level (which would be apparent with an infinitely large sample) being below .406 than being abve .406. But by how much?  If we had to pick a wOBA that reflected the likeliest true talent level of wOBA for Chipper across his career, how much would it vary from his actual .406 career wOBA?  I would venture to say not very much at all, but correct me if I’m wrong.


#34    Tangotiger      (see all posts) 2009/07/29 (Wed) @ 17:05

If you are looking to see what Chipper’s “true” wOBA was over his career (as opposed to what you’d expect from the rest of his career), you just add about 200 PA to his career total.

So, with someone with 9800 career PA, you would regress 2% toward the mean.


#35    birtelcom      (see all posts) 2009/07/29 (Wed) @ 17:37

So incorporating the regression to the mean to get the best estimate of Chipper’s “true” wOBA over his career gives us a .404 or .405 wOBA rather than his actual career number of .406. That is what I meant when I suggested that incorporating regression to the mean for Chipper, or others of his career length, while theoreticallly correct is of little practical effect.  And that if writing on the subject is going to use the Chippers of the world as examples of the application of the principle, a lack of understanding might be anticipated.


#36    Tangotiger      (see all posts) 2009/07/29 (Wed) @ 18:00

The practical use for regression toward the mean is in understanding his talent level for forecasting.  So, it’s important to know since a player’s entire career won’t be equally weighted for these purposes.

Otherwise, for your specific point, I’m ok with it.


#37    Daniel      (see all posts) 2009/07/29 (Wed) @ 18:02

At what point do we have an acceptable level of confidence as to knowing a player’s “true talent”? If a hitter’s “true talent” would serious only be known over infinite at-bats, even 10k ABs would just be a drop in the bucket. But it seems unlikely, intuitively at least, that a player with a 15 year career and a career .404 WOBA would in fact have a “true talent” level for producing, would have, say, the “true talent” to hit for .325 WOBA.

Obviously, “acceptable level of confidence” needs to be defined, but, lets say 95%. How many at bats would it take to reach that level of confidence? Do we ever reach that level of confidence? What is our level of confidence in knowing Chipper’s talent at 10,000 ABs?

All of my questions could be entirely off base, and it’s fine if you tell me that. But if that’s the case, I’d be interested in seeing an explanation as to why.


#38          (see all posts) 2009/07/29 (Wed) @ 18:31

So, the statement above, “Chipper Jones is not as good as his career stats tell us, even after you do all the appropriate adjustments,” is completely misleading.  Its suggesting that Chipper’s previous stats aren’t indicative of his talent.  However, as I suspect most of us know and you point out, the real use of regression towards the mean is in projecting future production, not in determining previous talent level.  So what was the point of this article?  To further confuse the use of regression towards the mean?

I find this whole discussion rather uninformative.  Regression towards the mean only works if you have a meaningful mean to regress players toward.  Take for example Evan Longoria.  Last year he had just 448 PAs and posted a .373 wOBA.  Now we want to project his talent and thus production for this year.  Well he’s 22, his skill set is amazing, and he may have been a little lucky last year.  So, how much of our projection is going to made up of regressing his performance to the .328 wOBA league average?  Now if you a mean performance of 21 year old rookies with amazing athletic abilities that posted .373 wOBAs, I could regress him towards their mean sophomore season, no problem.  But that .328?  Sorry no take.

Now, if you’re going to regress whole groups of heterogeneous players towards the mean, no shit you’re going to be right more than not.  Bad seasons are often fueled by bad luck and visa verse.  And large groups of players are more likely to resemble the league average player, as a whole.  But individuals such as Longoria or Chipper are nothing like the league average.


#39    MGL      (see all posts) 2009/07/29 (Wed) @ 20:14

1) The regression towards the mean applies not only to a projection, but to an estimate of past true talent (career, last year, last two years, whatever) as well.  A projection is simply an estimate of current true talent (with a little bit if an age adjustment going forward).  IOW, if a player hits .300 in 3000 career AB our best estimate of his actual true BA over that entire career in the past is somewhere between .300 and whatever mean is appropriate to use.

2) As Tango said, you only regress 2% after 9600 PA or so, so even though 9600 PA is a “drop in the bucket” as compared to an infinite number of PA, the point at which exactly zero regression is appropriate, the regression curve is steep and approaches zero quickly. It is not linear.

3) A projection for Longoria is his .373 in 448 PA last year (or whatever it was) plus whatever minor league stats you want to use regressed to whatever mean you want to use.  Simple.  That applies to every person who ever picked up a baseball bat in the major leagues.  Their projection or your estimate of their true talent (which is essentially the same thing - the only difference is the time frame) is their actual performance regressed toward some mean.  Exactly what that mean is, no one knows for sure.  We usually use age and phenotypical characteristics, like height and weight.  Speed can be used, as well as draft slot and prospect status.  Scouting reports can be used too, but one has to be careful that those scouting reports are not too much a function of a player’s stats.

If anyone wants to argue WHAT the mean is to regress a certain player, count me out of that discussion.  You can figure that out for yourself.  I’ve already spoken my piece on that.  As it turns out, if you use the MLB average for EVERYONE, you are going to be in pretty good shape.  For example, take all players in history just like Longoria - a big, strong, high prospect 3rd baseman with good skills who hit .373 in his first year.  Regress that toward the MLB average, even though that is probably not the best mean to use - and I guarantee that you will be in the ballpark for his wOBA in any future year (adjusted for age).

Wally, who is telling you to use .328 or whatever the MLB average is for ALL players?  Who are you arguing with?  Yourself?  How many times have I and a hundred (exaggeration warning) other people said, “the appropriate mean,” “whatever mean you (responsibly) choose,” “the mean of the population of similar players,” etc., etc., etc., etc., yet you say:

“But that .328?  Sorry no take.”

Why are you saying that?  Who are you talking to?  Who here is saying that all players should be regressed to the MLB average (I am assuming that your .328 is some MLB average)? I just said that you “couldn’t go wrong if you did that,” and that is true, but I have said hundreds of times before, and other people have said the same thing, that you regress towards the mean of the population that the player belongs to, and you do your best to identify that population and then find out its mean. That is the whole POINT of regression toward the mean!  If a 7’ 4” basketball player has 5 blocks per game, you don’t regress toward the average blocked shots per game of ALL NBA players, do you?  You don’t regress toward the mean blocked shots per game of the average player in your son’s pee-wee league do you?


#40          (see all posts) 2009/07/29 (Wed) @ 20:48

MGL, maybe I missed part of this discussion as I was reading the comments, but its hard to see your basic point about the player specific mean when say stuff like: “The thing that people don’t understand (actually one of the things) about regression toward the mean in baseball is that the reason any above or below average player will always regress, on the average, towards average, is that they were not really as good or bad as we thought in the first place, based on any of their stats.”

So, if I’m arguing with myself, great, than we don’t have anything to disagree on.  But given some of the things said above, I think this is an easy point of confusion.  So, thanks for clearing that up.


#41    birtelcom      (see all posts) 2009/07/29 (Wed) @ 21:00

Wally #38: My point was only about the usefulness of the Chipper example, and you are right that the original statement “Chipper Jones is not as good as his career stats tell us, even after you do all the appropriate adjustments” was very misleading because not as good apparently here meant a .404 or .405 wOBA instead of the actual .406 wOBA.  But the concept is still good for the Longoria exampple.  If mgl and tango have their formulas right, your suggested regression against a group of early career stars and their subsequent perfomance should produce the same result as the regression to the mean of the league-wide wOBA.  The chances that a performance like Longoria’s is 100% luck and that he is actually an average wOBA player are quite low, and tango’s Marcels, with the regression to the mean included, will reflect that.  As will Marcels for others of similar age and performance level.

My only point was that the statement about Chipper in the original post was cast in a way that a reader would likely imply that there was some probable significant difference between Chipper’s performance and his career long true talent, and that is not correct—the probable difference for Chipper over his career is meaninglessly small. I don’t even mind that the provocatively misleading statement was made—provoking discussion is not necssarily a bad thing. But to make a provocatively misleading statement and then whine about how people misunderstand the concepts you are merely trying so hard to explain I found irritatingly disingenuous.  But none of this invalidates the validity of the basic concepts of regression to the mean that tango and mgl correctly apply.


#42    Brian Cartwright      (see all posts) 2009/07/30 (Thu) @ 02:02

You can argue about how much knowledge we gain of Chipper Jones’ true talent by adding 200 PAs of league average performance, but how much do we gain by using stats he compiled 16 years ago?

ASo we want to keep the player’s stats as recent as possibl (because his talent level is changing) but as large as possible to minimize the effects of regression. If we use the past three seasons of says 600 PAs, that’s 1800 total, and 200 of regression, gives 2000 total, 90% player’s records, 10% league.

As MGL points out, when a player is short on MLB experience, use a translation of his minor league record. Use whatver info about the player you can get your hands on.

For what to regress to, it doesn’t have to always be the league mean, but it also can’t be the player’s own stats. If you are looking at Evan Longoria, you can look at all other athletic third basemen who got drafted in the first 5 picks, and see if that has any predictive value.


#43    Tangotiger      (see all posts) 2009/07/30 (Thu) @ 09:56

Right.  When mgl talks about Chipper’s true talent level, he is talking about a specific point in time, not his average true talent level of a 15-yr time period.

Since, as I’ve noted, you are going to regress an unweighted career performance by only 2% (and that will apply to all veteran players, thereby cancelling out the effect).  Even trying to regress a 5000 PA career performance to compare to a 10,000 PA career performance means regressing one by 4% and the other by 2%.  No one’s going to make a big deal out of this, other than for philosophical discussion of the uncertainty level of this mean, which will be very low (but not zero).

At this point, I don’t even know what questions are on the table.


#44    birtelcom      (see all posts) 2009/07/30 (Thu) @ 11:33

"When mgl talks about Chipper’s true talent level, he is talking about a specific point in time, not his average true talent level of a 15-yr time period.” That usually may be true, but it’s not what he seemed to be saying in the original post in this thread: “The thing that people don’t understand (actually one of the things) about regression toward the mean in baseball is that the reason any above or below average player will always regress, on the average, towards average, is that they were not really as good or bad as we thought in the first place, based on any of their stats.  That goes for Sabathia, Halladay, Bonds, Chipper Jones, etc., etc.  Chipper Jones is not as good as his career stats tell us, even after you do all the appropriate adjustments.” I read that to mean that regardless of all the accumulated experience we have of a career like Chipper’s, he wasn’t “really as good as his career stats tell us”—and I understood that conclusion to be based on the meaninglessly tiny 2% redcution that is applied.  If that’s how you are going to portray regression to the mean, people are going to misunderstand it and it will be harder, not easier, to have people respect the concept when it is more meaningfully applicable.


#45    MGL      (see all posts) 2009/07/30 (Thu) @ 11:57

I chose Chipper somewhat arbitrarily and on the spur of the moment.  These are blog posts, often in the middle of the night (by me) and not research papers. 

On the other hand, my point was that even an established, veteran player, who we “know” is a great player is still likely not as good as his stats suggest.  It would not have been instructive, given one of the points I was trying to make, to use a player with one year of experience as an example.  Bonds would have been another good example.

Another example I like to give along the same lines, which someone else mentioned in another thread or another site (maybe the BTF thread on regression), which I think is a good one, is that we are not 100% certain that Bonds was a better hitter, true talent-wise, than Nefi Perez…


#46          (see all posts) 2009/07/30 (Thu) @ 15:56

MGL:

“is that we are not 100% certain that Bonds was a better hitter, true talent-wise, than Nefi Perez… “

Ok, but what’s the P-value?  And when statistically speaking with real world data, can anything achieve 100% certainty?  After all there is a non-zero chance that the basketball random jumps out of the empty swimming pool…

And Birtalcom, I’m not trying to invalidate the usefulness of regression to them mean, but that its usefulness and its effect is being exagerated here.  Note the “we are not 100% certain that Bonds was a better hitter, true talent-wise, than Nefi Perez” comment.  The P-value on that has to be something ridiculously small.


#47    Tangotiger      (see all posts) 2009/07/30 (Thu) @ 16:11

Right, of course.  The point is that it is non-zero.

We are 100% certain that Bonds had better results from his performance than Neifi.  We are nine (one hundred and 9?) 9s certain that Bonds had more talent.


#48    birdo      (see all posts) 2009/08/25 (Tue) @ 23:03

This may not be the correct thread to ask this in (since it has already been noted that you can choose whatever mean you like) but what is the methodology for handling players who have only ever played in the minors?  Are people saying that once you translate the minor league stats to their MLB equivalent, then you should regress to whatever MLB-level-derived-mean you choose?

In other words, if you have a player with only 200 rookie ball ABs, you would still regress those translated stats towards some sort of MLB average.  Then, the fact that the stats would be significantly pulled towards MLB average is simply a result of the small sample and if you do not have a “more reliable” mean this is an unavoidable consequence?

Great discussion and thanks in advance.


#49    MGL      (see all posts) 2009/08/25 (Tue) @ 23:44

I am not sure what the last sentence in the last full paragraph means, but there are two answers to your question ("what mean do we regress a player’s minor league stats toward?").

If you want to project a player only if he ends up playing the major leagues, then you regress his minor league stats toward some kind of rookie major league average.  That would be the answer to the question, “What do I expect minor league player A to hit if in fact I knew in advance that he was going to get some PA in the major leagues?”

If you did that for all minor leaguers and you thought that you could use those projections to decide who should be in the major leagues, you would be wrong.

Basically, if you want to come up with projections for all minor league players as if they were guaranteed to play in the majors, or as if you were never going to know anything else about them and you had to make a decision as to who gets promoted and who does not, then you use average minor league stats (translated to major league stats) as the mean you regress towards. 

Those would be accurate (as accurate as projections on young players who only have minor league experience can be) if all of those players were guaranteed to hit in the major leagues, no matter what teams and scouts thought of them and what their tools were.

But, they would not be accurate if you compared them to players who do in fact end up hitting in the major leagues.  That is because these are two different populations.  And remember that in deciding what mean to regress towards, you are trying to estimate what population a player comes from and then you take the mean of that population.

Again, if you just want to take a random player from the minors and come up with a projection for him as if he were definitely going to get some major league AB’s no matter who he is and what teams and scouts think of him, you regress him to the major league translations of the average player at the level he is playing at.

But, if you want to compare your projections to players who are eventually allowed to play in the majors because teams think that they are major league quality or have major league potential, you better use a different mean to do your projections - the mean of rookie players or something like that.

That is because players who do eventually get some AB’s in the majors belong to a different population.  They belong to the population of players who likely were high draft picks and are considered prospects by their teams, and have good tools.

I know that is a little tricky to understand but I hope it makes some sense and I hope it answered your question.

If a team asks you to give them projections on all minor league players and they want to use those to help them decide whom to promote, you better use the mean of the average player in the league they came from. (It doesn’t really matter what mean you use anyway since they are just going to compare one player to another, plus if they go ahead and promote a player because they like that player and then they look at your projection, it is likely that you will under-project him because the team thinks that this player is better than an average minor league player regardless of his stats, or together wit his stats - presumably at least.)

If you are compiling projections for minor league players who may get some major league PA’s and you want to see how good you do, you better use a higher mean - that of rookie players or something close to that.

Here is an example:  Let’s say that you have two AAA (any level - it doesn’t matter) players with exactly the same minor league stats (in the same number of PA), and they are the same age, same position, same everything. You want to project each of them and you know nothing else about them.  You correctly use the AAA average player for your mean to regress their MLE’s towards and you come up with an X projection for both players (obviously the same projection).

Now, one of these players gets promoted right away and the other one does not. In fact, the other one never gets promoted.  But, we are the master of the universe, so we promote that second player ourselves. Who do you think will likely do better in the majors?  If you said the one who was actually promoted by the team, you would be right.  So how come you had the same projection for both players, but we know that player B (the one who got promoted by his team) will do better than player A (the one promoted by you).  Because you now know that you did the projections wrong (you didn’t know that when you did them).

If you had known that Player B was going to be projected, you would have used a different mean for his regression - that of a rookie major league player, or so. The “reason” for this is that Player B obviously is considered to be a better player than player A and likely is - that is why he was promoted and player A was not.  So we make use of that information (of course it is too late) in our projections, namely in the mean we regressed the player’s stats towards.

So, again, if you want to do any projections on minor league players and you only care about them if they actually ever play in the majors, then use some major league rookie mean for your regression.  Most people fall into that category.  Who cares about a player’s projection if they never make it?  BTW, that is one reason why players who languish in the minor leagues sometimes have good MLE projections.  We are regressing their stats to a mean that is too high. Obviously if they have lots of minor league data it won’t make that much of a difference.

If you really, really, really want to do an accurate projection that most of the time, NO ONE WILL EVER SEE, but actually reflects what each player would do if they were guaranteed to make the majors, then use a lower mean - namely that of the average player from the level that each player comes from!  This projection, although more “accurate,” is the quintessential tree falling in the forest when no one is around to hear it.

Interesting, huh!


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential