THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, January 09, 2009

Modeling Baseball Player Ability with a Nested Dirichlet Distribution

By Tangotiger, 11:41 AM

A forecasting/aging paper by Brad Null.  Excellent paper, and I recommend it.  The author even went out of his way to acknowledge the sampling bias, and references one of my discussion threads.  I like him just for not limiting himself to just academic papers.  A couple of quick takeaways:
- I would have preferred a smoothed function by component (see Figure 14)
- the binomial approach, first introduced (to me) by Voros, would make more sense with K/BB split away first (see Figure 7)
(I doubt it would make even a hair of a difference, but it would make more sense I think)
- even with all these fantastic breakdowns, the final result is on par with Chone, ZiPS, PECOTA, and a bit ahead of Marcel (see Figure 22)

(Hat tip: Victor)


#1    Rally      (see all posts) 2009/01/09 (Fri) @ 15:54

Looks pretty cool so far.  It will take some time to read through though.  Lack of comments does not = lack of interest.


#2    dan      (see all posts) 2009/01/09 (Fri) @ 17:09

I tried to read and my brain began to hurt, so I skipped to the references section at the end. The author credits Tango on a blog post that was written by MGL.

http://www.insidethebook.com/ee/index.php/site/article/peak_offensive_age/


#3    Pizza Cutter      (see all posts) 2009/01/09 (Fri) @ 19:56

I don’t honestly know what a Dirichlet distribution is, but from what I’m seeing, they’re using a nested approach (good), although I think it’s a little weird they’re setting up the nesting model.  Why not branch off FB/GB, then from there out/hit, and then under hit, 1b/2b/3b/hr.  Something like that.  Maybe there’s technical details I don’t know.

Still, it looks pretty good (from my skimming of it, I hope to devote a good solid reading to it this weekend).  The problem that I worry about with these type of articles is that I think that we just keep building a slightly better CD player.  Marcel is a system that is based on the assumption that players will generally do what they’ve been doing lately, with some adjustment for regression to the mean.  From what I’ve seen of the other systems, it seems that they may have more bells and whistles, but they’re set up to generate stat-lines that look suspiciously like last year’s.  Which is great, considering that most players do that, and it’s a heck of a lot better than what usually happens (uh… he was a .320 hitter last year, so he’ll probably do that next year.)

I’m longing for the new Sabermetric iPod.  Give me a system that tells me who the players are who are going to break out and NOT do what they did last year.


#4          (see all posts) 2009/01/09 (Fri) @ 22:11

Like the other commenters here, I’ve only skimmed the article.

He seems to be saying that he considered several different tree structures, and that the one he presents gave better overall results than the more intuitive approach of splitting non-contact results (BB + K) from contact results (GB + FB) higher up the tree.

But he also notes that with this structure he can’t get the positive correlation between HR rate and BB and K rates that we see in real life. I wonder if putting BB and K on the FB branch, rather than on the GB branch, would improve this.


#5    Rally      (see all posts) 2009/01/10 (Sat) @ 00:12

Does this work for a projected breakout season?
http://www.baseballprojection.com/cruz-ne1238.htm


#6    Guy      (see all posts) 2009/01/11 (Sun) @ 07:24

Ditto on Tango’s prais for Null’s willingness to draw upon non-academic sources.  Eight of his 18 references are to non-academic sabermetric works.  Quite refreshing.

I don’t have the knowledge to evaluate the nested Dirichlet model, so I’ll wait for Pizza and others to let us know what they think on that front.  But I share Pizza’s feeling that marginal improvements in these predictive metrics aren’t that interesting at this point.  Predicting breakouts and collapses is of course the holy grail, though we may never get it.  But regardless of whether that’s attainable, I do think more attention could be devoted to predicting players’ health and durability.  We have traditionally set that aside and treated it as a separate issue, but it’s really integral to player aging and thus performance.  Most metric comparisons I see answer the question: “how well does the metric predict the player’s performance next year, IF he’s healthy enough and plays well enough to make it into the study?” But the Yankees don’t care only what Sabathia does next year, they care about RAR over the next 8 years.  How will he age? Same for teams that sign Prince Fielder or Ryan Howard.  (Craig Wright’s article on Honus Wagner in this year’s THT annual implies he’s done a lot of work on aging and long-term forecasts, but unfortunately it remains proprietary).

One comment on the article:  his aging curve (Fig. 15) is much too flat after the peak (like Fair’s).  We know players decline more quickly than this.  Indeed, Null’s initial non-age-adjusted model (fig. 12), that incorporates regression to the mean, overpredicts OPS for players over age 33 by about 40 points (and those are the players still around to evaluate).  That suggests a 40-point single season decline.  But his age adjustment suggests a decline of 10 points per year or less in these ages (eyeballing the graph).  I suspect the overy-flat age curve could explain the appearance of what Null speculates is regression to the mean of actual talent.


#7    Brad Null      (see all posts) 2009/01/11 (Sun) @ 17:43

Hi guys, first off, thanks to all of you for checking out the paper. If you have any questions or feedback, plese email me. Just to let you know briefly why I started this research to begin with: I needed a prior distribution for player abilities to use in prediction and decision models for games, seasons, and beyond, and I couldn’t find one already out there. It isn’t intended to be strictly a forecasting engine in and of itself, but I was happy to find that its predictive power seemed at least reasonable.

Guy, do you have a source for the aging curves you reference?

Also, sorry if I mis-attributed a blog post. I will get that fixed.


#8    Guy      (see all posts) 2009/01/11 (Sun) @ 19:36

Brad: thanks for stopping by.  Nice work.

I wasn’t referencing a specific aging curve.  But I think we can use career length to provide some general constraints an aging curve should conform to.  For example, half of the players who are average at age 28 (OPS+ of 100) will play their final season by age 34.  This is consistent with David Gassko’s finding that there are about twice as many PAs by age 28 players as age 34 players: http://www.hardballtimes.com/main/article/measuring-the-change-in-league-quality-part-two/.  If replacement level is 75-80% of average, that implies a much sharper drop in performance after age 30 than your curve, or Ray Fair’s. (Coincidentally, I’m doing some research on this now, and may write something up.)

Also, don’t your results in fig. 12 suggest a much larger age decline?  Or am I misinterpreting that?

I think a sample of players with 8+ seasons of significant playing time probably creates a significant bias understating decline, as you speculate in the paper.


#9          (see all posts) 2009/01/11 (Sun) @ 20:16

Guy/6- sorry if this is verging into philosophy, but isn’t the point of a breakout the fact that you can’t predict them.  Ex- if we could predict breakouts accurately, then breakouts wouldn’t really be considered breakouts anymore.


#10    Brad Null      (see all posts) 2009/01/11 (Sun) @ 20:33

Guy, I’ll definitely take a look at what you are referencing about age effects. One thing to realize when looking at fig. 12 though, the original model uses the last 4 years of data, so for the overall effect to be on the order of .040 points, the year to year effect would probably be on the order of .012-.015 points (there are other factors at work, but bottomline, the year to year would be less). As I stated in the conclusions, I am definitely still interested in figuring out more about how these effects seem to vary from player to player, probably using a Bayesian sort of analysis similar to what Albert did (but using all players instead of a selective sample).

One other thing, regarding andeaux’s reference to different trees. Yes, I definitely think there are better tree structures out there. I limited the alternatives quite a bit, and I freely acknowledge that there is plenty of work to be done to find better trees. Finding the best tree is actually a very hard problem, but I have some ideas to push on that, also it is totally an open question as to whether this is the best way to segment event types. I have some ideas for partially observable event types that breaks down types of fly balls, for instance, further, but I won’t go into that right here for fear of boring folks. Suffice it to say though, I do think there is reason to consider nesting the K and BB near the end rather than the beginning.

As I said though, please feel free to contact me with feedback or questions, in case I miss them here. This is my first cut at the problem, and there is certainly a lot of room for improvement (just a few elements of which are listed in the conclusions).


#11    Rally      (see all posts) 2009/01/12 (Mon) @ 00:33

I still haven’t grokked this, I need to learn what Dirischlet is before I can do that, but it’s very interesting.  Nice to see my work (the CHONE projections) being reference in a Stanford Academic work.

Brad, you aren’t using park effects, right?  That won’t matter for most players, as they either stay in the same park, or else most parks aren’t all that different anyway.  But it would add accuracy for a few players and improve your results at the margins. Four years of data and age adjustments does a pretty good job.

You mention that you don’t use minor league data at this point.  It is a bit tougher to get the batted ball type for minor leaguers.  That would improve your results for the guys who split major and minor league time, but it’s a double edged sword.  Once you are predicting minor leaguers, then those players are included in your forecast evaluation, and minor leaguers are a bit tougher to predict than big leaguers.

I do like splitting hits up among groundballs and flyballs - if I’m able to get the right data, I’d love to add that to CHONE next year, but it probably won’t happen.  I think Minor league Gameday would have to be my source, and I’d have to figure out how to get the data from that.


#12          (see all posts) 2009/01/12 (Mon) @ 03:00

I was a math major at one time before graduation, so although most of the math notation was over my head I think I did a fair job of following along.

A couple months ago Pizza Cutter and I discussed what we called “flowcharting” over at StatSpeak. I believe it’s the same concept called “nesting” here. If the batter doesn’t reach on CI, then if he isn’t HBP, then if he puts the ball in play, then if it’s a fly ball, then it may be a HR, etc. I have my projection algorithms set up this way when applying park and level factors.

I also wrote an article on aging patterns, and how they differ among components. http://statspeak.net/2008/10/aging-patterns-its-all-downhill-from-here.html
I will disagree with how BB & SO were nested in this study, which resulted in their aging patterns being mirror opposites of each other, which is not true. BB & SO have distinct aging patterns. I would chart it as (bifurcated), after CI & HBP, then either ball in play or not; if in play, either fly or grounder; if not in play, either BB or SO.


#13    Guy      (see all posts) 2009/01/13 (Tue) @ 13:47

Brad:  OK, so fig. 12 is giving us approx. 2.5 years of decline, not 1.  That makes sense.  However, you might still have a survivor bias issue, as players who didn’t play at all in 2007 presumably experienced a disproportionately large decline.  So fig. 12 could understate decline.

In any case, fig. 15 suggests an average decline of about 90 OPS points from age 28 to 42.  That would mean an age-28 .800 OPS player (roughly average over his career) could be expected to play until age 42 on average.  Obviously, that’s not close to the real aging process in MLB.  But finding the right answer is definitely tricky.


#14    dcj      (see all posts) 2009/01/14 (Wed) @ 01:24

I read the paper and liked it a lot. Like Brian, I didn’t attempt to decipher some of the notation, but I believe I got most of the ideas.

By the way, this:

I will disagree with how BB & SO were nested in this study, which resulted in their aging patterns being mirror opposites of each other, which is not true. BB & SO have distinct aging patterns.

is misunderstanding what Brad did. Brad is saying that the aging patterns for BB/(BB+SO) and SO/(BB+SO) are mirror opposites, which is true by definition.

Pizza, your point is well taken. Still I think there are two great things about this paper, and two caveats.

Great thing #1: the methodology of Bayesian updating. For each player, rather than a single best guess for his true talent OPS, we have a probability distribution. As more data comes in, we refine the distribution accordingly. I think this is the best conceptual framework for a projection system.

Great thing #2: the potential for reliable estimates of variance at a granular level. We can come up with a probability that Player X raises his BB rate by >20% next year, for example. This would be a major step forward.

Caveat #1: The tree structure needs work, as others have said. I won’t trust this system over CHONE, PECOTA, ZiPS, etc. until it can reproduce the observed correlations between different batting rates. (Andeux, the way things are set up, it wouldn’t help to put BB and K on the FB branch. The system would still require K and HR/FB to be independent.)

Caveat #2: Player injuries don’t fit too well into this system. I agree with Guy/6: in order for a projection method to be head and shoulders above the rest, it will have to tackle the injury issue head-on. Didn’t Sig Mejdal do some work on this in an old BPro annual?

Bottom line, I think this is an excellent piece of work. Plus, the methodology is adaptable to more difficult tasks, like getting a better handle on the minors-to-majors transition.

I’ve also got some technical questions and comments, which I’ll send to Brad via email.


#15    Brad Null      (see all posts) 2009/01/16 (Fri) @ 13:52

Sorry it took a while, but here are a few responses to some questions.

Rally, no I am not using park effects or Minor League data. I certainly agree that both should improve performance. By how much, I don’t know. As of now for people with no MLB experience I am applying the population prior, similar to Marcel, so these players are already a part of my evaluation.

dcj, yes, I think your comments are right on. The BB and K effects here are only conditionally mirror opposites. If you pull out the unconditional effects, they will be quite different. I will compute a table of these unconditional component effects from the model at some point when I get a chance. And yes, modeling infuries is important, although I don’t think this sort of approach precludes that at all. It just isn’t incorporated at this point.

Guy, I agree there could be a survivor bias, and I have some thoughts on further aging analysis int he conclusions.

Also, yes, I agree with everyone that there is room for improvement in structuring the tree. However, in contrast to what others have said I think it makes sense both intuitively and from an examination of the data to put BB and K near the bottom.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 12:38
Do pitcher’s reach back for velocity when needed?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves