THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, October 08, 2007

The Moneyball Anomaly and Payroll Efficiency

By Tangotiger, 05:22 PM

I haven’t read this pdf yet, but I read the previous one (which focused on just a couple of years), and found those conclusions highly… inconclusive.  Guy did read both the papers, and since he shared my misgivings in the previous one, I’m sure I’ll be on board with him on this one as well, once I read it.  Here’s what Guy said (of whom I’ll presume he won’t mind if I copy/paste him directly):


This is an interesting paper, and the inclusion of earlier and later years is a plus. The data are intriguing, and certainly suggest the possibility that market returns for BBs/OBP have grown. To me, your year-by-year data (table 5) suggest a post-1994 increase in valuation, together with a potentially anomalous 2004-2005 surge, much more than a sudden post-2003 Moneyball response (especially given the big 2006 drop). But either interpretation is plausible.

However, to make a convincing case either for the historic undervaluation of BB/OBP or the post-Moneyball “correction,” the productivity and valuation models need to be improved. Both contain flaws too substantial to give us much confidence in the conclusions.

PRODUCTIVITY: A lot of good work has been done on the relative importance of OBP and SLG in generating runs. All of this work suggests the ratio is 1.7:1 or 1.8:1. See this NYT article: http://www.nytimes.com/2007/02/2...r=1& oref=slogin, and Tango has done extensive work on this: http://www.insidethebook.com/ee/...slg_make_sense/ . Or you can satisfy yourself by simply running a regression on team runs scored over the past few seasons (I get 1.8:1 using 2005-07). Since the only link between these variables and winning is RS and RA (except perhaps for spurious correlations we don’t want to measure), we know this is the correct ratio for impacting wins as well. The fact that your models yield a 2.6:1 ratio shows that your current OBP/SLG model is incorrect.

While I applaud your desire to deal with the multicollinearity of OBP and SLG, your new power variable (TB/H) creates new problems. The value of a hitter’s power depends on his batting average: the interaction of the two is critically important. Albert Pujol’s TB/H is far more valuable than the same TB/H by a player with an average BA, because Pujols hits .330—if you credit him only with the average power value, it won’t be accurate. This will only cause small problems at the team level (small variance in BA and SLG), but large errors at the player level.

Perhaps as a result, the coefficient you get for “eye” is clearly too high: 1.75 means that a .01 increase in BB rate will yield 2.8 additional wins. In fact, that translates into a gain of about 20 runs or just 2.0 wins.

You might try using the more traditional BA, ISO, and BB rates: using 2005-07, I get coefficients that comport pretty well with real run values (again, with runs as dependent variable): 4500 for BA, 1650 ISO, 2000 BB. (Note: 1 run = .0006 in win%).

VALUATION: The problems with the productivity variables of course apply here as well. In addition, the model appears to be very unstable, yielding wildly fluctuating coefficients from one year to the next. It seems implausible that valuation of key skills changes in this manner, especially considering the large number of multi-year contracts which constrains annual shifts. A few features likely contribute to the instability:

1) The inclusion of pre-arbitration players is going to distort your model, since these players’ compensation effectively has no relationship to performance. Why pretend that Ryan Howard’s 2006 salary of $355,000 has something to do with his .923 OPS the prior year? Or that Pujols’ sub-$1M contracts before 2004 somehow reflected his performance? Including a dummy variables for arb-eligible can only provide a crude adjustment. The much better solution is to exclude all pre-arb players. (Also, power and BB are both “old player” skills, while BA peaks at a younger age. Given the role of seniority in compensation, it will appear that BA is undervalued by teams if you don’t exclude the pre-arb players.)

2) Similarly, including players with only 130 PAs is likely to distort your findings in unpredictable ways. A .360 OBP in 175 PAs is not remotely comparable to the same performance over 650 PAs. Also, at such a small # of PAs, their performance is hardly a good measure of their true talent. It would be better to look only at regulars, something like 350+ PAs (and perhaps create a separate model for part-time players).

For that matter, why should we expect one year’s performance to determine the next year’s salary? Players should be paid based on their talent, for which any one season’s performance is a poor proxy (especially with only 130 PAs). A more reasonable model would compare salary to a weighted average of the previous 3 (or more) seasons’ performance, i.e. a reasonable projection of his future performance.

3) The high correlation between power and eye makes it hard to distinguish which skill players are being compensated for. Many of the highest-paid hitters are routinely leaders in both categories. This means the relative valuation is likely to be heavily influenced by the small number of hi-power/low-BB and low-power/hi-BB players. If one of these groups is disproportionately pre-arb and has few FAs in a given season (or the reverse), that will skew apparent valuations. This makes it even more important to set a higher PA minimum and exclude the pre-arb players.

Of course, it could be that a much improved model would also show a big post-Moneyball surge in valuation of BBs. But we can’t really know that based on the current model.

(2) Comments • 2007/10/10 • SabermetricsFinances
Page 1 of 1 pages

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 17:09
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade