THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, October 12, 2006

Forecasting Pujols’ AB

By Tangotiger, 07:41 AM

From 2001 to 2005, this was Albert Pujols AB: 590, 590, 591, 592, 591

Consistent!  That’s the word, right?  Wow, I will forecast Pujols for 591 AB in 2006, and I’m definitely going to be right!  Yes, consistent, healthy.  That’s Pujols.  We all know what happened to Pujols this year.  His final AB total was 535.

What did Marcel say before the season started? 

531.

Now, you may be thinking:


That’s cherry-picking!  But, no, that’s not true.  Marcel was built with a regression towad the mean, and that includes playing time.

Go ahead.  Go find me every hitter who had 2750 to 3150 AB over his previous 5-years, 1670 to 1870 over his previous 3 years, and 560 to 620 AB in the previous year.  Make sure you have somewhere between 20 and 50 hitters.  Tell me the average number of PA in the next year.  I swear I don’t know the answer.

But, that’s the whole point of Marcel, that it has built a very simple model that will tell you how many AB a hitter will get, as an over/under.  The regression on playing time is enormous, and it’s why it can forecast players as well as it does.

#1    Tangotiger      (see all posts) 2006/10/12 (Thu) @ 08:10

Just using the 1yr and 3yr restrictions from above, I got 1059 players.  Limiting it to players between 1969 and 1980, and I have 139 hitters.  Average number of AB for those hitters?  532.

And if I look at hitters from 1999 to 2005?  155 hitters, and the average AB is… 531.


#2    Peter Jensen      (see all posts) 2006/10/12 (Thu) @ 11:40

So Marcel was very accurate in forecasting Pujols AB’s for 2006.  Doesn’t that also mean that it underestimated his AB’s by 60 in 2004 and 2005?


#3    tangotiger      (see all posts) 2006/10/12 (Thu) @ 12:10

In the 1995-2005 sample of 155 hitters, the average going into the year in question was 590 AB, and Marcel would have forecast around 530 for each of those hitters.  90 of those hitters had at least 560 AB, and 34 had 500 or less AB.

Pujols was simply a useful illustration that it doesn’t matter how consistent AB a guy has been getting, that’s it’s impossible to figure if he’s still going to be part of the 60% that will be within 30 of their average, 20% that will miss their average by 90 AB, or be in-between.

If I use a range of +/- 30, Marcel will be right only 20% of the time.  But, on average, Marcel will get it right.

Any forecasts that doesn’t adhere to these kinds of rules is bound to do a bad job.  Estimating Pujols to continue to get 590 AB is a foolish estimate, even if he were to have gotten 590 AB again.

***

Note: I would add an extra parameter, and that is the quality of the player.  After all, there’s two major reasons that a player doesn’t play: injuries, and being supplanted by someone better (or perceived to be better).  Pujols doesn’t have that second problem.  So, a better forecast model would have called for Pujols to have gotten say 560 AB.


#4    Peter Jensen      (see all posts) 2006/10/13 (Fri) @ 07:49

"But, on average, Marcel will get it right.”

This is almost a tautology.  If you use the past average of AB’s for regular players to predict those players future AB’s then, of course the average of those player’s future AB’s is going to be very close to the past average.  This does not mean that it is a very good estimate.  As you admit, for 80% of the players it misses by a significant amount.  The problem is to identify the factors that would make good parameters for estimating future AB’s.  You have already mentioned the quality of the player as one parameter; age, past injury history, playing position, and weight are other possibilities.  I know Marcel is meant to be a quick, rough estimate and not an involved study, but you shouldn’t have cited its accuracy in predicting AB’s when it is not particularly accurate and could be much better with some minor changes.

In your first paragraph you mention that of the 155 hitters 90 were +30 from the 530 mean and only 34 were -30 (leaving only 31 as within + or - 30).  This shows a highly skewed distribution.  A simple improvement in prediction would be to use the median or an interval mode rather than the mean as a predicted value.


#5    tangotiger      (see all posts) 2006/10/13 (Fri) @ 07:59

I doubt it could be “much better”, but that’s semantical.  My main point is that most people, looking at Pujols line, would have likely predicted something close to the 590.  And you are right, that it’s almost a tautology.  But, most people wouldn’t believe it unless they study it.  Would most people use a .6*AVERAGE+180 to estimate someone’s AB?

As for the accuracy, it depends if you are looking for an over/under (median) bet, or a “differential” (mean) bet.

As for Marcel specifically, I’m not proud of it, because it’s an extremely simple model, and is the starting point for a forecasting system.  But it does give most people eye-openers as to modeling the past.


#6    dq      (see all posts) 2006/10/13 (Fri) @ 14:18

What you have in this group 590 ab, is a group of players who are very good players and havent had a major injury for the last 3 years. Chances are they will be good the 4th year and not have an injury. I took players with 1770 + ab for 3 consecutive years, ended 1998-2004 and looked at their next year abs. There were 130 players, whose 3 year average was 614.6. 180 + .6 * ab = an average of 548.8; their actual abs averaged 553.3 -

But, the 3 year average (3YA) was closer to the right answer 60% of the time versus Marcels. The average of the absolute difference for 3YA was 72.2 versus the Marcels 82.0.

The median of the 3YA was 28.7; the median of Marcels was 65.2.

The players stdev for their 3 year ab was 57.7 on 1843.9 ab; their stdev for the new year was 113.9 on 553.3 ab.

Age had a big impact as far as which was closer; the 3YA was closer 68.2% than Marcels for players 30 & under.

So, I would say that the chances of Pujols getting 592 abs were greater than the chances of him getting 535. On the other hand, over the course of x seasons, he will average 535, because in one of the years he will fall short.

It’s an interesting exercise in projections; because playing time is obviously a big driver. It’s got to be a lot harder at 300-450 abs.

It’s kind of like player Deal or No Deal; if there are 4 suitcases left: $1 million, $5,000, $10 & $1 do you take a $200,000 offer?


#7    Tangotiger      (see all posts) 2006/10/13 (Fri) @ 14:57

” The average of the absolute difference for 3YA was 72.2 versus the Marcels 82.0. “

Is this right?  You said the 3YR average is 615, and the season in question is 553, for a group difference of 62.  I find it hard to believe then that absolute difference, on a player-by-player level would average only 72.


#8    dq      (see all posts) 2006/10/13 (Fri) @ 15:32

You have an average that is near the absolute max; if you average 615 you won’t get too many to go over that, and the over/under is almost always an under.


#9    tangotiger      (see all posts) 2006/10/13 (Fri) @ 16:00

Good point.

That brings up a good tweak then, to try to minimize the individual differences, rather than at the group level.  I suppose as I go away from the top-end, then either way works, but in this case, when there’s a fairly clear ceiling, then we should adjust otherwise.

(Note: I use PA, not AB, and only used AB for illustration purposes.)

Unless dq beats me to the punch, I’ll come up with a different equation to estimate PA for Marcel purposes.

Thanks dq…


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 03:05
Nate Silver: hero to interviewers

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel

Nov 19 19:13
Offense by position groups by decade

Nov 19 17:32
Changes in home run rates during the Retrosheet years

Nov 19 16:40
One Year and One Million Hits Later

Nov 19 16:22
Soria as a starter?

Nov 19 13:50
Response of a fired head coach