THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Thursday, August 27, 2009

Why might Marcel suck this year?

By Tangotiger, 11:21 AM

I recently reported on the current standings in the Forecasters Challenge 2009.  For those new to the whole thing, I invited 21 top forecasters, plus Marcel, to take part in an automated draft.  Not just one draft, but one thousands drafts.  This way, any of the lucky breaks of one draft because of draft position or whatnot gets wiped out if I run it a thousand times.

Cool, right?  I was also surmising that everyone would have such similar lists that we’d all end up drafting each player around the same number of times.  I really thought that since Pujols was going to be drafted 1000 times, and there are 22 of us, then each of us would draft him at least 30 times, and no more than 100 times.  I reasoned that he would be a top five pick, everyone will more or less draft 40-50 times at each slot, so eventually, it would even out.  Indeed, he was ranked #1 by 5 of us, #2 or #3 by another 5, #4 by another 6, and finally #5 or #6 by the final 5.

A funny thing happened to that theory.  Of the 5 forecasters that had him ranked #1, he went to one of them 761 times out of the 1000 drafts.  The 5 forecasters that had him ranked #2 or #3 got him in 157 drafts.  The #4 rankers got him the other 82 times.  Anyone who ranked him 5th or 6th simply never drafted Pujols in 1000 drafts.

And this repeated itself with a huge number of players.  Sometimes, the rankings were so nuanced that a forecaster ended up drafting that player all 1000 times.

Let’s analyze a bit using Marcel The Monkey Forecasting System as the illustration.  Marcel had Chad Billingsley, an excellent pitcher who is performing well as ranked 24th in his list.  Remember, the automated draft program I wrote simply goes through everyone’s list and starts drafting (while ensuring that the position quotas are filled).  There were many forecasters that had Billingsley drafted pretty high.  One had him 25th.  Marcel ended up drafting Billingsley 984 times, while the other forecaster got him the other 16 times.  Quite a difference for such a small position!  What happened is that the other forecaster had another player ranked much higher than everyone else (Rich Harden, ranked 14th), so that player was also going to that forecaster, leaving Billingsley almost always exposed.

This particular case worked out very well for Marcel.  It wasn’t always so.  Here are the 25 players that Marcel ended up drafting the most.  You can consider these guys as the guys that Marcel liked more than anyone else:
ORDER_ID MLBAM_ID PLAYER_TX
5 407812 Holliday, Matt
24 451532 Billingsley, Chad
38 449107 Aviles, Mike
59 493137 Matsuzaka, Daisuke
65 467055 Sandoval, Pablo
69 425794 Wainwright, Adam
82 425426 Wang, Chien-Ming
109 451482 Galarraga, Armando
112 446209 Litsch, Jesse
119 434578 Saunders, Joe
142 150217 Guzman, Cristian
157 433584 Carmona, Fausto
161 460051 Getz, Chris
182 458690 Volstad, Chris
207 434633 Baker, John
215 460003 Teagarden, Taylor
322 150317 Crede, Joe
332 407487 Rivera, Juan
361 333292 Wilson, Jack
425 457788 Schafer, Jordan
486 334393 Pierre, Juan
503 346795 Chavez, Endy
513 123107 Tatis, Fernando
635 110383 Aurilia, Rich

Let’s look at a few of these players.  You will see that Marcel thought highly of Dice-K, Chien-Ming Wang, and Fausto Carmona.  Talk about bad luck.  Was Marcel unusually excited by these players, or was it a case like Billingsley where had these guys just a bit higher than everyone else, and so, just ended up with them so disproportionately because, well, that’s simply the way the draft works.

Let’s start with Dice-K.  Marcel had him ranked #41.  Here is how high he was ranked by the five forecasters most high on Dice-K: #58, 59, 65, 71, 83.  So, it’s not like Marcel was crazy in-love with him.  He had him ranked a bit higher, but not unusually higher.  The result though is that Marcel ends up with Dice-K in 897 drafts out of 1000!  And Dice-K’s Fantasy points is one of the worst in the whole league. 

In typical correlation studies, we wouldn’t notice much with this pick.  That is, take Dice-K completely out of the correlation study, and Marcel’s correlation coefficient doesn’t change much, if at all.  However, because he was drafted by Marcel nearly 90% of the time, and because he had an enormous collapse, it is Marcel that takes almost the whole brunt here.  Those other forecasters who also had Dice-K ranked highly, get off almost scot-free here, because Marcel was there to bail them out!

I have a solution here which makes more sense.  And because I have all the draft lists, I can re-run all the drafts by changing any of the assumptions I want.  This allows me to test various scenarios.  My first order of business would be to see what happens when I remove two forecasters from each draft.  That is, I run 1000 drafts with 20 of the 22 forecasters.  This way, I will be assured that Dice-K will go to someone other than Marcel, someone who also had him ranked very high, but for the grace of monkey, was spared drafting him.

Let’s continue with Wang.  We all know what happened with Wang.  Was Marcel, who drafted him 510 times as the 82nd best player, unusually excited with Wang?  Here were the 5 forecasters most in-love with Wang: #68 (drafted 490 times), #135, #137, #145, #168.  So, yes, Marcel should get penalized here.  Marcel was actually saved by one other forecaster who had him ranked very high as well.  Everyone else discounted Wang heavily (though of course, not heavily enough!).  However, in drafting, you just have to discount him enough that he never gets drafted by you.  So, whether he was ranked #135 or #494 (his lowest ranking by a forecaster), it’s the same thing! As long as you have one forecaster, just one, that is in-love with him enough, you will never draft him ahead of that guy.

Finally, Carmona.  Marcel had him #157 and drafted him 452 times.  There was one forecaster that had him ranked even higher at 119 and drafted him 541 times.  The rest of the forecasters were not close.

If you had to draw up a list of pitchers who collapsed, these three pitcher would likely be part of the 10 biggest pitcher busts.  And by bad luck, Marcel ended up with 3 of them.

At the same time, Marcel got Adam Wainright 665 times by ranking him 69th.  Here’s how five forecasters ranked him: #71, 74, 76, 81, 82.  So, Marcel benefits here by ranking him just slightly higher enough to end up drafting him two-thirds of the time.

I find these results utterly fascinating, and I can see myself spending an inordinate amount of time studying all this, running different drafts with different scenarios.  Clearly, it is not fair to those forecasters who ranked Wainright high to not get any acknowledgement for it.  After all, it is simply their bad luck to be in a draft with Marcel.  But, Marcel will not actually be in every one of their drafts.  Marcel is not necessarily a representative forecaster in a random fantasy draft.

The only right thing to do here is to run a second draft where not all the forecasters are participants.  I mentioned earlier doing a 20-forecaster draft, but perhaps 18 or 16 would be better.  The problem there is that the forecasters provided me with their rankings on the expectation that there would be 20-24 participants per draft.  If I change the rule here to 16, even if I have excellent reasons to do so, those participants can cry foul.  On the flip-side, they’re all in the same boat.

Anyway, what I will do is treat this as a learning experience.  I’m going to let the official rules stand in terms of awarding a winner.  But I’m going to run modified versions unofficially.  I won’t report an overall winner, but I’ll highlight in more broad terms what kind of changes we have.

The major point here is that while Marcel is currently in 17th place in the official challenge, if I were to re-run this next year, the results will be far more volatile.  He could just as well end up in 3rd or 21st place.  I just won’t know very well.  However, by only entering 16 of 22 participants in each draft, the results will be much more stable.  Never, by definition, will one player end up going 1000 times to the same forecaster.  This is REALLY the simulator I was trying to approximate, since this is really how you would test your forecasting skills.

LOTS more to come as I think it…

(44) Comments • 2009/08/28 • SabermetricsFantasy
Page 1 of 1 pages

Latest...

COMMENTS

May 26 10:58
What makes for a successful GM?

May 26 07:27
“Why Kickstarter works”

May 26 03:03
Pete Palmer’s new book: Basic Ball

May 26 01:11
Largest demonstration in Canadian history?

May 25 19:41
What sabermetrics is NOT

May 25 16:59
Howard Stern

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 12:51
Chad Curtis

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

THREADS

August 27, 2009
Why might Marcel suck this year?