THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, August 19, 2011

Forecaster’s Challenge - current results

By Tangotiger, 09:30 PM

As you know, we have 4 competitions: 3 unofficial ones, and one official one.

The first unofficial one is: 22 pros, all in one cage match (I used to run 1000 drafts, but now I only run 100).  So, all 21 are in league together, and the 22nd is a “consensus” pick of all 21.  Below are the current results.

“Value_ct” is the standings points earned, and what you should look at primarily.

“points_ct” is the average points the players earned in each draft.

Right now, it looks like a three-horse race between Consensus, RotoWorld, and Steamer.

As I get time, I’ll be running the results for the other 2 unofficial competitions, and of course, the official one.

fan_id    points_ct    wins_ct    value_ct        fan_tx
299        1135        37        577        Consensus
122        1110        27        460        RotoWorld
118        1042        20        306        Steamer
116        1053        5        223        KFFL
113        1030        2        128        FEIN
112        987        4        112        Mike Podhorzer_FantasyPros911
133        970        3        104        Rotochamp
102        1007        1        101        Ask Rotoman
106        995        0        58        CAIRO
120        888        1        51        Razzball
135        956        0        36        Pat Senechal
125        976        1        33        BigScoreSports
115        950        0        19        John Eric Hanson
131        804        0        5        Fangraphs Community
132        875        0        0        Fantistics
126        856        0        0        Bloomberg Sports
134        851        0        0        Geoff Buchan
217        830        0        0        Marcel
127        819        0        0        Future of Fantasy
111        765        0        0        Statspeakblog
119        729        0        0        PECOTA
130        721        0        0        Baseball Info Solutions


#1    Tangotiger      (see all posts) 2011/08/19 (Fri) @ 23:47

In this one, I take the consensus picks, randomly bump them up or down a random amount, randomly remove 5% of them, and create 21 teams. 

I then put ONE of the pro forecasters in this league of 21 teams.  And then, against the same 21 teams, I put a second pro forecaster.  And so on.  In this way, each pro forecaster is put against the same competition, and hopefully, some “reasonable” kind of competition.  I did this 22 times for each forecaster, once for each draft slot.

The results:

fan_id    points_ct    wins_ct    value_ct    fan_tx
116    1211    14    177    KFFL
112    1139    13    160    Mike Podhorzer_FantasyPros911
106    1142    10    149    CAIRO
113    1118    11    146    FEIN
135    1132    11    143    Pat Senechal
217    1128    10    130    Marcel
118    1136    8    128    Steamer
133    1097    9    128    Rotochamp
102    1124    9    123    Ask Rotoman
115    1105    8    120    John Eric Hanson
134    1088    8    113    Geoff Buchan
122    1111    7    112    RotoWorld
299    1090    7    104    Consensus
125    1075    6    99    BigScoreSports
120    1051    6    92    Razzball
132    1047    6    91    Fantistics
127    1054    5    82    Future of Fantasy
131    1022    3    54    Fangraphs Community
111    1001    2    42    Statspeakblog
130    980    3    39    Baseball Info Solutions
126    931    2    33    Bloomberg Sports
119    800    0    6    PECOTA


#2    Tangotiger      (see all posts) 2011/08/20 (Sat) @ 10:11

This one is a head-to-head, a two-person league, where each side drafts 275 of the 550 players.  Basically, your whole list now comes in play.

fan_id    points_ct    wins_ct    fan_tx
122    10906    41    RotoWorld
112    10854    37    Mike Podhorzer_FantasyPros911
118    10859    37    Steamer
299    10927    37    Consensus
113    10528    35    FEIN
116    10486    32    KFFL
120    10515    29    Razzball
106    10494    26    CAIRO
132    10399    25    Fantistics
131    10266    24    Fangraphs Community
217    10309    24    Marcel
130    10124    18    Baseball Info Solutions
133    10245    18    Rotochamp
102    10151    17    Ask Rotoman
127    10257    14    Future of Fantasy
135    10134    13    Pat Senechal
125    10074    12    BigScoreSports
115    9868    10    John Eric Hanson
111    9245    6    Statspeakblog
126    9678    5    Bloomberg Sports
134    9326    2    Geoff Buchan
119    8497    0    PECOTA

***

Just looking at Marcel in these three unofficial that I just did, and it’s all over the place.  This just shows that your measurement system is highly dependent on the competition assumptions.

BIS for example finished near the bottom in 2 of them, and right in the middle in this one.

Eric Hanson, who won the two previous years, is in the bottom half in all three competitions.

So, really, there’s a ton of luck here, especially if you are being compared with other pro forecasters.

PECOTA was last or 2nd last in each one, which tells me something strange is going on.


#3    Tangotiger      (see all posts) 2011/08/20 (Sat) @ 15:02

This is the official competition. 

I take 21 Fangraphs readers, and randomly put them in individually against one pro forecaster (who drafts first) for one league.  Then I repeat that with a second pro forecaster (same draft spot) in a second league and so on.  Then I repeat that with 21 other Fangraphs readers, and this time, each pro forecaster drafts second, and so on.  So, everyone is in 22 leagues.

Results so far.  It’s fairly tight at the top.  It’ll be pretty anti-climactic if the consensus wins.

Thanks to all for participating!  I learn alot with this, and I’ll post final results when the season ends.

Tom

fan_id    points_ct    wins_ct    value_ct    fan_tx
299    1140    8    135    Consensus
118    1121    8    116    Steamer
122    1103    6    112    RotoWorld
102    1098    5    93    Ask Rotoman
106    1069    5    85    CAIRO
132    1091    2    81    Fantistics
112    1090    4    78    Mike Podhorzer_FantasyPros911
113    1081    3    78    FEIN
116    1064    4    73    KFFL
131    1049    3    52    Fangraphs Community
115    1042    2    52    John Eric Hanson
133    1033    2    46    Rotochamp
135    1044    0    39    Pat Senechal
120    995    2    32    Razzball
217    991    1    31    Marcel
125    970    0    12    BigScoreSports
111    924    0    12    Statspeakblog
126    968    1    11    Bloomberg Sports
130    931    0    2    Baseball Info Solutions
134    964    0    1    Geoff Buchan
127    851    0    0    Future of Fantasy
119    777    0    0    PECOTA


#4    LJB      (see all posts) 2011/08/20 (Sat) @ 16:26

Awesome to see how well the new PECOTA is doing. More glad than ever that I didn’t waste another 40 bucks. Just wondering - was PECOTA anything more than Nate Silver?


#5    Tangotiger      (see all posts) 2011/08/20 (Sat) @ 17:22

I wouldn’t jump to any conclusions yet.  Looking quickly at the results, and we see PECOTA, BIS, Bloomberg near the bottom, and it’s possible they didn’t submit a list tuned well-enough for this competition.


#6    rwperu34      (see all posts) 2011/08/20 (Sat) @ 19:46

How was playing time determined? I’d like to see the results if everybody had projected the same PT, or even perfectly projected PT.


#7    RotoChampMike      (see all posts) 2011/08/21 (Sun) @ 08:05

Juan Uribe, Juan Rivera, Jose Valverde, and Julio Lugo are all listed at the wrong position in the spreadsheet that you sent me.  All of these players have a total points count of 0.  They are all listed as FirstName - LastName, instead of LastName-FirstName.

I know that we ended up drafting Juan Rivera all 100 times at the IF spot in the first competition and received 0 points for him, so those guys might be materially affecting some of the results.

In competition #1, we were the only ones to draft Valverde (20 times, 0 points), Rivera (100 times), and Uribe (4 times).


#8    Tangotiger      (see all posts) 2011/08/21 (Sun) @ 08:26

Yup, I’ll be checking that out, thanks.


#9    Tangotiger      (see all posts) 2011/08/21 (Sun) @ 08:32

I wonder if I sent you the wrong spreadsheet.

I have this:

mlbam_id    pool_fld_cd    player_tx    points_ct
346874    
IF    UribeJuan    4


#10    Tangotiger      (see all posts) 2011/08/21 (Sun) @ 09:16

Checking your site, I see that you made a mistake with your ID matching.  Therefore, in your draft list you submitted, you gave me the Jose Uribe from the 1980s.  I presume the rest of your list is similarly affected?


#11    RotoChamp Mike      (see all posts) 2011/08/21 (Sun) @ 10:30

I think you sent me the correct spreadsheet.  If I filter by ‘Jose Valverde’, only 1 entry comes up (ours).  If I filter by ‘Valverde, Jose’ there are 20 entries.  We messed up the MLBAMID for 4 players.

The 4 players we messed up are: Uribe, Valverde, Rivera, and Lugo.

I checked the rest of the entries by filtering the spreadsheet by points_ct = 0 and n_rank = 1

The only other mistake I found was on fan_id=111.  He had Mike Stanton with the wrong ID (122681 and should be 519317).


#12    Tangotiger      (see all posts) 2011/08/21 (Sun) @ 11:15

To be clear: this doesn’t change anything.  What counts is the MLBAM_ID and nothing else.  That’s clear in the rules.


#13    Tangotiger      (see all posts) 2011/08/21 (Sun) @ 14:02

Extending Rotochamp’s idea, I looked at all players that:
1. only was listed by ONE forecaster
2. that forecaster also drafted him (i.e., means he was listed high enough)

This is what I got back:

fan_id    order_id    mlbam_id    player_tx    pool_fld_cd
102    452    450689    Goedert
Jared    IF
102    744    445169    PowellLandon    C

111    132    122681    Mike Stanton    1B
111    150    502208    Walters
P.J.    P
111    523    501647    Rosario
Wilin    C

119    172    400062    Redding
Tim    P
119    175    116414    Isringhausen
Jason    P
119    176    124604    Wright
Jamey    P
119    177    407297    Cruz
Juan    P
119    181    110683    Batista
Miguel    P
119    182    150407    Mota
Guillermo    P

122    40    501954    Martinez
Osvaldo    IF

125    455    446345    TaylorMichael    OF
125    466    475676    Weglarz
Nick    OF

126    306    475115    Ross
Tyson    P
126    376    425487    Wellemeyer
Todd    P

127    284    502082    Chisenhall
Lonnie    IF
127    372    541618    MonteroJesus    C

131    31    425794    Wainwright
Adam    P
131    330    120485    Pettitte
Andy    P

133    247    466357    Jose Valverde    1B
133    335    123585    Juan Uribe    1B
133    348    489564    Juan Rivera    
IF

134    192    461811    AndersonJosh    OF
134    196    425883    Willis
Dontrelle    P

fan 133 is Rotochamp, so we can see his problem there.  He used the wrong IDs.

The one with the biggest problems is fan 119, which is PECOTA.  That ordering makes no sense really.  Either I somehow made a mistake in loading, or I received a list that was generated improperly somehow.

I’ll check back to my original email…


#14    Tangotiger      (see all posts) 2011/08/21 (Sun) @ 14:09

The problem was not mine.  Looking at the original email, and I’ll pick out some names, and order ID:

204,407793,"Lackey, John”
205,421685,"Harang, Aaron”
206,218596,"Hudson, Tim”
207,150404,"Lilly, Ted”
208,217096,"Zito, Barry”
209,112020,"Carpenter, Chris”
210,136880,"Halladay, Roy”
211,407193,"Lyon, Brandon”
212,276056,"Baez, Danys”
213,424324,"Lee, Cliff”
214,282332,"Sabathia, CC”

First, we notice lots of great pitchers, and they are all pitchers.

Later on:
370,430935,"Hamels, Cole”
371,433587,"Hernandez, Felix”
372,434442,"Howell, J.P.”
373,448178,"Jepsen, Kevin”
374,435178,"Johnson, Josh”
375,452657,"Lester, Jon”

Really, the problem is that the list I got hiccuped.  Starting at order #169 all the way to #615, the list is all pitchers.  Indeed, there were no pitchers listed in the first 168 slots.

Basically, the list that I got is not indicative of what PECOTA (or anybody) would properly submit.

Unfortunately, we’re kinda stuck here…


#15    RotoChampMike      (see all posts) 2011/08/21 (Sun) @ 17:36

My suggestion is to throw out the Pecota entry.  It’s not fair that they finish last with a list that was obviously screwed up.

I’d also change our 3 messed up IDs as it’s pretty clear who we were ranking.  All our improperly labeled players were drafted really late, so their replacements won’t have high point totals. 

I’d also change Mike Stanton on fan_id=111, as it’s pretty clear who he was listing.  This is probably a more significant error, as Stanton was ranked pretty high on the list and got drafted 53/100 times.  Because of the inter-relatedness of a draft, an improper player/pick affects all subsequent picks (and another reason Pecota should be thrown out).  I’d definitely change his ID.

It’s a tough spot for you as you don’t want to set a precedent for cleaning up sloppy lists, but you want to make sure each list is authentic to achieve the highest possible accuracy in the competition.


#16    Tangotiger      (see all posts) 2011/08/21 (Sun) @ 19:15

This is the third year I run this.  The first two years, I did MAJOR cleanup.  No more will I do this. 

The responsibility is for everyone to submit a list with the proper matching IDs.  I gave the list of everyone who played in MLB or MiLB in 2010.  That’s close to 9000 players.  If someone puts the wrong Mike Stanton or the wrong Uribe (wrong IDs of theirs that are NOT on my list, and therefore not my problem), why should I now change it? 

Furthermore, by changing those IDs, I now have to inspect every single other possibility.  This was just ONE test to see who had bad IDs.  How about other tests?

As for PECOTA: if I throw them out, then what about BIS or others who may have had a messed up list, but wasn’t so egregious. 

Furthermore, by throwing out anyone, I now have to rerun the draft in its entirety.

I’m being put in an impossible spot here.  The responsibility cannot be mine to cleanup the mess of others.


#17    Tangotiger      (see all posts) 2011/08/21 (Sun) @ 20:26

Here’s another example I found without looking much: Jacoby Ellsbury.  20 of the 22 forecasters had him ranked between #8 and #60.

One had him at #204 and another at #631. 

What am I to do there?

Victor Martinez was listed on 21 of the 22 ballots, at no worse than #145.  One forecaster didn’t have him on the ballot at all.

Chase Utley was listed on 20 of the 22 ballots, at between #31 and #327, and two others didn’t list him at all.

I can go through and find 50 to 100 players of questionable ranking to some degree or other.

I can’t start going through and drawing a line at how many I’m going to look.  This will lead to bias.

In the first two years, I went to extraordinary lengths to actually look at all the ballots in this detail.

This year, I decided I can’t be responsible for the errors of others.


#18    LJB      (see all posts) 2011/08/22 (Mon) @ 00:13

So PECOTA can’t even be trusted to make a list right.


#19    Tangotiger      (see all posts) 2011/08/22 (Mon) @ 11:44

LJB: I think it’s more that the list generation was likely rushed last minute.  Over half the forecasters submitted their lists in the 24 hours leading to opening pitch.

***

For those interested in last year’s results:

http://www.tangotiger.net/forecast/results2010.html

There’s not that much correlation, though I suppose someone out there can run the numbers for us. 

If it was me, all I’d say is: use Consensus of the pro forecasters.  As an alternative, use Fangraphs Community and/or Marcel.  Anything more, and, well, I just don’t see the value-added.


#20    Geoff Buchan      (see all posts) 2011/08/22 (Mon) @ 13:07

My forecast was the one with Ellsbury at #631, but that’s a failure in my model, not of any submission error.

I recall sending in my list shortly before the deadline, basically just re-running my algorithm and sending the output without further spot-checking of particular players or numbers. But my model indeed did project Ellsbury for little playing time (overreacting to his injured 2010, no doubt), and thus low cumulative totals, so #631 was where I had him.

It’s best to rely on people’s submitted lists, warts and all. If I’m entering a competition, it’s my job to see that I’m putting forward my best submission.

One question on Marcel: how did it determine playing time? If it were simply a function of the previous 3 years’ stats, it should have had Adam Wainwright ranked quite highly (a monkey system being ignorant of his season-long injury), yet he was #1650 on the list.

I know in my case, I made a rough guess for players with expected injuries and reduced their playing time accordingly. (de facto zeroing out Wainwright, Strasburg, and a few others). Otherwise I followed an algorithm to determine playing time.


#21    Tangotiger      (see all posts) 2011/08/22 (Mon) @ 13:31

Marcel used this:

http://www.tangotiger.net/survey/

And it’s possible that other forecasters did as well.


#22          (see all posts) 2011/08/22 (Mon) @ 14:00

Looks like my Rotoworld entry (No. 122) had Osvaldo Martinez in Victor Martinez’s place.  That stings, but it was definitely my fault.


#23    Geoff Buchan      (see all posts) 2011/08/22 (Mon) @ 15:01

Tango@21 - Thanks. I just tried to rescale Ellsbury’s numbers in my own projections to use those estimates, and that moved him into the top 40 in my draft list, in the range of most others.

I noticed those projections for innings pitched are too high in aggregate. I suspect plate appearances are, too, but that’s not easy to determine at a quick glance


#24    Tangotiger      (see all posts) 2011/08/22 (Mon) @ 15:22

It’s really irrelevant if they are too high, as long as they are too high in the same proportion for pitchers and hitters.


#25    Geoff Buchan      (see all posts) 2011/08/23 (Tue) @ 10:15

I agree - but I’d add that you also want the “errors” to be random and not systematic.

If players of some sort tend to be overestimated more than others, that might introduce some small bias.

The average per-team IP forecast was 1553, and the average PA forecast was 6491 - with IP ranging from 1408 to 1683 and PA ranging from 5949 to 7011.

I compared these to the baseball-reference per-team averages for 2010: 1443.5 IP, 6184 PA. The average innings estimate was 7.6% too high, and the PA estimate is 5.0% too high.

But the PA forecasts don’t include pitchers batting, which in 2010 accounted for a bit more than 200 PA per team on average (across all MLB teams). Adjusting for this, I estimate the PA forecasts are about 8.3% too high, actually a little more than the IP forecasts.

So while there is variation between team totals, it looks like both IP and PA estimates are inflated a similar amount.


#26          (see all posts) 2011/08/23 (Tue) @ 10:22

I’m just relieved I’ve improved from my first season, and I already identified my main ranking flaws. Definitely a final review would have paid off, but like many others, I finished at the last minute. It’s a shame about the mistakes out there, but I wouldn’t expect Tom to process any mulligans, especially in an exhibition with only bragging rights at stake.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 06:43
Largest demonstration in Canadian history?

May 25 06:39
Lack of hustle during a game

May 25 05:00
Help needed with sticky issue…

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards