THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, October 26, 2009

Forecasters Challenge 2009 - Who was similar to who

By Tangotiger, 11:23 AM

In the head-to-head drafts, I pitted Marcel against each of the other 21 forecasters twice (Marcel selected first in the first change, and second in the second draft), for a total of 42 head-to-head contests.  In 40 of those 42 drafts, Marcel ended up with Endy Chavez.  In all 42 of those drafts, ZiPS ended up with Endy Chavez.  We can say, therefore, that Marcel and ZiPS were “similar” in their evaluation of Endy Chavez.  And, naturally, these two were very different with each of the other 20 forecasters when it came to Endy.

For every players, I marked a player as “very different” if the number of times a forecaster ended up with him was at least 27 more than the number of times another forecaster got that player in his 42 head-to-head drafts.  Endy, for example, would count as “very different” when looking at the Marcel/MGL head-to-head, Marcel/PECOTA head-to-head, but “very similar” when looking at the Marcel/ZiPS head-to-head.  The “very similar” was noted as anyone that both forecasters selected within 5 times of each other.  Since the gap between Marcel and ZiPS on Endy was 2, that counts as “very similar”

So, which forecaster was most similar to Marcel?  Well, seeing that MGL and ZiPS both used the playing time forecasts of the Community, it’s no surprise that those two matched up highly with Marcel.

Marcel/ZiPS were very similar on 213 picks, while very different on 51 picks.  Marcel/MGL were similar on 211 and different on 54.  Among those that didn’t make any claims to having used the Community Playing Time forecasts, the biggest similarity was Fan #109 (checking to see who that is… Gehringer), with 114 very similar matches and 51 very different.

But, I’m more interested in who was most dissimilar to Marcel+Community.  I have to figure this forecaster’s system is the most novel.  Chone comes in with 65 very similars and 190 very differents.

When I look at all the pairs (22 x 21 = 462 pairs of head-to-head), the average number of big and small differences is about 100 players each.  When I look only at Marcel, the average number of big and small differences with the other 21 forecasters is about 110 each.  This tells me that the other 21 forecasters are more similar with themselves than with Marcel. 

How about the most famous system in our group, that being Baseball Prospectus’ PECOTA? 


They are actually fairly similar to all the other systems, with an average of 100 players each in the very different and very similar groups. Which system is the most similar to them?  Would you believe Steamer (Fan 218), which is a group of high schools students?  They have 88 big differences, and 93 small ones. 

Marcel is the most different from PECOTA, which is fairly shocking to me.

Finally, which pair of forecasters are the most similar?  That would be our top two finalists: Rotoworld and Hanson.  They had only 14 players that were very different, and 201 players that were very similar.  Hanson did rely somewhat on Rotoworld’s published playing time forecasts.  The next two pair of forecasters that were close were MGL and ZiPS, with 26 very similar and 205 very different.  Those two used the identical playing time forecasts.  After that, it was Marcel/ZiPS followed by Marcel/MGL.  All three systems used the same playing time forecasts.

The most dissimilar in the non-Community division was FeinSports and Cairo, with 148 big differences and 71 small ones.

Here are the overall “similarity scores”, which shows the average number of big and small differences, along with the total absolute differences for each forecaster compared to the other 21 forecasters.  Marcel is Fan ID 217.  You can find the rest of the matching key on the main website.

FAN_ID     big      small      tot 
218     78      100      7
,468 
110     74      101      7
,546 
109     77      100      7
,701 
122     71      104      7
,715 
101     77      100      7
,892 
116     73      104      7
,927 
115     93      94      7
,986 
121     94      94      8
,130 
112     85      104      8
,148 
119     100      96      8
,400 
105     103      98      8
,413 
108     111      98      8
,436 
120     92      107      8
,587 
204     100      106      8
,613 
207     109      95      8
,682 
113     100      107      8
,688 
203     102      108      8
,748 
217     106      111      8
,834 
111     110      95      8
,902 
106     128      80      9
,073 
102     114      101      9
,146 
214     123      94      9
,394

#1    J. Cross      (see all posts) 2009/10/26 (Mon) @ 12:38

Interesting.  We’ve started to look at correlations in OPS and ZiPS and CHONE are looking like similar systems while Marcel and Steamer projections look similar to each other (although Marcel looks like it had more success).  Sporting News and Pecota don’t seem that similar to any of the other systems we looked at (and didn’t do very well this year).

Maybe distinctions in playing time turns out to be more important in determining fantasy value.  We’ll check that out.

We’ll have much more on this coming up.


#2    Tangotiger      (see all posts) 2009/10/26 (Mon) @ 12:57

I’d love for you guys to do an analysis of this:
http://tangotiger.net/survey/

(Click the “All players” link at the bottom.)


#3    J. Cross      (see all posts) 2009/10/26 (Mon) @ 13:18

We’re definitely going to look at who projected playing time accurately (Marcel used those numbers for playing time, right?) and addtionally, I think, who projected total fantasy value accurately.

btw, when do the community playing time forecasts come out?  That could prove the way to go for next year.  Any interest in splitting the community projections for pitchers into starting/relief innings and projecting saves?


#4    Tangotiger      (see all posts) 2009/10/26 (Mon) @ 13:40

The note on the page said Mar 25, which was actually too early.  I got a ton of submissions really fast, and had I known it was going to be that popular, I would have waited a few more days, maybe a week, until rosters were finalized.

As for the pitchers, I asked the readers this way:
200+ (Workhorse)
175-199 (Regular Starter)
150-174 (Miss several starts)
100-149
60- 99 (Typical Reliever)
30- 59
1- 29 (Callup)

The one thing I could add is a flag for “closer”.


#5    J. Cross      (see all posts) 2009/10/26 (Mon) @ 13:48

What do you think about asking for “% of team’s save opps” since that’s just a managerial decision? 

Fans could then signify which guys they think will lose the closer role or might claim it later in the year.  Then Marcel (or any projection system) could figure out the total number of team save opps as well as a pitcher’s chances of blowing a save and project saves for each pitcher.


#6    Tangotiger      (see all posts) 2009/10/26 (Mon) @ 13:55

Not a bad idea.... I’ll consider it.


#7    jinaz      (see all posts) 2009/10/26 (Mon) @ 22:02

No question that real-world knowledge of projected playing time are a huge part of success here.

As far as systems using some kind of algorithm to forecast playing time, though...is CHONE showing up the best of those, at least in the head-to-head competitions?  It looks that way to me, though I don’t know all of the systems.
-j


#8          (see all posts) 2009/10/27 (Tue) @ 01:22

My team suffered in the PT department this year. I had what seemed like an abundance of midweek injuries, had a bunch of guys get dropped in the order, and at the end of the year I had just plain ol’ crappy players. If I prorated my stats to league average playing time, I would have gained ~5 points. If I put my PT equal with the league leader, it would have meant 12 points! Needless to say, I am going to spend a lot more time and energy on PT next season.

One thing I’d like to see with the comparison of forcasters is how the systems did if they had projected the PT perfectly. For example, I don’t want to punish a system for being overly high on Rickie Weeks because he only got 162 PA. Catastrophic injuries are random as are journeymen veterans becoming full time players (Ryan Ludwick, Ben Zobrist).

Also, in the real world of fantasy baseball, owners of a player who suffers a major injury will still get production out of that position. For example, I had Weeks, but my 2b gave me a total line of .255/75/15/62/6. That’s not nearly as good as expected from Weeks, but still much better than his actual production.


#9    Eric Hanson      (see all posts) 2009/10/27 (Tue) @ 08:23

"Hanson did rely somewhat on Rotoworld’s published playing time forecasts.”

That is true, especially if you replace the word “somewhat” with the phrase “very heavily for players with limited major league experience”.

There are a number of other reasons I would be similar to Rotoworld as well, but I am surprised we were that similar . . . even though I don’t have enough of a feel for this particular metric to know how similar that is.


#10    Tangotiger      (see all posts) 2009/10/27 (Tue) @ 09:18

I go back to the MGL/ZiPS/Marcel match.  The only thing common about those three is that they (likely) use 3 years, maybe 4 years, of data, and weight the most recent seasons more (probably).  But, this is likely true of ALL forecasting systems.

The commonality among the three is using the identical playing time forecasts.  And the results is that, after Rotoworld/Hanson, they are the most similar to each other.

So, I don’t know what the exact link is between Rotoworld/Hanson, but it seems apparent (to me anyway) that playing time forecasts is the great equalizer.  That is really the number 1 thing that a forecaster wants.

And, in a matter of two days, I was able to get playing time forecasts from the community, with almost no effort from my part.  To me, this is the revelation.


#11    Eric Hanson      (see all posts) 2009/10/27 (Tue) @ 10:30

I use three years with the exception that if a hitter (under a certain age) has an outlier to the upside more than three years old I use that as well.  The rationale being that many chance things can depress performance for periods of time while relatively few will boost it. 

On another note, you could isolate the playing time by removing players below an AB/GS/G threshold from the draft pool.


#12    J. Cross      (see all posts) 2009/10/27 (Tue) @ 13:09

The Steamer projections simply generated rate stats and then used the Pecota projected playing time to generate counting stats which explains the similarity to pecota in this competition.

I’m hoping that the community projected playing time proves more accurate than the pecota playing time (or the ESPN playing time or the Sporting News playing time etc.) so that it becomes the standard playing time to use and future competitions will be about how best to project rates.


#13    Tangotiger      (see all posts) 2009/10/27 (Tue) @ 13:24

J., I hope your kids are going to come out with the playing time study soon, comparing the Community to PECOTA to others.  Any timeline on that?  It’s one of the things I was going to do (at some point in the offseason), but I’d love for the kids to get exposure on it instead.


#14    J. Cross      (see all posts) 2009/10/27 (Tue) @ 13:40

Unfortunately these kids are seniors now and in the midst of college applications so it might take them a few weeks to get this done.  They have a bunch of systems with hitter and pitcher projections in an excel spreadsheet with names matched to lahman ID’s so they’re on their way.


#15    Tangotiger      (see all posts) 2009/10/27 (Tue) @ 13:50

Would have been better off to use MLBAM IDs, the closest thing we have to universal IDs.  After all, Lahman/Retro don’t have IDs for rookies, and MLBAM does.

Tell me what I can give them.  I can give them this:
MLBAMid, PA (or IP), RetroID, BDBid

(with the understanding that you will get nulls for pure rookies in the last two columns.  Will that help speed them up?

And, what’s with the college application?  In Canada, it’s nothing like that.  You fill out 3 or 4 pages, you authorize your transcript to be sent, you apply to the two schools in the city you want to live (Monteal in my case), and you make a decision.  Why the heck are they putting kids through what should be a stress-free decision?


#16    J. Cross      (see all posts) 2009/10/27 (Tue) @ 15:14

Thanks and, yes, you’re right.  I just downloaded a spreadsheet you’d previously linked that has MLBAM ID’s, retro ID’s and BDB’s.  The BDB’s look like lahman ID’s so I think I might switch us over to MLBAM ID’s. 

The college application process is ridiculous.  At least some of schools have a common application now but for other schools they need to write a new essay or two for each school they apply to… and these kids apply to far too many schools.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential