The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more. Read Excerpts & Customer Reviews
Tango #11,
Agreed that the PA and IP "projections" need work. I didn't age them at all and clearly they should be. I'd also want to re-evaluate my method of coming up with the year one estimate for each.
Also, to be fair, if I used updated projections instead of pre-season I think that Ubaldo, Strasburg, and Wainwright may have been included.
Nice graph Jeff. I like it. Is there any way to call out Soria without connecting the points with a line? It kinda implies relationships/hints at data not available.
Another thought, maybe not for this graph, but for a LOOGY (or even RH setup guy) for example, doing the player in question and then a dot for highest Leverage of like handed reliever and highest total (independent of hand).
I think the numbers in #1 are from the run frequency table and are representative of exactly 1 run. I think the corresponding of "at least one run" are .231 (1-0.769) for 1st and 2nd and .223 for 2nd only (1-0.777). I used http://www.tangotiger.net/RE9902score.html
Mike, it seems to me that it is less about actually changing the projection (i.e. changing underlying talent) than it is about realized performance around the projection (the error bars of the projection). Personally I see a difference between the two.
I posted the link at a couple of Cardinal sites.
http://playahardnine.wordpress.com/2010/03/26/playing-time-survey-the-book-blog/
and
http://www.vivaelbirdos.com/2010/3/26/1391745/tangos-playing-time-survey
Matt #4, it might have to do with playing time projections. I think the consensus was that Rally did rate stats the best but some version of community playing time was best... I could be way off too though as I'm going from memory.
To sort of echo what Mike said in #24 what little experience related qualifications (SQL, R, etc) I have I picked up doing sabermetrics not getting my degree (Operations Research). Basically just getting at a degree in OR or something similar doesn't particularly make one qualified (i.e. there's LOTS of people in the Saber community more qualified than I despite my degree)
Bill #2, Patriot makes a good case for normalizing PAs across different eras http://walksaber.blogspot.com/2009/12/caution-on-use-of-baselined-metrics-per.html
I enjoyed reading through the entire series, but especially enjoyed these closing two paragraphs. If anyone ever asks me why I "waste" so much time on this stuff I'm simply going to point them to those two paragraphs.
I tried to look at playoff probabilities to get a sense of the global part discussed in #20 over at BtB http://www.beyondtheboxscore.com/2009/12/28/1221681/the-braves-off-season-playoff
Tango, MGL, et all,
Pardon my dense question, but if someone does what Tango proposes in 6 and finds no bias, what then are the pitfalls of regressing towards a population mean where the population (not the actual values, just the population) is determined by the fans?
I'd liken it to the example MGL did the a little while ago about regressing towards an actual scouting report (I believe he used an "above average" qualifier and "equated" that to runs).
Thanks for putting up with a relative newbie!!
MGL,
Thanks for the input, especially that Method I is not correct. That method was what I set out to do at the beginning, but it didn't feel right at the end.
My plan was to identify a population whose UZR mean I could regress to, and clearly I chose the level of shortstop as identified by the fans. Unfortunately I neglected to consider the idea of an unbiased population for the reasons you and Steven mentioned.
Again thanks for the comments. In my next iteration I will experiment with the speed factor that you mentioned.
Tango,
Thanks for the pub!! I have no good reason why I chose the ordinal rank (your presumption was correct) vice the actual score. Good news is I still haven't gotten around to doing other positions, so I can do them using the actual score.
I was trying to do something like MGL mentions in #7 only using the fan's scouting report in lieu of an actual scouting report. While compiling UZR numbers from fangraphs I noticed that the defensive games number appears higher this year than years past. Does anyone have any insight on that? Thanks.
@Rally #52, Yep. The Yankees had the lowest SD, with TOR and SLN among the other lows.
Would you rather own Joe Mauer Properties or Domonic Brown Properties? (Steve Sommer) —
Tango #11, Agreed that the PA and IP "projections" need work. I didn't age them at all and clearly they should be. I'd also want to re-evaluate my method of coming up with the year one estimate for each. Also, to be fair, if I used updated projections instead of pre-season I think that Ubaldo, Strasburg, and Wainwright may have been included.
Bullpen Shutdowns and Meltdowns (S&M) (Steve Sommer) —
http://www.fangraphs.com/blogs/index.php/shutdowns-meltdowns/
Daily reliever usage charts (Steve Sommer) —
That one, with the dots colored by handedness, might be my favorite.
Daily reliever usage charts (Steve Sommer) —
Nice graph Jeff. I like it. Is there any way to call out Soria without connecting the points with a line? It kinda implies relationships/hints at data not available. Another thought, maybe not for this graph, but for a LOOGY (or even RH setup guy) for example, doing the player in question and then a dot for highest Leverage of like handed reliever and highest total (independent of hand).
"Those are the little things that I don't think you can see in the box score, ever." (Steve Sommer) —
I think the numbers in #1 are from the run frequency table and are representative of exactly 1 run. I think the corresponding of "at least one run" are .231 (1-0.769) for 1st and 2nd and .223 for 2nd only (1-0.777). I used http://www.tangotiger.net/RE9902score.html
Poll: Chances for Redsox and Royals to surprise (Steve Sommer) —
Mike, it seems to me that it is less about actually changing the projection (i.e. changing underlying talent) than it is about realized performance around the projection (the error bars of the projection). Personally I see a difference between the two.
WAR for catchers (Steve Sommer) —
If I remember correctly that's why draw poker died out right? The fish got beat up too much and the good players had no ones money to take.
Community Forecasts - Playing Time (Steve Sommer) —
I posted the link at a couple of Cardinal sites. http://playahardnine.wordpress.com/2010/03/26/playing-time-survey-the-book-blog/ and http://www.vivaelbirdos.com/2010/3/26/1391745/tangos-playing-time-survey
The Marcels takes on the field (Steve Sommer) —
Or I could have waited a couple minutes for Rally to answer himself :)
The Marcels takes on the field (Steve Sommer) —
Matt #4, it might have to do with playing time projections. I think the consensus was that Rally did rate stats the best but some version of community playing time was best... I could be way off too though as I'm going from memory.
Psst... wanna work for the Cleveland Indians? (Steve Sommer) —
To sort of echo what Mike said in #24 what little experience related qualifications (SQL, R, etc) I have I picked up doing sabermetrics not getting my degree (Operations Research). Basically just getting at a degree in OR or something similar doesn't particularly make one qualified (i.e. there's LOTS of people in the Saber community more qualified than I despite my degree)
ICC or Intra Class Correlation (Steve Sommer) —
Just wanted to chime in and say I've found the discussion so far very educational. It's something I've only peripherally thought of before.
HOF 2010 (Steve Sommer) —
Tango, That might be an interesting poll to re-do given the recent developments...
Mike Silva Chronicles - Part 10: Future (Steve Sommer) —
Eh, I figure it's professional development right? Spreadsheets, databases, etc. those are useful skills :)
Lou Whitaker - Best Player in MLB history eligible but not in the hall of fame? (Steve Sommer) —
Bill #2, Patriot makes a good case for normalizing PAs across different eras http://walksaber.blogspot.com/2009/12/caution-on-use-of-baselined-metrics-per.html
Mike Silva Chronicles - Part 10: Future (Steve Sommer) —
I enjoyed reading through the entire series, but especially enjoyed these closing two paragraphs. If anyone ever asks me why I "waste" so much time on this stuff I'm simply going to point them to those two paragraphs.
Javy Vazquez for... that's it? (Steve Sommer) —
I tried to look at playoff probabilities to get a sense of the global part discussed in #20 over at BtB http://www.beyondtheboxscore.com/2009/12/28/1221681/the-braves-off-season-playoff
Thank you Keith Law... and I acknowledge Will Carroll's contribution (Steve Sommer) —
Joel Piniero is a prime example of the first part of Tango's #40. Hardly any K's, BB's or HRs and a FIP of ~3.2 (3.27 actually) good for 14th in MLB.
Teixeira, take 2 (Steve Sommer) —
Looks like they fixed the DG problem http://www.fangraphs.com/blogs/index.php/uzr-update-dg
Regressing UZR toward the Fans' Scouting Report (Steve Sommer) —
Tango, MGL, et all, Pardon my dense question, but if someone does what Tango proposes in 6 and finds no bias, what then are the pitfalls of regressing towards a population mean where the population (not the actual values, just the population) is determined by the fans? I'd liken it to the example MGL did the a little while ago about regressing towards an actual scouting report (I believe he used an "above average" qualifier and "equated" that to runs). Thanks for putting up with a relative newbie!!
Regressing UZR toward the Fans' Scouting Report (Steve Sommer) —
MGL, Thanks for the input, especially that Method I is not correct. That method was what I set out to do at the beginning, but it didn't feel right at the end. My plan was to identify a population whose UZR mean I could regress to, and clearly I chose the level of shortstop as identified by the fans. Unfortunately I neglected to consider the idea of an unbiased population for the reasons you and Steven mentioned. Again thanks for the comments. In my next iteration I will experiment with the speed factor that you mentioned.
Regressing UZR toward the Fans' Scouting Report (Steve Sommer) —
Tango, Thanks for the pub!! I have no good reason why I chose the ordinal rank (your presumption was correct) vice the actual score. Good news is I still haven't gotten around to doing other positions, so I can do them using the actual score.
Teixeira, take 2 (Steve Sommer) —
I was trying to do something like MGL mentions in #7 only using the fan's scouting report in lieu of an actual scouting report. While compiling UZR numbers from fangraphs I noticed that the defensive games number appears higher this year than years past. Does anyone have any insight on that? Thanks.