THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, February 05, 2008

Tango On Demand

By Tangotiger, 04:30 PM

I’m in transition in my sports work.  You get to set my schedule for the next several weeks.  Let me tell you what I’m in the middle of, and you can decide if you want me to work on something else.  In no particular order:



  1. Community Forecast, 2007.  I’ve been putting this off since I’ve seen the data input job from a small group of people.  Depressed at people putting in OBP numbers instead of OPS.  Some people putting in OPS+ numbers.  This is totally my fault.  I should have stuck with drop-downs, and asked people to select based off that.  To think I was going to process over one thousand ballots with free hand text in a completely automated fashion was foolish of me.  Anyway, I’m working on this now.  I really don’t want to put his off any longer, as it haunts me.
  2. Hockey Scouting Report, 2008. Similar to the Baseball Fielding one, but this is based on the NHL.  NHL trading deadline is last week of Feb, and I like to run this halfway between that date and the last game of the season (first week of April).  NHL post-season is long (16 of 30 teams), so it could pick up steam.  I’ll target mid-to-late-March, 2008.
  3. Clutch Project, 2008.  This should be a real snap to setup.  And it requires so very little effort from the reader.  I have to get this done before Apr 1, obviously.
  4. Fantasy Forecast, 2008.  I was going to write a full article explaining how to value players in fantasy dollars.  I’ve already laid it out on my blog last year, but I just want to tighten it up.  It would likely be preferred I do this before Apr 1.  Right now, I’ve got a bit of motivation for this.
  5. College Scouting Report, 2008.  I’m actually already setup, and I just am waiting for first pitch.  I’m doing this as a favor to an MLB team, who are determining the viability of this.  The problems are obvious (hundreds of teams, and maybe a couple of ballots per team, if any) compared to MLB (30 teams, a few dozen ballots per team).
  6. With Or Without You, Retrosheet Fielding Database.  I’ve got soooo much I can do with the Retrosheet data (as you glimpsed from the THT08 Annual), maybe finally putting together a really cool fielding database (online or as a download).
  7. Baseball Database.  In similar spirit, the Baseball Databank is in need of expansion, what with all the split data available to us.  Basically, everything I did in The Book, and more, should be shoved into a database.
  8. Wiki.  I don’t know if there’s much more that I need to do.  Patriot has taken the bull by the horn, and I encourage all to participate.  It’s very easy.  Go the page that interests you, and click EDIT.  That’s it.
  9. Tangotiger Archives.  Something I’ve been meaning to do for years.  A summary of everything I’ve done.  But, I am so not motivated for this.  I prefer just plugging along, not stopping and gathering the remains.  I really like the Moneyball quote of James: He prefers leaving an honest mess, rather than a tidy lie.  Basically, I hope that whatever I write is enough without needing to flesh everything out.  I wrote The Book, and that was rather exhausting at the end, with all the typo checks, and making everything come together.  I don’t know if I want to go through something like that again.  Perhaps with the Wiki, I might be able to put it in there.  Maybe.
  10. Quantum Leap or Wiseguy?  Prepare critical pieces as I noted on my blog here:
    http://www.insidethebook.com/ee/index.php/site/comments/quantum_leap_or_wiseguy/
    This is also a good candidate for the Wiki.
  11. PITCHf/x.  I’m late to the party, but the partygoers are doing such a fanstatic job, I don’t know if it’s worth the bother at this time.
  12. Anything else?
  13. Compare BIS, STATS, HitTracker.  I’m hopeful in getting my hands on these three sources, so we can compare to see how the systems correspond to each other.
  14. Ichiro and UZR.  Is there a scorer bias at Safeco in RF?
  15. Fans Scouting Report Database.  Get the 5 years of data all together, and do more with it, like Walsh did on the arms.

Just put down in your comments the number(s) from above that interest you, and I’ll take it under advisement.

SabermetricsData
#1          (see all posts) 2008/02/05 (Tue) @ 17:36

12.1 A “read more” link in the RSS feed, so I don’t have to click two links to read the entire post + comments.

12.2 Basketball. We need more smart folks. Come on in, the water’s great. Regression to the mean is a topic that needs addressing.


#2    Xeifrank      (see all posts) 2008/02/05 (Tue) @ 17:52

4,5,12a,12b

12a. Park factors primer
12b. Similarity score primer
No hockey please!!! smile

vr, Xei


#3    Tangotiger      (see all posts) 2008/02/05 (Tue) @ 17:59

Xei/2: even if I wanted to, I couldn’t… I’m Canadian.


#4    Los Angeles Waterloo of Black Hawk      (see all posts) 2008/02/05 (Tue) @ 18:20

I find 5 and 6 tremendously exciting.  7 has promise, though I’m not sure what data you’re talking about.


#5    Tangotiger      (see all posts) 2008/02/05 (Tue) @ 18:31

Hawk/4: regarding 7, basically any of the splits data you see at baseball-reference, and more Retrosheet-splits that we can think of, plus merging the Japanese data, and any minor league and Negro League data we can get our hands on.  Plus also including as much sabermetric metrics and counts that we can think of. 

The end-result is that there should be little use for Retrosheet event files on a grand scale, because we’ll have captured all the information we want out of it into the database.  Any suggestions I made here, I’d incorporate:
http://www.baseball-reference.com/blog/suggestions


#6    Phil D.      (see all posts) 2008/02/05 (Tue) @ 18:51

The three I would really like to see, in order of preference, are six, two and one.

As far as number twelve goes, permit me to be greedy here. Someone somewhere has to do something with tennis. There are 108 ranking systems, at the least out there for college football (click my name), but none as far as I can tell for tennis beyond the two tours’ official rankings. Of course, those systems are woefully flawed. The problem with tennis is that there is absolutely nothing out there as far as data to build on, so it’s a huge burden for someone to start.


#7          (see all posts) 2008/02/05 (Tue) @ 18:56

I’d be more than willing to do some bitch-work formatting Community Forecasts if you could get a split database together by the end of the month; it’d be insanely useful.


#8    Patriot      (see all posts) 2008/02/05 (Tue) @ 19:11

#7 is really cool, as is #5. 

On #8, maybe there could be some kind of bribery scheme.  “Nobody gets the 2007 scouting report results until there are 100 articles!”...something like that. 

Just kidding, although playing on people’s guilt might not be such a bad thing grin


#9          (see all posts) 2008/02/05 (Tue) @ 19:30

Not really on topic, but since someone already brought it up ... how about eliminating the “read more” entirely?  I hate having to click twice.


#10    T-nasty      (see all posts) 2008/02/05 (Tue) @ 20:06

In order of what I would most like to see done:

1. Community Forecast - it’s a great idea, it’s something I did at my teams forum. Are we talkin just having a ballot where people input there projections for players OPS, or the 3 split stats? That’s what I did, and it’s worked well. I’m excited to see how well fan’s forecasts correlate with actual performance.

6. WOWY fielding. This system is awesome. More work should definitely be done.

3. Clutch project


#11    tangotiger      (see all posts) 2008/02/05 (Tue) @ 20:11

Hmmm… the “read more” is done intentionally, so that you can navigate through the main page without going through too many clicks.  That is, the tradeoff is the two-clicks for reading a long post against all the extra page down clicks to go to a particular post further down (which you would do occasionally.

I would actually prefer that YOU change your blog so that it would behave like this!


#12          (see all posts) 2008/02/05 (Tue) @ 20:36

I’d personally like to see anything related to hockey.  Not even so much as the number-crunching, but the theory.  What data is out there, what matters, etc.  And if it’s really possible to have an intelligent analysis of players beyond goals, assists, and +/-, with the data available.


#13    Rally      (see all posts) 2008/02/05 (Tue) @ 20:51

I don’t get the two clicks complain.  I never hit the read more anyway, just click on comments.  It gets you the whole post, and the comments.


#14          (see all posts) 2008/02/05 (Tue) @ 21:22

@6: If you’re reading through a feed reader (or at least, though my feed reader) you don’t see a comments link. You just see a “read more” link, which shows the full post, but you have to then hit a “comments” link to read the full post+comments. This two-click dance step has bugged me so much that I actually downloaded two other feed readers just to see if I could avoid at least one of those clicks—neither helped. I know this is a minor point, but it really has bugged me in the past.

(I am reading this through GreatNews, one of the others I tried was RSS Owl, and I forget the third. Are you reading this through a feed reader? Because I would love to know which one allows a single-click -> full post+comments.)


#15          (see all posts) 2008/02/05 (Tue) @ 21:23

Of course I meant @13 not @6.


#16          (see all posts) 2008/02/05 (Tue) @ 22:02

You have no idea how much the Retrosheet Fielding Database will help. It might be the first step in my dream of a CAD database, which you acknowledged would be a bear.

Two ideas:

Groundball/Flyball. In The Book, there’s a chart of wOBA by breakdown, showing that like-against-like favors the pitcher, but it doesn’t show a breakdown of how. Is it singles or extra base hits, or strikeouts?

Hit-and-Run: How good a strategy is it?


#17    John Beamer      (see all posts) 2008/02/05 (Tue) @ 23:44

6 and 7 are the two biggies for me


#18    jinaz      (see all posts) 2008/02/06 (Wed) @ 01:56

I’ll echo #6 as something I’d love to see, though Rally’s excellent retrosheet fielding numbers make this not quite as important as it would have been a few months back.

Possible related project idea as a #12: historical position adjustments, based on either your or Rally’s retrosheet fielding data.  I’d be very interested to know the extent to which changes in defensive skill among positions has tracked changes in offensive performance among positions.  You could do it decade-by-decade and have fabulous samples to work from. 

This would be a real boon for folks like me who are starting to dabble in historical player valuation .  It’d also be nice just to have the position adjustments calculated on a system other that UZR, in case UZR handles some positions differently than others.  ...  I could do this too at some point, but the night class I’m teaching right now is really killing my baseball analysis time. -j


#19          (see all posts) 2008/02/06 (Wed) @ 20:34

#11- I’d love to see what you can do with the pitch f/x data, you always seem to have useful suggestions for Mike Fast and that crew.

And I personally despise hockey, but I’d be interested in seeing a little bit of the analysis that you do with it (in addition to the scouting report).

@Xeifrank/2:

MGL just did something on park factors, although it seemed rather technical and beyond my interests (I skimmed it)

http://www.insidethebook.com/ee/index.php/site/article/mgl_component_park_factors/


#20    Sky      (see all posts) 2008/02/06 (Wed) @ 21:13

I second JinAz’s request for historical position adjustments.  1, 4, and 6 are also high on my list.  Thanks.


#21    Tangotiger      (see all posts) 2008/02/06 (Wed) @ 22:11

I forgot one. I added #13.


#22    jianfu      (see all posts) 2008/02/07 (Thu) @ 13:20

Count me in on the WOWY love. Great stuff. Thanks.

It’s too bad it’s likely impossible to track down data to study defensive strategies (historically, that is). That is, “no doubles"/guarding the lines, playing the infield in, etc. I’d love to see a “The Book"-like chapter on these tactics.


#23    Tangotiger      (see all posts) 2008/02/08 (Fri) @ 13:28

Added 14 and 15.


#24    JinAZ      (see all posts) 2008/02/08 (Fri) @ 13:32

Ooooo...#15 sounds fantastic…
-j


#25    Sky      (see all posts) 2008/02/08 (Fri) @ 16:24

#’s 13 and 15 would be at the top of my list as well.

I don’t supposed we can demand that you finish all of one through fifteen in the next month?  Nah, didn’t think so.


#26    Wildrose      (see all posts) 2008/02/09 (Sat) @ 13:40

Tom, put me down for more hockey analysis. I’d love to read your thoughts on the subject, some sort of pre-trade deadline or playoff thread would be great.


#27    studes      (see all posts) 2008/02/09 (Sat) @ 16:45

6, 7, 9 and 13.  I get frustrated when I try to find old work of yours but can’t easily.  Perhaps this is something to incorporate into the wiki?


#28    Tangotiger      (see all posts) 2008/02/13 (Wed) @ 22:23

Ed/1: the RSS feed will now give you one-click access to the comments


#29    Tangotiger      (see all posts) 2008/02/14 (Thu) @ 11:31

Ok, based on responses here, the following will be my schedule.  I’ll do my best to follow-through.  (Text in parens refers to original item number at top of blog.)

TIME-SENSITIVE
T1. By Feb 11. College Scouting Report, 2008. (Item 5)
T2. By Mar 07. Clutch Project, 2008.  (Item 3)
T3. By Mar 14. Fantasy Forecast, 2008. (Item 4)
T4. By Mar 21. Hockey Scouting Report, 2008. (Item 2)
T5. By Apr 01. Community Forecast, 2007. (Item 1)

PRIORITY OF REST
P1. With Or Without You, Retrosheet Fielding Database. (Item 6)
P2. Baseball Database. (Item 7)
P3. Primers - Tango Archives, Wiki contributions, Critical Reviews (Items 8, 9, 10)
P4. Compare Hit Location Systems - BIS, STATS, HitTracker.  (Items 13, 14)
P5. Fans Scouting Report Database.  (Item 15)
P6. PITCHf/x.  (Item 11)


#30    Tangotiger      (see all posts) 2008/03/04 (Tue) @ 14:49

This post is mainly for my purposes.

T1 done.
T5 done.


#31          (see all posts) 2008/03/04 (Tue) @ 20:55

Say, Tom, would it be possible to compile traditional fielding stats from Retrosheet by pitcher handedness, batter handedness and base/out? It’s a little late to the game, but it should be relatively easy. A side part of With or Without You.


#32    tangotiger      (see all posts) 2008/03/04 (Tue) @ 21:14

I’m already tracking that in my WOWY, and it’ll be a part of it.


#33    tangotiger      (see all posts) 2008/03/22 (Sat) @ 17:23

Post mainly for my purposes.

T2 done.
T4 Delay to playoffs.

T3 do next.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 19 15:14
Sabermetric Moves of the 2009 Pre-Season

Nov 19 19:31
My 1B is better than your 1B

Nov 19 19:29
Nate Silver: hero to interviewers

Nov 19 19:13
Offense by position groups by decade

Nov 19 17:32
Changes in home run rates during the Retrosheet years

Nov 19 16:40
One Year and One Million Hits Later

Nov 19 16:22
Soria as a starter?

Nov 19 13:50
Response of a fired head coach

Nov 19 11:26
MLB logo

Nov 19 10:53
BDB Database (MS Access)