THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Data

Thursday, August 14, 2008

wOBA online

By Tangotiger, 09:18 AM

Stat Corner is a new site by friends of The Book Matthew Carruth and Graham MacAree.  It looks like they are focusing on presenting stats you can’t find elsewhere, so that’s good. 

Since they are just starting out, and seem to be willing to try new things, I think they’ll be open to suggestions.  Here are mine after first use:
1. The team page should list their special stats for 2008, and make their names hyperlinked.  Right now, just their names are hyperlinked.  No reason for me to go one player at a time on the same team.

2. The team info page should also be present under a league page, so we can see those numbers in context.  Things should flow hierarchically (league, team, player), with information on each page.

3. Make the headings clickable or hoverable, so we can see what they mean.

4. On the Pitch data page, each year should be clickable to have further drill down.  In fact, Matthew at THT had presented such stats in an article.  It’s a great presentation, and I’d hope to see it here some day.

That’s all for now.  Good luck guys!

(10) Comments • 2008/08/14 • SabermetricsData

Tuesday, August 12, 2008

Volunteers for Retrosheet

By Tangotiger, 10:13 AM

SABRMatt speaks:

I thought today after completing a fourth team for retrosheet.org in its’ quest to convert the PDF versions of the daily summary pages provided by the HOF into a digital database that I should speak up here and make my own call for volunteers.

David W. Smith is a generous and brilliant man and his life’s work (in the baseball sense)...to bring high quality data to the masses...has produced tremendous results thus far. The daily summary project is gaining momentum, but there’s really no limit on the number of volunteers they can use over there to get this done in a timely manner.

I should be speaking to the choir when I talk about how critical it is to the future of sabermetric research and to the integrity of our data that we get access to this treasure trove of game by game data that goes back into the 19th century.

I’m asking anyone here who would like to use this data when it becomes available to e-mail David (dwsmith~retrosheet~org, replacing the ~ with the appropriate character) and volunteer your time to enter it for him. You don’thave to spend many hours every day doing this work...you can take more time to do each team...even an hour a day would really help move the project forward (each team takes about 7-8 hours to enter if you’re at all adept with Excel and a keyboard).

(0) Comments • • SabermetricsData

Wednesday, August 06, 2008

Bay, Ramirez, WPA

By Tangotiger, 06:18 AM

Leaderboards as of this morning on Fangraphs:

WPA: Batters
Lance Berkman 5.29
Pat Burrell 4.98
Albert Pujols 4.47
Jason Bay 4.42
Manny Ramirez 4.18

Ramirez is also #7 in WPA/LI and Bay is #9.

That is all.

(1) Comments • 2008/08/06 • SabermetricsDataRun_Win_Expectancy

Tuesday, August 05, 2008

Baseball Prospectus Minor League Translations

By Tangotiger, 09:40 AM

I have been fairly harsh toward BP these past several months, but deservedly so in my opinion.  I am anything if not fair, and so, here are BP’s major league andminor league stats, including MLE and “peak” MLE (what you can expect the player to do if aged toward his peak).  It’s real sweet, so huge kudos to Clay for presenting the work, apparently updated daily.

(6) Comments • 2008/08/05 • SabermetricsDataMinors_College

Friday, August 01, 2008

Lazy assumptions or lazy research?

By Tangotiger, 01:54 PM

Christina:

I know, sabermetric orthodoxy insists that lineup order doesn’t matter; I guess I keep forgetting to drink all of my Kool-Aid, especially when lineup-related research depends on so many lazy assumptions and/or involves redoing some of the same Markov Chain analysis that’s been done for decades, all of which ends up suggesting that… well, that Joe McCarthy or Earl Weaver or Casey Stengel or Bobby Cox are smarter than the models (or the modelers). Consider me a firm believer in the proposition that much of sabermetrics is about the documentation of already-observed phenomenon, and that the best-placed observers did not and do not need sabermetric re-educations, they need to be learned from to create historically-informed sabermetrics.

If Christina has read The Book, I am annoyed.  If she hasn’t, then she is as lazy a reader as she claims the researchers are.  And since most of The Book in fact documents empirical (i.e., real-life) data, satisfying her vision of sabermetrics, then Christina should be one of the biggest vocal supporters of The Book.

Hat tip: FifthOF

(13) Comments • 2008/08/02 • SabermetricsData

Tuesday, July 08, 2008

Minor League Database

By Tangotiger, 02:09 PM

I don’t remember if I ever posted this, so here goes (maybe again):

http://minors.sabrwebs.com/cgi-bin/index.php

(0) Comments • • SabermetricsData

Wednesday, June 25, 2008

Retrosheet Announcement

By Tangotiger, 09:17 AM

Details:

Read More

(9) Comments • 2008/07/01 • SabermetricsData

Monday, June 23, 2008

Mapping IDs

By Tangotiger, 01:38 PM

If you are looking to make a contribution to the world of sabermetrics, here is the perfect little project for you:  Create a mapping table of all player IDs out there.


  • Captain Crunch put out the CBS, MLB.com, NFBC (whatever that is), and BP ids. 
  • The BDB has its ID (formerly Lahman), Baseball-Almanac (Holtz), Retrosheet, and B-r.com.
  • Mike Fast (can’t find it right now) posted the Retrosheet, MLB.com mappings.  (Mike: I found a couple of tiny mistakes.  If you can post it somewhere, I’ll give you the errors.)

So, this is what I would like:
1. Post all your mappings somewhere

That’s it.  Someone, maybe me, will then merge all of them to come up with the (current) definitive list.

Ideally, other sites will be as bothered as the rest of us in terms of mapping everything, that they will contribute their mappings of the new players in the future to keep this up-to-date.  All those minor league IDs, college websites, Japanese websites, etc, can finally have everything linked up.  Basically the “universal ID”.  Is it possible?  Let’s see…

(25) Comments • 2008/07/08 • SabermetricsData

Thursday, June 05, 2008

Fangraphs: stop me if you heard this one…

By Tangotiger, 10:16 AM

You can now filter based on months, or recent days

(3) Comments • 2008/06/09 • SabermetricsData

Thursday, April 24, 2008

File listing dump

By Tangotiger, 12:44 PM

The main page to Tangotiger.net doesn’t necessarily have all the file listing on my site.  It has most, but every now and then, I forget to update it.  Here then is the complete listing:

(0) Comments • • SabermetricsDataWeb Admin

Wednesday, April 16, 2008

Retrosheet Database, part 1

By Tangotiger, 01:12 PM

With taxes finally filed, and my (now) lack of interest in forecasting systems analysis behind me, I’ll be focusing on building a Retrosheet database.  I was envisioning releasing everything I do all at once.  But, I had second thoughts about that.  I might as well release things as I do them.

So, follow along with me, and you can build your database with me, and we can finally share our SQL code once we finish building it, since we’ll all be using the exact same names for everything.  This is what you have to do to start off:

(20) Comments • 2008/05/05 • SabermetricsData

Thursday, March 20, 2008

Baseball Reference boarding the Leverage and WPA train

By Tangotiger, 01:43 PM

Sean has asked for, and received, my Win Expectancy (WE) and Leverage Index (LI) data for Run environments of 3.0 to 6.5. Fangraphs has the exact same data at the exact same terms: free, as long as all their users don’t have to pay to see the data (via PI).  This offer is open to any information provider under the same terms.

You can see the results in the above link (still in Beta, so glossary not updated).  One interesting presentation he did was to show the win expectancy play-by-play from the perspective of the eventual winning team.  That’s what that “wWE” means (win expectancy of winning team).

Just as cool is the pitching summary, where he shows the result of strikes: contacted, swing&miss, called no swing.  In addition, you get to see the results of the contacted PA by GB, FB, LD. 

The payoff will be when we see this on the players’ split pages, so we can see seasonal and career totals.

(3) Comments • 2008/03/20 • SabermetricsData

Thursday, March 13, 2008

Fangraphs keeps on rolling the new stats

By Tangotiger, 09:32 AM

Want to know what Jake Peavy throws?  Go to the bottom of that page: 59% fastballs, 18% sliders, 11% cutters (CU), 2% curveballs (CB), 11% changeups (CH).  In the last 3 years, he’s thrown 10,000 pitches.  (Minor note: I’d call the cutter CT, as you can easily confuse CU for curve or changeUp.)

(18) Comments • 2008/03/30 • SabermetricsBall_TrackingData

Tuesday, February 05, 2008

Tango On Demand

By Tangotiger, 03:30 PM

I’m in transition in my sports work.  You get to set my schedule for the next several weeks.  Let me tell you what I’m in the middle of, and you can decide if you want me to work on something else.  In no particular order:

Read More

(33) Comments • 2008/03/22 • SabermetricsData

Monday, February 04, 2008

The future of sabermetrics

By Tangotiger, 03:01 PM

The pinnacle of Sabermetrics is the convergence of performance analysis and scouting observations.  To that end, the future of sabermetrics will be the processing of the pitch-by-pitch data.  So, a very micro-analysis.  Bill James looks at the answer to the question from a very macro-perspective:

League-perspective decision making. Looking at decisions based from the standpoint of the league. Simple example: the wild card....

I know why what I’m saying is a candidate for the future of sabermetrics.  I don’t know why what he says is.  That’s not to say that he’s wrong, but I just don’t see what he’s seeing.

I was bothered by this statement, especially in conjunction with a later statement where he says he doesn’t keep up with what around, other than Retrosheet:

Then we created “profiles"… which contain all kinds of information about the teams and the players that you don’t have any other way of knowing, at least now; of course other people will rip us off, and the same information will be appearing on other sites in a matter of months.

I really wish he wasn’t so forceful about his statements here.  Especially when he’s wrong.

(17) Comments • 2008/02/12 • SabermetricsData

Thursday, January 31, 2008

Sabermetric Wiki on Tangotiger.net

By Tangotiger, 02:21 PM

Come one, come all.  I just installed it last night, so the site is very bare.  And I expect to make limited contributions, so this is a call for you guys to do the heavy lifting.  I put up a couple of pages, as has Patriot.

What can you do?  Click the above link, and then click on the main link of the page, and navigate the site.  Once you’ve dipped your toe, put a search term on the left, like FIELDING or PARKS or EQA or whatnot.  Click the SEARCH button.  If there’s no hits, then click on the red search word that came back, and you can start creating the page.

Registration is recommended (will be easier for you to track your edits) but not required.  Unless some yahoos derail the project, at which point it will be required.

If someone wants to be in charge of the web design, let me know.

(22) Comments • 2008/06/25 • SabermetricsData

Tuesday, January 29, 2008

Fangraphs, finally

By Tangotiger, 10:36 AM

Improvement on Fangraphs especially for you cut and pasters.  Finally, we can grab ALL the players for a given season in one shot, not 50 at a time.

David has also offered to partner on my Clutch project.  We’re working on the details as we speak.

(0) Comments • • SabermetricsData

Friday, January 25, 2008

Sabermetric Links

By Tangotiger, 02:52 PM

Every several months, someone tackles the daunting task of cataloguing sabermetric research.  James Fraser and Sylvain did a great job at one point.  Here’s a new archiver, who also happens to be Friar Forecast, that’s trying to help us out.

In my case, I’ve got my Primer blog archived and indexed on my site (including by individual poster!), as well as the self-indexing of The Book Blog.

Dan Fox labels his posts (and you can see the whole list of labels on the right, after a few page downs), as does Phil Birnbaum (except curiously, he doesn’t have the list of categories… Phil, talk to Dan… you both use blogspot).  Patriot does have a good list of categories, with an in-depth explanation or executive summary throughout.  Baseball Prospectus also has a library of their articles, but it seems to be outdated (none of Fox’ material is in here).  Finally, there’s BaseBoogle (aka Baseball via Google) that searches only Baseball Analytic sites.  It’s a cool new feature from Google, and I’m glad someone is minding the store on it.  You can see the sites he’s searching on the right. 

If there are any other relevant sites, feel free to post them below.

(20) Comments • 2008/01/31 • SabermetricsData

Friday, November 30, 2007

Google Earth meets Retrosheet

By Tangotiger, 02:25 PM

Philip (Flip) Kromer writes:

Read More

(0) Comments • • SabermetricsDataParks

Monday, November 05, 2007

Bill James, Online

By Tangotiger, 09:14 AM

In the new Bill James 2008 Handbook, James refers to “Bill James Online”.  If you google that you get: Batter Profiles, among a host of profiles on the left, and other navigation on top.  It’s still in Beta mode, but this is the internet, and “Beta” means “production-unready”.  You can’t hide.  The search button is not working, but Google is.  Try this:
site:billjamesonline.net “fielding bible”
A couple of clicks gets you here and an impressive list of excerpts.  (In the 2008 Hardball Times Annual, I will also have a Jeter fielding article.) You can also register on their site, but, if you can work Google, you can probably get by.  Feels just like it did 25 years ago.

I will say that it probably would have been better if Bill James bought out Sean Forman.

(44) Comments • 2008/05/29 • SabermetricsData
Page 1 of 3 pages  1 2 3 >