THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, November 30, 2009

Dan Fox’s MITT

By Tangotiger, 10:53 AM

I didn’t see this article from July.  It gives a very high-level view of what Dan is doing with the Pirates


#1          (see all posts) 2009/11/30 (Mon) @ 11:55

Tango -

The Giants have a player database like this?  Are there any other mentions in the press to support that claim?


#2    Brian Cartwright      (see all posts) 2009/11/30 (Mon) @ 11:59

The article mentions doing name lookups. When I saw Dan at the BP event at PNC in September, I asked him about doing Query by Example ‘show me all the players who are above average fielders at 2b who are also above average baserunners who are also above average homerun hitters’ and then present links for each player the query returns. His response “Absolutely!”


#3    Peter Jensen      (see all posts) 2009/11/30 (Mon) @ 12:43

“In total, we have some level of information on about 85,000 players,” Fox said.

A little misleading, probably from reporter error.  I too have a database with some level of information on about 85,000 players thanks to Sean Lahman.  Most of those players won’t be of much help to the Pirates.

I met Dan at Pitch f/x in July.  He is a very smart man.  I am sure that MITT is state of the art and that however many players are in it the Pirates can easily retrieve any information they desire.  The question is, why doesn’t every MLB team have such a database?  Or perhaps they do and MITT isn’t really news except in Pittsburgh.


#4    David Cameron      (see all posts) 2009/11/30 (Mon) @ 12:56

Very few teams have something like this.  They are starting to get more popular, but there are a bunch of teams who just don’t care, or won’t budget for it.


#5          (see all posts) 2009/11/30 (Mon) @ 12:59

Peter/3, I would be shocked if your last assertion isn’t correct. Today every team has to have a system like this. How could they not? They know other teams have these systems it only makes sense they would be at a similar place in the learning curve or at least be trying to catch up and replicate it.*

Cartwright/2, Did you get to ask Dan what kind of fielding metrics he is used for such a query? I assume the pirates have their own scouting data. But do you think Dan created a metric as well?

*Unless it is a team like the Mets who seem to be inept when it comes to modern analysis. :(


#6          (see all posts) 2009/11/30 (Mon) @ 13:02

Dave/4 Really? Does that surprise you?

My only guess would be that if they existed people like yourself, Tango, etc. would know the people running the systems. There aren’t that many saber-minded people out there. I assume collectively everyone knows who has jobs with major league teams.

I have to assume most teams are creating something like this, or at least something saber-based that will be helping them progress.


#7    David Cameron      (see all posts) 2009/11/30 (Mon) @ 13:34

Yeah, I just can’t get my head around how far behind the curve so many teams are.  I did a contract gig for an MLB team (not the Mariners) this summer to provide them with historical minor league data.  They didn’t have a database with the same information available on the baseball cube or b-ref. 

And this is a relatively smart team.  They at least recognized that they needed to play catchup, and were doing something about it.  But, yeah, I was pretty surprised that they didn’t already have that at a minimum.


#8    BenJ      (see all posts) 2009/11/30 (Mon) @ 13:53

I have worked with a “non-sabermetric” team that has a similar system (I forget what they call it) that houses basic historical stats and all of their scouting reports on particular players, from the amateur levels through MLB.  If you read the descriptions of MITT or the Indians’ “top-secret” system, it’s not much different.  I imagine most teams’ IT departments have set up some sort of central database of scouting reports at this point. 

Now, there’s a huge difference between having this software and making good decisions.  The team I worked for was using batting average and RBIs (combined with their scouting reports) to make decisions.


#9    Brian Cartwright      (see all posts) 2009/11/30 (Mon) @ 13:56

Peter/3 - I have a register of 35,000 pro players from 1998-2009, plus another 30,000 or so college players 2002-2009.

I can imagine that there are 85,000 professional (mlb affiliated and foreign) college and high school players of note who have played over the past two seasons.

JD/5 - The question I referred to was only on the ability to find players based on criteria. Fox had previously developed Simple Fielding Runs (SFR) while at BP, and from other conversations with him I am pretty sure what he has now is an extension of that, not something he started from scrath while at the Pirates. He allowed BP to continue using his base running metrics after he left, but not SFR.

I’ve had varying level of contacts with three teams. One is obviously Dan Fox, who I’ve corresponded with since he was at BP. A second recently asked me if I wanted to apply to be a db administrator. They appear to see the need, but don’t yet have what they want in place. Another has a smart guy with a growing set of data. I’ve sent him some of my info, and he’s given me some assistance in return. No money from any teams yet, but I’m working on it.


#10    Rally      (see all posts) 2009/11/30 (Mon) @ 14:36

When I saw the 85,000 players I assumed he meant college and high school prospects.  While such a system can be helpful, I think I could make reasonable decisions about current professionals without one.  I’d just need some info about service time/contract, his B-Ref page, and recent scouting reports.

This could be really, really useful when trying to decide who to draft, or which 16 year old Dominican is worth a million dollar bonus.


#11          (see all posts) 2009/11/30 (Mon) @ 14:48

Brian, thanks I guess SFR is before my BP subscription started, so probably missed it. It looks like Fox released SFR in 1/2008 and then left just a few months later.

Good luck with the money gigs. I’m sure they will come. I’ve read most of your stuff on FG and BP and it is great stuff, IMHO.


#12          (see all posts) 2009/11/30 (Mon) @ 15:12

Rally/10, I’m not sure that a 85,000 amateur player database is realistic. There are about 1500 kids drafted each year and 200 international FA signed (estimated). Even if draftees were put into the system in 9th grade and all international players were put in at 13 I doubt one would get close to 85,000 players. Especially players that they had an intent to pursue.

Obviously MITT is a secret. But lets assume that Fox is trying to quantify intangibles and scouting data into a database like Brian mentioned.

Just a silly guess of information I would want to evaluate.

Managing - Managing the below information in a database.
Information - Contract, age, height, weight, etc.
Tools - applied talent/stats, i.e. BB% K% ISO, k/TBF etc.
Talent - 20-80 scale current/future ability


#13    Rally      (see all posts) 2009/11/30 (Mon) @ 15:29

The 1500 draftees are just a subset of all the college and high school players that a scout will look at.  My guess is that every college player has a stat line in MITT.  They can get this by looking at each school’s website, links to which are organized on Boyd Nation’s website, or by leaving the dirty work to others, and purchasing from Jeff Sackmann’s Collegesplits.com.  I believe he’s provided data to about half of MLB’s teams.

Not sure if you’d have (or even want) high school stats, but I assume every player looked at by a scout, a report finds its way into the database.


#14    KY      (see all posts) 2009/12/01 (Tue) @ 15:38

IMO, it is embarrassing ... EMBARRASSING .. that multi-million or billion dollar corporations do not all have this stuff down pat already.


#15    Tangotiger      (see all posts) 2009/12/01 (Tue) @ 15:47

They are not multi-million dollar corporations.  They are 30 small businesses.

What should happen is that these 30 small businesses should hire someone (I’m available!) to create a database for everyone to use.

It’s shocking that STATS and BIS can find themselves 30 markets to buy their services, when MLBAM can do it for them.


#16    weskelton      (see all posts) 2009/12/01 (Tue) @ 17:28

One reason that some teams would not want a common database, is that it would dilute the potential competetive advantage of having something that the other guy doesn’t.  Basically, if everybody has the same database, then there’s no advantage to be had from having that database.  I suspect that more teams have their own databases than we here know about.  It’s just not public knowledge.


#17    weskelton      (see all posts) 2009/12/01 (Tue) @ 17:36

I’d also suggest that it’s not jut a database we’re talking about here.  The true value lies in having a tool (like MITT) that can synthesize all of the data in the database.  These days, just having the data isn’t enough.  Heck, there’s a whole boat load of people around here that have pretty good databases and most of us don’t have any budget at all.


#18    Tangotiger      (see all posts) 2009/12/01 (Tue) @ 17:42

wes: that kind of thinking would have stopped MLBAM from offering PITCHf/x.

And STATS has “one” database that all the teams can access (for a fee).  The differentiator is still what each team does with that data.

Nothing is stopping each team from adding their own database layer on top of this central repository.


#19          (see all posts) 2009/12/01 (Tue) @ 18:10

#1) To agree with Tango, it seems to me that if you take out the Sox and Yankees (and a cool billion for running MLB, MLBAM, etc.), you likley have 28 businesses that each have on average $200M in revenues, minus the “pass-through” costs of player salaries—say $100M/per.  That leaves essentially a “normal” $100M company.  Typically a $100M company is something that is usually going to run a bare-bones business.

#2) Also, as mentioned, there’s two improtant things.  There’s the basic public data and a database architecture.  Second thing is that one then has the company’s private data, performance metrics, etc, customized tools and reports.  Similar to something like an SAP ERP system, there’s a market for developing a flexible system like SAP, but then another market for a specialized company (or the team’s internal IT group) to build what specific tools/reports that can be added onto the base system.


#20    KY      (see all posts) 2009/12/01 (Tue) @ 20:17

The Pirates aren’t a multi-million dollar business?  Maybe I’m ignorant, but prove it please.


#21          (see all posts) 2009/12/01 (Tue) @ 20:50

Huh?  I assume you mean multi-million on a revenue basis, correct?

And yes the Pirates are a multi-million dollar business.  I don’t think anywhere we have said that they weren’t.  All I’ve said is that by most standards companies that are less than $100 Million in revenues are really not that big.  My point was that for many of these teams assume that the player salaries are pass-through costs, many of the smaller teams are $50M - $100M companies and that is actually not that big.


#22    KY      (see all posts) 2009/12/01 (Tue) @ 22:32

#15/Tango said:

“They are not multi-million dollar corporations.  They are 30 small businesses. “


#23    Mike Fast      (see all posts) 2009/12/01 (Tue) @ 23:07

MLB clubs will argue over a few thousand dollars spent on analysis that would help them save a multi-million dollar investment in a player.  Any of the folks here who have worked with the clubs will confirm that.

What club do you know that employs and pays more than one recognized sabermetric expert (not low-paid interns)?  They can be counted on one hand.

So don’t be surprised that so few teams have built comprehensive information systems (although the list of such clubs in the Trib article was not exhaustive).


#24    weskelton      (see all posts) 2009/12/02 (Wed) @ 01:33

Tango/#18

I’m not quite sure what I said that implied the kind of thinking that would have stopped MLBAM from offering pitch f/x.  Pitch f/x exists primarily for the satisfaction of fans that watch the games via Gameday.  The fact that MLBAM has agreed to leave the raw data available to the public (as well as encourage its exploration) is simply a bonus. 

And “yes’ STATS (and BIS) have data that everyone can access for a fee.  The strength of both of these organizations is in the collection and disemination of data.  They are merely a source of data for an MLB team, same as Gameday.  I’m sure that STATS gets the lion’s share of its revenue from providing data to the media and that MLB teams are a small piece of their pie.  I suspect that BIS has a different model though.

Bottom line though… The data is there and it’s not prohibitively expensive.  It’s also probably not that hard for a team to get that data into a structured data warehouse.  And yes, a team would also want to add in their scouting data.  The real trick remains in making the data available though a decision support system that will allow them to answer the questions they need to answer.  Data by itself is not enough. 

Now if MLBAM is to provide the actual decision support system to all teams, that’s where the competetive advantage is eroded.

Just curious, what % of teams does everyone think have the type of system that MITT represents?


#25    Mike Fast      (see all posts) 2009/12/02 (Wed) @ 01:59

Just curious, what % of teams does everyone think have the type of system that MITT represents?

I’d guess somewhere between 4 and 8 teams, depending on where exactly you’d draw the similarity line.  Toward the lower end of that range if you’re talking about something as completely comprehensive as what Dan has done with the Pirates or Keith with the Indians.


#26    David Cameron      (see all posts) 2009/12/02 (Wed) @ 02:36

No disrespect to Woolner, but the Indians system was in place before he got there.  I saw it ~5 years ago, and it was really impressive.  I’m sure Keith has made it even more so. 

But, yeah, the amount of teams who have something like Diamond View can be counted on one hand.


#27    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 08:02

"They are not multi-million dollar corporations.  They are 30 small businesses. “

If we are going to get picky, then I presume by “multi-million” you are not intending to mean “at least 2 million$”.  I mean, you said:

“that multi-million or billion dollar corporations do not all have this stuff down pat already”

which literally taken means between at least 2 million$ and at least 1 billion$.  That’s quite a range.  I took it to mean that you meant it as 100million$ plus.

And that’s not what MLB is.  As others have noted, the salaries of players (and GM and player development costs) are “pass-through”.  They don’t count.

What they try to control as expenses is akin to what a small business would do.  Say, for example, a RE/MAX with 20-30 employees.  Something along those lines.  Small businesses that will cut costs to the bone to make sure they can say 100$ here and there, looking for Priceline.com deals, etc.  I got this distinct impression based on someone else who worked for a team (not the ones I’m involved with).


#28    Rally      (see all posts) 2009/12/02 (Wed) @ 14:41

“They are not multi-million dollar corporations.  They are 30 small businesses. “

Tango, just replace “million” with “100 billion” and put your pinky in the corner of your mouth.


#29    KY      (see all posts) 2009/12/02 (Wed) @ 15:14

Tango, thanks for the explanation.  I know where you are coming from, but I’ll stand with my original opinion that I still think it is an embarrassment that so many clubs don’t have a system like this.  How else do they make a decision about trading, releasing, signing players?  No wonder so many of them make crappy decisions.  What’s the phrase?  Penny-wise pound-foolish?


#30          (see all posts) 2009/12/02 (Wed) @ 15:43

#29 KY) On that I would agree with you.  If you put a relativley low number out there that a system like this would cost $250K - $500K, it seems apparent for most of these organizations that if it makes you stop making one bad minor personnel decision (I’m going to use the paying of Mike Jacobs $3M/year when you likely would have gotten the same production from Kila Ka’aihue) per year you are likely saving at minimum $2M - $3M/year, let alone something of the Guillen/Farnsworth level blunders.  We’re likley talking about a 10:1 ROI in the first year.

I’ll I was saying is that I work with a lot of $100M - $1B companies and having the cash to do a simple $20K analysis (let alone a $500K solution) to determine potential benefit is difficult to get them to do.


#31    Rally      (see all posts) 2009/12/02 (Wed) @ 16:34

(I’m going to use the paying of Mike Jacobs $3M/year when you likely would have gotten the same production from Kila Ka’aihue)

They don’t need a system to tell them that.  They could have looked the two players up on baseballprojection.com for free and known that.

What they need is decision makers that understand how to use the information available.  Otherwise you can have the greatest system in the world and it won’t do you any good.


#32    Tim Kniker      (see all posts) 2009/12/02 (Wed) @ 18:03

#31) Fair enough, that one may have been a bit blatant.  The one thing about having a “system” is that once that investment is made, sometimes there is more of a willingness to accept the recommendations (others will stone-wall though).  Fact is that many times it’s not the information that helps the decision-makers but the packaging of that information.  The fact is that by having a better side-by-side system that has both the projections and the scouting reports and when they are both in agreement and it’s easy to see the packaging of the information helps make better decisions.  To meet that sounds like the more powerful thing about MITT than necessarily just the proprietary data/calculations that are in it.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 16:48
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 16:10
Clutch analogy

Feb 11 15:58
MGL: Today on Clubhouse Confidential

Feb 11 11:54
Who is Jeremy Lin?

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul

Feb 10 21:07
Hero of the month: Brittney Baxter

Feb 10 18:32
Moneyball at Villanova

Feb 10 17:00
Psst… wanna intern in Canada?