THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, February 04, 2008

The future of sabermetrics

By Tangotiger, 04:01 PM

The pinnacle of Sabermetrics is the convergence of performance analysis and scouting observations.  To that end, the future of sabermetrics will be the processing of the pitch-by-pitch data.  So, a very micro-analysis.  Bill James looks at the answer to the question from a very macro-perspective:

League-perspective decision making. Looking at decisions based from the standpoint of the league. Simple example: the wild card....

I know why what I’m saying is a candidate for the future of sabermetrics.  I don’t know why what he says is.  That’s not to say that he’s wrong, but I just don’t see what he’s seeing.

I was bothered by this statement, especially in conjunction with a later statement where he says he doesn’t keep up with what around, other than Retrosheet:

Then we created “profiles"… which contain all kinds of information about the teams and the players that you don’t have any other way of knowing, at least now; of course other people will rip us off, and the same information will be appearing on other sites in a matter of months.

I really wish he wasn’t so forceful about his statements here.  Especially when he’s wrong.


SabermetricsData
#1    Mike Fast      (see all posts) 2008/02/04 (Mon) @ 18:06

I guess I quit reading and started skimming by the point he was talking about his new website, and I missed the part about other people ripping him off. 

Bill James is one of my heroes, but it’s sad to read his stuff these days.  His piece in the Hardball Times annual just wasn’t up to snuff with the rest of the material.  If he would swallow his pride (or whatever the problem is) and keep up with the research in the rest of the field, maybe he would still be able to contribute at a high level.  As it is, he puts out second-rate material that can be found anywhere and barely passes for analysis, certainly not at the level he used to do, and thinks other people are ripping him off.


#2          (see all posts) 2008/02/04 (Mon) @ 18:36

In my opinion, he’s never been a particularly impressive analyst - I mean, he’s top 1%, or .1%, but with the internet these days, it’s so easy for anyone and everyone to get their ideas and thoughts out there.  So he’s in a sense competing with anyone who can figure out how to make a blog; years ago, he was competing with anyone who could publish.  Lots of good ideas, sure, but I think he’s a better writer than he is an analyst.  He might be top .01% of people at writing - communicating intelligent ideas in a way that a sizable portion of the population can understand and enjoy.  It’s things like the Cecil Fielder jab ("other foot on the scale"), or his explanation on clutch hitting ("we’re supposed to believe that they’re better people than us”, etc) that I think differentiates him from the analyst with the 160 IQ that we’ve never heard of.


#3    Mike Fast      (see all posts) 2008/02/04 (Mon) @ 19:11

I agree that a lot of the joy I got out of his Abstracts came from the quality of his writing rather than the depth and accuracy of his analysis.

I still love reading his material about historical baseball, be it his Historical Abstracts or his contributions to the Neyer/James Guide to Pitchers.  He can weave a baseball tale like the best.

However, the quality of his writing has always been undergirded by the quality of his insights and analysis.  His analysis was good enough to inform his writing in positive fashion, sometimes even in an amazing or breakthrough fashion.  It seemed like every page he wrote contained an idea that I’d never thought of before or posed a question in a way that was both blindingly obvious and completely beyond my grasp before he articulated it.

I just don’t see that level of insight from him into modern-day baseball, and his writing suffers as a result.  It’s like if Ron Luciano tried to write stories about baseball today without having been behind the plate in two decades.  It might still be a little funny, but it wouldn’t have that WOW factor that it had when the insight and and the writing ability were both firing on all cylinders.


#4    Matt Mitchell      (see all posts) 2008/02/04 (Mon) @ 20:56

My thoughts on James the man comes down to 4 words: A Godfather of Sabermetrics. Yes, his comments about not keeping up with the research means he may be out of touch, but he still gave those of us who try to analyze the numbers a great foundation to work from. I like Neyer’s tree analogy here, where most of the low hanging fruit is picked, and we’re having to climb higher to find the fruit.

As for his thoughts on league-perspective analysis, you can look at most of the analysis done by many and see how it’s all been in relation to players and teams and how they optimize their chances of winning. Heck, THE BOOK was precisely about exploring those ideas. But now put yourself in the office with Bud Selig in New York. Does anything from the field of sabermetrics help you do your job in managing and finding ways to improve the operations of the league? No. That’s what Bill wants to see if he can figure out.

And I’m not paying for his site unless someone shows me why I should.


#5          (see all posts) 2008/02/05 (Tue) @ 02:22

It’s an issue about which one could do research. One could define what constitutes a “meaningful game” in a pennant race, or, more probably, three or four levels of significance in competition, simulate a league competition 100,000 times one way and 100,000 times the other, and figure out whether you have more meaningful games played one way or the other. It might be that we’re doing it right; it might be that we’re doing it wrong. Nobody really knows.

Why hasn’t this been done? It hasn’t been done because there is no general understanding that it can be done or no confidence that the research would reach an accurate result. We’re in the process now of building confidence in our work process, up to the level at which people naturally think to ask for our input.

It strikes me as being rather easy to do that ...


#6    tangotiger      (see all posts) 2008/02/05 (Tue) @ 09:01

I just don’t see why this is the “future”.  It might be interesting in and of itself, as most things in sabermetrics is.  Other than that, I don’t see it.


#7    Mike Fast      (see all posts) 2008/02/05 (Tue) @ 09:55

I don’t see it as the “future”, either, and when I hear Bill James say that nobody has done something, I reflexively begin to think that there must be an existing body of research on the topic that James is summarily dismissing.  In fact, I really think I saw somebody write on this topic in the past year, but I can’t remember who it was.

I think it’s going to be tough to find an objective answer to the questions James proposes unless you want to accept the fact of increasing attendance as evidence that the current setup is better than the old setup.  What constitutes a “meaningful” game and whether one type of meaningful game is better than another is going to be substantially in the eye of the beholder.


#8          (see all posts) 2008/02/05 (Tue) @ 12:58

You could quantify meaningful games with some sort of pennant leverage index .... all the theory has been done by Tango. I think that Fangraphs may have something on this


#9    Peter Jensen      (see all posts) 2008/02/05 (Tue) @ 13:32

I think Bill James thinks this is the future of sabermetrics because he thinks this is something that he might have a paying client for in MLB.  I think he is conceding that there is too much freely available information on players and teams and too many analysts (and perhaps better analysts) chasing a limited amount of client dollars in the player and team area.  And because the player and team area has been studied so much in the past, any gains made in this area by new methods of analysis are likely to be relatively minor.


#10    Rally      (see all posts) 2008/02/05 (Tue) @ 14:18

Count me in those who fight the future. 

I don’t care one bit about any league wide decision that would make baseball a more popular (profitable) game.  That’s more money I’d have to spend on tickets, rising prices for MLB TV, more people to wait in line behind just to get a beer…

No thanks.  I want baseball to be successful, all the current teams except the Red Sox to continue to function, but if half the baseball fans (the non-hardcore ones) all of a sudden stopped showing up and went off to do other things, that would be just fine with me.

I could get my beer and dinner from Boog’s barbeque without missing a single pitch.


#11    Mike Fast      (see all posts) 2008/02/05 (Tue) @ 14:19

I don’t really agree with idea that the value of sabermetrics is defined by what the clubs or MLB want to pay for.  That’s merely one facet of sabermetrics, a facet that’s typically on the trailing edge rather than the leading edge of what can be done.  The leading edge is far more fascinating to me.

I also don’t agree that player and team analysis has been nearly exhausted.  PITCHf/x is one example of something that is blowing the doors off previous analysis.  Bill James and Gary Huckabay may not give a rip about it because they can’t turn it into dollars in their pockets, but they are missing the boat for the real future of sabermetrics when they do so.  I’m not saying that PITCHf/x encompasses the whole future of sabermetrics, but is an example of something that is ten times more exciting and full of possibility to change the game than the stuff James is talking about.


#12          (see all posts) 2008/02/05 (Tue) @ 17:46

To add to what some of you said…

If George Washington were to step into the presidency today he’d be lost, but that doesn’t change his status as one of our greatest presidents. I think that unless The Goldmine is hugely successful, it could very well be his final book.


#13    Tangotiger      (see all posts) 2008/02/05 (Tue) @ 18:10

I don’t think it will be his final book, because James is, above all, a fantastic writer.

***

Kalk article on trying to get a handle on Bartolo Colon is what PITCHf/x was made for:
http://www.hardballtimes.com/main/article/anatomy-of-a-player-bartolo-colon/

The article had no graphics, which is a shame.  I’m an old comic book reader, and I have that as my crutch.  In any case, this is what people need to remember: regression is the most important concept in sabermetrics.

All data is nothing but sample data.  A sample of their true rate (plus sampling bias).  In order to get to a true rate, you need to regress that sample data toward some population mean.  But, what population mean?  Well, the population from which you draw your sample.

In the most cases, that would mean all MLB pitchers.  But, what if you know more?  Like the guy still throws 95mph?  That his pitches still have alot of movement?  Well, that’s a much smaller (and better) group of pitchers.  That gives you a new regression point.

If I take a quick look at Marcels, I see that Colon’s best comps are Jon Lieber, Brett Tomko, Freddy Garcia, Mark Hendricks, and Rodrigo Lopez.  That is based on their OUTPUTS, their BB, K, HR, ERA numbers.

But, if I could also control for their fastball speed, their breaking ball movements, their release points, etc, now I’ll get a totally new set of comparison points.  And, it is THOSE pitchers towards whose mean I will regress.

Furthermore, you can extend this to hitting (quickness of bat, plane of bat, power of bat), and eventually to minors and college players.  OUTPUTS are nothing more than the result of INPUTS.  And the inputs are the tools of the players.  It all comes down to the skillset, mindset, and approach of the player.  The scouting trifecta if you will.

That’s why I always says that the pinnacle of sabermetrics is the convergence of the performance and the scouting.  And that’s what PITCHf/x (and eventually HITf/x) offers us.


#14    Tangotiger      (see all posts) 2008/02/12 (Tue) @ 12:40

Bill James:
http://www.billjamesonline.net/ArticleContent.aspx?AID=584&Code=James01001

That, it seems to me, could perhaps be the Next Big Thing in the world of sports research: the development of a field of actual knowledge about what it is that makes a sport work.... Instead, baseball sort of accidentally discovered the league format, which has since been successfully copied by basketball, football, hockey and almost every other team sport around the world. Two points: 1) There is a field of knowledge, somewhere, as to what works and what doesn’t work in organizing a team competition, and
2) Somebody some day is going to make a lot of money by acquiring and taking advantage of that knowledge. 

I think I’m understanding more what James is saying.  I agree that it would be very interesting, similar in the way that I figured how many NHL games would be equivalent to MLB games to NFL to NBA.  But, that’s it.  I don’t think it’s any great research interest, nor that it’s the “Next Big Thing”.  Pitch-by-pitch analysis and the followup of GPS-analysis (4D analysis, the 3D plus time) will be the Next Big Things.


#15    Sky      (see all posts) 2008/02/12 (Tue) @ 15:11

I can see where Bill James is coming from, but it’s a totally different definition of sabermetrics.  It would be nice for MLB to know how adding a second wild card team will affect revenue and at what point too many playoff teams/rounds will cause the regular season to become an afterthought.  The NFL recently made a statement on touchdown celebrations that keeps the focus on the game instead of XFL-type shenanigans.  Many of these league decisions can be modeled mathematically/economically, but I don’t think anyone except the league office really cares.  Teams won’t care (no competitive advantage) and fans won’t care (except when the product improves).  And I don’t see it as being much fun for the novice statgeek.


#16    Tangotiger      (see all posts) 2008/02/12 (Tue) @ 15:47

I think fans might care if it create some new competition level.  Say for example that you have a 162 game season with 40 teams, but that after 54 games, only 30 survive, with each team allowed to claim 3 players each from the disposed teams.  Then, after another 54 games, 20 survive, and again, 3 more players can be claimed.  At the end, you end up with 8 teams, and go through the regular playoff process.

Or, what if you have a promotion/relegation system, where you have 12 teams in the premier league, another 24 in a second division, 24 more in a third division, etc.

So, James is right that there’s no reason that the current US/Canada setup is the best working one.

Even the playoffs can be changed, similar to what they do with Olympic hockey.  The way it would work is this way: you still have 4 AL and 4 NL teams in the playoffs.  But, the top AL team plays the #4 in the NL.  The #2 AL plays the #3 NL, and so on.  This way, we aren’t subjected to always having a good AL team facing a notsogood NL team.  You could end up with a Yanks/Redsox World Series.

There’s a whole set of permutations that you can try to model to see what would work in the short and long term.

But, I see this as being fascinating mostly to economists (which I suppose is what James is).  I don’t see it at all as a “Big Thing”.

Pitch-by-pitch and GPS-analysis can reshape the way players are evaluated and selected, along with providing key scouting information.  Basically, taking your favorite video game, and bringing it to life.  I can’t see anything topping that.


#17    Matt Mitchell      (see all posts) 2008/02/12 (Tue) @ 15:52

I think Bill is on to something as another great area for sabermetric research. Is his idea THE “Next Big Thing”? Absolutely not, because it is just one branch. It is A “Next Big Thing”. Pitch-by-pitch is another “Next Big Thing”, especially as more f/x data comes out. THE “Next Big Thing” simply depends on what you’re researching and how you conduct it.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade