THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, June 14, 2011

Only use shortcuts if you are in a hurry

By Tangotiger, 07:02 AM

Patriot on James’ Gold book.

***

On a related note:

I worry about the future of Sabermetrics and it’s appeal to the masses. It seems that much of what I see today is 1 of 2 things. First, something like pitch f(x)where it seems like an attempt to impress a MLB team into a job. Or second a mathematical exercise, binomial distributions, Bayes, regression etc. Most of these things are god awful boring.What happened to asking an interesting question, and then figuring out how to study it?
Asked by: rempart
Answered: June 11, 2011

Well. . .I’m boring enough myself; I shouldn’t talk.  It was my theory, in the 1970s, that since sabermetrics could not sustain itself with academic funding, for the field to succeed it would have to speak directly to the public, therefore would need to avoid the kinds of language and expression that are common in academic circles.  I’ve done all that I could to discourage sabermetricians from talking to one another in a way that shuts out the public, but honestly, I don’t know that I have reached a lot of people on this point. 

Reader rempart says “god awful boring”.  Well, I agree, some of it is god awful boring, which is why I don’t read those.  (Anything with regression as the piece de resistance is god awful boring, and I will skim that article.) But, rempart, why do you read those?  I am overwhelmed by the number of interesting questions being asked and the research to support it.

I also don’t accept Bill’s conclusion.  There is tons of great articles that are public-accessible.  There’s no reason all of it has to be, but there are enough saberists out there that do a good job of reaching the public.  I personally don’t try to do that, but sometimes I will try (like the ESPN articles).  And, I hear from more than one person how they prefer that I do NOT try to do that.

***

Then another reader noted in an article:

By: ‘Monahan’
The recent question in “Hey Bill” by rempart concerning the future of sabermetrics helped me recognize how I’ve been following this discussion on both sites.  It simply boils down to… Bill’s explanations are both simpler and clearer.  I certainly respect the work put in by Tango (whose pedigree is unquestioned), but I find his explanations to be inaccessible-- I’m not a mathematician, I’m a baseball fan. While Phil seems able to fully square the two viewpoints, I see one that makes sense to me and another that does not.

This was in reference to this article I wrote.  I responded:

Bill has a different audience than I do.  Bill writes for himself (and whatever readers he wants to reach), and I write for myself (and whatever readers I want to reach, which is smaller than Bill’s audience).  And I am quite content with whoever I happen to reach.  The way I see it, I’m inviting people into my world, rather than going out to the world of others.  If that means I get 10 people, then that’s 10 more than I had a minute ago.

***

We all have different objectives.  Some write for clarity, some write for precision, some write for accuracy, and some write for entertainment.  Some write so we learn, and some write to inspire.  There’s a whole bunch of reasons.

And whether we get paid or not is a huge reason for what and how we write. Patriot in the above expected more accuracy from Bill James, especially since he paid for it.  He doesn’t want a tidy clean mess that he has to clean up (even though we are all actually better off for it).  Monahan however prefers that tidy mess, because he prefers clarity to accuracy.  Bill has often said he simply puts his ideas out there, and they live or die on their own.

Me?  I’m just a caveman.  I’m scared and frightened of cleanliness and regression.  These things confuse me.  But, if you have a chance to use Bayes theorem with a valid prior (or PythagenPat), so we can bypass the shortcuts even if it will overwhelm the reader, you do it.


#1    rempart      (see all posts) 2011/06/14 (Tue) @ 09:08

Tango, my original question was not meant to be a shot across anyone’s bow. Your final paragraph sums me up pretty well too.

“Me?  I’m just a caveman.  I’m scared and frightened of cleanliness and regression.  These things confuse me.  But, if you have a chance to use Bayes theorem with a valid prior (or PythagenPat), so we can bypass the shortcuts even if it will overwhelm the reader, you do it.”

Furthermore, “Well, I agree, some of it is god awful boring, which is why I don’t read those.  (Anything with regression as the piece de resistance is god awful boring, and I will skim that article.) But, rempart, why do you read those?”

The old change the TV channel if you don’t like the program. I get it, yeah that’s right you can do that. That doesn’t mean that you can’t comment on the content and the direction it appears headed. And, my comment was not directed at your work at all.

These days, as you say. I find myself skimming, and bypassing alot of work. The thing is, I either get or figure out enough of these things to understand them- If the article interests me.

To me the bigger problems, are the lack of clear writing. More, concrete, real, understandable examples are needed in the explanantions.And, the bigger issue is the lack interesting subjects being discussed. I realize this becomes more difficult as time goes on. I guess that is the challenge for the future isn’t it? That and being able to reach an audience, and get them to understand it.

I would encourage some to ask themselves if you started with the Math or with an interesting question you are trying to answer.

At the end of the day, sabermetrics cannot grow beyond a certain niche, if it becomes too much of an academic pursuit. The masses won’t get it. If I can’t sit down with my father(older generation) or my son (younger generation), and explain what and why w/OBA is better than BA. Or, why Randy Johnson is better than almost anybody. If what is required is a prerequisite stats and programming course to understand the material it becomes academic. Don’t get me wrong we need the people who understand these things. But more importantly we need to be able to make these things interesting and understandable.


#2    Tangotiger      (see all posts) 2011/06/14 (Tue) @ 09:32

rempart: I didn’t interpret anything you said as related to me.

But I don’t agree with your last sentence, that we need to make these things interesting and understandable.  That’s only if you intend to reach a wider audience.  To take an extreme example: do we need Alan Nathan to make his work more interesting and understandable?  If he does that, it kinda limits him.  Occasionally, it’s lovely when he does that (like his GUESTus piece shows), but to advance his view so that others pick up the pieces, he has to be more precise in what he does, which is why he writes his tech pieces that are very hard to read for most: some people, those he can inspire, will find what he writes as a launching point.

The Book, to use another example, was written with LOTS of real-life examples.  There are constant associations of the conclusion to real-world players.  But a book affords us plenty of room to do that kind of work.  In an article online, well, you have to cut corners everywhere.  I’m quite happy with the Jeter/ESPN forecasting article I did, but, that will leave plenty of researchers wanting more.

It’s a balancing act, so, you change the channel (great analogy!) when you don’t like something.

***

wOBA says to give less weight to the walk and more weight to extra base hits.  And to not exclude walks in the denominator.  This is obvious, and everyone can accept it.  So, right away, this is better than batting average as a representation of a player’s performance.  Furthermore, I set the weights to 0.7 for walks and 2.0 for HR.

The inevitable question is “why”.  Now, what am I supposed to do there?  I have to talk about run expectancy and I have to talk about linear weights.

It’s unfair to say that unless you understand the reason for 0.7 and 2.0, that the default position has to be 1 for singles and HR and null for walks.  And I have to be clear about it.

The default position must be that a single is not a HR.  And the default position must be that we don’t know the weight relationship between a single and a HR.  That means that batting average is not the default position.

I don’t have to explain why wOBA is better than BA.  I just have to explain that BA is cr-p as a metric of performance.


#3    rempart      (see all posts) 2011/06/14 (Tue) @ 09:48

I don’t think I was particularly clear. I like things like w/OBA, because of their simplicity and use. I can use it explain to my dad why Mike Schmidt was better than Wade Boggs.

I get your point about Nathan, and people building upon initial research. Perhaps I’m getting old(49), and my frustration or boredom is getting the best of me. I used to see a Sabr article and couldn’t wait to read and apply. Now, there are remarkably few interesting things-like the goalie save% article you did. Great use and explaination by you. Keep up the great work.


#4    Tangotiger      (see all posts) 2011/06/14 (Tue) @ 10:11

rempart: it seems we agree much more than we disagree.  I was also in the same boat that I couldn’t wait to digest the next saber article.

And now, I’m in the position of deciding which one to read or not.  Basically, I’m at the point where there is so much good stuff, that I don’t need to read the bad stuff.  This is actually a good thing!  That I end up missing out on some good stuff because it wasn’t written with me as the intended audience is not that big a deal. 

Like you said: plenty of channels to choose from.

I’m an old man now (over 40).  I’m ok that not everything is written with clarity.  What I still haven’t accepted is things that are wrong.  I’ll accept an article that is too hard to read, but I won’t accept one that is just out and out wrong.


#5          (see all posts) 2011/06/14 (Tue) @ 10:21

I used to see a Sabr article and couldn’t wait to read and apply. Now, there are remarkably few interesting things-like the goalie save% article you did.

Rempart, is it that there are fewer interesting things being published, or that there is so much more available that it takes more effort to sort through it all to find the interesting articles?

For example, there was this, just last week:
http://itsaboutthemoney.net/archives/2011/06/09/what-makes-a-groundball/

If I think back five or ten years, I would have been drooling over the opportunity to read such an article, and I would have studied it over and over until I understood every detail.  Nowadays, it’s just one among many, and I gave it a quick read, noted that it was well done, thought about it for a couple minutes, and moved on to the next thing.  I may return to it to think more about it in the future, but there is such a fire hose of analytical baseball material available that I don’t have to return to it if I want to satisfy my curiosity about the game.

I could read Max’s piece on catcher framing instead, or I could ruminate further on the conversation that Lucas, Josh W., Josh S., and I were having last night on what pitches Roy Halladay throws, or I could think about what Derek wrote yesterday about how quickly batters’ statistics stablize, etc. And there are a 100 things inferior to those that have grabbed at least a piece of my analytic/sabermetric attention over the past week.  Many of them were not worth my time, and I’m wondering if that’s what really bugs you.  But some of them were. 

Or do you not find that to be true?  Do you find that it’s only very rarely, once a month or once a year, that you find a piece that really stimulates your thinking and whets your appetite to learn more?


#6    Tangotiger      (see all posts) 2011/06/14 (Tue) @ 10:32

Mike/5 has captured my feelings on this.  There’s a saber-orgy going on, so we can’t lament that we are stuck with Amanda Seyfried because Jessica Alba is outside our grasp.


#7          (see all posts) 2011/06/14 (Tue) @ 10:39

I thought it was interesting how we concurrently wrote very similar things.

I see two threads in what we said.

1. We’re spoiled by how much good stuff is available these days.

2. The signal-to-noise ratio is lower than it used to be.  I imagine some people find this merely annoying while to others it is a huge obstacle to surmount (if they have limited time, interest, or capability to digest stuff).

I think there is a third thread in what rempart said, and maybe what John Sickels wrote last year, and that is…

3. Most of the really good research is less accessible to the casual saber-interested reader who love the game but doesn’t necessarily understand higher-level statistical techniques or the function of the PITCHf/x system.

I don’t think bugs Tango or me nearly to the extent that it seems to bug some people, people who are more or less “on our team”, i.e., loving baseball and wanting to understand more about how it works in a logical fashion.

I’m not sure I’m even in a good position to evaluate how true proposition #3 is, and if it’s true, how much of a problem that is.


#8    rempart      (see all posts) 2011/06/14 (Tue) @ 10:54

Mike that is a reasonable point you make. There is obviously a ton of material to sort through. And, with more material, comes more bad material. But, you also get interesting things to digest. I like things I can play with and use in discussions. For example, I like similarity scores, families, etc. and who this guy reminds me of from the past. Hall of fame talk. Who’s better and why? New methods like Tango’s goalie save% binomial distribution article, and it’s application in other areas. Say- can it be used with w/OBA and PAs to rate baseball players? to determine batting champs?. I liked the Indi Won-loss records. I’d like to see a database of WPA won-loss records!
On the flip side. I don’t give a cr-p whether Tim wakefield’s knuckle ball location or his pitch velocity when regressed,taken exponentially means....blah blah blah....If I were going to hit against him or was a Major League team-yeah I might be interested.

I’m willing to admit that there may be a bit of a generational thing involved. I’ll keep reading what is interesting, and ignore/skim the boring stuff. Hopefully, people will continue to produce material that is for the masses.


#9          (see all posts) 2011/06/14 (Tue) @ 11:08

On the flip side. I don’t give a cr-p whether Tim wakefield’s knuckle ball location or his pitch velocity when regressed,taken exponentially means....blah blah blah....If I were going to hit against him or was a Major League team-yeah I might be interested.

Rempart, I’m curious about something that perhaps you can answer for me.

I don’t typically do the kind of research that it seems is in your wheelhouse.  On rare occasion I do, like my opening piece for BPro on the chances for future 300-game winners, but that’s pretty much a sidelight for me, both in terms of my personal interests and what I tend to write about for others.  So there is a probably a large portion of your interests that I will never address.  Hopefully there are other writers who will.

But I’m curious, do you not care about how pitching works, or why pitchers do what they do?  If it was presented in a way that was more accessible to you, would it fascinate you?  Or is the topic basically one that is on the periphery of your interests, even if an article was written as well as it could be and with you as its intended audience?

I ask because one of the things I think a lot about is how to make the information from PITCHf/x usable--to teams and players, yes; but also especially to fans, those who would be interested if the information was more straightforward.


#10    rempart      (see all posts) 2011/06/14 (Tue) @ 11:40

Here http://www.hardballtimes.com/main/article/pitcher-similarity-scores/
is an article I found interesting, possessing alot of potential, complicated and in need further explanation. It says at the bottom about more in the next column..I don’t know if that ever happened I didn’t check.

Anyway, I am interested in how pitching works -yes. I think improving on BJ similarity scores or using a combination of sim scores with pitch f(X) data is interesting. I realize we are limited in going backwards with pitch f(x) data. However,are there any bright ideas out there to use what we have to correlate and convert this new data we have access to with the past? Maybe not. I don’t know.

I hope this makes sense.

“If it was presented in a way that was more accessible to you, would it fascinate you?” So yes is my answer. To me its the use of the data.


#11    Matthew Bultitude      (see all posts) 2011/06/15 (Wed) @ 03:12

Mike Fast wrote:

2. The signal-to-noise ratio is lower than it used to be.  I imagine some people find this merely annoying while to others it is a huge obstacle to surmount (if they have limited time, interest, or capability to digest stuff).”

This is precisely why I hang out on this blog: a better signal-to-noise ratio than anywhere else in the sabr-net.


#12    Guy      (see all posts) 2011/06/15 (Wed) @ 08:37

I’ve done all that I could to discourage sabermetricians from talking to one another in a way that shuts out the public, but honestly, I don’t know that I have reached a lot of people on this point.

Why should sabermetricians strive to use accessible language when “talking to one another?” Do economists do that?  Biologists?  Real estate agents?  Grocers?  Fish and game administrators?  Anyone at all?  Every field has its own language, and can’t function without it.  If you want to fault sabermetricians efforts to reach a larger publice (among those who choose to try), fine, but this is a ridiculous point.

I’ll always admire James for his contributions.  But nothing has diminished my respect for him like his inability to be gracious toward those who followed in his footsteps.  I know he’s been a mentor to Neyer and Voros, perhaps a few others.  But in general, I find his attitude toward other’s work to be dismissive and consdescending (when he doesn’t pretend to be completely unaware of it).  Compared to James, Jorge Posada is handling his decline with grace and humility.


#13    Tangotiger      (see all posts) 2011/06/15 (Wed) @ 09:34

You can’t stifle quality content by criticizing the less-than-effective delivery.

We definitely should stifle poor content, regardless how effective the delivery.


#14    Colin Wyers      (see all posts) 2011/06/15 (Wed) @ 10:21

Why should sabermetricians strive to use accessible language when “talking to one another?”

And how does Bill James know how sabermetricians are talking to one another?


#15    Guy      (see all posts) 2011/06/15 (Wed) @ 10:39

I assume he means discussions like the ones we have here, Fangraphs, etc.


#16    Colin Wyers      (see all posts) 2011/06/15 (Wed) @ 10:47

I assume he means discussions like the ones we have here, Fangraphs, etc.

Right. Is there any evidence James has read one of these discussions, though?


#17    Tangotiger      (see all posts) 2011/06/15 (Wed) @ 10:58

Seems appropriate parallels can be drawn:

http://www.montrealgazette.com/news/words+heats+between+Parizeau+young+MNAs/4946408/story.html


#18    Guy      (see all posts) 2011/06/15 (Wed) @ 12:03

Though James might say this is the parallel.


#19    Guy      (see all posts) 2011/06/15 (Wed) @ 15:36

Is there any evidence James has read one of these discussions, though?

I think he reads some sabermetric stuff.  Why else would he think to make this criticism?  And he’ll occasionally critique a study/article he disagrees with, suggesting he reads at least some work done by others.

However, James’ approach seems to be that he doesn’t believe something until he proves it to himself, using his own (sometimes idiosyncratic) methodology.  So while some of his work seems duplicative, I don’t think that necessarily means he was always unaware of similar work already done.  (Just to be clear, I’m NOT accusing him of plagiarism.  I just think the only research that “counts” to James is the work he does himself.)


#20    Jimmy      (see all posts) 2011/06/16 (Thu) @ 13:33

Why should sabermetricians strive to use accessible language when “talking to one another?” Do economists do that?  Biologists?  Real estate agents?  Grocers?  Fish and game administrators?  Anyone at all?  Every field has its own language, and can’t function without it.  If you want to fault sabermetricians efforts to reach a larger publice (among those who choose to try), fine, but this is a ridiculous point.

I completely agree with Guy here. If the purpose of sabermetrics is to enlighten the general public, then yes, simplify the language. But sabermetrics has to progress just like any other field. And as it does, the increased sophistication will require a certain level of complicated terminology.

In my opinion, most sabermetricians aren’t rigorous enough. A lot of times sabermetrics is cast as an offshoot of statistics, but in reality the use of statistics in baseball has a long ways to go to catch up the use of statistics in a lot of other fields. Economics, medicine, biology, you name it. I understand that it’s a game and all, but people are confusing “building cool stats and comparing them” with “statistics.” They are not the same thing. The difference here is exactly what people are saying is “god awful boring”: mathematical sophistication. Maybe it’s boring to you, but maybe that’s because you just don’t understand it. Once you get into it, I think you’ll find that the possibilities opened up by advanced modeling and math are very interesting and go well beyond just building more acronymic stats.

Basically, my point is that sabermetricians should not be afraid to delve into statistical theory to further their insight into baseball. It just pains me to hear people blasting the academic nature of statistics when it is exactly what lies at the foundation of everything we talk about on this blog.


#21    Tangotiger      (see all posts) 2011/06/16 (Thu) @ 13:43

The blasting you mention is how the academicians rely on it without understanding the gaps in it.  That is, subject matter experts will often tell you why the conclusion an academician comes to is crap.

Academicians::regression as media guys::RBIs.


#22    Jimmy      (see all posts) 2011/06/16 (Thu) @ 14:12

"Academic” is not a profession. “Economist” is a profession. “Internist” is a profession. “Plant biologist” is a profession. All these things imply that they are subject matter experts. It just so happens that a lot of these professions also employ statistical methods--some of them more heavily than others--and “sabermetrician” should not be excluded from that category.

What I was originally getting at is that some people seem to drawing a hard line between statisticians and baseball experts. In my view, sabermetricians are those that bridge the two. You don’t have to give up one in order to get the other. You should have a rigorous grasp of statistics, but you should also understand baseball. It seems to me that a lot of people interested in baseball shun those who try to take a more rigorous approach to analyzing the game, when really they can coexist just fine (like in any other field). In fact, I would say that it is the responsibility of sabermetricians to be as rigorous as possible, because otherwise there’s nothing separating us from the “subject matter experts.”


#23    Colin Wyers      (see all posts) 2011/06/16 (Thu) @ 16:24

It seems to me that a lot of people interested in baseball shun those who try to take a more rigorous approach to analyzing the game, when really they can coexist just fine (like in any other field).

Like who?


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves