THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, October 04, 2011

#bip

By Tangotiger, 09:07 AM

I tried a little experiment on Twitter last night.  I asked people to use the tag #bip, and then for every ball in play, give me your impressions.

If an out, tweet back:
1. automatic out #bip
2. decent amount of effort required by fielder #bip
3. tremendous effort required by fielder #bip

If a hit/error, tweet back:
1. automatic hit #bip
2. a bit more effort by fielder could have stopped the hit #bip
3. should not have been a hit #bip

Or something like that.  Just doing “2. hit #bip” was enough once you got the hang of it.  Twitter prevents submitting tweets with the same text, so, you have to be creative and do “2. hitt #bip”, etc.

Anyway, we did it starting from the top of the 7th (though I missed half an inning to walk the dog).  There was only a couple of us, two or three.  And they were all pretty straight forward, except for one play (hard shot right back at ARod, who got the out).

I’d like to do this more.  So, if you are watching any game, try to give us your impressions.  Because I don’t trust Twitter, not to mention there might be overlapping games, add the name of the batter involved.  So:
“1. auto hit Cabrera #bip”

Something like that. 

The ground rules are that you presume that positioning was already optimal (and so, off the table).  And that you are judging for that park.

I can’t say I will actually do anything with this information.  I might.  Or, if I’m luckier, someone ELSE will do something with this.  I have to say I was very surprised with Craig Glaser’s data on Verlander a few weeks back, so this kind of made me think more about this.

Of course, if you prefer to use Twitter as your personal snark outlet, don’t let me stop you from doing that.  Better there than anywhere else, that’s for sure.

***

Back to that ARod play.  If ARod couldn’t come up with that play, would it have been:
“2. hit #bip”
“3. hit #bip”

The 2 would mean some decent amount of effort from the fielder would have turned that into an out.  The 3 would mean the fielder just blew it, and it should have been an out.

So, a “2. hit” would be equivalent to a “2. out”, depending on the outcome.

A “3. hit” would be equivalent to a “1. out”.  This is one way to look at the play, without looking at the outcome of the play to bias you.  If the opposite happened, would that shot to ARod be marked as a “3. hit” or a “2. hit”?  I’m thinking probably a “2. hit”.  And so, since he did make the play, we should mark it as a “2. out”.

Again, this is just a work in progress, and it’s kinda fun (for those predisposed like me anyway).  Doesn’t really cost you anything, and, it might give you better focus.


#1    Bobby A      (see all posts) 2011/10/04 (Tue) @ 10:03

Good luck with this! Agreed that twitter may be less than ideal, as they don’t necessarily post every tweet with every hashtag. A dedicated bip@insidethebook email address may be better.

Regardless, good luck! I would love to read any ideas this experiment conjures up.


#2    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 10:30

First off, thanks for the suggestion, and if anyone has any suggestions, please do so.  I literally just started this while watching the bottom of the 6th inning of yanks/tigers.

Now, tell me more about this dedicated email address.  Are you saying that if I create a new twitter account, that it’ll be better, because then I know that all the tweets will reference a single topic?

So, I would create a twitter account called TangoBIP, and then everyone can do
@TangoBIP 1. hit

Something like that?


#3    Sky      (see all posts) 2011/10/04 (Tue) @ 11:01

Yes, I’d go with a new account that everyone “sends” their tweets to. Has the added benefit that only people who follow that account will see the tweets (and since nobody will follow it, we won’t be spamming our followers.)


#4    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 11:10

Fantastic idea guys, thanks.  More Straight Arrow readers.

Creating various Tango accounts in the future seems like a great idea, especially since I can create an unlimited number of emails on my personal website.


#5    Bill Baer      (see all posts) 2011/10/04 (Tue) @ 12:19

Sorry for the self-promotion, but I did something similar for the hits Cliff Lee allowed in Game 2 of the STL/PHI NLDS.

http://crashburnalley.com/2011/10/04/cliff-lee-and-babip/

I didn’t see your #bip thing, but it looks like we were on a similar wavelength.


#6    Francisco Merejo      (see all posts) 2011/10/04 (Tue) @ 12:40

I know I`m stretching things a little here but if for example there`s a flyball hit between the CF and 2B and the play seems fairly easy to the center fielder but the second baseman makes the out and finish doing what seems a rather difficult play, should we consider the play from the CF perspective although he didn’t made the out and credit the play with a 1. out #Bip, or should we always consider the fielder who made the play although there was another fielder with a much easier chance? 

Really great idea BTW. Definitely doing it.


#7    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 13:14

Look at it from the fielding team unit or pitcher.

What kind of ball in play did the pitcher allow?


#8    Francisco Merejo      (see all posts) 2011/10/04 (Tue) @ 13:34

For example, a weak flyball between the second base and the CF. The kind of a Texas leaguer that may drop for a hit if CF is playing deep.

So, based on the assumptions given (positioning was already optimal) and that we should look at from the fielding team I think it should be fair to mark it as 1. out #Bip since either fielder should have a play on it and the pitcher allowed a weak contact.


#9    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 13:44

Right, it’s a 1. out


#10          (see all posts) 2011/10/04 (Tue) @ 14:39

It’s actually pretty easy to pull tweets from Twitter using the Twitter API, so anything with the #bip hashtag will come up easily.

If we get enough tweets with the #bip hashtag I can just use the API to make a dataset out of it and do some text mining.


#11    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 14:44

Good stuff.  I’ll create a @TangoBIP account when I get home, and I presume it should be just as easy for you to use that?

Maybe it would be better if we can follow a standard format for you.  So:

@TangoBIP 1.out Cabrera extrastuff
@TangoBIP 3.out Inge extrastuff
@TangoBIP 2.hit Avila extrastuff

The “extrastuff” is just junk or whatever you want to add to bypass Twitter’s validation to prevent multiple tweets of the same content.

Does that work?


#12    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 14:51

Bill/5: great!  Hopefully the twitter-verse can refine the idea.

When I tried this back in 2008 I think in the World Series, I was asking people in a chat room to mark a play as 0% out to 100% out.  The range made it too wide I think.  Even if you make it in steps of 5% or even 10%.

The way I have it now, it’s just 3:
1. easy out or 3. undeserved hit
2. decent out or 2. decent hit
3. very tough out or 1. easy hit

I can see how we can refine that in the future to have a 4 or 5 step scale, and we don’t have to worry about the out or hit designation, and simply go back to the 0% to 100% out, but in steps of 5%, 25%, 50%, 75%, 95%.  Something like that.

Anyway, let’s see how things go…


#13    Josh Weinstock      (see all posts) 2011/10/04 (Tue) @ 14:59

Yes, a standard format would be very helpful. I just need to know if it’s an out or hit first, and then a standard way of the describing the legitimacy of the out or hit. The easiest way would simply be to code various outcomes, like

Hit:
1) no luck
2) some luck
3) very luck

and then tweet: @TangoBIP Hit 1) #bip and include a timestamp with the date or something to bypass the validation issue. If a date is included I can find games with the most luck involved (according to our tweets).

Of course it’s kind of iffy to force it into a distribution of three outcomes (or I guess 6 total if you have 3 types of outs as well), but whatevs.

And yes, the @TangoBIP is necessary because other people use the #BIP hashtag (I’m not sure for what though, but a search for the tag returns other results).


#14    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 16:07

With multiple games, you’ll need either the batter or the team.

The batter is good, since that is going to be unique, and presumably twitter is not going to block a repeat of a tweet that is 45 minutes apart.  (Though, who knows.) You can add the inning I guess.

So:
@TangoBIP 1.hit cabrera 7th #bip

That looks clean enough I think.


#15    Josh Weinstock      (see all posts) 2011/10/04 (Tue) @ 16:44

I guess it would be easiest just to have one number in there so I don’t have to separate out innings from the assessment of luck. So if you put in the inning, spelling it out (seventh) would be preferable.


#16    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 16:50

Josh: I see.

Normally, when I parse, I look for delimiters.  In this case, I was thinking it would be a space.

It seems though that you were simply going to read a line and look for a digit.

Can you explain how the parser works, so I can develop the best structure?


#17    Josh Weinstock      (see all posts) 2011/10/04 (Tue) @ 17:11

There is no set parser - it’s not like reading a .csv into excel and making sure the delimeter separates out the fields correctly. I’m just going to use a lot string manipulation and text mining tools to get out the desired information. It will just be really easy to do that if there’s only one number to look for.


#18    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 18:09

Ok, if that’s the case, let’s do that.

@tangobip 1.hit cabrera seventh


#19    ATennisGuy      (see all posts) 2011/10/04 (Tue) @ 18:39

Seems like a job for a Google Docs spreadsheet.


#20    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 22:23

Ignoring the bunt, I marked 11 plays, and 10 were routine.

If you add up the absolute value of UZR, of Dewan, of Colin’s metric, of any fielding metric out there, the total number of absolute plus/minus should come out to around .50 plays (.40 runs).

But, I will bet that all of them would be way higher.

Hence the need for human intelligence as valid data.


#21    Tangotiger      (see all posts) 2011/10/04 (Tue) @ 22:43

Actually, since I’m presuming positioning, we can’t make the apples to apples comparison.


#22          (see all posts) 2011/10/06 (Thu) @ 19:29

I think ATennisGuy is right.  You can real-time group-edit a Google spreadsheet. 

Also, I think any chat room format would be better than Twitter.  AOL messenger keeps a transcript.  Google chat does too (saves it as mail). Rather than e-mailing, everyone could dial you up on AOL messenger and just real-time post the bips.  I’m pretty sure the transcript has a time marker too, which could be important if someone posts an entry too late that seems to blend it with the next at bat.

Google Buzz supports tags and has no restrictions on repeat posts. 

If you are on Google+ you might consider starting a “Hangout” at the beginning of a game.  It’s live video basically, but also has a chat feature where everyone could react real-time.  I haven’t tried it, but it seems like it would be useful. 

If anyone needs a Google+ invite, let me know.


#23    Josh Weinstock      (see all posts) 2011/10/06 (Thu) @ 19:34

Yea, a google spreadsheet seems like a better way to do this.


#24    philosofool      (see all posts) 2011/10/11 (Tue) @ 01:24

Not sure I like the collective spreadsheet idea. Wisdom of the crowds does not work if everyone can see what others are doing, or at least, it doesn’t work the same.

Also, what do you do with DPs? (routine, athletic, heads-up?) Cabrera’s heads-up 3-2-3 today is a good example of something that’s going to be tough to codify. The 6-3-5 Pujols-throws-out-Utley is another case of smart-not-athletic play (that just kicked ass) that needs a coding. Seems like heads up is a hard category because could have made heads up play but didn’t will be missed a lot.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 03:39
Lack of hustle during a game

May 25 02:54
Largest demonstration in Canadian history?

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards