THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, December 05, 2010

Graphical run expectancy chart

By Tangotiger, 10:53 PM

For anyone who though that the Run Expectancy by 24 base/out chart was just a bunch of numbers that looked like gibberish, how about if you saw it like this:

So, the darkness of the diamond shows you if it’s 0,1,2 outs.  The color of the bases shows you which base is occupied.  And the position of the diamond shows you the run expectancy (and the value itself is inside the diamond).

Credit: Josh Maciel.


#1    Joshua Maciel      (see all posts) 2010/12/06 (Mon) @ 00:47

If anyone has any suggestions on how to improve the graph, please let me know. If anyone has any ideas on other aspects of baseball to visualize (where the data is readily available), please let me know that too.


#2          (see all posts) 2010/12/06 (Mon) @ 01:35

Great stuff, just a few thoughts:
- The color on the bases is probably extraneous as we know which bases are which and don’t need them color coded.  Perhaps a bright color would help us distinguish better however (yellow maybe)
- The fading makes it a bit more difficult to read the 2 out situations.  Perhaps you can use the extra colors instead for each of those series?
- It would probably help to see the Y-axis labels darker since that’s really the big takeaway
- Seeing an x-axis label (as simple as xxx, 1xx, x23, etc.) would help provided additional context more quickly as well.


#3    Tangotiger      (see all posts) 2010/12/06 (Mon) @ 02:13

I suggested to him the colors of the bases.  He thought it was too much.


#4    J-Doug      (see all posts) 2010/12/06 (Mon) @ 02:44

- The color on the bases is probably extraneous as we know which bases are which and don’t need them color coded.  Perhaps a bright color would help us distinguish better however (yellow maybe)
- The fading makes it a bit more difficult to read the 2 out situations.  Perhaps you can use the extra colors instead for each of those series?

Agree with both of these. Especial problem for us colorblind folk.


#5    J-Doug      (see all posts) 2010/12/06 (Mon) @ 02:44

Awesome graph though.


#6    Joshua Maciel      (see all posts) 2010/12/06 (Mon) @ 04:50

Great stuff, just a few thoughts:
- The color on the bases is probably extraneous as we know which bases are which and don’t need them color coded.  Perhaps a bright color would help us distinguish better however (yellow maybe)

I tried Yellow in my first draft (most of the MLB graphics on the major networks use yellow), but with a light colored background, it really doesn’t look that good/stand out as well as you’d think. I’ve made them white in this draft you can see what that looks like (I think it makes it obvious, but then again I made it, so of course it’s obvious to me).

- The fading makes it a bit more difficult to read the 2 out situations.  Perhaps you can use the extra colors instead for each of those series?

I bumped up the color on the 2 out, and dropped it on the 0 out, so that the range of color is smaller (but still distinct, especially in contrast with the background). I also added out markers (the little circles to the lower right of the base diagrams) to hopefully help out J-Doug and the 10% of men who have some form of colorblindness.

Even so, I have no idea if that’s going to be readable for all colorblind folk, so J-Doug, please let me know which type of colorblindness you have, and I’ll do my best to make one you can see (if you can’t see this one still).

- It would probably help to see the Y-axis labels darker since that’s really the big takeaway

Done and done. Bumped up the background color to make the white text more visible, and bumped up the grey letters appropriately.

- Seeing an x-axis label (as simple as xxx, 1xx, x23, etc.) would help provided additional context more quickly as well.

The X-axis information is in the baseout graphics. That’s what the legend is for. In reality the X-axis is arbitrary and meaningless—I just evenly spaced the 8 baseout states and arranged them so that the RE with 0 outs is ascending. I could add duplicate labels, but I think that wouldn’t add any extra information (not to mention that there isn’t really any place to put them).

Also, do you think this is something you would use for reference while watching a game? I find myself looking up RE a lot, which is why I wanted to create a chart to help me follow, but I think I’m probably in the (vast) minority there.

Here is the updated version:
http://img139.imageshack.us/img139/6812/baseoutstates20101207.png


#7    mettle      (see all posts) 2010/12/06 (Mon) @ 12:02

I agree - this is awesome.

Not to be nitpicky, but can you thin out the basepaths? It makes the bases harder to pick up. Also, my first glance presumed white=empty for a base, as opposed to the convention you use, black=empty.

I like the idea of x-axis labels for the base states, too, to disambiguate.

It seems 2-outs is still a little too light. Can you use colors for the whole things (e.g., darkblue = 0out, lightblue=1out, green=2out).

Finally, maybe make the #s in the center black and not faded for all points so it’s clearer to read?

Just suggestions - it’s awesome without it and would be great to send to Flowing Data as an example of great data visualization.
(http://flowingdata.com/)


#8    Tangotiger      (see all posts) 2010/12/06 (Mon) @ 12:40

Yes, excellent point that the RE numbers in the middle don’t need to be reduced in boldness.  That should remain constant.  (Though of course, I presume that it’s easier for the designer to do as he’s doing.)


#9    J-Doug      (see all posts) 2010/12/06 (Mon) @ 12:57

Josh, I was just using my colorblindness anecdotally. But in terms of most color blind individuals, if you avoid using red and green as contrasting colors, heavily differentiate the shadings of single colors, and keep the number of colors in use to a minimum, then you’ll be fine with us.

Yes, I know that those criteria are to a certain degree contradictory. Don’t worry too much about it.

Anyway, love your new chart. My suggestions would be make the y-axis even darker and to abandon the concept of fading the icons for the out state. I’d either use color to differentiate the out state (although colors aren’t ordinal, and I think ordinal is what you’re going for) or just stick with the dots you’re using right now.

Again, fantastic work.


#10    J-Doug      (see all posts) 2010/12/06 (Mon) @ 12:58

Just thought of this: Fade the basepath but don’t fade the RE number, out dots or bases themselves. That way you indicate the out situation without making the important information more difficult to read.


#11    Newcomer      (see all posts) 2010/12/06 (Mon) @ 13:16

Not fading the RE number was going to be my suggestion.  I love this, and I think I’ll use it as my desktop wallpaper.  This chart should be on the wall in every MLB dugout.


#12    Joshua Maciel      (see all posts) 2010/12/06 (Mon) @ 20:12

Just to explain my choices:

  • As J-Doug pointed out, I wanted to use an ordinal for baseout state because there is an order, and colors can’t show order (if I used arbitrary R/G/B for the out states, there is no distinction to which is “better” or “more important” visually)
  • The reason I didn’t add baseout labels or make the RE numbers too distinct is because I wanted the eye to focus on the data—for the actual RE numbers, a table is going to show the data better (everything takes less space with less ambiguity, even if it isn’t arranged to visually “show” the data)
  • In general, I want to aim toward the “average fan” to try to make a semi-complex sabermetric concept accessible visually, and in a way that makes them feel comfortable (close to graphics on TV), and more likely to use it.

Yes, excellent point that the RE numbers in the middle don’t need to be reduced in boldness.  That should remain constant.  (Though of course, I presume that it’s easier for the designer to do as he’s doing.)

That’s actually an optical illusion from using shades of grey. Both the numbers in the middle, and the original colors for the baseout states are a constant shade—they don’t fade with the baseout states—but because the surrounding gray changes shade, it looks like the color changes.

I’ll try to throw up a version with the consensus suggestions, and see what you guys think of it later. Thanks a lot for the flowing data suggestion, and J-Doug, if you really want a wallpaper of it, let me know your screen resolution and I’ll size it properly for you.


#13    J-Doug      (see all posts) 2010/12/06 (Mon) @ 22:39

That’s actually an optical illusion from using shades of grey.

Wow, that’s true. In that case, the shade should be darker or the weight should be bolder. They’re a bit difficult to read in general.

Also, it was newcomer who wanted the wallpaper, but a high-res version would be neat either way.


#14    Joshua Maciel      (see all posts) 2010/12/06 (Mon) @ 23:28

J-Doug,

I bumped up the shade from 50% black to 80% black, and made a PDF version available. It should be a lot clearer if you print it out (depending on the quality of your printer dealing with various shades of grey).

Here is the PDF:
http://www.scribd.com/doc/44806970/Run-Expectancy-by-Base-Out-States-2010-MLB

If you don’t have a Scribd account, feel free to use this version from bugmenot.com:
Name: guru1985singh
Passowrd: gandukam

Printed quality has a MUCH higher resolution than computer screens, so in general the data should look much “sharper” if you print it out. Hopefully that will also eliminate a lot of the pesky problems with visibility and the like. Computer screens are really really low resolution in comparison to paper.


#15    Joshua Maciel      (see all posts) 2010/12/07 (Tue) @ 00:18

Here is also what you guys requested:
- Colored Base states to represent outs
- Thinner basepaths
- White bases indicate empty bases
- Darker background/Y-axis labels
- Bolder/darker RE labels

Here is a larger version (2048 x 1536 for use as wallpaper):
http://img27.imageshack.us/img27/6191/baseoutstates20101208.png


#16    J-Doug      (see all posts) 2010/12/07 (Tue) @ 01:27

Julian:

I think you pretty much nailed it. Do you have a website that you host this on? Either way, I’d like to profile this over at Beyond the Box Score.


#17    Joshua Maciel      (see all posts) 2010/12/07 (Tue) @ 01:49

I don’t have a website (though I frequent a lot of baseball sites and comment/contribute under a different name). You can reach me at my name (as above) at gmail.

Thank you for offering to put it up on Beyond the Box Score, it would be an honor. Baseball information is meant to be shared! The more eyes the better.

I’d like to request it gets attributed to me, and preferably with a link to this thread if it’s okay.

The underlying data, in case you want to use it/make your own is here:
https://spreadsheets.google.com/ccc?key=0AvhKIaAw27e_dFgteU9ya0hEZ2dIZjNPQ2lZWUdqMHc&hl=en&authkey=CJLr97gI#gid=0

If you have illustrator, or want the vector data, I can also share that.

The images are all licensed under Creative Commons, Attribution, Non-Commercial. So you can put it up on Beyond the Box Score as long as you credit me, and preferably with a link to this thread.


#18    Joshua Maciel      (see all posts) 2010/12/07 (Tue) @ 02:10

I actually created a blog real quick and threw it up there:

http://henkakyuu.blogspot.com

You can link to it if you’d like, but that’s the only thing up there right now (I will put a second up in a second).


#19    J-Doug      (see all posts) 2010/12/07 (Tue) @ 03:13

Joshua: Those other charts are pretty sweet too. I’ll put a post up on Beyond the Box Score later today.

Also, sorry for calling you Julian earlier. I was reading a piece about the WikiLeaks guy at the same time.


#20    Joshua Maciel      (see all posts) 2010/12/07 (Tue) @ 03:25

No problem J-Doug. I get far worse here in Japan.


#21    J-Doug      (see all posts) 2010/12/07 (Tue) @ 12:43

Josh:

Finished the fan shot at Beyond the Box Score: http://www.beyondtheboxscore.com/2010/12/7/1861420/it-seems-the-internet-is-rife-with-excellent-sabermetric-artists

Justin Bopp loves your charts, btw. I strongly suggested he front-page your work when we have a slot open today.


#22    J-Doug      (see all posts) 2010/12/07 (Tue) @ 13:29

You’re front-page news, Josh: http://www.beyondtheboxscore.com/


#23    RMR      (see all posts) 2010/12/07 (Tue) @ 14:31

Love the updates Josh. 

Regarding the X-axis, I realize that the X-axis is categorical, but given the aims of making the concept accessible, I think the casual user is still going to feel a bit lost and struggle to understand (initially) the base-state concept.

I think even adding a simple “# of base runners” grouping label on the x-axis to reinforce the vertical lines would help to clarify the arrangement of the base-out states.


#24    Tangotiger      (see all posts) 2010/12/07 (Tue) @ 15:21

I agree regarding the baserunner axis.  I think putting “bases empty”, “one runner”, “two runners”, “bases loaded” on the x-axis to link up to the vertical line will make it clear.

And for the run expectancy line on the y-axis, I think every 0.25 runs might work out better too.  Maybe.


#25    J-Doug      (see all posts) 2010/12/07 (Tue) @ 16:28

I personally liked the absence of an x-axis, since the data’s already in the icons.


#26    Ryan JL      (see all posts) 2010/12/07 (Tue) @ 19:23

Personally, I find it much, much easier to comprehend a chart filled with numbers than to decipher any of these graphs.  Am I in the minority? I suppose I am just not a “visual” person.  Very interesting to me that others find this easier.


#27    Joshua Maciel      (see all posts) 2010/12/07 (Tue) @ 21:40

J-Doug,

Thanks a lot. You’re a gentleman and a scholar.

RMR,

I’ll try throwing labels on there sometime today.

Ryan,

You’re not odd in that respect at all.

In my experience (as a marketing/sales in an engineering company), engineers want to know the specifics, while the “creatives” and management types would rather get the visual overall picture and let the engineers worry about the specifics.

Obviously to make these graphs I have to wade through tables of numbers, and typically speaking I can cope, but for me I can’t understand the big picture until I put it all in a visual format.

There’s no one perfect approach to data analysis—I just take the visual one. If it doesn’t suit you, that’s fine, you shouldn’t feel like you should use a different format if your results are coming out okay.


#28    Newcomer      (see all posts) 2010/12/08 (Wed) @ 14:12

I like both ways, the chart and the numbers.  I like having the visual as a reference because it makes it a little easier to internalize the relationships.


#29    Tangotiger      (see all posts) 2010/12/08 (Wed) @ 14:20

As a numbers-maven, I’ll tell you why I prefer the graph in this instance: A base/out state only has value in relation to the other 23 base/out states.  What we care about is the relationship, of what happens when we go from one state to the other.

You look at the graph, and you see the following:
1. the gap from 0 to 1 outs and from 1 to 2 outs, for each of the 8 base states; it is instantly clear the magnitude of the drop from the graph, but not so obvious or readily apparent what the gap is from a numerical chart

2. going from runner on 1B, 0 outs to runner on 2B, 1 out, again is so quickly obvious in the graph

3. you can see quick equivalencies, such as runner on 1B 0 outs is equivalent to runner on 3B 1 out, or runners on 1b/2b 1 out

It paints the entire picture right there, the magnitude, the relationships, everything.  You eyes are drawn to what it needs to be drawn to.  With the numerical table, you have to breathe in a little to get your bearings, you eyes are always darting, left and up, and right.  It’s very unfocused.

That’s just me though…


#30    Tangotiger      (see all posts) 2010/12/08 (Wed) @ 14:27

That said, obviously, we can have both…


#31    Joshua Maciel      (see all posts) 2010/12/08 (Wed) @ 19:34

Tango, that’s exactly what I made it for. During the World Series, I was constantly referring to the chart, and it was taking me time to see where I was, where I ended up, and how big the drop was. This way I can take a glance and say, “ouch” or “nice” without as much time spent away from the game.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 06:43
Largest demonstration in Canadian history?

May 25 06:39
Lack of hustle during a game

May 25 05:00
Help needed with sticky issue…

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards