THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
Mailbag:You ask:We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, May 09, 2008

Be Jolly stats users

By Tangotiger, 03:17 PM

This is a question solely for those readers who are subscribers to Bill James Online:


Virtually the only place I go is the “Hey Bill” section.  It’s the closest thing to a blog there.  I’ll go to the articles when there’s a new one, but I rarely revisit; the lack of a date-based commenting system makes it pointless to spend any time to see if there’s a new comment.  So, I have given up on that.  And, I’ve also given up on Reader Posts.  All the other stuff is not for me.

So, that leaves me with the Statistics page:
http://www.billjamesonline.net/Statistics1.aspx
I’ve gone there a few times, and have come away, frankly, disappointed.  My initial impressions of each of those profiles have been less than positive.  And the navigation is terrible (10 years behind the times).  You’d think everyone would follow the b-r.com model.  Even “old” Retrosheet has everything linked.

However, perhaps I’m missing something.  So, what Statistics profile are you using, and why are you using it? 

#1    Luke Gofannon      (see all posts) 2008/05/09 (Fri) @ 16:38

I use the Fielding Bible Basic Data and Fielding Bible Plus/Minus stuff, because I like that system and the James site is the only place where I can find this info available to the public (at least for $3 a month, anyway).  But the one-player-at-a-time format in the stats section is frustrating and discourages my browsing around there.  I also agree with your take on the rest of the site: just not really a fun place to visit.  About re-upping a subscription, I tend to think, “eh, it’s just 3 bucks a month.” Kind of like the cost of merchandise sold at Disneyworld: priced right at the level where they make their money and you’re still willing to give it to them.


#2    MGL      (see all posts) 2008/05/09 (Fri) @ 18:02

The “one player at a time” format on several sites drives me crazy, but that is mostly because I like to copy and paste multiple-player data to create data bases for research, and that format makes it impossible.

Heck, I can’t even find minor league batter and pitcher data that is not “one team at a time.” Does anyone know of a site that displays individual player minor league data (all batters and pitchers, and not just “leaders") that is not “one team at a time?” Even the B-R minor league site displays only the top 100 batters and pitchers, “one league at a time” I think.  I can live with “one league at a time.”

And, BTW, I would hate to see other sites use the “B-R format.” As I’ve said in the past, I think that site is a disaster, mostly because it is asthetically a mess.


#3    dave smyth      (see all posts) 2008/05/09 (Fri) @ 18:03

I’ve looked at the stat section, batters pitch analysis quite a bit. You can do some nice combining of those numbers. For example,

Called ball/[ Called ball + Swung at (overall) - In strike zone (swung at by pitch location)]

gives the percentage of bad pitches the hitter refrained from swinging at. Pretty much the single best indicator of plate discipline.


#4    Anthony      (see all posts) 2008/05/09 (Fri) @ 21:47

You can get minor league stats “one organization at a time” at B-R:

http://minors.baseball-reference.com/affiliates.cgi?aid=NYY&yid=2008

The primary problem with that, however, is that players who split time across levels have their statistics aggregated. That’s probably not helpful at all.


#5    Eric J. Seidman      (see all posts) 2008/05/09 (Fri) @ 21:54

MGL, have you tried Fangraphs?

Go to the LEADERS tab and a popup with Major League or Minor League will come up. 

You can sort it by each league (AAA, AA, A, Rookie) and instead of doing leaders, opt not for only qualified players to see everyone.

Not sure if that’s what you’re looking for or not but it’s definitely a valuable resource.


#6    MGL      (see all posts) 2008/05/09 (Fri) @ 23:39

Anthony, that is the same as “one team at a time.” In order to save time pulling up each team or organization and then copying and pasting to a master document, I am looking for one league at a time, at least, for minor league stats.  Thanks though.

Eric, the Fangraphs is a good one. Thanks.  The only difficulty is that it displays the players “one page at a time” so I still have to copy and paste 5 or 7 times, but that is better than 30 times per league, one for each team.


#7    Eric J. Seidman      (see all posts) 2008/05/10 (Sat) @ 00:25

That can definitely be fixed.  David has the “Items in Grid” field on the Major League Board, wherein you can change the amount of results you see on the page.  I usually change it to “Show All” when I look at major league leaders.


#8    dkappelman      (see all posts) 2008/05/10 (Sat) @ 01:17

I’ll try and put in a grid limit doohickey this weekend for the minor league leaderboards.


#9    MGL      (see all posts) 2008/05/10 (Sat) @ 02:08

Great, thanks guys!  It’s nice to have “juice!”


#10          (see all posts) 2008/05/10 (Sat) @ 12:10

I usually only look at the “Hey Bill” section also, but yeah, the stats section is hard to browse through because you can’t even really compare other players unless you open multiple tabs/windows. The investment is still pretty worthwhile, if only for the Fielding Plus/Minus data and the questions in “Hey Bill”.


#11    Dave Smyth      (see all posts) 2008/05/10 (Sat) @ 17:10

In post #3, I mentioned the batters pitch analysis data. There seems to be a problem with this data. Let’s take M DeRosa, 2007. His # of pitches taken (for a strike) was 333. His pitches taken by pitch location (in strike zone) was also 333. And it seems to be the same for every player. It seems obvious that, of all the pitches DeRosa took for a called strike, some of them were outside the strike zone and were called strikes because of umpire misjudgement. So, why are the totals always the same? I asked B James that in “Hey Bill” a couple weeks ago, but he never answered, for whatever reason.


#12          (see all posts) 2008/05/10 (Sat) @ 18:14

The 1 batter at a time drives me nuts.  I look up 2 or 3 players that i am curious about and then i give up.  Other sections are team related- why not make the others?

I have gone back and read the older articles that were posted back in Nov, Dec., and Jan.  To be honest, i forgot they were there until a couple of days ago.  Some of the stuff is fun- the Turk Farrell Award stuff for example.

I can’t put my finger on why it seems so dry.  There’s only 2 things i can think of:

1.  He’s holding back opinion because he works for the Red Sox and I guess they get his top shelf stuff.

2.  We are used to getting really good nuggets in book form.  It’s his best stuff.  This feels like all his stuff that’s left over.

I tend to think the best stuff he’s included so far has been about things other than baseball (i.e the Jayhawk blog).  Maybe he should write about his life or subjects other than baseball.  I’ve always been fascinated about the way he thinks and goes about solving problems.


#13    studes      (see all posts) 2008/05/12 (Mon) @ 09:25

I just haven’t used it much.  Weren’t they going to send out emails when Bill posted a new article?  I got one once, but never got another.  Have other people gotten those?


#14    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 09:35

No, never received a notification mail.


#15    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 10:03

Ok, let me give Be Jolly another try, and follow David’s suggestion.  Let’s go here:
http://www.billjamesonline.net/Statistics1.aspx
then go here:
http://www.billjamesonline.net/StatisticsReport_new.aspx?Type=03&Team=0&Player=1&men=2
Do a search for Vlad to get us here:
http://www.billjamesonline.net/StatisticsReport_new.aspx?Type=03&Team=0&Player=1&men=2
And select 2007.

David suggests:
Called ball/[ Called ball + Swung at (overall) - In strike zone (swung at by pitch location)]

For Vlad, that means:
739 / (739 + 1159 - 543)

So, I like this, as it shows how often he gets a called ball on pitches outside the strike zone.

For Vlad, that’s 54.5%.

However, that’s any swinging strike, whether swung and missed, or he makes contact with.  What is Vlad’s BA and SLG on pitches outside the strike zone?  I can’t tell by the data, but one would hope that you’d have it.

You remember that Ted Williams chart showing the batting average for every part of the strike zone?  Very cool.  I’d like to see the wOBA of that for every player.  Dan Fox showed the BA/SLG for the field location.  Why can’t this get done on Be Jolly?

And, is 54.5% good or bad?  Do I really have to do this for several players?

This is what’s so frustrating with that stats profile.

I really want to like it.  There’s so much good data out there.  But man, is it frustrating.  Check out the Pitch data that Fangraphs published (from BIS itself!).  What a fantastic layout, and it’s sortable across the league, and yada yada yada.

BIS should contract out the web publishing to Fangraphs.  Not only do you get a great layout, but there is no one more responsive, and quick with the turnaround, than Appleman.  (No offense intended to the BIS programming folks.)


#16    david smyth      (see all posts) 2008/05/12 (Mon) @ 11:53

---"And, is 54.5% good or bad?”

It’s bad, very bad. The avg seems to be 72% or so, with the most disciplined hitters being in the 80s, and the wild swingers in the low 60s.

And, I agree that the section is not user-friendly. Also, the search engine makes an error in 1 of 10 (or more) searches.


#17    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 12:21

Well, it’s “bad” in that he swings at half the balls out of the strike zone.

But, I would bet that of his balls out-of-zone (BOZ) swings, his BA/SLG may be the highest of any ballplayer.

A called ball is worth something like +.08 runs.  The average Vlad PA would be worth something like +.06 or +.07 runs.  (Obviously, he’ll make better contact with pitches in the strike zone.) So, to Vlad, a called ball is just not that exciting to him.  It doesn’t enhance him as a hitter, since he gets his +.06 runs per PA in spite of the walks.  Now, maybe he’s +.10 runs per PA on balls in zone (BIZ) and +.03 on BOZ, and so maybe he should take it a bit easier.  But, that might be his magic, that if he does take it easier, his BIZ would go down to +.08 and his BOZ would go up to +.05. 

In the end, Vlad is Vlad.  I’m not sure that 54% is “bad”.  I think that was a bad question on my part.  54% is low, but low is not bad.  For a guy who swings ALOT, he certainly doesn’t strike out alot at all.


#18    dave smyth      (see all posts) 2008/05/12 (Mon) @ 13:02

I don’t think Vlad’s runs on BOZ is +.03. I think it’s around -.06 (with an avg batter being about -.13 on BOZ).

So, he’s not doing the team any favors by swinging at bad pitches. I know, I know, if he becomes more selective, he’ll be sacrificing his natural aggressiveness. So, it works for Vlad. But he’s a very unique case…


#19    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 13:29

Yes, definitely unique.

What will be interesting is how he ages.  A hitter’s walk rate increases every year until at least age 38.  So, a hitter is becoming smarter as he realizes his physical skills are diminishing.  If he could learn this in his 20s (ala Frank Thomas) perhaps they’d be better off.  Who knows.

I would be hugely disappointed if Vlad’s career path declines more rapidly than expected if he insists on maintaining his hitting approach in face of physical decline. 

Unfortunately, we’ll only know if Vlad approached his game right, after his career is over.  On the other hand, it’s hard to argue that someone who is a perennial MVP candidate could have been better by changing his hitting approach.


#20    dave smyth      (see all posts) 2008/05/12 (Mon) @ 14:33

Ther is a stat on fangraphs called O-Swing. When I first saw it I thought it was supposed to be the same as the BJ stat above--the percentage of pitches outside the strike zone that the batter swung (or didn’t swing, as I look at it). But then I came to believe that O-Swing instead refers to the percentage of swings which were on pitches out of the zone. That’s something different. Maybe someone else could take a look and check whether my interpretation is correct.


#21    dkappelman      (see all posts) 2008/05/12 (Mon) @ 14:35

MGL & others who where interested earlier in the comments:  You can set the grid limit on the minor league leaderboards and the minimum AB/IP.  Cutting/pasting full minor league stats for any individual league should be a cinch.

http://www.fangraphs.com/minorleaders.aspx?pos=all&stats=bat&lg=2&qual=y&type=0&season=2008


#22    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 15:22

Dave/20: I must be missign something.  How is it different?

You were breaking up your “swings” whether they were in the strike zone or not.  And you were adding called balls to that.  So, out of zone swings plus called balls are in the denominator.  Is that the same as yours?

Here is the Fangraphs recap:
http://www.fangraphs.com/blogs/index.php/plate-discipline-stats

Sounds like I’m needing the Be Jolly stats less and less if Fangraphs is going to do it all for me…


#23    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 15:27

The link at page 22 shows the description of David’s stats, along with league averages.

Here’s Vlad (bottom of the page):
http://www.fangraphs.com/statss.aspx?playerid=778&position=OF

Obviously, we’d like to see the BA and SLG (*) for each of those columns.

(*) This is one of the rare cases where I like BA.  However, SF needs to be counted as an AB for BA and SLG.  And, preferably, reaching on error too.


#24    David Smyth      (see all posts) 2008/05/12 (Mon) @ 16:17

The description on the fangraphs site is ambiguous to me. Its “the percentage of pitches a batter swings at outside the strike zone”. That could mean either of the alternatives in my last post. But, going back to Appleman’s 2006 article, it appears that, indeed, this stat is supposed to be the same as the BJ stat I concocted.

But the average is surprising to me. It appears that the avg is about 23% --equivalent to 77% in the format I used--instead of the 72/73% I’ve been estimating from plodding thru many individual calculations on the BJ site.


#25    tangotiger      (see all posts) 2008/05/12 (Mon) @ 17:23

Seeing that Fangraphs uses BIS data, it would seem that this is a case of selective sampling on your part!

And I still don’t see the “alternatives”.  They all look the same to me.  Can you clarify?


#26    dkappelman      (see all posts) 2008/05/12 (Mon) @ 17:29

The Bill James site and our site have exactly the same numbers and are using exactly the same data. However they’re calculating the strike zone, we’re calculating it exactly the same way.

So Bill James has the “Swung At By Pitch Location” data table and then an “In Strike Zone” stat:  This is how many pitches he swung at inside the strike zone / how many pitches he swung at in total.

Z-Swing on FanGraphs is how many pitches he swung at inside the strike zone / how many pitches were thrown inside the strike zone in total.

They definitely tell you different things, but the 1 - the Bill James stat (pitches swung at outside the strikezone / pitches swung at overall) has about a R^2 of about .6 with OSwing.  I’m not sure which one I’d rather use, but it’s probably worth looking into the merits of the one on Bill James’ site.


#27    David Smyth      (see all posts) 2008/05/12 (Mon) @ 17:57

"Can you clarify?”

One version starts with whether the pitch was actually in the K zone, and then whether the batter swung or not. The other starts with whether the batter swung, and whether the pitch was a strike or not. Completely different things.


#28    MGL      (see all posts) 2008/05/12 (Mon) @ 19:47

You can NEVER say whether a change in approach, no matter how bad we think the batter’s approach is, can help a batter.  All you can do with a batter is have him change his approach and see what happens over a long period of time. It is impossible to know (for us) what a batter can or will do, what he is comfortable with, not comfortable, etc.

Also, if Vlad stopped swinging at many of those outside pitches then pitchers would stop throwing many of them!  You can never say how a change in approach by a batter will affect his production since you don’t know how that will change the pitcher’s approach (although I suppose you can try and model the approach).  Of course if you changed on paper all of the pitches outside the zone that a batter swings at to balls, the batter’s overall production will change dramatically.  But that can’t be done in reality since the percent of pitches outside the zone would change as soon as the batter changed his approach.


#29    tangotiger      (see all posts) 2008/05/12 (Mon) @ 20:01

I think the default has to be that an extremely successful hitter is already hitting optimally.  Nobody would be crazy enough to tell Vlad that he needs to lay off more pitches.  What’s the payoff there?  10% change of improving, 40% neutral, 50% decline?  Doesn’t make sense.

Certainly one change forces another change.  Clay did a very good job in BP08 annual on that front.

Any change in hitting approach has to be developed over time as something learned (and some batters, like Frank Thomas learn much faster than others).  Now, if you had a hitter who was on the cusp, that guy I would tinker with, and use him as my guinea pig, so we could have our experiment, and see what would happen.


#30    MGL      (see all posts) 2008/05/12 (Mon) @ 21:27

I’m certainly not suggesting that you can’t or shouldn’t (if you are a coach) try and work with a hitter to change his approach.

I am simply saying that no matter what a player does “right” or “wrong”, and no matter what his profile is, we have no idea what would happen if he all of a sudden tried to do something different, if he even could.

And while you would be more likely to want to change someone who was not a good hitter than someone who wasn’t, you may still be able to improve upon goodness or greatness.  Maybe Vlad would be a better hitter if he worked on changing his approach a little.  Who knows?  Maybe Maddux would be a better pitcher, even in his heyday, if he threw more waste pitchers when ahead in the count.  Who knows?  Of course, you better have a helluva resume before offering suggestions to guys like that, especially when they are not in a “slump”.


#31    dave smyth      (see all posts) 2008/05/13 (Tue) @ 05:53

Yeah, it’s not a matter of whether Vlad should change, it’s a matter of how we label what he does. Two choices:

1) Plate discipline is a relative skill, dependent on the other components of a batter’s game. Using this concept, Vlad has good pitch selection. I mean, look at how successful he is!
2) Plate discipline is more of an absolute skill. Vlad succeeds in spite of having low discipline.

They’re both correct, because they’re using the term discipline to refer to different things. Forced to choose, I think the absolute version is more useful.


#32    tangotiger      (see all posts) 2008/05/13 (Tue) @ 06:14

I see it as #1.


#33    dave smyth      (see all posts) 2008/05/13 (Tue) @ 06:31

Why is that? I mean, W Boggs succeeded despite low power. It was said that he could have hit more HR if he wanted to, but felt it would detract from his ability to hit for average. But, we don’t say that Boggs had power but didn’t use it, we say he had low power. Similarly, Vlad could probably take more pitches if he wanted to, but he believes it will detract from his overall results. So, he has poor discipline. If it’s just the terminology you object to, that’s fine. Just say that Vlad swings at lots of pitches outside the zone, and draws few walks. I prefer a nice neat term like discipline.


#34    tangotiger      (see all posts) 2008/05/13 (Tue) @ 08:05

I don’t like the word “poor”, as if it’s something that detracts from his game (Jeter is a poor fielder).  I’d prefer patient and aggressive.  For someone who strikes out as little as Vlad and who takes as many swings as he does, he’s got fantastic strike zone judgement.  He’s just very aggressive.


#35    tangotiger      (see all posts) 2008/05/13 (Tue) @ 23:32

David, check your “my account” page.  My notification was turned off.  Here’s what the fine folks at BJOL said:

You did not have the check box checked.  Please check.  Go to “My Account” (found in the top right hand corner of virtually every BJOL webpage), look at the box to the left of “Notify me by email when new articles or columns are published” and make sure there is a check mark.


#36          (see all posts) 2008/05/14 (Wed) @ 00:45

Mine is checked off (I just looked) and I definitely have never received a notification of a new article or column.  For example, there is a new article dated May 12, I think.  I just double checked my inbox and my junk-mail folder.  No notification.  Has anyone been notified of any new articles/columns?


#37    studes      (see all posts) 2008/05/14 (Wed) @ 05:16

Mine has always been checked, and I received one when they first went live, but haven’t received any since.  They contacted me and suggested I change my email address.  I’ll let you know if that works.


#38    david smyth      (see all posts) 2008/05/14 (Wed) @ 07:51

---"I prefer patient and aggressive”

Agreed, although I prefer ‘selective’, over ‘patient’. Plus, even though we are using neutral terms, it should always be kept in mind that more good hitters are selective than are aggressive. Lots more, I believe.

I might as well show a few numbers I worked out. Since I had to make a couple assumptions, they are only approximate:

Event, avg batter run value, Vlad run value
bad ball in play, -.15, -.075
good ball in play, +.07, +.18
swing at bad pitch, -.11, -.085

The ‘swing’ run difference for taking vs swinging at a bad pitch is .20 for an avg batter, and .165 for Vlad. So Vlad, the best bad ball hitter, is only able to mitigate 20% of the damage from swinging at bad ones.


#39    Tangotiger      (see all posts) 2008/05/14 (Wed) @ 08:13

Selective is better than patient.  Good word.

Great job on that data.  THAT is the kind of stuff that we should be seeing on all these sites.  You’d think that Studes’ fantastic Batted Ball Library would have spurred everyone into taking a LWTS approach to making sense of all these numbers.  I’m knee-deep in this, and I always try to remember what the baselines are.  Simply making things as runs per pitch (or PA) is ideal.  That’s why I love the Linear Weights by counts, and the further refinement by pitch type.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main