THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, March 02, 2010

By The Numbers, new issue

By Tangotiger, 05:07 PM

Just posted today.  Haven’t opened it yet…


#1    Patriot      (see all posts) 2010/03/02 (Tue) @ 17:15

I think blogs have finally killed BTN.  No offense, Phil.


#2    Tangotiger      (see all posts) 2010/03/02 (Tue) @ 17:25

I like that the BTN has quality production standards.  It makes for better reading and referencing in the future.

I think Phil would be better off taking all those pieces nominated for the Sabermetric Awards and packaging it into one volume.  That would kick a$$.  So, you’d have a “Best in 2009 Sabermetrics”, all nicely packaged.  That’d be nice.

***

I liked Tom’s article in this issue.  I’ve thought about this issue alot.  I’m not sold that this is the right way to do it, but I like the idea behind it.


#3    Tangotiger      (see all posts) 2010/03/02 (Tue) @ 17:38

I do the same thing for K.  I think I worked it out that 1 HR = 6K or something, so I compare how many K’s he has relative to his HR.  So a 40HR, 200K guy is a plus.

Something like that…


#4          (see all posts) 2010/03/02 (Tue) @ 17:39

Patriot, no offense taken.  I’m thinking the same thing.  The thing is, we’re willing to take stuff that’s been published elsewhere, as Tango suggests in #2.  So maybe I’ll start more aggressively asking for reprint permission.


#5    Guy      (see all posts) 2010/03/02 (Tue) @ 17:39

I agree Tom has an interesting idea here.  But I wish people would stop using single-season data to measure relationships between stats, as he does here for BB rate.  If he looked at career stats, I feel pretty confident he would have found a relationship with BA or SLG, and not just ISO.  Especially in this case, when he was going to work with career data in his next step anyway!  I’m just so tired of seeing reports of “weak” relationships based on single-season data—wins/payroll, BABIP, whatever—without any recognition that these correlations are telling us more about the chosen sample size than the actual relationship.


#6          (see all posts) 2010/03/02 (Tue) @ 17:49

I’m starting to think Tango/2 is a great idea.  I’ll get on it over the next few weeks.


#7    Tangotiger      (see all posts) 2010/03/02 (Tue) @ 18:06

"without any recognition that these correlations are telling us more about the chosen sample size than the actual relationship. “

Well-said.


#8          (see all posts) 2010/03/02 (Tue) @ 18:32

With that many player-seasons—3,700 of them—wouldn’t you expect that the regression equation would be at least as accurate as if Tom had used careers?  The r-squared would be lower, of course, but who cares about that?  It’s the equation he uses.

You’d be more accurate with the seasonal breakdown, because power varies from year to year, and the career number would obscure some of that relationship.  So you might get more randomness, not less.

Of course, there are problems—walks vary with on age independently of power, and Tom didn’t control for that.  And there are other things you can think of.  But, as far as the equation goes, are you sure it’s less accurate than you’d get by using careers?  I don’t think it is.


#9    Guy      (see all posts) 2010/03/02 (Tue) @ 18:45

Phil:  I think you’re right that the regression coefficient for ISO would remain the same.  But with career data he might have found that other factors also had a significant relationship with BB rate.  I’d be surprised if using BA or SLG didn’t improve predictive power (and might then also change the ISO coefficient). In any case, it would be worth looking at.


#10          (see all posts) 2010/03/02 (Tue) @ 18:52

True: he might have found that whatever relationship he found was statistically significant for careers, but not for seasons.

In his defense, he did say he didn’t find a positive correlation (which means he DID find a negative one).  And he might have found that the equation he discovered showed such a small effect that it wasn’t worth correcting for.  I bet that’s what happened—if there was a huge negative relationship between BA and BB/PA, I bet he would have mentioned it.

That is, if you find that every .001 of BA increases your walks by .00001, who cares?  The relationship would be the same for careers, even if the r-squared came out higher. 

In any case, I’ll tell Tom we’re here and let him argue for himself.


#11    Tom H      (see all posts) 2010/03/02 (Tue) @ 19:32

I agree that a single season does cause ‘fog’ problems; but as Phil stated, with this MANY seasons, the correlation should show through if it existed. Using career ##s would make me wonder if the increase/decrease in power with age made us “lose” data that otherwise woudl be there; after all, didn’t Bonds walk more when his power went up?
Maybe using sets of 2 to 4 years of data would work.


#12    dave smyth      (see all posts) 2010/03/02 (Tue) @ 19:36

If you’re trying to distill walking ability apart from the pitchers’ fear of throwing strikes to a power hitter, what about walks per ball thrown? Using the fangraphs plate discipline data, you can figure balls thrown from (1-Z%) times pitches. After removing the IBB, HBP, SH stuff as desired, you can figure a good estimate of walks per ball. Of course, you can’t do that for Ted Williams or Max Bishop, since the data only goes back to 2002.

And maybe I missed it, but it doesn’t look like author took out the IBBs.


#13    Guy      (see all posts) 2010/03/02 (Tue) @ 20:59

Tom/Phil:  you’re right that if BA was a strong predictor, it should show up in the seasonal data.  But there’s another problem, which is selection bias:  a low-BA player is more likely to stay in the majors if he can draw walks, while a high-BA player can survive with or without this ability.  So even if pitchers do tend to miss the strikezone more when throwing to high-BA hitters, it might not show up in walk totals. 

One possible solution to this is using pitch f/x data, and use percentage of pitches in the strikezone as dependent variable.  Would be interesting to see if anything other than ISO is predictive.


#14          (see all posts) 2010/03/03 (Wed) @ 22:13

You might find this blog entry of mine interesting called “Which Players Had The Most Surprising Walk Rates? (Part 2)”

http://cybermetric.blogspot.com/2009/06/which-players-had-most-surprising-walk.html

I took ISO into account and era into account but also stealing and height. I think the guys I found are similar to the guys Tom found to have the best eye.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 23:23
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 22:49
Clutch analogy

Feb 11 22:08
Who is Jeremy Lin?

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul