THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, February 26, 2010

Robinson Cano’s batting spot

By Tangotiger, 09:27 AM

John Walsh and Steven Goldman.

Two points:
1. When you see a statistical significant to p < whatever-number observed split, that does NOT mean that the observed split it significant at that level.  It means that the NON-ZERO difference is significant at that level.  So, if let’s say you have a .400 wOBA in one split and .300 in the other, and it’s significant at p=.04, then it means that it could very well be .335 and .334 as the true splits.  It’s non-zero.

2. Other than leadoff hitter, if you can find me a one-position switch of a .300 to .360 wOBA hitter that will cause more than a 10-run difference, I will be impressed.


#1          (see all posts) 2010/02/26 (Fri) @ 11:08

Tango, could you give me the “talking to a 5 year old” version of your first point.  Are you basically saying that Robbie Cano’s true talent split w/RISP versus no runners on isn’t nearly as dramatic as his performance to date would indicate?


#2    Ken      (see all posts) 2010/02/26 (Fri) @ 11:31

As tango pointed out, the true talent difference is statistically likely to be non-zero - but isn’t necessarily the 51-point difference that we see in the statistics. That said, it might be a 10-point difference, and it might be an 80-point difference. (Given the comment about a 0.5% chance of being zero - it is also very unlikely to be 1% as well.)

One more time when it would be nice if people would report confidence intervals on their analysis.


#3          (see all posts) 2010/02/26 (Fri) @ 12:41

For the batting averages listed (the 51-point difference mentioned), the 95% confidence interval on the observed difference in proportions is (0.017, 0.085) - so between 17 and 85 points - using the Agresti/Caffo interval (difference in population proportions).

This, of course, does not take into account quality of pitcher differences between the ‘groups’, etc, but at least puts the non-zero difference in some perspective.


#4          (see all posts) 2010/02/26 (Fri) @ 14:19

... and the CI for OBP is essentially the same - between 18 and 85 points.


#5    John Walsh      (see all posts) 2010/02/26 (Fri) @ 17:26

Tango,

Regarding 1), yes, of course.  I tried to state that explicitly, but perhaps I didn’t get my point across:

For example, the probability that the 51 point difference in OBP is just a statistical fluctuation (i.e. that Cano’s true OBP in the two situations are the same) is about 0.5%.

Actually, the first part of the statement is vague, but I tried to clear it up in the parenthetical remark.


#6    tangotiger      (see all posts) 2010/02/26 (Fri) @ 18:17

I was thinking of something else you wrote earlier in the post.  I agree that showing the confidence intervals (not just you, but EVERYONE) would simply make everything clearer.


#7    MGL      (see all posts) 2010/02/26 (Fri) @ 19:52

Disclaimer:  I have not read the articles.

Ok.

These are all (most of them at least) Bayesian problems and we must know the “a priori” before we do or we analyze tests of significance.

We flip a coin from our pocket and come up with 10 head in 10 flips.  Is there any significant chance that the true rate of heads for that coin is more than .5?  No!  Why?  Because 99.999999999999% of all coins from my pocket are pretty much fair.

If Cano’s RISP splits are very large and that gap is statistically significant, but upon testing all other players we find that there is little or no RISP split “skill” (which is the equivalent of testing thousands of coins and finding out that they are all pretty much fair), guess what?  The large split we find for Cano means nothing.  We KNOW that it is pretty much a fluke, no matter how large.

There is some small chance that while we cannot find any observable skill among the population that there is indeed a small skill, in which case Cano’s split gets regressed a lot, but not quite 100%.

There is some chance that a few players in the population have a large skill, but the percentages of these players is so small that it is barely noticeable.  If that is the case, that still won’t change our answer (that most of the observed split is likely a fluke) since the chance that Cano is one of these players (with a large skill) is much smaller than the chance that he is NOT one of these players and that his sample split is a fluke.

That last point is like when a person not in a high risk category for contracting AIDS gets a positive AIDS test.  He is still like a 10-1 or 100-1 dog (not likely) to have AIDS.

Getting back to the batting order thing. Manages and commentators spend WAY too much time thinking and talking about batting orders, especially “true lead-off hitters.” Construct a good team, even if that means 8 middle of the order guys, and then put them in a reasonable batting order.  End of story.

Does any manager or GM seriously think that a lineup of 8 or 9 Manny’s or Mauer’s, even if you have to put one of them at leadoff, will not score a ton of runs?


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 01:57
Who is Jeremy Lin?

Feb 12 00:40
Clutch analogy

Feb 12 00:38
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 10:29
Dwight Evans