THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, February 17, 2012

It is human nature to attach too much significance to “consistency…”

By , 03:34 AM

Here is an example:

Which player is most likely to be a true .300 hitter? Assume that the player’s true talent never changes across AB and that there are no park or weather or any other effects (the observed BA are all on the same scale).

Player A hits .298 in 600 AB in one season.

Player B hits:

.294 in April in 100 AB.
.304 in May in 100 AB.
.301 in June in 100 AB.
.300 in July in 100 AB.
.288 in August in 100 AB.
.291 in September in 100 AB.

Player C hits:

.188 in April in 100 AB.
.216 in May in 100 AB.
.512 in June in 100 AB.
.087 in July in 100 AB.
.390 in August in 100 AB.
.401 in September in 100 AB.


#1    German dude      (see all posts) 2012/02/17 (Fri) @ 05:40

How do you hit .294 in 100 AB? Just kidding wink


#2    MGL      (see all posts) 2012/02/17 (Fri) @ 06:21

He got 29.4 hits of course.


#3    Geoff Buchan      (see all posts) 2012/02/17 (Fri) @ 11:16

All three are about the same. But A averages .298, B averages .29633, and C averages .299, so if forced to choose, I’d say C is a little more likely.

If you consider batting average to be a normally distributed random variable around the “true” value, then the big unknown is what the variance is for hitters A, B, and C. Presumably C has the largest variance. Taken to extremes where A has 0 variance, and the computed variances of the samples for B and C match the actual variance, then we know A is definitely *NOT* a .300 hitter, B could be, but is a bit more likely not to be (low variance around the mean furthest below .300), and C is basically a 50/50 chance to be at or above .300 in “true” value (very high variance around a mean that is closest to .300).

This raises the interesting question of whether there is any “skill” to consistency, i.e. whether some players regularly show higher variance around their mean performance than others.


#4    pm      (see all posts) 2012/02/17 (Fri) @ 11:45

MGL or Tango, have you done any studies on batter hitting variance and determining whether there are guys who are more skilled at being consistent?


#5    Tangotiger      (see all posts) 2012/02/17 (Fri) @ 11:53

Variance can never be exactly 0, since you can’t get 0.298 hits in any one at bat (you either get a hit or not).

Nevertheless, the better question is not which is most likely to be the true .300 hitter, but which is the likely best hitter of the three.


#6    Geoff Buchan      (see all posts) 2012/02/17 (Fri) @ 12:03

Agreed on 0 variance in the real world. But in the Quantum Mechanical Field of Dreams…

As to the best hitter, well they’re all about the same - but if forced to answer, does anyone have a good reason to rank in an order other than the overall average (i.e. B < A < C)? If so, why?


#7    Mr. Red      (see all posts) 2012/02/17 (Fri) @ 12:30

I remember Bill James comparing the value of a high career peak vs. the value of career consistency in The Politics of Glory. He had an entire chapter devoted to this comparison using Don Drysdale and Milt Pappas (I believe) as examples of high peak and consistency respectively. He found that the Drysdale type player was more valuable. This finding might be outdated/disputed, but I seem to remember a Hardball Times article that supported this theory.

I wonder if this theory holds true in a single season. If you could add one of these three players to a pennant contender, would Player C make the team more likely to win the pennant than player A? Without any research, I would think that this effect would fall apart over 600 PAs or at least be masked by fluctuations in leverage beyond a player’s control, but maybe someone here has an answer.


#8    MGL      (see all posts) 2012/02/17 (Fri) @ 15:03

Remember I made the qualification that each batter’s true talent never changes! You can’t “force” variance if that is true. The only variance for each of these batters is going to be by chance, by (my) definition…


#9    pm      (see all posts) 2012/02/17 (Fri) @ 15:18

Wait are you saying that these batters don’t experience variance over a period of time?


#10    Geoff Buchan      (see all posts) 2012/02/17 (Fri) @ 16:31

Red/7 -

I actually see two related questions here:
1. Suppose I can choose between two players with the same expected WAR (to encompass all facets of a player’s contribution to winning). Does it matter if I pick the higher or lower variance player?

2. Suppose I can choose between two players who achieve the same total WAR in a season, but one has much higher variance in WAR computed per 10 game segment than the other. Does it matter?

For question 1, then you’re clearly better off taking the high variance player, and a simulation could confirm this. The key is that the winning team is already likely to be luckier than average in the year they win, and the high variance player offers more upside when you’re lucky. By contrast, if he’s having an off year, you were already less likely to be better than 29 other teams anyhow, so the better downside of the low variance player is less likely to matter.

For question 2, I don’t see so clear an answer, and indeed if anything it may be that the low variance player might be preferred. A season is 162 binary games, but now rather than needing to beat out the win total of the other 29 teams, you simply need to beat the one you’re playing in any given game. The higher variance player then becomes more of a liability, making it more likely you win by a lot when he’s performing above expectations, but also more likely to lose when he’s below. Again one could design a simulation to investigate this.


#11    Geoff Buchan      (see all posts) 2012/02/17 (Fri) @ 16:34

Oh, I should add that in practice you likely never have such a clear choice of equal expected value but different variance. You expect one player to have higher value, and I’d also guess that even a rather small difference in expected value should dominate over the incremental edge of different variance in some situations.


#12    B in DC      (see all posts) 2012/02/18 (Sat) @ 00:31

These guys get fractions of hits? Wild.


#13    WanderingWinder      (see all posts) 2012/02/18 (Sat) @ 00:51

Of course given the presumption that nobody has any variance in skill, it’s just who has the best average here, regardless of how they got it. That’s sort of a circular argument, though, because in assuming no variance, you’re assuming that there’s no difference in consistency either (in regards to future forecasts if not past results) except the slight difference you get from having the slightly different rates. So of course, you find that consistency has no effect, because there’s no difference in consistency.
Where consistency comes into play is when players have different variances. I should say ‘if’ they do, because I don’t know if we can predict that, or if they even do have significantly different variances. I do know that if we CAN predict that, it’s a big finding. Here’s why:
Because baseball is non-linear, teams with higher variances in basic batting rate stats score more runs overall.
Because all that matters is winning and losing and not how much you win by, being consistent in how many runs you score offensively has a significant impact in letting you win games (though still not as important as the number of runs you score).
Among teams which score the same number of runs (probably I should say have the same wOBA*PA, because I’m ignoring significant differences in baserunning here; though I didn’t directly check wOBA*PA in any of my research on this particular subject, it ought to apply), those which have higher SLG are more consistent in scoring runs than those with higher OBP.
Whether these last two points mean that teams with higher SLGs and similar wOBA*PA do better than teams with higher OBPs is actually dependent on the exact distribution and run-scoring level you have, because the higher OBPs are still going to give you more runs total, generally, due to the non-linearity. Consistency is generally more important at lower run-scoring levels. And there are some specific interactions and lineup balance issues that can have some effect as well.


#14          (see all posts) 2012/02/18 (Sat) @ 01:30

Yes, I said (in my original post) that none of these players’ true talent changes from PA to PA!  So, any and all of the variability you see is by chance alone!

The point I was trying to make is this:

“Of course given the presumption that nobody has any variance in skill, it’s just who has the best average here, regardless of how they got it.”

Thank you Mister Winder!

Many people think that player 2 is more likely to be close to a .300 hitter even though the answer is C because his overall BA is closer to .300 than A or B.

A completely separate issue is whether players can have significantly different variances in true talent. I don’t know the answer to that.

I do strongly suspect that if these were real live players that most of the difference in variability you see between player B and C is due to chance.  IOW, people make WAY too much of observed variability and the “timing” of performance (as in, “Yeah, but look at how he did the last month of the season, or in the second half...").


#15          (see all posts) 2012/02/18 (Sat) @ 02:56

"These guys get fractions of hits?”

In my world, absolutely!


#16          (see all posts) 2012/02/18 (Sat) @ 16:12

IMO, fans put more significance in the extremes whereas managers prefer consistency.

The extremes really stick out in our mind, as anytime a batter hits 15 homers in a month or bats .450 for a month. (or some super lefty does something like go 6-0 0.34 ERA in a month).

Managers prefer consistency because it makes managing “easier” or more comfortable.

when I see situations like these I always think of situations where Theriot had 4 hits in a playoff game and then didn’t play in the next 2 games.

----------------------------

Didn;t we have a good thread here about whether you’d want an inconsistent pitcher over a consistent pitcher? I recall the conclusion (hopefully with accuracy) that the occasionally great occasionally horrible pitcher actually wins more games than the pitcher that is “steady eddie”. IIRC, the idea was that anytime they allow less than 2 runs or whatever the case may be, it was essentially a win almost by what the pitcher has done “alone”, but the team would also win some of the games where they were bad. A pitcher that was more “steady” (perhaps more 6IP 3ER QS’s) would have a W-L record (or his team would have a W-L record) more toward .500 than the other situation.

I hope I’m accurate with my recollection.

I think this was in response to discussion on QS% and/or pitchers like Edwin Jackson who had/have a rep as being hot/cold.


#17    DavidS      (see all posts) 2012/02/19 (Sun) @ 18:02

@16 - whether you want consistency in a pitcher is driven by his expected level of performance.  Generally speaking, if you have a good pitcher you are better off if he is consistency, and if you have a bad one, you’d prefer the variance.

Here’s an example: Since runs scored should roughly follow a Poisson distribution, the plot of win expectancy vs. runs allowed is roughly 1 - the cumulative Poisson distribution.  (Essentially, a backwards S curve.) The second derivative of this function is negative up until the average runs scored and positive thereafter.  This means that if you look at win expectancy for allowing 0, 1, and 2 runs, the average of the WE for 0 and 2 is lower than the WE for 1.


#18    DavidS      (see all posts) 2012/02/19 (Sun) @ 18:04

sorry, that should read “better off if he is consistent”


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 09:31
Do pitcher’s reach back for velocity when needed?

May 25 08:11
What sabermetrics is NOT

May 25 06:43
Largest demonstration in Canadian history?

May 25 06:39
Lack of hustle during a game

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story