THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, October 04, 2010

Phil nails it again about .300 hitters

By Tangotiger, 12:21 PM

Read.

Now, how would you do this study?  Very very easy.  Look at all “last TEAM game of season”, and select each PA where the batter entered with a batting average where the next hit would put him at .2995 or above.  And just count if he did get a hit or not.

You cannot, as Phil explains so clearly, start with the batter’s last PA of the season as if that were some random data point.  It was his last PA of the season precisely because he may have gotten a hit on that last PA and then stopped playing.  Never start with the end result, and presume that there’s no bias.  This is probably the biggest mistake done in baseball research (academics or otherwise). 

This is very popular with the “at count”.  “Hey look, batters who end their PA when they were at 0 balls and 2 strikes got alot more strikeouts than those that did not end their PA when they had 0 balls and 2 strikes.” That’s because if they ended their PA, they got a strike most of the time.  If they didn’t end their PA, they got a ball most of the time.  So, yeah, the best way to not strikeout on 0-2 is to make sure the pitcher throws you a called ball.

Same kind of thing happens all the time in research, and I would bet Phil is right that the researchers introduced a bias.

Listen to all researchers: if it doesn’t pass the sniff test, your result is almost necessarily wrong.  It points to a methodology issue.  We would have been happy to see these hitters hit .320 or .330.  Maybe .350?  That starts to stretch it.  But .463?  You made a mistake, or your sample size is terribly low.


#1          (see all posts) 2010/10/04 (Mon) @ 12:42

Another way to put it:

Select only on criteria that are known BEFORE the the trial.  NEVER select on criteria that are not known until AFTER the trial.

So you don’t select on “last PA” because, at the time the guy bats, you don’t know whether or not it’s his last PA.

And, to use Tango’s example, you don’t select on “ABs ending at 0-2” because you don’t know the AB actually ended until AFTER the 0-2 pitch.

Simple as that.


#2          (see all posts) 2010/10/04 (Mon) @ 12:56

I saw the study, and as Tango says, it didn’t pass the “sniff test” for me either.  If some group, or individual, can hit .463 when they try really hard, then you’d think they could muster most of that effort on a daily basis, hit .425 every year, and be the highest paid player in baseball.  I spent a couple minutes but just couldn’t figure out what was wrong with it.  Thanks Phil for clarifying.


#3          (see all posts) 2010/10/04 (Mon) @ 12:57

My best guess, and I think this could explain maybe .030 points of BA, was that there is a much higher likelihood of late September pitchers being low-quality callups or guys coming back from injury to try to get a few MLB starts in before the season ends.


#4          (see all posts) 2010/10/04 (Mon) @ 12:58

Phrasing it another way: you have to decide if the event is going to be included in your sample BEFORE IT HAPPENS.  If you only decide after it happens, you’re doing selective sampling (AKA “cherry picking"), and your study is not valid.

David Ortiz steps up to bat in game 162.  You ask the authors, “is this PA going to be in your sample?” They say, “um, I don’t know ... only if, in retrospect, it turns out to be his last one.”

VIOLATION!  You have to decide if it counts BEFORE it happens, not after.


#5          (see all posts) 2010/10/04 (Mon) @ 13:00

A commenter at my blog, David N., pointed to a version of the actual study:

http://faculty.chicagobooth.edu/devin.pope/research/pdf/new%20and%20improved%20round%20numbers.pdf


#6    Tangotiger      (see all posts) 2010/10/04 (Mon) @ 13:11

I’ll give you another example of where it’s ludicrous:

how do relievers do when they get 3 outs in the ninth inning?

Well, guess what, if it’s the bottom of the 9th and they got three outs, they likely did great.  Because if they blew it, they would only have gotten 0,1, or 2 outs to end the game.

***

What do you thnk is the ERA in games in which the pitchers gets a win?  A loss?  Yeah, so the best way to get a low ERA is to get a win!

***

Right, what Phil said: select your samples without knowing the outcomes.  And THEN look at the outcomes.


#7          (see all posts) 2010/10/04 (Mon) @ 13:11

Has anyone ever read Rob Neyer’s interview with Wes Parker?  I can’t find it in the archives anymore.

Wes Parker was a career .267 hitter.  (Some of that’s from playing with the Dodgers in the 1960s dead ball era.) In 1970, at Age 30, he played in a career high 161 games (he was frequently injured).  He hit a career high .319 (next best .279) with 47 doubles (next best 24) and various other career highs.

He attributed the career year to disconnecting his phone, quitting drinking, and not going out on dates.  He said he was so miserable that he never did it again.

Back when the incentives were much less - 30 or 40 years ago much more so than now - this element of performance was a very real thing.  So players who got a good night sleep and laid off the booze had an advantage against everyone else who didn’t care.  Still, .463 doesn’t pass the sniff test.


#8          (see all posts) 2010/10/04 (Mon) @ 13:15

The study itself confirms a strong “benching” effect.  Updated my post as follows:

----

If you look at the study, the authors actually show evidence that pinch-hitting is the cause!  However, they didn’t get the significance of that data.

In their “last scheduled plate appearance of the season”, the average batter was pitch hit for 7 percent of the time.

But batters with a .298 or .299 average were pinch hit for only 4.1 percent of the time.  Batters with .300 or .301 were pinch hit for 19.7 percent of the time.

And, most importantly, batters hitting exactly .300 were pinch hit for 34.3% of the time!

That basically confirms that the authors’ results are likely to be the result of cherry-picking.  If you’re hitting .299, you get a chance to get a hit in your last AB to jump to .300.  But if you’re already hitting .300, you often don’t get a chance to drop back to .299, getting an out in your last AB.

You know how when you looking for something you lost, you always find it in the last place you look?  Well, the same thing applies here.  When you’re looking for .300, you find it with a hit in the last AB you take.


#9          (see all posts) 2010/10/04 (Mon) @ 14:20

It all reminds me of John Kruk, retiring right after he reaches a 300 AVG.


#10    Apologist      (see all posts) 2010/10/04 (Mon) @ 14:35

After reading the study posted, let me counter the conclusion that seems to have been reached here.

First of all, the title of the paper is “Round Numbers as Goals”.  Thus, the paper is arguing that people try to reach round numbers. 

They have clearly shown that this is true.  Baseball players do things that cause them to end the season at .300 much more often than at .299. 

I don’t think the study writers really care whether this is due to substitutions, increased effort, etc.  In fact, the authors rightly point to 3 mechanisms in the paper (increased hit rate, decreased walk rate for .299 players, and increased pinch hit rate for .300 players).

I think the study is accurate in that it shows that players do what it takes to get to .300.  While this appears to be largly driven by careful substitutions, they show that this isn’t the entire story.  For example, the fact that nobody who is batting .299 walks is clearly evidence on top of just pinch hitting that they are using a combination of tools to get to .300 by the end of the season.


#11    BrianK      (see all posts) 2010/10/04 (Mon) @ 14:37

@Phil #8:
That was absolutely the first thing that stood out for me. In fact, I think the Cubs sat S.Castro the last day of the season in order to preserve his .300 average causing Ted Williams to roll over in his, um, freezer or something.

Why not look at batters who were hitting .300 or .301 in their last at bat? They have just as much incentive to get a hit as the guy with the .299 average. You’d still have some bias in the study of course...but it’s the same idea.


#12    Tangotiger      (see all posts) 2010/10/04 (Mon) @ 14:47

Apologist:

So the goal of reaching .300 is best reached by not doing anything once you reach that goal.


#13    Guy      (see all posts) 2010/10/04 (Mon) @ 14:59

The authors were aware of the potential bias of hitters sitting out.  They looked at all “scheduled” final PAs, and included them in the denominator even if the player walked or didn’t hit at all. 

However, the problem is that all hitters who stop after reaching .300 won’t have another “scheduled” PA.  They might be removed immediately for a pinchrunner, or in the next half inning for a defensive replacement. Or they might reach .300 in game 161, and not play at all in game 162.  So I’m sure that bias does explain most or all of the extreme performance for .299 hitters.  The problem is that the authors didn’t have the requisite baseball knowledge (where have we seen that before?) to see all the forms that selection bias could take.

The fact that there is no similar performance for hitters already at .300 should have been a tip-off to the authors.  A failure to get a hit there could lower the season average to .299, so the incentive to succeed is just as great. 

Very disappointing to me that Alan S. at the NYT didn’t immediately see the problem.  I mean, .463?


#14    Apologist      (see all posts) 2010/10/04 (Mon) @ 15:01

@Tangotiger

I’m not sure if that was meant to be sarcastic or not, but the answer is “yes”.

By the way, I think the logic given here for why we should discount the incredibly high hit rate found is wrong as well…


#15    Tangotiger      (see all posts) 2010/10/04 (Mon) @ 15:07

I’m not sure if I meant it sarcastically or not either.

As for your conclusion that we are wrong: make your case.


#16    Guy      (see all posts) 2010/10/04 (Mon) @ 15:15

Apologist:  you are correct that the authors point to some mechanisms separate from motivation to hit.  And as best I can tell, they never even report a batting average for the .299 hitters (their metric is hits per scheduled PA), so Schwarz did them a disservice by hyping their findings. 

BUT, the authors clearly believe they have found a “motivational” factor as well, in the form of the huge surge in hits among .299 hitters.  And it is virtually certain that this is simply an error, caused by their failing to understand all the ways in which a .300 hitter can avoid further PAs. 

BTW, didn’t Bill James do a study of pitchers reaching 20 wins that raised some of the same issues?


#17    Apologist      (see all posts) 2010/10/04 (Mon) @ 15:15

Let me suggest two things (perhaps for starters).

1.  In response to guy’s comment that the study writers find no effect for .300 batters, that is wrong. 

Their figures are really weird to a baseball fan, but they are not plotting batting average.  They are plotting hit rate/scheduled plate appearances.

Thus the denominator is not outs + hits, but is actually outs + hits + walks + substitutions.

If you go through and do the proper division (take into consiration the number of .300 players that didn’t even get a chance to hit due to substitutions, you actually find that the .300 players had an even HIGHER batting average than the .299 guys.

This goes against guy’s direct argument and also goes against the selection story.

2.  Regarding the selection story directly… 2 things.  First, the authors seem to have taken the selection effects into account (by looking at potential plate appearances), so I don’t think their are selection problems.

But second, the arguments made here have done the following thought experiment:

Imagine a .299 batter who hits th ball (reaches .300) and then decides to stop batting.  This might make it so that the “last” plate appearance would have a lot of hits.

But, think of the similar but slightly different story:

Imagine a .201 batter who misses the ball (and falls to .300) and then decides to stop batting.  This selection works in the opposite direction and should be just as prevalent in the data.

Thus, even if the authors did not take selection into effect (which I think they did), there is not an a priori reason to expect selection problems.

Couple that with the great success of the .300 batters (which I indicated in point #1) and I would argue that the study has not been too readily dismissed.


#18    Guy      (see all posts) 2010/10/04 (Mon) @ 15:32

Apologist:
Yes, the .300-hitter results are complicated by the authors’ odd denominator, and so you are right that they can’t be directly compared to the .299 result.  Many of the .300 hitters sit out, while those who don’t are hitting over .400.  But, those who hit at .300 have all the same selection bias issues I raised earlier:  many .300 hitters who then get a hit may be removed prior to their lineup position coming up again.  Or this final PA may not even occur in the season’s final game! 

Now, if you look at players hitting .299 who have their final PA in the 8th or 9th inning of the final game of the season, and they hit over .400 in that final PA, I’ll be impressed.

Your point about the .301 hitters falling to .300 is simply mistaken.  The authors select players based on their BA prior to their last scheduled PA.  So the hitters you describe will be in the .301 bin, not the .300 bin.


#19    Guy      (see all posts) 2010/10/04 (Mon) @ 15:34

"I would argue that the study has not been too readily dismissed.”

On that point, we can agree.....


#20    Tangotiger      (see all posts) 2010/10/04 (Mon) @ 15:44

"who have their final PA”

that should be NEXT PA, not final PA.

So, your selection criteria is this:
- all players
- who played in their team’s last game
- who came to bat in any inning starting from the 8th inning
- who when they came to bat were one hit away from reaching .2995 to .3005

So, give me those numbers.  Tell me how many times they came to bat, how many walks, hit batters, and hits.

And run a second set of players, where for that last condition, they were one out away from dropping out of .2995 to .3005.  Show me those numbers.

Some aspiring saberist should crank their RetroDB and have fame and fortune awaiting you…


#21    Ryan JL      (see all posts) 2010/10/04 (Mon) @ 15:52

Some aspiring saberist should crank their RetroDB and have fame and fortune awaiting you…

Drat...curse this day job!


#22    BrianK      (see all posts) 2010/10/04 (Mon) @ 15:55

@Tango:

I’d also want to see a control group. Perhaps batting averages are higher for all batters in their last at bat of the season.

...which would be an interesting result as well.


#23    Tangotiger      (see all posts) 2010/10/04 (Mon) @ 16:00

Again, please, never use the word “LAST”.  It should be “NEXT”.  You can’t start by looking at it backwards.  This is the entire point of this thread.


#24    Guy      (see all posts) 2010/10/04 (Mon) @ 16:03

"who have their final PA”
“that should be NEXT PA, not final PA.”

OK, I suppose we have to include the players who played extra-inning games or whose teams played a 163rd game that counted for season stats.  Probably not a huge factor, but I take the point.

By the way, it may be some consolation to the study authors to know that Bill James made a similar mistake when he discovered there were “too many” 20 game winners.  Phil took the trouble to break it down, and found it was mainly a question of pitchers getting extra opportunities not actually pitching better:  http://sabermetricresearch.blogspot.com/2008/03/players-being-clutch-when-targeting-20.html.


#25    Tangotiger      (see all posts) 2010/10/04 (Mon) @ 16:06

Right, it’s not a question of motivation to perform better, but a motivation to get more (or less) opportunities to achieve the goal. 

It’s like picking your opponent, and then finding out that boxers who pick their opponents win 99% of the time.


#26    BrianK      (see all posts) 2010/10/04 (Mon) @ 16:12

Yes yes...next. Should have proofread.


#27    bowie      (see all posts) 2010/10/04 (Mon) @ 16:44

Thanks for posting this. This is why I come to this site. I read that NYT article yesterday morning and totally missed the selection bias problem. There was something about it that I felt didn’t add up but I couldn’t put my finger on it.


#28    Apologist      (see all posts) 2010/10/04 (Mon) @ 16:45

Where to start…

First of all, I agree with Tango that if the numbers in the study are not beleived, then somebody should do it themselves.

I just think the paper is accurately taking into consideration substitution.

Let me explain.  My read of the paper is that the authors did the following:

Take the RetroDB and see what the last action that a player took in a season was.  For simplicitly, let’s say that this can be a hit, walk, out, or substitution (this is listed in RetroDB when a pinch hitter is brought in).

The study writers said that the last listed action in a season was the “last scheduled plate appearance”.  Again, this can be a substitution or a walk or an at bat.

They then look at players whose batting average was a .299 and .300 going into this last scheduled plate appearance (remember: there last scheduled PA can be a substitution).

They then calculate the hit rate for the action.  They find that players with a .299 AND players with a .300 (that choose to hit) have a very high batting average.  Furthermore, they find that .299 players never walk.

This takes into consideration any potential selection problems!!

Where everyone here is getting mixed up is by thinking that the authors are looking at “last plate appearance”.  They are looking at “last scheduled plate apppearance”.

Thus, they are essentially giving the numbers that you are asking for Tango in your comment #20.


#29    Guy      (see all posts) 2010/10/04 (Mon) @ 16:47

Another simple way to see if there’s a problem is to look at the Ns.  Reportedly, there were 61 “scheduled final PAs” for .299 hitters in the sample.  Maybe Apologist can find out for us how many such PAs there were for .297 hitters, or .302 hitters?  Or perhaps the authors will wander by at some point.  My guess is that there are fewer “final” PAs at .299 and .300 than the surrounding BAs.


#30    Apologist      (see all posts) 2010/10/04 (Mon) @ 16:51

Rather than just trying to correct… let me ask two questions to the obviously skeptical crowd (e.g. guy, Tango, Phil).

1.  Does anyone have a problem with the extremely low walk rate they find for the .299 players?  This can’t be explained by the selection story being proposed (even though it is incorrect).

2.  How do you explain the higher than .463 batting average for players that had a .300 batting average coming into what ended up being their last PA?  Again, your selection story doesn’t explain this…

And again, even the selective substituting actually is the point of the original study.


#31    Guy      (see all posts) 2010/10/04 (Mon) @ 16:57

Apologist, why are you ignoring the multiple ways in which the methodology you describe fails to capture players who reach .300 (exactly), and then choose not to hit again (or their manager chooses for them)?  If a player reaches .300 in his final AB in the 161st game, he might not play the 162nd game—his final PA will be a hit.  Or if he gets a hit in his 2nd PA in game 162, he might be replaced by a pinch runner, or replaced in the field, before he is due to hit again. 

When hitters are removed because they are now hitting .300, we simply can’t assume they stay in the game until replaced by a pinch hitter.  In fact, since this would draw a lot of attention to the hitter “chickening out”, by having him stand in the on-deck circle but then fail to hit, I fell pretty sure that this is NOT how managers usually handle it. 

Again, I think having the N for each group of hitters would help settle this.


#32    Apologist      (see all posts) 2010/10/04 (Mon) @ 17:05

Guy,

The point is that the methodology DOES consider this.

If a player gets a hit in his 2nd PA in game 162 to reach .300 and is replaced by a pinch runner or replaced in the field, then the study writers consider the substitution as the “last scheduled plate appearance” not the hit they they got to make it to .300.  Thus, this person is coded as a “substitution” in their last scheduled plate appearance not a “hit”.

Admittedly, it is awkward, but not wrong.


#33    Guy      (see all posts) 2010/10/04 (Mon) @ 17:06

"1.  Does anyone have a problem with the extremely low walk rate they find for the .299 players?”
No and yes.  No in the sense that I’m sure guys hitting .299 do swing away and the true walk rate is low.  But yes, in the sense that if you draw a walk at .299, you are much more likely to choose to bat again than if you got a hit and raised your average to .300.  So even here we do see the effect of selection bias.

“2.  How do you explain the higher than .463 batting average for players that had a .300 batting average coming into what ended up being their last PA?”

The same selection bias you don’t want to see.  A player hitting .300 who gets a hit is now hitting somewhere between .301 and .302.  So his manager pulls him from the game, or fails to start him the next day—his last PA is a hit.  But if the same player makes an out and falls to .299, he damn well hits again and tries to raise his BA back to .300.

Apologist, there is absolutely no chance in the world that this result is correct.  If .300 hitters could become .463 hitters just because they really, really like round numbers, I guarantee you we would see them do it all season long.  Because they would see a whole lot of round numbers—like $40,000,000.00 per season --if they did.


#34    Guy      (see all posts) 2010/10/04 (Mon) @ 17:19

#32:  So if a .299 hitter gets a hit and then is pulled for a pinchrunner, you’re saying that the study counts that hitter’s PA as a non-hit?  And if he’s pulled for a defensive replacement the next inning, his final PA is called a substitution (non-hit)?  But of course, in both cases he has now become a .300 hitter prior to his last PA, so he’s no longer in the .299 sample, right?  Very confusing and hard to interpret.....

And there’s still the problem of .299 hitters who reach .300 in the final PA of a game, and then don’t play again that season. 

So even assuming you’re correct, I’d like to know the sample sizes, and also how many of the .299 PAs were prior to game 162.


#35    Apologist      (see all posts) 2010/10/04 (Mon) @ 17:27

#34:

Yes, the person you described would be a .300 hitter coming into the “last scheduled PA” and the “last scheduled PA” is coded as a substitution.  This is how I read it.

The paper is unclear about the .299 hitters who reach .300 in the final PA of a game, and then don’t play again that season…

I agree about going back to the data…


#36          (see all posts) 2010/10/04 (Mon) @ 18:23

Guy/13,

>The authors were aware of the potential bias of hitters sitting out.  They looked at all “scheduled” final PAs, and included them in the denominator even if the player walked or didn’t hit at all.

Thanks, Guy, I hadn’t noticed that.  (It’s footnote 4.)

I agree with you that the authors might have missed “scheduled” PA that became “unscheduled” via pinch runners or defensive substitutions.


#37          (see all posts) 2010/10/04 (Mon) @ 18:36

Further to Guy/13, the last paragraph on page 6 explicitly mentions pinch hit appearances, but not pinch runners or defensive substitutions.  So I’d agree with Guy that those other two cases are enough to bias the results.


#38    Guy      (see all posts) 2010/10/04 (Mon) @ 19:05

Yeah, I don’t see any indication in the paper that defensive substitutions and/or pinch runners are treated as non-hit PAs, as Apologist speculates.  And if they do, a lot of “PAs” aren’t plate appearances at all (even by a pinch hitter), which would be very odd.


#39          (see all posts) 2010/10/04 (Mon) @ 23:47

Going back for a second to Hawerchuck at #7:

“Wes Parker was a career .267 hitter.  (Some of that’s from playing with the Dodgers in the 1960s dead ball era.) In 1970, at Age 30, he played in a career high 161 games (he was frequently injured).  He hit a career high .319 (next best .279) with 47 doubles (next best 24) and various other career highs.

He attributed the career year to disconnecting his phone, quitting drinking, and not going out on dates.  He said he was so miserable that he never did it again.”

That piqued my interest so I looked up Parker’s BABIP in his career:

1964--.310
1965--.274
1966--.282
1967--.298
1968--.286
1969--.282
1970--.343
1971--.297
1972--.298
Career--.297

So, either “disconnecting his phone, quitting drinking, and not going out on dates” enabled Parker to suddenly increase his BABIP by 60 points, which would be quite a finding, or he just got a bit lucky that year.

I just thought that was interesting.  Now, back to the real discussion…


#40          (see all posts) 2010/10/05 (Tue) @ 01:59

With some time on my hands, I decided to try to replicate the study’s results. 

For batters hitting .299 or .300 before their last actual PA, I found 115 hitters: they went 42-for-105 (.400) with 10 walks.

The authors found 127 hitters hitting .463 (according to the NYT).  Not sure why the difference: will have to check.

For batters hitting .300, I got 11/37 with 10 walks.  The NYT says there were 66 such players, not the 47 I got.  Again, not sure why.

For batters hitting .299, I got that in their last actual PA, they went 31-for-68 (.456) with no walks.  The authors found 0 walks in 61 AB, but didn’t say what the batting average was.

For those batters hitting .299, the ones who were NOT replaced after that PA (specifically, the game ended less than 9 batters after their last PA, suggesting they played the whole game) had hit .361 (17-for-47).  The ones who WERE replaced had hit .667 (14-for-21).

This .299s result is right in line with the hypothesis that getting the .300 hit causes the batter to end his season.

More investigation tomorrow.  Too tired now.


#41    MGL      (see all posts) 2010/10/05 (Tue) @ 02:22

What I would like to see is this:

Batters hitting .299 (and .300 if you like) with some min number of PA (maybe 400 or 500) going into the last game of the season:

What they did their first, second, etc. PA.  Whether or not they were replaced after their 1, 2, etc. PA.

You can do this for the last few games, and not just the last game.

Doing it this way is doing it “forward” and not backward as Tango, Phil, and others (correctly) warn against.

Basically if you look at the entire last game (or the first PA) for all batters hitting .299 (or .300), that will tell you whether these batters are “trying harder” (or pitchers are doing something differently) in order to hit .300.  That is all you need to do.  That takes care of any selective sampling issues. It is not really that hard to do.

Technically, you have to look at all batters’ BA (or the quality of the pitchers they face) for the last game of a season, as your control group, just to make sure that everyone does not see a spike in BA for some reason (perhaps bad pitching)…


#42    Guy      (see all posts) 2010/10/05 (Tue) @ 07:14

Phil:
The likely reason for your sample size discrepancies is that the authors count it as a PA if the hitter is pinch hit for.

Using the authors’ metric, your .299 hitters only had a hit percentage of .250 (17/68), while the authors found .430. Pretty large discrepancy.

I wonder if the .361 is a function of hitters skipping an entire game?  How often did the .299 guys who finish with a hit (and BA of .300) do this in a game prior to 162?


#43    Bjorn      (see all posts) 2010/10/05 (Tue) @ 08:20

While I have nothing to add regarding the selection bias and in principle agree with what’s been said in that area I am suprised by one other thing.

Most seem to assume that whatever “motivational effect” exists on the cusp of a .300 season regardless of if that is .020 or .150 will come from the batters side. Personally I would think that the diffrence might come more from an unusually low motivation on the pitchers part rather than the batter trying realy hard.

In the extreme case and assuming the game is meaningless the pitcher might even “throw” the at bat (pun intended). Either as some kind of trade of favors or because he knows the hitter and wants to be nice to him.


#44    Tangotiger      (see all posts) 2010/10/05 (Tue) @ 09:02

Unless this is Brett Favre’s best friend, or Carl Pavano knows he’s on a national stage, there’s no way you will find any kind of systematic bias of some stranger giving in to someone looking for some ho-hum record.


#45          (see all posts) 2010/10/05 (Tue) @ 10:15

Guy/42:

If I count players who never played the last game of the season as pinch hit for (which I probably should have done in the first place), the numbers for the .299 guys are:

replaced: 23/35 .657
not: 8/33 .242

Comparing this to the previous, it means the “didn’t play another game” guys were 9-for-14 (.643) in their last AB.

That means that if you’re hitting .299, and then you get a hit near the end of the season to bring your average above .300, you get benched A LOT.  It’s not possible to say exactly the chances of that until I figure out how many batters hit .300 and stayed in (to add to the denominator).  But as for the numerator, at least 23 guys got a hit to pass .300 and wound up benched immediately after that.


#46          (see all posts) 2010/10/05 (Tue) @ 10:28

Of the 47 guys who went into their last PA hitting .300, only 17 played the entire last game.  You’d think a lot of them would have dropped to .299 and would have wanted to keep playing.

The .301 guys: only 29 out of 98 played the entire last game.  (Maybe after going 0-for-1, they got pulled while still at .300.)

The .297 guys: 51 out of 72 played the entire last game.


#47    MGL      (see all posts) 2010/10/05 (Tue) @ 10:38

#43, I thought about that and then I thought, “There is no way that most of these pitchers know that a certain batter is hitting .299 or .300 or what have you.”

Also, in baseball at least, you DON’T usually do your opponent a favor unless he is a friend of yours.

In any case, I think that we are going to find (as usual) that if it looks and quacks like a duck (IOW, the authors screwed up)…


#48          (see all posts) 2010/10/05 (Tue) @ 11:45

Phil, have you simply looked at BA going forward (in any PA or group of PA where a hit in the next PA puts the batter at .2995 or better and an out puts him at less than .2995 near the end of the season - say the last game or 2 games), as I explain in #41 above?  As I said, that will easily answer the question…


#49          (see all posts) 2010/10/05 (Tue) @ 11:46

Trivia: between ‘75 and ‘08, who was the only player batting .300 in the last game of the season, who then made an out, dropping his average below .300, and then was taken out of the game?

Answer: Ryan Spilborghs, October 1, 2007.  He was replaced in the 8th as part of a double-switch.


#50          (see all posts) 2010/10/05 (Tue) @ 11:47

mgl/48: Yup, I’ll get to that.  Right now, I’m just trying to replicate the original study.  Then I’ll do the real stuff.


#51          (see all posts) 2010/10/05 (Tue) @ 11:52

One of the most important points is, “Why is it so hard to duplicate and/or get similar results for these baseball studies, like this one and the one on HFA and attendance?”

That is ridiculous.  Both the methodologies and data sets should be crystal clear when doing empirical studies like this.

As I said in the thread about the HFA study, do NOT discount the possibility that the authors completely botched culling or analyzing the data.  One of the problems when non-baseball people do studies like this (and the other ones) is this:

If I do a study like this and get the result they got, I think, “Wow, that is almost impossible.  I must have made a mistake somewhere, which is quite easy to do (make a mistake).”

And then I redo the study several times until I am satisfied that I made no mistakes.  And even then, if I still get the same results, I am skeptical and I might even ask someone else (another baseball person) to duplicate my study.

On the other hand, if a non-baseball person does the study and gets this kind of result, they think, “Wow, that is a fascinating result!  How can I get this paper published as soon as possible and have an article printed in the NYT?”


#52          (see all posts) 2010/10/05 (Tue) @ 11:52

There were 13 players who were hitting .300, then made an out in their last PA of a game that wasn’t the team’s last game, and then didn’t play any games after that.

Of those 13, 11 of them were still hitting .300 after making that out.  So that explains why 11 of them sat in the last game.  The other two were Randy Milligan in 1993, and Denny Walling in 1980.

Milligan was split between two leagues, so maybe it wasn’t obvious he was hitting .300/.299.


#53          (see all posts) 2010/10/05 (Tue) @ 11:56

mgl/51:

Agreed.  The study should absolutely make the methodology crystal clear.  I still don’t know how the authors dealt with pinch hitters. 

I suspect they might have included them in both samples (before and after).  That is, the guy’s hitting .299, gets a hit, then is pinch hit for.  I suspect he appears in the .299 as a hit, and in the .300 as a pinch-hit-for.

Any study like this has to explain clearly where all the numbers come from.

And, BTW, it’s quite possible that I made programming errors.  But, still.


#54          (see all posts) 2010/10/05 (Tue) @ 12:15

OK, bottom line: I found 69 players who were hitting .299 before their final PA of the season.  The authors found 62.  I don’t know why the difference.  I suspect our lists must be pretty close, because both of us found exactly 0 guys who walked.

As I said, I got 31-for-68 (.456).  One guy did something other than AB/walk.  HBP?  SF?  I didn’t check.

My list should actually be smaller than theirs: I looked only at September 26 and later.  So anyone whose last PA was on 9/25 is on their list but not mine.

Only 33 of these guys played the entire final game of the season.  If the original study was leaving out all guys who were replaced, they would have only about 33.  So if they did leave out replaced batters, it must have just been guys who were pinch-hit for, and not guys who were held out of the last game. 

I think there were 13 “replaced during the last game” guys, but probably not all were PHs.  As Guy suggested, they may have been PRs or defensive replacements.  And, the original study might have also excluded guys who were pinch hit for in a game that wasn’t their last.  I included those guys in the “didn’t play the last game” category.

Anyway, for the record, here’s my list.  It’s player, date, result of last PA, his final batting average for the year, and the reason that PA was his last.  Sorry about the proportional font, you may have to cut and paste.

doylede01 197509270 out 298 didn’t play last game
murcebo01 197509270 out 298 didn’t play last game
singlke01 197509281 hit 300 didn’t play last game
staubru01 197610030 out 299 played entire last game
biittla01 197710022 out 298 played entire last game
coopece01 197710020 hit 300 played entire last game
cruzjo01 197710020 out 299 played entire last game
driesda01 197710020 hit 300 played entire last game
otisam01 197810010 out 298 played entire last game
hendrge01 197909270 hit 300 didn’t play last game
lefloro01 197909300 hit 300 replaced during last game
madlobi01 197909300 out 298 played entire last game
vailmi01 198010030 out 298 didn’t play last game
kennete02 198110040 hit 301 replaced during last game
bakerdu01 198210030 hit 300 played entire last game
hernake01 198210030 out 299 replaced during last game
benedbr01 198310010 out 298 didn’t play last game
oliveal01 198309280 hit 300 didn’t play last game
lansfca01 198409290 hit 300 didn’t play last game
bucknbi01 198510060 out 299 played entire last game
brockgr01 198710040 out 299 played entire last game
jacobbr01 198710030 hit 300 didn’t play last game
santibe01 198710040 hit 300 replaced during last game
surhobj01 198710040 out 299 played entire last game
wallati01 198710040 out 298 played entire last game
wilsomo01 198710040 out 299 replaced during last game
brownje01 198910010 oth 299 played entire last game
polonlu01 198909300 hit 300 didn’t play last game
reynoha01 198909300 hit 300 didn’t play last game
frymatr01 199010030 out 297 played entire last game
javiest01 199010030 out 298 played entire last game
saboch01 199110050 hit 301 didn’t play last game
hamilda02 199210040 out 298 replaced during last game
butlebr01 199310030 out 298 played entire last game
gonzalu01 199310030 hit 300 replaced during last game
butlebr01 199510010 hit 300 replaced during last game
eusebto01 199510010 oth 299 played entire last game
gilkebe01 199509300 out 298 didn’t play last game
valenjo02 199510010 out 298 replaced during last game
coomero01 199709280 out 298 played entire last game
higgibo02 199709280 out 299 played entire last game
rodrial01 199709270 hit 300 didn’t play last game
stairma01 199709280 out 298 played entire last game
higgibo02 200010010 hit 300 played entire last game
kotsama01 200010010 out 298 played entire last game
maynebr01 200010010 hit 301 played entire last game
rolensc01 200009260 out 298 didn’t play last game
piazzmi01 200110060 hit 300 didn’t play last game
derosma01 200209290 out 297 played entire last game
garcika01 200209290 out 297 played entire last game
jonesja04 200209290 hit 300 replaced during last game
rodrial01 200209290 hit 300 played entire last game
winnra01 200209290 out 298 played entire last game
abreubo01 200309280 hit 300 replaced during last game
catalfr01 200309280 out 299 played entire last game
gilesbr02 200309280 out 299 played entire last game
phillja04 200309280 out 298 played entire last game
rodrial01 200309280 out 298 played entire last game
anderga01 200410020 hit 301 didn’t play last game
fordle01 200410030 out 299 played entire last game
overbly01 200410030 hit 301 played entire last game
catalfr01 200510020 hit 301 replaced during last game
sweenmi01 200509300 hit 300 didn’t play last game
aloumo01 200610010 hit 301 played entire last game
anderma02 200609300 out 297 didn’t play last game
loftoke01 200609300 hit 301 didn’t play last game
gomezch02 200709300 out 297 played entire last game
martivi01 200709290 hit 301 didn’t play last game
werthja01 200709290 out 298 didn’t play last game
winnra01 200709300 hit 300 replaced during last game


#55          (see all posts) 2010/10/05 (Tue) @ 12:21

Tango/#44

In Ball Four, Jim Bouton has a whole section about that kind of thing. I’m at work now, but if I recall correctly, when he was in the minors his catcher let an opposing player call the pitches so he could hit a home run (there was some sort of promotion at the ballpark). In the majors a catcher made his 3rd baseman play extra deep so someone (Tommy Davis?) could lay down a bunt for a single, get removed for a pinch runner, and get .300 on his record.

Not strictly related, but there’s even one about an umpire (Ed Runge?) ejecting a player in the first inning of the last game so he could catch an early flight home.

I wouldn’t be too surprised to find out this sort of thing still happens when two teams that are just playing out the season play each other.


#56    Guy      (see all posts) 2010/10/05 (Tue) @ 12:39

"If I do a study like this and get the result they got, I think, “Wow, that is almost impossible.  I must have made a mistake somewhere....And then I redo the study several times until I am satisfied that I made no mistakes....On the other hand, if a non-baseball person does the study and gets this kind of result, they think, “Wow, that is a fascinating result!  How can I get this paper published as soon as possible and have an article printed in the NYT?”

This is the heart of the problem with many of these academic sports studies.  The incentives are skewed, rewarding you if you stop digging/thinking once you have an “interesting” result. Now, that’s only true up to a point:  if you’ve made an obvious error that will likely be detected by your reviewers, or by other academics later, that will reflect badly on you.  However, the academic scrutiny is pretty weak, basically limited to “did you use the ‘correct’ statistical techniques?” Errors rooted in a lack of understanding of the game, or data errors, won’t often be detected (except by riffraff like us).  For example, there is almost no chance a referee would uncover the mistakes made in this paper. 

The perverse incentives are especially problematic when the researcher creates a regression model to determine whether sports decision makers are making correct, rational decisions (about salaries, drafting players, or whatever).  In those cases, any flaw in the model, or important omitted variable, will likely result in a finding that GMs/managers are doing it wrong.  And that’s an interesting finding!  So the researcher literally gets rewarded for their mistakes, but has no incentive to do what MGL says he would do, i.e. keep digging to try to find an error that produced the surpising result. This is how we get studies showing that football and basketball teams overvalue top draft picks, and NHL teams pay goalies too much, and pitchers throw too many fastballs, etc. etc. etc.  And almost all of it is nonsense.


#57          (see all posts) 2010/10/05 (Tue) @ 13:48

@Bobby Mueller/39 -

I’m not convinced that in 1970 it wasn’t possible to improve your BABIP.  E.g.:

“Sober” Mickey Mantle 1951-1961: .331
“Drunk” Mickey Mantle 1962-1968: .299

Talent is narrowly-distributed today, and players have incentives to stay sober and work out a lot. 

@Guy/58 - NHL goalies were overpaid smile


#58    Guy      (see all posts) 2010/10/05 (Tue) @ 14:16

#57:  I think that would be more accurate if you said Mickey Mantle “with knees” and Mickey Mantle “without knees.”

Mantle’s BABIP really fell starting in 1965.  First, league BABIP was about 10 points lower in 1965-68 than the rest of his career.  Second, Mantle lost his wheels:
1951-1964:  10.6 SB per 600 PA
1965-1968:  3.7 SB per 600 PA

1951-1964:  5.2 3B per 600 PA
1965-1968:  0.9 3B per 600 PA

Happy to be corrected on goalies if that’s what the evidence shows.  Can you provide me a link/cite?


#59          (see all posts) 2010/10/05 (Tue) @ 14:28

OK, I did the “forward” study, and, as expected, the results show nothing special.

I looked at all PA in a team’s last two games of the season.  Only batters with 200 AB overall were included.

Guys hitting .298 at that moment: .310 in 365AB, 45 walks.  I’ll repeat that and do the others too:

.298: .310 in 365 AB, 45 walks
.299: .313 in 323 AB, 23 walks
.300: .288 in 233 AB, 30 walks
.301: .289 in 277 AB, 37 walks

Now, here are guys who could move from below .300 to .300+ if they get a hit this AB:

hit to reach .300: .300 in 517 AB, 50 walks

And the other way around: guys who could drop below .300 if they make an out this AB:

out to fall below .300: .297 in 145 AB, 21 walks

So: not much there, except that the .298/.299 guys hit slightly better than the .300/.301 guys.

I found it interesting that all the averages were around .300 ... I would have expected some regression to the mean, maybe .293 or something overall.

Will write all this up for my blog tonight.


#60          (see all posts) 2010/10/05 (Tue) @ 14:34

@Guy -

I think what I’m saying is still consistent.  Parker lived in LA, drank a lot, went out with tons of women.  He couldn’t stay in the lineup.  Then he becomes a hermit, suddenly develops health as a skill, hits a career peak in BABIP - I just don’t think it’s all luck.

On goaltenders, here’s the salary chart:

http://www.behindthenethockey.com/2010/7/7/1556692/goaltender-salary-per-win-2000-01

So they got paid $2.3M per win last season, which is way more than players got paid per win at other positions ($1.5M or so, I think). 

Couple that with an inability to evaluate goaltender true talent, and you get the following distribution of wins vs salary:

http://www.behindthenethockey.com/2010/4/16/1423992/goaltender-performance-vs-salary

Free agent goaltender salaries have been driven way down this offseason because teams have recognized that 1) goalies were overpaid; 2) they were misevaluated; 3) there’s actually a glut of talent out there.


#61    Tangotiger      (see all posts) 2010/10/05 (Tue) @ 14:43

Great job Phil!


#62    MGL      (see all posts) 2010/10/05 (Tue) @ 15:33

"Guys hitting .298 at that moment:”

Do you mean they are hitting .298 going into the last 2 games?

“I found it interesting that all the averages were around .300 ... I would have expected some regression to the mean, maybe .293 or something overall.”

Right, you should expect them to hit something like .285 or whatever their estimated true talent BA is given that they hit around .300 for almost one full season.

That is what you need some kind of control group.  What do all batters hit in the last 2 games as compared to their normal BA?  IOW, how bad are pitchers in the last 2 games of the season? I wouldn’t think it is enough to make up for 15 points in BA (.300 as compared to .285) but it could be 5 points.

What is the standard error of those .300 BA?  IOW, how many PA is it based upon?  .300 may not be so statistically different from .285.

In any case, great work and another garbage study refuted. Unfortunately, the damage is done as many people will read the NYT article and almost no one will read this blog (relatively speaking).

At least, Phil, please write to the magazine that is supposed to publish this study (it has not been published yet, right), cc the authors as a courtesy, and either explain to them the issues, or refer them to this thread or to your blog.  Something needs to be done about these bad studies…


#63    Guy      (see all posts) 2010/10/05 (Tue) @ 15:34

Yes, great work.

“not much there, except that the .298/.299 guys hit slightly better than the .300/.301 guys.”

I think you ARE seeing regression when you look at the .300/.301 hitters, to about .290.  The question is:  why are the .298/.299 hitters are up around .310?  My guess is it has to do with how a guy gets to be a .299 hitter during these final 2 games.  Very few are “fallers” coming down from .300+, because most of those guys sit out.  So it’s a mix of guys who entered the final games already at .299 and some who entered at .298/lower and have moved up to .299.  This second group will tend to be better hitters at this moment, because of park effects, hitting at home rather than the road, facing a weak pitching staff, etc.  That’s why they have improved their average over past few PAs, and that likely explains why they are .310 hitters going forward.

If you just took hitters who began the 161st game at .299, and looked at all subsequent PAs, do they still hit .310?

Would also be interesting to look at ROEs for these guys.  It might be below average for the .299 hitters, giving them a little boost.  But I can’t see why that wouldn’t also be true for the .300 hitters.

Hawerchuk:  If Wes Parker said he was going to quit drinking and raise his average in the spring of 1970, then that’s interesting.  Retrospectively, I give it no weight at all.  Always forward, never backward.


#64          (see all posts) 2010/10/05 (Tue) @ 15:51

"at that moment” means from the beginning of the season, up to (but not including) that PA.

Control group ... OK, let me a few more, for giggles:

Guys hitting .241 hit .268.
Guys hitting .257 hit .248.
Guys hitting .275 hit .276.
Guys hitting .310 hit .262.
Guys hitting .311 hit .266.
Guys hitting .312 hit .331.

Dunno, seems to me like there’s enough variation that the .300 instead of .290 (regressed, say) is just random variation.

Remember, these aren’t random samples of AB.  They’re clustered among just a few hitters (who go .298, then .299, then .298, then .299 ...).  That means the SD is a lot larger than if the hitters weren’t clustered: it could be that there were a few really good hitters in the .299 group.

Indeed, you’d expect that the good hitters would have more AB in that situation.  When Joe .250 winds up hitting .300 in game 161, you bench him so he’ll finish .300.  When Wade Boggs is hitting .300 in game 161, it’s no big deal so you let him keep playing.


#65          (see all posts) 2010/10/05 (Tue) @ 15:54

Guy/63, good point ... the .299 guys probably *are* in a better-hitting situation, for the reasons you state.

And there’s enough randomness, I think, that we don’t have to worry too much about it anyway.


#66          (see all posts) 2010/10/05 (Tue) @ 15:56

What am I thinking?  Even if they WERE random samples, the SD of a .300 hitter over 300 AB is 26 points anyway.  So why are we worrying about differences of half a standard deviation?


#67          (see all posts) 2010/10/05 (Tue) @ 16:02

One last control group: the .222 guys hit .350!  (49 for 140.)


#68    Guy      (see all posts) 2010/10/05 (Tue) @ 16:08

Well, the .298/.299 hitters combined have 685 ABs, for a SD of about .017.  And they are roughly 22 points better than the .300/.301 hitters.  True, it’s not statistically significant, but it does seem likely to be a real difference. 

I think the key result is what you found for hitters who stand to reach .300 with a hit (they hit .300) and those who stand to fall below .300 (.297).  That tells us everything we need to know. 

I second MGL’s suggestion that you alert the authors and the editors of the journal.


#69          (see all posts) 2010/10/05 (Tue) @ 16:11

@Guy/63

http://www.tonymedley.com/Articles/Wes_Parker_One_On_One.htm

It’s hard to find contemporary evidence for anything that happened 40 years ago in baseball, but I’m inclined to believe this particular claim.


#70          (see all posts) 2010/10/05 (Tue) @ 16:13

Guy,

You’re cherry-picking the two most extreme groups from the four.  There is *some* reason those two are more important, than the other two, but, still.

There are six ways to choose two groups from those four.  The chance that the most extreme division will be 1.3 SDs (or whatever) seems pretty good.

And that’s *before* you bump up the SD a bit for having a non-random group of players in it.

I’m not sure it’s likely to be a real difference, but I’d agree that wouldn’t hurt to look more closely.


#71    MGL      (see all posts) 2010/10/05 (Tue) @ 16:31

I like to keep things as large (sample-size-wise) and as simple as possible.

All players (with a min number of PA) who are around right around .300 going into the last 2 games of the season have an incentive to get a hit in each and every one of their next AB’s, whether they are going to sit out or not if and when they reach .300 or above after a certain AB. If there is a large motivational effect, as the authors suggest, we will see it in all of the AB’s collectively in those 2 games, even if a player goes 3 for 3 and then can afford to not try hard in the next AB or 3.

So simply look at all players near .300 in their last 2 games.

For a control, DON’T split up batters by their BA.  That just makes for a mess and makes it look like things are going on when they aren’t.  Simply look at all other batters, or at least those with BA of less than .280 or .290 or more than .310 or .320 (who have no chance of falling below .300 or rising above .300).  That is your control group as a proxy for pitching quality over that last 2 games.

That is what I would do, at least on the first pass…


#72    Guy      (see all posts) 2010/10/05 (Tue) @ 16:48

Hawerchuk:  There’s no date on that article, but it was clearly written many years after Parker retired.  So I don’t see any reason to give it much credence.  And note how self-serving Parker’s description of his decision to retire is.  He makes it sound like he chose to go out on top, before his skills deteriorated.  But he slugged .354 his last season, which is pretty terrible for a 1Bman even in the 1970s.  I doubt the Dodgers would have kept him. 

Not a slam on Parker—nearly all athletes will remember their careers in a favorable light.  You just can’t take these accounts at face value....


#73          (see all posts) 2010/10/05 (Tue) @ 17:28

Based on BA going into the last two games, regardless of what happened subsequently:

291-296—692/2434 (.284), 229 BB
297-302—593/1962 (.302), 205 BB
303-308—421/1587 (.265), 167 BB

Looks like the first group regresses right, the second not enough, and the third too much.

SDs (based on random binomial):

291-296: .284 +/- .009
297-302: .302 +/- .010
303-308: .265 +/- .011

The difference between the middle and high group is statistically significant if you assume binomial, at about 2.5 SDs.  It’s probably significant in real life too, after adjusting for clusters. 

But, it seems to me like the one that’s off is the 303-308 group.  There’s no way they should be that low.  The middle group, the one mgl is concerned with, is about 1 SD from where they should be, which is something, but not much.


#74    MGL      (see all posts) 2010/10/05 (Tue) @ 17:39

Phil, and all others?  In order to get a sense of the pitching quality (and other environmental things) in the last 2 games.

I am not really concerned with any of the groups. I think the evidence is clear that there is not likely much of an “incentive effect” which would be ridiculous anyway.  If there were such a large incentive to hit .300 and batters could actually elevate their hitting talent in order to do that, surely we would see a much higher BA in clutch situations (although it could be that the pitcher and batter cancel each other out).

In any case, couldn’t the lack of walks be causing a rise in BA (at the expense of OBP), at least for the batters who need a hit?  And perhaps a loss of power as these batters are just trying for a hit.  If yes, we should see a difference between batters who need a hit and those who need to avoid an out (more walks for the latter).  And we should see a decrease in OPS for batters needing a hit, as they are altering their approach in an overall sub-optimal manner.  For example, if you need a hit and you are 3-0 and it is correct to take a pitch, you are swinging at a good pitch, right?  And if it is correct to swing at 2-0 or 3-1 and you need to avoid an out, you might take more, right?

So the magnitude of BA is not necessarily telling us the increase or decrease in “batting talent” by the batter, but perhaps only his approach (to maximize a hit or minimize an out).


#75          (see all posts) 2010/10/05 (Tue) @ 17:41

Although ... maybe mgl and Guy are onto something.  The near-300 group is the only one I tried that doesn’t regress to the mean:

200-219: .228
220-230: .259
291-296: .284
297-302: .302
303-308: .265
309-319: .291
319-329: .315

I’d say, slight evidence of an effect, at 1 SD.


#76          (see all posts) 2010/10/05 (Tue) @ 17:43

MGL, I agree that if there is a real effect, it’s very plausible that it comes from batters trading walks for hits.

But, as you say, with several AB left in the season, getting a hit and avoiding an out are probably of close to equal priority.  It’s only with 1 or 2 AB left that there’s a difference, no?


#77    MGL      (see all posts) 2010/10/05 (Tue) @ 18:30

"But, as you say, with several AB left in the season, getting a hit and avoiding an out are probably of close to equal priority.”

Well, yes, only if you look at several AB’s after a player is close to .300, since he is likely to fluctuate above and below .300.

But if a player is below .300, he is likely to avoid walks.  If he is above .300, he is likely to embrace them and go out of his way to get one.

If a player is near .300 and gets above it, he is likely to not play anymore, so I think that you will mostly see players trading walks for hits when they are near .300 at the end of the season…


#78    MGL      (see all posts) 2010/10/05 (Tue) @ 18:33

So I would look at the walk rate for all of those groups. My guess is that you will see fewer walks (and extra base hits) for those near .300 and especially just under it. The reason the .308 group has such a low BA may be that they are trading hits for walks. The ones with a higher average and a much lower average are simply doing what they always do.


#79          (see all posts) 2010/10/05 (Tue) @ 18:38

.291 to .296: walk rate 8.6 per 100 PA
.297 to .302: walk rate 9.5 per 100 PA
.303 to .308: walk rate 9.5 per 100 PA
.309 to .319: walk rate 9.6 per 100 PA

.297 to .299: walk rate 9.2 per 100 PA
.300 to .302: walk rate 9.9 per 100 PA


#80    MGL      (see all posts) 2010/10/05 (Tue) @ 18:57

A little bit, but not much there.  If 25% of those walks turned into singles, we are looking at less than a 2 point difference in BA…


#81    Guy      (see all posts) 2010/10/05 (Tue) @ 19:59

I still like MGL’s theory that the .298 to .300 hitters give up power in favor of singles (with .300+ hitters happier to take a BB).  Is there any drop in ISO for the .297-.302 cohort?


#82          (see all posts) 2010/10/05 (Tue) @ 21:32

Dunno about ISO ... will have to rebuild my September database to include extra bases.


#83    Guy      (see all posts) 2010/10/05 (Tue) @ 21:47

Probably not worth the effort, just to explain a (maybe) 10-15 point bump in BA....


#84    Tangotiger      (see all posts) 2010/10/06 (Wed) @ 10:36

Phil’s part II, which is a recap of what he’s said here already:

http://sabermetricresearch.blogspot.com/2010/10/do-batters-really-hit-463-when-gunning_06.html


#85          (see all posts) 2010/10/06 (Wed) @ 22:51

ISO stats, as requested by Guy and MGL:

Guys batting .298 as of right now hit .288 (as shown above), and had an ISO of .230 (.74 extra bases per hit).

I’m going right to chart form:

290-295: BA .275, ISO .161, EB/H .59
.298 ...... BA .310, ISO .230, EB/H .74
.299 ...... BA .313, ISO .152, EB/H .49
.300 ...... BA .288, ISO .103, EB/H .36
.301 ...... BA .289, ISO .202, EB/H .70
305-310: BA .273, ISO .157, EB/H .58

It is consistent with the idea that the guys on the cusp, at .299/.300, might have been trying for singles.


#86    MGL      (see all posts) 2010/10/06 (Wed) @ 23:12

.298 and .301 guys are also on the cusp yet their ISO are much higher.  I am not sure there is much in those numbers.  It takes a little cherry picking to conclude that my theory has some merit.  If we needed the .298 and the .301 guys to support the theory, we would have used them.  (A guy that is at .298 sees his BA go up by 1.5 points or so for every hit.) But their ISO is so high, that we don’t use them.  Not a good way to support a theory…


#87    Guy      (see all posts) 2010/10/07 (Thu) @ 08:34

I wonder if the slight overperformance of the .298-.302 cohort reflects allocation of PA by talent.  A star player probably plays regardless of his BA, except perhaps the final PA of the season, while a lesser player is more likely to sit once he reaches .300/.301 (whether going up or down). 

Phil:  without doing a lot more work, could you separately list the players who had, say, 6 or more PA over the last 2 games and those who had 5 or fewer?  My guess is the high-PA guys are much better hitters, and it will be obvious just eyeballing the lists.  In contrast, in the lower and higher BA ranges I’d guess pretty much everyone will play.


#88          (see all posts) 2010/10/07 (Thu) @ 09:12

I’ll do that today.  It’s going to be two fairly long lists, though.


#89          (see all posts) 2010/10/07 (Thu) @ 09:37

OK, these are PAs in the last two games for guys going into the 2nd-last game hitting .298 to .301.

Six or more:

1975 6 stargwi01
1975 8 smithre06
1976 9 staubru01
1977 8 bonneba01
1977 8 biittla01
1977 7 driesda01
1977 7 bochtbr01
1977 6 mcraeha01
1977 6 poqueto01
1977 8 coopece01
1977 9 hislela01
1978 8 lynnfr01
1978 8 otisam01
1978 9 cromawa01
1978 9 munsoth01
1979 8 bellbu01
1980 6 murraed02
1980 8 collida02
1980 9 kempst01
1980 6 summech01
1980 6 jacksre01
1981 6 kennete02
1981 7 salazlu01
1982 9 bakerdu01
1982 8 bochtbr01
1982 8 hernake01
1982 7 mcgeewi01
1983 10 coopece01
1983 8 dawsoan01
1983 9 raineti01
1983 6 wynegbu01
1983 8 penato01
1984 8 barrema02
1984 7 puhlte01
1984 6 guerrpe01
1984 8 hatchmi01
1985 8 murraed02
1985 10 bucknbi01
1985 9 cruzjo01
1985 6 sciosmi01
1985 10 molitpa01
1986 8 cartejo01
1988 8 dawsoan01
1988 6 mullira01
1989 9 brownje01
1990 10 gantro01
1990 6 larkiba01
1990 9 frymatr01
1990 7 wallati01
1990 6 francju01
1990 9 mcgrifr01
1991 9 quintca01
1992 7 hamilda02
1992 11 knoblch01
1992 8 walkela01
1992 8 jordari02
1993 8 paglimi01
1993 7 vaughmo01
1993 6 gonzalu01
1993 7 slaugdo01
1993 8 palmera01
1995 10 coninje01
1995 11 biggicr01
1995 6 butlebr01
1995 7 karroer01
1995 8 whitero02
1995 7 finlest01
1995 6 corajo01
1995 9 carrema01
1995 9 clarkwi02
1996 7 vizquom01
1996 9 galaran01
1996 9 pridecu01
1996 9 batisto01
1997 10 nunnajo01
1997 7 reedje02
1997 8 higgibo02
1997 8 bonilbo01
1997 6 offerjo01
1997 10 coomero01
1997 8 griffke02
1997 9 lankfra01
1997 9 gonzaju03
1998 9 bagweje01
1998 6 hidalri01
1998 8 joynewa01
1998 10 kentje01
1999 8 anderga01
1999 7 kleskry01
1999 7 belleal01
1999 9 venturo01
1999 7 youngke01
1999 9 cairomi01
2000 7 cruzde01
2000 10 kotsama01
2000 9 drewjd01
2000 8 vinafe01
2001 7 singlch01
2001 8 greensh01
2001 8 kentje01
2001 8 edmonji01
2002 8 walketo04
2002 7 jonesja04
2002 6 wilsoda01
2002 10 winnra01
2002 9 rodrial01
2003 6 mientdo01
2003 7 abreubo01
2003 9 byrdma01
2003 8 gilesbr02
2004 11 crispco01
2004 6 abreubo01
2004 7 walkela01
2004 8 huffau01
2005 8 ortizda01
2005 8 mauerjo01
2005 9 suzukic01
2005 7 teixema01
2006 11 murtoma01
2006 7 abreubo01
2006 9 gonzaad01
2006 7 vizquom01
2007 8 atkinga01
2007 7 spilbry01
2007 10 grandcu01
2007 10 castilu01
2007 6 winnra01
2008 8 inglejo01

Five or fewer:

1975 1 singlke01
1975 3 doylede01
1975 1 luzingr01
1975 3 murcebo01
1975 1 mcbriba01
1976 5 fossera01
1976 5 cruzjo01
1976 5 munsoth01
1977 2 garrra01
1977 4 garvest01
1978 5 lemonch01
1978 3 conceda01
1978 4 lefloro01
1978 5 lacocpe01
1978 4 pacioto01
1979 5 adamsgl01
1980 3 trammal01
1980 1 wallide01
1980 2 staubru01
1980 5 woodsal01
1981 4 almonbi01
1981 2 howear01
1981 3 penato01
1982 4 decindo01
1982 5 lynnfr01
1982 4 brettge01
1982 4 whitefr01
1982 5 mumphje01
1983 5 benedbr01
1983 2 ramirra01
1983 1 hernake01
1985 5 murphda05
1985 4 salasma01
1985 4 hassero01
1986 4 rayjo01
1986 4 fletcsc01
1987 5 santibe01
1988 2 perryge01
1988 2 backmwa01
1988 5 wilsomo01
1989 4 polonlu01
1989 4 reynoha01
1990 2 bondsba01
1990 1 daughja01
1991 1 greenmi01
1991 4 polonlu01
1991 5 saboch01
1991 5 butlebr01
1992 2 clarkwi02
1993 3 smithdw01
1993 4 boggswa01
1995 5 devermi01
1995 3 greenmi01
1995 3 valenjo02
1995 2 vaughmo01
1995 4 merceor01
1996 4 greenmi01
1996 5 ochoaal01
1996 2 kendaja01
1997 1 glanvdo01
1997 5 castivi02
1997 4 guerrvl01
1997 3 corajo01
1997 5 rodrial01
1998 4 hubbatr01
1999 1 singlch01
1999 4 liebemi01
2000 2 younger01
2000 5 maynebr01
2000 1 spierbi01
2000 3 koskico01
2001 5 bautida01
2001 2 piazzmi01
2002 4 sheffga01
2002 4 pierzaj01
2003 5 loftoke01
2003 5 phillja04
2003 4 galaran01
2003 5 grissma02
2003 2 catalfr01
2003 5 delgaca01
2004 4 anderga01
2004 5 ortizda01
2004 4 matsuhi01
2005 3 kennead01
2005 4 ordonma01
2006 3 carroja01
2006 1 anderma02
2006 5 furcara01
2006 4 reyesjo01
2006 4 derosma01
2006 5 lairdge01
2007 4 grudzma01
2007 4 werthja01
2008 1 mccanbr01
2008 5 ethiean01
2008 5 rodrial01


#90    Guy      (see all posts) 2010/10/07 (Thu) @ 10:22

Thanks, Phil.  This is harder to “eyeball” than I anticipated.  When I sort it by PAs it does look to me like better hitters tend to get more PAs, but there are certainly some high-BA hitters getting pulled after only 2-3 PAs (Piazza, Bonds, Trammell, Will Clark).  So it’s hard to say for sure, and probably not worth the effort to run their season BAs.


#91          (see all posts) 2010/10/07 (Thu) @ 10:36

Their season BAs will all be around .300.  smile


#92    Guy      (see all posts) 2010/10/07 (Thu) @ 10:48

That was easy!  Right, would need to look at PAs and/or career BA.


#93          (see all posts) 2010/10/07 (Thu) @ 10:58

I looked at the respective Marcels for the two groups.  I weighted every player equally, by his Marcel AB, regardless of how many PA he had late in his .300ish season.

BA Marcels for that season:

5- PA: .282
6+ PA: .285


#94          (see all posts) 2010/10/07 (Thu) @ 11:00

The word “equally” should be deleted in post 93, along with the comma following.


#95    James      (see all posts) 2011/02/03 (Thu) @ 10:17

I know it’s way too late to weigh in on this, but I ventured over here from the Deadspin article. While I don’t disagree with the selection bias hypothesis, I have another theory (sorry if I just missed it, but I didn’t see it mentioned by anyone) to at least partially explain the higher last-plate-appearance average for .299 hitters.

What if it’s analogous to onside kicks in football? Whatever the exact numbers, I think we can all agree that an onside kick is more likely to be successful when the receiving team doesn’t see it coming. And the obvious paradox is that if teams tried surprise onside kicks all the time, they’d no longer be surprising.

Couldn’t a similar thing happen in baseball? A given .299 hitter, in all likelihood, isn’t a notorious free-swinger, but may have a higher success rate by swinging away if the pitcher doesn’t expect him to swing away. If he did it every time, though, pitchers would adjust and the advantage would disappear.


#96    Tangotiger      (see all posts) 2011/02/03 (Thu) @ 11:55

Whatever advantage that is is not going to be more than .010.  That would be an interesting finding if someone massively changed his approach in light of a particular incentive.  He may increase his BA, but he’s going to decrease his walk rate substantially.

But, I believe Phil showed that this is NOT the case.

Your hypothesis has been tested and found to not be true.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 23 01:15
How much should minor leaguers make?

Feb 22 22:31
Not everything you learn in college is true (duh)…

Feb 22 17:27
Would you cut to a regularly scheduled show, if the main event ran long?

Feb 22 17:02
This week in chart failure

Feb 22 16:26
Who’s evaluating the 2011 forecasts this year?

Feb 22 12:21
MLB 2012 Odds: BetOnline

Feb 22 07:11
K minus BB differential or ratio?

Feb 22 01:18
Two players have the same stats: one is much younger.  Which one will be better next year?

Feb 21 14:49
Knuckleball pitchers: all of them

Feb 21 13:57
Proper compensation for Epstein?