THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, July 16, 2010

And yet more HR derby hangover effects

By Tangotiger, 12:03 PM

Matt takes a look, and he does it along the lines of what MGL did for the hot/cold streaks in The Book: focus on a very narrow time frame, on the idea that if something were to happen, it would be most noticeable immediately.

As you’d expect, little to no change.  The article’s value is in its process, not in its findings (or lack of).  People who read the article and say they learn nothing, well, you missed learning something.

***

I also want to point out, as I always do, that just because something is statistically significant, it does NOT mean that that particular observed difference is the estimated true difference.  All we know is that we estimate that we have a true non-zero difference. 

In short, it tells you that there is a difference, but it says nothing about how much the observed difference is the true difference (other than the higher the t-test, the more the difference is real).


#1    Matt Swartz      (see all posts) 2010/07/16 (Fri) @ 13:00

Thanks for posting this. I wanted to just highlight that the one resut that I did get is actually what I would think is the most intuitive result of all-- while there was minimal effect on production on the aggregate, the actual batted ball distribution was pretty clearly skewed towards fly balls.  That was actually exactly where my prior guess was distributed around-- a 2-3 percent bump in fly balls simply because the participants worked on improving their ability to hit fly balls.  That seemed incredibly intuitive and a nice direct way of looking for an effect.


#2    Tangotiger      (see all posts) 2010/07/16 (Fri) @ 13:33

For those who didn’t read it, there was a 2 percentage points change in GB rates (down) and air ball rates (up).  This was based on roughly 1900 BIP.

Would we expect a GB .43 to .41 change by chance on 1900 BIP?  One SD is about .01, so we’re talking about a two SD change.  So, there might be something there, and really falls in-line with all the other figures reported by Matt: that yes, there is likely some non-zero effect and in the direction we expect.

This is a finding similar in scope to other findings we have in The Book, where there is indications that something is going on, but it’s kinda hard to find, and even when you find it, it’s impact is limited.

Matt’s study would really fit in perfectly in the Hot/Cold chapter.


#3    MGL      (see all posts) 2010/07/16 (Fri) @ 23:57

Sorry, but any results from this kind of study is fatally flawed:

“This would likely remove the selection bias, since a couple weeks right before the break are unlikely to affect whether an individual is asked to participate in the Derby (especially because the participants are selected about a week before the All-Star Game).”

While any period of time AFTER the selections are made should be unbiased, ANY period of time whether it be one week or 3 months BEFORE the selections will be a biased sample of players who have gotten lucky.

IOW, you simply cannot compare ANY time period before the selections to that after it.  In Matt’s study, for the BEFORE sample, you have one week (assuming that all the selections have already been made) of an unbiased, random sample of the participants’ hitting and one week of a biased, non-random, “lucky” sample of that hitting.


#4    Matt Swartz      (see all posts) 2010/07/17 (Sat) @ 00:16

MGL, come on: “fatally flawed”?  The players improved a little, so even if you are convinced that the any of the two weeks before the all-star game are going to impact the decision of who participates at all, they didn’t regress, so if anything, the effect is going to be understated.  They weren’t lucky enough to regress.

Regardless, when do you think they select the HR Derby participants?  How many of those participants do you think were really settled in that last week?

Given the results that I got, it’s pretty hard to say there is anything flawed here at all.  Players seem to uppercut a little more after uppercutting a lot, but they don’t regress backwards...actually a bit forwards. 

I’m not seeing this issue.


#5    MGL      (see all posts) 2010/07/17 (Sat) @ 01:06

"The players improved a little, so even if you are convinced that the any of the two weeks before the all-star game are going to impact the decision of who participates at all, they didn’t regress, so if anything, the effect is going to be understated.”

That is correct.  I would have expected them to regress, since the “before” sample is biased in favor of being lucky.  Maybe HR derby improves one’s swing.  I don’t know. 

“Regardless, when do you think they select the HR Derby participants?  How many of those participants do you think were really settled in that last week?”

To answer your first question, I have no idea.  I thought you said that they were chosen one week before the ASG.  If they were, then the week before that contains exactly the same bias as any other week during the first half.

I have no idea what your second question above means?

If the participants were chosen a week before the ASG, you CAN NOT use the week before that as part of your BEFORE sample.

That is 100% true, and if they were in fact chosen when you said (I think) they were chosen, then yes, your research is fatally flawed.  Any results you get, one way or the other, is essentially meaningless.

You basically looked at before the ASG and after the ASG, for half of your “before” sample at least, which, as you explicitly say in your article, is NOT the correct way to look at the issue.

Let’s say that we knew that all participants were chosen on July 4th.

Are you aware of the fact that the week before that will have exactly the same bias as any week in the first half and in fact the entire first half?  Are you aware of the fact that the HR rate for the derby participants for the one week before July 4th will be exactly the same as their HR rate for the entire first half of the season, adjusted for the weather, parks, opponents, etc? 

Are you aware that if you were to use just one measly week before they were chosen as your “before” sample that it is exactly the same as if you were to use the entire first half as your “before” sample, which is what the media uses when they make these “Derby hurts the players’ swings” claims?

That is all I am saying.  If you chose any part of your “before” sample before they were chosen, then you f***ed up.  If you didn’t, then I am mistaken.  I don’t know when they are chosen.  I went by what you said, which was that they are generally chosen a week before the game and you used 2 weeks before the game as your “before” sample, which is in fact a 100% fatal error.  At the risk of repeating myself, you can’t use any data from before they were chosen.  Period.

I hope you can see the issue now.


#6    Matt Swartz      (see all posts) 2010/07/17 (Sat) @ 01:37

My feeling is that since they seem to all be announced a week before the all-star game, so that’s a minimum time period, but the all-stars seem to be known about two weeks before that, so most of them are probably determined two weeks before, but certainly all of them are one week before since the public seems to know then.  So, I think there may be a small amount of bias there, but I’m having a hard time seeing how there could be all that much bias.  I think that the bias is probably very small (and since the performance improved, probably very very small) and I think that doubling the sample size is worthwhile.  The HR/FB for the last two weeks of the first half was 2.6% lower than the entire first half in my sample.  (FB+PU)% was 2.5% lower in the last two weeks of the first half than in the entire first half as well.  It really does seem like the selection into the HR derby has nothing to do with performance in the last two weeks of the first half.

I did check the league stats for 2 weeks before and 2 weeks after, and they were all the same, at least rounded to the nearest percent, so there was not an issue of league-wide offense changing over those four weeks.

Weathers, parks, opponent-- I didn’t break that down individually.  That’s a little more data than I could break down, and I’m not totally sure how I’d even do that efficiently.

I’m just not really troubled by a bias in a finding that seems to be in the opposite of the bias.  I’m always worried about bias, but it just seems so small.  I mean, can you think of any participant in any of those 6 HR Derbies who powered their way in the first week of July into the contest?  They are almost always guys who have been hitting home runs for years and were willing to participate, or young players who had enough home runs in the first month or two of the season to be in the news.  I just don’t think it’s causing a massive problem.  And if there was a bias, it would towards more flies in the penultimate week of the first half.

I do see what you’re saying and it did occur to me before too, but it just seems like getting 1300 extra PA before and after the break lowered the noise enough to not bother getting worked up over a tiny bias in the opposite direction of the finding.  If I was pinpointing the exact jump in fly ball rate, I’d want to try and remove bias in the ‘before’ sample but to just get a positive effect, I’m pretty comfortable that I found something little and worth smirking over.  And probably re-debunked a logical fallacy that people think HR Derbies hurt you.


#7    MGL      (see all posts) 2010/07/17 (Sat) @ 08:30

Matt, as I said (multiple times), if they were chosen all or substantially before your sample period, then there is no problem.  If not, then there IS a problem.

And you can NOT use the stats of the sample to determine whether there was bias or not!  That is irrelevant.  For example, let’s say that you chose the second week of May for your “before sample”.  We KNOW that would be an incorrect methodology since we KNOW that players in the HR derby, on the average, have had a lucky first half and will regress, on the average, in the second half.

Now, let’s say that you do that anyway - you are not a good researcher or stats person (I don’t mean you of course) - you use a week in May as your before sample.  Now, let’s say that you find that players hit fewer HR in the first week of the second half and you conclude that the derby does, in fact, adversely affect players’ ability to hit HR’s.  You would be criticized (and correctly so) of course for using a flawed methodology. But, you point out - I checked the player stats in that week in May, and they actually hit fewer HR’s than normal for that time of year!  Well, that does not change (or excuse) the fact that your “before” sample was severely biased and that your study is invalid.

Matt, you seem to think that even if the players were selected on July 7, that the week before that would not show much of a bias because players are not selected on the basis of that specific week.  That is wrong.  The week before selection is just as biased as the first, second, 3rd, 5th, or any other week in the first half, and on the average is exactly as biased (in all rate stats - presumably, in this case, WRT to HR rates only) as the entire first half, unless, for some strange reason, whoever selects the players are told to ignore the week before the selections (and actually does ignore it).

So, as far as any results of the study being worthwhile to look at:

Most (95%+) or entirety of the “before” sample after the selection process, YES.  Some or all of the “before” sample after the selection, NO!


#8    Tangotiger      (see all posts) 2010/07/17 (Sat) @ 11:13

Matt is saying they were selected one week before the HR Derby.  And so, the performance AFTER the selection was made, but prior to the HR Derby is considered out-of-sample.

Matt however decided to expand the sample by also including the performance in the week leading up to the selection (and so is in-sample).  That’s not good.  However, he noted that if there is a bias, it would be that the players would “show” better than “true” in the that in-sample data (that is, that the players are more likely to have been lucky than unlucky in that last in-sample week).

That the results show that the post-Derby out of sample is still higher than the last-week in-sample pre-Derby means it’s less of a concern (and really makes it mask the improvement).

All that said, if you want to do it right, you take the performance of players after they were selected to the Derby.  Do that first.

Then, you can include the in-sample and apply whatever adjustments you need to that in-sample.  And that in-sample data adjustments will make the player look worse than he performed.


#9    MGL      (see all posts) 2010/07/17 (Sat) @ 17:33

Right, I will retract my position that you can’t use in-sample data to do the analysis.  I suppose you can even though you shouldn’t, as long as you recognize and acknowledge that it is biased data in favor of a high (and lucky) HR rate.  But if you do find that HR rates do NOT decline in the second half as Matt and the others found, then indeed that is evidence that no adverse effect exists.

Of course if Matt had found the opposite (that HR rates declined) then we could legitimately question the results.

As Tango said, the in-sample data should not have been used, but as it turned out it didn’t make much difference.


#10    Tangotiger      (see all posts) 2010/07/17 (Sat) @ 20:26

For what it’s worth, I think this is a pretty good nuanced discussion that I’m glad were having, if only for those people who aren’t knee-deep in this stuff as much as we are.

If we didn’t have this discussion, someone else would have.  And, we get this discussion resolved with all sides heard.

Love it.


#11    MGL      (see all posts) 2010/07/17 (Sat) @ 21:37

#10, I agree.  I thought that the important point (that I was trying to make, but it didn’t come out too well) was that even though you wouldn’t think that one measly week before the selections would be very biased at all, that it is just as biased (exactly), rate-wise, as the entire first half and in fact any randomly (or not necessarily randomly selected) selected time period in the first half. 

That is important enough to repeat.  If a certain large sample is biased (we also often call it a selective sample), which we encounter all the time in baseball, because decisions are often made based on performance, any time period within that sample, no matter how small, will be exactly as biased as the entire sample.

And it was a very good idea by Matt to look at only the two weeks following the ASG, since there could have been a short-term effect which might get diluted (and hence hard to find) over the entire second half.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 20:16
Largest demonstration in Canadian history?

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards

May 24 08:13
espnW for hockey: CBC’s WhileTheMenWatch.com

May 24 00:16
Psst… wanna intern… somewhere?