THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, October 27, 2008

Do first_half/second_half splits mean anything?

By , 08:11 PM

I thought the results were important enough to warrant its own thread, rather than continuing the last one on first half, second half splits.  If you didn’t follow the discussion in the thread I was talking about, here is the link:

http://www.insidethebook.com/ee/index.php/site/article/first_half_second_half_splits/

To summarize the discussion, a few people pointed out some unusually large first half, second half splits in performance for some players.  The discussion is whether for players in general, those splits “mean” anything, which is the same thing as asking whether they have any predictive value, which is the same thing as asking whether they correlate to any degree from one year to another.  For example, we find that platoon splits for RHB have very little predictive value.  No matter what a RHB platoon splits are in any given time period, they will tend to revert to near league average for all RHB in any other time period.  For LHB, there is some predictive value - the larger the sample size of data we have, the more predictive those sample results are.

For RHB (since there is some predictive value), we might need 10 years of split data to “tell us anything” about that player’s true talent platoon ratio or difference.  For LHB, it might be 2 or 3 years of data.

Anyway, I was skeptical that any sample of first half and second half splits means anything, i.e., has any predictive value.  Of course, even if there is a tiny amount of predictive value in any sample data, if the sample is large enough we eventually get tremendous predictive value. But, in baseball, we really only get to use one year at the least and maybe 5 or 10 years at the most, worth of data to have any practical significance, of course.  If we have to wait until we get 15 or 20 years of data for it to have much predictive value, that is not particularly interesting, to me at least.

Anyway, one way to see how much predictive value there is in a certain amount of data, we can run a regression from one time period to another.  If the correlation is really low, then there is little or no predictive value to that particular stat for that amount of data (the number of opportunities underlying each element in the regression).  Hopefully we have enough data (both data points in the regression and a decent sample size for the underlying number of opportunities for each data point), such that the uncertainty in the resultant “r” is fairly low (a small standard error).

I did such a regression on first half, second half splits. Here is the methodology and the results: 


I used the database provided by our own “terpsfan.” First half, second half data from 1974 to 2007, for all players. I used OPS as the stat of choice.  I could have used lwts or RC. It does not make any difference.

I only looked at players who played on one team for the whole season.

I only looked at players who had at least 100 PA in the first and second halves.

There were 6427 data points (player seasons) in the regression.  I regressed the 1st half, second half OPS difference (1st half minus second half) for one year on the same thing for another year.

I overlapped years (e.g., 1974 on 1975 and 1976 on 1975), so the data points are not independent.  That is not the best thing to do, but it is no big deal. If I only used independent data pairs, I would get around the same results.

First I gave every data pair the same weight, IOW, if a player A had 120 PA in the first and second half in year 1, and around the same in year 2, that got the same weight as a player who had 200 PA in the first and second halves of years 1 and 2.

I redid the same thing, weighting each of the data pairs by the minimum number of PA in any half (out of 4 possible halves, 2 for each year).

Here is what I got:

Using the “same weight” method, I got an “r” of -.004 (again, 6427 player seasons).

Using the second method, weighting by PA, I got an “r” of .001.

Sorry guys, I see no evidence of these splits having any meaning whatsoever.  None.

I have an open challenge to everyone.  If anyone finds any predictive value to any splits other than the ones we KNOW have predictive value, I’ll donate 100 dollars to the charity of your choice, for each split.  Obviously we have to establish a minimum level of predictability for any given number of opportunities, say an “r” of at least .2 for one year of data.  And I (or someone else I trust) have to verify

#1    terpsfan101      (see all posts) 2008/10/27 (Mon) @ 22:23

I had a feeling that there would be no predictive value in the 1st/2nd half splits. It was fun to speculate though, as we did find some players who had pretty extreme 1st/2nd half splits.

What splits do “we KNOW have predictive value”?


#2    MGL      (see all posts) 2008/10/27 (Mon) @ 23:17

R/L platoon and G/F are the only ones I can think of off the top of my head.

One of the interesting things is that these regression and correlation tests assume you are looking for a “uniform effect.”

If G-d comes down and tells us that there are 2 players with a large true split and no other players, then these kinds of tests won’t pick that up.

I don’t know what kinds of tests can pick that up, if any (and no matter what, there is going to be uncertainty in the “conclusions").

As an aside, I ALWAYS say, when dealing with sample data, something like, “The evidence indicates,” or, “The evidence strongly indicates,” or, “There is no evidence that...”

I NEVER say that something is “true” or “not true” (implying 100% certainty) or at least I try not to.  We do NOT know, with 100% certainty, that Bonds is a better hitter than Neifi Perez.  I know that one is hard to get our arms around. Even considering that Bonds is much bigger, stronger, has a nicer looking swing, comes from a good pedigree, I think there is a non-zero chance that, despite all of that, he is really a bad hitter, worse than Neifi.  That number might be 1 in a trillion or even less, of course.

For example, what if we find 2 players with a 10 SD split, but everyone else looks random?  Do we
still regress those 2 guys 100% toward the mean?

I don’t know the answer to that.


#3    MGL      (see all posts) 2008/10/27 (Mon) @ 23:18

Whoops, I broke that post up awkwardly.


#4    Derek Carty      (see all posts) 2008/10/27 (Mon) @ 23:47

I’d be curious how this would look if you limited the players you’re looking at to rookie splits predicting sophomore performance.  I believe I’ve read somewhere (might have been the Baseball HQ Forecaster) that the second-half of a rookie season predicts the sophomore season better than the entire rookie season (or something to that effect), but I’ve never seen the actual study.  Has anyone here looked at this or does anyone know the one I’m thinking of?


#5    MGL      (see all posts) 2008/10/28 (Tue) @ 00:17

That sounds kind of sketchy and if it is true, I would suspect that it would be due to some kind of selective sampling.

For example, let’s say that we have 100 rookies with a true talent of .750 (OPS).  And let’s sat that 50 of them go .650 in the first half and 50 go .850 (an oversimplification, but basically true).

Both groups will hit .750 in the second half, their true OPS, and of course both groups will hit .750 in the next year.

But the .650 first half players will get sent back down to the minors and the .850 ones will remain to see the second half and another year.

As I said, they will hit .750 the next year.  Their full year OPS would be .800, but their second half OPS was .750.  So yes, the second half OPS is a better predictor of next year, but ONLY because of this selective sampling effect.

This kind of selective sampling occurs for all players, but is most pronounced for rookies, because if you fail in the first half (or first month or whatever) of your rookie year, you often get little or no playing time in the second half (or rest of the year), and even the next year.

Gotta always watch out for selective sampling issues, especially as it relates to performance dictating future playing time.


#6    MGL      (see all posts) 2008/10/28 (Tue) @ 00:24

HFA is another “known split” to a very small degree, and it is not uniform.  There are generally a few teams that have a HFA different from the norm (BOS, MIN, and COL), because of quirks in their parks, but in any given era it could be none (although that is unlikely).


#7    JD      (see all posts) 2008/10/28 (Tue) @ 04:32

Is there any merit to month-by-month splits? Or, put another way, does the “slow starter” really exist (or any variation/opposite of that)? I’m sure there are sample size issues at work here, but it seems logical that some guys need more at bats to get going, others are fairly consistent for 6 months, and some guys wear down in August/September.


#8    MGL      (see all posts) 2008/10/28 (Tue) @ 11:52

JD, given the results of the 1st half, 2nd half regression, I’ll double my offer of $100 if anyone finds any significance to month to month splits.  In fact, I’ll make it $500.

IOW, if there were slow starters or fast starters, it would show up in the first half, second half splits.


#9          (see all posts) 2008/10/28 (Tue) @ 12:05

Derek,

I remember reading that Nate weighted the second half more heavily than the first half last year for his PECOTA projections.  I think it might have been a 52/48 split.  I’m going to look for where he said that, though it might have been in their 2008 annual which I do not have with me.


#10    VictorW      (see all posts) 2008/10/28 (Tue) @ 12:11

In the article linked, Nate implies that they weight the second half more heavily:

“This year,” Silver says, “we started looking at things like platoon splits in more detail, and first/second half splits—if a guy performs better after the All-Star break, that’s a good sign for the next season.”


#11    MGL      (see all posts) 2008/10/28 (Tue) @ 12:25

If there is only one thing I could convey in the various venues it would be the notion that all of the dozens of things that the media and fans (and most of the baseball insiders themselves - managers, coaches, GM’s, etc.) think have significance in terms of predictive value, true talent, or however you want to couch it - don’t!  None of them (other than the ones I already mentioned and maybe a couple more that I can’t think of off the top of my head).

The reason is two-fold:  One, whatever effect or talent there may be is MUCH smaller than these guys (the media et al.) think, not that they really have any idea of the magnitude of effects or even what that means.  Two, even if there is some small spread of “talent” associated with most or even all of these things, it is DWARFED and SWAMPED by the noise (random fluctuations) associated with any sample sizes that the media et al. use to illustrate or “justify” their position. 

So that even if some or all of these things exist as true effects or true spreads of talent, we can NEVER identify those differences in talent from the kinds of sample sizes we generally work with.

This is true of batter/pitcher matchups, the aforementioned HFA, first half, second half performance, slow and fast starts, certain innings that pitcher do better or worse in, home and road splits for individual players, day/night splits, clutch hitting and pitching, pitchers “pitching to the score,” performance in regular season versus post-season, etc., hot and cold streaks, etc., etc. 

The extent of the thinking of the media et al. is “Well, it makes sense that there should be differences among players with respect to X, therefore when we see differences in any given sample of data, it must mean something.”

They have no idea what the magnitude of that effect has to be or what the size of the sample has to be for the observed effects to “mean anything.” No idea.

That at least is he default position for any “split” you can come up with.  That does not mean that I can guarantee that none of them will have a significant spread of true talent observable in relatively small samples.  But I can say with some high level of certainty that that is true.  Plus, it is all a matter of degree.  There are probably very few things that are either yes or no with respect to a spread of true talent.  It is probably true that most of these things have some tiny spread of true talent associated with them - just not enough to be observable in the kinds of samples we typically look at, be it 30 PA (e.g., hot and cold streaks and batter/pitcher matchups) or 1-3 years (e.g., clutch).


#12    Tangotiger      (see all posts) 2008/10/28 (Tue) @ 12:33

If I were to do day-to-day Marcels, I weight each day as:
weight: .9994^DaysAgo

So, a game that was between 0 and 90 days ago gets weighted at between 1 and .947, for an average of .974.

A game that was between 90 and 180 days ago gets weighted at between .947 and .898, for an average of .923.

This implies a random game in the second half get .974 weight, and a random game in the first half gets .923.  974 divided by 974 plus 923 is 51.3%.  So, I would give the weight as 51/49. 

If I instead use .9990^DaysAgo (a number that Brian figures is closer when he tested Marcel), then I get .931 as the weight for a second half game, and .883 for a first half game.  This also gives me 51.3%.

If Nate says 52/48, I won’t argue the point, as we’re really splitting hairs here. 

A guy with a .300 wOBA in the first half and .450 in the second half (or whatever Delgado did) would give you:
50/50: .3750
51/49: .3735
52/48: .3720

Really splitting hairs at this point, for such an extreme player, to boot.


#13    MGL      (see all posts) 2008/10/28 (Tue) @ 22:33

I remember reading that Nate weighted the second half more heavily than the first half last year for his PECOTA projections.

if a guy performs better after the All-Star break, that’s a good sign for the next season.”

Kind of depends on the magnitude of a “good sign” is, but NO SHIZIT!

I am going to write an article tomorrow and the headline will read, “Famed sabermetrician finds that if a player performs really well this season, it is a good sign for next season!”

In other news, “Forecasting systems give more weight to this year’s stats than those from 10 years ago.”

Front page of “BTN” SABR’s research journal:

“Researchers find that MLB players’ Little League stats should get little weight in their projections!”


#14          (see all posts) 2008/11/04 (Tue) @ 16:53

I’d be curious to see if a big drop off in second half stats has a good correlation with the likelihood of off-season surgery. I realize this isn’t exactly the type of prediction you are looking for but it would still be interesting.


#15    MGL      (see all posts) 2008/11/04 (Tue) @ 17:06

Certainly could be.  Could be that any drop-offs, especially large and extended ones, correlate to some extent with any kind of injury.  As usual, it is hard to detect these kinds of relationships (which sort of means that for all practical purposes they don’t exist) when you have so much inherent noise…


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:49
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps