THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, June 25, 2008

Vote: How predictive are partial season stats?

By Tangotiger, 12:24 PM


SabermetricsPoll
#1    Tangotiger      (see all posts) 2008/06/25 (Wed) @ 14:03

(Spoiler alert: To see the results, highlight the space between the two colons.)

After 17 votes: 101 games:


#2    Tangotiger      (see all posts) 2008/06/25 (Wed) @ 15:31

After 27 votes: 102 games, and if you remove the top and bottom 5% (the extreme votes), it’s 104:

I always find these polls of mine, be it here, or the Fans’ Scouting Report, stabilize very quickly after 20 votes or so.  No exception here.


#3    Blackadder      (see all posts) 2008/06/25 (Wed) @ 16:55

I like these polls, but isn’t it a bit of a problem that you are only getting people who read your site!  I mean, I read MGL’s thread earlier today, and I suspect a lot of the other readers did too.  If you polled the general baseball watching population, it wouldn’t shock me if the results were pretty different…


#4    tangotiger      (see all posts) 2008/06/25 (Wed) @ 20:13

Can’t disagree with you.  I’m always hopeful that some other blogs out there will link here, and they can answer as well.

That said, it’s not too important that I get a “general baseball population”.  I’m happy with just the guys who like to think (and overthink) baseball.


#5    tangotiger      (see all posts) 2008/06/25 (Wed) @ 20:24

After 51 votes, it’s back to the average of what was in post #1.


#6          (see all posts) 2008/06/25 (Wed) @ 20:56

I’m wondering who clicked choice A (20 games). Anyone wanna take credit/explain his reasoning?


#7    MGL      (see all posts) 2008/06/25 (Wed) @ 20:58

More than the fact that our readers are obviously completely different than the general public or the typical fan, people will say and write stupid things all the time.  However, when forced to “put their money where their mouth is,” most of them will all of a sudden become rational.

Here is an example:  Think of all the really stupid things that TV commentators say all the time.  I guarantee you that if you gave them a survey or you interviewed them face to face, they would do an about face on many of those things.

People don’t think unless they have to.  People will also say and do stupid things if they have no accountability.  Once you force people to think or make them accountable, things change.

In this case, as I initially said, if you substituted last year’s stats for this year’s stats in all of those articles where the author tells us how good or bad someone IS, almost everyone (but the saber-friendly person) would think you were out of your mind ("Why should I care what so-and-so did LAST year?  Look at what he has done THIS year!").  However, when you have a survey like this and you force people to think and to be somewhat accountable (obviously this survey is anonymous), they often change their tune ("Of course a whole season of stats from last year has more predictive value then only 75 games from this year.  Everyone knows that!").


#8    tangotiger      (see all posts) 2008/06/25 (Wed) @ 21:13

I agree completely with MGL’s sentiment.  People yap… they love to yap.  “Yeah, the Pats will win!  No question about it!” “Really, you are 99.99% sure they will win?” “Yeah” “Wanna bet?” “Uh, what kind of odds am I getting?” Yapper.

These surveys are to cut through all the adjective bullsh-t, since some people don’t know how to express themselves qualitatively, and force a quantitative response.  Juan Pierre has 1500 PA left in his MLB career, entering 2008, Cliff Lee is worth a 4/34 deal entering May, 2008, and Arod deserved a 32MM a year deal for 7 or 8 years.  All things that make sense at the time, even though that’s not really what you were necessarily hearing.


#9          (see all posts) 2008/06/25 (Wed) @ 23:42

But your also talking about an average in your question.  On the average how many games does it take?  But when people yap they yap about an individual player.  Aramis Ramirez is walking more this year.  I don’t know how much walk rates vary in season historically, but he’s jumped 65% or so over his career and recent years.  I would think that is an actual improvement one could yap about, perhaps not.  A good follow up would be which metrics can be relied upon after a certain amount of games.  Not everyone yaps without some cause.  Yap yap.


#10    Colin Wyers      (see all posts) 2008/06/26 (Thu) @ 00:51

That’s where you get into trouble, Kyle - looking at a single player rather than his peer group, especially if you’re selecting a player based upon his unusual performance.

Simply given random chance, we would expect any group of players to have a few outliers over a short run of time. What you first need to look at is, given all players - or all players with a certain preseason talent estimate - how many outliers would you expect, based upon sheer random chance? The right question isn’t, “How likely is it that Cliff Lee has such a good April?” The question is, “Given the number of Cliff Lee-like players in MLB, how likely is it that one of them has a spectacular April, assuming that our projections are 100% correct?”


#11    Aaron      (see all posts) 2008/06/26 (Thu) @ 01:01

I was thinking along the same lines as Kyle. Say that you have two hitters whose batting average and power has increased substantially to start the year. One of them has the same plate discipline (BB/K ratio) he’s always shown while the other one has a much improved eye ratio (either from an increase in walks or a reduction of K’s, or both). It seems to me that you should weight the second guys performance from this year more than you would for the first since what pitches a player swings at is completely under his control, while that isn’t the case with what the ball does after it hits the bat.

The same would go for guys who are doing worse. A drop in performance that coincides with a plunge in plate discipline might be more indicative of a real change in approach or ability.

Of course, most of the “yappers” aren’t even considering such issues before they talk, but I’m curious if it’s meaningful.


#12    Aaron      (see all posts) 2008/06/26 (Thu) @ 01:18

#10, that doesn’t really address what Kyle is saying. The questions that should be asked in regard to his thoughts are:

1) Is K/BB ratio more or less consistent than production stats (BA/OBP/SLG)?

2) Does a change in K/BB correlate with a change in a player’s “true” talent level?

If the answer to #1 is “more consistent” and #2 is “yes”, then Kyle is on to something.


#13          (see all posts) 2008/06/26 (Thu) @ 01:51

To Cliff Lee, I was just defending the yappers.  The yapper is not trying to comment on more then one player.  And so they are not making a claim of their prognostic abilities for all players using 100 games.  They may just be claiming their ability to read that Cliff Lee is walking less dudes.  Basically, to bring yappers into the discussion is to move away from the player group question this thread describes.
To Aaron, exactly.  But we are pointing out the obvious.  The more sabermetricly solid stat you use, the more predictive it is from year to year, the more accurate it will be in season too probably.  To quote Dave Cameron “As Lee continues to post months with a FIP below 3.00, we’re going to have to continue to revise our estimate of his true talent level upwards.” Not all who wonder (from 100 games) are yapping, if they use the right, what’s that stuff?


#14    Colin Wyers      (see all posts) 2008/06/26 (Thu) @ 01:56

You’re starting from the premise that the start to the year is meaningful from the standpoint of assessing a player’s talent level; it really isn’t.

For a player with 0 prior plate appearances, yes, things like K rate or BB rate tend to stabilize quicker than batting average. You can read up on it here:

http://mvn.com/mlb-stats/2007/11/14/525600-minutes-how-do-you-measure-a-player-in-a-year/

Even then, though, walk rate doesn’t reach an r of .7 until 200 or so plate appearances. And even then, given the population of major league players, you would expect some of them to have an improvement or decline in BB rates over the first few months of the season simply due to random chance. 70% reliable is 30% unreliable, and when you have enough players to choose from, SOME of them are going to win the lottery.

But we aren’t talking about players with 0 PAs, we’re talking about players with thousands of prior PAs. And unless you know that there’s something actually wrong with data, you don’t want to throw it out unless you absolutely have to - you can weight it, or regress it, but don’t just dispose of it.


#15    Muddy      (see all posts) 2008/06/26 (Thu) @ 07:53

Thinking this question over, I’ve realized that with the 35+ crowd, I tend to instinctively weight recent performance a bit more.


#16    Tangotiger      (see all posts) 2008/06/26 (Thu) @ 09:07

After 68 votes, the results are the same as in post 1.

***

As for the predictive nature of a large increase in walks: do the study!


#17    MGL      (see all posts) 2008/06/26 (Thu) @ 11:59

No question that if you referred to individual component stats (or some combination, like K/BB ratio), that the answers would be different.  Same for age of the player, etc.  And if you dug deeper, like into the pitch f/x data, you could come up with even more specific answers.

But, the question that Tango posed is very clear.  It applies to “all” or the “average” MLB player, and is based on some overall performance metric like OPS, EQA, wOBA, or lwts.

The “who is going to win X series” is a good example by Tango of “yapping.” For example, on just about any TV show, you are going to hear one of the “expert” commentators either tell us that so-and-so, a big underdog is “going to win the series,” or that so-and so, say, a 2-1 favorite, is ”definitely going to win the series.”

Of course, then you say to the first guy, “Then I guess you wouldn’t mind taking an even money bet on that underdog,” or to the second guy, “Well, I guess you wouldn’t mind laying 5-1 on that team (since they are definitely going to win.”

Now, of course, they may not be gamblers, but all of a sudden, you are going to hear “hommena, hommena, hommena,” or however you spell that sound, which in this case, means, “I just said something really stupid, and someone called me on it...”


#18    Dackle      (see all posts) 2008/06/27 (Fri) @ 02:52

Here’s another way of figuring out the appropriate regression factors:

1. Build a database of all players from 1920 to 2006 with at least one PA in back to back seasons.

2. Add x plate appearances of league average performance to this year’s stat (eg if x = 100 and the league hits .025 HR/PA, then add 100 PAs and 2.5 HR to the player’s actual stats)

3. Calculate a revised HR/PA rate for the player with x PAs of league average play added in (eg if the player hit 10 HR in 100 PA, his revised HR/PA rate would be 12.5/200 = .0625)

4. Multiply the revised HR/PA rate by next year’s PAs to determine projected HRs (eg if he had 400 PAs next year, we would estimate 400 * .0625 = 25 HRs)

5. Subtract from actual HRs, square the difference (if he actually hit 40, the error would be (40-25)^2, or 225).

6. Repeat the same for every player in the database, and sum the squared errors.

7. Through trial and error, find the value of x (number of league average PAs) that minimizes the sum of the squared errors.

Following those steps, I come up with the following optimal number of league average PAs to add for various stats:

Singles: Add 375 PA of league average
Doubles: 970 PA
Triples: 140 PA
Home runs: 140 PA
Runs scored: 355 PA
RBI: 225 PA
Walks: 135 PA
Strikeouts: 95 PA
Stolen bases: 110 PA
Caught stealing: 365 PA
Runs created: 260 PA

For pitchers, and using rates per batter faced:

Strikeouts: Add 215 TBF of league average
Home runs: 1,470 TBF
Walks: 310 TBF
(Hits - HR)/Ball in play: 2,375 TBF
Earned runs/PA: 1475 TBF
Earned runs/IP: 310 IP

The NL has a 4.24 ERA this year (equal to 146 ER in 310 IP), so Edinson Volquez (95 IP, 18 ER before last night’s game) would have a regressed ERA of (146 + 18) * 9 / (95 + 310) = 3.64.


#19    Aaron      (see all posts) 2008/06/27 (Fri) @ 20:52

Is your number for doubles correct, it looks completely out of place?


#20    Dackle      (see all posts) 2008/06/28 (Sat) @ 01:00

Yeah, I checked again and it should be correct.

Pizza Cuttter’s study referenced in #14, also showed a lack of reliability for doubles (actually 2B+3B):

A few “how often does he…” stats stablized at:

* 1B rate - 375 PA
* 2B+3B rate - never did.  at 650 PA, it had only reached a split-half correlation of .411
* HR rate - 100 PA
* K rate - under 40 PA
* BB rate - under 40 PA


#21    Dackle      (see all posts) 2008/06/28 (Sat) @ 01:29

Aaron, thanks for pointing that out. I rechecked all of the other categories and there was an error in the triples—should be 740 and not 140. So, the doubles should make more sense next to that.

Singles: Add 375 PA of league average
Doubles: 970 PA
Triples: 740 PA
Home runs: 140 PA
Runs scored: 355 PA
RBI: 225 PA
Walks: 135 PA
Strikeouts: 95 PA
Stolen bases: 110 PA
Caught stealing: 365 PA
Runs created: 260 PA


#22    Colin Wyers      (see all posts) 2008/06/28 (Sat) @ 01:34

I presume you’re looking at 2B and 3B per PA, right? Try looking at (2B + 3B) / Hits instead. That should work a little better.


#23    Dackle      (see all posts) 2008/07/01 (Tue) @ 02:32

Colin that would probably work better. If you expand (2B+3B)/Hits to (2B+3B/(1B+2B+3B+HR), it looks like (2B+3B)/Hit is just a more stabilized version of (2B+3B)/(1B+HR), which looks a bit odd. Wonder if it would be better to use (2B+3B)/ball in play (ie AB-HR-SO)?

There are other ways the numbers could be improved—runs scored should be per time on base, stolen bases should be per time on first base, and so forth.


#24    Colin Wyers      (see all posts) 2008/07/01 (Tue) @ 09:17

Dackle, MGL was his component regression paper:

http://www.tangotiger.net/mgl/regression.pdf

He explains what components he uses, if not precisely why. (He does 2B+3B/1B+2B+3B, which probably makes more sense than 2B+3B/H).

But essentially, what you’re trying to do is isolate each component down to one skill. Go down the list. Isolate out the TTOs. Then look at BABIP. Then look at XBH per hit on ball in play. Then look at how many triples there are per XBH.


#25    Rally      (see all posts) 2008/07/01 (Tue) @ 09:30

That is an excellent paper.  Pretty much the basis for the regression used in the CHONE projections.


#26    Dackle      (see all posts) 2008/07/02 (Wed) @ 02:48

That’s a much more robust way of regressing rate stats, particularly the batter regression constants (ie not always regressing to exactly the league average).

If my algebra is correct, the implied league average PAs to add to the player’s stats is equal to r(p)/(1-r), where r is the regression value (from batter table on page 8) and p is the number of plate appearances for the player. So, the regression value for $bb at 400PAs is .25, which implies adding 133 PAs of league average performance: .25(400)/(1-.25) = 133 (I cherrypicked that one because it is close to the #s I posted earlier). In other words, you’ve got 400 player PAs, 133 league PAs for a total of 533 PAs, of which the 133 league PAs represent 25% (regression value of .25). The implied league PAs for the other values in the table are:

                    
  PAs     $h     $e   $t  $hr  $bb  $so
  200  1,133    800  600  371  164   86
  400  1,200    933  933  327  133  100
  600  1,400    900  900  257  106  106
  800    978    978  800  200   89   89
1,100  1,100  1,100  733  122  122   58


#27          (see all posts) 2008/07/02 (Wed) @ 03:07

The number of league avg events to add in order to regress should be the same, regardless of the quantity of player events actually measured. MGL said there was some variance in the results due to selective samplign at the various levels of playing time.

The formula I use to calculate the number of events to regress with is from the appendix of “The Book” and is a funtion of the weighted mean and standard deviation of the sample. They are in the ballpark of what you have above, but again, are the same for any number n of player events being regressed.


#28    tangotiger      (see all posts) 2008/07/02 (Wed) @ 07:20

The discussion thread for MGL’s article can be found here:
http://tangotiger.net/archives/stud0274.shtml

Here is my relevant post:

wOBA 1B 2B 3B HR NIBB HBP RBOE SO
209 298 1,101 571 131 96 255 1,627 62

As noted in that post, I did per PA, and not the way I’d normally do it.  As we can see here, K’s regress very little toward the league mean, followed a bit more by walks, and most of all by reaching on error.

With 1627 PA, you regress errors 50% toward the league mean.

I also made this post:

Here are the regression values for hitters and for pitchers, on a per 600 PA basis:

Bat Pit Event
26% 39% All
33% 44% 1B
65% 64% 2B
49% 83% 3B
18% 56% HR
14% 24% NIBB
30% 57% HBP
73% 75% RBOE
9% 11% SO

“All” refers to the Linear Weights-based OBA.

Like I said, I WOULDN’T do it this way, per PA, because of the interdependency, but this is good enough for now.

Check out the RBOE. There’s a similar impact based on the hitter and pitcher. Now, I *know* that the distribution of batters faced from the pitcher’s perspective does NOT have a variance of zero, especially as it relates to handedness. The LH/RH split for a LP and RP are far different.

That the 3B rates regress much more for pitchers than batters is probably due to the batter’s speed. The park effect, if the variance is not zero from each of the hitter’s and pitcher’s perspective, is probably the same for both. So, the regression differentials are probably the same, but the amount of regression might be different.

Check out how much a pitcher’s HR has to regress… right in line with his doubles, and MORE than his singles. This does NOT mean that a pitcher has less control on HR than singles, or whatever. It just means that our ability to figure out how much HR skill the pitcher has is limited by the sample available.

In virtually all cases, the hitter’s performance is more indicative of his skill level than a pitcher’s performance. Again, this is not to say, necessarily, that a hitter has more influence on a PA (they probably do), but rather that the individual performance lines AND the distribution of these performance lines are such that we can tell more about a player if he’s a hitter than if he’s a pitcher.

***
Incidentally, these numbers kind of support my off-the-cuff MArcel for pitchers to be 3/2/1/2, where the last value is for regression towards the mean, compared to the hitter’s 5/4/3/2.


#29    Tangotiger      (see all posts) 2008/07/09 (Wed) @ 09:10

After 98 votes, the results:

98 games: mean
100 games: median (exactly 32.7% chose both 80 games and less, 120 games and more)

From the first time I reported the results (after 17 votes), the results hovered around 100 games the entire time.  When you give a group of intelligent people a limited number of choices (7 here, or 5 in my Fans’ Scouting Report), you get consensus built extremely quickly.  That’s why I like these polls.  You’ll be hard-pressed to find situations where the results will diverge much after even a handful of responses.

Anyway, the 100 game mark will occur around July 21.  So, this will be our little game.  We’re going to see who will forecast the players better, the readers, or Marcel.

Marcel thinks that you need 130 games for hitters and 110 games for pitchers (average of 120 games) in 2008; so he’s loving it that The Book Readers need just 100 games from the 2008 season to forecast the final 62 games of the season, while Marcel gets to use all 162 games of the 2007 season (and zero in 2008).

We’ll have about 70,000 PA, so our results should be fairly significant.

Stay tuned!


#30    Tangotiger      (see all posts) 2008/07/22 (Tue) @ 09:18

After 100 votes, the mean is 98 games, the median is 100 games, and if I remove the top and bottom 10% of the votes, the extreme votes, the mean is 99.3.

As of today, each team has played an average of 99.2 games.

Therefore, starting from games of July 22 to the end of the year, we will see what will forecast the season better: the stats accumulated in 2008 through July 21 (ignoring all previous years), or the stats accumulated only in 2007 (ignoring all previous and current years).

That is, given the choice between 100 team games of 2008 or 162 games of 2007, you, the fans, the readers, believe that it’s a breakeven proposition, that both will tell you roughly the same amount.

Marcel, the Monkey, is elated, as he thinks it’s an easy win for him.

I’m going to do it two ways:
a) take rate stats of the two time periods for each player, multiply it by the total number of PA from July 22 to end of season, add them all up and divide by the sum of the PA.

b) add to each player a league average of 200 PA to his stats of that time period, and proceed as above.  This is to take care of some crazy cases where someone might have gone 7 for 13 or had an ERA of 0.50 in 10 innings in 2007, or currently in 2008, and then from today to end of season, they end up with 250 PA or 100 innings.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main