THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, October 03, 2007

What stats should we use to represent how good a player IS?

By , 10:17 PM

As in evaluating teams for the post-season.  It is a long-standing bugaboo of mine - using one season stats as evidence of how good someone is, comparing players, teams, etc., not in the past, but for the future or the present.


It came to light again when reading BP’s comparison of the Rockies and Phillies, by Christina Karl.  As usual, it gave all the players’, or at least the starters’ at each position, 07 stats.  The first thing I think when see “analyses” like that, is, “Is that the best you can do - give us one year stats?” Why one year?  Why the current year?  Why not the last 3 months?  Why not the last month?  5 months?  How about year and a half?  How about last year, but not this year?

I realize that giving this year’s stats in a comparison of teams or players has two things going for it:  One, it is convenient and recognizable.  Two, it is not a bad proxy for true talent or future performance, depending on the stats of course.  And I am not quarreling with the stats themselves.  It could be EQA, VOPR, WARP, lwts, whatever.  I am quarreling with the relevance of one year stats to the discussion at hand.

I am claiming that you simply cannot and should not use one year stats in a rigorous or serious analysis or discussion.  And it is done ALL THE TIME - by serious analysts (like the aforementioned BP article - what you think of BP or Karl as serious analysts not withstanding) , by the media, and by casual and serious fans.  And I don’t like it.

It can give you all kinds of misleading information - obviously, and lead you to bad conclusions and inferences.  One year stats, even for a whole team - because a whole team is made up of lots of individuals, which is not the same thing as a whole lot of stats from ONE individual) - is simply not enough information for a serious or rigorous discussion of which teams or players are better than other teams or players, as in most of the post-season analyses you read, even in the sabermetric venues.

ALL serious discussions about who is better should begin and end with projections, which have two basic ingredients, assuming that we are working with park and context neutral stats so that everyone is on a level playing field to begin with (we obviously don’t want to compare raw stats from Rockies players with that of SD players, for example).  The two basic and critical ingerdients are:  One, multi-year weighted averages, and two, regression toward the mean.

Here are some interesting numbers:

If you use one year stats to represent future stats or true talent (essentially the same thing), the best case scenario is when you have at least 80 games of stats in that one year.  I am arbitrarily defining this as the best case scenario.  Obviously it could be more.  Or less, if you don’t want a best case scenario.  Anyway, I looked at year to year correlations from 98 to 07 for all players who had at least 80 full time games in back to back seasons. I used park and context neutral lwts as the stat of choice.

The y-t-y “r” was .636.  OK, that makes for a decent projection and a decent proxy for current true talent and future performance.  Not great, but decent.  .636 would certainly make us prone to lots of errors when comparing teams or players, especially if their true talent were close in the first place.  And this is just for players who have at least 80 games to work with.

What about if we used last year’s stats in these analysis?  IOW, what if I wrote an article comparing the Rockies and Phillies, just like the BP article and countless others, and I used last year’s VORP, EQA, or lwts?  Readers would think I was crazy, right?  Not so fast.  The y-t-y correlation between years that are one year apart is .588!  Not too much different from the .636.  You might as well quote last year’s stats as this year’s stats!  That is my point.  Sure, this year’s is better than last years, but the difference is not all thast great, and we can, of course, easily find something better than either of those two alternatives.  Heck, if we just used last year’s stats for players who had lots more PA last year than this year, we could probably get that .588 up to close to the .636.

What about the last two years weighted, with the same requirement, 80 games per year?  .651. better than one year. How come you never see the last 2 years quoted in these articles or analyses?  3 years?  .666, even better. 

And of course if you included some regression, you would do even better, especially if you were using one year stats (where you would want to regress almost 50% or so).  Why not take everyone’s lwts and cut them in half when making one of those lists?  Again, the readers would think you were crazy, but you would do even better with that than by printing the full lwts.

#1    auntbea      (see all posts) 2007/10/04 (Thu) @ 02:38

mgl:  I was wondering what you thought of the decision to take out Zambrano for the bottom of the 7th in game 1 of the Cubs/D-Backs.  I realize Marmol has been fantastic this year, but that is only one years worth of stats (as you point out above).  Is he really projected to do better than Z in that situation?  The answer better be a resounding yes, as the game was still tied and Zambrano had only thrown 85 pitches (and looked very good in doing so FWIW).  It should be noted that Piniella left Z in to hit for himself just an inning earlier… with 2 outs and the bases loaded.


#2    MGL      (see all posts) 2007/10/04 (Thu) @ 02:53

Marmol is a very good pitcher, projection-wise, his great stats this year not withstanding (or at least included).  He is probably as good as Zambrano, if not better, for one inning or so, and given that Zambrano is facing hitters the third time through the order (I think).  So no problem there, other than you burn Marmol somewhat.

But, apparently they were taking him out to minimize the impact of him pitching on 3 days rest in the 4th game.  (I did not know they were using a 3-man rotation until I heard that.) That makes sense, as on the average, pitchers do a little worse with 3 days rest then with 4 or 5, and apparently Piniella thinks the same.

As far as letting him hit with the bases juiced, that is normally a very bad thing to do in the late, or even middle, innings of a close game - one of the worst things a manager can do (and they do it all the time), but…

Zambrano is a very, very good hitter for a pitcher and a lefty.  I am not sure what lefty hitters they had on the bench off the top of my head, but you have to include the pinch hitter penalty for them.  Plus, the better a pitcher is, especially if you don’t have great relievers and/or don’t want to burn relievers, the less you would pinch hit for your pitcher as well.  Zambrano is also a great pitcher of course.

I don’t think that leaving him in to hit was a bad idea, but I am not sure.  If it was a normal or bad hitting pitcher, then it most definitely would have been a bad idea, I think.  It is easy to run the numbers though.  You would have to estimate how good a hitter Zambrano really is (.600 OPS?, .700?) and what the hitting alternative would have been (and then you burn that pinch hitter of course).  Then you have to know who comes in for him (and again, you burn your bullpen a little, but you save Z a little more for the next game).


#3    auntbea      (see all posts) 2007/10/04 (Thu) @ 03:04

I don’t know much about the Cubs, so I don’t know who could have pinch hit for Zambrano.

Another factor, of course, is that the game was tied and could easily have gone to extra innings, meaning some lesser pitchers would have pitched high leverage innings.  Even if the Cubs were to take the lead in the 8th, who were they planning to pitch in the 8th and 9th innings?  It seems to me all of these pitchers must be at least as good options as Zambrano after 85 pitches for this decision to be the right one.


#4    MGL      (see all posts) 2007/10/04 (Thu) @ 03:44

As we both agree, there some fairly complicated “chaining” issues.  But the decision to pull him in order for him to pitch on 3 days rest is not a bad one, I don’t think.  In either case, I don’t think you can chastise him.  Heck, if I can’t figure out the right decision…

wink


#5    tangotiger      (see all posts) 2007/10/04 (Thu) @ 09:37

Zambrano has a career 580 OPS.  His career BABIP is .312, which is above league average!  30% of his hits are for extra base hits, which is almost league average.  When the dude makes contact, he’s a real MLB hitter.  Problem is that over a third of his PA are strikeouts, and that comes with almost no walks (5 in 446 PA).  There’s really no regression required here, because all his components are right around the average for the kind of hitter he is.  So, we’re safe with 580 being his true talent level.

I didn’t know that Zambrano is a switch hitter.  He has much better stats as a RHH than LHH (668 to 549 OPS).  Normally, I would look at a hitter’s K/BB ratio, but Zambrano has 5 career walks!  I won’t learn much there.  His number of PA is fairly low as a LHH, so you’d have to look at it more in-depth.

The pinch hitting penalty is about 60-70 OPS points.  Zambrano gets a bonus for facing Webb a third time, which is about 15-20 points.  So, he gets about a 85 OPS bonus against someone’s true talent off the bench.  Take off about 25 points because he hits off his probable weak side, and the guy off the bench has to hit about 640 OPS.  That’s got to be anybody in baseball (except John McDonald).

But, he’d only pitched 5 innings.  Bottom of the rotation, sure.  But, not Zambrano.

And, most importantly, he got a double (a triple for some major leaguers) at the start of the game.  A double against Webb!  That changes his forecast up at least 300 OPS points, if not 400.  {Tongue-in-cheek}

***

Zambrano threw 85 pitches.  As for the 3-day rest, in his career, he has only pitched ONCE on 3 days rest, and that was two weeks ago:
http://www.baseball-reference.com/boxes/CHN/CHN200709180.shtml

It was back-to-back 100 pitch outings, and he didn’t perform well in the second game.

I dunno.... risky to take him out early, and risky to bring him in early.  This was a pure guts move, with no numbers to help him out.


#6    tangotiger      (see all posts) 2007/10/04 (Thu) @ 09:51

"The y-t-y “r” was .636. “

Definitely sounds low.  Assuming the average number of PA in your pool would be 450-500, your r=PA/(PA+275) or so.  Historically, it’s been around 200, not 275.  Must have been a bad year for forecasting this year if your number is accurate.  The historical number is closer to .68-.70 or so.

The .588 would imply r=PA/(PA+330).  That makes sense, since it should be about 1.25 times higher than the “275” number.

As for the .651: for the two years weighted (presuming you weight the more recent season at 1.0, and the following season at 0.8), I’ll guess the averag weighted PA to be around 600.  And with r=PA/(PA+275), r should be .69.  I find the .65 figure pretty low, unless I made a poor guess on the PA figure.  I don’t think it’s fair for you to report the .651 figure, since you increased the number of players in your pool.  You should have selected a threshhold whereby you have the same number of player in the 2-yr pool as you have in the one.

Your general point is valid.


#7    Patriot      (see all posts) 2007/10/04 (Thu) @ 10:35

I have no beef with Tango’s #5 which shows it was defensible to let Carlos hit...but given that Piniella must have known ahead of time that he planned to pull him early, I have to question what he [Lou] was thinking.  He only got one more inning from Zambrano in exchange for that one crucial at bat.


#8    jinaz      (see all posts) 2007/10/04 (Thu) @ 13:08

Assuming the correlations reported are correct, we’re looking at:
r=0.588 for previous year
r=0.636 for current year
r=0.651 for last two years
r=0.666 for last hree years

Granted, the three-year data are better.  I’m not really disputing that.  But how much better?  An improvement in the correlation of 0.03 seems like a minor improvement to me.  In fact, I’m surprised how small of an improvement we’re seeing here.

To put it another way, you claim that 0.588 is “not too much different” from 0.636.  So what’s the motivation to go to 0.651, or to 0.666, when those are smaller jumps than 0.588-->0.636?

Just playing devil’s advocate.  Mostly.  I have recently posted some postseason team profiles that include only ‘07 stats, mostly because I can get them quickly.  When evaluating the individual players in the profiles, I take a look at their previous season’s stats, as well as things like babip, props, fip, etc, that can give indications of lucky/unlucky performances in the current season.  It’s perhaps less objective than three-year weighted and regressed averages, but it’s less work and probably comes to similar conclusions most of the time.
-j


#9    HarryAbles      (see all posts) 2007/10/04 (Thu) @ 13:43

"Marmol is a very good pitcher, projection-wise, his great stats this year not withstanding (or at least included).”

MGL, did you really have Marmol as above average?  I’ve got 5 projections in front of me, and the lowest ERA is CHONE at 4.82 .


#10    Tangotiger      (see all posts) 2007/10/04 (Thu) @ 13:50

jin: like I said, MGL shot himself in the foot by increasing the number of players in his pool.  It is on that basis, and that basis only, that he could have gotten such low correlation totals for the multiyears.


#11    MGL      (see all posts) 2007/10/04 (Thu) @ 14:28

I did some quick correlations, so I am not sure of the integrity of the whole thing.

For Marmol, I also have a bad pre-season projection for him.  I assume you are looking at pre-season projections.  But he has pitched so well this year and his past TBF are so small (only 2006 in majors, AAA, or AA) that his projection changed radically (to a good one).  Of course, the uncertainty is high, and I have not seen him pitch much (which can inform the projection).


#12          (see all posts) 2007/10/04 (Thu) @ 14:35

Is this the biggest pitcher hitting mistake (note: may not have been a mistake at all) of the day or just an example of confirmation bias?  Earlier in the day, Jeff Francis hit in the top of the 7th with a runner on first and one out (grounded out to second) and was relieved before the bottom of the 7th. 

The game was 3-2 at the time.  If the Phillies score two in the bottom of the 7th (off Francis or Hawkins), isn’t this the most egregious non-pinch hitting move ever (as far as the media is concerned)?


#13    Tangotiger      (see all posts) 2007/10/04 (Thu) @ 14:39

Just to be clear about correlations: I can take something that has the absolute minimal of relationship and I can get a correlation to approach 1.00, if I can get enough sample for that.  For example, clutch hitting DOES exist.  But, if all you have is 100 PA, the correlation will be something like r=.02.  If I have 5000 PA, r=.50.  If I have 50,000 r=.90.  See where I’m going here? 

So, in the BBTN, when they reported a correlation of clutch of r=.33 or whatever it was, I was not surprised.  They probably had something like 2500 PA per player (in each odd year and in each even year) to begin with.  The result was entirely consistent with what we know about clutch hitting.

The same applies here.  The average number of PA that MGL had in his single year correlation was probably something like 450-500.  But in the two-year, instead of it being 900-1000, it was probably closer to 700.  And in the three-year, probably 800.  MGL could have run a 20-yr correlation and would have STILL gotten an r under .70, simply because he kept his original threshhold (350 PA or so) intact.  The average PA per player would have barely moved, since he’s letting in so many players through the “back door” (for example, PITCHERS!).

I hope that’s clear.


#14    Tangotiger      (see all posts) 2007/10/04 (Thu) @ 14:41

Ok, maybe not pitchers, since he has the threshhold for both end points.  The main point still stands though.


#15    auntbea      (see all posts) 2007/10/04 (Thu) @ 14:58

Sherwood:  I can’t speak for anyone else obviously but the move to take out Francis immediately after having him bat was either so boneheaded that there really can be no debate whatsoever (making discussing it uninteresting) OR there was some mitigating factor that nobody in the media was made privy to (e,g, he stiffened up running out his grounder) because the Rockies ended up winning.


#16    MGL      (see all posts) 2007/10/04 (Thu) @ 18:32

Regarding #13, that is the WHOLE point of using more than one year to represent true talent.  It has more PA.  The entire point.  What other point is there?  For example, granted Rolen was hurt this year and still is, but to use his 300 PA of terrible hitting this year, as opposed to 5000 PA of great hitting as representative of his true talent and performance going forward (not withstanding his injuries and age) is ridiculous.  So many of those, “Let’s look at their OPS for the year” comparisons include players with career years and players with bad years and no injuries (both due to random fluc).  Sure, when you have a team of 10 players or so, those things tend to even out, but they won’t even out EXACTLY and sometimes they will be a lot off.  And when you are comparing individual players (for true talent and performance going forward), using this year’s stats only is just plain ridiculous.  And that is done ALL the time.


#17    4seamer      (see all posts) 2007/10/04 (Thu) @ 18:39

MGL - Tom..

someone put up the xERA3 and xERA4 formulas floating around so I can evaluate my work please.


#18    James Holzhauer      (see all posts) 2007/10/06 (Sat) @ 05:13

I’m familiar with the Cubs bench.  They certainly had some pinch hitters who were a better bet to produce in that spot than Zambrano, led by Daryle Ward.  Of course, this has to be weighed against the cost of removing him from the game.  It’s probably a close call.

I fully support the decision to put in Marmol.  His ERA won’t be under 1.50 going forward, but his ratios this year and his scouting reports both conclude that the move to relief has done wonders for him.  I’d certainly take a fresh Marmol over a tired Zambrano, and getting Big Z fully rested for Game 4 is gravy.  It just didn’t work out.


#19    tangotiger      (see all posts) 2007/10/06 (Sat) @ 08:24

MGL/16: what I’m trying to say is that by keeping the threshhold at a min of 350 for your pool of players, that even if you go to multi-year, your correlation will barely go up.  That is, you didn’t prove your point enough, since the correlation will barely change if you use 2 years or 20 years, IF AND ONLY IF you keep your threshhold so low.  What you end up doing is keeping the average prior PA per player very similar whether you use 2 years or 20 years.

You go from 100 players with a range of 350 to 1200 PA in the 2-yr sample to 200 players with a range of 350 to 5000 PA in the 20-yr sample, but you are letting in alot of the extra 100 players with only 350-500 career PA.

In order to prove your point, you would need to keep 100 players in the 2-yr sample AND 100 players in the 20-yr sample.  Then, and only then, can you be assured of a r that goes much higher.

As jinaz/8 says, the change in correlation is not that much different using your process, but that’s a result of the process of letting in alot of backdoor guys with small samples into the pool.  If Lenny Harris has 200 PA in year x-1 and 360 in year x, he doesn’t make your original pool (single year-to-year).  If he has 180 PA in year x-2, all of a sudden, he’s now part of the 2-yr pool!  That’s not a good way to increase correlation.


#20    auntbea      (see all posts) 2007/10/06 (Sat) @ 20:42

If one were to include Carlos Marmol’s 2007 post season stats into his projection going into next year, it obviously would not look quite as good, as he has so far given up 3 hits, 3 walks, 3 runs, and 2 home runs in 3 innings (with 6 strikeouts).  He only had 69 IP in the regular season this year, so 3 IP is not insignificant.

Do any projection systems use post season stats?  It seems like they would be useful in situations like this.


#21    MGL      (see all posts) 2007/10/06 (Sat) @ 21:23

I don’t use post-season stats but I could I suppose.  For one thing, you would want to make sure you adjust for the competition.  For another, no matter how many IP he had this year, 3 IP IS insignificant.  Let’s say that he had NO IP this year or in prior years.  How significant would 3 IP be?  Almost none.  So with SOME IP this year and some last year, how significant do you think 3 IP would be?  Less than almost none.


#22    auntbea      (see all posts) 2007/10/07 (Sun) @ 00:02

mgl:  yes, I understand 3 IP is not terribly significant in general.  What surprises me, then, is how a below average projection for Marmol coming into the season can change into an excellent projection after only 69 IP this season.  69 is also not an awful lot of innings.


#23    auntbea      (see all posts) 2007/10/07 (Sun) @ 04:17

Looking at Marmol’s minor league stats I see that he has also pitched 41 very good innings this year for the Iowa Cubs in AAA PCL which should at least partially answer my question above.


#24    MGL      (see all posts) 2007/10/07 (Sun) @ 13:33

yes, I include AA or AAA stats.  He had good, but not great, MLE translations for Iowa.  Too many HR and walks, typical of a young, hard thrower.

100 IP is a lot more than 3 IP (duh), but even though 100 IP is not a lot at all, if a pitcher has ridiculous numbers for 50 or 100 IP, even with regression (which is a lot for a pitcher), he is going to have a nice projection, especially if those ridiculous numbers include good or great K, BB, and HR rates (those are the components that get regressed the least).

Of course, with that few IP, a good or bad (or whatever) projection has a high incertainty AND it is subject to radical change as that pitcher amasses more IP.


#25    auntbea      (see all posts) 2007/10/07 (Sun) @ 14:15

Part of my point about the 3 innings possibly being significant is that Marmol had only given up 3 home runs in 69 ML innings this year.  Since I was unaware of his AAA stats, his 2 in the playoffs almost doubled his home run rate!  Obviously with batters faced numbers this small it seems like a large regression is probably in order anyway, but it is easy to imagine how 3 innings could be significant if a relief pitcher were to give up a ridiculous amount of homeruns (say 6) in those innings after having only given up 3 all year.

It seems like a factor in any decision making process must be the uncertainty one has for the various projections involved.  In an obviously contrived scenario, let’s say you have 2 options to start against an excellent pitcher (say Webb) in the deciding game of a series.  In option 1, you have an absolutely known quantity who is guaranteed to give up exactly 3 runs in every 7 innings of work.  In option 2 you have a mostly unknown quantity who you know averages exactly 4 runs in 7 innings of work, but sometimes gives up many more or many fewer runs.  If this uncertainty is large enough, it seems like starting option 2 would be the better option for this particular game despite the average projection of option 2 being worse than that of option 1.

I’m not at all implying that this must be a factor in the Marmol v. Zambrano discussion.  It just seems like some specific decisions require not only the averages (means) but also the standard deviations of projected performances involved for correct analysis.


#26    Tangotiger      (see all posts) 2007/10/08 (Mon) @ 14:41

bea/25: I doubt you can find any real-life cases to support your position. 

What you are suggesting is that you might have a pitcher with say a forecast of 4.00 ERA, but based on extensive performance and scouting data, and another pitcher with a forecast of 5.00 ERA with very limited data, and that because of the very high uncertainty in the second player and very low in the first, that the guy with the higher ERA has a better chance of giving up 2 ER in 6 IP than the guy with the lower ERA.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 14:49
Mail: rWAR v fWAR

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 13:00
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 12:05
Could Rob Dibble have been a comp for Strasburg?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?

Sep 01 23:16
Strasburg II

Sep 01 22:11
PITCHf/x Summit 2010 - Recaps