THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, May 11, 2008

Streaks: Has the whole world lost its collective mind?

By , 07:20 PM

This is the headline and a snippet from an MLB article on the Tigers page from yesterday:
Tigers waiting for Cabrera to bust out

DETROIT—The Tigers’ new $153.3 million man has not lived up to expectations. But that’s not to say he won’t eventually.

Leyland apparently agrees with that sentiment.

I just looked at my current player database as of a few days ago, and Cabrera had a +33 per 150 park-neutral lwts.  What the hell are they talking about?


#1          (see all posts) 2008/05/11 (Sun) @ 20:10

Another person who has lost her mind is Christina Kahrl from the “sabermetric” Baseball Prospectus. From Saturday’s free article currently available on the site’s main page.

“the struggles of [A] and [B] have gotten so desperate that it’s more than time for the team to have someone to choose as an alternative instead of the equally stone-cold [C] and [D].”

I omit the players names because it frankly doesn’t matter who they are. Here we have a writer from a website that posits itself as the pinnacle of baseball analysis falling into the same trap that dopes in the mainstream media fall into all of the time. A and B have 77 and 152 PAs respectively and both came into the season with projected WOBAs of .352 and .345, respectively (using ZiPS). Their poor numbers this year have dimmed those numbers only a bit and if they were the best options on April 1, they certainly still are today. It continues:

“having [E] active to alternate with them and create a more direct challenge to them for playing time might light a fire under somebody”

No evidence exists of the kind of effect suggested there. In Kahrl’s entire 1925 word transaction analysis, there is very little if anything that can be called sabermetric analysis. And on most days, BP expects people to pay for dreck like this.


#2    Billy      (see all posts) 2008/05/11 (Sun) @ 20:10

From Baseball Prospectus:

Miguel Cabrera 160 .252 .338 .439

That ain’t breaking out.


#3    Steve      (see all posts) 2008/05/11 (Sun) @ 20:53

I would assume they don’t think a .252 AVG with a OBP that has dropped by .063 from last season and a SLG that has dropped by .126 from last season is what they paid for, especially given that he’s only 25.


#4    tangotiger      (see all posts) 2008/05/11 (Sun) @ 21:01

MGL, that doesn’t sound right.  Unless Comerica is a big pitcher-friendly park, Cabrera has been hitting around league average.

Do you mean his revised forecast is now +33?


#5    Vegas Watch      (see all posts) 2008/05/11 (Sun) @ 21:47

He has an OPS+ of 113.  In the last three years, he’s been at 151, 159, and 150, respectively.


#6          (see all posts) 2008/05/11 (Sun) @ 22:07

Phil, I agree with you 100%.  And I just posted some data that suggests that short-term BAD performance (at least by BA) in April by hitters has NO predictive value other than how it changes the Marcel.  And that jives with the results of my “hot and cold” research in The Book.

Pitchers seem to be a little different animal.

Tango, let me check.  And of course, I always use LONG-TERM park factors (adjusted for the current parks in the league) for the park adjustments, as I should. You should NEVER care how a park is playing for the time period you are adjusting a player’s stats, unless that time period is a long one, or you are having some funky weather, and even then, you are much better off adjusting for weather AND long-term “true” park effects.  And yes, Detroit is a decently extreme pitcher’s park.  .97 is their run factor, based on the components. 

Anyway, that (park adjustments) shouldn’t matter that much.

On May 4, Cabrera was .279/.369/.520, which is pretty darn good, especially considering the low offense in the AL this year so far (for whatever that is worth). I guess he has really slumped since then.

And of course ANY team which expects a GOOD or BAD player to have the same performance as in the past, does not understand regression toward the mean! (Although, if a player is young and good, or old and bad, his age decline or improvement can actually wipe out the regression.)

Anyway, I was using May 4 data to get the +33.  His line stats on that date were:

PA=121 (AB+BB+HP-IBB)
s=17
d=5
t=1
hr=6
uibb+hp=15
so=19

That is around a lwts per 150 of around +20.  I am not sure where I got the +33.  Maybe the park adjustment takes it up that high.


#7    Vegas Watch      (see all posts) 2008/05/11 (Sun) @ 22:34

"On May 4, Cabrera was .279/.369/.520”

That was actually May 1.  He’s hit .171/.237/.200 in 38 PAs since.  An 0/4 really kills your SLG this early in the year- he went from .519 to .500 in one day.


#8    MGL      (see all posts) 2008/05/12 (Mon) @ 01:44

If an 0/4 can kill a stat, then, by definition, that stat cannot be worthwhile in terms of predictive value.  That should be obvious to even someone who knows nothing about what we are talking about.  IOW, to a casual fan, if a player on May 1 is doing very well and then on May 10, he is doing abysmally, it should be intuitively obvious that it is too small a sample to draw any conclusions about anything.  Of course, that is trying to introduce logic into someone’s argument that has little logic in the first place, although “logic” is not really the right word.

In any case, I stand corrected about Cabrera “not living up to his hype or his expectation,” as of May 10 or so.

I also love when people say that eventually “so-and-so” (insert player or team), who is slumping, will go on a tear, and then when they do, they say, “See, I told you so!”

Of course, eventually EVERYONE will go on a tear, whether they are good or bad, although the good ones are more likely to do so than the bad ones, and in less time.

Every year, there are several teams that are, “Good, no bad, wait, good, no, sorry bad, good, bad, nope, they’re good again...”

Last year it was the Yankees and probably a few others.  This year, so far, it is the Tigers, Mets, Royals, Seattle, and Baltimore.  Soon it will probably be Florida (who is still a shitty team with a great record - nothing has changed from before the season), maybe Arizona (decent, but not great team), and I’m sure some other teams as well.


#9          (see all posts) 2008/05/12 (Mon) @ 02:55

I’m still relatively new to statistical analysis, so what i’m asking is probably somewhat novice, but what exactly constitutes an “acceptable” sample size ? 300 PA’s?


#10    Colin Wyers      (see all posts) 2008/05/12 (Mon) @ 04:18

I’m still relatively new to statistical analysis, so what i’m asking is probably somewhat novice, but what exactly constitutes an “acceptable” sample size ? 300 PA’s?

There’s no one threshold where a sample size suddenly changes from being invalid to valid. Everything needs to be regressed to the mean - eventually you may reach a point where the amount of the regression is so small that it doesn’t noticably affect the outputs, but that generally occurs when you’re looking at a guy’s career numbers, if at all.

Beyond that, it really depends on what skill/measurement you’re talking about.


#11          (see all posts) 2008/05/12 (Mon) @ 05:37

#10- Thanks for clearing that up Colin. makes more sense than what I had figured


#12    MGL      (see all posts) 2008/05/12 (Mon) @ 05:46

#10, good answer of course.  And #9, a good and common question.

Unfortunately, it is a little like “chaos theory.”

Two examples of that, are:

1)"Why should anyone vote if none of their individual votes has more than a 1 in billion chance of mattering?”

2) If I pulled out all of my hair one hair per day at a time, each day you would never notice the difference, but eventually I would be bald!  At what point do you notice?

The second one is a good analogy to the, “At what point does a sample size matter” question.

Colin’s second paragraph is really important Aaron.  FIRST, you have to answer the question, “Is the skill/measurement I am looking at have a lot of variation in ‘true talent’ (like a player’s BB or K rate) or a little (or even none) variation (like a pitcher’s BABIP, or a RHB platoon differential).  If the answer is a lot (of variation), then the “certainty” of the sample result climbs quickly as the sample size increases.  If the variability of the measure is small, then the certainty of the sample measurement climbs slowly (doesn’t climb at all, if the variability is zero) as the sample size increases.

That is why we like to give the “50% regression point” for a measurement.  It tells us that variability, at least as compared to some other measurement.  So if OPS needs 500 PA to be regressed 50% toward the mean (that means that after 500 PA, a player’s OPS averaged with a league average OPS (more accurately, the mean OPS for that “kind” of player), gives us our best estimate of that player’s true OPS, or what we expect him to do going forward, not counting aging, injury, etc.), and another measurement needs 3000 PA for a 50% regression (like maybe a RHB platoon differential), then there is more variability in OPS skill in the population than there is in platoon differential for RHB’s, and we need a much larger number of PA for a player’s sample (what he DID) platoon differential to be close to our estimate of his true platoon differential (what we expect going forward), than with his OPS.

(BTW, with platoon differential, the smaller of the PA, with versus LH or RH, is the one that most tells us about the certainty of the sample differential.  So if a LH batter has faces 1500 RHP and only 200 RHB, we use mainly that 200 to determine how much to regress his sample platoon differential toward the mean.  Even though he has faced a total of 1700 pitchers, we will regress his platoon differential a lot because of those, only 200 were against a LHP.  obviously if a guy faces 5000 RH pitchers and 3 LHP, you know intuitively that his platoon differential is meaningless in terms of what it will likely be going forward (his “true” differential). Keep that in mind when you hear commentators quote player’s platoon splits, as well as the fact that you need a jillion PA for a RHB platoon splits to not get almost all the way regressed toward the mean.


#13    tangotiger      (see all posts) 2008/05/12 (Mon) @ 07:43

Yes, I love the 50% regression point threshhold.  It means that at that level, half of what you are seeing is real, and half is not.  After 200 or so PA, the r=.50, so half of what you are seeing is real.  But, if you are looking only at the K rate, then you need less, say 150 PA.  If you are looking just at GB rates, then you need even less, like 100 PA.  (Don’t remember exactly what.)


#14    Lou      (see all posts) 2008/05/12 (Mon) @ 09:51

#1Phil D - she wasn’t talking about GOOD players, when bad players struggle, I don’t understand the objection to suggesting a change.


#15    Jon      (see all posts) 2008/05/12 (Mon) @ 12:29

I completely agree that getting excited about Cabrera’s (relatively) slow start is stupid.

But…

I did’t read the article, but from what you quoted, it just seems to be saying that he’s not hitting as well as he was expected to.  Who’d disagree with that?

I think that even his numbers on May 1 would be considered a disappointment.


#16    andy      (see all posts) 2008/05/12 (Mon) @ 12:44

Isn’t “Waiting for Cabrera to Bust Out” a non-sabermetric way of saying, “he is going to regress upwards towards the mean”?


#17    Rally      (see all posts) 2008/05/12 (Mon) @ 12:50

The Tigers are an example of what can go wrong when you have a good GM and give him too much money.

Dombrowski did a great job building the team that went to the series in 2006.  He did overpay for a few free agents (Mags, Pudge) but buildt most of the team through the farm system and finding undervalued players like Polanco, Marcus Thames, Carlos Guillen, Kenny Rogers.  Brandon Inge was a productive player at reasonable cost despite not hitting much, thanks to his great defense.

Now the Tigers are the built in the most predictible way an “overpaid mediocre team” can be built:  A guys who can hit but are terrible defenders (Cabrera, Renteria), expensive vets well past their prime (Sheffield) and pitchers with a track record and price tag who aren’t especially good (Willis).


#18    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 12:53

One place that Pinto has a leg up on the competition is here:
http://www.baseballmusings.com/cgi-bin/PlayerInfo.py?StartDate=01/01/2008&EndDate=05/01/2008&GameType=all&PlayedFor=0&PlayedVs=0&Park=0&PlayerID=1744

Cabrera, through games of May 1, 2008:
.279/.369/.519

And since then:
http://www.baseballmusings.com/cgi-bin/PlayerInfo.py?StartDate=05/02/2008&EndDate=05/12/2008&GameType=all&PlayedFor=0&PlayedVs=0&Park=0&PlayerID=1744
.171/.237/.200

His Marcel entering 2008 was:
.328/.409/.558

His performance through May 1 was a mild disappoint, but clearly within the bounds of random variation.  That is, we can’t tell if it was pure luck that he didn’t perform well, or if something innate about him changed.

He’s had a horrible 10 days since.  Wake me up when this is the only great hitter this year or last who has gone through a 10 day drought where he hits like a pitcher.


#19    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 13:04

Rally/17: good observation.  It’s incredible what happens when you force a budget on someone.  The NHL’s payroll cap forces the teams to think smart.  Unfortunately for the Tigers, the savings that Illitch has made on being forced to not spend as much as he would like on the Redwings(*) has spilled over to the Tigers.


(*) The Redwings would overspend like crazy, and they were routinely one of the best teams in the league for a long long time.  Now, their payroll spending has been forced down by the league cap, they spend smart, and they are in the conference finals.  Basically, they are smart.

The difference is that in hockey, you don’t have all these crazy numbers you have in baseball, where HR and RBI have a life of their own and fielding is always an afterthought.  Scouting is hugely important. 

The baseball GMs, some of them anyway, don’t know how to translate wins to dollars, even though it is fairly easy to do so.  Almost any player can be valued for dollars in under 5 minutes, and don’t listen to anyone who tells you otherwise.  Just one of those things where having too much information makes you feel that you are justified in your moves.

I have to believe that once USA Today started publishing all the NFL stats 15-20 years ago, that betting went up like crazy, as everyone now had a “secret recipe” for beating the odds.  That’s a (small? big?) portion of the baseball management for you right now. 


#20    MGL      (see all posts) 2008/05/12 (Mon) @ 14:56

#19, that is a great passage!  Is that yours or from something someone else wrote?


#21    Tangotiger      (see all posts) 2008/05/12 (Mon) @ 16:17

That’s me. (*)

(*) But the effectiveness of the format is due to the Poz-terisk, one of the great things to come out of Poznanski’s blog.


#22    MGL      (see all posts) 2008/05/12 (Mon) @ 20:22

So what is the “poz-terisk” other than an asterisk in parenths?  Does that mean that something is REALLY important?

If that is the case, ALL of my posts should have them! wink


#23    tangotiger      (see all posts) 2008/05/13 (Tue) @ 00:28

Just an interlude or tangential thought, that you can skip over.  But that thought is usually what makes Poz’s post the best.  Right now, I’m not using it effectively, but it’s useful as a footnote for me.


#24    Bjorn      (see all posts) 2008/05/14 (Wed) @ 09:58

Sorry for a somewhat frivolous comment but…

Doesn’t to lose ones mind imply that you sometime actually were “of sound mind”, personally I am far to cynical to make that assesment for mankind.


#25    MGL      (see all posts) 2008/05/14 (Wed) @ 15:07

Even though I was somewhat wrong about this whole Cabrera thing, that is a good point Bjorn!


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 15:54
The two uncertainties of UZR

Sep 02 15:17
Mail: rWAR v fWAR

Sep 02 14:59
Roger Federer

Sep 02 14:59
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 14:57
Could Rob Dibble have been a comp for Strasburg?

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?