THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, September 08, 2010

Secret Sauce?

By Tangotiger, 04:26 PM

I’ve never been a fan of BPro’s Secret Sauce.  First, it’s hard to test it, if the formula is based on the universe of data we are interested in.  There’s no expectation that future years will match the regression.

Secondly, given that there are only three components to the metric, and one of the components is relief performance, and an abnormally disproportionate share of the playoff relief performance has Mariano Rivera in it, and Mariano Rivera has the most extreme performance of any post-season pitcher ever, what the secret sauce could just as well say is: fielding, starting pitcher K rates, and Playoff-Level Mariano Rivera.

And, lo and behold, now that the Yankees aren’t powerhouses since the Secret Sauce was launched, it looks like it doesn’t work any more.

This reminds me alot of what Bill James did with his secret sauce 25 years ago, in trying to figure out which team would be favored to win in the playoffs.  Just because you can fit a metric to the data, doesn’t mean that the out-of-sample data will be similarly fitted. 

Logic should trump rationalization.  And all these secret sauce metrics are rationalizations.  Show me otherwise.


#1    MGL      (see all posts) 2010/09/08 (Wed) @ 18:23

The “secret sauce” formula is ridiculous, whether the “trend” continued since 2002 or not.  Anything that helps a team win games in the regular season is going to help them win games in the post-season - period.  Any other suggestion is just ludicrous.  Do closers get more high leverage opps per game in the post-season?  I assume yes.  Does offense not matter in the post-season?  Of course it does. Any suggestion otherwise is not even worthy of a comment.  Is defense more important in the post-season?  I doubt it.  In fact, it should be less important since you probably have fewer BIP per game.  The whole thing is just ludicrous.  I don’t know what else to say.  Nate should be ashamed of himself.  About the only thing that might be different in the post-season is experience, at least according to some recent research (I forgot by whom), and even then, I am skeptical (although I don’t recall any major problems with the research), and that isn’t even in the formula…


#2    Phil D      (see all posts) 2010/09/08 (Wed) @ 18:32

Click of my name and there is an interesting thread from this blog about secret sauce from 2 years ago. It’s kind of cool to look back at old threads like that from time to time. Here is what I said at the time (and everyone really contributed as far as poking holes in the theory).:

“Even if the main idea behind secret sauce were sound, since when is one season’s worth of WXRL an effective way to measure closer effectiveness? And since when is FRAR a desirable way to measure defense? The idea that WXRL and postseason success have a surprisingly high correlation is like saying that having a last name beginning with M has a high correlation. It’s total applesauce.”


#3    J-Doug      (see all posts) 2010/09/08 (Wed) @ 18:41

@Phil D: Right, the theory wasn’t all too sound in the first place, but one of the great things that it seemed to have going for it was that it appeared to work. That no longer seems true

@MGL: I’d take issue with that. In October, the lineup isn’t the same, the rotation isn’t the same, I’m gonna bet that the win value of a run isn’t the same (but I could be wrong), and most importantly the competition isn’t the same. In other words, it’s a good possibility that replacement level is different in the regular season than it is in the postseason, and if that delta isn’t equal for all components of RAR then you’re not going to see the same correlations.

I felt Nate made a pretty good argument as to why the postseason is different in his Secret Sauce articles, even if the stats don’t seem to be backing him up now.


#4    J-Doug      (see all posts) 2010/09/08 (Wed) @ 18:44

*I meant roster, not lineup.


#5          (see all posts) 2010/09/08 (Wed) @ 18:55

J-Doug, all I am saying is that the team with the better offense OR the better defense, or the better closer, OR the better starters, or some combination of the above, is the team that is more likely to win, in the reg and in the post. 

Now, whether the relative weights of those things are exactly the same in the reg season and the post-season, is another story.  It probably isn’t.  As I said, closers get more IP per game and in more high leverage situations. Obviously the top 2 or 3 starters get most of the IP.  Run scoring is lower, which changes things a little. But the idea that offense isn’t important and that defense is SO important, for example, is just beyond absurd.  And the idea that those relative weights change dramatically is equally absurd.

As Tango said, logic trumps ANY kind of statistical analysis, when the thing being analyzed is not rocket science and we already understand it quite well…


#6          (see all posts) 2010/09/08 (Wed) @ 18:56

Phil, that link does not work for me.  Just takes me back to the current Book blog…


#7    J-Doug      (see all posts) 2010/09/08 (Wed) @ 19:02

@MGL: Agreed. The logic was ad hoc (and probably post hoc), at best.


#8          (see all posts) 2010/09/08 (Wed) @ 19:03

From the NYT article describing Silver and Perry’s “work.”

“...they also found no significant correlation between any measure of team offense (including bunting and stealing) and postseason success.”

Does anyone believe that?

So let’s say that the WS are playing the Yankees this year in the post-season.  And when you go through the matchups for, say, a 7-game series, you find that the pitching (starters, closers, etc.) and the defense of the two teams are almost identical.

Do you think that the proper line for the series is even money, even if the Yankees have an offense that is .5 runs per game better?  How could that possibly be?

That is EXACTLY what Perry/Silver and their regression say!  Exactly!

Can you say “Type II error...”


#9    J-Doug      (see all posts) 2010/09/08 (Wed) @ 19:23

@MGL: I haven’t been able to find an offensive correlation myself, although I’m sure my methods aren’t nearly as sophisticated as yours, Tango’s or Silver’s.

The issue of course is that luck plays so much more into a 5 or 7 game series than it does into a 162 game season. It’d make much more sense that nothing is significantly correlated due to the outsize effect of chance than if only a few stats have anything to do with the outcome. The hope of Silver and Perry is that something is so reliable that chance wouldn’t overpower it in October.

To me this seems very similar to the issue of using season-level components vs. game-level components to measure defense. The former method is obviously more accurate, but it requires more work and is harder to obtain. The latter is less accurate but easier to do. Silver and Perry felt they found something that was easy but also accurate enough.



#11    J-Doug      (see all posts) 2010/09/08 (Wed) @ 21:37

Thanks for the link, Phil. I wasn’t reading this blog back then. Really, if there are any stats that do better in predicting postseason success than others, it should be the ones with the smaller standard error (which, presumably, would be more powerful to overcome the noise of random chance in a short series). If offense is actually less important, we should see it as statistically significant but with a diminished beta coefficient.

Also, I liked MGL’s analogy to pharmaceutical research, although quite a few drugs are developed before we figure out why they work after accidentally discovering that they do. Early chemotherapy comes to mind, as do serotonin inhibiting anti-depressants (we still have only vague, uncorroborated theories as to why they work).


#12    MGL      (see all posts) 2010/09/08 (Wed) @ 21:39

J-Doug, I don’t have any method!  What method do you need to know that all other things being equal (which is what a multiple regression does, right?), the team with the better offense is the favored team in one game or in one series.

And yes, it is true that because we are talking about 5 and 7-game series, nothing should have a high correlation, especially with such a small sample of series and games.  Which is why you are going to have your share of Type I and Type II errors in the regression.

And as Tango and I have said many times, these are all Bayesian probabilities, where “common sense” (actually what we already know, based on past analysis) is the “prior probability.” If I run a regression on winning regular season games with a small sample and I find that pitching talent of the starting pitcher is NOT correlated with winning at the game level, do you think I probably have a Type II error (or I made a mistake in my analysis)?  You bet I do!


#13    mettle      (see all posts) 2010/09/08 (Wed) @ 22:35

Can’t the Yankees question be answered by taking out Yankees games from 1995-2001 or for all years? What are the 95-09 results if you do that?
More importantly, shouldn’t this be more correctly addressed by making sure you include TEAM and YEAR as random effects in your analyses (or PLAYER where appropriate)? I only admire baseball stats analyses from afar, so I don’t know the particulars, but is that not how they are typically done?
If so, this would seem to be a pretty big oversight that leaves most analyses subject to the criticism that one team/year/player is driving the whole effect.
Maybe I should get getting my hands dirty…


#14    J-Doug      (see all posts) 2010/09/08 (Wed) @ 22:46

@MGL: Right, I’m not arguing with that.


#15          (see all posts) 2010/09/10 (Fri) @ 03:07

I’m guessing the failure to consider the possibility of a type II error is the second most common type of faulty statistical interpretation, the most common being the failure to consider the possibility of a lurking variable.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 16:48
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 16:10
Clutch analogy

Feb 11 15:58
MGL: Today on Clubhouse Confidential

Feb 11 11:54
Who is Jeremy Lin?

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul

Feb 10 21:07
Hero of the month: Brittney Baxter

Feb 10 18:32
Moneyball at Villanova

Feb 10 17:00
Psst… wanna intern in Canada?