THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, July 24, 2011

My issue with regression equations

By Tangotiger, 10:35 PM

Patriot captures it right here:

Building your metric around a run estimator does not necessarily restrict you to simply plugging in the numbers in the appropriate place. Suppose you wanted to construct a metric based on batted ball types, strikeouts, and walks. One way to go about it would be to simply go through and estimate singles, doubles, triples, homers, and outs in play based on the percentage of each batted ball type that wind up as each. So, you would end up with equations that might look something like this:

Singles = .057FB + .217GB + .516LD + .017PU

However, if you believe that you have gleaned some other insights into the relationship between events that could improve your metric (such as strikeout pitchers having lower HR/FB rates) , you could still build that in to your formula for estimated home runs, and plug those into the run estimator.
It’s more difficult than running a regression, and a more delicate balancing act (at least in terms of developing the formula), but it allows you to stay grounded in a model that estimates runs by taking a first step of, well, estimating runs.

He’s saying this (or if he’s not saying it, then that’s how I am reading it, and, in any case, it’s how I think it):

1. You start with a working model of how runs are created.  This is the beauty of something like BaseRuns, because it works so darn well… GIVEN its inputs.  If you know the number of hits, HR, walks, outs, then we have a fantastically great estimate as to how many runs are expected to be scored.

2. If you don’t know the inputs, estimate the inputs… but don’t change the actual run scoring model.  So, again, if you happen to not have the number of doubles, but can estimate the number of doubles that this pitcher either gave up, deserved to give up, or was expected to give up, and it’s based on his batted ball distribution profile, and/or the number of HR he gave up, and/or his SO/BB ratio, then estimate the doubles in that manner.... but do NOT touch the run scoring model.

Once you have the estimates of all your inputs, then you can plug them into an established working model.

Even something like FIP is basically a regression equation, because it doesn’t adhere to an actual run scoring model.  Of course, there is a tradeoff between complexity level.  A linear equation is used at the expense of a real baseball run scoring model because it’s easier to compute or understand.  But, if you’ve got a complex linear equation, or even a complex multiplicative equation, or some other form of equation, then you’ve got the worst of both worlds.

This is why I like FIP or wOBA, because they are such simple metrics, that its strengths and limitations are readily apparent.

So, ANY pitcher metric that is not grounded in BaseRuns is immediately setup for a limitation.  The bigger your limitation, then the easier your metric must be.

SIERA, for example, is a good example of a metric that is too complex for its own good.  The insights, the benefits of SIERA is hidden inside its complexity.  But, if Matt were to follow Patriot’s lead here, and compute estimates for events (1b, 2b, 3b, hr, bb, so) based on his findings, about how things interact, then we would have a very helpful metric.

So, that’s my recommendation as to how you can really advance the cause: keep the logic of baseball intact if you insist on complexity.


#1    Patriot      (see all posts) 2011/07/24 (Sun) @ 23:14

Yes, that’s exactly what I was getting at. Well said.


#2    Lee      (see all posts) 2011/07/25 (Mon) @ 11:40

Yea, this is the best summation of the entire SIERA inkstorm that I’ve read so far.


#3    Matt      (see all posts) 2011/07/25 (Mon) @ 13:24

Agreed, you and Patriot put into words what was lurking in the back of my mind. The “throw everything (every term, squared term, interaction) at the wall and see what sticks” method of regression analysis always seems not very well-grounded, and you’ve explained why: it’s not based on anything in reality.

Still I know that it’s tempting, because I do it myself when I’m fooling around with data!


#4    Tangotiger      (see all posts) 2011/07/25 (Mon) @ 13:40

The thing with SIERA is that when you read Matt’s explanation, he DOES breaks things down, and tries to tell you why he’s using each component.  In a sense, while he tested the “let’s throw everything against the wall”, he then tries to figure out the reason why we might be seeing certain things.

It’s just the next step that’s the issue.  Should he simply merge it all into one formula, or break it down like Patriot is suggesting?

Take for example the regression of events to team runs scored.  The run value of the double is something like +.15 above the single, which is not only ridiculously low, but completely wrong.  But, what if the double is capturing some hidden relationship (i.e., teams that hit alot of doubles are slow runners).  Then what?  Well, either you do something like:

hitting = .50*singles + .80*doubles +…
baserunning = -.15 * doubles

OR, you do:
offense = .50*singles + .65*doubles +…

Now, which one is clearer?  And, what if you finally improve the baserunning equation to the point that you don’t need doubles?  Do we want to see the second equation change from .65 to .80 because some OTHER variable was introduced?

Or, do we prefer like in the first sets of equation?

This really comes down to mostly a question of presentation.

***

We can be just as harsh on Baseball Prospectus showing it’s percentile forecasts.  I would guess that if I were to quiz the BPro readers, 90% of them would fail my test.

Take for example Felix’s forecast:
http://www.baseballprospectus.com/card/card.php?id=HERNANDEZ19860408A

The data under the W/L columns go from at best 16-6 to at worst 15-7.

And if I were to ask: what are the chances that Felix will be .500 or worse, given what you see anywhere in that “2011 Forecast” section.  What do you think they will answer?

Or look at the data under HR, where it is at best 15 and at worst 15.  And I were to say: “what are the chances he will give up at least 16 HR”.  What do you think they will answer?

I know EXACTLY why the data shown there is shown in that manner.  I don’t need the explanation.  But at least 90% of BPro readers would, and 99% of mainstream readers who stumble onto BPro would need the explanation and would fail my test.

Sometimes, the presenter has to make a choice that may not be the most appealing, but it is the easiest.  SIERA is like that, and the percentile forecasts are like that too.


#5    Lee      (see all posts) 2011/07/25 (Mon) @ 14:29

Yea, you don’t need a PhD in Math to know those BP percentiles are completely broken - or at least the ones on Felix’s page. Who could publish those and think “yep, 15 HR… looks good!” Pretty funny actually.


#6    Tangotiger      (see all posts) 2011/07/25 (Mon) @ 14:44

Lee: it’s not necessarily that they are “completely broken”, but that the presentation of it is indecipherable for the majority of readers.

You come in cold, to that page, and you have no choice but to laugh at what you see.  As Danny Vuko would say: “Ha.  Ha.  Ha.”

But, what if instead what you see is an implied “average” for the ERA forecast at that line?  So, what it’s really saying is this:

“The 90th percentile forecast applies ONLY to ERA.  All the other columns are the AVERAGE you would get, that would allow you to produce that ERA.”

Basically, except for ERA, everything in that chart is “reverse-engineered”.

So, you go from a chart that is completely ridiculous to a chart that is completely rational… and all you did was change your glasses.

***

If for example you wanted to know the 90th percentile forecast for Felix’s HR, you will NOT find it anywhere in that section.


#7    Pierre      (see all posts) 2011/07/25 (Mon) @ 19:05

Re pitcher metrics- it seems like if you’re going to address the interdependence and non-linearity of the inputs and how they differ by pitcher type, you’re going to end up with an equation that’s pretty complicated regardless of whether it’s derived via regression or by using a run creation model.

Personally, I like FIP for the reasons mentioned- it’s simple and its limitations are pretty clear.  But there is an argument to be made for something like SIERA.  Namely, folks are inevitably going to over-interpret and over-apply the metric, so you may as well knock yourself out trying to make it absolutely as accurate as possible.


#8    Tangotiger      (see all posts) 2011/07/25 (Mon) @ 19:52

I agree that SIERA has a place at the buffet table.  Everybody loves the FIP chocolate, but some also like the SIERA escargots.


#9    John Beamer      (see all posts) 2011/07/25 (Mon) @ 23:31

Can someone point me in the best place to read about the Siera imbroglio and criticism that the metric is intelligible?


#10          (see all posts) 2011/07/25 (Mon) @ 23:48

John/9,

Matt’s five-part series launching his modified version of SIERA at FanGraphs is here:
http://www.fangraphs.com/blogs/index.php/new-siera-part-one-of-five-pitchers-with-high-strikeouts-have-low-babips/
http://www.fangraphs.com/blogs/index.php/new-siera-part-two-of-five-unlocking-underrated-pitching-skills/
http://www.fangraphs.com/blogs/index.php/new-siera-part-three-of-five-differences-between-xfips-and-sieras/
http://www.fangraphs.com/blogs/index.php/new-siera-part-four-of-five-testing/
http://www.fangraphs.com/blogs/index.php/new-siera-part-five-of-five-what-didnt-work/

Colin’s article announcing that the original version of SIERA would no longer be carried at BPro is here:
http://www.baseballprospectus.com/article.php?articleid=14603

Patriot’s post on his blog on run estimators is linked above.

Graham MacAree posted an article at Lookout Landing about his thoughts about sabermetrics.  It has a broader aim than just addressing SIERA, but it’s part of the discusion:
http://www.lookoutlanding.com/2011/7/20/2285655/the-problem-with-sabermetrics

Tango’s thread on new SIERA is here:
http://www.insidethebook.com/ee/index.php/site/comments/siera_updated/

David Appelman’s defense of publishing SIERA is here:
http://www.fangraphs.com/blogs/index.php/your-sabermetric-choices/

Matt has a mailbag thread for SIERA questions here:
http://www.fangraphs.com/blogs/index.php/siera-mailbag/


#11    Tangotiger      (see all posts) 2011/07/26 (Tue) @ 10:33

Mike/10 was automatically blocked for moderation and is now open.


#12    Tangotiger      (see all posts) 2011/07/26 (Tue) @ 10:36

Graham’s article was devoid of any real examples.  His overall summary opinion, which seemed far-reaching, lacked evidence to support that. 

Normally, I’d call it bullsh!t, but the article itself didn’t read as bullsh!t.  A weird article to read basically.


#13          (see all posts) 2011/07/26 (Tue) @ 10:50

If you haven’t followed Graham on Twitter or read his previous work on Lookout Landing, it would probably be a hard article to interpret.  As evidenced by how people on BBTF and B-Ref totally missed what he was saying.

He does explain a few more things in the comments.

I debated whether I should even list Graham’s article as part of the broader discussion, as I don’t think it was necessarily intended that way.  But it seems to have become that, so I listed it.  Plus, I agreed with a lot of what he had to say.


#14    Tangotiger      (see all posts) 2011/07/26 (Tue) @ 11:14

Someone else had written an article a week or three ago about how sabermetrics has never been in better shape.

The level of sabermetric signal is extremely high.  If there is a problem, it’s the lazy reader who doesn’t know exactly where to go and what to listen to, or even how to listen.

Graham’s article therefore was one-sided.  He may have made some fine general points, but his overall conclusion had no evidence to back him up.  (It reminded me alot of Gary’s “Baseball Analysis is dead” piece.  That one was myopic.)

Like I said, it was a non-bullsh!t bullsh!t article.  The equivalent of the non-intentional intentional walk!


#15    Tangotiger      (see all posts) 2011/07/26 (Tue) @ 11:19

I shouldn’t have said “lazy”.  The correct term is “unmotivated”.


#16    Pierre      (see all posts) 2011/07/26 (Tue) @ 11:43

Tango, did you ever derive FIP co-efficients using regression?  If so, was the equation very different, or basically the same as the one you came up with?  Thanks.  I ask because I’m curious whether regression is really an inferior method, or just different and perhaps less intuitive.


#17    Tangotiger      (see all posts) 2011/07/26 (Tue) @ 12:00

I originally used regression.  But I derived it later on as well. 

Regression is not necessarily inferior.  It’s a first-step, if you don’t know how things work.

Basically, there’s a time and a place for regression.


#18    Pierre      (see all posts) 2011/07/26 (Tue) @ 12:26

So I look at what Matt did, and I can see exactly the thought process.  It’s “you know, there’s no reason to think that the impact of walks (or whatever) on RA is linear or independent of the other variables, and every reason to think that it isn’t.  Let’s regress the bejeezus out of all this data and figure out which terms are significant and why”. 

Now, once you do all this, you may find that the improvement isn’t worth all the added complexity.  Or, you may think that any improvement justifies the effort.  Whatever your preference.  But it doesn’t strike me as a methodological issue.  The methodology seems just fine.


#19    Tangotiger      (see all posts) 2011/07/26 (Tue) @ 12:32

Pierre: I agree with your first paragraph.

The next step would have been to then build the findings into a run modeler.  If you insist on complexity, at least put it into a working complex system. 

This is the point I am trying to make.

We could make Runs Created a regression equation similar to what Matt is doing, with quadratic terms, multiplying some of the components (HR x BB), etc.  You’ll have a very long and complex equation.  And, maybe you can even justify each and every component.

But, it doesn’t fit within something that we know: how runs are scored in baseball.  We know exactly how that works.  You get runners on, and they move over, until the inning is over. 

If you insist on complexity, at least work within whatever thing you are actually doing.

Hence, the reason that BaseRuns is far far far better than any regression-based runs equation you can come up with.


#20    Pierre      (see all posts) 2011/07/26 (Tue) @ 13:18

Could you get at some of these issues using the model that underpins Fip or wOBA?  Could you test the hypothsis “walks are less bad for ptichers with good control becuase they are issued more strategically” or “HRs are a bigger issue for pitchers with really good k/bb ratios because there aren’t going to be many long rallies”?  Off the top of my head, it seems like it would be really tricky.


#21    Tangotiger      (see all posts) 2011/07/26 (Tue) @ 13:28

Right, I wouldn’t use FIP or wOBA for that.  Let’s say that this is a true statement:

walks are less bad for ptichers with good control becuase they are issued more strategically

The next step is: how to model that.  Well, why not model it within a model that approximates run scoring?  Now, you are correct that we just couldn’t use BaseRuns, because BaseRuns presumes random distribution of events across base/out states.

Therefore, we need a BaseRuns-type model that includes sequencing.  Colin’s FRA purports itself to do that.  I’m not sure that it actually does do that.  But in any case, that would be a good next step: to have a BaseRuns model that includes the base/out state.

Perhaps I am being too harsh on Matt’s model, now that I think more about it.  After all, having a base-out version of BaseRuns would be fantastically complex.  So, Matt’s equation, while appearing fairly complex already, may in fact be “simple” compared to the complex system that I am envisioning.

So, that is a very good example you chose, because it needs to be addressed within a model that does not keep the frequency of events static across base/out states.

Good job!


#22    Pierre      (see all posts) 2011/07/26 (Tue) @ 13:42

Thanks.  I really like your work and appreciate your willingness to answer questions about it.


#23    John Beamer      (see all posts) 2011/07/26 (Tue) @ 15:21

Mike/10—perfect and thank you


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves