THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, February 11, 2010

What does SIERA think of walks?

By Tangotiger, 05:55 PM

I started with this baseline data:

PA BB SO BatBall GrnB LinB FlyB
1000 90 180 730 314 146 270

That is, you have 1000 batters, of which 90 are walks, 180 are strikeouts, and 730 are batted balls.  And the batted balls are broken down as 314 GB, 146 LD, 270 FB.  SIERA comes in with an ERA of 4.31.  Estimating 234.7 IP, and .92 ER per R, that means 122.31 runs allowed.

I created a matching line for FIP, using the same PA, BB, SO, IP data, and putting in 27 HR (10% of FB).  Adding in a constant +3.1995 (in place of the “3.2"), and I get an identical ERA and runs allowed of 4.31 and 122.31.

I created a crude BaseRuns equation to give me the same result.

Finally, I put the above in the Markov calculator as:
http://tangotiger.net/markov.html
910 AB, 235 H, 54 2B, 5 3B, 27 HR, 90 BB, 180 SO

This gets me 4.688 runs per game.  Multiplying by .92 and I get 4.31 as an ERA.

All equations are now calibrated to the same baseline.

I then added 1 walk, 1 PA to see what would happen:


SIERA: 122.68 runs, or +.373 runs
FIP: +.375 runs
BsR: +.340 runs
Markov: +0.3%, or +.365 runs

Some basic agreement here.  What happens if I add 10 walks and 10 PA instead?  Here are the marginal changes:
SIERA: +.365 runs
FIP: +.375 runs
BsR: +.341 runs
Markov: +.388 runs

Here’s the pattern for SIERA, FIP, BsR, as I add 10 walks each time:
BB SIERA FIP BsR
100 0.370 0.375 0.341
110 0.365 0.375 0.342
120 0.359 0.375 0.344
130 0.354 0.375 0.345
140 0.348 0.375 0.346
150 0.343 0.375 0.348
160 0.338 0.375 0.349
170 0.333 0.375 0.350
180 0.329 0.375 0.351
190 0.324 0.375 0.353
200 0.319 0.375 0.354

As you can see, FIP keeps it constant, which is wrong.  BsR goes up, and SIERA goes down.  Markov is +.483.  Markov suggest therefore that the marginal run value of the walk should go up, as it should.  SIERA suggests the opposite.

In isolation, this means that SIERA has an issue.  However, if there is an relationship between extra walks and other events (say a lower HR rate or lower hit rate per batted ball), then SIERA would capture this.  This can only be shown through empirical testing, not this general testing.

If we go backwards, from 90 walks down to 0 walks, we get this:
BB SIERA FIP BsR
80 0.376 0.375 0.340
70 0.382 0.375 0.338
60 0.389 0.375 0.337
50 0.395 0.375 0.336
40 0.402 0.375 0.334
30 0.409 0.375 0.333
20 0.416 0.375 0.332
10 0.423 0.375 0.330
0 0.431 0.375 0.329

It follows the same pattern, such that the fewer walks you have, the more run impact each walk represents.  Though, in isolation this is untrue, it may be true if there is a dependency to other events, and presumably this is what SIERA is trying to capture.

Is this true?  Well, I guess we’ll have to test that.

#1    Brad at Cubs Stats      (see all posts) 2010/02/11 (Thu) @ 19:18

Very interesting, Tango. Where could this problem be coming from? Is it merely the arrangement of elements of SIERAs equation, or is it more of a theoritical fallacy behind SIERA?

Also, just to be sure I understand, you said you were reducing the PAs by equal amounts as the BBs in the second section, right?


#2    Tangotiger      (see all posts) 2010/02/11 (Thu) @ 19:38

Yes, definitely I matched the PA to the walks.

I don’t know if there is a problem.  What we need to do is see if there is a bias in the out-of-sample data with respect to walks.

It would be so much easier if we all had the same dataset to work against.  I mean, if I have to, I’ll create the dataset.  I can only do it at home, and when I’m at home, I’ve got Mariners stuff to do.

When I’m at the office, I’m limited to what I can do.  Hence, the call out to get me the data, so we can all talk to the same data.


#3    Brad at Cubs Stats      (see all posts) 2010/02/12 (Fri) @ 13:23

I’m going to try to compile what I can today (it is sweet to be a grad student!). The early signs seem back-and-forth. I did a quick overview of the Cubs here:

http://cubsstats.blogspot.com/2010/02/siera-watch-day-3.html

...and FreeZorilla looked at the AL East here:

http://www.draysbay.com/2010/2/10/1304547/al-east-starters-according-to-siera

Some guys I’d expect to profit from SIERA in fact do (Matt Garza, Ted Lilly), others don’t make much sense (Carlos Marmol, for one).


#4    Brad at Cubs Stats      (see all posts) 2010/02/12 (Fri) @ 15:14

Okay, using BPro’s stats, I’ve compiled all pitchers (25+ IP) and their SIERA, FIP, and ERA. I could add xFIP and tERA, buuuuut I’m not sure how to calculate those. Anyway, links are here:

http://cubsstats.blogspot.com/2010/02/siera-watch-day-4-data.html


#5    The Real Neal      (see all posts) 2010/02/16 (Tue) @ 01:06

I haven’t tried to dig into the math behind SIERA, but didn’t the original article mention that pitchers who can get double plays are allowed to walk more guys with less issues?  Doesn’t it make sense that if a guy is continued to allow to pitch, despite higher BB rates that he must be getting people out, and consequently the mathematical disparity you’re highlighting is just the illustration of the survivor effect?


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 15:37
What sabermetrics is NOT

May 25 15:28
Largest demonstration in Canadian history?

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 15:02
Pete Palmer’s new book: Basic Ball

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion