THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, April 07, 2008

IBB Redux

By , 12:06 AM

Pretty much everything is a redux these days.

I was watching the Tigers, White Sox game tonight and with the Tigers down a couple of runs in the 6th, I think, they issued an IBB to AJ what’s his name to load the bases with 1 out.  In The Book:


We suggest that it is generally not a good idea to load the bases with an IBB, even with 1 out.  Obviously the decision is a complex one.  To take a page out of the crazy BTF thread about the Justice post, I’ll say again that there is no human being on earth (unless there is some savant that can do it) that can figure out if that was the correct move or not, let alone a baseball manager.  If the correct answer were a matter of life and death, he (Leyland) would have to hire a number cruncher to figure it out, and even then, hope that the number cruncher got it right, given all the variables that go into the equation.  Of course, if the person offering the “life and death test” (let’s say GOD, because only he knows the 100% correct answer) was at least a little bit generous, he would offer Leyland and the number cruncher some “slop.” If the decision were “don’t walk him”, but by a hair, and the number cruncher says “walk him, but it is close,” then Leyland lives.

If the number cruncher says, “Walk him, and it ain’t even close,” then Leyland bites the dust.  If God tells us that the correct answer is “walk him and it’s not close,” and the number cruncher says, “Don’t walk him,” the Tigers’ skipper gets axed (literally).  Get it?

Obviously a guess (or whatever you want to call what a manager does when he makes these decisions) is going to be “right” a good portion of the time, but that is obviously not what I am talking about.  I am talking about whether the manager, in any close situation (I am assuming that this and similar situations are close) like this, can KNOW that he is reducing the other team’s WE by walking the batter (whether he knows what that means or not).  How would he do/know that?  NO close and complex decisions can be figured out by intuition or experience.  That just doesn’t make any sense.

Anyway, I thought I would look at the IBB again, with runners on 2nd and 3rd and 1 out.  Since after the IBB, lots of things affect the RE and WE, we can take a shortcut and include all of those things automatically by looking at actually how many runs were scored after such an IBB. I did that for 2003-2007.  I looked at all IBB’s that loaded the bases with 1 out, with a non-pitcher coming up and any inning less than the 9th.

In 5 years, there were 1007 such IBB’s, or around 6.7 per team per year, which is a lot, I think.  1.652 runs were scored after the IBB, on the average.  One standard deviation is 2.14 runs, which amounts to .067 runs per instance for N=1007.  Even though the distribution is not normal, I don’t think, we can say that the 1.652 is around 1.52 to 1.79 with a 95% confidence interval.  That’s the last time I’ll talk about sample error.  From now on, I’ll just assume that the 1.652 runs is the “real” number after an IBB in that situation.

Now, all we have to do is to compare that to how many runs we think would score if we let all those batters hit away with runner on 2nd and 3rd.  What I will do is use those batters’ collective 3 year batting stats to determine their “real” batting line, and then I will tweak those numbers based on how ALL batters change their stats with runners on 2nd and 3rd with 1 out, when they are allowed to hit of course.  I may or may not adjust for the pitchers on the mound when the IBB’s occurred (depending on how lazy or not I am).  I have no reason to think that the pitchers on the mound are not a group of average pitchers.  I am not including 9th and later innings, so closers tend not to be on the mound.

In general, when the IBB was issued, it tended to be in the later innings, but there were plenty of mid-inning IBB’s, and not a lot of early ones.  Also, the score tended to be tied, or the issuing team either up by 1 or 2 runs, or simply behind in the game, as in tonight’s Tigers game.

Of course, you want to look at WE and not RE, but for now, I am going to compare the observed (empirical) RE, which was 1.652 runs with what we would expect if those exact batters would be allowed to swing away.  I will first infer their collective stat lines and then I will use a Marov model to figure out how many runs would score to the end of the inning, given 1 out, and runners on 2ndt and 3rd, and then compare that to the 1.652 runs that DID score.

Any guesses as to the result?  Any suggestions for changing the model?

As Dial would say, this has already been done before, I am sure, so my disclaimer here and in everything else I do from now on, is that I automatically give full citation to everyone else’s work before me, and I fully admit that I am probably just “tweaking” the work that has already been done.  If you have not read the BTW thread, I apologize for being obtuse with that comment.

BTW, I constantly hear criticism about these kinds of models and analyses, along the lines of, “You don’t know all the variables, so we can’t know if your numbers are correct.”

I’ve said this a million times before and I’ll say it again.  What we (I) do is to create a model that is pretty good, and then come up with an answer.  If the answer is clear cut, regardless of those other variables, then we don’t care about them.  If it is not clear cut, then we can say/do one of two things:  One, try and include more variables in the model and come up with an answer that has more certainty, or simply say, “It is too close to call, either go with my answer (knowing that the certainty is not great) or go ahead and include those other variables “by the seat of your pants” and then make your decision.

What happens is that most of the time we include the variables that have the most effect on the model, such that if we find a clear cut answer, we don’t care much about the other variables.  If that is not the case, that either we come up with an answer that is NOT clear cut, or there are variables that are or may be important, but for some reason we cannot model accurately (which is rare), then so be it.

Remember that good sabermetrics is NOT the search for an answer, per se.  It is the search for the truth.  Sometimes the truth is, “We don’t know.” Often, it is, “We think it is this, but we are only 80% (or 90%, or 55%) certain.

#1    MGL      (see all posts) 2008/04/07 (Mon) @ 05:36

I will have to do this piecemeal.

Here are some relevant and interesting stats relating to the IBB situation above:

Here is the stat line for the average player who is IBB’d.  This stat line are the players’ overall stats for the year of the IBB, the year before, and the year after:

per 500 PA (pa=ab+bb+hp+sf-ibb)

s 80
d 26
t 2.3
hr 18.7
bb+hp 49.7
so 79.4
roe 4.6

That is a wOBA of .369, which is very good but not great.  A league average wOBA is around .344.

In fact, the stat line of the average non-pitcher batter is:

s 80.3
d 24.7
t 2.54
hr 14.4
bb+hp 44.9
so 82.2
roe 5.04

But, the player who is IBB’d tends to have the platoon advantage on the pitcher.

Normally, against all batters all the time, the batter has the platoon advantage 54.2% of the time.  The batter who is IBB’d has the platoon advantage 76.8% of the time.  The next batter also tends NOT to have the platoon advantage.  His platoon advantage is only 29.2%.  So pitchers tend to issue the IBB when the batter is opposite handed and the next batter is same handed.

That means of course, that you will tend to see FEWER runs scored after an IBB than “expected” (assuming that the next batter is randomly “handed") AND it means that the run expectancy if the batter who is IBB’d would have swung away is going to be HIGHER than expected, as compared to that batter’s overall stat line.

We also tend to see a slightly higher GIDP rate from the next batter than for all batters overall, but that could just be because batters who are IBB’d tend to be middle of the order guys and the batters who follow them tend to be middle or lower order guys and these guys are usually higher than average DP guys anyway.

IOW, it does not appear that a manager will tend to IBB someone when the next guy is a high DP guy and not IBB someone when the next batter is not a high GIDP guy.

I did not look at the pitcher’s GIDP rate though.  It could be that managers tend to issue IBBs with high GIDP pitchers on the mound, further reducing the run expectancy after an intentional walk.

But, as I said initially, since we are using actual runs scored after an IBB as our baseline to compare to, we don’t have to worry about the handedness of the next batter or the GIDP rate of the pitcher.  I thought it was interesting though.

OK, I’ll take a look at the pitchers on the mound when the IBB is issued.

Here is their average stat line:

s 81.7
d 24.7
t 2.64
hr 13.0
bb+hp 42.1
so 84.5
roe 4.9
gidp 11.3

This is a wOBA of .337 as compared to an average wOBA for all pitchers of .341, and an average GIDP for all pitchers of 10.8.

So the pool of pitchers who issue an IBB are slightly better than all pitchers overall (probably because the IBB tends to occur in the later innings) and their GIDP is in fact a little higher than all pitchers overall.  The latter means that managers do use the GIDP tendency of the pitchers as a factor in his decision, albeit only marginally (.5 GIDP per 500 PA or around .005 more GIDP per opp).

I’ll be back with an estimate of how the IBB’d batters would have done had they been allowed to swing away.  As I said, it is not that hard to estimate that with reasonable certainty.  All we have to do is adjust the batters’ overall line to how batters tend to do with runners on 2 and 3 and one out, adjust their overall line for the fact that they have the platoon advantage more often than overall, and for the fact that they are facing slightly better pitchers than overall.


#2    Matt Mitchell      (see all posts) 2008/04/07 (Mon) @ 09:26

MGL,

I was asking the same question about the Pierzynski IBB myself last night. One of things I thought of was the “hot hand” theory, as Pierzynski has a .556 OBA (mostly hits, I don’t have the wOBA offhand) through the first week, and Leyland probably didn’t want to be burned by him. Any way to account for streakiness here?


#3    Peter Jensen      (see all posts) 2008/04/07 (Mon) @ 10:36

One method of finding out how they might have done if allowed to hit is to look at the pool of 2nd and third 1 out situations where the batter was allowed to hit. That should be about 6 times the number where the batter was intentionally walked.  From that pool you can select the batters that have an average batting line averaging the same as those that were IBBed and then look at the actual outcomes of those plate appearences.


#4    Jacob      (see all posts) 2008/04/07 (Mon) @ 10:42

Even though the distribution is not normal, I don’t think, we can say that the 1.652 is around 1.52 to 1.79 with a 95% confidence interval.

This is really key here, isn’t it? By issuing the intentional walk in the _23, 1 out, down 2 scenario, isn’t the manager’s goal not to decrease the average number of runs scored, but to skew the distribution?

The idea is to increase the chance at giving up 0 runs despite increasing the chance at giving up 2 or 3+ runs? At some point in the game, moving to a game state with a higher average of runs scored, but a higher frequency of zero scored is the right play, isn’t it?


#5    MGL      (see all posts) 2008/04/07 (Mon) @ 12:04

Any way to account for streakiness here?

Yes, there is a very good way.  Ignore it.  Read our chapter on hot an cold streaks in The Book.

Peter, sure that is a good way too.  Pretty much the same way as I describe, only eaiser.

Jacob, #4, yes.  Which is why you should use WE (win expectancy) rather than RE, which accounts for exactly what you are talking about.  I will do that eventually.  For starters, though, I am going to look at RE.  One thng at a time.


#6    Peter Jensen      (see all posts) 2008/04/07 (Mon) @ 12:35

MGL - Although streakiness probably doesn’t affect the outcomes, some previous research that I did on this question indicated that teams definitely weighted recent performance more heavily in deciding whether to intentionally walk or not.  Which is why I think you should not use a player’s year after stats in determining the ability of the players intentionally walked or allowed to hit.  They may a truer indicator of the player’s true ability but they obviously couldn’t enter into the decision of whether to walk or not.


#7    MGL      (see all posts) 2008/04/07 (Mon) @ 13:32

Yes, the decision is of course partially predicated on how “hot or cold” the hitter has been lately or whenever, but I am using the stats (year before, during and after) to estimate their true talent so I can estimate what they might have done had they hit away.  Why do I care what enters into the decision?

Sure, the pool of hitters who are IBB’d probably will show a career year, or at least an above-average for them, in the years they are IBB’d, and it might be interesting to present that data, but it is not relevant to my calculations. In fact, I probably should not even use the year of the IBB to estimate the true talent of the hitters’ since that year is going to be a “lucky” one which is one reason why the hitters are walked in the first place.  In fact, I definitely won’t.  I will only use the year before and year after.  Even the year before may be a “lucky” year, as hitters who get Ibb’d tend to be hitters who were a little lucky in past years (thus they have a reputation as a good hitter).

Using the year of the IBB to estimate a hitter’s true talent would be a little like taking all FA signings and breaking them down into low and high salaries, and then using the year before the FA signing as an estimate of the two groups’ true talent in order to evaluate the signings.

Obviously the low salary group will have tended to have a poor year and the high signing group will have tended to have a good year, so our estimates of true talent will be biased.  Unless you want to do regression, which I don’t want to do with the IBB’d hitters.  By using the next year’s stats, I automatically have the regression included.  In fact, I can say without running the numbers that for all IBB’d hitters, their year of the IBB and prior stats will be higher than their next year’s stats and that the next year’s stats will be a much better estimate of their true talent since the prior stats are selectively sampled.

Even though you were talking about something else, thanks for accidentally straightening me out!


#8    Peter Jensen      (see all posts) 2008/04/07 (Mon) @ 15:07

MGL - I am confused.  Why are you going through this exercise unless its to decide whether a manager is making a good decision when he intentionally walks a hitter in this particular situation?  Don’t you ultimately want to include as many of the factors that go into that decision as possible?  Isn’t that what you meant when you talked about including the variables that have the most effect on the model?  Isn’t the perception of the true “true value” of the player AT THE TIME THE DECISION IS BEING MADE the variable that is important, not the player’s true value calculated a year after the fact?  Are you in Canandaigua today?  If so I’ll come over and we can talk about this.


#9    Tangotiger      (see all posts) 2008/04/07 (Mon) @ 15:23

The choice is:
1. Wanting to know if given a large set of IBB, did the manager make the correct call.  To do that, you need to know the true talent level of the player.  The best way to do that, is to use his performance that is not part of the selection criteria being studied.  And for that reason, simply using the year+1 stats is enough to establish that.  If you have a large enough group of players, and if that group of players is around the age of 28 or so.

2. Wanting to know if given a large set of IBB, what kind of talent did the manager perceive.  If managers are being influenced by streaks and short-term performance and the like, it is entirely possible that the manager’s engine in determining whether to issue the walk is identical to the sabermetric engine, but that you are getting a GIGO (garbage in, garbage out).  That managers understand when to issue a walk, but they can’t identify a lucky streak from a player’s true talent level.


#10    Matt Mitchell      (see all posts) 2008/04/07 (Mon) @ 16:01

MGL,

Thanks. Stupid me should’ve thought of the the sample size issue as well as that chapter you cite.


#11    Peter Jensen      (see all posts) 2008/04/07 (Mon) @ 16:02

No, you are absolutely wrong.  The manager can only be evaluated on the totality of the information that was available at the time the decision was made.  Only some of that information is available to us as amateur analysts.  Neither we nor the manager knows how the player is actually going to perform in the future.  We only know how he is projected to perform in the future at the time of the decision.

Think of it this way.  If you were called on as an analyst to evaluate a particular decision on the day after the event, you could only use the best existing projection of the player’s true talent.  You couldn’t say “Let me wait a year or so and then decide what the player’s true talent actually was and then I’ll tell you if it was a good decision or not.” Isn’t that the definition of hindsight?


#12    Tangotiger      (see all posts) 2008/04/07 (Mon) @ 16:10

As much as you think I’m wrong, you are wrong plus one!

Think of it another way: Ryan Braun is batting on July 1, 2007.  He just got called up a few weeks ago.  He’s a first round pick, but you still don’t know really how good he is.  But between July 2, 2007 and Oct 1, 2008, there is no question as to his talent level.

So, going back to what I was saying:

In alternative 1, you really want to know Braun’s actual true talent level, as of July 1, 2007.  As a saberist, if all I have is the data as of July 1, 2007, I won’t make a very good guess.  A manager has more information, he has his scouting report, etc.  So, for me to decide how good he is on July 1, 2007, I need to have all of his data, and then I can plot where he actually is.

In alternative 2, it’s a question of perception.  And in this case, you MAY want to have his future performance, since that would act as a proxy for the scouting report the manager has.  You probably don’t need it, and you can get by with the data until then, including further knowledge of recent games, and possibly matchups, etc.


#13    Peter Jensen      (see all posts) 2008/04/07 (Mon) @ 16:35

Tango - The example you give in #12 is a clear case of what MGL is defining as a case of analysts not having enough information to judge the manager’s decision.  In this case the manager has access to information that you don’t have.  Of course, we are assuming that his scouting reports are confirmed by Braun’s future performance.  But what if they weren’t?  The manager would still have been acting on the best information available to him.  You wouldn’t have had enough information to say whether he was right or wrong on July 2, 2007.  But you think you should be able to say he was wrong on October 2, 2008?  That is just silly.  No wonder baseball people have trouble accepting anything we have to say as valuable.


#14    Tangotiger      (see all posts) 2008/04/07 (Mon) @ 16:45

Remember, there are TWO alternatives here.  You seem to be talking only about alternative #2, as I described it.  And therefore, I am NOT implying at all that I’ll know on Oct 2, 2008 if we can blame him or not.  We can only blame him based on the information at hand, which means information to-date, plus the scouting reports.  And if I don’t have the scouting reports, maybe I get to have 100 or 200 future PA as a proxy.

My alternative #1 doesn’t seem to be what you are discussing at all, and therefore, my arguments as it relates to alternative #1 should be dismissed as irrelevant to alternative #2.  I hope that clears it up.


#15    Peter Jensen      (see all posts) 2008/04/07 (Mon) @ 17:08

No, I am only talking about #1.  Your #2, as I understand it, is WHY a manager might be right or wrong.  #1 is IF his decision is right or wrong.  I am saying that you can only judge the IF using the information that the manager could have had at the time he made his decision.  I don’t require him to foresee the future.  For some reason you seem to think he should be able to.


#16    MGL      (see all posts) 2008/04/07 (Mon) @ 17:19

I am 100% with Tango here and I am confused as to what you are saying Peter.  Why should we care what criteria the manager is using to determining whether to issue the IBB, unless that criteria is relevant to the outcome.  We are assuming that the only relevant outcome to the PA if the batter would swing away is his true talent (tweaked for handedness, the pitcher, etc.) at that exact point in time. Every research that has ever been done, that I am aware of, suggests that the best estimate of a player’s true talent at any exact point in time is our normal projection.  In retrospective studies, the easy and 100% accurate way to determine someone’s true talent at any point in time is to look at their performance in any time in the future or past, trying to keep things like age and health as close to constant as possible.  And as long as we have a large enough group of players, and a large enough sample per player, we are going to nail that true talent.

We can easily test that model, though.  And anyone who disputes that it works is going to lose 100% of the time.  Pick any number of PA for any group of players, as long as it is large enough (the total number of PA’s) to reduce sample error to a bare minimum, and I will tell you EXACTLY what the stats for those PA will be by simply looking at those batters’ and pitchers’ performance in any other time period BUT those PA.

I think you know all of this, so I am not sure what you are saying.  Are you saying that we should give the manager a pass if he walks a guy because he has been hot lately or he is having a lucky year?  If so, that is fine.  My intention is not to determine whether a manager should or should not have issued a walk based on information he believes is relevant but is not.  My intention is to figure out how many runs (ultimately, the WE not the RE) would have scored if the batter were allowed to swing away and then compare that to how many runs scored after the IBB.

To figure out how many runs would have scored, we have to estimate the batters’ true talent at that point in time, and then adjust for the pitcher, and the handedness of the pitcher and batter, and any relevant information that would affect the batter swinging away (like the fact that there are runners on 2 and 3 and 1 out).

What the manager THINKS might affect the result of the batter swinging away (such as hot and cold streaks and batter/pitcher matchups), but have already been proven to be untrue will not affect my analysis.  Why do you think it will?  I don’t understand what you are saying.

Are you saying that in order to estimate the batters’ true talent at that point in time, there are other variables besides what I have already talked about, that I need to include?  Are you suggesting that I need to include hot and cold streaks?  Batter/pitcher matchups?  Why would I include those in my model for estimating true talent when we have already proved with a high degree of certainty that those things have NO EFFECT on true talent (other than how they have changed the batters’ Marcel projection of course)?

If a player is red hot in the 20 or 50 PA before any given PA and we show that in that given PA he hits EXACTLY as he normally does (which we have shown in our research on hit and cold streaks), why would I want to NOT use a batter’s normal projection when he has been on a hot (or cold) streak and I am trying to project what he would do if he were not walked?  I already have my answer by looking at 10’s of thousands of PA when batters had been red hot for any number of PA beforehand.  And the answer was that they always hit at their normal levels.

Same thing with batter/pitcher matchups.

So rather than argue this back and forth and talking past each other, slow down and explain to me what you are saying, other than the manager has certain criteria that he uses to make the decision, some of which are relevant (handedness of the batter, overall quality of the batter, handedness of the next batter, etc.), and some things that have been proven NOT to be relevant, such as hot and cold streaks and batter/pitcher matchups.  The ones that are relevant, of course, I will include in the model and the ones that are not, I won’t.  When I say, “the model”, I mean the only thing that I need to calculate to do this exercise, which are the expected results when all the batters swing away.  Why should I care what the manager thinks in doing that, unless what the manager thinks is relevant?  And I don’t need the manager to tell me what is relevant, do I?


#17    Peter Jensen      (see all posts) 2008/04/07 (Mon) @ 18:06

No, I would never suggest that you include hot/cold streaks or batter pitcher matchups in assessing a batters true talent.  As it affects your model, the only difference between what you are saying and what I am saying is that I think you should use the projection of the batters true talent as it was estimated at the time of the decision instead of the estimation of his true talent using data from after the decision was made.  I am not saying that you should care how the manager came to his decision at all or try to replicate his process of decision making.

As a practical matter this means projecting the runs that would have been scored had the batter been allowed to bat by matching his true talent as it was projected (by YOUR best projection methods, not the manager’s) AT THE TIME OF THE DECISION to a group of batters with a similar profile.

I contributed to the confusion by post #6 when I used “Which” to begin the second sentence.  It made it seem that I was trying to guess how the manager made his decision and suggesting that you change your analysis to conform to his.  I apologize.  You should use your best method of analysis, but it should be limited to the data that could have been known to the manager at the time of the decision.

I fully understand that whether a manager used streakiness or batter/pitcher matchups to make his decision is a question of WHY a decision might have been good or bad and therefore falls into Tango’s category number 2.

One of the frustrating things about blog communication is the waste of energy on differences that would never occur during face to face discussions.

I was serious about meeting to discuss some of these issues.  I live in Ovid in Seneca county about 30 miles from Canandaigua.


#18    MGL      (see all posts) 2008/04/07 (Mon) @ 20:22

We are absolutely on the same page.  I misunderstood your post as much as you may have miscommunicated (you may not have - I was doing several things at once).  Except…

The best estimate of a player’s true talent at any exact point in time is actually looking at before and after that time, when you are doing retrospective studies.  It seems a little counter-intuitve, but it is true.  It is especially true when you are dealing with large groups of players.  The reason for looking at the “after” is two-fold:  one, it simply gives you are larger sample size, and two, more importantly, it becomes an unbiased sample, and thus you don’t have to do any regression.  You can also look only at last performance (say the previous year), but then you have selective sampling, and you have to regress, and it is a mess.  The reason it is selective sampling and you have to regress is that managers are choosing to IBB guys based partially on how they have done before that PA.

Even if we are just selecting good players (or bad), using whatever criteria we want, in order to get an unbiased estimate of their true talent, the BEST way is to take future performance.  Future performance is always the way to go when doing these retrospective studies.  If the selective sampling does not include past performance, then you can use that past performance also, but if it does (as is the case with the IBB’s to some extent), then you are probably better off using future only.

So I understand now what you are saying, but it is not correct.

I’d love to get together with you.  I am not in Canandaigua yet.  I will be there probably in mid to late May or so.  We can arrange to hook up after that.  Play golf?


#19    tangotiger      (see all posts) 2008/04/07 (Mon) @ 21:10

Ditto everything mgl said.

And of course, if you look too far in the future, remember to age-adjust.


#20    MGL      (see all posts) 2008/04/07 (Mon) @ 22:18

OK, here is the batting line and wOBA for all hitters, other than pitchers, who hit away when facing 2nd and 3rd with 1 out.  Following that is their batting line in all situations over the 3 years surrounding (before, during, and after) their PA.

Batting line of all non-pitchers when they are faced with runners on 2 and 3 and 1 out:

N=6,753

s=81.7
d=24.8
t=3.4
hr=9.8
bb+hp=57.2
so=77.7
roe=6.2
wOBA=.313 (wOBA does really mean anything, since the weights are based on overall base/out situations.

Overall batting line (for 3 years, before, during, and after the PA in question) for these same batters weighted by the number of times they faced the above situation:

s=81.5
d=25.2
t=2.6
hr=15.1
bb+hp=45.6
so=79.4
roe=5.0
wOBA=.292

Basically, in this situation, you have 2 competing factors.  The pitcher is trying to strike the batter out, generally does not mind walking him, and is otherwise trying to get him to hit a ball on the ground I would think.

The batter is trying to do exactly the opposite. Mainly the batter is trying not to strike out - just put the ball in play.  There is an occasional squeeze in there I guess.

Here are the coefficients I am going to use to convert any batter’s overall line to his line when faced with that situation:  They are simply the second batting line (overall) divided by the first (runners on 2 and 3 with 1 out).

s=1.002
d=.98
t=1.31
hr=.65
bb+hp=1.25
so=.98
roe=1.22

You can see that the batter basically (wins out over) controls the approach.  The singles are elevated probably because some of the time the infield is playing in.  Extra base hits are way down (I think the triples increase is a fluke), especially the HR, because the batters are cutting down on their swings in order to score a run and avoid striking out.  SO are down for the same reason, and walks are only up because pitchers are pitching around a lot of these hitters (the batters are not trying to walk of course).  So the only thing that the pitcher controls more than the batter are the walks.  ROE’s are up, I assume, because there is more pressure on the defense, and because they are sometimes playing in.

So I am going to assume that when a batter is IBB’d, if he would have swung, he would hit at his overall rate times the above coefficients.  From that, we can generate a hypothetical batting line for all of the IBB’d batters.  And from that, we can use a Markov model to figure out how many runs would score to the end of the inning if these batters swing away.

The tough part is going to be estimating what happens in the Markov model after the batter who was IBB’d.  We can’t assume league average batters, I don’t think.  Since batters who are IBB’d tend to be middle of the order guys, the remaining batters in the Markov model will tend to be middle and bottom of the order guys. Maybe I will look at all batters’ lines following the IBB and use that in the model. I’m not sure what I’ll do.  Any suggestions are appreciated.

Anyway, here is the overall (3-year) batting line of the batters who were IBB’d:

s 80
d 26
t 2.3
hr 18.7
bb+hp 49.7
so 79.4
roe 4.6

Applying the coefficients to these numbers tells us what they would have done had they hit away.

But first, we have to adjust for the pitchers they faced when they were IBB’d.  Remember, they were facing better than average pitchers.

Here is the line of the pitches on the mound when the IBB is issued:

s 81.7
d 24.7
t 2.64
hr 13.0
bb+hp 42.1
so 84.5
roe 4.9
gidp 11.3

I’ll divide these numbers by those of the league-average pitcher to get pitcher adjustment coefficients.

s 1.03
d 1.02
t 1.06
hr .93
bb+hp .95
so .99
roe .97

We will multiply the batter line above buy these coefficients.

s 82.4
d 26.5
t 2.4
hr 17.4
bb+hp 47.2
so 78.6
roe 4.5

If you lost me, just keep in mind that I adjusted the IBB’d batters’ overall line for the pitchers on the mound when they were IBB’d.

Now we’ll apply the base/out adjustment by using the “runners on 1 and 3 and 1 out” coefficients.

s=1.002
d=.98
t=1.31
hr=.65
bb+hp=1.25
so=.98
roe=1.22

s 82.6
d 26.0
t 3.1
hr 11.3
bb+hp 59.0
so 77.0
roe 5.5

So these numbers are what we expect the IBB’d batters to hit if they were allowed to swing away with runners on 2 and 3 with 1 out AND against these exact pool of pitchers.

Now all we have left is to use a Markov model to take the above batting line followed by a certain pool of hitters (again, facing these particular pitchers), starting with 1 out and runners on 1 and 3, and see how many runs would score to the rest of the inning.  We would then compare this number to the original 1.652 runs that were actually scored after the IBB.

I expect the numbers to be close.  Keep in mind the results do not tell us which IBB’s were good and which were bad, only whether all IBB’s overall increased or decreased RE.  Plus we are not even looking at WE which is the only way to do it correctly. But, baby steps.

Once we get an overall number, we can go back to each and every IBB and use the same model by applying it to just the one batter who was IBB’d and the rest of the lineup, and say, “Yeah or nea” for that IBB.  We will basically have a scorecard for each and every IBB.  Of course, easier said than done.


#21    MGL      (see all posts) 2008/04/07 (Mon) @ 22:20

The reason I like doing the year before, during, and after, or just before and after only, is that you usually don’t have to age adjust.


#22    MGL      (see all posts) 2008/04/07 (Mon) @ 22:51

I think I might have to redo all the numbers, using the AL only.  Whether the pitcher eventually bats is just too much a problem.  For example, since IBB’s tend to occur in close and late situations, pitchers will tend to be pinch hit for if they come up to bat in the IBB inning.  Unless I know exactly how often that happens (which I can find out of course) and incorporate that into the “hitting away” Markov model, I am screwed.  I think I am better off just using the AL.


#23    Peter Jensen      (see all posts) 2008/04/08 (Tue) @ 06:03

I’ll divide these numbers by those of the league-average pitcher to get pitcher adjustment coefficients.

Shouldn’t you be dividing these by the numbers of the pitchers that the non IBBed batters faced?


#24    MGL      (see all posts) 2008/04/08 (Tue) @ 12:38

Peter, I am trying to scale the IBB’d batters overall line (facing all pitchers) to the pitchers that they faced when they were IBB’d. 

When I record how many runs were scored after an IBB, that is my baseline.  So I want everything to be scaled to those pitchers.

So let’s say that after an IBB, only 1 run is scored on the average instead of the 1.652 I actually got. And let’s say the reason it is so low is that the pitchers on the mound when the IBB’s are issued are extremely good.

Now, I still want to see how those IBB’d batters would do if they hit away.  But they are going to hit away against those extremely good pitchers. 

So I have to take their estimated hitting away stats and scale them way down, because those estimated hitting away stats are first based on average pitching.

How do we scale them down?  We divide them by the coefficients gotten from dividing the average pitcher by the pitchers on the mound when the IBB’s were issues (the really good ones).

Example (using wOBA only):

Pitchers on the mound when IBB’s are issued: .300 wOBA

Average pitcher: .330

Coefficient: .330/.290 or 1.10

Those batters who were IBB’d, if they were to hit against average pitchers, would hit .380.

How would they hit against these same pitchers?  .380/1.10 or .345.

That is the number we need to compare hitting away to being IBB’d.

Of course, I am doing the above with component stats and not wOBA.


#25    Peter Jensen      (see all posts) 2008/04/08 (Tue) @ 14:04

Here are the coefficients I am going to use to convert any batter’s overall line to his line when faced with that situation:  They are simply the second batting line (overall) divided by the first (runners on 2 and 3 with 1 out).

Aren’t the coefficients that you show the first batting line divided by the second?


#26    MGL      (see all posts) 2008/04/08 (Tue) @ 15:44

I have 2 sets of coefficients:  One, is to convert overall batter stats to that with runners on 2 and 3 and 1 out.  Two, to adjust for the pool of pitchers when batters are IBB’d in that situation.


#27    Peter Jensen      (see all posts) 2008/04/08 (Tue) @ 19:03

MGL - Go back and look at your data.


#28    MGL      (see all posts) 2008/04/08 (Tue) @ 21:14

I’m redoing everything anyway for the AL only, 1998-2007.  I’ll post the data and the calculations later.


#29    MGL      (see all posts) 2008/04/11 (Fri) @ 03:40

I have some interesting data coming, but I am still working on it.  Give me a little time.  We wanted to “stick” this thread for a while so that it does not get buried before I get to present the data and the analysis. Thank you for your patience, now back to our regularly scheduled programming…


#30    MGL      (see all posts) 2008/04/14 (Mon) @ 21:04

I think I am going to write an article for THT in which I describe my findings, which I think are interesting.  So we’ll “un-stick” this thread and you can look for the THT article in the upcoming week or two.


#31    Peter Jensen      (see all posts) 2008/04/15 (Tue) @ 17:35

Did you correct the errror that I pointed out in Post#25?


#32    MGL      (see all posts) 2008/04/16 (Wed) @ 01:57

Peter, I am not using that in the analysis anyway.  I wrote the article for THT, but I am not sure when it is going to be printed online.


#33    Peter Jensen      (see all posts) 2008/04/16 (Wed) @ 02:14

Looking forward to it.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jul 04 20:59
Mapping IDs

Jul 04 01:40
BPro Idol

Jul 03 01:39
sUZR v bUZR

Jul 02 21:15
Batting Order and the pitcher

Jun 30 07:22
NHL draft analysis and spreadsheet 1994-2009

Jun 30 04:14
The Poz goes FJM on Harold Reynolds’ a$$ - gather around the kids

Jun 30 00:11
Blogosphere Question of the Day, 06/24; OR Why should OPS die?

Jun 27 16:04
Loss aversion in golf

Jun 26 16:30
Donald Fehr

Jun 26 14:04
Barry Code