THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, February 17, 2008

Linear Weights Run values of each pitch

By Tangotiger, 12:29 PM

Great job by Joe Sheehan, in looking at the run values of each pitcher’s pitch.  We’ve been talking about this for a while, and Joe did the grunt work on it.  We definitely need to look at it by count, but this is definitely a great big first step at it.


#1    MGL      (see all posts) 2008/02/17 (Sun) @ 19:10

Great, great stuff, as usual. 

I would like to see all pitches’ lwts/pitch compared to the average for that type of pitch.  In all his charts, he only gives, and ranks the pitches in the charts, by their lwts/pitch as compared to an average pitch.  That is deceptive.  If I told you that so-and-so’s changeup was -.02 runs per pitch, and I asked you if that was a good changeup, what would be your answer?  Well, it should be, “I have no idea, until you tell me the average lwts/pitch for a changeup!”

Joe (if you are here), did you “zero everything out” for all pitchers before you started listing each pitcher’s total lwts and lwts per pitch?  IOW, if you used Tango’s formulas for computing each pitcher’s lwts values for all their pitches, you need to go back and make sure that everything adds to zero for all pitchers.  It likely will not and you will have to add a “fudge multiplier” for all results.

I love the fact that you attempted to do a regression.  Without that, we have no idea how much of these results are likely chance and how much is likely skill.  As I always say, when giving sample results, which we almost always are, we MUST give the reader some idea of the regression, so they have SOME idea how much the differences among players may be attributable to luck and how much to skill.  It is never all that evident.  The only thing that usually is intuitively obvious (and even then, we can be fooled sometimes, as in DIPS) is whether something probably has either some or no skill.

If I want to look at the quality of a pitch, independent of how often it’s thrown,

I know that Joe knows this, but the quality of a pitch is NEVER independent of how often or WHEN it is thrown.  Even if I have an average changeup, for example, if I throw it in fastball counts, even if that is incorrect, it will appear to be a great pitch (it will have a great lwts/pitch) because the batter is not expecting it.  That is only true if you are NOT using count to determine lwts value.  If you are, then it won’t matter when I throw it.  The lwts value will include that.  That is why it is really NECESSARY to use lwts by count!  That is because some pitchers pitches are more or less effective because of when they throw it.  In fact, the more that I think about it, Joe, it is virtually mandatory that you do it by count, as that is a big part of a pitcher’s skill - the ability to throw a certain pitch at a certain count, or not.  For example, there are many pitchers who have a nasty, virtually unhittable curve, but they have no control with it and can only throw it when ahead in the count.  Etc.

Great, great stuff, but a long way to go.

I would love to see a comparison between those lists that are generated by players, managers and pitching coaches, of who has the best fastball, curve, etc., and the actual lwts/pitch.


#2    MGL      (see all posts) 2008/02/17 (Sun) @ 19:14

I’ve said it before and I’ll say it again.  It is a crime that teams are not hiring guys like Joe (there are several others that are doing equally great work) for BIG money, or at least doing the same thing with the pitch fX data internally (I’m pretty sure some of them are).  This is the next frontier for getting a GIGANTIC edge both in scouting and in developing better pitching and hitting techniques, as well as in evaluating pitchers (and hitters), especially after injuries.  It also enables us to at least partially answer questions like why did a sucky pitcher like Marquis have a good season - was it luck or did he learn a new pitch, improve one or more of his pitches, etc.?

Now, if Joe ends up getting the big bucks from a team, I would like a commission! wink


#3    Renè      (see all posts) 2008/02/17 (Sun) @ 19:45

It really is good stuff, and I’m ever so glad that someone followed up on my suggestion. I also believe that doing it by count is fundamental. I picked out Papelbon’s data from Josh Kalk’s webtool (301 of his 910 pitches of the season) and this is what I got, by using average strikes and balls (in terms of runs saved) with that dataset:

Fastball: 11,45
Curve: 0,82
Splitter: 0,53
Slider: 0,41

The splitter seems to be worse than the curveball. But when you take into account every situation (2 strike foul balls, and specific run values of strikes and balls in every count), you get this:

Fastball: 8,91
Splitter: 0,72
Curve: 0,64
Slider: 0,43

I was stunned that the splitter prevented so few runs, but it is possibly explained by the fact that he never throws it for a strike (he only got one called strike with Pitch F/X in function) so unless he throws it in a pitcher’s count, hitters have the luxury of taking it, thereby turning it into a ball. In fact, here’s a table in terms of runs saved:
http://rene144.playitusa.com/wp-content/uploads/2008/02/papsrs.jpg
“Early” counts are 0-0, 1-0, 0-1, 1-1. “Deep” counts are all the other ones. Pitcher’s counts are those with a negative run value, hitter’s counts are those with a positive one. The splitter is actually bad when thrown early in the count (because of all the balls taken), but it is nearly as effective as the fastball with 2 strikes (despite being thrown 19% of the time vs. 73% of the fastball) because hitters have to protect the zone and miss it all the time (88% of the times Papelbon throws it, it isn’t put into play). That’s where the run value is maximised. In other situations, he might want to throw it to keep hitters honest and aware, but it’s actually a bad pitch for him.

I started with Papelbon because I was curious regarding his splitter, but you may only imagine how different the run values can be for starting pitchers with 2000+ pitches recorded over the course of the season.


#4    tangotiger      (see all posts) 2008/02/17 (Sun) @ 21:15

I had written to Joe prior to his article:

The only question you have is if you want to treat the pitcher as his own universe (personalized run values, such that the overall weighted run value will be zero for all pitchers) or using the league values (therefore, EVERY pitcher gets the same run value at each count, but the overall run value for each pitcher will reflect his overall talent).

It all depends on what you are trying to do.

There’s definite benefits on doing it both ways.

For example, say you give Santana a personalized LWTS set of run values for each count (he’s his own universe).  The benefit here is that you are comparing each of his pitches at each count. 

Since a Santana changeup doesn’t exist by itself that you can compare it to somebody else, that it really exists (a) within the context of Santana’s other pitches, and (b) within the context of the frequency at which its thrown at a particular count, you have to be very congnizant of this.


#5    MGL      (see all posts) 2008/02/17 (Sun) @ 22:56

Another monkey wrench that came to mind while reading Rene’s post (BTW, what do those numbers represent, Rene?  “This is what I got” does not cut it for my dense head!) is that a pitcher may need to throw a pitch that has a positive (bad) lwts value, even for him, in order to make his other pitches better.  For example, let’s say that I have a crappy changeup and you look at its lwts value and you say, why does he even throw that?  He might as well throw one of his other better pitches.  Well, I need to throw that bad changeup so that the batter is not always looking for my fastball, in which case the value of my fastball would be worse (if I don’t throw the changeup).  So there is a lot of game theory involved in evaluating pitchers and their pitches, which makes the whole process that much more complicated, as if it weren’t complicated enough!


#6    tangotiger      (see all posts) 2008/02/17 (Sun) @ 23:43

Agreed.  That’s why I prefer starting with a pitcher as his own universe.


#7    Renè      (see all posts) 2008/02/18 (Mon) @ 06:55

MGL/5: sorry, I should have explained better. I did exactly what Joe did.
I started by getting the run value of every event and every strike in every situation, so here’s the example:

0-0, run value is 0, Papelbon throws a fastball for a strike. An 0-0 strike has a value of -0,04697.
0-1, Papelbon throws a curveball for a ball. An 0-1 ball has a value of 0,028946.
1-1, Papelbon throws a fastball which gets an out. A 1-1 out has a value of -0,28201.
In this AB the fastball had an overall value of -0,32898, while the curveball had 0,028946. That table obviously shows runs saved, so you have to reverse the signs. The fastball saved 0,32398 runs. I applied this to all pitches in all counts (using 2 strike foul balls as neutral) and figured out with Pitch F/X in function how many runs a pitch saved or created for Papelbon during the season with Pitch F/X taking a look. The fastball was better than the other pitches.

Two excellent avenues to explore:
- the pitcher as his own “universe”, because it is actually necessary to do “bad” things every now and then to improve the effectiveness of other pitches in spots where the run value is maximised.
- going beyond weighing just superficial events (such as outs) and contextualizing them further by game situation and not just by count. A groundball out is worth more with runners on, and not all 1-0 pitches are the same depending on the score of the game.


#8    joe p      (see all posts) 2008/02/18 (Mon) @ 14:04

MGL
Thanks for the reminder about making everything equal 0 at the start.  You mean the total lwts for all pitchers and all pitches, right? 

As far as the average values for each subgroup I thought when I was comparing lwts/pitch I was comparing to the average pitch for that subgroup, but looking back, I didn’t.  Sorry about that.  I think the average values were mostly around 0, with curveballs from LHP to LHH he lowest at -.01 runs/pitch, and several pitches were as high as .01 runs/pitch.  I can post those later tonight.

That sentence about looking at the quality of a pitch independent of how often it was thrown was a poor choice of words.  I was trying to account for the fact that Webb and some other pitchers had more total appearances in pitch f/x enabled ballparks, but without LWTS by count (which is the next step, and the db queries are just about done), the correct way to adjust for that needs to account for the frequency of the pitch (which I didn’t do).  Rene mentioned something about this in one of his articles, but finding the number of runs saved, for X number of pitches thrown with a pitcher’s normal distribution of pitches, is another way to go.

Tango

The pitcher as his own universe involves personalized linear weights for every pitcher in every count, right?  Do you use personalized ‘through count’ values to get the value of each event for each pitcher in each count or do you use the average ‘through count’ values.  I’m guessing theres some regression involved here.


#9    Sky      (see all posts) 2008/02/18 (Mon) @ 14:08

#5 - If a crappy pitch isn’t effective even when thrown infrequently, I don’t see how it would make the other pitches better.  If, no matter how infrequently my changeup is thrown, it gets pummeled, why would a hitter bother to even think about it?


#10    Mike Fast      (see all posts) 2008/02/18 (Mon) @ 14:09

Tango, the problem that I have run into with using each pitcher-count combo as its own universe is that the sample size is so small the results can be counterintuitive.  A given pitcher can perform better on 2-0 than 1-2, for example.  Maybe that’s due to some real talent or strategy difference unique to that pitcher, but probably it’s just luck. 

I probably need to regress to the mean to make that work properly, but I wasn’t sure how to do that in that case.


#11    Renè      (see all posts) 2008/02/18 (Mon) @ 14:57

Rene mentioned something about this in one of his articles, but finding the number of runs saved, for X number of pitches thrown with a pitcher’s normal distribution of pitches, is another way to go.

Thanks for mentioning this Joe, since I forgot! While I’m at it and since I spoke about Papelbon’s runs saved by pitch type, I also figured out how many pitches each single pitch saved on average. So here are the values:
Every fastball saved 0.040 runs
Splitter: 0.014
Slider: 0.029
Curveball: 0.080
Funnily enough, the curveball is more efficient per pitch thrown, possibly (probably) because of the reduced frequency. Then I went beyond and I calculated for him how many runs each type of pitch would save if he threw 2000 total pitches and maintained the same pitch distribution. This is how it would go:
Fastball: 59.22
Splitter: 4.75
Slider: 2.87
Curveball: 4.26
I guess this is more useful for a comparison with other pitchers. So far I only did Carlos Marmol, and I will move on to SPs from now on (Haren, Chris Young, Batista and other pitchers with large sample sizes).

#9: I don’t think he’s talking about crappy pitches being ineffective. I think we should all agree that Dave Bush should drop the changeup (which is extremely ineffective at all times), but the idea comes from Papelbon’s data. He has a great splitter with 2 strikes, but it is poor when thrown early in the count. It’s not about the ineffectiveness of a bad pitch thrown infrequently. It’s about a good pitch being thrown in bad spots every now and then to keep hitters “honest” and guessing. This ends up with the splitter having a bad run value when thrown early in the count, but it is going to maximise the value of the fastball early, and of the fastball and the splitter later on. Overall, he comes out positive, but in order to do so, occasionally he has to throw “bad” pitches (I guess intentional balls enter into this equation as well).


#12    MGL      (see all posts) 2008/02/18 (Mon) @ 18:53

Rene, got it!  Great work also!

That sentence about looking at the quality of a pitch independent of how often it was thrown was a poor choice of words.

Sure, I know what you meant.  I only wanted to emphasize how interdependent a pitcher’s pitches are because of the game theory element.

Sky, sure if a pitch is so bad that a batter does not have to alter what kind of anticipation he has, then of course he should not throw it.  That is generally not the case with MLB pitchers.  My point is that pitchers probably have pitches that are “negative” in value, but they still need to throw them at some frequency.  Imagine a pitcher has a fastball that is -.01 (.01 runs better than an average pitch) and he throws it 90% of the time.  Then he has a changeup which is .01.  A person might think, “Well, why doesn’t he just throw the fastball all the time, as the fastball is a much better pitch, AND that CH is worse than even a league average pitch.  There are 2 reasons.  One, if he did, it might change the overall value of the fastball a tick.  If it even changed it to -.005, he is still better with the 90% -.01 and 10% .01 rather than 100% -.005.  Two, he is probably throwing the CH in pitcher’s counts and that might change the fastball value considerably if all he threw were fastballs in those counts, or something like that.

Again, just another example of how all of a pitcher’s pitches and their accompanying values are extremely inter-related.  And it is difficult to figure out the exact or even approximate correct ratio of pitches in different counts for every pitcher given the characteristics and effectiveness of their pitches.  That might even be something that is best done by human beings using their intuition and experience.


#13    joe p      (see all posts) 2008/02/19 (Tue) @ 01:27

Here are the average lwts/pitch for each subgroup.  These are calculated using an average value for every event, which is the way I did the calculations in the article. 

Group lwts/pitch
LHP-CB-LHB -0.02
LHP-CB-RHB 0.00
LHP-CH-LHB 0.00
LHP-CH-RHB 0.01
LHP-FB-LHB 0.00
LHP-FB-RHB 0.01
LHP-SL-LHB 0.00
LHP-SL-RHB 0.02
--------------------
RHP-CB-LHB 0.00
RHP-CB-RHB 0.00
RHP-CH-LHB 0.01
RHP-CH-RHB 0.01
RHP-FB-LHB 0.01
RHP-FB-RHB 0.01
RHP-SL-LHB 0.01
RHP-SL-RHB 0.00

I mentioned in the article that the sum of the lwts for Webb’s pitches was 18 runs better than average.  Evaluating every event based on the count it occurred in causes Webb’s pitches to be 25 runs better than average.  I had hoped that the sum of his pitches might connect with how many runs above average Webb was overall during the period I looked at, but goofed on the math in the article.  Guy showed me my mistake, and the correct math as well as doing the pitch valuations by count, makes me think you can break up a pitcher’s wins into the pitches responsible for them.

I posted this at Baseball Analysts as well, but in the time period I had for Webb he threw 113 innings with a 2.55 ERA, good for somewhere around 22-23 runs above average.  The sum of his pitches was 25 runs above average. 

I did this for Peavy too (111 ip, 2.99 ERA in my window) and he was was 17-19 runs above average and the sum of his pitches was 17 runs above average. 

I did Lowe and Moyer as well and both were off by only a couple of runs as well.  I need to 2x check my new formulas for values by count, but I think its pretty cool if you could say that Webb’s sinker to righties was responsible for half of his wins above average.


#14    MGL      (see all posts) 2008/02/19 (Tue) @ 05:51

Joe, #13, aboved, again, shouldn’t everything add to zero by definition?  Why is only one category, LHP CB to LHB, negative?

Granted, your universe of pitchers is only those whose pitches are being tracked (what % of all pitchers/pitches are tracked?), but unless that universe is biased, it should add to around zero, no?


#15    Mike Fast      (see all posts) 2008/02/19 (Tue) @ 10:24

MGL, a little over a third of pitches were tracked during 2007.  Pitchers typically have about a quarter to half their pitches tracked, depending on what point in the season their home park got a PITCHf/x system.  There is overweighting toward the pitchers from the West Coast parks because they got their systems first.

BTW, Joe, I’ve ended up tracking Lwts/pitch to .001 runs rather than .01.  I’m not sure what the error is on the Lwts/pitch numbers, but I do know that if a pitch was thrown 500 times that .01 runs/pitch is 5 runs, which is a pretty coarse granularity.  Similar to what you found in your article, I’ve found that the best pitches in baseball are around -.04 runs per pitch.


#16    Renè      (see all posts) 2008/02/19 (Tue) @ 11:34

MGL/14: the data is biased. We have more data for certain parks (and therefore certain pitchers), but we don’t account for park or league factors. Not only that, we have more data from the summer period, which, if I recall correctly, yields more runs with respect to April and May for example, so working with lwts from the whole season isn’t perfect. I’ve been thinking about this stuff since a friend of mine and I have been onto it for the past weeks (sorry for not referring you to my work, but it’s in Italian), but I’m not sure there is an easy way around this trouble, especially with limited data.
Not only that, I think that DPs are not accounted for in this analysis, so it’s pretty obvious that data for pitchers like Webb, Lowe, Wang, Westbrook and the like isn’t going to be perfect (more DPs, more imperfection basically) when you compare lwts for pitches to ERA.
Park factors are also important. Take Fenway, where a single or double to left will systematically lead to less extra bases taken by the runner. Maybe including park factors in the analysis would be a nice future avenue.
Basically, there is a lot of noise that piles up. But we’re really talking about perfection. I believe that what has been found is already quite satisfactory. There might be an error of 2-3 runs over the course of a season (which is not small) but as Joe said, we can already say that 90% of Papelbon’s RSAA comes from his fastball and that Webb’s sinker to righties is responsible for half of his own RSAA. This is truly exciting stuff.

Many issues are interesting to me. Here are some more:
- not all outs are actually the same. Should sacrifices be treated differently when rating a pitch’s efficiency?
- can the study be made defense and park-independent by using the value of a batted ball (by type of BIP) rather than the actual outcome? It’s another thing I wanted to do, but I’ve only got superficial data unfortunately. Of course luck and chance will emphasize the difference in RSAA, but will provide a better account of “true talent”. If we got both actual RSAA and DIPS-RSAA, we could also understand how individual pitchers are helped/hurt by their defense and we could see how much their pitch distribution works for/against him.

As I said, I’ve been onto this stuff for weeks now, and I’m really glad to have fresh minds working on this!


#17    Guy      (see all posts) 2008/02/19 (Tue) @ 11:55

Another issue to wrestle with:  if I understand Joe’s methodology, the value of each non-BIP pitch reflects the change in expected run value for that PA (strike reduces, ball increases).  But for BIP, he uses the actual run value of that outcome.  An alternative approach would be to continue to measure the marginal change for BIP pitches as well. 

For example, when Webb has a hitter 1-2 the OPS against is just .372 when the next pitch is put in play.  So the expected run value is already low.  Now, if Webb throws a changeup and gets an out, should the changeup get full credit for the out, or only the marginal change?


#18    Renè      (see all posts) 2008/02/19 (Tue) @ 12:04

Guy, I think he’s adjusting for that (when he’s calculating by count). I know I did for example, and Joe and I spoke about it. We’re rating pitch by pitch, so when you’re 1-2, that’s a pitcher’s count by run value, and if you get an out, your “out pitch” should only get the marginal credit. For example, according to my friend’s calculations (he did the “dirty” work for me using Tango’s data for the run values), this year a single was worth 0.497, but a 3-1 single is “only” 0.343, and an 0-2 single is 0.608 because you have to figure out the run value by count and subtract the value of previous pitches, as I also explained in post #7. My calculations reflect that, and I think Joe’s latest ones do as well.


#19    Mike Fast      (see all posts) 2008/02/19 (Tue) @ 12:13

Rene/#16, I raised the question earlier in another thread here (the Bedard thread, I think) about using actual batted ball outcome versus the average for that particular type of batted ball.  I’ve been using the actual batted ball outcome in my work since I don’t think we know whether DIPS theory applies at all at the pitch level, and if so, how or to what extent it applies.  Better to assume it doesn’t apply and find out from the data that it does than to assume it does apply and never be able to tell one way or the other.

One other note--I am including double plays in my lwts.  I think that’s important to do because it clearly varies a lot between pitchers and pitches.


#20    Mike Fast      (see all posts) 2008/02/19 (Tue) @ 12:22

Rene/#18, are you counting two-strike foul balls when you calculate the lwts at each count?  It looks from your value of a single at 0-2 that you are ignoring two-strike fouls.  I’ve seen that in lwts/pitch that other people have done, and I think that’s an error.

If a pitcher’s two-strike pitches just get fouled off rather than closing the deal with a strikeout, that should be counted in their average value.  It would be a value of zero, but if the pitch is otherwise an above-average pitch, two-strike fouls would reduce its average value.

Maybe you are including two-strike fouls in your work, and our lwts just differ enough that I came to the wrong conclusion about what you had done, but we are pretty close on the 3-1 count value.  For the AL, I have the 3-1 count worth 0.146 on average, and the 0-2 count worth -0.088 on average.  Before I corrected the two-strike foul problem, I had the 0-2 count worth -0.115, which is pretty close to your number.


#21    tangotiger      (see all posts) 2008/02/19 (Tue) @ 13:41

The two-strike foul issue:

If let’s say at a particular 2-strike count that the run value is a total of +.100 runs based on all non-foulstrike pitches, and if the foul pitches represent 20% of all pitches, then the LWTS value at that count is .100 / (1-.200) = .125

If this is what Mike is doing, I’ve got no issue here.  If Rene and others are not doing this, then this is a problem.


#22    joe p      (see all posts) 2008/02/19 (Tue) @ 13:42

Initially my methododology was to have an average value for every event (balls in play, balls and strikes) and then I just multiplied that by the number of events that happened.  This isn’t the most accurate way to make the evaluations, but I wanted to make sure I was actually measuring something before I spent the time refining it by count.  In the last couple of days I’ve adjusted my evaluations so that every event is weighed by the count it happens in.  A single in an 0-2 count now hurts a pitcher more than a single in an 0-0 count.

Mike, great point about 2-strike foul balls.  I was ignoring them because they didn’t change the value of the count, but you’re totally right about adding them in as a 0.


#23    Mike Fast      (see all posts) 2008/02/19 (Tue) @ 13:48

Tango, using your example numbers, I calculate the lwts value to be .100 / (1+.200) = .083.  Right?

And 20% is a pretty good guess for two-strike fouls, if that’s what it was.  I calculate 19% of all pitches at 0-2, 21% at 1-2, 24% at 2-2, and 28% at 3-2.


#24    Mike Fast      (see all posts) 2008/02/19 (Tue) @ 14:41

I have actually gone back and forth in my head on the two-strike foul issue a dozen times.  I’m glad there are some other people working on this publicly now so I can have someone to bounce my thoughts off of. 

On the one hand, it seems to me that the run value of a count (i.e., of all plate appearances that go through that count) should be independent of two-strike fouls.

On the other hand, there is the argument I made above in #20.

I am having trouble reconciling the two ideas either logically or numerically, particularly when I start including the value of a individual ball, strike, or ball in play.  This is why I have not published anything on this topic.  I have yet to be able to get my numbers to balance.

I realized just as I was typing this that it might be helpful to break out two-strike fouls as a separate type of pitch and not include them in the same category as strikes at all.  That way I would not have to play tricks with the value of a strike, and I could also leave the overall per-count values alone.  I will have to see if that fixes my lwts/pitch accounting problems.

The per-count run values are one thing.  Ignore the two-strike fouls


#25    Mike Fast      (see all posts) 2008/02/19 (Tue) @ 14:44

Ignore the last sentence-and-a-half paragraph in #24 (or Tango, are you able to delete it?).  That was the thought that I was starting to type when I had my realization of the previous paragraph, but then I neglected to delete it.


#26    tangotiger      (see all posts) 2008/02/19 (Tue) @ 18:34

I’ll delete it later tonight.

No, I meant 1 MINUS .200, since the run value of +.100 was based on 80% of the PA.

I’m pretty sure that’s how I’ve done it.  I’ll have to go back and look.

***

As for the sample size, I turn everything into a Markov chain.  So, I don’t look at how a pitcher does from a 2-1 count to the end of the PA, but rather what happens at that 2-1 count: in play, 3-1, or 2-2.  Each of those has its own run value.  The 3-1 and 2-2 you’d have to determine recursively.  The in play is exactly what happened at that count.


#27    Mike Fast      (see all posts) 2008/02/19 (Tue) @ 19:10

Tango, addressing your points in reverse, I thought you (or maybe someone else whose results you presented here) had found that all 1-1 counts were not created equally, that 0-1 plus a ball had slightly different results than 1-0 plus a strike.  I guess I’m not seeing the value in doing the Markov chain approach.  What am I missing?

Regarding two-strike fouls, there were 33,821 plate appearances in 2007 MLB that had an 0-2 count.  The linear weights of the final results of all those plate appearances totaled about -3605 runs.  If you divide the two you get -0.107 runs/PA.  But there were approximately 41,387 pitches thrown in an 0-2 count (meaning ~7566 two-strike fouls).  If you divide those two, you get -0.087 runs/pitch.


#28    tangotiger      (see all posts) 2008/02/19 (Tue) @ 20:02

Right, my Markov chain assumes that each state doesn’t care about how you got there, which is not a great thing to assume, but it’s probably decent.

As for your other calculation, doing it that way, the “through count”, you are correct about the .087.

I was doing it my way, about the run value by ball, strike, and in play.  From that standpoint, I had to do it my way (recursively).


#29    Renè      (see all posts) 2008/02/19 (Tue) @ 21:28

Sorry for falling behind with the replies, anyway…

Mike/20: I am not including 2-strike foul balls in the sum of how many runs a pitch saves, because their value is zero, so there is nothing to add to the total. I do include them in the averages however, but I do separate calculations for those. Because pitchers have very different abilities in missing bats and generating foul balls I do this on a pitcher-by-pitcher basis, as they are going to have different Foul% values. The values I showed earlier regarding Papelbon are strictly totals and not averages. The average value of a Papelbon fastball is -0.03942 according to my data (factoring 2-strike foul balls into the average value). This goes back to #24. I don’t think the run value of the count should really be modified, but I do think that the average run value of a pitch should factor it in. It’s graph theory in a way, right? I add and subtract the various transitions between states (counts) and 2-strike foul balls count as zero, which means they should be factored in the average but not in the total. The run value of an 0-2 single is the same with respect to the pitch that yields the base hit, regardless of the fact that it happens on the third pitch (no 2-strike foul balls) or the fifth (2 2-strike foul balls). Of course when you calculate average values you take a look at those pitches that had a run value of 0.00 and factor them in. But doing it “automatically” (and changing the run value of 2-strike counts as a default for groups of pitchers rather than checking individually) can be unfair to pitchers who gain their “unhittability” (the ability of preventing that a pitch thrown is hit into play) from missing bats rather than generating foul balls. It can be a pain, but in my opinion it needs to be done by hand, pitcher by pitcher and pitch by pitch (which isn’t necessarily too hard to do when you have totals by count I guess - just use actual run values of the various counts and divide by the total number of pitches thrown by that pitcher, for every type of pitch he has).
I think I twisted my keyboard in this explanation, but I hope I was clear enough. If not, tell me and I’ll give it another shot!

Tango/28: Sal Baxamusa on THT definitely showed that there was no difference between counts other than 1-1 with respect to pitch sequence. But 1-1 was alternatively a pitcher’s count or a hitter’s count depending on how you got there. The Markov chain just has the average, but maybe (since it’s just for that count anyway) it should be differentiated. Is there any chance it could be done somehow? Actually, I don’t have pitch sequences, so I’m not sure what use I would make of that anyway…


#30    tangotiger      (see all posts) 2008/02/20 (Wed) @ 10:19

Thinking about it some more, even if the 1-1 count is different if it came from 1-0 or 0-1, what kind of difference are we talking about it?  Let’s say that the 1-1 count is zero.  If it came from 1-0, it might be +.02 and if it came from 0-1 if might be -.02.  For most pitchers, it probably means up to 65% from one side and 35% from the other, which makes the overall value .006 runs.  And that’s probably an exaggerated number.  If someone wants to take a bigger look at it, feel free.  But, I have to believe that you are extremely safe by treating 1-1 as its own count.


#31    Renè      (see all posts) 2008/02/20 (Wed) @ 10:29

Tango, by the way, here is Sal’s piece I mentioned:
http://www.hardballtimes.com/main/article/the-memory-remains/

This is the relevant part:

Ball in play on 1-1 pitch
First pitch strike .336 BA .528 SLG
First pitch ball .299 BA .472 SLG


#32    Mike Fast      (see all posts) 2008/02/20 (Wed) @ 20:55

My numbers finally balance (within about 0.001 or 0.002 runs at each count, which error I chalk up at least partly to data quality problems).  I used a two-pronged approach.

What I did was this: ignore two-strike fouls and calculate lwts/PA for each count.  Use these values for valuing the transitions between counts (i.e., balls and strikes) and as the baseline to subtract from the result of any pitch thrown at that count.

For comparing how effective a pitcher is at a given count, use lwts/pitch, which includes two-strike fouls as zero-value pitches.  By the way, I think that’s an assumption that probably needs to be tested.  I would guess that two-strike fouls actually have some positive effect on run expectancy even though they don’t change the count.  That data may already exist for all I know.  But for now I’m calling them zero value.

Finally, how are the rest of you accounting for intentional balls/walks?  Right now, my method counts intentional balls as normal balls up until 3-0, at which point I make the adjustment to devalue the resulting walk to the run value of an intentional walk.  This results in the somewhat illogical step of giving a negative run value (-0.028) for ball four.  Is there a better way of doing this?


#33    Renè      (see all posts) 2008/02/21 (Thu) @ 09:25

Mike: I wish I had that kind of data, but I don’t. Anyway, what is the actual problem? That before the fourth ball you can’t distinguish intentional balls from unintentional ones? What about using the X and Y coordinates of arrival to see if the pitch just missed or was clearly intentional?

And one more thing: since there are many guys working on similar projects but with different datasets (because of different classification algorithms), why don’t we “join forces” and try to work together to have a unique dataset in order to produce more coherent results?


#34    Mike Fast      (see all posts) 2008/02/21 (Thu) @ 12:22

Rene, the intentional balls are labeled as such in the pitch des field ("Intent Ball").  However, the problem I run into is needing to know the entire pitch sequence of the intentional walk in order to value each pitch in the sequence properly.  In probably 90% (guess) of the cases, it’s four straight intentional balls, and that would be easy to account for.  Just divide the intentional walk run value by four and forget treating the pitches as transitions from 0-0 to 1-0 to 2-0 to 3-0 to walk.  Basically, that means removing the intentional walks from the dataset and calculating their value separately.

But if I take that approach, it’s wrong for the other ~10% or whatever of cases where the walk was not four straight intentional balls.  The only way I know to be right in all cases is to compute separate values for all possible pitch sequences, but that involves way more work than I want to go to. 

Right now I am using the blunt club of adjusting the value all on ball four.  It might be better to use the other blunt club of removing intentional walks from the data set altogether.

It has a very marginal impact on counts other than 3-0.  However, what you do with intentional balls/walks makes a big impact on your 3-0 run value. In the AL in 2007, 12% of 3-0 counts resulted in intentional walks, and in the NL, 14%.  That’s enough to affect the overall 3-0 run value by 0.02 runs.


#35    Tangotiger      (see all posts) 2008/02/21 (Thu) @ 12:55

If you follow a Markov approach, you can throw out all the intentional balls (including pitchouts).


#36    joe p      (see all posts) 2008/02/21 (Thu) @ 14:07

My problem with intentional balls came when I was classifying pitches.  For the most part, intentional balls were outside the range that I considered fastballs, and I didn’t have a good way of naming them, so I just ignored them at the time and didn’t name them.  I think throwing them out is the best way to go when doing the linear weights.  You can’t really assign the run value of the IBB to any of a pitchers pitches .

Tango, when you say a Markov approach, in which part do you mean?


#37    Mike Fast      (see all posts) 2008/02/21 (Thu) @ 14:17

Tango, you’ve convinced me that the Markov approach is at least worth looking into.

I realized I had the data on intentional balls broken down by count.  (I still need to publish this data.  It’s from part 2 of the series of which the in-play data I published was part 1.)

My totals differ by one intentional walk, 1322 vs. 1323, from the Baseball-Reference data, so I’ll consider that good. 

On the 0-0 count, 1021 intentional balls were thrown.  That means that about 77% of intentional walks consisted of four straight intentional balls. 

At the 3-0 count, 1248 intentional balls were thrown.  That means that an additional 17% of intentional walks consisted of four straight balls, of which some were officially unintentional.

At the 3-1 count, 73 intentional balls were thrown, plus one at 3-2, means that about 6% of intentional walks include a strike in the pitch sequence.

Hmm.  Unless you object, Tango, I might reprint some of these thoughts over at StatSpeak.  This is sort of tangential to the topic anyway.  I’ve run into way too many of these interesting tangents while I’ve been looking at the per-pitch run values.  That’s partly why I’ve not published anything on the topic as a whole.

Rene, you mentioned collaboration.  I definitely am in favor of that.  I have tried to make all my methods and data public.

I don’t have a universal pitch classification algorithm that I like or trust.  I have pitch classification data for individual pitchers embedded in spreadsheets.  I’ve made a number of these spreadsheets available at my Fastballs blog, although not every single one.  Many of the unpublished ones have only fragmentary data where I either couldn’t classify all of a pitcher’s pitches or didn’t take the time for whatever reason. 

I have not collated all my pitch classification data for every pitcher I’ve analyzed back into my database yet, so I don’t have a nice format that is available to share the full extent of what I do have.  I’ve been hoping we/someone would come up with a pitch classification algorithm that was more trustworthy than what I’ve seen so far and that my individual pitcher spreadsheets were only stepping stones in that direction rather than portions to be accumulated toward universal data.


#38    Mike Fast      (see all posts) 2008/02/21 (Thu) @ 14:27

Joe, I agree with everything you said in #36.  I throw out intentional balls and pitchouts when I classify pitches.  I suppose pitchouts could still be considered fastballs, but they are fastballs of a unique purpose that is irrelevant to most analysis.  Although it might make an interesting analysis to know if the speed and location of pitchouts have any effect on whether the basestealer is thrown out, assuming there are more than a mere handful of basestealers who are safe on pitchouts.  There I go on another tangent.

However, per-pitch run values have broader application than just to valuing pitch types.  In some of those applications, I think it makes sense to account for intentional walks.


#39    Tangotiger      (see all posts) 2008/02/21 (Thu) @ 14:46

Mike, feel free to quote anything you want.

A pitchout should certainly not be counted as a “fastball”.  Call it a pitchout.  You always have to go back to intent, not to outcome.  If the intent is to throw a pitch for the batter to NOT swing at and to throw it at 75% speed, it makes no sense to clump that pitch in a bucket of other pitches whose intent was to throw the pitch close to the strike zone and close to 100% speed, with the hope that the batter would at least think of swinging at it.

As for Markov, I probably should finally write that article I’ve been meaning to for a few years now.  I finally copied that file from my main computer to my portable drive, so I can work on it at the office.  There’s definitely some good stuff to be done with it.  But to try to answer the question quickly, look at each pitch as having a starting and ending state.  You throw a pitch at 1-0 count: it’s a ball, a strike, or in play.  You never include pitchouts in this.  You recursively determine the run value at 2-0 and 1-1, and figure out exactly the run value of the in play pitch.  Figure out the frequency of each of these three things at that count, and that gives you the run value of the 1-0 count. 

It’s actually extremely simple.  And with only 12 states (unlike the 24 base/out ones) and that you CANNOT go back into a state (you are always moving forward, or staying in the same count in the case of 2-strike fouls), you don’t get into anything circular like with the 24 base/out states (just recursive).

The really cool part is trying to infer how a pitcher does at each count WITHOUT having any of the count data.  That’s a bit more complicated, but you can come up with a general scheme for each pitcher in MLB history.


#40    Renè      (see all posts) 2008/02/21 (Thu) @ 19:13

Mike/37: When you talk about classification, are you talking about:
- Analyzing the retrieved raw data from scratch
or
- Properly labeling the pitches after you have performed a cluster analysis upon the data?


#41    Mike Fast      (see all posts) 2008/02/21 (Thu) @ 22:53

Rene, the way I do pitch classification is to query my database for the data on a particular pitcher, import that into Excel, and graph the speed and spin parameters.  Then my eyeball is my cluster analysis tool, either on a season level or usually on a game-by-game level.  After I classify the pitches this way, I check the scouting reports on the web and make adjustments.

Properly naming the pitches is an interesting challenge, too, but that wasn’t what I was referring to in #37.


#42          (see all posts) 2008/02/22 (Fri) @ 12:51

Mike, yes, naming pitches is interesting but I was going to say that it might be futile since defining a slider or a curveball or whatever else is just subjective. How can you call K-Rod’s breaking ball? I called it a slurve because it didn’t really fit with sliders or curveballs, but evidently there are literally hundreds of pitches once you properly break them down. A knucklecurve can be different from a curveball (speed especially?), a circle change can be different from a “straight” changeup. So it is definitely more interesting to group together the different types of pitches a pitcher has. This way when you scout him you can say: he’s got four pitches. Pitch A moves… pitch B moves… etc. Conventional wisdom is going to assign rigid names here and there, but we know that no 2 fastballs are the same, and the same goes for many pitches. Is Mariano Rivera’s really a cutter? Not by the way it moves as opposed to other pitches, so calling it a cutter might give you a rough idea, but it really doesn’t explain how unique it is. Joe Sheehan determined that it was more similar to LHP’s fastballs or RHP’s sliders!
Anyway, I’m going off track. I also agree that developing an appropriate cluster analysis is a priority. I think it could be done by means of similarity scores depending on the speed and movement of pitches. I know Joe Sheehan and Josh Kalk have already done something along these lines, but I was working on refining their ideas, pitch by pitch. I think Kalk just groups together the two closest pitches and then expands to group the next closest pitches until he’s left with big clusters which are then identified. This has the disadvantage of having too few clusters when there are many different pitches (does Matsuzaka have only three pitches?!?!?) because they get improperly grouped. So a different approach could be used. As I said, I’m working on that along with a friend and we might share ideas. Just drop me an e-mail (you should get it by clicking on my name) since I think we’re starting to go a little off-topic regarding pitch evaluation.


#43    Tangotiger      (see all posts) 2008/02/25 (Mon) @ 12:17

Joe Sheehan comes back with his run values by count:
http://baseballanalysts.com/archives/2008/02/writing_about_t.php

Cool stuff…


#44    Tangotiger      (see all posts) 2008/02/25 (Mon) @ 18:09

The run values per count gets an r=.99 using the following, against Sheehan’s run values:

runs
= 0.017*BallsSquared
+ 0.0125*Balls
- 0.01*StrikesSquared
- 0.04*Strikes

You get the following results:
Balls Strikes Runs/PA Regressed
3 0 0.207 0.191
3 1 0.137 0.141
2 0 0.097 0.093
3 2 0.062 0.071
2 1 0.035 0.043
1 0 0.034 0.030
0 0 0.000 0.000
1 1 -0.016 (0.021)
2 2 -0.037 (0.027)
0 1 -0.043 (0.050)
1 2 -0.083 (0.091)
0 2 -0.104 (0.120)

If you want to get realllly basic, just do:
runs = .06 * (Balls - Strikes)


#45    David Smyth      (see all posts) 2008/02/25 (Mon) @ 18:25

I’m wondering what is the avg value of a called strike vs a swinging strike. My initial guess is something like -.09 for a called and -.14 for a swinging.


#46    Mike Fast      (see all posts) 2008/02/25 (Mon) @ 18:38

David,
My numbers work out to -0.130 for a swinging strike and -0.066 for a called strike, on average.  That’s not including foul balls or foul tips in swinging strikes, no matter the count.

I’m actually intending to publish these numbers in full as soon as I finish up the last two parts of my series on Bannister.


#47    tangotiger      (see all posts) 2008/02/26 (Tue) @ 08:24

John Walsh published his numbers:

http://www.hardballtimes.com/main/article/searching-for-the-games-best-pitch/


#48    joe p      (see all posts) 2008/02/26 (Tue) @ 13:41

If John is reading this, I had a question about the runs/100 number given.  Is that per 100 pitches of that type (100 fastballs, 100 curveballs) or is it per 100 pitches from that pitcher, given how often he throws his different pitches?  It’s not a big difference, especially with the inclusion of a minimum pitch requirement, but I was just curious. 

Also, this seems like as good a place as any to ask this, but how would you go about finding the random variance for these linear weights values in order to regress them properly?  Is there a trick for finding random variance with a polynomial like there is with a binomial?


#49    Tangotiger      (see all posts) 2008/02/26 (Tue) @ 13:58

The Book, pp. 370-371: “Random Variation in Multinomials”.


#50    joe p      (see all posts) 2008/02/26 (Tue) @ 15:56

Thanks...I must have missed that when I found the shortcut for binomials.


#51    John Walsh      (see all posts) 2008/02/27 (Wed) @ 05:44

Joe/#48

That’s 100 pitches of the given pitch type.

Regressing is a good idea—I wanted to investigate that, but I ended up saving it for a later date.  I actually need to educate myself on how to do it!


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 23 01:15
How much should minor leaguers make?

Feb 22 22:31
Not everything you learn in college is true (duh)…

Feb 22 17:27
Would you cut to a regularly scheduled show, if the main event ran long?

Feb 22 17:02
This week in chart failure

Feb 22 16:26
Who’s evaluating the 2011 forecasts this year?

Feb 22 12:21
MLB 2012 Odds: BetOnline

Feb 22 07:11
K minus BB differential or ratio?

Feb 22 01:18
Two players have the same stats: one is much younger.  Which one will be better next year?

Feb 21 14:49
Knuckleball pitchers: all of them

Feb 21 13:57
Proper compensation for Epstein?