Thursday, December 20, 2007
Fastball, slider, changeup, curveball
Me hero of 2007 lays out his landscape for pitch types, counts, batting results. This is the kinda article that you need to print, and refer back to often.
Buy The Book from Amazon
Me hero of 2007 lays out his landscape for pitch types, counts, batting results. This is the kinda article that you need to print, and refer back to often.
SirKodiak, I was going to respond to your post in the other thread, but I’ll just do it here.
I am definitely in agreement with you. I think there is some value in the analysis being done at this time that is based on very general pitch classification. The reason is that the alternative is basically having no conclusions at all. We haven’t had this kind of data, and if we wait until we completely iron out the pitch classification thing, it will be well into next year before we can undertake studies.
It reminds me a little of DIPS. Was the initial proposition that pitchers have no influence over batted balls completely correct? No. With further research we found exceptions and learned to better quantify the extent to which pitchers have influence over batted balls. If Voros had waited to publish his initial study until he had ironed out all those things, we wouldn’t be nearly as far along. So I am glad that Josh Kalk is getting his player cards out there, even if 50% of the analysis they are generating is noise. I am glad that John Walsh has his platooning article in the Hardball Times annual, even if we have to revisit some of his conclusions later on.
On the other hand, however, it bothers me that many of the other leading PITCHf/x researchers don’t seem concerned by all the problems in the data. There seems to be a general sense that everything PITCHf/x is telling us is gold, and it’s just a matter of presenting the data with the right terminology and setting up the right studies.
It’s hard to resist that when the field for research is wide open, but I definitely see what you see, that we as a group of researchers are currently misclassifying a ton of pitches, and that makes me skeptical of some of the broader studies. I don’t know if I believe John Walsh’s conclusion that BABIP is highest for fastballs. I don’t know that I disbelieve it either; I’m just not convinced that we have the ability to tell that yet from his data.
One of things I am working toward, albeit very slowly, is getting this data out in the public view so we can have more than a half dozen people tearing it apart to find the warts and get them fixed. Right now it seems like no one knows how to question what John Walsh or Joe Sheehan or Josh Kalk are doing because they can’t reproduce the analysis from scratch for themselves. There are flaws in what we are doing, we need more attention paid to that, and we need the help of the sabermetric community as a whole to refine the analysis. Until we can make the data and the methods public, the PITCHf/x research field is going to suffer.
I am quite optimistic this will happen over time. The progress in the past year has been phenomenal. The analysts and researchers involved are a sharp and generous bunch, and I’ve learned a ton from working with them.
Getting this data out in the public view would be outstanding! Cheers to you. I would love to work on classifying pitches, but for now am working on an old computer that is incapable of doing that. With something so raw, the more involved the better the chances that inovation will occur sooner. It would be a great service to the community.
I see your point about the value of getting studies out there, but the caveats need to be prominently displayed.
A couple of thoughts I have had about classifying pitches:
* Broad grouping first by algorithm, then reapplying the grouping algorithm to each group to look for pitches inside that may form seperate patterns.
* Perhaps the result of the average pitch should be put in vector form rather than listing horizontal and vertical break seperately. It is simple to make the hypotenuse the magnitude and just use trig to find the angle and I think perhaps give the viewers a better idea of what the ball is actually doing.
* It would be nice to have a few pitchers / catchers / pitching coaches willing to add their knowledge to what pitches are and are supposed to do.
I took the baseball season off from any sabremetric reading and found out about pitch f/x after the regular season was over. So that is why I am sounding off a bit late. I watched about 250 baseball games this year, though, focusing on pitchers, their grips, ball movement, etc. It brought some insight with it, and renewed my love for the game as it is played on the field, but I lost hair watching Kip Wells pitch. Anyone ever seen/done a study in regards to a pitcher’s effectiveness and the amount of time between pitches?
This is an interesting and important debate, one that will shape the direction of analyses over the next 12-24 months with the pitch data.
First we should applaud the likes of Mike (and Josh and John W) who have been willing to make their methods public—it is a great service to the analytic community.
One issue that we should be mindful of before drawing conclusions is that there is still a lot of noise in the data. The system is recallibrated between home stands, which means that even comparing data between starts can be fraught with error. Hopefully in 2008 the data will settle down.
This means that making adjustments to the data isn’t necessarily the right thing to do (except to release position). We just have to live with it at the moment.
I firmly believe that a fully automated clustering algorithm is too unreliable to be of much use at the moment, particularly, as SirKodiak points out, when trying to discern between four seamers, two seamers, cutters and sinkers. Also different pitchers have different pitches that we call the same. A general algorithm can’t hope to capture those nuances properly.
However, what we can do is what Walsh did in his article at THT the other day and what I did for Burnson’s graphical player, which is to highlight macro-groupings (at the moment I have fastballs, change-ups and breaking balls (sliders + curves etc).
I also feel that the classification of pitches is slightly arbitrary. It is useful from a “what did the pitcher want to throw” perspective, but when working things out like babip you want to know a number of parameters ... movement, location, speed —and the trade offs between these parameters
When examining batters, I would think that what the ball actually did (movement, location, speed) would be most important. That is why I was pushing for location in the other thread.
But when examining pitchers it would seem to me that what pitch was thrown is crucial. I’ve already seen articles talking about what pitch is thrown in what count, how player X’s sinker compares to player Y’s sinker, and if player Z is throwing pitch A too perdictably and pitch B not enough. Without a rigorous classification of pitches, it seems that all of those (and many other) studies are pointless.
Mike makes some excellent points—I think he’s right to be skeptical about the results presented by myself and others. Analyzing this data is far from trivial and it’s not easy to get it right. I’m sure researchers are doing their best to get the bugs out, but there are conflicting demands: for example, getting articles written for baseball websites like THT and Baseball Analysts (and others).
So, we have to find an acceptable balance. As for making our methods public, I think that is a good idea. I tried to do that in my latest piece, but again, I’m writing for a general audience, the great majority of whom don’t care how I classify pitches. I should probably have my own blog, where I can publish the dirty details, but so far I haven’t had time for it.
As for the niceties, for example, of cutter vs. slider, I agree with Mike on this, as well. I don’t think we can sit down and write a program to distinguish every type of pitch. We have to start somewhere—let’s start with broad strokes and then refine things as we move forward. I thought I could do a decent job of distinguishing 4 basic pitches: fastball, change, slider, curve, so that’s what I presented last week.
Of course, as we move forward, we’ll learn more about making finer distinctions.
Really fascinating work, John.
One thing that would be interesting to delve into would be the interaction of count and pitch type, as it impacts outcomes. Is BABIP higher for FBs because FBs are thrown more in hitter’s counts, or is BABIP higher in “hitter’s counts” because pitchers must throw more FBs (or some of both)? It would be nice to take only neutral counts and then compare BABIP/SLGBIP for different pitch types, and also to look at the impact of count holding pitch type constant (3-1 FB vs. 0-0 FB vs. 1-2 FB).
* *
I know there’s been discussion in other threads regarding how best to present the break data. I think the current convention of comparing V break to a zero-rotation pitch probably works fine for visitors to this site, but is not a good choice for taking this analysis to a broader audience. Showing FBs as having more V break than CBs is profoundly counterintuitive (even leaving aside the misleading notion that balls have positive vertical movement). Same for idea that a sinker has less V movement that a regular FB. I think this confusing presentation will cause many fans to question the legitimacy of this data, and create huge—and unnecessary—obstacles to communicating outside the sabermetric community.
My suggestion would be to treat the average MLB FB as the benchmark for V break. With the typical FB as your benchmark, other pitches will have negative V break, just as fans naturally think of them. And I think it’s analytically valid as well: a FB is what hitters see about 60% of the time, and what they saw about 90% of the time when they were first learning to hit in Little League and High School. To the extent hitters have a baseline expectation of what a thrown pitch will do, I would think it probably is a fastball of average velocity. Certainly, the expectation must be closer to that than to a zero-spin pitch.
More great stuff from Joe Sheehan:
http://baseballanalysts.com/archives/2007/12/john_walsh_wrot.php
It was good to see that John Walsh pointed out the problem of just generally grouping pitches and analyzing that (all fastballs vs fastball/sinker delineation). Grouping splitters with changeups seems odd, considering pitchers still throw the forkball. It would seem to me that the people doing the vanguard work on pitch f/x data should ‘well define’ each pitch, at least individually if not collectively, before advancing into publishing findings.
As Walsh’s article pointed out, and my example of Felix Hernandez in another thread of this blog (only having sinkers listed, when it is clear that he throws a 4-seamer), numbers will be misleading if too many things are grouped together. I’ve been researching pitches for the last couple years, and while I believe grip and release action is what defines a pitch, it is not practical in this application. In my notes, I have 4 general groups of pitches: fastballs, breaking pitches, changeups, and other.
Fastball: 4 seam, 2 seam, sinker (a subset of 2 seam), cutter, splitter
Breaking pitch: 12-6 curve, curve, slider, slurve, screwball
Changeup: straight change, circle change, forkball, palmball, hybrids
Other: knuckleball, eephus, etc.
There are some issues, such as cutters looking like sliders, but if groups are left too broad there are problems such as:
* 4-seam fastball and sinker have polar opposite movements, and thus results
* 12-6 curve should be equally effective against LHB and RHB, while the modern 10-4 curveball or slurve should be less effective against opposite handed batters
* Circle change has break like a screwball (purpose of which is to be effective against opposite handed batters), while forkballs and some of the hybrid changeups sink, and straight changes should not be expected to have much more movement than a 4 seam fastball
Rigorous analysis of these things (so as to come up with well defined pitches) should come well before any other analysis. Otherwise I see the potential for a lot of analysis that will both have to be undone and perhaps snowball into even more incorrect conclusions.
There is another issue as well, that hopefully someone can conquer, and that is of breaking pitches that do not break (ex.- hanging curves). If they can be put into the proper category, it would greatly enhance the breakdown of each pitcher’s pitches, since they tend to be hit very hard.