Filter posts by...
Ball_Tracking
Monday, February 06, 2012
I love the idea, off the bat.
But, I don’t like the question. The peripherals already includes the diversity. A better thing to ask is if a pitcher with more diversity performs better than someone with less diversity, if you control for the fact they have the same fastball speed. (We will end up with a selection bias at the low-speed end, but not the high-speed. Well, we might, because we don’t have a measure for strike zone location, which, really, is what pitching is all about.)
Anyway, would love to see more work on this.
Saturday, February 04, 2012
Great stuff available at BrooksBaseball.
It’s interesting to me when we classify things as “Fastball”, when obviously a Moyer fastball and a Strasburg fastball moves differently. First, there’s the difference in speed (16mph difference), which means that gravity has more time to impact a Moyer fastball (you can see this in the charts a bit further down). The end result is that a Moyer fastball sinks more than a Strasburg sinker.
But, then there’s also those pitchers that throw alot of fastball AND sinkers, and so, they will try to differentiate them enough so they can get value from the two pitches. For example, Strasburg doesn’t throw a sinker alot, but when he does, the movement is a couple of inches different from his fastball. It’s also two mph slower. Moyer on the other hand throws his sinker and fastball at the same speed, but it has significantly different movement on the pitches: 7 inches on the horizontal and 5 inches on the vertical.
And it’s not just Moyer. CC also throws fastballs and sinkers, and the speed difference is only 1 mph, the horizontal is 6 inches and the vertical is 3. Felix throws his sinker and fastball at the same speed, with 7 inches of difference on the horizontal and 3 on the vertical.
Verlander is like Strasburg, in that he rarely throws a sinker, and he gets 3 inches on the horizontal, and 4 on the vertical, at the same speed (so, a bit more movement differentiation than Strasburg, but not as much as CC/Felix). And maybe the reason is that his fastball already gets so much movement to begin with. Verlander’s fastball on the horizontal moves as much as Felix’s sinker on the horizontal. So, he really doesn’t have much room to maneuver there. One might think that Verlander can improve his repertoire by following a Felix/CC model of limiting movement on the fastball, to differentiate it more to the sinker. But, it’s hard to argue with the success of Verlander/Strasburg to think that they can actually pitch better.
Anyway, all to say that since a Verlander/Strasburg fastball really lies half-way between a Felix/CC fastball/sinker, that it’s not really helpful to simply classify their pitches as “fastball”. It’s really a Verlander-fastball and a CC-sinker and a CC-fastball.
I think it’s perfectly fine that when you treat the pitcher as his own universe, that we stick with the standard classifications. But, when we combine pitchers, it’s may be more helpful to distinguish them based on their movement and speed, rather than how they clustered for a particular pitcher.
By , 06:48 PM
I downloaded Bill James Baseball IQ onto my iphone (I don’t think it is available on droid phones, but I’m not sure). Here is the web site for the app on Acta Sports:
http://www.actasports.com/titles/bill_james_baseball_iq_app/
It is pretty cool. You can read a description and see some screen captures on the above site, but basically it allows you to see heat maps and color maps of batters and pitchers (in all combinations, counts, situations, etc.) for K zone, batted balls, pitch type, etc.
Best of all, the app is free! Seems to me that they could have charged for this one, but I know nothing about the best way to make money from apps. It also seems like they could use these graphics more often on TV broadcasts.
Anyway, give it a try and see what you think…
Thursday, February 02, 2012
Good stuff from David.
Tuesday, January 31, 2012
The always affable, generous, and insightful Alan Nathan follows up.
From Alan:
Within the precision of the tracking data, knuckleball trajectories are just as smooth as those of ordinary pitches. Read on to find out how I arrived at this conclusion.
...
With apologies to John Walsh, I conclude that knuckleballs are more like bullets than butterflies.
How about it’s more like bullets fired by a 10-yr old? So, the kid doesn’t have good aim, he’s jittery when he fires, and ask him to rehit the same target, he won’t be able to. So, it follows a smoothish pattern, after the fact, but even upon release (knowing angle of release, speed, spin angle), you won’t be able to predict that path.
That about right?
Wednesday, January 11, 2012
Josh takes a look.
I must say, this looks a bit of a step back from others that I’ve seen. Suppose that an umpire has pitchers that throw alot of pitches in the middle of the strike zone. The way Josh calculates it, that’s included in his metric.
The way the other guys have done it, they looked for a “contour” of x-inches of width where the called strike to called ball ratio was 1. Basically, the wide pitches and the center pitches tell us nothing about the umpire.
Or, do they? Perhaps, rather than a step back, Josh took a step to the side. If a pitcher is throwing alot of pitches in the middle of the strike zone, he may be doing it in ANTICIPATION of a smaller strike zone from the umpire. So, if an umpire happens to see alot of down-the-middle pitches, then that may tell us something. Except.... Josh removed any pitches that the batter swung at.
So, I think there’s alot of different considerations here, as to exactly what it is that Josh (or, the interested reader) actually wants. I’m not sure that Josh answered his intended question, or some other intended question. And, it’s quite possible, that the huge sample size mitigates any of this anyway, since umpires are not paired with pitchers.
Anyway, lots of things to consider, and reformulate…
Thursday, January 05, 2012
Great job by Josh:
On pitches down the middle, the balls that are put into play have, on average, about twice the magnitude of run value as pitches that aren’t put in play. That means for the two to come into equilibrium, you would need to have about 33% of pitches put into play and 66% not put into play. But as discussed earlier, far fewer than 33% of pitches are put into play. This means that, on average, pitches thrown down the middle are good for the pitcher, not the batter.
Merging everything together, we can see this visually:
Wednesday, December 07, 2011
Yup.
Monday, December 05, 2011
Some updates on Fangraphs. Check out the various tabs in there. Tons of good stuff.
And, if you have suggestions, post them here. David is pretty much incomparable in terms of turnaround time of taking suggestions and implementing them.
Wednesday, November 30, 2011
More good stuff from Josh.
But if we have plate discipline metrics from 2010, we also know strikeout rate from 2010. Do these metrics give us any information that the previous year’s strikeout rate does not?
If I run a regression of 2011 strikeout rate on 2010 strikeout rate and 2010 swing area, I find that swing area no longer has significance. I find the same result for O-swing. In other words, these plate discipline metrics are not useful in predicting the next year’s strikeout rate if we already know the previous year’s strikeout rate.
Tuesday, November 22, 2011
A followup to Mike’s terrific piece of the horizontal speed off the bat, this time, with the added focus of the vertical launch angle.
There’s actually plenty of info here, and I can’t comment properly yet until I do a second re-read. There’s also something that seems inconsistent, and I’m hoping Mike can set me straight on whether one of the charts published needs to be updated, or my reading skills need to be improved. I’m hoping it’s the latter.
Friday, November 18, 2011
Excellent work, and exactly the kind of thing that is actionable. As noted, it’s more helpful to break up by count, but that’s just one small step away. (And pitch types, natch.) He also did it for pitchers. Just fantastic work.
Wednesday, November 16, 2011
Great stuff from Mike:
Batters have a good deal of correlation between halves of the sample, with a correlation coefficient of r=0.76 with an average of 201 batted balls in each half. That means that we would add 63 batted balls (or about one month’s worth) at league average to the observed average speed for each batter in order to estimate his true skill.
...
Pitchers have fairly good correlation between halves of the sample, though not as good as batters. The correlation coefficient is r=0.48 with an average of 251 batted balls in each half. That means that we would add 269 batted balls (or about three months’ worth for a starter) at league average to the observed average speed for each pitcher in order to estimate his true skill.
Just fantastic stuff, and I’m glad Mike did it, as well as showing the key points, which is the point at which r=.50.
***
I’m not really surprised by the results. The closer you get to someone’s base physical and mental skills, the less observations you need. This is why scouts are so important. And the F/X and Trackman systems are, at their heart, scouting tools.
What we’ve had until recently are outcomes, results, things like OBP and K/PA, etc. What drives OBP and the like are the players’ base skills AND luck. That’s why we infer a players’ base skills by stripping out as much luck as we can figure out. We do this through a Bayesian process (or its equivalent in regression toward the mean). We need a few hundred contacted balls for a hitter, and in the thousands for a pitcher, in order for us to be able to strip out that luck to infer the base skill.
Inside a player’s contacted ball skill is not only the horizontal speed off the bat, but placement as well.
Unseen in Mike’s data is what the horizontal speed off the bat really means. Let’s take a pitcher’s fastball speed. We presume that there’s a high degree of correlation in a pitcher’s fastball speed. I have no doubt that if you do a split-half correlation, you’ll get something ridiculous like r=.99 (really, it’s a question of how many nines) for pitchers who throw 1000 fastballs. So, we can ascertain a scouting observation: we can readily and easily ascertain a pitcher’s underlying true fastball speed.
But, what does THAT give us? He throws really hard or really soft. But, that by itself, still doesn’t tell us how EFFECTIVE he is.
The next step is to correlate that particular base skill, that scouting-level observation, into results. And Mike has given us that:
We see that a player who hits the ball at close to 80mph has a BACON of close to .300, while those who hit the ball at close to the league average (70mph) has a BACON of close to .200, and those at the league low (60mph) is just above .150.
I have to say, all those numbers look pretty low. I guess that’s what happens when you have non-linearity. For example, suppose you hit one-third of your balls at under 60mph, another third at 60-80, and the last third at over 80mph. (Numbers for illustration purposes only.) If it’s under 60mph, you get a batting average of .050 to .150, or say around an average of .120. If you hit it between 60-80, it’s .150 to .300, or an average of .220. And above 80mph, it’s from .300 all the way up to .650, for an average of say .500. That gives you an average of .280, for an average of 70mph. As you can see, the overall average for a distribution around 70mph is way above the batting average at the 70mph point.
Anyway, so what I’d like to see is this: create a DISTRIBUTION for each player, centered around his true talent horizontal speed off the bat, and apply the rates from the above chart (or a more smoothed version actually). This way, we can end up with a player’s true talent BACON, if all we know is his horizontal speed off the bat.
THAT will tell us how valuable knowing his horizontal speed off the bat is.
Tuesday, November 15, 2011
trackman_leaders.pdf
Wednesday, October 26, 2011
I like these “differential” graphs, because it saves me the trouble of comparing to the league average (though as noted later in the article, it would be better to match on the count). Chart is from the catcher/batter/ump perspective. Pujols gets more pitches (red) low and away, and less pitches (blue) up. No surprise of course.
This article was a long-time coming, so thanks to Mike for all the hard work on this one:
I ran a regression for all the right-handed batters with at least 630 plate appearances in 2007-2011 that ended on a pitch in the strike zone.
...
With larger sample sizes, the split-half correlation improved somewhat, as expected. However, even with only four zones, much noise remained in the results. Here is the regression equation for right-handed batters:
Zone Performance in Split Half 2 = (0.32 * Zone Performance in Split Half 1) + (0.32 * Performance in Other 8 [Ed note: Mike meant 3 here] Zones in Split Half 1) + (0.36 * League Average Performance).
The correlation coefficient was r=0.46, and the p-values for both input variables were highly significant (<.0001).
With sample sizes from larger zones between 200 and 300 plate appearances in each half of the sample, both the split-half correlations and the statistical significance of the results have improved.
Let’s say the average number of PA per player in the sample is 2000 PA. So, we can say that for someone with 2000 PA, and if you want to know how good he is at balls in the top left corner, you take one-third based on his performance in zone A, one-third base on his performance in the other three zones B, C, D, and one-third the league average.
Michael Young for example had OBSERVED TAv of the following: .270 (up and away), .353 (up and in), .198 (low and away), .257 (low and in). If you want his TRUE low-and-away, you would take one-third .198, one-third the other three (.293), one-third league average (whatever that would be… let’s just say it’s .220), to come up with a(n estimated) TRUE TAv of .237.
Now, I’m thinking we’re going to have some selection bias here. It doesn’t look like Mike controlled for count, and he’s only looking at the very last pitch of the PA. If you know a plate appearance ended on a pitch low and in, it’s possible that you got an out. That may be one reason we see some difference. I don’t know, but we need to control for count, and even after that, I’m not sure that’s enough.
This is a great first step, so I definitely want to encourage more work like this. Just seeing the OBSERVED hot/cold, but not the TRUE hot/cold is a definite hole in (public) sabermetrics right now.
And I think the next step is to treat each pitch in the plate appearance, one by one, rather than just looking at the last pitch of the plate appearance.
Sunday, October 23, 2011
Good stuff.
Thursday, October 20, 2011
Mike did a bang-up job on PITCHf/x in the THT Annual a couple of years ago, and studes has made it available for free for the public (pdf). Tremendous stuff.
There are two other must-haves as well in book form. Dave Allen did one (I don’t remember where), and I think John Walsh or Harry Pavlidis did another. Heck, there might even be more, and I don’t remember.
In any case, thanks to studes for opening up the vault on this one. I’m looking forward to getting the new THT annual. This will be the first one where I haven’t contributed something in a while. I think I wrote in each of the last 3 or 4.
Tuesday, October 18, 2011
Jeff presents some interesting data.
This is a perfect example of a sampling bias. While it looks like this is a complete population of pitchers, the reality is that the MLB pitchers is a sample of all pitchers. Notably, if your fastball speed is below 90mph, then the only way to be a MLB pitcher is to be able to do something else well (location, movement, other pitches, etc).
Ultimately, the bias is so strong as to render the data presented as applying only to those pitchers who happen to be in MLB, and you can’t apply it to all pitchers who throw at that speed.
Recent comments
Older comments
Page 1 of 320 pages 1 2 3 > Last »Complete Archive – By Category
Complete Archive – By Date