THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Ball_Tracking

Wednesday, September 28, 2011

Matt “anti-DIPS” Cain

By Tangotiger, 11:21 AM

Josh’s great investigation.

(8) Comments • 2011/09/29 • SabermetricsBall_TrackingBatted_BallPitchers

Tuesday, September 27, 2011

Catcher Framing Skill, part WOWY

By Tangotiger, 02:59 PM

Mike Fast generously posted his file on Catcher Framing in his tremendous article last week.  That file is the data source of everything I’m about to post here.

Also note that when I’m going to present a rate stat, it’ll be “per 75 called pitches”, because there’s around 75 called pitches per 9 innings.  Since runs allowed per 9 innings uses the notation RA9, I’ll use CP9 for called pitches per 9 innings.  (That it bears some resemblance to CP30 is either an unfortunate blemish, or a cool byproduct.)

Another note is that turning a called ball into a called strike is worth roughly +.12 runs.  However, the way that Mike has identified the pitches, it’s more like a “probably would have been called ball” turning into a called strike.  As a result, the run value, using this data source, is going to be more like .08 runs per extra call.

On to the data.

Read More

(34) Comments • 2011/09/28 • SabermetricsBall_TrackingFielding

Thursday, September 22, 2011

Changeup pitchers harder to hit

By Tangotiger, 10:47 AM

Great stuff from Josh.  He splits his pitchers into whether they rely on the changeup or not.  And he finds a 11 point difference in BABIP.  He looks at various variables to see if any of them can impact the results. 

Wednesday, September 21, 2011

Catcher Framing Skill, part next

By Tangotiger, 09:28 AM

More great stuff on catcher framing, this time from Mike Fast.

***

Mike: what was the average number of pitches caught per catcher, in the 2008+2010 pool, and in the 2009+2011 pool?

The correlation for catcher ratings between the two pools is around r=0.7, depending on where exactly the playing time cutoff is set. That means that we should add about half a season’s worth, or 4500 called pitches, of league-average performance to the observed performance for each catcher in order to get a better estimate of the catcher’s actual skill level.

(18) Comments • 2011/09/22 • SabermetricsBall_Tracking

Sunday, September 11, 2011

Out-of-synch PITCHf/x?

By Tangotiger, 09:25 AM

UPDATE: based on Straight Arrow reader responses below, it seems that there was a missing pitch originally, which has since been later corrected by MLBAM.

Lesson learned: MLBAM is on top of things.

Original post follows:

Read More

(14) Comments • 2011/09/11 • SabermetricsBall_Tracking

Friday, September 09, 2011

One mph twist

By Tangotiger, 08:18 PM

Jeff on Vargas.

(1) Comments • 2011/09/13 • SabermetricsBall_Tracking

Wednesday, September 07, 2011

Where do umpires position themselves?

By Tangotiger, 12:17 PM

Wonderful charts here.  Here’s one of them.  The “in” and “out” designation is where the catcher is positioned.  So we see that the umpire is influenced not only by the batter’s handedness, but also by the position of the catcher.  Mike shows the charts of two other umpires, and you can see that umpires are like snowflakes.

***

He also points out some anomolous PITCHf/x call. 

I would like to touch on one other aspect of umpire strike zone evaluation. If you look closely at Estabrook’s plot for right-handed batters, you may notice something strange. Did he really call a strike on that pitch located at 2.96 feet high and 1.87 feet wide of the center of the plate? That is 14 inches off the edge of the plate!

If you believe the PITCHf/x data, then yes, he did, but it might be wiser to believe Estabrook than the data in this case. I reviewed the video of that pitch, and while it does appear to have been outside off the plate, it almost certainly was not 14 inches outside. By no means do I advocate video as an accurate method for judging strike calls. Relying on video leads to all sorts of problems, not least not being able to tell when the ball crossed the plate, even with a straight center-field camera view as we have for the pitch in question here. With those caveats, here is a still frame showing roughly where the ball crossed the plate. From this angle, the pitch appears to be maybe five inches outside.

Sometimes you get calibration issues, and when technology misses, it misses big.  A good example is if you track hit locations of minor league parks.  It’s obvious that one park is hugely shifted.  MLBAM assures me that all the parks work off the same software, but every year I’ve done this, this one park is hugely off.  While most home plates are in the vicinity of x,y = 125, 200 pixels, this one park is more like 165, 230.

The real problem with the calibration on PITCHf/x is that they may not be consistent game to game, or even within game possibly.  So, what should be systematic biases (which can be easily corrected through post-game analysis) ends up looking like possibly random biases.  Knowing that all games are at 165, 230 is more beneficial to me than MLBAM tinkering with the software mid-season to try to get it right (which they haven’t done since last time I checked).

Strasburg 2010 v 2011

By Tangotiger, 11:11 AM

PITCHf/x charts.

Monday, August 29, 2011

AJ, relief pitcher?

By Tangotiger, 10:58 AM

Lucas notes:

(2) Comments • 2011/08/29 • SabermetricsBall_Tracking

Friday, August 26, 2011

Reduce speed, improve command

By Tangotiger, 10:35 AM

Great little study, using COMMANDf/x:

It appears there is not much difference in command between zero-strike and one-strike counts, despite the quarter-mph difference in velocity. At two strikes, though, pitchers seem to throw about half a mile harder than average, while losing around an inch of command.

I was more shocked by the table showing the Verlander misses his catcher’s mitt, on his fastball, by 17.4 inches… on average, which is league worst.  The average pitcher misses his catcher’s mitt by 13 inches, and the best pitcher is at 9.2 inches.

I’m not buying it.  At all. 

The second chart shows standard deviations, and Verlander again at the bottom at 11.8 inches (and league average of 7.4).

I think there’s a math error somewhere.  Almost always, the average absolute error is smaller than the standard deviation.  I think the author has all the absolute error numbers as 2x too high.

Even then, I’m finding it hard to believe that Verlander is off by one SD = 11.8 inches on his fastball.  If that is what it is showing, then I have to question the using of a catcher’s mitt as a proxy for pitcher intent.  I’d like to see someone look at Verlander, pitch by pitch, and show us how he really missed where he wanted to throw. 

Or is the catcher simply always setting himself up in the exact same spot?

(37) Comments • 2011/08/27 • SabermetricsBall_Tracking

Wednesday, August 17, 2011

Everything you wanted to know about Hit Batter locations but were afraid to ask

By Tangotiger, 12:57 PM

Great stuff by Mike.

***

If you are up for Part II Mike, how about the intentional HBP?  “First pitch thrown to the batter following a HR” should comprise a great majority of those.  What is being thrown?  Presumably 95% “fastballs” (though I’d bet they’re going to cluster more like some sort of changeup).  Presumably the location will be butt to lower back?  And are some pitchers more likely to throw an intentional HBP?

(20) Comments • 2011/08/18 • SabermetricsBall_Tracking

Saturday, August 13, 2011

Should you throw a sinker at Coors?

By Tangotiger, 07:25 PM

Interesting charts here:

This is non-Colorado pitchers, and we see they’ve decided to throw fewer sinkers at Coors:

But this is Colorado pitchers, and they’ve decided to throw more sinkers at Coors (though it could be that it’s not the same pitchers in the same proportion in the two groups).

And here’s the piece de resistance, comparing how pitches move at Coors compared to the league overall.  The “eye” is where a pitch would be observed to be thrown if you were playing catch.  So, a MLB fastball looks like it “rises” in a regular park in comparison, but at Coors, it moves “straighter”.


(21) Comments • 2011/08/16 • SabermetricsBall_TrackingParks

Friday, August 12, 2011

Jayson Stark Tracer

By Tangotiger, 05:17 PM

He says:

“This guy could always hit a fastball,” one scout said. “But he’d chase so many other pitches, he didn’t get in enough hitters’ counts to get those fastballs. Now he doesn’t chase those pitches. I’ve never seen anything like it. I’ve never seen a player make that change and do it that dramatically.”

So, Bautista doesn’t chase breaking pitches, right?  In 2008-2009 (for whatever games are in the log), he swung at 27% curve balls, 37% sliders, and 42% changeups.  In 2010, he swung at 26% curve balls, 34% sliders, and 38% changeups.  Looks pretty similar to me.

In 2008-2009, he swung and missed on 7% curveballs, 11% sliders, and 13% changeups.  In 2010?  7%, 10%, and 10%. 

I don’t see much here.  Perhaps Stark can support his opinion (which he couched by quoting one scout out of 1000 possible scouts) with actual data.  I’m sure one of the PITCHf/x jocks will help him out.

(8) Comments • 2011/08/13 • SabermetricsBall_Tracking

Wednesday, August 10, 2011

What does Josh Tomlin throw?

By Tangotiger, 09:43 AM

In this Tomlin interview, he says:

If I throw 100 pitches in a game, I’ll throw about 10 or 11 curveballs, four or five changeups, 30 or 40 cutters and the rest two- and four-seam fastballs.

What does Gameday and BIS say?  I don’t know, but let’s find out together.

Curveballs are the easiest to distinguish, so hopefully, all three are in agreement.  Tomlin (or rather his memory, or possibly what his pitching coach is counting for him) says 10 or 11, BIS says 15 this year, and Gameday says 16.  Hmmm.. that’s quite odd.  Does Tomlin not know what he’s throwing?  Or is everyone else confused?

There’s also a bit of disagreement between BIS and Gameday.  Why would there be any?  You could start your clustering just on velocity, and you’ll get the majority of curveballs.  And then if you want to distinguish between “fast” curveballs and “slow” changeups, you look at vertical movement.  The tougher one I suppose is to distinguish between a “fast” curveball that doesn’t drop too much and a “slow” slider that has extra drop.  Indeed, Gameday has no sliders recorded, while BIS has 3 recorded per 100.  Tomlin himself says he doesn’t throw any sliders. So my guess is that BIS has taken some Gameday-curveballs and marked them as sliders.

How about changeups?  Tomlin says 4 or 5, BIS says 11, while Gameday says 10.5.  Again, does Tomlin not know what he’s throwing, or are his slow fastballs and sinkers simply being picked up as changeups?  That is, maybe Tomlin intends to throw a slow sinker, but it looks alot more like a changeup.

He said 30 or 40 cutters. BIS says 28 and Gameday says 30.

Finally, that leaves about 50 fastballs / sinkers.  BIS has 43 (without distinguishing between the two), while Gameday has 34 4-seamers, and 9 2-seamers (total of 43). 

So, it looks like he has quite a bit of slow fastballs (according to Tomlin) that are being picked up as changeups (according to Gameday).

My guess is that Tomlin is throwing it the way he says he’s throwing it, but the actual result may be closer to how Gameday clusters the pitches.

All to say that there’s not always a very clear line, and there’s enough uncertainty in pitch classification that both sides (Tomlin and Gameday) can be justifiably accurate.

(8) Comments • 2011/08/11 • SabermetricsBall_Tracking

Thursday, July 28, 2011

Human pitch and computer pitch

By Tangotiger, 11:07 AM

Thanks to Jeff and Trip.

(11) Comments • 2011/08/08 • SabermetricsBall_Tracking

Wednesday, July 27, 2011

El Jered’s BABIP

By Tangotiger, 09:19 PM

Josh gives us a breakdown.

I’m not sure what “league” means, if it’s RHP or all pitchers.  I also think it’s valuable to keep things separate between LHH and RHH.  Nonetheless, good stuff.

(2) Comments • 2011/07/27 • SabermetricsBall_Tracking

Tuesday, June 28, 2011

Clustering and pitch/fx

By Tangotiger, 07:36 PM

Jimmy:

Let’s say you want to identify clusters in two-dimensional data. You an do this using a clustering algorithm such as k-means or soft k-means. In a nutshell, what this does is take an initial set of means (chosen however), evaluate the distance of each data point to one of the means using some distance metric and then assigns a mean to each data point (i.e. the closest mean). Then it re-evaluates the means given the current assignment and steps through the process again, unless it converges and you have the data grouped into “k” clusters.

So this helps with grouping the data points, but let’s say you wanted to go a little bit further. What you can do is run the initial algorithm to find the means and cluster assignments, and then impose the assumption that each cluster is distributed around its mean (which you just found) according to a bivariate normal distribution. Then you use maximum likelihood (ML) to find the variance parameters of the bivariate normal for each cluster, which may vary for each cluster. You can assume different variance in each direction to account for clusters that aren’t spherical. Then once you have those parameters, you have the variance of each cluster.

To relate this to baseball, assume the two-dimensional data we have is horizontal and vertical pitch movement, and assume that the pitcher in question has three pitches: 4-seam FB, slider, and a curve. Presumably these three pitches will form three distinct clusters when graphed. We run the k-means algorithm to identify which pitch is which (i.e. assign clusters), and then we fit each cluster to a bivariate normal distribution by ML. Then we have the variance of each cluster. Then we can compare the variance (i.e. the consistency) of each pitch’s movement relative to the other pitches, or compare it amongst pitchers with the same type of pitch. And we can track it from game to game, season to season, etcetera, so that we can say that “oh, Erik Bedard’s control of his CB has really improved this season relative to last” with some quantitative oomph rather than with simple visual evidence.

And there are a lot of other advantages to this too besides just getting the point estimate of the variance. We can also get the variance of the point estimate itself to quantify how accurate we think our estimate of that variance is. We can use the bivariate fit in real time, with Bayesian updating to improve the accuracy of the pitch/fx system itself (in identifying pitch type). There are a lot of places to go from here.

I also hear you on the problem with noisy data. That is a universal issue, but there exist a lot of ways to deal with it. I’ve heard of people transforming the data with principal components analysis first (which is a sort of clustering algorithm in itself… kinda) and then running the k-means on the transformed data to get better clustering fits. And lots of other improvements upon the plain vanilla k-means algorithm to deal with tough data. I’m sure there is literature on this stuff somewhere… but I should really shut up because I don’t understand the pitch/fx system too well.

If you’re feeling adventurous, I recommend chapters 20 and 22 of this book as an intro to the stuff I’m talking about: http://www.inference.phy.cam.ac.uk/mackay/itprnn/ps/

(5) Comments • 2011/06/29 • SabermetricsBall_TrackingStatistical_Theory

Friday, June 10, 2011

Catcher Framing Skill

By Tangotiger, 09:41 AM

Fantastic article by Max:

Let’s do some back-of-the-envelope calculations. A top catcher at framing pitches, such as Russell Martin, improves the chances of a borderline pitch to be called a strike of roughly 20 percent. Since the difference in run value between a ball and a strike has been estimated around 0.13 runs (see the References and Resources section at the end), a skilled catcher might be worth 0.026 runs on a single borderline pitch.

In the data used for this analysis, each game contributes on average four pitches. That is, there are four borderline pitches per game on the outside corner. Let’s conservatively guesstimate a total of eight when we add the inside part and the upper and lower border of the strike zone. Then divide by two, to apportion the borderline pitches between the two teams playing the game. That makes four uncertain pitches a game where the catcher can make the difference.

Martin caught 97 games in 2010. Multiply that by four pitches and by 0.026 runs and you get 10 runs in a limited number of games.
According to this analysis the top catchers can win a ballgame per season (even playing fewer than 100 games) only with the skill of framing pitches.

If you think that’s a lot, I’m with you.

Great stuff all-around.

(18) Comments • 2011/07/08 • SabermetricsBall_Tracking

Thursday, June 09, 2011

Estimating Groundball Rates, based on velocity, trajectory, movement, location

By Tangotiger, 11:35 AM

Great job by Josh.

He points out the issue with merging handedness into one chart.  Things are clearer if you do 4 charts (hand pitcher x hand batter, or at least two: with and without the platoon advantage).  That would probably help for this chart:

For 100mph pitchers, GB rate is very high if you face a RH batter, but very low if you face a LH batter.  Except I presume you have a disproportionate number of pitchers who are righties who throw 100mph.  So, this chart can just as well say that if you have same-handed players, you get lots of GB at 100mph, but if it’s opposite-handed players, you get very few GB at 100mph.

(Note: The second chart has the y-axis label wrong.)

But, forget that negativity.  Just a great article all-round.  Big thumbs up.

(7) Comments • 2011/06/09 • SabermetricsBall_Tracking

Wednesday, June 01, 2011

Establishing a batter-specific strike zone

By Tangotiger, 11:19 AM

Mike proposes different solutions of varying complexities (easy to hard).

(4) Comments • 2011/06/01 • SabermetricsBall_Tracking
Page 3 of 20 pages « First  <  1 2 3 4 5 >  Last »

Latest...

COMMENTS

May 16 22:47
Dodgers’ win reversed because Mattingly did not attest to proper score!

May 16 20:44
How to beat the shift

May 16 20:02
Sponsoring MLB jerseys

May 16 19:34
Now you frame it, now you don’t

May 16 16:56
Did Manny Pacquaio actually quote Leviticus?

May 16 16:06
Does changing your pitch frequency lead to substantial change in results?

May 16 14:18
Extra Innings: One-minute review

May 16 14:16
This particular criticism of UZR is unfounded

May 16 13:21
Psst… wanna intern for the Astros?

May 16 12:23
Arena wars

THREADS

May 16, 2012
Now you frame it, now you don’t

May 16, 2012
Dodgers’ win reversed because Mattingly did not attest to proper score!

May 16, 2012
Does changing your pitch frequency lead to substantial change in results?

May 16, 2012
Sponsoring MLB jerseys

May 15, 2012
Andre The Hawk Dawson speaks

May 15, 2012
Euro 2012 Preview

May 15, 2012
How to beat the shift

May 15, 2012
Will Pujols end the season with at least 30 HR and .500 SLG?

May 15, 2012
Kershaw v Strasburg, part 2

May 15, 2012
Did Manny Pacquaio actually quote Leviticus?