THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, March 31, 2009

HZR, UZR, PZR: the convergence of scouting and performance

By Tangotiger, 01:07 PM

Hey Tom, I’m a frequent reader of yours, and I had been wondering about a new possible stat derived from the vector things that UZR uses. Could we use batted balls to get a offensive stat that takes out great defensive plays. Like the example of the medium-speed, straight-at-the-shortstop ball being converted at 95% of the time (dont recall exact numbers), and then Derek Jeter makes the play and has a UZR of +0.05. Could we give the batter a scoring of 0.05 and then multiply by a run valued linear weight, like lets say 95% out and 5% single. and then he would get 5% of .47 (what i think is the run value of a single) then make an average? Wouldnt that take luck out of the stat?

What is the model?  The model is a set of humans, who each have their own unique skills, each of whom is faced with a different set of circumstances at different frequencies, with a changing environment (weather, park, umpires, how they feel).  Ideally, we’d put each human under all the possible combinations of parameters, and make him face each context one million times, all in the same day.  That is the model.

What we have instead is 4 PA a day, of a set of known and unknown variables.  That’s the performance data.

What we also have is how observers see those players, in-game or out.  That’s the scouting data.

The idea is to create a model that is complex and comprehensive enough as to make both performance and scouting data obsolete.  We’re not there, and we’re not going to be there.  But, the plan is to work toward getting there as much as possible.  PITCHf/x, HITf/x, and FIELDf/x (or similar systems) is the gold we’ve been looking for. 

Imagine knowing exactly when Ryan Howard starts his swing when he faces Johan Santana, when he expects fastball, and gets curve, he expects it inside, and it goes outside.  It will become not only important the angle at which it comes off the bat in that particular pitch, but at what angle does it normally come off under those conditions.  We’re not going to necessarily care exactly where the ball ended up, we won’t necessarily care what happened when the ball and bat met, what we may end up caring about the most is the exact millisecond prior to impact: given all the effort exuded by Santana and Howard, in the amount, direction, and timing of that effort, what should have happened?


#1    MGL      (see all posts) 2009/03/31 (Tue) @ 15:27

The data that UZR and other defensive systems use can be used to tweak offensive evaluation and models that estimate true, long-run offensive talent, just as they are used to create these on defense, but…

If you rely exclusively on such a model for offense, you will run into problems, because we don’t have granular enough data on batted balls.

Let me give a few examples which will explain what I mean:

Even though the data tells us the speed of each batted ball, slow, medium, and hard, all “hard” ground balls by Juan Pierre are not the same as all hard hit ground balls by Ryan Howard. So, if we used the same methodology we use for defensive metrics to evaluate offense, we would likely be underrating Howard and overrating Pierre on ground balls (Howard will have more ground balls go through the IF because he hits them harder, but the “system” will assume that all ground balls of a particular speed by both players should be fielded the same).  The same thing is true of air balls.

Another thing is player speed. Player speed is an important factor in things like IF hits and whether outfield hits are singles, doubles, or triples.  Again, if we use the same methodology we use for defense, the “system” will assume that all players are of the same speed.  Again, a ground ball by Pierre is NOT the same as a ground ball by Yadier Molina because Pierre will beat many more of them out for a hit.  So all of a sudden that ground ball to Jeter that gets fielded 95% of the time is NOT really 95% for players of different speeds.

Same thing with fielder positioning.  The actual results of batted balls that go into traditional offensive metrics - out, s, d, t, roe - include the positioning of the fielders.  A defensive-type system would not.

Etc.

Of course, all of these things could be accounted for - not perfectly.

So, we are probably better off in the long run just using what actually happens for an offensive evaluation system.  In the short-run, we definitely could improve offensive evaluation systems by incorporating the types of things that we use in defensive systems like UZR.  Lots of guys are already doing that (like the THT guys) in their offensive projection systems - for example, incorporating line drive rates and things like that…


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 08:49
Do pitcher’s reach back for velocity when needed?

May 25 08:11
What sabermetrics is NOT

May 25 06:43
Largest demonstration in Canadian history?

May 25 06:39
Lack of hustle during a game

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story