Tuesday, June 26, 2007
Inferring Injuries
When you establish the true talent level of a player (or create a forecast, which is essentially the same thing), you want to know if he is healthy or not. And, if he’s not healthy, how unhealthy, and how persistent is his illness. So, you try to infer such things. If a player plays 159, 160, 154, 159, 161, 112 games, his OPS+ at any point in that stretch bottomed at 133, and the player is 27 years old at that point, we infer an injury. We don’t need to look more closely at the situation, though it could have been the case of someone even better usurping his playing time. You always have a certain uncertainty level. And if the player is 37 instead of 27, we may be more inclined to infer a longer rehab period. But, we still don’t know what kind of injury because the data doesn’t tell us much more.
John Walsh shows us the data for Curt Schilling. Now, we don’t need to infer if his performance was about balls falling in for a hit, or whether his true talent level was marketdly different. We remove that uncertainty level with the data. Depending on the nature of the illness, we’ll be able to either discount the data from this performance more, or place a greater premium on it. We’re always looking for the establishment of a new talent level, as opposed to randomness creating noise around the data. It’s data like this that we need.
And for MLB teams that are not doing this.... are you kidding me? What Walsh, Fox, Beamer, Sheehan, Appleman, et al are doing is the cutting edge of sabermetrics, the point where performance and scouting converge. This is the pot of gold that is being prospected.
***
Further research would go into the “mix” of pitches, and the “strategy” of pitches, based on the game state (inning, score, base, out) conditions… i.e., Leverage Index.
***
The data itself also has a certain amount of uncertainty, as can be easily seen with David Wells having a bunch of pitches being released from the wrong side of the mound (four feet from where it should be).