THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, February 05, 2010

What do you forecast?

By Tangotiger, 12:54 PM

Someone sent me an email about how to handle forecasting negative WAR since every year, we see negative WAR being actually generated.  Here is my response:

There is a difference between OBSERVED and TRUE.

If Garret Anderson and Jacque Jones and a host of other players are all TRUE replacement level, and if you give them each 600 PA: guess what happens?  Some will have a 2 WAR, some will have a 1 WAR, some will have a 0 WAR, some will have a -1 WAR and some will have a -2 WAR.

Overall, as a group, these players will have a 0 WAR.  And that’s because we KNOW that they were TRUE replacement level.  So, if you start with the idea that you know someone is a true 0 WAR, then it’s irrelevant what we will observe, any more than you know you have a true fair coin and you observe 60 heads in 100 flips or 30 heads in 100 flips.

And when you forecast players, it would be insane to give PA or IP > 0 for players who are below replacement level.  Therefore, by definition, the lowest (true) WAR you can give someone is 0 WAR.

We are not trying to forecast observations.  We are trying to establish the (unknowable) true rate.  And your forecast must equal the true, just as you would ALWAYS forecast a coin to come up heads 50% of the time, regardless of how many observations you have seen or are about to see.


#1    Rally      (see all posts) 2010/02/05 (Fri) @ 14:25

I have come to hate the term projections.  Fans want to credit us with Nostradumbass psychic abilities when we get them right.  Critics want to burn us at the stake when we get things wrong.  We can’t predict the future.  Nobody can.  If they claim to, they are liars.

A more accurate term would be “Estimates of Current Talent Level”.

But I realize the value of a quick, one word term, and am not going to change my site to baseballestimatesofcurrenttalentlevel.com


#2    MGL      (see all posts) 2010/02/05 (Fri) @ 14:53

"Estimated future true talent level” since a projection is generally an estimate of the average true talent level for the entire upcoming year.

Tango, obviously I agree with your post, but I’m not sure why you would say this:

And when you forecast players, it would be insane to give PA or IP > 0 for players who are below replacement level.  Therefore, by definition, the lowest (true) WAR you can give someone is 0 WAR.

Every forecaster projects playing time based upon either a “blind” weighted average of past playing time or an estimate/judgment as to what they think the player’s GM/manager will give them.  In both cases, a player with a negative WAR could easily get a playing time forecast.

The obvious example is a player on a bad team (often with bad management) who has a negative WAR projection but either the team is more optimistic as to their expected performance, they have little to no concept of replacement value, and/or they don’t have many options in the minor leagues and they don’t want to trade for or acquire a replacement player from another team.

IOW, it would not be uncommon and certainly not “insane” to expect a player with a negative WAR (according to a forecaster, which may or may not be particularly accurate) to get some playing time.

On top of everything else, if I project a player to have a -.5 WAR, for example, since there is uncertainty in my projection, if it turns out that my player is actually a 0, .5 or 1 WAR player, he certainly may get some playing time.  It’s not like all teams say to themselves, “Hmmm. Chone has so-and-so projected at -.5 WAR. We’re certainly not going to have this guy on our major league roster.” As well, if this kind of player starts out on the MLB roster and plays well for a while, even if he is a below replacement level player, he is going to get some playing time, at least until he starts to revert to his true level.

Forecasters don’t project playing time in their forecasts according to what THEY would do if they owned a team, do they?  They are trying to anticipate what a team will give a player, right? So why would you say that it would be insane to project a player with ANY playing time if he is projected to be a below replacement level player?  Not to belabor the point or anything. wink


#3    Toffer Peak      (see all posts) 2010/02/05 (Fri) @ 15:01

"And when you forecast players, it would be insane to give PA or IP > 0 for players who are below replacement level.  Therefore, by definition, the lowest (true) WAR you can give someone is 0 WAR.”

Doesn’t this assume that:

1. the WAR model and all of it’s components are accurate and account for all player value (including baserunning, catcher defense, etc). and

2. that the people who make playing time decisions (the GM and Managers) are knowledgeable about WAR and have perfect understanding of each player’s true talent level?

After all, what would you predict Craig Monroe’s WAR to be in 2010? (http://www.fangraphs.com/statss.aspx?playerid=1464&position=OF#value)

Or Delmon Young? (http://www.fangraphs.com/statss.aspx?playerid=2140&position=OF#value)


#4    Tangotiger      (see all posts) 2010/02/05 (Fri) @ 15:15

"Estimates of Current Talent Level”

Close, but not good enough.

After all, we are not estimating Troy Tulowitzki’s current talent level, but the performance that would match his current talent level, given the environment he will find himself in.

***

MGL, if we look at the current fan forecasts (which are likely optimistic to begin with):

http://www.fangraphs.com/projections.aspx?pos=all&stats=bat&type=fan

We do have eight nonpitchers with a forecast of negative WAR, with Mike Jacobs, Jose Guillen and Yuniesky Betancourt leading (trailing?) the way, for a total of these 8 players of -3.5 wins.  None of the pitchers forecast for negative WAR.

Given that there’s about 1000 wins above replacement for the whole league, I don’t know that we need to concern ourselves with the possibility that the Royals will be playing guys who are below replacement level (according to the Fans). 

So, I would say that it’s closer to insanity to forecast someone in MLB with below 0 WAR than to try to worry about making sure that you properly account for bad players being given too much playing time. So, I will end up forecasting 1003.5 WAR instead of 1000.

I mean, you have to presume that when Betancourt and Gary Matthews are being given substantial playing time, it will be because their “actual” true talent level is higher than their “fan-perceived” true talent level.  Even though, we can never know, since we’re limited to our observations of their performance.

In my personal WAR forecasts, the only player I have worse than a -0.5 WAR who might see big playing time is Garret Anderson.  If a team makes him a starting OF, then, yeah, ok, forecast a negative WAR for him. 

Betancourt, Bloomquist, Guillen, all at -0.5 for me, I’d rather forecast them at 0, on the idea that they are much better than I think, or their team (Royals in this case) will make the right call.

If you want to make an exception for the 30th team that they will end up playing guys below replacement level because they don’t have enough talent in their farm system, then, ok, I’ll relent there.

But, as a rule, let’s just call the idea of forecasting someone at less than 0 WAR as insane.


#5    Tangotiger      (see all posts) 2010/02/05 (Fri) @ 15:20

Remember what WAR is:

WAR = WARper162GP times Games Played

Monroe should have 0 games played, so his WAR is zero.

And Young, if he’s that bad a fielder, should be a DH, and if he’s a league average hitter, than that means WAR = 0.


#6    Duke      (see all posts) 2010/02/05 (Fri) @ 15:45

Betancourt, Bloomquist, Guillen, all at -0.5 for me, I’d rather forecast them at 0, on the idea that they are much better than I think, or their team (Royals in this case) will make the right call.
Why would you do that? Predicting playing time is a completely separate issue from predicting performance, and the idea that every team will refuse to play anyone whose true talent WAR dips below 0 strikes despite all evidence to the contrary strikes me as way more insane than just using whatever the projected performance result is and the most reasonable guess as to who will get playing time. To do otherwise just ignores the best evidence of who will actually play and gives the 0 WAR point, which is just an estimation, a sort of magical quality.

Should you bump up a 2008 projection of GMJ because he got paid all that money so he must be better than we think?

Should you project Betancourt the same as some legitimate 0 WAR guy when he is worse and is certain to play?


#7          (see all posts) 2010/02/05 (Fri) @ 15:59

To echo some of the above sentiments, I don’t believe that projections have to exist as though scouts and management are competent or fully rational.  To go back to the Betancourt example, it’s pretty apparent that the Royals think that he’s at least a replacement level player.  When I’m asked to make playing time predictions, I don’t necessarily care about that particular player’s on field performance.  I care about injury risk, roster-able alternatives, and behavior of management (including front office moves).  I project Betancourt to put up worse than -1 WAR because I project him to stay healthy, I project the Royals fail to find a ‘free’ comparable alternative, and I project that management will stick with him.  If I’m confident he’s going to play and I’m confident his true talent level is below replacement, then I see no way to avoid a negative projection.

I realize it’s dangerous to rely on a single anecdote for an example, but it’s the most glaring example out there.


#8    Rally      (see all posts) 2010/02/05 (Fri) @ 16:12

Picky today.  How about:
“Estimates of Talent Level for the Upcoming Season in Player’s League/Ballpark Environment”

It’s going on my front page.  Might change “upcoming” to “ongoing” after the season starts, but not ready to promise anything.


#9          (see all posts) 2010/02/05 (Fri) @ 16:12

Tango (or MGL or Rally), how “firm” do you consider the replacement level to be at the team level?  WAR as a league wide unit is all well and good, but for each individual team, isn’t the replacement level rather variable?  I’ve always considered a given team’s replacement level to be the second best player at a position.* Or the 6th best SP for pitchers.  In the cases where a free agent or easily acquirable talent is better than the second player, then I would call him replacement level.

To continue using the easy anecdote, as bad as Betancourt is, for the Royals he may well be more valuable than enticing one of the better SS’ on the market with the kind of contract required to get a guy to go to a bad team.  Maybe they can get a similar player for little more than a guaranteed contract, but it’s dubious that they can do better than Betancourt at a cost that would be reasonable to the Royals.  Maybe this is all null, maybe Alex Aviles returns from the dead.

*With utility factored in.  Say for example Chase Utley is hurt.  His apparent backup is Juan Castro but I would assume that the Phillies move Polanco to 2b and Dobbs to 3b so that replacement level for Chase Utley is Dobbs at 3b.


#10    Tangotiger      (see all posts) 2010/02/05 (Fri) @ 16:21

Ok, how about we do a little contest then.  We’ll take all the guys that the Fans are going to have a negative WAR on (and it might be more than what we see, once I get to doing the adjustments).

Let’s say that there’s going to be 20 players with a below 0 WAR, for a total of -6 WAR.

I will say that the sum total of those players will be a 0 WAR in 2010.

The Fans are going to say that the sum total will be equal to whatever their forecasted sum total is, which would be -6.

What are you guys saying?


#11          (see all posts) 2010/02/05 (Fri) @ 16:47

Sounds like about as fine a test as could be.  20 sounds like it’s WAY too many negative players, I think that fairly unique circumstances have to exist for a below replacement level player to get the playing time required to post big negative numbers.  Right now I have 4 with one of them being an unadjusted Scott Podsednik (pre-Ankiel).  I’ll adjust him now and see where I get. 

Looking over my numbers, I have quite a few replacement level talents (exactly 0 WAR) on both sides of the ball.  I think this highlights my position on the matter nicely.  There are a couple rare outliers and I see no reason to they can’t be explained within the bounds of theory.


#12          (see all posts) 2010/02/05 (Fri) @ 16:59

Adjusted Podsednik (moved from leadoff to 9th and PA cut from the 400s to the 100s) puts him at .2 WAR.  My personal projections are left with Francoeur (-.2), Everett (-.2), and Betancourt (-.1.8).  I’ll try to track down some other guys I would rate negatively but right now it’s probably obvious why I keep using Betancourt as an example.  He’s basically the only guy I think that fits the bill of a player who is blatantly below replacement level yet still going to play.  (I honestly think he should be projected at -1 WAR for the playing time I gave him (about 90 starts)...I also don’t think the numbers I put in the fields are ‘wrong’ projections and they say it’s -1.8)

Looking through my own projections is making me think that you’re probably right, I just don’t understand why certain exceptions in the framework can’t be made when it’s plainly obvious like with Betancourt.

And I’m very interested in all the fun things you’ll be doing with the fan scouting reports including this test of negative WAR projections.


#13    Tangotiger      (see all posts) 2010/02/05 (Fri) @ 17:28

If Betancourt is the sole exception, then is there really a point to making an exception?

The only way the Royals play Betancourt as much as people are saying is if he delivers with the glove as much as the Royals think he can.  Otherwise, they are not going to stick with him.

Isn’t it better to say that the Royals will man SS for 162 games for a total of 0 WAR in talent?


#14    MGL      (see all posts) 2010/02/05 (Fri) @ 19:20

"Let’s say that there’s going to be 20 players with a below 0 WAR, for a total of -6 WAR.

I will say that the sum total of those players will be a 0 WAR in 2010.”

Well of course they will.  That is because all of those players that start out badly and get dropped are likely worse than we thought and the ones who start out well and continue are likely better than we thought.

Say you have 2 players who are forecast at -.5 WAR for 200 PA each.  Both get 50 or 100 PA to start the season.  They are supposed to be -.25 WAR each at the end of those 100 PA, if they were both truly -.5 WAR players in 200 PA.  But if one is really +.5 in 200 PA and the other is -1.5 (that is a large spread in uncertainty of course), then after 50 or 100 PA, the -1.5 player gets dropped and the other one ends up with 400 PA.  That is why your projections for these bad players will always look low if you weight the results by playing time.  Actually everyone will which is one of the problems that Dan Rosencheck is running into in his research.

I gotta agree with Duke and some of the other guys above.

Duke said:

“Why would you do that? Predicting playing time is a completely separate issue from predicting performance, and the idea that every team will refuse to play anyone whose true talent WAR dips below 0 strikes despite all evidence to the contrary strikes me as way more insane than just using whatever the projected performance result is and the most reasonable guess as to who will get playing time.”

I mean projecting playing time is playing time and should have almost nothing to do with your performance forecast.  Playing time has a lot to do with a lot of things, and none of the things has to do with how YOU as a forecaster project a player.  If you project a player as -.5 WAR, if the manager or GM “projects” him as +1.0 or whatever, your -.5 projection becomes meaningless in terms of playing time other than the fact that the player is slightly more likely to play worse than the GM thinks at some point in time.  But even if he completely tanks because the GM or manager thought that he was way better than you did and than he really is, he still is going to get SOME playing time if the GM thinks he is a player. If your goal is to look good in evaluating your projections, then sure, go ahead and lower your playing time projections for the bad players and raise them for the good ones, even if you don’t really believe that those are accurate playing time projections, but I was not aware of the fact that that is what we are talking about.

Tango, I think you are way off base here.  If you were to ever simply retract a comment you made, I would vote for that one above (where you said it would be insane....).  And I think if we took a vote, you would be outvoted by 8-1. wink Unless of course, by “insane,” you meant in terms of how good your projections will look when we weight your player errors by playing time.

The A-Team, when we talk about replacement level, it is defined as “as compared to the best league-wide player that you can get at around the league minimum.” But sure, on an individual team level, some teams might have a harder or easier time finding a league-wide replacement level player at a particular position for the league minimum, for whatever reasons (such as they have no on in their minor league system that good and no other team is willing to give them anyone for nothing and their happen to be no one available at that time from the minor or major league free agent pool). In addition, replacement level probably changes a little from year to year even though we tend to use the same baseline every year.  In addition, how many players need to be available at that level to call it replacement level?  If there are only 3 at each position, can we call that replacement level?  I don’t know the answer to that.  Obviously only 3 teams could take advantage of that, and then the other teams would have to “replace” a player with someone worse than that, and that would be the new replacement level.  So replacement level is kind of an iffy thing.  But, it doesn’t really matter, because it is just a way to compare one player to another.  You can use any baseline for that.  You can just as easily use the “average player in the majors.” Using “replacement level” works best for salarying purposes for obvious reasons, but for anything else, it doesn’t really serve any useful purpose, although I suppose it makes it easy to identify players that you can automatically assume should not be playing the majors (although they could be on a roster to leverage a certain skill, like a designated pinch runner).


#15    tangotiger      (see all posts) 2010/02/05 (Fri) @ 20:52

“Let’s say that there’s going to be 20 players with a below 0 WAR, for a total of -6 WAR.

I will say that the sum total of those players will be a 0 WAR in 2010.”

Well of course they will. 

If that’s the case, then we are in agreement then that the 20 worst players in the league needs to be forecasted for a total of 0 WAR, even if they “really” should be forecasted for -6?

And that’s all I said, that it would be insane to forecast anyone at below WAR.

I can retract “insane” to something less blunt, but since we agree, then I don’t know what I would be retracting.


#16    MGL      (see all posts) 2010/02/05 (Fri) @ 21:35

"If that’s the case, then we are in agreement then that the 20 worst players in the league needs to be forecasted for a total of 0 WAR, even if they “really” should be forecasted for -6?

And that’s all I said, that it would be insane to forecast anyone at below WAR.”

No, not really, but we’re getting kind of metaphysical. Sort of a tree falling in the forest thing. I see your point though.


#17          (see all posts) 2010/02/05 (Fri) @ 22:26

""And when you forecast players, it would be insane to give PA or IP > 0 for players who are below replacement level. Therefore, by definition, the lowest (true) WAR you can give someone is 0 WAR."”

Tom,
I think you are digging in to make some pedagogical point that should be made another way.

Suppose one generally makes independent predictions of playing abilities ("true" rates) and playing time, with derived predictions of season playing records by multiplication. It’s reasonable not insane to do it consistently. When the abilities are jointly below replacement rate, carry on with the season playing record that is below replacement value.

Use a footnote to say that the abilities and time are jointly unsustainable if management recognizes the truth, or something like that.

(This concerns the sabrmetric audience without regard to rotisserie drafts.)


#18          (see all posts) 2010/02/05 (Fri) @ 22:47

Duke #6
“"Predicting playing time is a completely separate issue from predicting performance,
“”

That is now true in practice, I infer.

It isn’t necessary in principle. Joint prediction is possible and may be valuable, at least if there is some publication of uncertainty along with point estimates. Tom somewhere recommends or urges others to report uncertainty, but I infer that “no one” yet does that (no one but Marcel?).

Should joint treatment of uncertainty be introduced for the small numbers with predicted playing abilities below replacement rates? (If they play so poorly as the predicted rates, they shouldn’t achieve the predicted playing times, and vice versa, with a strong positive correlation between the prediction errors.) That seems ad hoc to me.


#19    Rally      (see all posts) 2010/02/05 (Fri) @ 22:58

Call me insane, but I am projecting that Betencourt plays under replacement level and keeps his job all year.  He might have a -15 UZR but they won’t notice, and praise his defense anyway.

This the team that played Neifi Perez(-1.9) everyday in 2002.  Also, just noticed Neifi in over 5000 PA had a WAR of exactly 0.0


#20          (see all posts) 2010/02/05 (Fri) @ 23:05

P.S.
I don’t recall where “Tom recommends or urges others to report uncertainty.” I believe he does and good for him.

From the 2004 explanation of Marcel,
“"Posted 4:22 p.m., March 10, 2004 (#8) - tangotiger
I added another column called “reliability”. That shows how much of the forecast is based on his performance, and how much was regression towards the mean.

Bobby Abreu shows a .87. That means that I regressed towards the mean 13%. Using that, it should be easy enough to figure out a confidence interval for each of the stats.
“”

That quantifies uncertainty about the true rates or playing abilities alone, if I understand correctly. For Marcel there is no uncertainty about the playing time, iiuc. Ipso facto the reliability quantifies uncertainty about the season playing record, the product of rates and time.

to be continued when everyone is familiar with independent uncertainties in both dimensions


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 01:20
Who is Jeremy Lin?

Feb 12 00:40
Clutch analogy

Feb 12 00:38
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul