THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, December 01, 2009

WAR, Salary, and Service: Estimating Dollars Per Win

By Tangotiger, 04:43 PM

Sky gives it a go.  I like the overall effort involved.  The main issues I have is that the intercept has to be the league minimum salary.  If it is not, then the WAR model has the wrong replacement-level baseline (for salary purposes). 

When under team control: Salary = .51 + WAR*.001
When Arb eligible: Salary = 2.26 + WAR*.31
When FA eligible: Salary = 5.53 + WAR*1.23

So, an arb guy with negative 6 WAR will be paid the league minimum.  That is impossible.  The problem here is almost certainly because the “arb eligible” should be split by arb1, arb2, arb3, arb4.  You also have a similar problem with FA, where negative 4 WAR will give you the league minimum.

If Sky is finding that players who generated between negative 2 and 0 WAR earned around 4MM$ in free agency, there’s a problem.


#1          (see all posts) 2009/12/01 (Tue) @ 18:22

Tango, I ran the data for you.  Of 87 FA-eligible players with <0 WAR, the average salary was $5.06 mil.  It may be a little on the high side, but it’s not a total fluke either.  It 2007, the average salary is $4.25, in 2006 it’s $3.27, in 2005 it’s $3.54, in 2004 it’s $3.07.

The reason is a big-time regression effect.

If you look at it from the other direction (predicting WAR from salary), you see more what you expect.  A FA-eligible player with league minimum salary is expected to produce 0 WAR.


#2    Jeremy      (see all posts) 2009/12/01 (Tue) @ 18:40

Wouldn’t a nonlinear model help in finding the correct replacement-level baseline for determining salary. All players between negative 2 and 0 WAR could expect to earn around the same amount.


#3    Telnar      (see all posts) 2009/12/01 (Tue) @ 18:41

I’m not sure that ex-post WAR is a good stat to use for this analysis.  Could you run the numbers using forecast WAR (perhaps by Marcel to keep the calculations simple) and see if that fixes the intercept problem?


#4    Matt Swartz      (see all posts) 2009/12/01 (Tue) @ 19:55

The regression of WAR on salary is not the relevant regression.  The regression of salary on WAR is.  Not only is WAR not perfectly estimate-able in advance, but WAR is a noisy measure of actual productivity.  The latter regression is right.

When you run a regression on a noisy independent variable, you get attenuation bias which biases the coefficient downward and the intercept towards the mean.  That is exactly what happened and it’s why you don’t get a zero intercept.  The arb regression could very well get a nonzero intercept because it is not a market based result.  That you do get a zero intercept for the regression of WAR on salary is a great indication that WAR has the right replacement level.  That WAR is measured with noise is not an issue unless it is measured with biased noise.  There is no attenuation bias and the coefficients are unbiased (though with wider confidence intervals) if you have a noisy dependent variable.  The enemy, especially for this kind of analysis, would be using noisy independent variables.

What do the residuals look like?  Are they distributed in a normal way?  Maybe some other functional form would be good?  I suspect that a bunch of censoring around 0 WAR is the cause of the .16 coefficient not being larger.  Not many people get many chances when they produced below replacement level so there is going to be a censored distribution for guys with low expected WAR.  What happens if you run the regression on guys with more than 2 WAR?

BTW I love the article.  Very intelligent analysis and looking at smart information.  I always like Sky A’s articles.


#5          (see all posts) 2009/12/01 (Tue) @ 20:07

Matt’s points are right on.  As I mentioned at the time, salary on WAR is the more interesting regression.  I probably should have led with it instead of revealing it later in the article.

The data, to look at it, looks pretty linear, but I didn’t do as many diagnostic checks as I probably should have - some of Matt’s other queries could be useful.

I’ll be following up with this stuff in weeks to come.


#6    Tangotiger      (see all posts) 2009/12/01 (Tue) @ 21:13

Telnar is correct.  You can’t look at the dependent variable and treat it as independent.

The equation is y=mx+b .  The independent variable is salary and service time.  So, Sky should not have brought up, at all, the other one.


#7    Tangotiger      (see all posts) 2009/12/01 (Tue) @ 21:37

What he could have done is used WAR of year x-1 and year x-2.  That would have been fine.  But, WAR of year x should have been out of bounds.


#8          (see all posts) 2009/12/01 (Tue) @ 22:05

It’s not out of bounds - it may not be useful for what you may be particularly looking for, but it’s certainly do-able.  If you know a player’s WAR in a particular year, and want to estimate his salary, that regression is the way to do it.  Dependent and independent variables are relative terms - it depends on what you know and what you want to find out.

If you’re a player and had a 5 WAR season, you can use the equation to see how your actual salary stacks up with other 5 WAR players in that year.  That may not be of interest to you, and it’s probably not as interesting as the 2nd regression, but it will help you make that comparison.  It is what it is - an estimate of this year’s salary from this year’s WAR - if you’re not interested in that relationship, then it may not be of use to you.


#9    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 08:22

Sky: you are suggesting that you can look at 0 WAR free agents, 0 WAR arb-eligibles, and 0 WAR slaves, plug it into that equation (0.5MM$, 2.3MM$, 5.5MM$ respectively), compare that to their actual salaries (say it was 0.5, 2.3, 5.5 respectively) and conclude .... what exactly? 

Remember, in this formula, WAR of year X is the independent variable, and salary of year X (which is set prior to the accomplishment of WAR of year X) is the dependent variable.

This is EXACTLY like running a regression of team wins to team payroll, but you are making it:

payroll = m*wins+b

And then running a regression on that.

So, what that says is that “this number of wins should have cost this amount of money”.  This works kinda well at the team level. 

It doesn’t work at the player level. You are suggesting that a team of 0 WAR (after the fact) free agents would cost 130MM$.

What you are really doing is taking, after the fact, the worst performances (which is a result of talent AND luck), and then adding up their salaries.  And then… what exactly?

Anyway, as I said, I love the whole mindset behind the article, but I disagree strongly on the relevancy (even validity) of this scenario.


#10    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 08:28

"If you know a player’s WAR in a particular year, and want to estimate his salary, that regression is the way to do it.  “

How about this: let’s say you have a free agent entering that 2008 season, and his WAR for the 2008 season was 0.

Estimate his salary for 2008 and estimate his salary for 2009.

What you are saying is that if AFTER THE FACT you knew you had a 0 WAR free agent, then his salary was 5.5MM in 2008, and you would estimate his salary in 2009 as close to zero.

Is that the extent of it?


#11          (see all posts) 2009/12/02 (Wed) @ 10:23

Tom, I believe your second post is right on.  If I knew a free agent player had a 0 WAR season in 2008, I would predict that he had a 5.5 mil salary in 2008.  That’s all it tells you.  In 2009, I’d expect his salary to go down, maybe to close to zero - I didn’t answer that question.

This regression is valid, but I agree it’s of limited use.  I found it interesting that worthless (in retrospect) free agents were making over $5 million, so I decided to talk about it.  I obviously didn’t explain what it was estimating and it’s limitations as clearly as I should have.  I think we’re mostly on the same page.

As I said in the article..."perhaps more relevant is predicting performance from salary. A player’s salary is determined before the player performs, so it makes more sense to analyze it this way.” That regression is the real story.


#12    Rally      (see all posts) 2009/12/02 (Wed) @ 10:53

Brian Giles was less than useless in 2009.  He made a good amount of money because he was a very good player before that, and projected to continue being a good player going into 2009, though obviously that didn’t happen.

His salary for 2009 is set well before his 2009 performance.  I’d be interested in a projected WAR compared to salary.  Instead of dealing with all the calculations I’d just take Fangraphs or Baseballprojection WAR.  Then weight it like 5*2008 WAR + 4*2007 + 3*2006 + 2*regression.

For the regression, I’d suggest league average, prorated to 2008 actual plate appearances, so a full time player would have 2.0, and a backup catcher 0.5 or so.


#13    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 11:10

I really think actual WAR regressed on salary is a much much better way than projected WAR.

The goal is a good estimate of the coefficient ‘b’ of the following equation:

expected WAR = constant term + b*Salary + noise

With that in mind, consider the following.

If you have a noisy estimate of WAR, then the coefficient is unbiased ONLY as long as the noise is on uncorrelated with salary.  That’s the question, and the question is not the amount of noise!

The difference between using actual WAR is that the misestimation of WAR by a team is unlikely to be highly correlated with salary.  the difference between formula projected WAR and the WAR teams actually project is in fact correlated with salary.  Ichiro gets paid more despite nearly every projection system coming up short on him.

Is there something I’m missing here?


#14    Rally      (see all posts) 2009/12/02 (Wed) @ 11:25

If you take my example in #12, you can get an estimate for Ichiro 2010 that is based on his actual WAR, and doesn’t matter that CHONE projects him to only hit .305 for 2010 (Which pretty much everyone, myself included, would take the over on).

It really depends what the question here is.  I don’t really see the point in comparing salary to WAR since there cannot possibly be any causation - salary is set before the season.  If you don’t want to deal with projections at all, compare salary to WAR in the previous year.


#15    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 11:34

I think the question is ‘How much do the teams that submit the winning bid in an auction value a win in dollar terms?’

If so, to answer that question, the coefficient ‘b’ from #13 is right. Any use of historical WAR in projection or literal form is a biased estimation of team’s expected WAR in that the difference between it and the team’s expected WAR is correlated with salary, biasing the coefficient.


#16    Rally      (see all posts) 2009/12/02 (Wed) @ 11:35

Projecting Ichiro’s 2009 WAR:

1. Using Chone projections I had him at 0 batting runs, good fielder and baserunner in right field, probably around 2.7 WAR

2. Or, as above: 2008 (5.4) *5 + 2007 (5.8) * 4 + 2006 (4.2) * 3 + regression (2.0) *2 / 14

= 4.8 WAR

I would suggest using the 2nd method in a salary comparison model.  For one, it should be an absolute snap to do for every season Sky has salary data for.  Method one has the biases of the projection system to account for, and will only be available for recent years.


#17          (see all posts) 2009/12/02 (Wed) @ 11:55

I agree with Matt here.  To find the relationship between $ and WAR, you shouldn’t need to use projections.  Coefficient ‘b’ from Matt/13 is .163 in 2008.  That translates to $6 million per win.  The coefficient has a SE of .02.  In past recent years it has fluctuated between .16 and .21.  That pegs the win/dollar amount for free agents at between $5-$6 million.

Fangraphs uses Rally’s projection method and comes up with $4.5 million/win.  I like my calculation more because it’s not subject to bias from the projections.


#18    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 12:01

But you can only apply the 6MM$ per win AFTER the 2010 season has been played.  Correct?


#19    Rally      (see all posts) 2009/12/02 (Wed) @ 13:06

If you’re asking how many wins you can expect, given you spend X amount of dollars, I agree.

But for “How much dollars should I pay for player X” we need to use projections of some sort, as do teams, since the money is committed up front.  I don’t see any way around that.


#20    David Gassko      (see all posts) 2009/12/02 (Wed) @ 13:26

Tango/Rally—you’re wrong on this. After the fact WAR IS a projection—it is in fact a perfect projection. We can’t use Sky’s first equation because it is biased (players with negative WAR were likely unlucky whereas players with some high number of WAR were likely lucky), but we can use the second one, since the number of WAR produced by players who were paid $X is in aggregate the number of WAR they were projected to produce. This is tautological.


#21    Bill Letson      (see all posts) 2009/12/02 (Wed) @ 13:41

I would be interested to see what the results would be if only the players serving the first year of a free agent contract were included.  I suspect a chunk of the difference between the Fangraphs value and Sky’s value is a “long contract penalty” of sorts - attributable to players at the tail end of contracts who no longer produce at the level they did when signed.


#22          (see all posts) 2009/12/02 (Wed) @ 13:41

Gassko is right.  Tango/18, you are right as well in that we can’t see how many wins per dollar were gained in 2010 until the 2010 season has been played. 

I suppose if you were dying to see how the 2010 market was shaping up before the season began you could use projections, but that would be the only reason to do so.  Unless I thought there was some sort of major shift going on I’d be more comfortable using the 2009 known values.


#23    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 14:25

David/Sky, I’m not sure what I’m wrong about here.

I’m saying that Sky’s first set of equations are just about useless.  Sky says its of limited value.  So, we’re pretty much agreed here, correct?

When under team control: Salary = .51 + WAR*.001
When Arb eligible: Salary = 2.26 + WAR*.31
When FA eligible: Salary = 5.53 + WAR*1.23

This is the one that says “given that I have a player with 0 WAR after the fact, his before-the-fact salary was 5.5MM$”.  Wholly uninteresting to me, and pretty much limited to anyone.

***

Now, this set of equations:

When under team control: WAR = .84 + Salary*.002
When Arb eligible: WAR = .62 + Salary*.21
When FA eligible: WAR = 0 + Salary*.16

This tells you what kind of WAR to expect, given a salary.  This one is fine.  This says: given his salary, this is his WAR.  So, if I have a group of 6MM$ free agents, they’ll get 1 WAR.

Sorry if I was confusing anything.  In terms of cause-effect, we have these for y = mx +b

when x = WAR in 2008, then y = salary in 2009
when x = salary in 2008, then y = WAR in 2008

That’s the causal relationship.

So, the first set of equations that Sky produces doesn’t do either, hence its limited utility (virtually useless frankly).

The second set is the correct one.

So far, we’re all on the same page?

***

HOWEVER, I fail to see how he could have gotten the regression he did.  For example, he had:

When FA eligible: Salary = 5.53 + WAR*1.23
When FA eligible: WAR = 0 + Salary*.16

And this is for salary and WAR in the same year.  If y = mx + b, then x = (y-b)/m

But, that’s not what we get here.

***

Finally, I would definitely NOT use 6MM for 2 reasons: one is multi-year deals, and two, the uncertainty level must be quite high.  I presume a few outliers would simply kill the equation here.


#24          (see all posts) 2009/12/02 (Wed) @ 14:26

A couple suggestions for the graphs:

1) In addition to the regression lines, show the actual data points, colored the same way. (Unless this would make things too cluttered)

2) Only extend each regression lines in the X-axis as far as there is real data for that group. (Showing that a pre-arb player who makes $20 million would be expected to produce .9 WAR is useless at best and potentially misleading. Obviously there are no such players.).


#25    Rally      (see all posts) 2009/12/02 (Wed) @ 14:36

OK, you’ve convinced me on the 2nd equation.

It would be interesting to look at the first year of contracts.  Guys like Gary Matthews Jr. has been a sunk cost since the end of his first year with the Angels.

Not that the formula is invalid while including latter years of a contract, but we would have to be careful to heavily discount projected WAR before applying something like Holliday = 5WAR=30 million.  Heavily discount, more than we currently do, for the chance of injury or suckitude.


#26    David Gassko      (see all posts) 2009/12/02 (Wed) @ 14:44

HOWEVER, I fail to see how he could have gotten the regression he did.  For example, he had:

When FA eligible: Salary = 5.53 + WAR*1.23
When FA eligible: WAR = 0 + Salary*.16

And this is for salary and WAR in the same year.  If y = mx + b, then x = (y-b)/m

But, that’s not what we get here.

---

That’s because with regression, you cannot just re-arrange one equation to find the other. You can go from one equation to the other, but it’s more complicated than what you’re doing.


#27          (see all posts) 2009/12/02 (Wed) @ 15:04

Gassko/26: Exactly.

Tango/23, Rally/25: The whole point is to include things like multi-year deals and factor in things like players getting injured, stinking, etc.  You can’t buy free agents without getting around those things - it’s part of the cost.

The standard error on the “b” component of the equation is fairly low at .02.  As I said earlier it’s consistently been between .16 and .21 over the past several years - it’s pretty stable.


#28    Guy      (see all posts) 2009/12/02 (Wed) @ 15:11

Whichever way you run the regression, what you’re finding is a weak relationship between WAR and salary in the same year.  I think this is a function of 3 things:
1) back-loaded contracts that appear to pay old players far more than they’re worth (when they are really being paid for what they did years earlier);
2) general inconsistency in player performance;
3) non-symmetrical WAR curve around a player’s mean projection.  It is MUCH more likely for a projected 4-WAR player to deliver 0 WAR than 8 WAR.  This is a function of injury risk:  a player can get hurt, but can’t get “extra healthy.”

Now, it would still be interesting to learn that teams have to pay $6M for each win they actually get, even if they think they are paying $4M (or whatever).  BUT, that isn’t true if some big part of the salaries for 0-1 WAR players is actually paid by an insurance company.  Nick Esasky was paid by the Braves for 3 seasons to deliver 0 WAR (vertigo).  But it was actually their insurance co. that paid the bills.  What I don’t know is how many players are insured these days, and to what extent—without that info, it’s hard to know what teams are really paying for this limited productivity.


#29    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 15:35

To expand a little bit on David Gassko/26, the equation does not say

y=mx+b

the equation says

y=mx+b+e

where e is an error term that is assumed to have a mean of zero, but it is not zero from every player.  in fact, it is different from zero with probability one.  a regression is a method of determning m and b such that the RMSE of all the error terms is minimized.

there is a difference between biased errors and large errors that needs to be highlighted here.  that is the key to a lot of the mistakes in this thread.  having a lot of multiyear contracts where players overproduce early and underproduce late does not matter unless the expected production difference is correlated with salary.  since that does not seem true (presumably there are many expensive players who are playing out the tail end of deals and many expensive players who are playing out the early end of deals), that is a nonissue.  the only problem with large unbiased errors in a regression is that the coefficient has a high standard error.  as the dataset is large enough that the coefficient has a standard error of .02, this is not a problem.

guy/28: the mean of the error terms matters but I’m not sure the distribution of the error term does at all, so asymetrical errors are not an issue.  I’m not 100 percent sure of this but I’m pretty sure.


#30    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 16:15

Sky/David: looks like I am wrong.

As an example, I just put these random salary/WAR numbers:
5 2
6 1
7 3
8 6
9 5
10 3
11 4
12 6
13 7
14 5
15 7

Doing it one way, I get:
y = 1.28x + 4.3 (r=.78)

Flipping the x, y, my regression gives me:
x = 0.47y - .27 (r=.78)

Interesting…


#31    Bill Letson      (see all posts) 2009/12/02 (Wed) @ 16:35

Matt, multiyear contracts absolutely are an issue.  The reason for all the discussion is that the generally accepted number as published on Fangraphs and Sky’s result are in disagreement.  Fangraph’s methodology, as explained by Dave Cameron, uses only salary and projected WAR for players who signed that preceeding offseaon.  Given what we know about player aging (WAR in y+1 is more likely to be less than year y for a player eligible for free agency) and contract structure (for a multiyear deal, salary in year y+1 is almost certainly greater than or equal to year y), this must be a biased sample of all players on free agent deals, favoring those players with a lower salary/WAR ratio.  The $4.5 million from Fangraphs is not including the hidden costs of multiyear deals, whereas Sky’s methodology is. 

That said, I’m not convinced the regression is right either.  Sky, if you have your data points in an easy to share format, would you be willing to do so?


#32    Guy      (see all posts) 2009/12/02 (Wed) @ 16:39

Matt:
I don’t see how the backloading of contracts can fail to impact your coefficient.  Let’s take the simple example of Jason Giambi’s 7-year Yankee contract:
WAR Salary
7.3 $10.4
5 $11.4
0.4 $12.4
4.5 $13.4
3.6 $20.4
0.8 $23.4
2.3 $23.4
Regressing WAR on salary, we find that each additional million dollars buys -.24 WAR (we lose a quarter of a game).  The fact that your 2008 sample includes young, underpaid Giambis just compounds the problem, it doesn’t offset it.

Now, reverse Giambi’s salary schedule, paying him the same total dollars but starting at $23.4M, which is much closer to what we (and I’m sure the Yankees) expect a player his age’s progression to look like.  Then, the regression tells us that we buy an extra .21 WAR for every $1M spent—much more reasonable.

The fact that there are contracts like Giambi’s will weaken the relationship between salary and WAR.


#33          (see all posts) 2009/12/02 (Wed) @ 16:52

I think Guy’s right here, though I’m not sure how big the problem is.  Reapportioning contracts reasonably would be a way to test it out.  It’s work, but probably worth doing to see what the effect is.  Of 268 potential FA eligibles, I’m not sure how many have this backloading problem.  Something to investigate....

Bill I’ll try to put up a scatterplot for you tonight in the original article.


#34    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 16:54

Bill/31:

I can’t tell if you’re agreeing with me?  Does fangraphs’ $4.5MM number really only look at the first year of contracts?  Because that would be way off.

Guy/32:

This is why you regress WAR on salary and not the other way around.  But it could still be a problem if higher salaries are correlated with being in the early or late part of a deal.  Giambi started at $10MM and ends at $23MM right around the time that ARod starts at $23MM and ends at $30MM-ish.  There are plenty of players that build up to $10MM when Giambi started.  Other than over $25MM, how many salaries are particularly likely to be more likely to be at the end of multiyear deals than at the beginning of others.

It’s not about avoiding a noisy relationship, it’s about avoiding a biased relationship (in the error being correlated with salary for the whole dataset).  That may be biased but I’m not sure.

If Sky has the data capabilities to just do average $/year of contract in place of salary, I would be he still gets about .16 as a coefficient.


#35    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 16:58

Sky: what was the total salaries paid in 2008 and what was the total WAR in 2008, for the free agents in your sample?


#36    Guy      (see all posts) 2009/12/02 (Wed) @ 17:00

"though I’m not sure how big the problem is”

My guess is it’s relatively small, compared to the problem of player injuries.  But worth correcting for if you can. 

I think it would also be worth doing separate analysis for pitchers and position players.  Anecdotally, it seems that pitcher performance is much more variable.  Certainly the injury problem plays a bigger factor.

And I really think that, unless you have information on insurance coverage, you need to remove players who experienced a major drop in playing time and spent time on the DL. What do we learn from saying these players were overpaid?


#37    Jeremy      (see all posts) 2009/12/02 (Wed) @ 17:04

I regressed the average annual value of the contract (AAV) for each of the last 3 year’s worth of free agents on the WAR produced the following year they signed the contract, and I’m very surprised to report that my AAV coefficients came out to .15, similar to Sky’s. We would have expected somewhere between .2 and .25.


#38          (see all posts) 2009/12/02 (Wed) @ 17:12

Total WAR was 306 and total salary was 1872 million.  That’s a $6.1 million per WAR ratio.  Seems to match the regression.

Guy, I don’t think the insurance is a big deal because presumably the benefit of using the insurance is balanced by players who teams buy insurance on, but never get injured (if it didn’t balance out, no company would offer the insurance).

I’ll will try out some of these ideas next week.  Great discussion.


#39    Guy      (see all posts) 2009/12/02 (Wed) @ 17:21

"I don’t think the insurance is a big deal because presumably the benefit of using the insurance is balanced by players who teams buy insurance on, but never get injured (if it didn’t balance out, no company would offer the insurance).”

You’re missing the whole point of insurance:  it protects those who would suffer a catastrophic loss—like paying Albert Belle $12M to produce nothing.  Let’s say the Orioles would pay their OFs the following w/o insurance: 
WAR Salary
0 $12M
2 $6M
3 $9M

With insurance (@ $100K per $1M salary), they pay this:
0 $1.2
2 $6.6
3 $9.9

The second scenario will give you a very different coefficient.


#40    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 17:40

Total WAR was 306 and total salary was 1872 million.  That’s a $6.1 million per WAR ratio.  Seems to match the regression.

It would have to have been a perfect match actually.  Once you reported the relationship is linear with an intercept of 0, then the sum of the salary over the sum of the WAR had to match what you reported earlier.

The reason I ask is that getting only 306 WAR for free agents is fairly low.  I have to ask what is a free agent then.  I presume you did like Hakes/Sauer and did NOT tell us about free agents in 2008, but those guys who last signed a contract as a free agent and who played in 2008.

This is totally a different thing.

For example, why don’t you list for us all the “free agents” who had a WAR of 0 or less and who earned 6MM$ or more.  I get the feeling we’ll be seeing alot of guys that signed in 2003-2005.


#41          (see all posts) 2009/12/02 (Wed) @ 18:02

Tango,
As I said in the article, the group is free-agent eligible players, defined as players with 6 or more years of service (estimated), regardless of whether the player was already under contract or not.

So, yeah a lot of the overpaid 0 WAR players probably are guys who signed a while ago.

I don’t see what the problem is though.  Capturing the risk that your player will be overpaid and crappy a few years down the road is the reason I set it up this way.


#42    Guy      (see all posts) 2009/12/02 (Wed) @ 18:04

Tango, I’m not following your distinction.  Why do you care when they signed the contract, as long as it was a FA contract (and not an Arb player)?  And wouldn’t the sample be smaller, not larger, if it included only those in the first year of a FA contract?


#43    Guy      (see all posts) 2009/12/02 (Wed) @ 18:09

Sky:  To clarify the point on insurance, you’re right that it won’t change the total WAR/$ ratio.  Whatever teams collect on injured players, they will spend on healthy players (and then some).  But it WILL have a big effect on the correlation between what a team pays for a player and what he produces, because it reduces the cost of the players who most underproduce.  If Mike Hampton’s various employers had insurance coverage, that alone would probably increase your coefficient by about .1!


#44    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 18:25

Tango, what is the reason you would use only players who signed that offseason?  Most contracts are backloaded money and frontloaded WAR.  You pay for the whole thing.

Guy, the insurance thing isn’t an issue for estimating this.  The Belle example only looks at players who insurance was paid out for.  Plenty of insured contracts never pay out. The amount that teams pay into insurance should be what they get out.  Also, I’m not sure how much teams can actually get insurance on contracts anymore anyway.  I’m pretty sure that mostly stopped a while ago.


#45    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 18:27

Let’s say you have this for a player in terms of WAR over the next 7 years:

3
2.5
2
1.5
1
0.5
0

Normal progression for a player.  Now teams are envisioning paying this player something like: 20MM, 19MM, 18MM, 15MM, 12MM, 6MM, 1MM

That’s a total of 91MM over 7 years.  But, the way Sky is doing it, the player is being “paid” 13MM each year.  Well, by the time the 7th year rolls around, a team is really just bargaining for 1MM$ of production, and that’s what they’re getting.

This is my point that I made to Sakes/Hauer regarding the back-loading of Tejada’s contract: it does not matter at all how much a guy is being paid for a given year.  What matters is how much he is being paid for that year IN CONJUNCTION WITH the rest of his contract.

In effect, every contract is pretty much a back-loaded contract, even though the contracts are of equal payments.

If the net result of Sky’s list is that all the 0 WAR, 6MM$+ contracts are guys who signed it 2+ years ago, that’s a bias that really tells us nothing.

Indeed, it’s not a “free agent” as we talk about.  Sky should not use that term, and neither should Sakes/Hauer use that term.

A free agent is a guy who is a free agent in the year he signs the contract.  In the year after that, he’s not a free agent.

I understand Sky may define the word how he wants it, but everybody better be darn well sure that they can’t revert to the common definition.


#46    Guy      (see all posts) 2009/12/02 (Wed) @ 18:28

Matt, you’re thinking about this wrong. But if it’s true that teams no longer purchase insurance, then the issue is moot.


#47    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 18:29

Matt: we talked about this a few weeks ago (look for Saber Rattling).  The WAR and the contract have to be used for the same time period.  If you have a 7yr contract, then you need to use 7 years of WAR.


#48    Guy      (see all posts) 2009/12/02 (Wed) @ 18:33

I don’t agree with Tango that you can’t call these players FAs if they signed 2 years ago.  Just be clear FA refers to the contract they play under.  But he’s right that you still have a big bias even if you use average annual value.  What you really want to do is use total WAR over life of each contract and total payroll over that same contract (adjusting for inflation), and see how well those correlate.  Then you would be measuring what teams get for their $.


#49          (see all posts) 2009/12/02 (Wed) @ 19:10

Tango, while the situation you describe creates noise, not bias.  When you consider that players are at various stages of their contracts it won’t, on average, affect the dollar/WAR ratio.

If there are 7 of these guys you describe, one in each year of his 13MIL per year contract, the group as a whole for a particular year will earn 9.5 million for 91 million dollar.  That’s the same as if you examine their WAR and salary for the length of the entire contract.

Now, by chance you may not have an even distribution of players at various stages of their contracts.  That causes the noise.  But there’s no bias going on.


#50    Guy      (see all posts) 2009/12/02 (Wed) @ 19:21

Sky:  True, it won’t affect the $/WAR ratio.  And if that’s all you are interested in, you don’t need regression at all—just long division.  But assuming we are interested in how well expenditures match wins, then this matters a great deal.  Using an average (and non-inflation-adjusted?) annual salary, for players who will experience a secular decline in production on average, is guaranteed to reduce the salary/WAR correlation.


#51    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 19:28

"Now, by chance you may not have an even distribution of players at various stages of their contracts.  That causes the noise.  But there’s no bias going on. “

This is bias.  You cannot apply the 6MM per win to a player’s first year under contract.  For example, take all the free agents who signed a contract entering the 2008 season.  How much is their total dollar per WAR? 

Now, take all the guys who signed in 2007.

And do the same for guys who signed in 2005/06.

And all the guys who signed in 2004 and earlier.

The dollars per win will be progressively higher.

The point is that you can’t then say you will apply a 6MM$ per win for guys signing a contract for the 2010 season, which is the implication I presume most people will be making when seeing those numbers.  That a guy who signed in the 2010 season for 18MM$ is expected to generate only 3 WAR.

That’s not what you are saying.  What you ARE saying is that EVERYONE who has signed a free agent contract in 2010 or earlier and did so for 18MM in 2010, will generate 3 WAR each.  That I can believe.  This includes say Torii Hunter and Vernon Wells, and say Matt Holliday this year.  But it’s Matt Holliday and his ilk that will provide the bulk of the contributions this year.

It’s bias, not noise.  And you can remove the bias by including a parameter that says “how many years ago did he sign”.


#52    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 20:31

And Matt Holiday will produce less and earn more later in his contract so it’s not right to say that if he gets paid $18MM to add 4 wins, then the Yankees or whoever are paying $4.5MM/win because the Yankees know they will paying later.  Tango, your last comment doesn’t disprove the bias not noise argument.  Why would it make the coefficient inaccurate, rather than noisier?  Do teams structure their contracts differently than they used to?  Do 6+ service-time salaries escalate at a different pace than the average long contract?  The way that you’d approach this is different-- you’d want to look at the same type of analysis for 2002 or something, and then compare it to the contracts signed in the 2001 offseason and see what you get.  I’m not sure you would get something different based on this.


#53    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 22:29

Matt: you would definitely get something different.  This is my point when I said this:

For example, take all the free agents who signed a contract entering the 2008 season.  How much is their total dollar per WAR?

Now, take all the guys who signed in 2007.

And do the same for guys who signed in 2005/06.

And all the guys who signed in 2004 and earlier.

The dollars per win will be progressively higher.

Basically, you would get 5MM$ per win for the guys who signed in 2009, 6MM$ per win for the guys who signed in 2008, 7MM$ in 2007, 8MM in 2006 and earlier.  It has to work out like that.  You know this is true because if you look at the last year of anyone’s contract, they are almost all way overpaid.

***

“Do teams structure their contracts differently than they used to?  “

It doesn’t matter HOW they structure it.

***

Let me ask, are we in agreement that a free agent at age 30 will progress something like this:

4 WAR
3.5
3
2.5
2

Let’s give him a name… say… Vladimir.

So this guy, his name is Vladimir, and he got 15 WAR over 5 years.  His team, I dunno, let’s say, Les Anges, are paying him 4MM$ per win a year (baseball revenues expected to be flat for 5 years).  So, his structured contract is this each year:
$16
$14
$12
$10
$8

So, when you run your regression, you get
y=mx+b
WAR = .25Salary + 0

But Vlad says… uh, no… I want my 60MM evenly, so 12MM$ a year.

So, in his final year, he’s getting 6MM$ per win (and 3MM$ per win his first year).

When you look at everyone in 2009, Vlad’s final year, you see lots of players who are in his position, and you see few players signing big contracts.  So, 2009 is biased toward players who are disproportionate in bad contracts signed, and there’s not enough good players in early contracts to counteract that.

This is why you have a bias, not noise.  Because the key variable to long-term contracts is not included in your regression.  If you included this variable, you increase your correlation coefficient (if it was “noise”, adding this extra parameter would do nothing).

Now, if you go back and do this for many years, this reduces the bias, because at some point, the sample will start being disproportionate the other way.  If you look at say 2005-2009, that should knock out a big part of the bias.

***

Sky, you also said this:
“Update: I’ve had a few requests to see the data points plotted, so here they are for free agent eligibles in 2008. The data looks linear to me, and although the variance of the errors does get a little larger as salary increases, it doesn’t seem like a major problem.”

It looks to me that there’s well over 200 points there, maybe 300.  But FREE AGENT ELIGIBLES must have been much closer to 100 in 2008, no?  Are these free agent eligibles, or “free agents” as you’ve defined them?


#54    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 22:49

The thing is that Matt Holliday will be underpaid this year.  So even though you omitted a variable, it won’t affect your coefficient on salary unless it is correlated with salary.  What is the variable you think should be added?  Length of contract?  Year of contract?  If you did length of contract, then you would probably get a coefficient of zero anyway, since Matt Holliday and Vladimir Guerrero are both in there and cancel each other out.  If you did year of contract, that is presumably positively correlated with salary and negative correlated with WAR, so I guess that would be something.  But then you’re treating the first year of Matt Holliday’s deal like the first year of a one year deal.  That’s not right either.

Also, on a separate note, Vlad doesn’t want the money doled out evenly.  He wants it early just like anyone, based on present value.  But the team’s discount factor is usually lower than the players, since the team can get a higher return from investing than the player can (being that they are in a government approved monopoly that grows more than the stock market, and the players do not have access to a return that is higher than the stock market in all likelihood).  So the teams get more money now and are willing to give the player more money later.

Regardless, there are probably about equally many players in the 1st year of 5 years deals as in the 2nd, 3rd, 4th, and 5th year of 5 year deals.  I’m not quite sure how the correlations would work out, but I’m not convinced you get a different coefficient rather than just one with a large confidence interval.  You would need salary to be negatively correlated with the error in WAR.  It just may be true, but I don’t think it’s trivial given that all contracts probably accelerate and average salary growth may neutralize the effects of individual players seeing more WAR/$ early on in contracts.  I think this is something the data would need to say.  It could be an issue, but I’m not sure that the omitted variable in the regression affects to coefficient on salary yet and the noise/bias argument is not the same.  Just saying that a variable is omitted isn’t enough.  It needs to be correlated with the regressor, salary.


#55    Tangotiger      (see all posts) 2009/12/02 (Wed) @ 23:09

"Regardless, there are probably about equally many players in the 1st year of 5 years deals as in the 2nd, 3rd, 4th, and 5th year of 5 year deals. “

And I dispute that.  The data has to show that.  6MM$ per win tells me that.  It simply doesn’t make sense.

Either that, or Rally doesn’t have alot of WAR given out.  I give out 1000 WAR a season.  How many WAR does Rally have?

We should probably call his rWAR and mine tWAR and MGL’s mWAR.

***

The missing parameter is: “years since contract last signed”.

This way, you’d have this data for Vlad:
12MM$, 0 years ago = 4 WAR
12MM$, 1 year ago = 3.5 WAR
12MM$, 2 years ago = 3 WAR
12MM$, 3 years ago = 2.5 WAR
12MM$, 4 years ago = 2 WAR

Call me crazy, but I see a relationship there.

I would also include age of course, so that if ARod signs a free agent contract at age 26, we might still see a progression up, not down.

And I submit that this is standard.  That the farther out you last signed your contract, the lower your WAR is (age notwithstanding).


#56    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 23:28

Where does the $4.5MM/win number come from?  I’m not sure that I buy that $6MM/win is wrong and $4.5MM/win as right.  I’ve taken it as given before, but given this discussion, I don’t see how this is a trivial thing to compute.

I’m not disputing the existence of a relationship.  I’m asking a legitimate unanswered question which is whether the salary is correlated with the error, not whether the error exists.

Someone correct me if I’m but I think that using years since last contract signed is going to mean that the coefficient on salary is wrong.  I think the people in the first year of their deal are providing more WAR/$ than they are actually being paid for because the money is deferred to the later portion of the contract.  So, you will get that the coefficient on years since last contract signed is negative and on salary is more positive than it is for the whole population or even for the whole population of contracts signed in the same off-season if you were to look at those players’ whole deals rather than just that first year.

I would suggest comparing the coefficient done the original way Sky did it for a historical year and then done using the AAV and average WAR, with some discounting of later money, and then see how different a result you get.  I suspect it would not be very different from 0.16.


#57    Guy      (see all posts) 2009/12/02 (Wed) @ 23:37

"Regardless, there are probably about equally many players in the 1st year of 5 years deals as in the 2nd, 3rd, 4th, and 5th year of 5 year deals.”

Even if that’s true, it doesn’t matter.  The problem is that in 4 of the 5 years—for EVERY one of these players—the data is giving you a distorted view of what teams are paying for.  In Tango’s example, Vlad delivers exactly what he was paid for, if the market price is $4M/WAR.  But the regression will not give you a coefficient of .25, it will give you a coefficient of zero. 

I have to say, this is a somewhat weird discussion....

*

The scatterplot confirms that there is an asymmetry due to good players getting hurt.  There are no high-performing but low-salary players.  If a FA is paid under $5M, there’s almost no chance he will turn in a great season.  But a $15M player can deliver zero WAR.


#58    Matt Swartz      (see all posts) 2009/12/02 (Wed) @ 23:45

Just noticed the scatterplot.  What happens if you run the regression without the top three or four salaries?  It seems like the salary outliers might really be messing with the coefficient, and I think it’s worth checking.


#59    Rally      (see all posts) 2009/12/02 (Wed) @ 23:46

"Either that, or Rally doesn’t have alot of WAR given out.  I give out 1000 WAR a season.  How many WAR does Rally have?”

It’s less.  Just checked 2008, and found a total of about 865.  That would account for about half the difference we’re seeing.


#60    Rally      (see all posts) 2009/12/02 (Wed) @ 23:50

I suspect that we’d be more in agreement if we looked at first years of contracts.  I don’t think teams give out multi-year deals very efficiently.

How many free agents give significant bargains to their employers?  Wakefield for one, but I don’t think enough to balance out the Matthews, Byrnes, Pierre, Soriano, Wells, and Zito deals.


#61    Tangotiger      (see all posts) 2009/12/03 (Thu) @ 00:13

I give out 4.4 billion$ of “Free agent” salary to the 1000 WAR.  Giving the same amount to Rally’s 865 WAR means 5.1MM$ per WAR.

***

I HIGHLY encourage reading this discussion, as alot of the points being brought up were discussed there:
http://www.insidethebook.com/ee/index.php/site/comments/how_much_did_the_2009_free_agents_sign_for_per_win/


#62    James M.      (see all posts) 2009/12/03 (Thu) @ 00:28

"The scatterplot confirms that there is an asymmetry due to good players getting hurt.  There are no high-performing but low-salary players.  If a FA is paid under $5M, there’s almost no chance he will turn in a great season.  But a $15 M player can deliver zero WAR.”

Right.  Which means the relationship is non-linear.  At best it may be (approximately) piecewise linear.  To test this, split the data into 2 parts: $0 to $5 M and over $5 M. If the data were truly linear, the fitted values of WAR($5 M) should be very similar for both regressions.  But they won’t be. 

There is nothing magic about $5 M, of course.  You can repeat the test using different inflection points.

Alternatively, try taking logs of the salary dollars before running the regression.


#63    Tangotiger      (see all posts) 2009/12/03 (Thu) @ 00:36

When Arb eligible: WAR = .62 + Salary*.21
When FA eligible: WAR = 0 + Salary*.16

In order to generate 1 WAR, the arb guy is going to be paid 1.8MM and the FA is going to be paid 6.25MM.  The arb guy is getting 29% of the salary of the FA.

In order to generate 2 WAR, the arb guy is getting 6.57MM and the FA is getting 12.5MM.  Arb guy is getting 50% of the salary of the FA for the same production.

To generate 3 WAR: 11.3MM, 18.75MM, 60%.

To generate 4 WAR: 16MM, 25MM: 64%.

Roughly speaking, the arb guy is getting around 50% of the FA.  But, there is no arb guy getting 16MM.  Indeed, I would have expected the relationship to go the other way.  I’d have figured the discount level would be much higher for the good arb-player than the bad-arb player.

However, this may be another case of bias, whereby a distinction between arb-1, arb-2, arb-3 is necessary.  I presume most of the arb guys making under 2MM are arb-1, arb-2, and most of the arb guys making 6MM+ are arb-3 players.


#64    Tangotiger      (see all posts) 2009/12/03 (Thu) @ 00:40

You can practically force a linear relationship with WAR.  Just change the baseline.

Remember:

WAR = WAA + rate * PA_or_IP

Regardless, the data points look fairly linear to me.


#65    Matt Swartz      (see all posts) 2009/12/03 (Thu) @ 00:42

James/62:

I don’t see why that means its a non-linear relationship.  It’s just that the error terms exhibit heteroskedasticity (bigger errors for bigger salaries).

It has to be a almost exactly linear relationship or there is a severe market inefficiency that would have been corrected.  The winner of the free agent auction should be the team that has the highest value per win.  Since the playoff added value per win is approximately constant over a range of 0-6 wins when you account for the binomial variance of win total, then any situation where the relationship was nonlinear would be improved by buying and trading players by one team and adjusting accordingly.  The only possible issue is that these free agent salaries don’t take draft pick compensation into account which would actually make the salary paid nonlinear with WAR even if the opportunity cost for the team was.

I thought about taking logs of salary, but I’m not sure if we can then determine the proper linear coefficient.  Logs of salary would be the natural way to adjust for the skewed distribution of salary, but I’m not sure there’s a mathematical way to then approximate that relationship of $/WAR which should be linear.

Tango/61:

Whatever we get for a correct dataset to use in regression-- if we get an intercept closer to 0 with one WAR, that’s the better one to use.


#66    Tangotiger      (see all posts) 2009/12/03 (Thu) @ 01:19

Good point (though only if you do salary minus 400K).

And only for free agents.


#67          (see all posts) 2009/12/03 (Thu) @ 10:31

Tango/51:

***What you ARE saying is that EVERYONE who has signed a free agent contract in 2010 or earlier and did so for 18MM in 2010, will generate 3 WAR each.  That I can believe.  This includes say Torii Hunter and Vernon Wells, and say Matt Holliday this year.***

Yes, that is what I’m saying and that’s what I’ve been saying all along.  If you give out $18 million, on average expect 3 WAR in return.  You may get better than that early and worse than that late, but on average it’s 6M per WAR.

Tango/53, The players in the scatterplot and regression are those with 6+ years of service.

Matt/58, I removed players with salaries 16M and higher - got about the same coefficient (.155).

I think the long term contracts in general is something to look into.  This weekend I plan to go back and reapportion them so they are not backloaded.  I’ll see if anything changes.  My gut says not much, but it’s worth checking.


#68    Tangotiger      (see all posts) 2009/12/03 (Thu) @ 11:45

"Tango/53, The players in the scatterplot and regression are those with 6+ years of service. “

I think when you say:
“so here they are for free agent eligibles in 2008. “

That means something else.  If, for example, you wanted to only show the guys who signed a free agent contract for 2008, what would you say?

***

Ok, I’ve come around to accepting that there is no bias in what Sky is doing, as long as the population he is showing is representative each year.  If, for example, there are a ton of “back-loaded” (and by that, I mean guys at the end of their contract years, who are obviously at the tail-end of their career) guys in 2008, this skews the results.  The sample is not representative

If you do what Sky did for a few years past (at least 3, preferably 5), that pretty much knocks out the non-representativeness argument.

***

Now, if Sky continues to show 6MM$ per win, then it’s going to be something else.  Simply put, I don’t believe in the 6MM$ figure.  You’ll have to drag me there to believe it.

***

I once again highly encourage Studes’ ground-breaking work on this in the THT07 annual, which is available for free here:
http://www.wowio.com/users/product.asp?BookId=511

Go to page 131 (by clicking at about the 40% mark of the slide bar on top).

He does what Sky does, but he uses Win Shares Above Bench.  WSAB, when looking at things in the aggregate, is a good metric.

PLEASE, PLEASE, PLEASE, everyone read that.  It’s similar to what Sky is doing, but Studes is using data from earlier years.

When you make a post, please tell me you read it.


#69          (see all posts) 2009/12/03 (Thu) @ 12:08

Here are the coefficients for the previous 5 years and the translated $ per WAR:

2008: .163 -> 6.1
2007: .217 -> 4.6
2006: .211 -> 4.7
2005: .167 -> 6.0
2004: .204 -> 4.9

The average is 5.3M per WAR.  Given that there has been salary inflation since 2004, the true $ per WAR for 2009 is probably over $5.5M.  It’s certainly more than $4.5M.

I read the Studes article, though I didn’t find a huge amount of detail on his methods like we’ve been discussing here.


#70    Tangotiger      (see all posts) 2009/12/03 (Thu) @ 13:40

You can’t look at 4.5:

I give out 4.4 billion$ of “Free agent” salary to the 1000 WAR.  Giving the same amount to Rally’s 865 WAR means 5.1MM$ per WAR.

My WAR, which we should call tWAR is different from Rally’s WAR, which we should call rWAR.

With tWAR, it is 4.4MM$ per win.  With rWAR, since you haev fewer WAR than tWAR, then each rWAR has to cost more, which is why the baseline is 5.1MM$.

I think from Sky’s chart, we can see how 2009, and especially 2005 (when you consider inflation) to be real outliers.

So, I am vindicated that the sample in 2008 was not representative.

However, the overall values are still a bit high: average of 5.3 rWAR (in the average year of 2006).

The average for tWAR (in 2006) would be probably around 4.0.  And 4.0MM$ per tWAR would be equivalent to 4.6MM$ per rWAR.

We’re not THAT far off overall.

***

As for studes’ article, there’s a ton of information in there, and I was able to use that as the (initial) basis for my financial model.

***

For this point forward, it’s important to note that tWAR != rWAR.  So, when you are trying to compare to Fangraph’s WAR (or fWAR, similar to tWAR), you are not making an even comparison.

***

Great work Sky!


#71    Rally      (see all posts) 2009/12/03 (Thu) @ 14:09

Tango, what is your position player/pitcher split for WAR?


#72          (see all posts) 2009/12/03 (Thu) @ 14:14

Thanks Tango.  It only took 70 posts to sort this all out : )

I had no idea that the differences in types of WAR were that different.  The 2008 season is on the high side.  My estimate of a true $5.5M or $5.75M per WAR would translate to $4.75M or $5M per WAR on your scale.  So yeah, not too far off.

I think there is more work to be done on this issue overall, but this is a good start.


#73    Tangotiger      (see all posts) 2009/12/03 (Thu) @ 14:50

Rally: the split is 4/3 for non-pitchers/pitchers.  This split also happens to, not-so-coincidentally, match the split in terms of salaries.


#74    Bill Letson      (see all posts) 2009/12/03 (Thu) @ 14:51

I read the Studenmund article, here’s what I see for material differences.

*Result - $4.4M for Studes, $4.7M for Sky for 2006

*DNPs - Studes omits players who are paid but do not play, Sky does not explicitly state either way unless I missed it.  I presume that he omits as well, since these players don’t appear to have entries in the Sean Smith WAR data (at least Bagwell doesn’t, I didn’t check everyone).  If that is true than this isn’t actually a difference.  If Sky actually added these in at 0 WAR, then this would contribute to the gap in results.

*WSAB vs WAR - WSAB has fewer marginal wins to go around.  If converted to the same measure of wins, the results would be spread further apart.

*Regression vs Simple Average

*Categorization - Same definitions, but Studes looks at players on an individual basis while Sky estimates based on playing time.  In Sky’s study, good players (more strictly, those with a lot of playing time) are going to be more likely to end up in the free agent category while still arb eligible or even under team control.  But correcting this would push Sky’s estimate higher, not lower.  It could be that there are more of these type of players in ‘07, ‘06, and ‘04 than in ‘08 and ‘05.  In fact, Miggy in ‘06 was actually under team control, but the 130 PA = 1 year service estimation would drop him in the free agent category.


#75    Rally      (see all posts) 2009/12/03 (Thu) @ 15:37

I assumed Sky gave 1 year of credit for 130 PA in the estimation, but would have a maximum of one year service per actual calendar year.  I did not read that as a rookie with 700 PA following an 80 PA September callup showing in the free agent category.  Correct me if I’m wrong, but I think that’s a pretty good way to estimate service time if all you go on is Baseball Databank.

I have 59% of WAR in 2008 going to non-pitchers.  Close enough to the 57% implied by a 4/3 split that I won’t worry about it.


#76    Bill Letson      (see all posts) 2009/12/03 (Thu) @ 16:06

Rally, you’re right, your reading makes much more sense than mine - my bad.


#77    Tangotiger      (see all posts) 2009/12/03 (Thu) @ 16:10

Right, I have no problem with anything between 40%-45%.

***

I posted the service time file two years ago:
http://www.insidethebook.com/ee/index.php/site/comments/request_mlb_service_time/#27

Go to post 27.  It’s mapped to the BDB ID.

That’s the career service time entering the 2008 season.

And, in there, I show service time accrued in 2007, so Sky can snap his fingers and get career service time entering 2007.

From there, he can try to estimate service time for 2006.


#78    Guy      (see all posts) 2009/12/03 (Thu) @ 16:11

How much rWAR do teams get from Arbs and slaves?  If I read it correctly, Sky has about 270 FAs delivering an average of 1.1 WAR for $7M each.  So an average (mean) team will spend $62M on 9 FAs to get 10 wins.  That should leave another, what, 25 wins (?) from non-FAs at a price of about $28M.  But I don’t see how you can get that many more wins, given the coefficients for the non-FA players.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 15:37
What sabermetrics is NOT

May 25 15:28
Largest demonstration in Canadian history?

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 15:02
Pete Palmer’s new book: Basic Ball

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion