THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, December 11, 2007

SABR’s By The Numbers

By Tangotiger, 02:20 PM

The latest is up on Phil’s site.  I just zipped through it.  The one from Wang looks good.  And, he reports something interesting that I was going to do (I love it when someone else has the same idea, and does it before me):

The $/WARP for a fourth year player was $.64 million/WARP, for a fifth year player it was $.83 million/WARP, and for a sixth year player it was $1.29 million/WARP. Remember, it cost $1.525 million for every additional WARP in the free agent market.

0.64/1.525 = 42%.  0.83/1.525= 54%.  1.29/1.525= 85%. 

What those numbers represent is the cost for the 3+, 4+, and 5+ year player, relative to a free agent.  I’ve been using 40%, 60%, 80%, which is fairly consistent with the above set of numbers. 


#1    Patriot      (see all posts) 2007/12/11 (Tue) @ 14:49

I haven’t read it yet, either, but “Does the Runs Created Formula Work for Division III Softball?”.  This is a problem that is screaming out for Base Runs.  Both of the articles I’ve published in BTN have been either largely or solely about Base Runs, and yet...I’m not surprised by this mind, certainly not anymore, but still.


#2    Tangotiger      (see all posts) 2007/12/11 (Tue) @ 15:43

Yes, quite a joke.  Unless it has the Bill James seal of approval, we are largely ignored.  Even a site like baseball-reference.com, as avant-garde as it is, shows much too much defence to James.

I told James about my Markov, and he agreed that the run value of the HR was too high in the high environment, and he articulated, correctly, the reason.  And still, he doesn’t change RC!

On the other hand, I did post a link to a fellow who used BaseRuns in the Israeli league.  Once he accounted for the reaching base on error, BaseRuns did quite well.

I will say that once someone accepts BaseRuns, they never turn their back on it.  The same can’t be said for anything else.


#3          (see all posts) 2007/12/11 (Tue) @ 16:27

I’ll point the authors here so they can find out about Base Runs ... maybe they can get another article out of it!


#4    jinaz      (see all posts) 2007/12/11 (Tue) @ 16:43

I really like Wang’s study, because it’s so relevant to some of the things I’m fretting about with the Reds talking about shipping off their farm system for Erik Bedard.  But I had a question:

As I understand it, WARP has an extremely low baseline for position players because it compares players to “replacement level” hitting AND fielding.  So if you’re using a 75% of league average replacement level in each, then the replacement PLAYER is set to 0.75*0.75 = 50% of league average.

Does a similar problem exist for pitchers?  Or does WARP use a more reasonable baseline for them?  Comparing my RAR numbers to WARP for a few pitchers does still show WARP to be rather high, but I’ve never seen much discussion about WARP for pitchers.

If pitchers are treated more reasonably in WARP, that could be contributing to why pitchers do so much more poorly than hitters.  Though clearly the high “bust” rate is probably the bigger source of error.
-j


#5    studes      (see all posts) 2007/12/11 (Tue) @ 16:55

I think you know this, but I already did this analysis in last year’s THT Annual.


#6    tangotiger      (see all posts) 2007/12/11 (Tue) @ 17:13

Yes, it was last year’s annual that spurred my “Sabermetrics in the off-season”, as well as settling on the 40/60/80 numbers I used.

jinaz: yes, I did talk about pitcher replacement levels.  IIRC, it’s something like 2.35 runs worse than average!  You can check out my blog here… look for “the replacement pitchers” or some such. Obviously, a 2.35 level is completely off the wall.  I will guess he chose that so that he can properly balance pitching with nonpitching.  That is, in order to take a wrong (double-replacement level for nonpitchers), he had to take an absurdly high ERA replacement level for pitching, to keep the nonpitching/pitching WARP levels in good proportion.

Every time I see WARP, I get a headache.  If there is someone out there that actually supports an ERA level of 2.35 runs worse than average, please stand up.  Clay is awfully lonely.


#7    jinaz      (see all posts) 2007/12/11 (Tue) @ 18:42

Tango, thanks--I didn’t follow this blog particularly well prior to this season, so I missed out on a lot of background work.  I’ll go check some of that out when I get a chance.

Also, in #4, my last sentence should have read “Though clearly the high “bust” rate is probably the bigger source of the difference.” Never type while holding an unrelated conversation with a coworker. smile
-j


#8    studes      (see all posts) 2007/12/14 (Fri) @ 20:23

Just read the Wang article.  I think he only scratches the surface, but I’m mostly interested in the BPro figure of $1.5 million per FA WARP.  I’ve looked through my copy of BPro 2006 and can’t find it.  Does anyone know what page that figure is on?


#9    studes      (see all posts) 2007/12/14 (Fri) @ 20:30

By the way, I just pulled out the 2004 BPro, which had Doug Pappas’s old Marginal Sal/Marginal Win tables.  In 2003, the average marginal salary per marginal win across all teams and all player classifications was $1.9 million.  No way that it’s only $1.5 million two years later for free agents only.  Weird math.


#10    tangotiger      (see all posts) 2007/12/14 (Fri) @ 21:58

BP uses whatever “marginal wins” suits the author at hand.  Pappas uses what I do: team win% of .300.  Clay uses team win% of .150. 

Not only does the .300 have far more grounding in reality, it also has a very linear relationship between wins and $, which is always a good bonus.


#11    VictorW      (see all posts) 2007/12/15 (Sat) @ 11:35

#8- Oakland Athletics essay, page 327


#12    VictorW      (see all posts) 2007/12/15 (Sat) @ 13:49

By the way, I’m repeating my study using WSAB.  I’ve finished the data for top 10 hitters and found the breakeven WAR is 15.37.  The average star caliber hitter gives a team $89.7 million in surplus value in PV dollars!


#13    studes      (see all posts) 2007/12/15 (Sat) @ 15:18

Thanks, Victor.  I’m able to replicate your numbers and ratios overall, so I think your study probably holds up well.  But it blows me away what a difference 30% vs. 15% makes.  Crazy.

When it comes to absolute numbers, I feel strongly that WARP should not be used for payroll analysis.  Minimum salary in today’s age gets you much more than whatever it is the WARP baseline represents.


#14    VictorW      (see all posts) 2007/12/15 (Sat) @ 15:48

MGL, I did a similar study to the one you suggest.  It was published in the latest edition of BTN.  It looks at prospects rated in the top 10 and from 11-25 by Baseball America from 1990-1999 using WARP.  I’m running the numbers again using WSAB.


#15    tangotiger      (see all posts) 2007/12/15 (Sat) @ 16:04

There’s no question that WARP is less than ideal to be used in any linear relationship.  You might as well remove some 2 to 2.5 wins per 162 G to get WARP into WAR (which uses a .300 team replacement level rather than .150).

So, if someone is saying that the average player is 4 WARP, and that represents 6MM of free agent dollars in 2005, that’s the equivalent of 2 WAR for 6MM.  And at that point, the WAR and $ are linearly related.


#16    MGL      (see all posts) 2007/12/15 (Sat) @ 22:52

Wow (#298)!  Is your study available yet online?  Can you link it here?


#17    VictorW      (see all posts) 2007/12/16 (Sun) @ 01:13

MGL, here’s the link:
http://www.philbirnbaum.com/btn2007-08.pdf


#18    MGL      (see all posts) 2007/12/16 (Sun) @ 06:06

I have not read it carefully yet, Victor, but very nice article.  I have a couple of questions:  One, you use around 1.5 mil per WARP in the FA market.  We are using around 4.5.  That is quite a difference.  I do not understand the 1.5.  Regardless of how BP defines replacement, a WARP is a WARP, right?  As far as I know, the 4.5 is correct, is it not?  OK, I take that back.  If an average player in BP’s WARP system is 4 WAR, which I think it is (rather than 2 WAR in most systems), then 1.5 per win is correct for an average player, but after that, I think you still have to use 4.5 or thereabouts, do you not?

For example, we know that players like A-Rod and Pujols are worth around 25 mil a season as FA, no?  Well, according to Pecota, Pujols is projected for 08 as 9.2 and A-Rod as 7.4.  Let’s call that 8.3.  Using 1.5 mil per WARP, that is a FA value of 12.45.  That is not even close to being correct.  It should be twice that.  How are you/they using 1.5 per win??

Also, I would love to see a regression equation (maybe a logit regression) using propsect number and mean WARP in first 6 years.  The numbers for the 11-25 and 1-10 seem awafully close to one another, suggesting that maybe the scouts at BA are not that good, at least in terms of distinguishing a top 10 prospect from an 11-25 one.  I would also love to see you extend either a regression like that or just the categories as you did, all the way down to the 91-100 prospects.  I would love to see how well the scouts at BA are able to distinguish prospect talent.  As I said, although the 1-10 appeat to do a little better than the 11-20, I am not sure that the differences are significant enough to get excited about.  For example, say we have two similar players who have the same OPS (or lwts or whatever) in the minors (MLE).  And say that one is a top 10 prospect and the other is a 11-20 prospect.  Is there enough of a difference in your numbers to give the top 10 prospect a better projection?  IOW, is it worth regressing a player’s stats based on how good a prospect he is?  If yes, can we distinguish one quintile from another?


#19    Anthony      (see all posts) 2007/12/16 (Sun) @ 10:58

Couldn’t the similar numbers of the 1-10 prospects and 11-25 prospects be due to WARP’s low replacement level? That is, if the 11-25 group has more replacement-level players, and they get, say, 2 WARP instead of 0 WAR, that’s bringing the lower group up closer to the elite group?


#20    VictorW      (see all posts) 2007/12/16 (Sun) @ 15:53

#18- MGL, I wrote the article using BP’s $/WARP and a linear system.  I know Silver also created a MORP system that is supposed to fit WARP better.  Using his definition of MORP on the BP website, it appears MORP fits much better the Pujols example you gave.  I reran the savings for top 10 hitters using MORP and a 40/60/80 rule for the arbitration years and I found the average top 10 hitter gives a PV saving of $45.18 million which is about $15 million more than the savings posted in the article.  However, given the nonlinearity of MORP, how much WARP you could acquire with those savings depends on how you distribute the money to the players.  I’ll take a look at the regression and I plan to extend the study to include all players in the top 100.


#21    tangotiger      (see all posts) 2007/12/16 (Sun) @ 17:22

Also try to rerun assuming everyone’s WARP is 2 wins too high, per 162 G or IP (with a likely linear wins to $).


#22    MGL      (see all posts) 2007/12/16 (Sun) @ 17:53

Are the pitchers’ WARP’s too high (or just wrong) also?  Does anyone know where they get 1.5 mil per WARP?  Even using their own low replacement level, that can’t be right, given the Pujols and A-Rod examples I gave.

Anthony, I guess.  Another reason why using BP’s WARP and using 1.5 mil per WARP (clearly understating the FA value of a win) is screwing up the results of the otherwise very good analysis and article.


#23    tangotiger      (see all posts) 2007/12/16 (Sun) @ 20:37

I don’t know where the 1.5 is from (which year).  2003?

1.5MM per WARP would imply 6MM for an average 4 WARP player.  HE would be 2 WAR for us, meaning 3MM per WAR.  If that’s in 2003, then it would be 3.3 in 2004, 3.6 in 2005, 4.0 in 2006.  It sounds about right to me, if that’s where the 1.5 comes from .

The average pitcher has something like 2.35 runs above replacement, per 9 IP, using WARP.


#24    studes      (see all posts) 2007/12/16 (Sun) @ 22:38

The $1.5M is a combined figure from ‘05 and ‘06 offseasons, I think.  As far as I can tell, the salary figures are correct.  It’s just that WARP starts so dang low.

Every economic analysis like this starts with a baseline of minimum salary, as it should.  The critical question is “what level of performance is a reasonable match for minimum salary.” My assumption in pulling together WSAB was that an average bench player will cost the minimum.

Another approach is to use Freely Available Talent, which I think essentially means players you can pick up off the “scrap heap” for the minimum.  Rule V guys and minor league free agents.  Maybe WARP is appropriate for that, don’t know.

OTOH, last year I looked at all players who made the minimum and found they had a Win Shares percentage of over .450 (IIRC).  That’s because there are some stars who make only the minimum cause they’re in their first or second year.  My argument is that this can be a proper baseline too, because you can get those players at the minimum.  That’s a reality of baseball economics.

It comes down to philosophy.  OTOH, if I were an actual team, I’d throw all this out the window and calculate my own replacement level for each position, based on who I’ve got in my system and some version of FAT.


#25    tangotiger      (see all posts) 2007/12/16 (Sun) @ 23:52

Studes’ question is right on.  His answer though regarding the WSAB is that an average bench player will not cost the minimum… how could he?  He’d cost something like 750K.  I’m sure if you go through and look at how much the average bench player cost, that’s how much he is being paid. 

This is fine, since the average bench player is a bit above the replacement level (i.e., the 25th guy, as opposed to the average bench player, who would be the average between 17th and 25th player).


#26    studes      (see all posts) 2007/12/17 (Mon) @ 01:17

His answer though regarding the WSAB is that an average bench player will not cost the minimum… how could he?

Yeah, I agree.  I’ve thought of changing the parameter in that direction, but then I always stop and think about it in the other direction: there are plenty of stars making the minimum.  In that sense, you could argue for a higher level to match the average minimum salary.

So I stay where I’m at: a compromise that probably isn’t ideal at all.  Sheesh!


#27    studes      (see all posts) 2007/12/17 (Mon) @ 01:24

BTW, did people read the James article, too?  I thought it was well done, and I was particularly heartened to see that he used the PythagenPat formula (though he didn’t name it).

His finding, that teams that significantly beat their pythagorean formula do better the next year, is interesting and begs the question of why.


#28    tangotiger      (see all posts) 2007/12/17 (Mon) @ 08:18

Presuming you’ve got some 16 regular starters per team (8 or 9 nonpitchers, 5 starting pitchers, plus 2 or 3 relievers), that leaves you with 12 bench players (presumes some guys on DL, etc) per team, for a total of 360 players.  I don’t think you can find 360 players that are paid the league minimum, out of the 840 highest-paid players.


#29    MGL      (see all posts) 2007/12/17 (Mon) @ 08:36

Yes, I read the James study and was intrigued as well.  For those that did not read it, I summarize:

While he (and others) found that teams that exceed or underperform their pythag w/l in year X do not do so (by more than .2 to .4 wins) in year X+1, he found that teams’ RS and RA significantly moved towards their actual records in year X+1.

For example, if a team’s pythag record was 81-81 in year X but they won 90 games, in year X+1 their pythag record (and actual record) was significantly better than a “matched” team that had the same pythag record (in fact, almost the same RS and RA) in year X, but an actual record close to their pythag record.

So, if team A was 800 RS and 800 RA in year X but won 90 games (should have won 81 of course) and team B was also 800/800, but won 81 as they should, he found that team A was likely to be 850/800 and win 85 games (or whatever) in year X+1, but team B was likely to be 800/800 and win 81 games again.

I think that the difference in year X+1 was mostly in runs scored and not runs allowed.  IOW, a team which overperformed its pythag record in year X saw an increase in its RS the next year, but not its runs allowed.  Same for underperforming teams I think.  I think they saw a decrease in RS but not RA in year X+1.  And again, he (correctly) used matched teams to make sure that these “regressions” the next year were not typical of all teams with these RS and RA.

Same was true of underperforming teams.

I wondered why this might be the case, but have not put too much thought into it. There obviously could be some kind of effect on the team in terms of trades/acquisitions from over or under-performing their pythag wins, but that would probably not explain such a significant finding.


#30    Rally      (see all posts) 2007/12/17 (Mon) @ 10:17

Shouldn’t we expect that of two teams of 800 runs, the one that won 90 games has a slightly better offense because they are not batting in the bottom of the 9th as much?

That’s probably a minor thing, less than 5 runs.  But if you find enough minor things you might be able to add them up and explain the effect.


#31    studes      (see all posts) 2007/12/17 (Mon) @ 10:59

I don’t think you can find 360 players that are paid the league minimum, out of the 840 highest-paid players.

I agree but, again, that’s just one way of looking at it.  Another way is to ask what contribution those who made the minimum last year made.  If you ask it that way, you get a very different answer.


#32    MGL      (see all posts) 2007/12/17 (Mon) @ 16:22

Rally good point.  Teams that overachieve their pythag usually win lots of close games too.  Close games probably have more batting in the bottom of the 9th or later as well.

At the very least, James should have used RS and RA per inning.  This is not the first time that James has done a “rough” study I don’t think and then thrown it out there for other people to refine.  I guess that is not all that bad.  It is better than nothing.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 21 17:29
Sabermetric Moves of the 2009 Pre-Season

Nov 22 06:40
The New Triple Crown

Nov 22 06:24
Chance of Scoring by Base/Out, Retrosheet Years

Nov 22 02:48
How good are the Fans in evaluating fielding?

Nov 21 20:13
Runs Produced

Nov 21 19:27
Marcel 2009 is here

Nov 21 16:43
Nate Silver: hero to interviewers

Nov 21 10:57
New BBTN

Nov 20 20:34
ABSO-lutely… not!

Nov 20 19:23
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being