THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, August 23, 2006

Selective Sampling - How NOT to Choose Players

By Tangotiger, 08:21 AM

Cy Morong takes a look at establishing the replacement level.  He says:


I ranked all NL pitchers from worst to best in RSAA per IP

And therein lies the problem.  You cannot establish, after the fact, who is a replacement-level pitcher.  Say you have a pitcher that gets thrown into the dustbin, I dunno, maybe Orlando Hernandez, between 2003/2004.  He was a free agent, and signed with the Yanks for 500,000$.  That’s the definition of replacement level.  In 2004, he went 8-2, with a 3.30 ERA.  However, he would fall off Cy’s (and pretty much everyone else who looks at this issue) selection criteria.

This is a huge selection bias, and invalidates any results presented.

#1    Tangotiger      (see all posts) 2006/08/23 (Wed) @ 13:32

I also recommend these articles, which discusses True Score Theory, Theory of Reliability, and Regression Toward the Mean:

http://www.socialresearchmethods.net/kb/truescor.htm

http://www.socialresearchmethods.net/kb/reliablt.htm

http://www.socialresearchmethods.net/kb/regrmean.htm


#2          (see all posts) 2006/08/23 (Wed) @ 15:15

I certainly don’t claim to be an expert on replacement level. But how would you go about determining it?

Also, maybe some guys who are really “replacement” level in a given year, like Hernandez, get lucky and do better than they should have. So they get left out of any calculation of the replacement level. But if they had done worse (performed at their “true” level), they might drop into the lowest 1% or lowest 5% or whatever. They would displace someone and they would have to be worse than that guy to get into the lowest 1%. So the replacement level would fall.

The reverse might be true. A guy who was in the bottom 1% might have really been better (but he was there due to bad luck). If he had performed at his “true” level some other guy would have to be pushed down into the lowest 1%. But since this new entrant into the bottom had been better than those guys before, and he joins them, he raises their level.

So we have one move raising the level and another lowering the level. I don’t know which change will dominate and I don’t think I know how to figure it out. Maybe if we keep looking at lots of years all of this good and bad luck will even out. I also wonder if the bottom 10% already have enough IP to have the good and bad luck cancel out (it was 2,000 IP in the 2000 AL).


#3    David Smyth      (see all posts) 2006/08/23 (Wed) @ 16:45

I’m not an expert, either, but I think Tango is saying that you should have regressed everyone’s performance towards the mean. Doing that, and the resultant true talent repl level should be closer to .500 than the observed repl level.


#4    Patriot      (see all posts) 2006/08/23 (Wed) @ 18:47

And while you will still have selective sampling issues, it would be better to use playing time (sat BFP or IP or appearances) as the definition of who is replacement level, not performance.  Again, not that you would be home free, because pitchers who perform poorly in their first few outings are less likely to get more outings, and could also be tagged as “replacements” when in fact their true talent is higher then they have demonstrated.


#5    MGL      (see all posts) 2006/08/23 (Wed) @ 19:16

Yes, you can NEVER take the worst or best performances and assume any kind of true talent level unless you regress toward the mean properly.  Even then, you are not computing replacement level by any stretch of the imagination.  I am not sure what you are computing other than the true talent level of some subset of pitchers (those who fell into the bottom quintile in performance, which does not really mean anything I think).

Morong has done some great work over the years, but I don’t think this study tells us anything at all.

Now, I work with estimated true talent levels of pitchers every day, and I can tell you from experience that the spread of true talent among pitchers is much smaller than that of batters and that the SD of talent is probably close to 1/2 run per 9 innings, which means that replacement level pitchers are probably close to one run per 9 innings worse than average.  For relievers, it is probably less than that.

Replacement level is a funny thing anyway.  Replacement level to a team that does not know how to evaluate players is a lot different than replacement level for a team that does.

Even if we define it as the average talent level of a certain percentage (20% is fine) of all players, we can easily estimate what that replacement level is by looking at performance propspectively rather than after the fact.

Simply make a list of replacement players (say, cheap FA’s), in this case pitchers, going into a certain year, and then look at their collective performance in that year, not based on a minimum level of playing time.  That is replacement level, or at least the true talent level of those pitchers.  It is really not all that difficult.

The reason why it is important for each team to have their own replacement level is this:  If I am a smart team, and I can find players for 500,000 who are only 15 runs worse than average, the monetary value of any given player is going to be different (less) than a team who can only find cheap players who are 20 runs worse than average.


#6          (see all posts) 2006/08/23 (Wed) @ 20:06

If everyone should be regressed to the mean, does that mean the worst 10% would see their value increased? My understanding of what people are saying here is yes. But I guess that also means that the second worst 10% see their value raised and so on. So once everyone is regressed to the mean, would the worst 10% still be the worst 10%, but not quite as bad as before in terms of runs allowed or ERA or whatever measure you are using?

If that’s the case, is there a simple adjustment that could be performed on the worst 10% (or whatever) to account for regressing to the mean? Like just lowering the ERA for that worst 10% by 5% or 10%? When you talk about regressing to the mean, does everyone get adjusted in exactly the same way?


#7    John Beamer      (see all posts) 2006/08/23 (Wed) @ 23:26

Cy,

You’d regress different players differently depending on number of at-bats or BFP, or some other measure of playing time. Also you may choose to regress to different means. For instance you may regress Pujols to a mean based on some sort of similarity score with some variable(s), while regressing someone like Neifi Perez to a different mean. As Tango’s link mentioned it is necessarily a simple exercise if you do it properly.


#8    Joe Arthur      (see all posts) 2006/08/24 (Thu) @ 02:26

There’s some ambiguity about what is understood by “replacement level.”

One reasonable definition [the original Bill James definition] is the cheaply available talent pool. This appears to be Tango and MGL’s understanding of it. With free agents though, some players with health issues or other risks can be cheap and available in spite of a relatively high upside in talent.

But another reasonable definition is the level of play below which a player gets replaced, regardless of how much or how little money he makes [eg Carl Everett; Russ Ortiz]. Replacement level in this sense is different for teams with pennant ambitions. They have a higher threshhold to replace players. In most cases the replacement body is a AAA minor league player, not a major league free agent.

Teams are now routinely giving some playing time to 50 or more players each season. Over the course of a season, there’s a lot of overlap with AAA, and a little with AA.

When thinking about regression to the mean, why wouldn’t you consider the population to be professional ball players instead of MLB players only? As John says, you can customize the individual regressions if you want to include Orlando Hernandez types as part of replacement level. But I think generally the mean (in talent) to which you want to regress is below the threshhold of major league play. Most of the players I’d construe as replacement level play much of the season at AAA because they’re not good enough to be significant contributors in MLB.  I’d agree with Patriot that the best after-the-fact method of identifying replacement level players is to use playing time criteria rather than rate of performance.

The study Cy attempts would be more compelling if minor league performance could be integrated into it to better establish the ability levels of these borderline players. The lack (as far as I know) of a publicly available database combining both major and minor league evidence of performance makes it more complicated to study “replacement level” and “spread of talent” type questions.


#9          (see all posts) 2006/08/24 (Thu) @ 07:23

John

What you’re suggesting seems to be how it’s explained in “The Book.” But I wonder if what group you assign a player to (and therefore what mean you regress them to) might be a little arbitrary. If a pitcher has not had too many IP in his career, doesn’t that make it harder to decide what group he is in? The pitchers with low IP are more likely to be replacement type anyway, I think.

Joe

If you bring minor leaguers into this, can they be rated using major league equivalents or MLEs? That is projections of what they would do in the majors based on their minor league stats? But that is probably more than I want to do.


#10    tangotiger      (see all posts) 2006/08/24 (Thu) @ 07:33

It really is a simple exercise, and you can read the two blog entries I have on “Spread in Talent”.  You can estimate fairly well what the true distribution of talent is for pitchers and hitters.  Once you have that distribution, it’s just a matter of drawing a line somewhere. 

For example, if we use some sort of component ERA (or wOBA), adjusted for starter/reliever role, you would get a mean of say wOBA of .340, with 1 SD equal to say .020.  (All depends on the number of players you put in your pool.) Once that is established, you draw a line somewhere.

You do the same with hitters, and maybe with them you get a wOBA of .340, with 1 SD equal to .030 (with either a positional adjustment, or you have to do off+def, where the avg def for a SS is about 15 runs higher than the avg def for a 1b).  And again, you draw a line somewhere.  That somewhere is exactly at the same # of SD from the mean as for pitchers.

Where to draw that line?  You can make it so that you have 30x20 players to the right of the line.  Or, 30x25 players to the right of the line.

And of course, you’d want to include minor league players as well.

I’d recommend this link:
http://www.tangotiger.net/talent.html


#11    Tangotiger      (see all posts) 2006/08/24 (Thu) @ 11:26

In the original “Spread in Talent” blog entry, I ended up by saying that 1 SD of nonpitching talent was .036 of wOBA, and 1 SD of pitching talent was .026 of wOBA.

Remember also that .036 of wOBA is about .031 runs per PA and .026 is .023 runs per PA.

If we give the 162 GP hitter 700 PA, that means 1 SD is 22 runs, or 2 wins.  That is, if we’ve decided that the appropriate place is to set the replacement level at -1 SD from the mean, then we are saying we are setting the level at -2 wins.

To make that pitching-equivalent, .023 runs per PA for a pitcher with around 39 PA per 9 innings, is .90 runs per 9 innings (or .81 ERA).  That would be the equivalent level to set for replacement level.  And, -.90 runs per 9 innings, is about -.08 wins.  Which makes the replacement level for a pitcher at .420.

If you want to make the non-pitcher replacement level as -2.5 wins instead, then the pitching equivalent would be an ERA 1.00 below average, or -.10 wins below average, or a replacement level of .400 for a pitcher.

(-2 wins for a nonpitcher, by the way, means it’s -18 wins for the whole nonpitching team for 162 games, or .390 win percentage)

I feel the most comfortable saying that the pitching replacment level is .420, the nonpitching replacement level is .390, making the team replacement level .316.

Want to knock it down to .410 and .380, for a team replacement leevl of .300?  Fine by me.  That’s what we should be using.


#12    MGL      (see all posts) 2006/08/24 (Thu) @ 13:14

The .75 to 1 run per 9 innings worse than average for pitchers jives with my “experience” working with pitcher projections (there are a ton of no-name pitchers both in the majors and minors who have a true talent less than 1 run per 9 worse than average).

Most people don’t believe that a replacement pitcher can be only less than 1 run worse than an average pitcher because they overestimate the spread of talent among pitchers and because they see large spreads of sample performace all the time (ERA’s less than 1 among a few closers and ERA’s of 7 and 8 among some starters and relievers).

The fact that most people think that a replacement pitcher is so bad is one reason why pitchers are paid so much money relative to their true value above replacement.

20 runs (per season) below average on offense also jives with my experience that there are a ton of -17 per 150 players available on the cheap (which is around what I use as replacement level - for Slwts at least.  At the same time, there are many near-replacement offensive players being paid a lot of money, mostly because they are “proven veterans” or something like that.


#13    Tangotiger      (see all posts) 2006/08/24 (Thu) @ 13:33

Actually, the 22 runs per 162 GP is off+def.  If it was off only, it’d be around 18 per 162.  So, I think we are both in-synch here.


#14    tangotiger      (see all posts) 2006/08/24 (Thu) @ 14:55

I’ve mentioned this on other occasions, so I’ll say it again.  You don’t even need to know the replacement level for salary purposes.  If you can calculate the marginal $/win in other ways, that’s all you need.

For example, the average payroll in 2006 is 77.6 million$ per team.  About 40% of that should go to the pitchers, or 31 million$.  There are 162 x 9 innings pitched.  So, for every 162 inning pitched, the average pitcher on the average team should get 3.4 million$, or about +3 million above minimum (replacement). 

How much is a marginal $ per win worth?  Let’s say it’s 2 million$.  There’s many ways to calculate this, and it’ll come back with a number between 1 and 3.5.

Anyway, if a marginal $/win is 2 million$, and if the average pitcher is getting paid 3 million $ above replacement, then the average pitcher is expected to contribute +1.5 wins above replacement (over those 162 IP).

1.5/162*9 = +.083 wins, which converts to +.83 ERA.  This means the average pitcher has an ERA of .83 above replacement.

Or, the replacement level is .83 below the average.

See?  Without much effort, we figure out the replacement level… and, we don’t even need it!


#15    MGL      (see all posts) 2006/08/24 (Thu) @ 17:14

If I own a team, I need to know the level of the best players I can get for free (.5 mil), which is going to be a level a lot higher than the “average” replacement level, since I can identify plenty of decent players who are not considered very good by most teams.


#16    Mike      (see all posts) 2006/08/24 (Thu) @ 18:28

Yeah, but if we assume that the average replacement level is -17-20 runs below average (-1.6 wins or so), what would you approximate the replacement level is for “free” players? -20-23 runs below average and then the rest of the guys would be in the -14-17 range?  It may be in the best interest for teams to separate replacement level into these two categories, but then what replacement level would you use when determining their worth in dollars?  While it may be semantics, someone like Beltran will be worth more/less depending on the replacement level used.  That’s why it may be best to just use the average replacement level (usually -17-18 runs below average, although that changes depending on the talent level in baseball each year) when determining each player’s dollar value.  It won’t make that much of a difference, though.


#17    tangotiger      (see all posts) 2006/08/24 (Thu) @ 21:09

Like I said, it is completely and totally irrelevant what replacement level is, if you already know what the marginal $/win rate is.  All you need is that, the average, the expected playing time, and that’s it.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 21 17:29
Sabermetric Moves of the 2009 Pre-Season

Nov 22 06:40
The New Triple Crown

Nov 22 06:24
Chance of Scoring by Base/Out, Retrosheet Years

Nov 22 02:48
How good are the Fans in evaluating fielding?

Nov 21 20:13
Runs Produced

Nov 21 19:27
Marcel 2009 is here

Nov 21 16:43
Nate Silver: hero to interviewers

Nov 21 10:57
New BBTN

Nov 20 20:34
ABSO-lutely… not!

Nov 20 19:23
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being