THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, June 17, 2008

Even more about replacement level

By Tangotiger, 01:02 PM

Referencing this article on VORP, some BTF readers make some curious statements, while JC makes some puzzling ones.

I made this post at both blogs:


Wins Above Replacement
x Dollars Per Win
x Years of Service Adjustment
equals Salary

Wins Above Average times Dollars Per Win
+
Constant Dollars Per Playing Time Unit times Playing Time Unit
, all times Years of Service Adjustment
equals Salary

It’s all about where to put that “constant”. But, it works out to the exact same thing.

(Note: in both cases, I forgot to add 390,000$.)

It was ignored by the BTF readers, and did not even get passed the posting filter at JC’s site.  So, I’ll just make my comments here.

JC said this:

Are catchers really more valuable than equally-talented batters who play left field because of scarcity? There are plenty of catchers in the minor leagues and major-league teams often carry three catchers. Teams don’t normally carry nine outfielders, do they?

Yes, they are really more valuable because of scarcity.  This is the core to the replacement discussion.  Catchers are selected from a limited pool of players.  Infielders (2B, SS, 3B) are selected from that pool, plus the infielders pool.  And so on.  It looks like this:
Pool A: Catchers
Pool B: Pool A + Infielders
Pool C: Pool B + Outfielders
Pool D: Pool C + Firstbasemen
Pool E: Pool D + Nonfielders

Pitchers represent their own pool (at the MLB level).

If you don’t accept the scarcity argument, clearly you won’t accept replacement level.

I suspect that catchers are more expensive because they offer a greater defensive contribution. After all, they are the only other player besides the pitcher involved in every pitch.

Here again, we are nowhere close to the same page.  Or the same chapter.  It is irrelevant that they are involved in every pitch.  A batting tee is involved in every pitch too, but it doesn’t over any value compared to any other batting tee.  Catchers are more expensive because they offer a greater defensive contribution that very few people can provide, while still being able to hit a ball. 

Imagine that you had a DH for catchers as well.  What do you think would happen?  Catcher value would plummet.  Now, you’ve opened up the pool of players far wider, since one of the requirements (being able to hit a little bit) no longer applies.  Even though his fielding responsibility is still the same.  It’s all due to positional scarcity.

Player value is determined by opportunity cost as determined by marginal revenue product (MRP). If a player generates many millions of dollars, his value is determined by this, not by how much he makes.

Last time I took Economics was in 1989 or so, so forgive me if I’m pedestrian in my analysis.  As I understand the MRP model, you would keep adding workers to your workforce, and if he can generate revenue, he’s got MRP.  Ideally, you’d pay him no higher than his MRP.

Now, if a baseball roster was 200 players per team, then the MRP of the 25th best player on the roster will surely be higher than 390,000$.  But, this is not allowed in baseball.  You are limited to 25 players for five-sixths of the season. (28 or so really, if you include DL players).

So, even if the 25th player has a positive MRP in a 25-team roster (however you would calculate that in a zero-sum environment), it’s irrelevant.  Because he is only allowed to create value by displacing the 26th best player.

So, the MRP of the each player on the roster is relative to the 26th player.  This is why we talk about wins above replacement (WAR).

The entire process to calculate WAR was presented on my blog a few months ago.  And the passionate guys over at USSM did a bang-up job at going through their roster via a replacement-lens.  I urge you to read both posts before commenting.  Or in lieu of commenting.  Just read them!

Like I said in my bypassed quote at the top, you don’t need to use the concept of replacement level.  However, you will be calculating it indirectly anyway, since you will be introducing a constant to turns wins above average into dollars (and dollars is really dollars above zero, where zero is how much they’re paying the best guy not on their roster).

So, someway, somehow, you are using the replacement-level concept, if you are converting wins to dollars.  If you are not converting to dollars, you do not need to worry about replacement-level so much, but you still need to worry about positional scarcity.

***

Notes: Bill James introduced (to me) the concept of replacement level, when he discussed Jim Rice v Ron Guidry and Clemens/Mattingly.  The positional scarcity was also introduced as a concept by Bill with his Defense Spectrum.  Marginal Dollars to Marginal Wins is my own concept, but has been done by many others, before and after me.

#1    Colin Wyers      (see all posts) 2008/06/17 (Tue) @ 13:53

I gave up arguing with JC, - after I wrote a short essay on why positional averages weren’t useful comparisons, he said, “I can compare the mean at positions,” and it went downhill from there. It was almost as if he wasn’t reading anything I said - someone give me a sanity check here, but I really felt his arguement boiled down to, “Well, I can make all the adjustments you’re talking about without CALLING it replacement level.”

Which is true - call it freely available talent, or league minimum, or whatever. But to argue that it’s just hard to explain - and then insist that we use marginal revenue product, which I know comes up at the dinner table all the time in my house! - is… well, I don’t know what it is. That’s why I gave up.


#2          (see all posts) 2008/06/17 (Tue) @ 16:14

Colin, I’ve had similar experiences; it’s not even on my ‘favorites’ list anymore.  I’ll usually stop by and check out a discussion if Pinto or Tango links to him.

Tango, I think Catchers are, like pitchers, a separate “pool”.  How many C could move to the infield?  Inge, sure, and maybe Mauer or something… but I feel like Varitek, Posada, any of the Molinas, etc, would be a disaster.  What benefit do we gain in assigning them a place on a spectrum?

Also, on a side note, i SWEAR I got the exact same ‘captcha’ phrase yesterday.


#3    Rally      (see all posts) 2008/06/17 (Tue) @ 16:26

I’d put Biggio in that pool.

But I agree with Mike, most catchers are a separate pool and could not handle an infield position.  Probably 1/4 or 1/3 could handle 3B, but few have the speed to play 2nd or short.


#4    Tangotiger      (see all posts) 2008/06/17 (Tue) @ 16:52

You could have a pool of 1A and 1B for Catchers and infielders, with both then being part of the pool for outfielders.

I would think, however, that all catchers could be considered as part of the 3B pool.  Regardless, the main point that there is a scarcity for the catcher pool stands.


#5    Guy      (see all posts) 2008/06/17 (Tue) @ 17:02

I assume that the scarcity of the defensive skill set required at catcher is the primary cause of the low offensive production there (even though it has proven hard to measure those skills).  But I have to think there’s an additional factor at work, which is that the very best hitters get steered away from the position at a young age, even if they could play it.  The prospect at catcher of reduced playing time, injury risk, and (perhaps) an offensive “penalty” from playing the position gives great hitters (and their coaches) an incentive to play 1B/3B/CR.  Conversely, if a marginal offensive talent has the ability to catch, it absolutely makes sense for him to hone those skills, as it’s his best chance of making a career in the game.


#6    tangotiger      (see all posts) 2008/06/17 (Tue) @ 21:07

Agreed on all counts.


#7    Me      (see all posts) 2008/06/18 (Wed) @ 10:07

How many C could move to the infield?  Inge, sure, and maybe Mauer or something… but I feel like Varitek, Posada, any of the Molinas, etc, would be a disaster.  What benefit do we gain in assigning them a place on a spectrum?

It doesn’t invalidate your point, but Posada played 2B in the minors.  He was converted to catcher.

Also, in terms of TangoTiger’s pools, I agree that they should be more complex than what is written there (separate corner OF’ers from CF’ers, etc.) but that doesn’t change his argument about scarcity.


#8    Tangotiger      (see all posts) 2008/06/18 (Wed) @ 10:16

Is it just me, or is Phil the most thorough and plain-speaking non-academician number-cruncher in the blogosphere:
http://sabermetricresearch.blogspot.com/2008/06/replacement-players-vorp-salaries-and.html

He lays it all out for you, constructs great arguments, time and again, that both sides would bring up, and deconstructs them as plain as day.


#9    Tangotiger      (see all posts) 2008/06/18 (Wed) @ 10:27

If you have Adam Dunn, Manny Ramirez, and Raul Ibanez in your outfield, who do you put in CF?  They are all candidates, however sucky they may all be.  If you can catch a flyball, and run faster than your grandmother, then you are a candidate for the OF.  Your overall bat+fielding may not allow you to be a “good” candidate, but you are in the pool, barely treading water.  As long as you don’t drown, you are a candidate.

Frank Thomas as a SS is drowning.  Troy Tulowitzki as a catcher is not drowning.  Ichiro as a catcher or shortstop is not drowning.

(I’m not suggesting they get in there cold.  But, one month in spring training should allow Ichiro to play the IF, to at least tread water.)

***

Regardless of the likely nuances of the argument, the main point stands regarding scarcity, as we all seem to agree.  Feel free to create the argument in a clearer and simpler fashion than I have.


#10    Tangotiger      (see all posts) 2008/06/18 (Wed) @ 10:55

Reading more of JC’s posts, there are two issues to deal with, when discussing VORP (or the basic concept):
1. replacement level v average
2. positional scarcity

You can in fact, as my bypassed quote shows, make the discussion based on average and not replacement level, and be fine.  There’s an indirect way to bring in replacement level without talking about replacement level.  So, we can all have the discussion here and come to the agreement.

But, positional scarcity is something that cannot be ignored.  There is an enormous difference between the comparison level for catchers and LF, and that has to do with positional scarcity.

Or, more specifically, as my positional adjustments show, it’s about comparing each player at each position to what Willie Bloomquist would do at that position.  Willie as a 2B or 3B?  He’s average (fielding-wise).  As a SS or CF?  He’d be 5 or 6 runs below the average player at that position with the glove.  As a C?  10-15 runs worse (given acclimation time).  As a corner OF?  5 or 6 runs better.  As a 1B?  10 or 12 runs better than teh average currently at that position.

Then you bring in his offense.  And what you end up with is that, overall (off+def), Willie Bloomquist would be, pretty much, the the worst player at each position, if you’ve got some 420 nonpitchers in MLB.

The two concepts, scarcity and replacement, are part of the talent distribution concept.  And so, if you are going to talk about scarcity, you will probably end up talking about replacement level anyway.

Here’s a useful chart to remember:
http://tangotiger.net/talent.html


#11    Bjorn      (see all posts) 2008/06/18 (Wed) @ 11:20

I think one reason to treat catchers (just as pitchers) as separate from the defensive spectrum rather than at one end of it is that in theory the worst possible defensive contribution at these two positions is worse than the maximum possible offensive production.

I personaly do not think this applies to any of the other seven positions. No matter how lousy you are in the field at say shortstop there is in theory some level of offensive production that makes the net contribution positive.

Sure, the optimal use is gona be to move the player but if he insists that it is either SS or nothing there is a level of hitting making it worthwile no matter how bad he is in the field.

If the player is instead a catcher even OPSing 5.000 might not cut it if his fielding is sufficiently bad. It would have to be pretty terrible but he still has catch or at least block a reasonable portion of pitches and be able to throw at least 60 feet.


#12    Nashboy      (see all posts) 2008/06/18 (Wed) @ 11:48

Another factor driving catcher scarcity is pure fear. Ask any little league coach about this and they’d tell you very few kids like to catch. Fear of the bat, fear of the foul ball, worries about pitcher velocity all make this a hard position to find volunteers for ( oddly in Canada this is less of a problem, witness the minor league depth of Canadian raised catchers, as many youngsters here find the position akin to being a hockey goaltender).

At higher levels throw in the need to have a fairly strong arm and be reasonably bright enough to call pitches , it’s no wonder there is a positional scarcity at catcher.


#13    Sky      (see all posts) 2008/06/18 (Wed) @ 11:53

Two non-major things…

One, there’s continuum between “sucky”, “treading water”, “drowning” and however else we label fielders.  You *could* put Frank Thomas at shortstop.  The cutoff between could and should depends on individual offense, the offensive talents of the pool at each position, and the individual’s relative fielding talent at each position.

Two, there *is* something valid to the claim that catchers are important because they touch the ball every pitch.  In judging fielding, there’s a quantity piece and a quality piece.  The quality piece is the difference between a good play and a bad play.  In the outfield, that’s almost a full run (difference between an XBH and an out).  But at catcher, it’s much smaller, because allowing an extra base isn’t THAT big of a deal.  But because catchers have so many chances to allow/prevent passed balls, throw out baserunners, etc., preventing all those little things adds up.

(It’s like the difference between CF and a corner spot.  Both positions take similar skills, but the CF gets more opportunities to make plays.)


#14    Tangotiger      (see all posts) 2008/06/18 (Wed) @ 12:27

Bjorn:

If the player is instead a catcher even OPSing 5.000 might not cut it if his fielding is sufficiently bad.

If you have Bernie Williams or Juan Pierre as a catcher, say they turn everyone into Tim Raines on the basepaths, allowing steals at every base.  Say in a game there are 20 potential bases to be stolen.  A Pierre catcher will allow all those bases to be stolen, which is worth +3.4 runs or so per game.  At 162 games, that’s 500 runs, just for baserunning (steals, WP, PB, etc).  Calling the game, framing pitches, etc, all that could be worth another 100 runs.  So, he’s costing 600 runs per season.  He’d have to be worth +1 run per PA on offense to make up for it.  Needless to say, that’s basically impossible.

What about Frank Thomas at SS?  He gets 3 balls hit to him that other SS make an out, and he never gets to any, every game.  That’s 3*.8 = 2.4 runs per game he has given the other team.  That’s nearly 400 runs.  Not as bad as catcher, but pretty bad.  Maybe he makes one “easy” out, so he costs his team 1.6 runs per game, or say 260 runs per season.  Bonds, at his best, was about half that as a batter.

So, going to Sky/13, this is also a case of drowning.


#15    Colin Wyers      (see all posts) 2008/06/18 (Wed) @ 14:01

I’m still working on analyzing this and doing a writeup, but in the meantime, it wouldn’t hurt to get some feedback from you guys. Average hitting at position versus replacement hitting at position:

http://www.editgrid.com/user/cwyers/replacement_batting_by_position_85-07

Since JC really seems to like OPS+, I included that. Since I really like wOBA, I included that.

Replacement players were defined as:

* Rule 5 draft picks
* Players claimed off waivers
* Free agents making the league minimum.

Data was gathered from the Retrosheet transactions log and Sean Forman’s Baseball DataBank salary information. A full list of “replacement level players” used is in the third tab of the spreadsheet.


#16    David Gassko      (see all posts) 2008/06/18 (Wed) @ 14:11

Great job, Colin. I’ve been sitting on similar research for a long time, and my results are very similar to yours. What gets is interesting is when you look at pitchers in the same way—even if you isolate just starters, the numbers don’t make any sense.


#17    David Gassko      (see all posts) 2008/06/18 (Wed) @ 14:19

You seem to have the same problem as I did, though, which is that center fielder replacement level ends up way too high. For the record, here are the numbers I came up with versus what you have (all numbers per 640 PA, mine come first):

C: -26, -29
1B: -7, -3
2B: -16, -19
3B: -18, +11
SS: -26, -28
LF: -15, -4
CF: -8, -2
RF: -19, -4
DH: -5, -3

You have big problems with outfielders and third basemen, but at most positions are numbers match up well.


#18    David Gassko      (see all posts) 2008/06/18 (Wed) @ 14:23

Actually, I take back (partially) what I said about pitchers. Looking at my results, I had replacement level starters at around -.6 runs per game, which seems about .4 runs too high (i.e. it should be around -1), but not absurd.


#19    Rally      (see all posts) 2008/06/18 (Wed) @ 14:27

Tango, the worst possible catcher is worse than that.  Where he really kills his team is that he can’t catch a third strike, or make the throw to first to retire the batter.

How bad is he?  Pretty much unlimited, he’ll cost as many runs as a pitcher who cannot throw a strike.


#20    Tangotiger      (see all posts) 2008/06/18 (Wed) @ 14:37

Rally: I’m talking about the pool of players being the 5,000 nonpitchers in pro ball.  Other than Gary Sheffield willfully not catching a ball, I don’t think the “never catch a ball” will apply, any more than it would at first base or any other position on the field.

***

Colin: the determination of position is critical.  A guy who is a CF at age 23 who no longer has the legs at age 28 will check in as a LF or RF in your pool, right?  This is the problem with baseball positions: they are fluid, but you are using a rigid system.

A fairer thing to do is to determine the guy’s position as of age 25 (or whatever).  These guys, who may now be corner OF, would become the true replacement level for CF.

In any case, we know that the positional requirement is roughly 10 runs between CF and the corner OF.  Any system devised that doesn’t adhere to at least this most basic condition is a system that definitely requires a change.


#21    Colin Wyers      (see all posts) 2008/06/18 (Wed) @ 15:01

Tango - I ran the numbers per plate appearance; everything was filed under whatever was in the BAT_FLD_CD. I’ll post the code in a bit.


#22    MGL      (see all posts) 2008/06/18 (Wed) @ 15:16

David, can you print your league averages for each position for you and for Colin?

Tango, does that matter (players changing positions) if they are using the primary positions they played in any given year and their batting performance for that year?  I wouldn’t think so.


#23    MGL      (see all posts) 2008/06/18 (Wed) @ 15:23

David, why do you think that replacement pitchers (starters, at least) are so low in RA?  I have always used a low (good) number for them, and I think it is because teams are so bad at evaluating pitchers, at least they used to be.  The worse they are at evaluating players, the more each group of players, at any salary class, will clump around average, performance-wise.  So if we at least assume that teams are a lot worse at evaluating pitchers than hitters (which we, as forecasters are too), then it is expected that replacement pitchers will not be that far from the average pitcher.  No?


#24    Colin Wyers      (see all posts) 2008/06/18 (Wed) @ 15:24

MGL, I have league averages for all positions in the second tab of the spreadsheet; I call it a spreadsheet but it’s an EditGrid and should open in any browser. Hope that helps.

And I’m not using primary position for season - its a bit more granular than that. Say Willie Bloomquist gets four PAs in a game - he starts at left field, is moved to right, then to center, then finally ends up at second. He bats inbetween each move. Each PA would be added to the “bin” of the position he was playing when he batted. I’m not convinced this is the correct way to do it, but I’m not using defensive position in 1985 to figure out what position a player plays in 1994.


#25    Tangotiger      (see all posts) 2008/06/18 (Wed) @ 15:34

Right.  What I mean to say is that if you are looking at a 29-yr old cheap free agent rightfielder, and if he was a CF as a 24-yr old, he, probably, should be part of the replacement pool of centerfielders, not RF.  A guy not good enough to be a CF is dropped from the pool of CF, leaving you with pretty good “replacements”.

Furthermore, even if the replacement players at CF, and the corner OF are all the same level of hitters, they won’t necessarily be the same level of fielders.  As it is, using your process, the replacement OF are all .321 to .325 wOBA hitters.  But, those may be a bunch of great fielding corner OF, and some poor fielding centerfielders.


#26    Colin Wyers      (see all posts) 2008/06/18 (Wed) @ 15:53

What really worries me is the third base issue - I can rationalize a higher-than-expected CF replacement level, with the assumption that teams are trading defense for offense - at least it’s worth considering. I can’t, however, even come close to explaining why a replacement third baseman would outhit a regular third baseman - it’s not feasible.

The only thing I can think is that a fence is down somewhere and a Tyrannasaur is in the hadrosaur paddock; a real third baseman has snuck into my replacements somewhere. There’s twice as many PAs for third as any other position, as well.


#27    Tangotiger      (see all posts) 2008/06/18 (Wed) @ 16:13

You’ve got 2119 PA.  1 SD = 11 points of wOBA

Also, the number of PA makes a big difference.  Look how few PA you have from 1B.  This is what I was talking about regarding selection bias: no one even stays in the 1B replacement pool long enough to be considered.


#28    Guy      (see all posts) 2008/06/18 (Wed) @ 16:33

David:
I don’t know how you weighted your sample of replacement pitchers, but could you perhaps be seeing a selection bias?  Teams experiment with a lot of marginal pitchers, and those who succeed get to keep pitching while those don’t, don’t.  This happens considerably less with position players.  If you gave every replacement starter a weight of 1, regardless of IP, I’d guess the performance would be much worse.  (Not suggesting that’s the right method, just that if you picked one of these guys at random to pitch for your team, he’d likely be worse than -0.6).


#29    David Gassko      (see all posts) 2008/06/18 (Wed) @ 17:08

MGL/22: I actually don’t have the league averages for my positions. For Colin, they are:

C: -8
1B: +15
2B: -2
3B: +4
SS: -8
LF: +11
CF: +4
RF: +11
DH: +9

MGL/23: That’s just my gut feel, and I think Tango has maybe done some research to indicate the difference is closer to 1 run. Nonetheless, you might be right—this is just something that I need to research more.

Guy/28: The median is a .9 run difference, which sounds much closer to what I would expect. The problem is that if you look at the median results for hitters, they are way too low, so I don’t think the median is actually the way to go.


#30    MGL      (see all posts) 2008/06/18 (Wed) @ 18:14

Tango (and others), if we want to find out the hitting level of replacement players at any position, we simply need to look at the freely available (probably more than one, but not necessarily enough to fill out every roster at that position - which is an issue, BTW, how many players you average - on Colin’s and David’s case, they are only using major league players, which may or may not be correct) at that position at that particular time.  Which means that you use the position that player primarily plays in that year (or do what Colin did, which is to use all the positions they play in that year and put each hitting performance into a separate bucket), and not what they used to play.

If you use what they used to play, then you are getting a higher batting replacement number, but a worse defensive replacement value.

Knowing the offensive replacement level at each position does us no good if we don’t know the defensive value.  So eventually we have to look at defensive replacement numbers as well.

But artificially calling a player a replacement CF’er when he has not played CF for several years, makes no sense to me.  We want to know exactly who we can put in CF if we don’t want to pay more than the major league min for a CF’er.  Now we can assume that a RF or LF who used to play CF is in this pool, but what is the point of using his batting numbers if we don’t know his defense in CF?  We might as well use a minimum salary 3B and assume we can put him in CF also (and of course there has to be a defensive adjustment also).

The only way to know the hitting value of a replacement player at a certain position is to look at players who play that position and who get paid the min salary.  That is by definition.

Now, if it looks like there is a “weird number” at a certain position, there are only 3 possibilities:  One, those replacement players are either well below average in defense (if the replacement hitting is too high) or well above (if the replacement number is too high).  Two, the teams are really screwing up how they evaluate and pay players at that position compared to the other positions.  Three, it is just a sample size fluke.

But again, if you start using players who play LF and RF for your CF replacement pool even though they don’t play CF anymore, and 2B for your SS replacement pool, you are creating all kinds of problems for yourself.

Bottom line, if I am a team and I want a CF’er and I don’t want to pay more than the league minimum, and I want to know how much below average hit hitting is going to be, sure I can look at all players who are making the min salary, and try ti ascertain what their defense will be in CF, but if I want to be more precise and a lot cleaner, I will simply look at all CF’ers who make the min salary and see what their collective batting has been while they are playing CF.  If I do that for a number of years (to get a large sample), I will know exactly what the batting level of a replacement CF’er is.  I still am not sure he is an average defender, so I am still not sure how much he is worth relative to an average CF’er.  But I can do the same thing using UZR or whatever to see the defensive level of the average CF’er making the min salary.  Again, I can’t do that if they don’t play CF any more.  I can look at their RF or LF UZR and them make some inferences or look at their UZR when they used to play CF and apply some aging adjustments, but it is a lot cleaner and more accurate to just look at those players who play CF (young and old) and make the min salary (as FA), or are put into the Rule 5 draft, and compute their average defense (and offense).
t


#31    MGL      (see all posts) 2008/06/18 (Wed) @ 19:10

If you are going to use corner OF’ers who used to play CF as part of your replacement CF pool, you might as well use every player for every position, and make some kind of defensive adjustment.  That defeats the whole purpose of looking at min salary players at each position of course.

As far as pitchers go (or batters), there is never a selective sampling problem as long as you don’t put a min TBF requirement in your sample.

If you look at the performance of all pitchers who fall into the category that these guys are using to define replacement (FA who make the min salary, etc.), you will get exactly what the true talent of a replacement pitchers.  And as long as you classify a reliever by when he pitches in relief only and a starter by when he starts only, you also won’t have a starter/reliever selective sampling problem either.

Basically when you look at all FA pitchers who are being paid the min salary going into a season, you are looking at pitchers who the teams think stink - are the worst pitcher in major league baseball (of course that could also include pitchers who are not THAT bad, but are a great injury risk, so if you are looking at replacement RATE, it is going to be a little high, as opposed to total replacement value including PT).  Now, if you take their collective performance in the next year, that should be the talent level of pitchers you can get for the major league min, by definition.  There is no selective sampling there.


#32    Colin Wyers      (see all posts) 2008/06/19 (Thu) @ 00:29

I’ve found my Tyrannasaurs. Kevin Seitzer and Gary Gaetti, both in 1993. 364 PAs for Seitzer and 566 for Gaetti - that’s 930 PAs, almost half of my sample for third basemen! Both of them were free agents (after being released) making pocket change and just absolutely destroying pitching. (Tack on Jeff Cirillo, too, and 188 totally dominant PAs).

I can only think of two ways to solve my Tyrannasaur problem - increase my sample of players or put a hard cap on PAs for every player. Or both, I suppose. Thoughts, anyone?


#33    Colin Wyers      (see all posts) 2008/06/19 (Thu) @ 00:32

Sorry for the double post, but I should clarify - the whole reason I brought up the number of PAs for 3rd basemen was that I didn’t think that teams needed replacement third basemen twice as often as any other positions; that’s why I figured a full-time starter had snuck in. It wasn’t an assertion of significant sample size.


#34    Colin Wyers      (see all posts) 2008/06/19 (Thu) @ 04:52

Okay, here’s a second pass at these numbers. I expanded my data pool to include all players making the league minimum with at least 1950 PAs, as well as all players who were purchased from other teams; I also dropped 1985 and 1986 from the sample. This is runs per 650 PAs:

C -18
1B 1
2B -7
3B -1
SS -24
LF -1
CF -7
RF -3
DH -2

I think I now have what might charitably be described as a problem at catcher. (Or shortstop, depending on your inclination.) Everything else seems to fit pretty well.

The reason I dropped ‘85 and ‘86 is that I only have STATS, Inc. ZR going back to 1987; tomorrow I’ll see how our replacements faired as fielders. As it stands I absolutely have to get to bed.


#35    MGL      (see all posts) 2008/06/19 (Thu) @ 10:54

Colin, why are you using a minimum number of PA’s to be included in the sample?  That is where you are running into selective sampling problems.  Your number for SS is the only one that is close to being correct. The other ones are WAY too high.  A replacement player is around 18-20 runs per 650 PA worse than average.  None of your numbers comes even close to that.  I assume that is because of your use of a min number of PA to include in the samples.  You say “with at least 1950 PA.” Is that total (career) in the time periods you looked at (87 to 07?)?

For something like this, where you are trying to estimate true talent, you definitely don’t want to have ANY min number of PA.  The question you are asking is, “If I (a team) were to pick up a replacement player at X position, how would he hit if I gave him an infinite number of PA?” The answer is to look at ALL players who fit the description of a replacement player at that position, and compute their collective performance, including everyone who got 1 PA or 5000 PA.

Once you start imposing min PA, that collective performance gets higher and higher.  The higher the PA requirement you impose, the higher the performance.  None of those collective performances are true talent unless you include everyone, regardless of their number of PA.  This is true for any class of players, not just replacement players.

Let’s work backwards here. What happens is you have a group of players, say those who are in the category you are talking about, the so-called replacement players, who are a certain true talent level.  Let’s say that is -20 (per 650).  They all come out of the shoot, at the beginning of any year, and half of them perform at 0 and half perform at -40 (again, per 650), just by chance alone, for the first half of the season.  We are still on track here, as combined, they are a -20, which is their true talent level.  All the ones who hit -40 are dropped because they were replacement players in the first place (no one thought very highly of them), and now they really are sucking it up.  Those players are either benched for the season, or they are cut from MLB, or retire, or what have you.

So only the ones who performed at 0 are left. Remember, they are still true -20 players (which is the number you are trying to ascertain for this group of players - you don’t know what it is yet).  Now, these guys play the rest of the season and amass another 300 or so PA (the ones that were dropped amassed 300 in the first half of the season).  They will hit -20 for the rest of the season, because they are true -20 players.

So, what do we have at the end of the season?  We have half of our players at -40 with 300 PA each, and half of our players at -10 (0 in the first half and -20 in the second half) with 600 PA.

If we put a min PA requirement of 400 or 500 PA, we will only be looking at the ones who got lucky in the beginning, and we will think that replacement level is -10, when in fact, it is -20

This is exactly what happens in baseball when you put a min playing time requirement, either per year, or per multi-years, on the samples of players you are looking at.  You will never get a true talent level on those players.  You will always get a number which is much higher than true talent level.

This is especially true for FA making the min salary, since teams will not have much patience with them.  As soon as they demonstrate a really bad level of performance (through bad luck, at least partially, which is always the case when a player performs badly), they get benched, released or they retire, or they simply get much fewer PA for the remainder of the season or perhaps their careers.


#36    Colin Wyers      (see all posts) 2008/06/19 (Thu) @ 11:32

The minimum plate appearances are a way of estimating service time; I’m trying to filter out the “prospects” and be left with just the replacements.

And it’s career PAs prior to the start of the season I’m looking at, and only applies to players who don’t meet any of my other criteria. Here’s how it goes:

* Rule 5 picks
* Waiver pickups
* Players purchased from other teams
* Players making less than 150% of the league minimum (I changed that and should have clarified that) as free agents
* Players making less than 150% of the league minimum who had 1950 or more career PAs to start the season.

The 1950 PA criteria is only applied to players who fail to meet one of the other four criteria. There’s two sampling issues in a study of this kind, numbers of PAs in the sample and numbers of players in the sample. I’m trying to increase the size of the sample.


#37    MGL      (see all posts) 2008/06/19 (Thu) @ 13:05

Players making less than 150% of the league minimum who had 1950 or more career PAs to start the season.

Why are you including them?  Aren’t some of these guys barely arb eligible?

The 1950 PA criteria is only applied to players who fail to meet one of the other four criteria.

I don’t get this at all.  If they fail to meet any of the criteria, how are they replacement players?  Who is included in this group?  Guys making 10 mil a year (they fail to meet any of the above criteria, right?)?  I don’t get your criteria and I don’t get the 1950 PA requirement.

If you are using a min PA requirement before the season in which you are collecting their stats, then you don’t have a selective sampling problem.  But, then, I don’t see why you are using any kind of PA min requirement.

And what do you mean players purchased from other teams?  What is a purchase and why are these replacement players?  Can’t I “purchase” a star player?

Sorry, I can’t evaluate your methods since I now have no idea as to what your criteria is for inclusion in your “presumably replacement level player” pool.

And again, if you are getting numbers that are not close to 20 runs worse than average at each position, then you are doing something wrong OR these replacement players are not very good at defense (the -20 ruins is offense and defense combined), which could very well be the case with certain positions, like catcher.  There seems to be a lot of decent hitting catchers (like Castro of the Mets, Fick, Saltamaccia, Inge, etc.) that teams don’t like to use as catchers if they can help it.


#38    Colin Wyers      (see all posts) 2008/06/19 (Thu) @ 13:14

Now for defensive:

1B -3.3
2B -1.0
3B 0.9
SS -6.4
LF -2.9
CF -2.7
RF -2.7

I used RLYW’s zone rating database, and then used Chris Dial’s method to convert to totals to runs.

The CF versus corner outfielder numbers - center fielders had a higher ZR than corner outfielders, but they also had a higher number of chances.

Offense and defense combined:

1B -2
2B -8
3B -1
SS -31
LF -4
CF -10
RF -6


#39    Colin Wyers      (see all posts) 2008/06/19 (Thu) @ 13:32

On offense, those are runs compared to league average, not league average at that position. So a replacement first baseman is a (roughly) average baseball player, not an average first baseman.

As far as purchased players - you can purchase players off another team’s 40 man roster, or from another (unafilliated) league. A fair portion of my purchased players pool is guys from places like the Atlantic League or the Mexican League. Unfortunately, that also includes players from Japan who are “posted,” which is something I didn’t realize. I’ll need to make some adjustments there.

For the PAs issue - what I want to do is look at players who are available from the minors relatively cheaply - minor league free agents, failed prospects, roster filler. The right way to do that would be to look at service time data, which I don’t have.

So I’m using players with three full seasons of hitting that are still making the league minimum. If that PA number is too small - if I’m capturing too many actual prospects - I can increase it. But there are two sampling issues, PAs and players. Below a certain number of players, you have a real sampling issue - and there aren’t enough free agents to make a reasonable study of the issue.


#40    Colin Wyers      (see all posts) 2008/06/19 (Thu) @ 13:36

I don’t get this at all.  If they fail to meet any of the criteria, how are they replacement players?  Who is included in this group?  Guys making 10 mil a year (they fail to meet any of the above criteria, right?)?  I don’t get your criteria and I don’t get the 1950 PA requirement.

They have to be making less than 150% of the league minimum. (I have a seperate table of the league minimum per season that I’m using.) So A-Rod isn’t showing up as a replacement player.


#41    Colin Wyers      (see all posts) 2008/06/19 (Thu) @ 14:40

Another pass at this. I had another fence down in my SQL - it wasn’t checking to make sure that players purchased or claimed off waivers were within the salary constraints. Somehow my replacement second basemen are outhitting the league average second baseman:

http://www.editgrid.com/user/cwyers/replacement_level_87-07

All of the infield positions look off - 1B and the outfield looks fine. I don’t know.


#42    B      (see all posts) 2008/06/20 (Fri) @ 09:38

If we’ve found that replacement level per 150 games is -23 runs and per 160 -19 runs and that’s what Tango uses as a basis in his salary scale, then what’s the point of finding replacement level hitting (or even defense) for an individual position?  That question may have been worded wrong, I think.  Anyway, it intuitively makes sense that if you’re a team looking for a CF and you want to pay min. salary you will look at all current CF making min. salary (and Rule V guys) and look at their batting/defensive level for a number of years to determine the replacement level CF.  However, Tango’s salary scale uses replacement level for nonpitchers as a whole to determine dollar value for any FA position player.  Which method is more precise?  Does it matter?  In other words, can Tango’s replacement level for nonpitchers be applied/used in MGL’s hypothetical situation (which is pretty realistic) or would you need to configure replacement level specifically for a CF?

Furthermore, I think it may not matter or that the answer will be similar.  Tango found replacement levels of a nonpitcher (.380), starter (.380), and a reliever (.470) by putting those numbers into the Odds Ratio Method which tells us they will win .300 times per game.  MGL seems to do the average offensive/defensive batting level for each position of a replacement player and goes from there...I think before we can move on in this replacement level talk these two different methods need to be cleared up.  Tango’s is quite easy to figure out for overall replacement level (and accurate), but how would you apply that to MGL’s CF situation because that is what teams are facing…


#43    Chris      (see all posts) 2008/06/20 (Fri) @ 09:42

re: 1st post…

you said you forgot to add $390 K, but don’t you have to account for the 3 years of minimum salary a replacement player would be paid?  I thought I read once where you would need to do $390K * 3 = $1.17 mil when valuing a player…


#44    MGL      (see all posts) 2008/06/20 (Fri) @ 18:32

Offense and defense combined:

1B -2
2B -8
3B -1
SS -31
LF -4
CF -10
RF -6

I don’t know what is wrong, but those numbers just don’t make any sense.  The only one that is close to what it should be is SS.  Depending on what era we are talking about, the average SS is about -10 in hitting (and average of course, by definition, in defense), so -31 is 21 runs worse, in the ballpark of what it should be.

The average second and third baseman are around a league average hitter.  You have them as 1 run and 8 runs less than average.  Can’t be.

The other ones are in the general vicinity, but way too high, only in the neighborhood of 10-12 runs worse than average.

If a replacement player is 1 to 1.2 wins worse than an average player, and an average FA gets paid 8 mil a year, which he does, then a marginal win is worth 7 mil or so, which it ain’t.


#45    B      (see all posts) 2008/06/24 (Tue) @ 08:59

No thoughts, anyone?


#46    Tangotiger      (see all posts) 2008/06/24 (Tue) @ 10:11

I don’t know what else there is to say.

My position is pretty clear: create a model that mimics reality.  Treat positions not as if they are independent of each other (other than pitcher at the MLB level).  Once you’ve got a reasonable model, what is the point of trying to use the empirical evidence that will be fraught with selective sampling and sample size issues?  All that the work that was presented here has done for me is reinforce this issue.  We need to be very careful in what empirical evidence to use, and note its limitations.

There are certain things that we know that needs to be in the model.  We know that OF move around all the time.  Even in hockey, they don’t move LW to RW and vice-versa as often as MLB moves LF/RF around.  And, there’s no one that follows hockey that would dare compare a LW to a replacement-level LW.  For some reason, people have this thought in baseball.

SS become 2B at some point in their lives.  How many MLB 2B were not the star SS of their team at some point?  Relief pitchers were once starting pitchers.  This is all something that needs to be part of the model.  If you construct a model or a study that presumes this is not true, or even doesn’t even require this knowledge, I don’t see how you can really move forward here.

There are certain truths, like the average hitting CF, at some point in the late 40s or early 50s, was a better hitter than the average hitting 1B.  Now, unless there were a ton more GB hit at this time than FB, thereby limiting the value of the CF, any replacement model will likely fail this test. 

In other time periods (and I mean a twenty-year one), the average hitting 2B was the same as the average hitting SS.  Clearly, we didn’t have any kind of equilibrium here.  And, that management didn’t find the talent level required to push the better SS into 2B (as would be their natural inclination) doesn’t mean we hold it against the SS.  We certainly wouldn’t hold it against high school SS who would be compared to a higher average and replacement baseline than the average 2B.

To me, the problem is that there is too many numbers in baseball that obfuscates the reality.  We all know baseball.  We shouldn’t shun it aside because we don’t have ready numbers at our disposal, or we have easier numbers that really tell us less than what we know.


#47    Chris      (see all posts) 2008/06/24 (Tue) @ 10:23

I think he wanted to know given MGL’s situation, if tyou were working for a team, which method you would use (the one you use for your salary scale or similar to what MGL did with average offensive values, etc.).  Does it even matter?  If everyone has the same baseline with their position accounted for, I don’t think it does.  It shouldn’t matter when you evaluate a potential FA signing.

Oh and for your salary scale do you do $390K * 3 (for the 3 years of min. salary)?


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Aug 31 15:28
Fans Scouting Report: Update

Sep 02 15:17
Mail: rWAR v fWAR

Sep 02 15:08
The two uncertainties of UZR

Sep 02 14:59
Roger Federer

Sep 02 14:59
It’s hard to beat the crowd (Vegas in this case) no matter how smart you think you are

Sep 02 14:57
Could Rob Dibble have been a comp for Strasburg?

Sep 02 14:15
WOWY Teachers

Sep 02 13:37
Who’s Waldo?

Sep 02 08:36
Team Elin

Sep 02 01:19
Can someone tell me why Trevor Hoffman is still allowed to pitch?