THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, May 09, 2008

By The Numbers

By Tangotiger, 11:59 AM

Two excellent issues from editor Phil Birnbaum (Nov 07, Feb 08).  Here are my thoughts:


1. Eric does an excellent study about “low reward/risk” pitchers, which is those free agent pitchers signed for 1 year at a low cost.  He finds that from 2002-07, they end up costing about 2MM per year per win.  The free agent cost of players between that time period has been a bit over 3MM per win.  However, if you combine all players (free agents, arb-eligible, slave), the average cost per win is right around 2MM per win.

So, what we have here is an EXTREMELY efficient market, whereby a team will pay a premium for “real” free agents (those that are actually being bid upon by many teams), and will pay the going MRP (marginal revenue product) rate for “happen to be” free agents (those who really have no sought-after value that would cause a bidding war).

Great job, Eric.

***

2. Victor expands on his study of the draft.  I love all these types of articles, as they get into the risk/reward, but not with the immediacy of free agents, since it takes years to evaluate what you end up having.  That Victor can conclude that the average hitter in the 51-75th best hitter class can be equivalent to any subgroup of pitchers (1-10, 11-25, etc) tells you all you need to know about the incredible risk that pitchers provide.  Basically, picking a pitcher in the first round, by ANY team, is foolish.  You’d have to be such a much better prospect than your typical first round pitcher to be even considered being pick in the top 30.

Now, it’s possible, as Victor pointed out, that he happens to have selected draft years where the pitching talent was fairly low.  So, that’s a huge provision there, as he notes the pitchers who came after that didn’t make the study.  And that makes the conclusions in the previous paragraph moot.

This is an area that deserves alot of research, and Victor has added to the already great research of Philly and Rany.  And I look forward to alot more of it.

3. The conversion of R+RBI, outs into W/L is a good idea, but the implementation requires changes.

First of all, it’s not R+RBI you want, but R+RBI-HR.  I’ve detailed the reasons already.

Secondly, W/L of pitchers make alot of sense, in that if you do Wins minus Losses, all divided by 2, you get a pitcher’s Wins above average (WAA).  This works best at the career level.

So, if you want to create a similar metric, you need to ensure that a player’s hitting W/L adheres to this property.

Since we have R+RBI-HR, and you have outs, it’s fairly straightforward to get runs (and therefore wins) above average.  RAA = R+RBI-HR - outs*x, where x is some league or team constant.  If a team is +50 runs above the league average, then you set the “x” such that RAA = 50.  Easy stuff.

RAA to WAA is fairly straightforward as well, using Pythag.  Or, just divide by 10 if you want a shortcut.

Finally, since you have 162 games, you can get the “game share” for the hitter as his percentage of his team’s PA times 162.  So, a guy who gets 11.1% of his team’s PA, he gets 18 “games”.

Now, we do have one problem.  We are giving 162 “games” for pitchers (which includes fielders) and 162 “games” for hitters.  But, that’s not the correct balance (because of the fielders), which is what the author came up against.  The “shares” you want to give out is roughly 30% higher for hitters than pitchers. 

The author comes out with his own divisors which are 30% higher than the original baseline we would have started with.  Kudos to the author in making an arbitrary intuitive decision that in fact is justifiable.

Anyway, I don’t know how much what I’ve said here changes.  It’ll certainly change alot for the Raines and Gwynns and Boggs.

So, a good idea, with an implementation that needs work.

And, as I noted in my original Win Shares article, all of these things are simply Linear Weights, but trying to come up with a second number to accompany it, to give it more context.

#1    Tangotiger      (see all posts) 2008/05/09 (Fri) @ 12:40

Now, it’s possible, as Victor pointed out, that he happens to have selected draft years where the pitching talent was fairly low.

The expectation of low pitching talent for any given year can be easily determined by the % of pitchers drafted (say after 30, 50, 100, or whatever players selected).

If Victor is right that the pitching talent may have been low, was that the perception at the time?  If the percentage remained the same each year, then no, that was not the perception, and therefore, we should not think that the result should differ greatly.


#2    Eric J. Seidman      (see all posts) 2008/05/09 (Fri) @ 12:49

Thanks, Tango.  I’m going to do a followup for tomorrow or Sunday at StatSpeak looking at the Low-Risk class of this year and what’s happened thus far.

I loved the Hitter W-L too.  The Bonds-Clemens and Rose-Ryan comparisons really stuck with me and made sense.

Is this something you would look at the Game Advancement for?

I’d be really interested in seeing the Hitter W-L during the peak years of a batter’s career.  Some guys stay around for a very long time and aren’t particularly effective towards the end (or as effective as before), so seeing the peak years and/or how many “20-win” seasons a batter had would be extremely interesting.


#3          (see all posts) 2008/05/09 (Fri) @ 14:16

I think part of the poor turnout for pitchers is because I used win shares.  Is there an adjustment I can make for pitchers to correct for win shares ex-every 50 IP add 1 WS?


#4    Tangotiger      (see all posts) 2008/05/09 (Fri) @ 14:36

Right, Fangraphs already collects the WA and LA (noted as +WPA and -WPA on its site).  All you need to do is collect it on a career level.

Pujols for example two years ago was 18-9, for a +9 wins in WPA.

When you add it up at the team level, you’ll get the typical 81-81 team to be 110-110 or so in hitting WA-LA.

So, it’s “cute” to get it like that, but ultimately, you always need to remember the basics of wins above average, and “games”.

If you insist on turning WA and LA into a cute W/L, you can try to do this:
1. take WA+LA and multiply by 0.63
2. take WA and LA and multiply each by two
3. take the value in step 1 and remove it from the new values in step 2

So, let’s take Pujols.  He’s 18, 9 for WA, LA. 
1. 27 times 0.63 = 17
2. 18, 9 becomes 36, 18
3. 36 minus 17 = 19, and 18 minus 17 = 1

Pujols gets a W/L record of 19-1.  That puts him as +9 wins above average.

His 20 “games” (i.e., 19 W and 1 L) gives him 12% of his hitting team’s share of games.  His game advancement (18, 9) is 12% of his offense’s game advancements.

Sweet, right?

***

You do end up with 162 pitching games and 162 hitting games.  In order to solve that, you need to subtract a proportionate share of all that.  So, Pujols, who has 20 “games” really needs to have 10 “games”.  However, the gap of 19 W and 1 L has to remain.  So, you subtract 5 from each side, giving him a W-L record of 14-(-4).

In Win Shares lingo, you multiply all that by 3, and show the “double negative” as:
42+12


#5    Tangotiger      (see all posts) 2008/05/09 (Fri) @ 14:48

Victor/3: oh, that would explain ALOT.

Yowza.  You used WSAB, and I’ve already talked about that a few times how to handle the pitchers.  What you need to end up with is the following:
57% nonpitchers WSAB
33% starters WSAB
10% relievers WSAB

This can be explained as using the following replacement-levels:
.380 nonpitchers
.380 starters
.470 relievers

Note also that these averages:
.500 nonpitchers
.490 starters
.520 relievers

Finally, note that starters gets twice the number of innings as relievers.

So, add it up:
+.120 average nonpitchers above repl
+.110 average starters above repl
+.050 average reliever above repl

Give these weights:
9 nonpitchers
6 starters
3 relievers
... these correspond to innings.

Overall, you get:
.12x9
+ .11x6
+ .05x3
= 1.89

You get these shares:
1.08/1.89 = 57% nonpitchers
0.66/1.89 = 35% starters
0.15/1.89 = 8% relievers

Now, relievers leverage are not spread out randomly.  So, the 8% value that reliever get should probably be more like 10% or so.

You can also look at player salaries, and I’d bet that the allocation would follow this 57/33/10 split, more or less.

So, when you construct your WSAB, you need to ensure that it represents something close to that.

***

Guy or Rally has contended that the salary split should be a bit higher because pitchers represent greater risk/uncertainty.  Perhaps at the individual sense that is true, but if you’ve got 100 pitchers in your organization, I don’t know that it matters.  To the extent that it does matter, maybe pitchers should get a bit more as a group.  But, I’m pretty sure they get paid around 42% or so, so I don’t think teams are practicing that.


#6    Eric J. Seidman      (see all posts) 2008/05/09 (Fri) @ 14:52

Yeah, that’s very sweet… and I think it’s a very cool way to indicate a batter’s contributions.  We don’t have to do it, like you said, but it’s still pretty cool to be able relate offensive contribution to a commonly accepted pitching barometer.


#7    Tangotiger      (see all posts) 2008/05/09 (Fri) @ 15:06

So, Pujols, who has 20 “games” really needs to have 10 “games”. 

Actually, sticking to the theme that nonpitchers should get 57% of the share (for their hitting and fielding), then you would want to subtract 43% of his 20 games or remove 8.6 games, or 4.3 W and 4.3 L.  But, as far as Fangraphs is concerned, since fielding is simply ignored, then the 50/50 split should hold.


#8    philly      (see all posts) 2008/05/10 (Sat) @ 08:17

Tango/1:

“If Victor is right that the pitching talent may have been low, was that the perception at the time?  If the percentage remained the same each year, then no, that was not the perception, and therefore, we should not think that the result should differ greatly.”

I’m in the process of updating my draft data and that’s one of the questions that I’ve been interested in.  To try to address the question I established an expected WARP production for every draft slot up to #350.  There is a decent amount of aggregate production after that - about 325 WARP/draft - but for each individual slot it’s pretty close to zero.  I can then simply ad up expected slot by position or state or school or whatever and see how MLB as an industry viewed that segment in a given year.

I haven’t had a chance to read Victor’s piece yet, but let’s just use pitchers to see how that might work.  Oh, and actually I have it split by hand so for ease I’ll stick to RHP.

This won’t format, but I’ll list the ExpectedCareer WARP and the percentage of that total devoted to RHP for each year from 1987-1996.

1987: 451.7, 38.0
1988: 434.1, 36.5
1989: 366.7, 30.9
1990: 318.2, 26.8
1991: 391.1, 32.9
1992: 405.8, 34.1
1993: 483.6, 40.7
1994: 457.3, 38.5
1995: 454.7, 38.2
1996: 485.6, 40.9

The 10 year average is 425 WARP, 36%.

As a group RHP comsume the most draft slots and expected production.  However, you can see that the figure does bounce around from a low of 318/27 in 1990 to a high of 486/41 in 1996.

Let’s just see if MLB was right as industry to devote relatively less resources to RHP in 1990.

The top pitchers that year turned out to be:

Mussina - 123 WARP, #20 overall
Wickman - 55.5 WARP, #44
Alex Fernandez - 52.3, #4

huge gap

A bunch of guys like Mike Williams, Steve Karsay, Rick White at 25 WARP.

It did turn out to be a very shallow draft that produced just two very good starters and one decent releiver who hung around longer than most.

I’d say as an industry MLB did a pretty good job dialing back the percentage of draft resources devoted to RHP that year.

However, MLB did a poor job of balancing hitters vs pitchers overall in the 1st rd in those year which addresses this point from Tango:

“Basically, picking a pitcher in the first round, by ANY team, is foolish.  You’d have to be such a much better prospect than your typical first round pitcher to be even considered being pick in the top 30.”

I exclue the #1 overall pick beause it’s a real anomoly, but I can look at #2-30 for that 10 year period.  I use a 40 WARP career as a rough guideline for a “good” player and it’s much more likely to find a 40+ WARP hitter than pitcher.

- 143 pitchers were picked in those slots and 11 exceeded 40 WARP or 7.7%

- 147 hitters were picked and 27 exceeded 40 WARP or 18.4%

So you can see that as an industry MLB picked hitters and pitchers in equal amount, but hitters were more than twice as likely to have “good” careers.

I’m hopefully going to post a lot of this stuff at SoSH prior to the real draft.

Oh, but one other quick, quirky thing that I just compiled the other day.

Mussina at 123 WARP was the top college pitcher drafted in this period by a lot.  The next is Scott Ericson at the top of a big group at ~55 WARP.

However, Ericson is not the second most productive college player who pitched in the majors.  He’s not even 3rd.  The next most productive college “pitcher” is Trevor Hoffman at 80 WARP.  He was a SS who was drafted in the 11th rd.  The 3rd most productive college “ppitcher” is Wakefield who was a 1B drafted in the 8th rd.

Now, if I looked at pre-FA production they would drop down the list some, but I think it is amazing to think that a SS and a 1B have been more productive over thier careers than every drafted RHP save one.


#9    tangotiger      (see all posts) 2008/05/10 (Sat) @ 09:28

Fantastic stuff!  And very cool about Hoffman/Wakefield.

***

And maybe you can write it for Hardball Times or somewhere else in addition to SOSH.  SOSH has disabled their search feature, so it’s impossible to find anything there easily, and you’ve got too much good stuff already posted there for it to be hidden away from everyone.

***

Question: do you carry IP and PA along with your WARP numbers?  The WARP numbers skew things for players with long careers, and I can give you an adjustment if you tell me you also have IP and PA.


#10    philly      (see all posts) 2008/05/10 (Sat) @ 12:00

No, I didn’t think to include PA and IP.  This is my third iteration - every one an improvement I like to think - with these data, so I’m sure I’ll do another round in a couple of years.  I’ll try to remember to include that.

But all of what I’ll post will include very good estimates of pre-FA production (some thanks to your posting 2007 service time) in addition to career totals so I will be able to make some notes about peak vs career produciton.

I would like to get some brader exposre so I did consider publishing through THT or somewhere, but I tend to write and publish at the last minute and just wasn’t sure about making the process more complicated with a 3rd party.  I have thought about just maing a separate forum over at SoSH.  Right now I just dump everything into the general Sox Draft forum so I can find them.  I do believe that the links are stable though so links from here or the wiki would do the trick from an archival perspective.

There’s one other point I wanted to make about the 1st rd hitters vs pitchers issue that I didn’t have a chance to earlier.  The direct comparison that I made - and everybody else makes - between 1st rd hitters and pitchers is a little too facile.  If MLB as an industry avoided pitchers in the 1st rd the pitchers wouldn’t be replaced with players as good as current 1st rd hitters.  They would be replaced by the hitters who are currently drafted in the 1st supp and early 2nd rd.  And the overall drop off in that area of the draft is pretty steep.

I believe I started to compile a comparison of 1st rd pitchers to the next x hitters a couple years ago, but don’t think I finished it.  Don’t remember the results anyway.

I’ve grouped slots #31-100 together as my next segment.  I can compare pitchers drafted #2-30 to hitters drafted #31-100.  The hitters group will be much larger and worse than the players who would replace 1st rd pitchers, but for a quickie look not bad.

Pitchers #2-30:

- 143 total
- 44 or 31% DNP
- 11 or 7.7% exceeded 40 WARP

Position Players #31-100:

- 373 total
- 232 or 62% DNP
- 12 or 3.2% exceeded 40 WARP

Obviously 1st rd pitchers look much better in comparison to these hitters than 1st rd hitters (28% DNP, 18.4% over 40 WARP).

The position players who would get drafted in the 1st rd to replace the 1st rd pitchers in a theoretical no pitchers in the 1st rd environment would be closer to 62/3 than 28/18.

It is a very interesting question.  I’m going to put it onto my to do list because the breakeven point could be pretty close from those numbers.

Oh, and it’s important to remember that I’m still excluding the #1 overall pick here.  There’s a huge, huge advantage to picking a hitter at #1.

In this 10 year span it’s the difference between:

Griffey, ARod, Chipper, Erstad, Nevin and

Andy Benes, Ben McDonald, Kris Benson, Paul Wilson and Brien Taylor.

Now Griffey and ARod are the very top of the range of amatuer talent ever available in the draft, but the difference is still real.


#11    tangotiger      (see all posts) 2008/05/10 (Sat) @ 14:20

This is a good study:
http://sports.espn.go.com/mlb/columns/story?columnist=neyer_rob&id=1811682

What I like about that study is that it looks at the top 10 HS pitchers and top 10 college pitchers, regardless of which round they were in.  After all, this is what we are trying to figure out, exactly where should they be drafted.

If you look at the way we did the groupings, an equal number in each group, it makes it alot easier to know the equivalencies of what each set matches to the other set.


#12    philly      (see all posts) 2008/05/14 (Wed) @ 07:28

I can’t seem to post to this thread with something I cut and paste from Word.  Just seeing if that’s the problem…

Looks like it.  When I cut and paste and hit preview it just returns to the top of the thread and I never get to this preview window.

That’s a pain in the butt.


#13    Tangotiger      (see all posts) 2008/05/14 (Wed) @ 08:11

Send me an email, and I’ll try to figure out what the issue is.  The spam filter may not like it.  And if you are using “a href” tags, it don’t like that either. 

If you have a URL, just type it in, and the software will hotlink it automatically.

I think I have a 9999 character limit per post.  I can check that too.

email to:
tom~tangotiger~net
replacing the ~ with the obvious characters


#14    studes      (see all posts) 2008/05/14 (Wed) @ 08:39

Victor, WSAB already has different baselines in place for hitters vs. batters, unless you did something different than my WSAB stats.  I can’t remember exactly what I sent you.


#15          (see all posts) 2008/05/14 (Wed) @ 09:56

14/Studes:
Do you mean hitters vs. pitchers?  I used WSAB which is what you sent me.


#16    Tangotiger      (see all posts) 2008/05/19 (Mon) @ 12:43

This is from philly (I think):

===========================================

“After all, this is what we are trying to figure out, exactly where should they be drafted.”

True, but teams are already doing that to some extent. I’d guess that the top 10 C pitchers were drafted higher and therefore had a somewhat higher expected rate of return than the top 10 HS pitchers already.

Just to take one example that I know will be skewed towards C pitchers because the #1 overall pick was a C pitcher here are the slots that the top 10 HS and C pitchers were taken in 1994.

HS:

7, 10, 15, 16, 25, 27, 35, 45, 67, 70

C:

1, 3, 9, 18, 19, 23, 31, 33, 34, 36

I have the 10 HS slots with an expected pre-FA WARP of 60.0 vs 82.7 for the C slots. Just based on slots the industry as a whole had already discounted the HS pitchers somewhat.

Anyway, I went back to look at how the 10 1st rds from 1987-1996 would have been different if instead of the 148 pitchers actually drafted in the 1st rd teams took the next 148 position players. In a “no pitchers in the 1st rd” model you don’t get position players as good as the ones already being drafted in the 1st rd, you get the position players just outside of rd 1. In actuality these players were drafted in slots #31-66. How good are these position players in comparison to the 1st rd pitchers?

Pitchers:

- 148 total
- 1 greater than 100 WARP (Mussina)
- 1 greater than 80 WARP (Appier)
- 1 greater than 60 WARP (Wagner)
- 6 greater than 50 WARP (Halladay, Benes, McDowell, Nagy, A Fernandez, Jones)
- 3 greater than 40 WARP (Sele, Olson, Morris)
- 11 greater than 30 WARP
- 15 greater than 20 WARP
- 17 greater than 10 WARP
- 48 less than 10 WARP
- 45 DNP

That’s a 30% complete bust rate, 8% who exceeded 40 WARP, and 18% who were in the useful 20-40 WARP range.

In total these pitchers collected 1347.2 pre-FA WARP and so far 1859.2 career WARP.

Hitters:

- 148 total
- 0 greater than 100 WARP
- 1 greater than 80 WARP (Rolen)
- 2 greater than 60 WARP (Belle, Beltran)
- 1 greater than 50 WARP (Damon)
- 1 greater than 40 WARP (Rollins)
- 3 greater than 30 WARP
- 3 greater than 20 WARP
- 6 greater than 10 WARP
- 53 less than 10 WARP
- 78 DNP

That’s a 53% complete bust rate (vs 30% for pitchers), 3% who exceeded 40 WARP (vs 8% for pitchers), and 4% who were in the useful 20-40 WARP range (vs 18% for pitchers).

In total these hitters collected 497.9 pre-FA WARP (vs 1347.2 for pitchers) and so far 659.4 career WARP (vs 1859.2 for pitchers).

Overall, the position players who would have been pushed into the 1st rd in a “no pitcher” model were about 1/3 as productive as the pitchers actually taken.

So I’d say on the individual team level it is true that the teams that took pitchers in the 1st rd probably should have taken a comparable 1st rd hitter. However, on an industry wide level it is not true that all teams should avoid pitchers in the 1st rd simply because the hitters that would be elevated into the first round were not nearly as good as the pitchers actually taken in rd 1.

One other observation, of the 12 pitchers that exceeded 40 WARP just one (Halladay) was from HS. But of the 5 position players just outside the 1st rd that exceeded 40 WARP just one (Albert Belle) was from college. So it may be the case that MLB would have been better off pushing HS pitchers out of the 1st rd while pushing some second tier HS position players up.


#17    Tangotiger      (see all posts) 2008/05/19 (Mon) @ 12:46

Fantastic stuff!


#18    philly      (see all posts) 2008/05/19 (Mon) @ 22:56

Yes, that was me.  Glad you figured out why it wouldn’t post.

It would probably be worth it to re-do and just back the HS 1st rd pitchers out of rd 1 and bring back in the next group of players including C pitchers with all of the position players.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main