THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, December 16, 2010

Lemonade

By Tangotiger, 08:55 PM

JC was kind enough to post this handy-dandy chart:

That’s the frequency of ERA, for pitchers allowed to face at least 100 MLB hitters in 2009.  Now, what do you think a similar chart would look like for AA+AAA (twice as many pitchers).  Probably something like this:
image

Now, what if all those guys were to pitch in MLB (but only against current MLB hitters).  How would they do?  Probably something like this:

image

Now, what would happen if you simply add the first chart (JC’s chart) to the third chart (my chart of minor league pitchers’ presumed performance against MLB hitters)?  You get something like this:

image

You see how the pyramid is starting to form?  If you add single A, rookie pitchers, college pitchers and so on, the pyramid keeps building, and you get the tail of a normal distribution.

This is how talent should be visualized:
http://www.tangotiger.net/talent.html


#1    MGL      (see all posts) 2010/12/17 (Fri) @ 04:27

A “pyramid?” Isn’t a pyramid symmetrical with a point at the top, sort of like a pointy normal curve?  How is the tail end of a normal curve in the shape of a “pyramid?”


#2    Tangotiger      (see all posts) 2010/12/17 (Fri) @ 08:25

Push the pyramid down to its (left in this case) side.


#3    tangotiger      (see all posts) 2010/12/17 (Fri) @ 08:41

The pyramid is a reference to Bill James’ term here:

Talent in baseball is not normally distributed. It is a pyramid. For every player who is 10 percent above the average player, there are probably twenty players who are 10 percent below average.

While the 1:20 ratio itself is not necessarily the best estimate, the idea is that MLB players represent one tail of a normal distribution of a larger group of all baseball players.


#4    Martin Monkman      (see all posts) 2010/12/17 (Fri) @ 10:13

Another way to think of this is that J.C.’s analysis has selection bias, based on survivorship.  (This is an untested hypothesis, but I recall seeing tabulations that show that there are, as you say and implied in the Bill James quote, far more below-average players than above-average.) By pruning his list to only those pitchers with a minimum of 100 batters faced, J.C. omits a large cohort of pitchers who didn’t get that chance.  And the fastest way to miss that chance is to pitch poorly, i.e. have a high ERA.


#5    Tangotiger      (see all posts) 2010/12/17 (Fri) @ 10:48

Martin: you are obviously correct.  The large majority of pitchers who did faced 1 to 99 batters in MLB did so because they were not good enough to face 100 batters.  SOME of them were late season September callups, and would be good enough.  But, clearly, the omitted sample of pitchers would be disproportionately below average.


#6    Sky      (see all posts) 2010/12/17 (Fri) @ 11:20

It’s almost like he’s DOUBLE selective sampling.

1. Ignoring non-MLB players.
2. Ignoring MLB players with little playing time.

If you go searching for replacement-level players, why would you think you’d find them among guys given significant MLB playing time?


#7    Martin Monkman      (see all posts) 2010/12/17 (Fri) @ 13:50

Short snappy analysis, based on all pitchers 2000-2008 (Note: some cleanup of the source data necessary ... thus the results are accurate enough for this purpose, but not necessarily precise):

Column1 - BF_grp: number of batters faced
Column2 - SEA: count of pitcher-seasons (2000-2008)
Column3 - BF_avg: average number of batters faced within group
Column4 - ERA_avg: average ERA of group












BF_grpSEABF_avgERA_avg
<100193446.088.11
100-3992814227.614.61
400+1365703.924.44
TOTAL6113276.535.68

Conclusion:  J.C.’s analysis (a) excludes roughly 1/3 of all pitcher-seasons and (b) those that are excluded are a group that is, on average, far less effective than those who face more batters.

And don’t try to tell me that we have to exclude the ”<100 batters faced” pitchers because of the inaccuracy in the measure (small number of batters faced). We have nearly 2,000 pitcher-seasons to look at, so the average (mean) of that group is quite precise, even if the variation around the mean is wide due to random effects.  What I’m trying to say is that our best estimate of the “true talent” of this group is an ERA of 8.11, and that estimate is quite accurate.

I’ll prepare a more robust analysis and post it to my blog in the next 24 hours.


#8    Tangotiger      (see all posts) 2010/12/17 (Fri) @ 14:22

What I’m trying to say is that our best estimate of the “true talent” of this group is an ERA of 8.11, and that estimate is quite accurate.

That is not accurate.  If you look at how those pitchers who faced fewer than 100 batters did in the season preceding or the season following, THAT will give you a much better indicator of the true talent level.

This is just like my challenge from last week, about the league-leaders in SLG and OBP and HR.  The true talent level (for the group) is determined from the seasons that are not in the sample.

***

The problem is that once the manager saw someone face 90 batters with a 9.00 ERA, he did not let him pitch any longer.  BUT, if he saw someone face 90 batters with a 1.00 ERA, he WOULD let him pitch. 

And in their next 90 batters, it is certainly not true that the first guy would continue to post a 9.00 ERA and the second guy would continue to post a 1.00 ERA.

***

So, think of this as a similar challenge, but looking at the worst players given the least amount of playing time.


#9    Guy      (see all posts) 2010/12/17 (Fri) @ 16:36

Are we seriously discussing whether baseball talent is normally distributed around the talent level of the median MLB player?  Isn’t it more appropriate to discuss, say, whether this is so nuts that it should subject JC to a one year suspension during which he is ineligible to comment on any issue related to baseball player talent and valuation? 

I wonder what JC thinks the REST of the curve looks like?  He believes there are more average players than -20 runs players, and more -20 players than -40 players, etc.  Does the curve ever turn up?  Or are there still fewer players who could provide average production for a AA team?  You wonder how community colleges even manage to field a team—those guys must be scarce as hen’s teeth! 

It boggles the mind......


#10    Tangotiger      (see all posts) 2010/12/17 (Fri) @ 16:52

Just trying to make a lemonade with what I saw.  The only other option is a tequila.


#11    Rally      (see all posts) 2010/12/17 (Fri) @ 16:57

Well, in major league baseball there are probably more -20 players than -40 (as long as we exclude pitchers) and when you go by playing time it will look like a normal distribution.

I haven’t read the link to JC’s (have not found that a productive use of my time).  Just reading your comment #9, Guy, I’m thinking you have to be completely making up a strawman.  Nobody could actually be that nuts.  Then I remember some of JC’s other whoppers, so maybe that is what he’s saying.

I think to get a normal curve of baseball talent outside MLB, You could construct a talent pool that looks like this, from 0-100, from males in the baseball playing age range (20-40):

0 - paraplegic
50 - played in backyard as a kid, probably not good enough to make high school team, in average physical shape
75- former high school player, may not be in great shape anymore
99 - Div I college player
99.5 - minor league player
99.9 - average major leaguer
99.99 MLB star

That might look like a normal distribution, with a lot more people in the 50 range than the disabled, low end of the population.  Numbers are for illustration and not to be exact.

Making some quick estimates of population and # of MLB stars, the last number should be more like 99.9999, but hopefully you get the picture.


#12    J-Doug      (see all posts) 2010/12/17 (Fri) @ 17:00

Nice job making the point in a very concise and pithy manner, Tango.

I just don’t get how JC gets away with this. “There’s no such thing as a replacement player in this sample that I drew that deliberately excludes the vast majority of replacement level talent” is what he’s saying--and he gets a book out of this.


#13    J-Doug      (see all posts) 2010/12/17 (Fri) @ 17:02

Nobody could actually be that nuts.

Plenty of people are this nuts. They’re just not usually published.


#14    Tangotiger      (see all posts) 2010/12/17 (Fri) @ 17:07

Neyer linked to JC, and the first three comments out of the gate all echo what we are saying in this thread.  I just saw JC’s name, and I wanted to take a little breath before reading it.  Ok, I’m going back there to see how he handles this....

Wow, he should be a politician:

When selecting sample inclusion cutoffs there is a tradeoff. Setting cutoffs high excludes some player observations, but gets a good representation of true talent. Setting cutoffs low includes more player observations, but because of so few observations it’s less able to capture true ability. I tried both lower and higher cutoffs, and the results looked somewhat similar. 100 PAs is quite low for measuring ability, and as I state in note 61, on page 244 of my book, “Lower cutoffs yielded similar results.” Cutting observations does not induce a distribution that looks like the right tail of a normal distribution, as replacement-level theory assumes.

Man, Neyer sent JC some really smart readers.  Pete here captures it:

I ran the numbers using Fangraphs WAR from 2010. About 28% of pitchers and 37% of hitters were below replacement level. Which is about 33% of all players. However, this includes pitcher hitting, which is almost universally below replacement because Fangraphs does not use a positional adjustment for.

Also the 28% of pitchers made up 13% of innings pitched and the 37% of hitters made up only 16% of plate appearances.

But when considering these numbers you have to take several things into account.

1) If you take out pitcher hitting, about 28% of pitchers and hitters were below replacement level. And they made up 13% of innings pitched and 14% of plate appearances.

2) Most of these below replacement performances are based on very small sample sizes. The pitchers below replacement level averaged 31.2 innings and the hitters averaged 86 PA. So most of those hitters wouldn’t even make your talent distribution graph. And probably about half of those pitchers wouldn’t show up on your chart.

3) With many players right around replacement level, you would expect a good portion to have observed value below replacement level though their true value may be right around or slightly above replacement level.

4) About 60% of hitters and 75% of pitchers below replacement level were below by only a third of a win. Making it highly likely that many of them are actually above replacement level in talent.

5) If you don’t believe replacement level exists then why do you base your argument on the fact that Fangraphs set their replacement level too high? (Rally WAR at Baseball Reference and WARP at Baseball Prospectus both use lower replacement levels).

And Larry does another great post:

Pete really starts to move the ball forward on this. I, too, suspect that part of the answer is sample size – i.e., a lot of the players performing below “replacement level” really have replacement level ability, but this fact is masked by luck in small samples. Though I doubt that this explains all of the problem.

It’s frustrating that this is such a little discussed issue though – I think J.C. has a real point, but it bugs me that he doesn’t go deeper into his analysis (maybe he does in the book, I don’t know), and is IMO overly categorical in his conclusions.

And, JC just keeps doubling-down:

The best estimate of sub-replacement makeup of the league is about 1/3—that’s a lot of performance below replacement-level, especially given that so much superior talent should be available at the league minimum. Why not just raise replacement level? Because replacement-level as a concept serves no useful purpose if replacement-level players are not scarce.

***

Fold, dude, fold.


#15    Tangotiger      (see all posts) 2010/12/17 (Fri) @ 17:09

Rally: you should read the comments… just skip over JC’s.  All of the readers there may good points.


#16    Martin Monkman      (see all posts) 2010/12/17 (Fri) @ 17:13

Guy @ #9:  The fundamental problem is that J.C. has made his data fit his belief system (which is, and I quote: “I don’t believe replacement players are cheap and abundant.") I think the goal of the discussion here is to point out the flaws in his reasoning, if not through thought experiments (your approach) then through some analysis of the data (where I’ve been going).  Either path is worth taking, and not only for the exercise.

Tango @ #8:  I’m trying to find a way to restate what I meant—I still think my basic idea is correct (these players are, collectively and/or on average, less talented than those who get to pitch more often; they are indeed replacement level players) but my articulation is poor.  I’ll save my comprehensive response until I have run some more analysis, including (as you challenge) the very bottom end of the “regression to the mean”.


#17    Guy      (see all posts) 2010/12/17 (Fri) @ 17:14

"Just reading your comment #9, Guy, I’m thinking you have to be completely making up a strawman.  Nobody could actually be that nuts.”

I would never… ok, I might, but in this case I really didn’t take his remarks out of context or construe them in an extreme way.  He has consistently argued, in his book and on his blog, that talent is normally distributed.  He very clearly argues, for example, that what we would call replacement level players are more scarce than players of average talent.


#18    Rally      (see all posts) 2010/12/17 (Fri) @ 17:50

(Rally WAR at Baseball Reference and WARP at Baseball Prospectus both use lower replacement levels).

Actually, my replacement level is higher than Fangraphs (i.e. my league has fewer WAR).  BPro is lower, and though he doesn’t publish his projections, what I remember about MGL’s are that his replacement level is higher than mine.

#17, I believe you Guy.  Just because I’m familiar with your posts and with JC’s.

But it is mind-boggling that he can actually convince himself in these arguments.  Even though I’ve seen him do it before, I still have to do a double take. 

Imagine you are a in a political debate, maybe on the pros and cons of a slightly higher marginal tax rate.  One side is going to stress the need to be fiscally responsible, the other stresses the need to stimulate jobs.  You’re prepared to argue along one of those lines.

And your opponent starts off the debate by claiming that lower tax rates will cause vorpal bunnyrabbits that will eat up all the turkeys on America’s farms.  And he says it with a straight face, throws his credentials around and acts like you must be an uneducated moron to even question his expertise on the scientific consensus of the vorpal bunny aspect of taxation.

That’s how I feel when I read JC.  I’m taken off guard too much to offer a rational response.


#19    Tangotiger      (see all posts) 2010/12/17 (Fri) @ 17:55

I think MGL uses -18 runs per 150G for nonpitchers.  That’s about -17 wins for nonpitchers.

I don’t know what he uses for pitchers.  Probably something like -1 run for starters and -0.5 runs for relievers or something.  That’s about -13 wins for pitchers.  So, MGL probably sets repl level at -30 wins from league average.  Pretty close to Rally.


#20    Guy      (see all posts) 2010/12/17 (Fri) @ 18:15

If the worst player in baseball is worth $12M, and baseball players never get old, why not vorpal bunnyrabbits?  In JC-world, everything is possible.....


#21    Rally      (see all posts) 2010/12/17 (Fri) @ 18:36

If I ever find myself posting in a discussion with JC, I’ll have to use the Chewbacca defense.  Not likely though because I refuse to subject my comments to his moderation.


#22    Pete      (see all posts) 2010/12/17 (Fri) @ 19:26

Rally/18, yeah I realized that I had flipped your replacement level and Fangraphs around shortly after I posted, but I was at work so it took a while to correct it.

I also just posted some numbers over there about the actual James quote about there being 20 players 10% below average for every 10 players above average using data from 2006-2010. Pointing out that by only using 1 year of data, JC neglects that the talent at the bottom turns over at a much higher rate than the talent at the top. (Thought it’s currently awaiting moderation).


#23    MGL      (see all posts) 2010/12/17 (Fri) @ 21:03

I’ve said this many times before:

Is it that hard to determine replacement level, at least over a sample of several years?  I don’t think so.

Just write down all the players who were signed as FA for, say, less than 1mm (you can make it .5mm I guess), and then look at their average performance.  Isn’t that pretty much replacement level?  Using minor league players is trickier.  They all get paid around the same (league min) so you would kind of have to know who was or was not considered a prospect beforehand.

As far as J.C.’s assertion:

Are these even debatable?

1) The pool of talent to draw from in MLB is the most talented players of all the adult make baseball players in the world.

2) That talent pool is normally distributed.

3) You start by drawing from the extreme right and then go from there until you run out players you need.

4) For every player on the right, there has to be more players on the left. What the ratio is (the slope of the curve, which is defined by the SD of the entire curve), I have no idea.

Period.


#24    Guy      (see all posts) 2010/12/18 (Sat) @ 18:26

JC responds to Colin:  http://www.sabernomics.com/sabernomics/index.php/2010/12/agreeing-and-disagreeing-with-bill-james/.  The whole discussion is so surreal.  JC’s claim is that the players we call “replacement level” are in fact quite scarce, not plentiful.  But his evidence for this “scarcity” is that 25-30% of all PAs are provided by this kind of player—which kinda makes it sound like there might be a lot of these guys.  How do you even argue with logic like that?


#25    MGL      (see all posts) 2010/12/18 (Sat) @ 22:36

It depends on how broad you want to define replacement level (as far as their scarcity is concerned).  And “scarcity” where?  MLB?  Minors?  Both?

You can conceivably say that there are NO replacement level players in MLB - that replacement level is the talent level of the best players in the minors who are not prospects waiting to be called up.  Or you can define them as the bottom 1% or 5% (or whatever) in MLB.  It is not the point whether and where they are scarce.  It is simply the point that if you want or need a player and don’t want to pay more than the min salary or so, at what level will these players be?  The number of these players available should be self evident based on the idea of the curve that BJ explains and we are discussing.  And the fact that for every player at level X there has to be more than one player at level “less than X” should be obvious and indisputable.


#26    Martin Monkman      (see all posts) 2010/12/21 (Tue) @ 16:10

Fresh squeezed: “Agreeing with Bill James”

http://bayesball.blogspot.com/2010/12/agreeing-with-bill-james.html


#27    Tangotiger      (see all posts) 2010/12/22 (Wed) @ 17:20

I’ve never seen 100% support from Primates on anything.  There’s always someone who is contrarian almost seemingly for its sake.  Not in this case:

http://www.baseballthinkfactory.org/files/newsstand/discussion/whats_wrong_with_replacement-level_valuing_of_players/


#28    Martin Monkman      (see all posts) 2010/12/23 (Thu) @ 00:54

A few more drops of lemonade, including a response to Tango @ #9.
http://bayesball.blogspot.com/2010/12/era-distribution-curve.html

I fear I may have gone down a rabbit hole from which there is no exit—I noticed a few other interesting characteristics in this distribution, which I will write up in the days ahead.

And any comments/critique/questions are welcomed.  (As per Matt’s comment #19 at http://www.insidethebook.com/ee/index.php/site/comments/non_saber_saber/,
“Just don’t get personal.” 8-) )


#29    MGL      (see all posts) 2010/12/23 (Thu) @ 03:01

Monk, from your above blog post:

So let me clarify.  The average level of skill of the pitchers who faced fewer than 100 batters in 2009, is an average ERA of 8.72. Although Tango is correct in his assertion that the poorest performers would regress upwards, by the same token the best pitchers (some of whom managed a 0.00 ERA in their short stint) would get worse. But if we were to let all 188 of them continue to pitch, we can be 95% certain that the “true” ERA of the group would end up somewhere between 6.92 and 10.52.

Unless I am not understanding what you are saying, you are 100% wrong.  You are forgetting about regression toward the mean.  Those pitchers are not an unbiased sample.  We have Bayesian information about them.  We know what population they come from (major league pitchers or perhaps some subset of them such as relievers or young pitchers or old pitchers, etc.) and we know the mean of that population.

Wow, isn’t the name of your blog “bayesball?” The regression toward the mean is a short-cut for a Bayesian analysis of the data in order to infer the likely true talent of these pitchers.

Do you read this blog much?  We hammer into people’s heads 100 times a day that any group of players (or one player) that is above or below the mean of the population they belong to likely has a true talent closer to the mean than their actual performance in any period of time.  And that (estimated) true talent is also the performance level we expect them to achieve in any other random, unbiased sample other than the one we already observed (assuming that their true talent does not change due to age or injury or something else).

Any group of major league pitchers who has an 8.00 ERA in some number of innings less than a whole lot is not going to have a true talent anywhere near 8.00, just like no group of pitchers with a performance of 1.00 (ERA) is going to have a true talent anywhere near 1.00, again, assuming that the average number of innings per player is not really large (like a thousand or more innings).

Again, I apologize if I am not understanding your post that I quoted above.


#30    Tangotiger      (see all posts) 2010/12/23 (Thu) @ 13:42

Martin is having a tough time posting, and this is his post.

(Please wait a second.)

==========================================
MGL/29,

This is what real-time peer review is all about—either I’m wrong or I haven’t explained it very well. Or perhaps there’s some point in between. Either way, the critique is welcome. And I think that the terms you’ve used will help me clarify what I’m trying to say, and have forced me to start thinking about other, hopefully better, ways to address the analysis.

First, though, I will admit I took a frequentist look at the bottom of the BFP distribution. In spite of the name of my blog, I’m not wedded to one approach or the other. And “Frequentist Ball” just doesn’t have the same ring. (happy face emoticon)

In answer to a couple of your questions, yes, I read The Book blog (long-time lurker, recent poster), and I understand regression to the mean. As recent evidence of both, I’ll point to my blog entries over the past few weeks where I responded to Tango’s challenge and demonstrated regression to the mean using slugging average (at the top and bottom of the SLG range).

My overall point with my recent post on ERA distribution is that there are many players who don’t pitch much and who are, on average, not as good as those who are out there more often. I was attempting to demonstrate that they are a distinct population from the rest of the pitching fraternity.

There is no argument that the <100 BFP pitchers will, as individuals, regress to the mean. But my hypothesis is that it’s not the league mean to which they would regress—it’s the mean of the population of less than 100 BFP pitchers, which is several points higher.

As I saw in the SLG regression to the mean analysis I did, it was the bottom end of the table—the players with poorer-than-average performance who we would expect to be regressing up to the mean—where things start to get interesting. In particular, there’a a challenge in analyzing this group: many of the poorer performers get a limited number of chances (i.e. small sample size at the individual level). This has two impacts for the analysis. First, it reduces our confidence in the measure (SLG or ERA, or whatever) as indicative of any one individual’s performance. Second, many of those players don’t come back for a second season so we can’t see how much regression they demonstrate.

Treating the group of infrequent performers as a whole—that is, our sample size is now the number of individuals, not the number of opportunities any one individual has—and making inferences about the sub-population average is one way to address this. This is what I was trying to get at with my “multiple pennies” example.

With all that said, I’m open to ideas about other methodological approaches to analyzing (for want of a better term) replacement-level players. 


#31    Tangotiger      (see all posts) 2010/12/23 (Thu) @ 13:49

But my hypothesis is that it’s not the league mean to which they would regress—it’s the mean of the population of less than 100 BFP pitchers, which is several points higher.

See, that’s your problem.  The PA and ERA are not independent.  So, while you can say “my pitcher is drawn from all pitchers who were allowed to pitch to less than 100 batters”, and then treat that as your population, the issue is that the ERA generated in that population is not unbiased.  Because the determination of whether to allow the pitcher to face 100 batters is based on what his ERA was to that point.

What you CAN do is look at their batting average (as hitters). 

I think you are where I was way in the beginning of learning this.  So, congratulations on being here.  We’ll guide you, and all the others who are in this same boat we once were in, together.

***

Here’s anotehr example: what population of players do you regress Micah Owing’s SLG?  Are you going to look at all pitchers who have pinch hit?  No!  Because one of the reasons he pinched hit was because of the observed high SLG.

You have to be careful that what you are selecting on is independent (or as independent as possible) to what it is you are measuring.


#32    Tangotiger      (see all posts) 2010/12/23 (Thu) @ 13:51

it’s the mean of the population of less than 100 BFP pitchers

And if you think about it, what you really want to say is this:

it’s the TRUE mean of the population of less than 100 BFP pitchers

Except we don’t have the true mean of this group.  We have their observed mean.  A mean which is biased.

The reason we can get away with using the league mean (all players) as a true mean is that we are presuming that 189,000 PA in a league is sufficient to establish the true mean for that population.


#33    Tangotiger      (see all posts) 2010/12/23 (Thu) @ 14:56

Martin tries to explain his reasoning (still wrong) about what he meant:

http://bayesball.blogspot.com/2010/12/era-distribution-curve.html

He uses a coin analogy.  Let me try with a dice analogy.  You have three dice, each colored: green, yellow, red.

If you roll the green dice, you “win”, when you roll a 1-4.

If you roll the yellow, you “win”, when you roll a 1-3.

If you roll the red, you “win” when you roll 1-2.

You roll each die 30 times.  And you end up with this sample random record:
21-9: Green
11-19: Yellow
14-16: Red

You give the record, but not the color to your friend, and you ask him: “which two dice should I keep rolling?” He’ll choose Green and Red.

The Yellow has an OBSERVED 11-19 record.  That is NOT his true rate.  Indeed, since we actually know his true rate, we would regress that record 100% to a 15-15 record.

Now, if we didn’t know the true rates of each color, we can use Bayes, if we know one is .667, the other is .500 and the other is .333 and try to figure out what Yellow’s true rate is given that we’ve observed 11-19 on Yellow, 14-16 on Red, and 21-9 on Green.

But, you cannot jsut treat Yellow’s record as a CENTER of 11-19, with an uncertainty of the true estimate around that mean.  That’s what Martin did, and that’s why he’s wrong.

***

By the way, we shouldn’t use ERA.  We need to be using rates, like win% or OBP.  Or, as a good shorthand: wOBA.

ERA is sort of like OBP^2.  That’s why it’s going to naturally skew toward a higher ERA.


#34    Tangotiger      (see all posts) 2010/12/23 (Thu) @ 18:16

This is from Martin

=============================================
Here’s a summary of what I’ve learned about low BFP pitchers (and by extension, low PA batters).

1. There’s bias in the sample.  My assumption had been that the bias would be eliminated with the large number of players, but this is wrong. The error in this assumption is where my coin flipping analogy falls apart.  If “success” is a head, then the coin that comes up heads >0.5 will keep being flipped, possibly with enough flips to no longer be part of the “low flip” group (over that arbitrary threshold).  Meanwhile, a coin that runs tails more often will get pulled from the trials quickly, and end up <0.5 and with few flips.  Thus, as a group, the coins with a smaller number of flips will end up looking worse than those that keep getting flipped.  Selection bias deviously rears its ugly head.

2. My second assumption had been, as a result of assumption #1, that the low BFP pitchers have a different population mean.  Back to Tango’s dice analogy: we are able to regress Yellow to 0.5 since we know that’s the true probability (in the language of probability, the prior).  But with freshly minted MLB ballplayers, we don’t have a prior for the individual, so we have to start with the league mean.  We make the assumption, until we have enough data points to judge players accurately as individuals, that every individual is average.  In order to estimate their true talent, we blend the limited observations with the league mean (i.e. regression to the mean).

3. wOBA is a much better measure to use for this analysis than ERA.  ERA throws gasoline on my already skewed fire, by magnifying the impact of each additional run or out.  (e.g. the difference ERA between a pitcher who allows 1 earned run over 3 innings pitched versus the second who goes 2 ER/3 IP is 3.00 vs 6.00, a doubling of the measure.)

Thanks to Tango and MGL for patiently working through this so I get it straight. Time to edit my blog.


#35    Tangotiger      (see all posts) 2010/12/23 (Thu) @ 18:18

Good job.

For #3, I’m not sure that you described it enough.  I guess it’s fairer to say that ERA is a ratio of sorts, rather than a rate.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 12:16
“Why Kickstarter works”

May 25 12:08
Largest demonstration in Canadian history?

May 25 11:53
Do pitcher’s reach back for velocity when needed?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 17:04
Firefox, IE, or Chrome?