THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, October 02, 2010

Two kinds of luck

By Tangotiger, 09:55 AM

There are two kinds of luck: pure luck, and talent-driven luck (or “make your own luck").  Let me describe the difference.

Each of us, in whatever actions we are performing (throwing a pitch, driving a car, typing) has a talent level.  We’ll even say that, at a point in time t, we have a fixed true talent level TT.  TT(t) if you will.  When you apply that TT(t), you will NOT OBSERVE TT(t).  That’s because we are not automotons.  We are people.  What we WILL observe is some performance where, had we repeated those actions a million times, will have centered around TT(t), with a normal distribution around that centering point.  If you randomly pick out one of these points, this is talent-driven luck, or “make your own luck”.  It counts as something you did because you are the causative agent.  It doesn’t REPRESENT you, but it is an INDICATOR of you.  Given enough of these indicators, it will represent you, with a certain uncertainty level (the more indicators the less uncertainty).

Now, pure luck has nothing at all to do with your talent level.  You are struck by lightning.  You are a pitcher for the league’s worst offense.  You bet on double-zero.  You observe results to these external actions.  You are hospitalized, you have an 8-16 record (with a 2.74 ERA), you made a million dollars.  This has nothing to do inherently with you, even though you are the beneficiary or victim of these actions.

Suppose that in baseball, we only recorded runs scored or allowed for a game to determine the winner, but after that, we discarded the runs numbers, and only kept track of wins, and who was pitching in that game.  And suppose that pitchers always pitched complete games.  So, you can have someone with an 8-16 record, and we have NO IDEA how much of that was due to his true talent level, and how much of that was due to pure luck.  All we know is that he was a participant of those results.  We also know that half of the game is offense and half is defense, and that the pitcher has no influence on the offense.  That 8-16 record is loaded with uncertainty.  We have talent-driven luck and pure luck.

If this pitcher had a career 300-250 record, and he pitched for many teams, we now feel better.  We feel better because the sample size increased, and his talent level, and the luck from his talent level is driving the record.  The pure luck, the noise, gets overwhelmed the more events you have.  If you get struck by lightning 10 times, maybe you have a lightning rod up your butt.  If you bet doube-zero 5 times and win each team, maybe you have the fix in.

Now look at batted balls in play.  Suppose that we KNOW (god told us in her wisdom) that results from batted balls are almost entirely due to the talent level of fielders, and virtually none by the pitcher.  But, the pitcher is the agent that delivers the ball.  He rolled the dice.  The fielders are the one that determines hits and outs (in this example). 

Now take a more realistic example that the pitcher has some complicity, as does his fielders.  And let’s say we know (god again, smart girl) that pitchers/fielders are equally responsibe.  Like, for example, offense/defense equally responsible for winning and losing.  But, like in the example earlier of us not knowing how many runs are scored or allowed and we just know how many wins and losses that the pitcher participated in, we have a similar situation that we don’t know why a hit or out was recorded.  All we know is that the pitcher was there when 36% of BIP are hits, and we that the the pitcher was there when his team won 33% of its games.

What do we do with this pitcher?  What if we know how many runs his team scored?  What if we know how good his fielders actually were?  What if he know what his career BABIP or W/L record was instead, but we don’t know anything about that particular season?

We have talent-driven luck, and we have pure luck.  The first thing you have to decide is how you want to account for the pure-luck in terms of apportioning responsibility to players.  A team wins 60 games in 162, and a pitcher has 27 wins in 35 decisions.  What do you do?  Now, you find out that this team happen to score 5 runs per game in his games, while scoring 2 runs per game in the other games.  Now what do you do?

These are not easy questions, and there are no easy answers.  It’s a question of philosophy.  You need to create your own personal framework to handle luck.  And then be consistent in that application.


#1          (see all posts) 2010/10/02 (Sat) @ 10:58

"Luck” is a very poor term for what you are labeling “talent-driven luck.” Luck is an external force out of the control of the player.  That we can’t measure something or don’t believe it’s properly representative of ability is not a good excuse for being sloppy with language.  The word “luck” has a lot of connotations that are not appropriate for “talent-driven luck.” If one chooses to be imprecise with that term, one can expect a lot of confusion and trouble communicating to follow.  And that’s not just bad luck.


#2    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 11:13

Give me a better word, and I’ll use it.  You can’t complain about me saying something wrong, and then not tell me the right answer!


#3    Guy      (see all posts) 2010/10/02 (Sat) @ 11:15

What about calling it “performance variation”?  I certainly think fans get the idea that players perform better on some days than others. 

Of course, separating this from “true luck” is no simple or obvious matter.  Even in Mike’s example of a pitcher throwing a hanging curve, it still only becomes a HR is the hitter tees it up right.  We may not want to call the hanging curve “bad luck,” but isn’t it “good luck” when the hitter fouls it off instead of depositing it in the seats, or hits a screaming line drive right at the 2B?


#4    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 11:32

The only point I was trying to make is that in one case, you have a true mean, and you have variations around that mean, and a point plucked from that distribution is luck.  You want to call it performance variation, you want to call it timing.  I’m open to whatever term we want.

Then we have pure luck, something that has nothing at all to do with the player, and yet we *track it by that player* !

We also have an in-between, where there is a true mean (a pitcher’s batted ball skill) and pure luck (what his fielders do).  And yet we track it by that pitcher.  That’s the issue.

I’ll be happy to use any words you guys want.


#5    Colin Wyers      (see all posts) 2010/10/02 (Sat) @ 11:51

I think the clearest distinction is between THAT player’s luck and SOMEONE ELSE’S luck.

A pitcher gives up more (or fewer) home runs than we think he “should,” given his past results and the observed spread of talent? He’s still the one who gave up the home run.

As for run support - he clearly (a bit murkier for the NL than the AL, but still) was not the one scoring or not scoring runs. That was the hitter’s luck.


#6    Peter Jensen      (see all posts) 2010/10/02 (Sat) @ 12:00

As Mike stated almost nothing of what is being taalked about here is “pure” luck.  And as I suggested in post 51 of the Halladay- Lee thread almost everything that is being described as luck is “variation due to small sample size”.  The exceptions are “what his fielders do” which is not pure luck, but a fielding bias (from the pitching point of view) plus some measure of variation due to small sample size of the fielder’s ability; and the batters’ skill which may vary from the league average.  This should be thought of as a bias as well, but as the sample size increases the bias well likely decrease, unlike the fielding bias which may persist if the fielders remain with the team.


#7    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 12:44

If you know nothing about the fielders or hitters, then from the pitcher’s point of view, it is pure luck.  Yes, in reality, these things are biases, because we know that a pitcher is married to his team, and by extension, his hitters, fielders, and bullpen.

If a pitcher was barnstorming team to team (say David Cone gets traded every week), then these components ARE pure luck.

Or if he’s traded year to year, and has a long enough career, then all those things are pure luck.

***

I like Colin’s distinction of saying “his luck” and “someone else’s luck”.

How about “the luck as the residue of his design”, and “the luck as the residue of someone else’s design”?

***

So, that’s where we are.  We have these two scenarios:

1. You know the team’s Won/Loss record in games when the pitcher was on the mound.  You know how many runs his team scored over 162 games.  You do NOT know how many runs the pitcher allowed, nor how many runs a team scored while he was pitching.

2. You know a team’s BABIP in games when the pitcher was on the mound.  You know a team’s UZR (or FIELDf/x runs) over the course of 162 games.  You do NOT know anything else.

What do you do with that W/L record?  What do you do with that BABIP record?


#8          (see all posts) 2010/10/02 (Sat) @ 12:56

As Colin mentioned, there is certainly the aspect of the performance of players overlapping or opposing each other.  That’s just the game.  I suppose you can call it “bad luck” for Felix that he has gotten poor run support from the Mariners awful offense this year.  To me that’s a different kettle of fish than figuring out what you called “talent-driven luck.”

I didn’t suggest another name because I don’t have a better one.  But let me tell you why I think that one doesn’t fit.

I think you/we are trying to do two things with that label.  One is to identify what it is--it’s talent driven.  I think you got that part right.  Performance is not equivalent to talent itself, but it’s driven by talent.  So that’s good.

But then I think you were trying to label what it was not.  And you chose the term “luck” for that, which implies an external force out of the control of the player.  I think you/we really want something else there to describe what that kind of performance is not.

I see three related and overlapping factors at play.

1. Things can be difficult or impossible to measure.  We can’t measure them but we still want to describe them or perhaps even to quantify them if we can in some way.

2. Skills may manifest themselves in very particular situations, on one pitch, in one at bat, or in one game, for instance, but not at the season or the career level.  As sabermetricians we tend to think of these kind of skills as “not real” or “luck”, but they are real and talent-driven.

3. If we can model something with a binomial distribution or a Gaussian distribution, and the results of our model comport well with reality within certain contexts, we tend to think of the underlying reality we are modeling as being driven by chance.  That a random model fits the data fairly well, within limitations, does not necessarily mean there is no skill involved.  It’s also easy for us to forget about the limitations of our model because we are pleased with the power of the model within the limitations.


#9    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 13:23

That a random model fits the data fairly well, within limitations, does not necessarily mean there is no skill involved. 

Your whole post was fine except this.  Nobody is saying there is no skill.

The number of 0, 1, 2, 3, 4+ times reaching base games by Albert Pujols follows a standard probability distribution.

The number of 0, 1, 2, 3, 4+ times reaching base games Jeff Francoeur follows a standard probability distribution.

It just so happens one guy’s mean is centered around .440, and the other guy’s mean is centered around .300.

But, if you pluck out any single PA from each batter, that’s luck.  It’s luck whether he got a hit or out.  But it’s luck of a loaded die.  Pujols’ die is loaded so he is safe 44% of the time.  Frenchy is loaded so he is out 70% of the time. 

Now, we don’t want to call it luck, fine.  Call it random variation around an estimated true mean.


#10          (see all posts) 2010/10/02 (Sat) @ 13:55

Re Tango/9,

1. A lot of people are saying that there is no skill. J-Doug, for instance, but he’s not the only one.

2. It’s not random variation due to luck or a loaded die.  It’s the skill of the batter and pitcher and how they physically and mentally execute with that skill in a given confrontation.  That it happens to follow a binomial probability distribution under certain circumstances and within proscribed limitations is convenient for analysis.  That in no way proves that the underlying reality is random.  Saying that it truly is random ignores both the reality of the batter-pitcher confrontation and the limitations outside which the binomial probability distribution does not hold.


#11          (see all posts) 2010/10/02 (Sat) @ 14:00

Re-reading #9, I see maybe where a point of clarification is needed.  When I say that people are saying there is no skill involved, I don’t mean people are claiming there is no skill in the game of baseball.  I mean that they are claiming there is no skill beyond the “loading the dice”, no skill beyond giving one player a small mathematical advantage over another player, but that in large part each individual confrontation is decided by chance.  That’s what I disagree with.


#12    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 14:01

That it happens to follow a binomial probability distribution under certain circumstances and within proscribed limitations is convenient for analysis.

That’s not totally true.  If Pujols was a “True” .440 OBP guy, and always was .440 and never wavered from .440, we would get a certain distribution.  One explained by the binomial distribution.

Now, there are other forces at work in reality.  The batter himself is a human being.  The conditions he faces are not static.  This adds to the spread we would expect from a pure binomial.

And what do we actually see?  Well, we see something that is almost a perfect binomial, as if he were a pure .440 unchanging as a player and facing static conditions.

Not exactly of course.  But close enough that, for all intents and purposes, that we can get away with it.  To treat him as a true human with all the extra conditions does nothing but slow us down considerably in the analysis.  WE can get 99% of the way there by using the binomial. 

The theoretical objection is fine.  But practically speaking, it’s an almost non-issue.


#13          (see all posts) 2010/10/02 (Sat) @ 15:37

Not exactly of course.  But close enough that, for all intents and purposes, that we can get away with it.  To treat him as a true human with all the extra conditions does nothing but slow us down considerably in the analysis.  WE can get 99% of the way there by using the binomial.

The theoretical objection is fine.  But practically speaking, it’s an almost non-issue.

I am not objecting to people using the binomial distribution because it works.  I use all sorts of models simply because they work.

There is a problem if people depend too much on the models and don’t recognize when reality diverges from the model.  That’s not something I typically see you do, so that’s not really a caution I’m directing at you, but it is something that needs to be held in mind.

Secondly, there’s a problem if you’re talking to other people about the reality and you act to them as if your model is completely interchangeable with reality.  It’s not.  While the binomial model of Pujols as a true .440 OBP guy works just fine for most of your analysis, that’s not the reality of how he functions as a batter.  He is a human with all the “extra conditions.” We don’t have to treat him that way as along as what we do works.  But when you’re talking about the Cy Young award, for instance, how the model differs from reality may become important.

Your model may work fine for 99% of what you want to do.  I’m not suggesting you abandon it.  However, that does not mean it works fine for 99% of what everyone in the world wants to do 99% of the time.

It’s a non issue for you when you’re examining the kinds of questions you tackled in the Book.  There are whole areas of baseball analysis for which the “extra conditions” become very important.  The distinction is also very important to many fans when they are discussing the narrative of the game.


#14    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 15:54

This is reality: at time t, Pujols has a certain True Talent level, which is TT(t).

Each time he is thrown a pitch, say 2500 times, he has 2500 different TT(t).  We can estimate his mean TT(t) fairly easily. Let’s say that average in 2010, that is TT(1) to TT(2500), is .440. 

Now, at TT(1) we may know more.  Maybe he has some dirt in his eye, maybe he has a sore butt, maybe whatever.  So, at TT(1), we estimate it’s .360 with an uncertainty of +/- .060.  And we proceed over all 2500 PA he has faced.

Let’s say we end up getting a mean of .440, with an uncertainty of +/- .013.  What would we have otherwise presumed with a static talent level?  Maybe .440 +/- .011 or something.

Like I said, it would be no big deal, really, in the grand scheme of things. 

(Just making numbers up.)


#15          (see all posts) 2010/10/02 (Sat) @ 15:59

And I’m saying that wOBA or OBP or whatever is only a high-level summary of his talent.

His talent is really the ability to swing the bat hard and in a certain plane with a certain pitch recognition and certain stance that gives certain plate coverage.  And you can break it down more than that.  And it varies by what the opponent is doing and the game situation, and he handles certain parts of that better than others.  For most analysis you do, breaking it down to its small constituent parts is of no value.  In fact, it would probably hinder what you do.

But that doesn’t make a single wOBA number the reality of who Pujols is, even if you allow that wOBA number to fluctuate based on what he had for dinner or how the tendons in his elbow are feeling.  That’s still a very abstracted model, useful in many ways but limited in others.


#16    Colin Wyers      (see all posts) 2010/10/02 (Sat) @ 17:07

2. You know a team’s BABIP in games when the pitcher was on the mound.  You know a team’s UZR (or FIELDf/x runs) over the course of 162 games.  You do NOT know anything else.

I have a question I find much more interesting (and more relevant). Let’s say you have the following data for a starting pitcher in a full season:

* The DER when he was pitching (in other words, that allowed by the pitcher and his fielders)

* The league average DER

* The implied DER for the BIP the pitcher allowed, based upon the defensive metric of your choice (for UZR that’s PZR; for DRS or TZL you can come up with the same thing in concept).

* The DER when he was pitching (in other words, that allowed by the pitcher and his fielders), adjusted based upon the quality of his team’s fielders, as measured by the defensive metric of your choice

Which of those do we think is the best reflection of what the DER of a pitcher’s BIP allowed would have been if his fielders had been average in performance (not talent) for those BIP?


#17    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 18:43

Mike, well, of course wOBA is just the manifestation of his inherent skillset.  What we care about are his physical tools, and his mindset, and how his brain works.  We’re never going to have that.  If your objections are on those grounds, then I agree with you. But, this is the case for every human on the planet.


#18    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 18:56

Since I find the analogy of fielding/pitching separation on BABIP in line with the offense/defense separation of W/L records, I will recast Colin’s question along those lines:

1 * The W/L when he was pitching
2 * The league average W/L
3 * The implied W/L for the baserunners the pitcher allowed, based upon the metric of your choice (say BaseRuns).
4 * The W/L when he was pitching, adjusted based upon the quality of his team’s hitters

Which of those do we think is the best reflection of what the W/L of a pitcher’s performance would have been if his hitters had been average in performance (not talent) for those PA?

So, the choice is between #3 and #4. 
- A vote for #3 is saying that you care more about theoretical results, regardless of what actually may have happened.  You don’t include certain things like sequencing.  All you care about is the performance.
- A vote for #4 is saying that there’s alot of luck involved, and we’re going to leave most of that luck in, and just adjust for the bias of his teammates’ performance.

Now, Colin’s question is very specific, and forces you to answer #4.  He’s saying, how do you adjust for the bias of his teammates, while keeping everything else as-is.  So, keep the luck, keep everything, but remove the bias of teammates.

***

I’ll give you one question back: what if NOTHING CHANGED AT ALL.  And the 162 game season were simply replayed.  What results would you get?

Well, now the answer is #3.

This all goes back to how you want to count luck as it relates to the player’s ledger, the talent-driven luck and the pure luck.


#19    Colin Wyers      (see all posts) 2010/10/02 (Sat) @ 19:25

Okay, taking this piece by piece:

Since I find the analogy of fielding/pitching separation on BABIP in line with the offense/defense separation of W/L records, I will recast Colin’s question along those lines:

1 * The W/L when he was pitching
2 * The league average W/L
3 * The implied W/L for the baserunners the pitcher allowed, based upon the metric of your choice (say BaseRuns).
4 * The W/L when he was pitching, adjusted based upon the quality of his team’s hitters

And I already disagree. If we are JUST trying to split between offensive support and defensive performance, then the closest equivalent to #3 involves using the pitcher’s actual runs allowed in combination with an assumed level of offensive support. (Yes, this ignores fielding - this doesn’t serve as a very good analogy for how to handle fielding if we end up having to resolve fielding to solve the analogy.)

As for #4, Cameron perfectly points out the flaw in this approach:

We can simply look at the distribution of run support for a pitcher on any given team to see that the assumption of even performance is not going to be true. If we use the Yankees rotation as an example, we see that the Yankees averaged 5.30 runs per game this year. Their distribution by starting pitcher is below:

CC Sabathia – 5.89 R/G
A.J. Burnett – 4.29 R/G
Phil Hughes – 6.75 R/G
Andy Pettitte – 6.00 R/G
Javier Vazquez – 4.12 R/G

No pitcher is actually within half a run of the team average. Burnett and Vazquez are over a run per game lower than the overall total, while Pettitte is nearly three quarters of a run per game higher and Hughes is a run and a half per game higher. If you built a metric that worked off the assumption that the Yankees offense scored the same amount of runs per game when Vazquez was on the mound as when Hughes was on the mound, you’d probably draw some pretty inaccurate conclusions.

So the trouble with number four is that you end up doubly penalizing a pitcher who plays on a team with an above-average offense but recieves below-average offensive support. (And vice versa for a pitcher with good run support on a bad hitting team.) In other words, you have a not insubstantial subset of pitchers who you’re doing far, far worse on than if you hadn’t adjusted their numbers at all.

The correct answer as to what DID occur from the point of view of the defense is number three, isn’t it? And then to figure out what the offense did, do the same thing, but with actual runs scored and average runs allowed. Then you end up with the right amount of total wins and the appropriate split between the two units, right?

Which of those do we think is the best reflection of what the W/L of a pitcher’s performance would have been if his hitters had been average in performance (not talent) for those PA?

So, the choice is between #3 and #4. 
- A vote for #3 is saying that you care more about theoretical results, regardless of what actually may have happened.  You don’t include certain things like sequencing.  All you care about is the performance.
- A vote for #4 is saying that there’s alot of luck involved, and we’re going to leave most of that luck in, and just adjust for the bias of his teammates’ performance.

Now, Colin’s question is very specific, and forces you to answer #4.  He’s saying, how do you adjust for the bias of his teammates, while keeping everything else as-is.  So, keep the luck, keep everything, but remove the bias of teammates.

I think number four is actually the worst answer of the set, when talking about pitchers. (Given that the split between defense and offense is 50-50, in that scenario number 2 is probably worse. For BIP outcomes, where defense seems - to me - to be more responsible on the whole, I don’t think it is.)

The question is, can we measure #3 accurately? And if we can’t, how can we measure #4 correctly?


#20          (see all posts) 2010/10/02 (Sat) @ 19:27

Mike, well, of course wOBA is just the manifestation of his inherent skillset.  What we care about are his physical tools, and his mindset, and how his brain works.  We’re never going to have that.  If your objections are on those grounds, then I agree with you. But, this is the case for every human on the planet.

We don’t have everything available to us that we would like, but we have A LOT more than just his estimated true talent wOBA over the last three years.

People are arguing at Fangraphs and at this site that any deviation from his true talent wOBA is due to luck, to forces outside of his control.  So we may as well award the Cy Young to the player the Marcel or CHONE says is the best at the end of the year, right?  Because everything else is luck-driven.

I’m saying that’s it’s primarily skill driven.  The player’s performance is not just related to skill by his skill establishing the mean for an otherwise random distribution.  That such a distribution functions fairly well, with some problems, for certain kinds of predictive or valuative analysis is very good.  But that does not mean that a random luck-driven distribution is what is happening in reality.

That distinction is fairly unimportant when you are using WAR as a salary model.  It might have some marginal effects, but overall it’s not a big deal.

That distinction is very important when you’re trying to figure out which player had the best performance in a given year and whether long-term true-talent wOBA is all that matters or whether the player actually had a huge hand in determining what happened in each and every plate appearance they participated in.  That is, do we credit them for something like half the full result of the plate appearance, or do we credit them only for half the difference between being a .350 true talent wOBA player vs. the league average .326 wOBA?  I’m convinced of the former.  Some people are convinced of the latter.  I believe they hold a mistaken view of baseball reality.


#21    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 21:54

"using the pitcher’s actual runs allowed “

But we don’t know his runs allowed in this analogy.

Remember what I’m trying to do: we don’t know how to separate pitching from fielding, hence, BABIP is a product of both.

And we don’t know how to separate offense from defense, hence, Win% is a product of both.


#22    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 21:57

People are arguing at Fangraphs and at this site that any deviation from his true talent wOBA is due to luck, to forces outside of his control.

It is outside his control, the deviation, for the most part, but it is centered around his true mean.


#23    Matthew Cornwell      (see all posts) 2010/10/02 (Sat) @ 22:09

So could we say that if a veteran pitcher has a true BABIP of .297, and if he deviates a little from that in a given season - say .293 or .301, that would be within his “talent-driven luck”, but if his BABIP is .260 or .320, that would be “pure luck”?


#24    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 22:16

Actually, control is not a good word.  He has his mean, and everything that happens is random variation around his true mean.  Call it what you will.


#25    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 22:26

Matt, go back to my analogy.

You have a pitcher with a true talent .667 win%.  And, this year (no injuries), he has a .333 win%.  What can we tell from that?

Well, we know that half of the record is impacted by noise (pure luck, the offense).  It had nothing at all to do with him.  The rest of the record is impacted by the random variation around his true talent.

It could be that his offense was great, and he simply had tons of bad luck.  Or, that his offense was horrible, and he had a bit of bad luck.

We don’t know.

This is how it would work with BABIP.  There’s a certain level of impact to a the BABIP that is “assigned” to the pitcher that has nothing at all to do with the pitcher.  It’s noise to the pitcher.  (Or noise+bias if you know the quality of his fielders.)


#26    Tangotiger      (see all posts) 2010/10/02 (Sat) @ 22:37

This is the example I was using for the analogy I was making:

Suppose that in baseball, we only recorded runs scored or allowed for a game to determine the winner, but after that, we discarded the runs numbers, and only kept track of wins, and who was pitching in that game.


#27    Matthew Cornwell      (see all posts) 2010/10/02 (Sat) @ 22:39

Yeah, but after 10 seasons like the guy in my example, if he is close to his career mean, we wouldn’t need to worry about it too much.  Only if he is well outside his mean do I start to wonder about the “why’s” really.  I guess that is where the regression formulas come in.  If Halladay has a career BABIP of .295 and is at .291 this year, I am not concerned really with who to attribute what to - he is close enough to his mean for it not to mean too much to me.  If it is Tim Hudson posting a BABIP .05 lower than his career mean, then I want to know why and how.


#28    Colin Wyers      (see all posts) 2010/10/03 (Sun) @ 00:39

Okay, here’s what it comes down to for me. For a guy who pitches 150+ innings, you’re looking at an average of 600 or so BIP. For a full-time fielder (150 games) you’re looking at something like 4400 BIP.

Here’s the thing. To determine the pitcher/fielder BIP split, you need to figure:

* The odds that ANY fielder makes ANY out on the play

To figure the intrafielder split, you need to figure:

* The odds that the first baseman makes a GB out on the play

* The odds that the second baseman makes a GB out on the play

* The odds that the third baseman makes a GB out on the play

* The odds that the shortstop makes a GB out on the play

* The odds that the left fielder makes an out on the play

* The odds that the center fielder makes an out on the play

* The odds that the right fielder makes an out on the play.

So you have roughly seven times as many BIP for a full-season fielder as you do a full-time pitcher (actually I could bump that up to 200+ IP and the gap closes considerably, and I think that’s probably a closer match to a 150 game fielder). But you also have seven times the work to do!

So if:

Remember what I’m trying to do: we don’t know how to separate pitching from fielding, hence, BABIP is a product of both.

is true, than UZR/TZL/etc. is not valid for fielders any more than it is for pitchers. If it is untrue, and we can use UZR/TZL/etc. to make valid conclusions about the split credit, then we should do so for both pitchers and fielders.


#29    Tangotiger      (see all posts) 2010/10/03 (Sun) @ 07:48

I agree that we should use UZR and PZR.

And when I say “We don’t know”, I meant it only in context of my illustration.

***

If you re-read my initial post, I am intentionally assuming that we have limited data, so that I can make an analogy between the W/L record of the pitcher to his BABIP record.  In order to make this analogy, I am assuming that we only recorded the following data:
- Starting pitcher of game
- Winning team of game
- number of balls in play
- outs made on balls in play

And that the W/L information is not shared with the BABIP information.

I am making this analogy because I want to show that in both instances, there is considerable noise that has nothing at all to do with the pitcher:
- how many runs his offense hits (of which we don’t record for posterity)
- how many runs his fielding saves (of which we don’t know how to count)

Colin is going two steps ahead of me here, and I keep trying to bring the discussion back to the most basic level until we are all clear of what I’m trying to do.  Maybe Colin is clear because he’s ahead, but I’m not sure if everyone else is on board yet.

1. We have the random variation around a pitcher’s true talent level, which drives his “make my own luck”, and therefore influences to a degree his W/L record and his BABIP record

2. We have pure luck, or someone else’s luck, that is just noise polluting a pitcher’s W/L record and BABIP record to some degree

3. And we don’t have any other information.

And therefore my point is this...drum roll please…

how you are thinking of handling BABIP is similar to how you are thinking of handling a pitcher’s W/L record, if in both cases you have no other information beyond BABIP and W/L.

If everyone is following here, we can tackle the next step (adding more information).  But this is important to know as the basis, so that we don’t keep talking about luck.


#30    Nathaniel Dawson      (see all posts) 2010/10/03 (Sun) @ 22:13

"But, if you pluck out any single PA from each batter, that’s luck.  It’s luck whether he got a hit or out.  But it’s luck of a loaded die.  Pujols’ die is loaded so he is safe 44% of the time.  Frenchy is loaded so he is out 70% of the time.

Now, we don’t want to call it luck, fine.  Call it random variation around an estimated true mean.”

Call it “random occurrence”. Ichiro isn’t going to get 1.37 hits in every game he plays. Sometimes he’ll get 1, sometimes he’ll get 2, sometimes he’ll get 4, sometimes he’ll get none.

We certainly can’t know all the different factors (or anything more than a very small percentage) as to why that would happen for any particular game, but we wouldn’t think that it’s really attributable to “luck”. It’s all about a million unknowable different things that could occur on any one play, in a way that appears random to us.

Is that really different than saying that it’s “luck” or “chance”? I don’t know, but I know that people have a problem with calling it those things, for good reason.


#31          (see all posts) 2010/10/03 (Sun) @ 22:21

Sorry, to get around the luck issue in a different way, what if we started from a totally different direction? Right now we use a bunch of different systems: linear weights for batters/runners, UZR for fielders, FIP for pitchers… Why not just use a single all-encompassing run-estimator for all, and use double-book accounting?

The baseout states give us all information on how many runs should score, right?

The batter deserves credit any time he does anything that advances the baseout state in a good direction, and blame whenever he doesn’t. The split between the pitcher and the catcher can be on a sliding scale depending on how we measure defense. For instance, we can split the field into bins with the gameday fielding data (even though it isn’t perfect) and measure the likelihood in each of those bins that a ball will get caught.

If it gets caught over 95% of the time, we can give that to the pitcher (it’s an automatic out essentially, so even a high school player would probably get it). If it gets caught under 5% of the time, we can give that to the pitcher too (it’s an automatic hit essentially). All the others we can assign to fielders on the basis of the likelihood each fielder should get to a ball in that bin/how many bases the batter advanced/how many batters scored.

If people think the pitcher has more control, they can swap to outs in over 90%, or less than 10%. If they think less, it can be 98% and 2% or whatever. But it allows people to be a bit more fungible with how much blame the pitcher gets for a lot of that noise, and makes sure that every single run is accounted for exactly.

We can talk about limited information or regression for luck, but if we’re going to do double-book accounting, why don’t we start from a place that has all the information we need to cover all runs scored/allowed, and start from there?


#32    Tangotiger      (see all posts) 2010/10/04 (Mon) @ 10:02

Call it “random occurrence”.

People will think of it light lightning striking if you say that.  I don’t see that as any better than “luck”.

I’m saying random variation around an estimated true mean.  You want to call it random occurrence around an estimated true mean, that’s fine.  But, you need to have both descriptions in there (random, true mean).


#33    Tangotiger      (see all posts) 2010/10/04 (Mon) @ 10:13

Sal, that’s basically PZR and UZR.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 15:12
Do pitcher’s reach back for velocity when needed?

May 25 15:02
Pete Palmer’s new book: Basic Ball

May 25 14:44
What sabermetrics is NOT

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion