THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, November 08, 2011

Numbers don’t (necessarily) represent the performance of the player

By Tangotiger, 12:23 PM

Jeff asks an innocuous question:

As I understand it, the Rookie of the Year award is supposed to go to the league’s best rookie. Consensus seems to be that “best” is some combination of performance and playing time. This is why Brett Lawrie doesn’t show up at the top of many lists. But why should playing time be that important? Brett Lawrie came to the plate 171 times and hit .293/.373/.580. That is an outstanding performance. An outstanding performance over a limited sample, sure, but a more outstanding performance than any other AL rookie, as far as I can tell. Why shouldn’t he get more consideration for the award? It isn’t the AL’s most valuable rookie. It’s the AL’s best rookie. There’s room for interpretation. Man, there’s room for interpretation with everything.

It is an outstanding RESULT.  It is an outstanding OUTCOME. 

And those results, those outcomes, are LINKED to Brett Lawrie.

Can we therefore INFER that because those outcomes are linked to Brett Lawrie that we (necessarily) conclude that Brett Lawrie had an outstanding performance?

Just today, I caught every single green light.  I mean every single one.  That was an outstanding outcome.  And, it was me, Tom, driving the car.  Can you infer that I had an outstanding driving performance?  Intuitively, we know that virtually all of that was luck.  So, since we know most of it is luck, we simply conclude that all of it is luck, and regress my “performance” 100% and treat it as all luck.

Brett Lawrie came to bat less than 200 hundred times.  And he was deeply involved in each one.  It SEEMS like the outcomes linked to Lawrie is OWNED by Lawrie.  But that’s not true!  A large share of those outcomes are owned by Lawrie, but not all of them.

And here’s another weird part: the more outcomes he had, then the larger share of those outcomes that we can attribute directly to Lawrie.  So, if you had 2 plate appearances for Lawrie, then we attribute very little of the OUTCOMES to Lawrie.  We just don’t know if he was being a good driver, or just happened to be in the driver seat at that moment in time.  If he had 20 PA, then we attribute more of those outcomes to Lawrie.  If he had 20,000 PA, then we’d attribute 99% of each of those outcomes (including those first 2 outcomes) to Lawrie.

This is a Bayes world. 

Unless you can make a perfect and direct connection from Brett Lawrie to a particular outcome, then we have no choice but to INFER FROM the outcome back to Lawrie, the extent to which Lawrie himself actually influenced that outcome.  And one way to do that inference is through regression.

Remember: everything we see is an observation.  And our job is to infer what caused that observation.  And the more observations we have of Lawrie, the more we can infer each single observation. 


#1    Tangotiger      (see all posts) 2011/11/08 (Tue) @ 13:51

To put it another way:

Brett Lawrie hits .293/.373/.580 in 171 PA in 2011

Willie McCovey hits .293/.368/.590 in 261 PA in 1962

But because McCovey had already accumulated 900 PA prior to that, we can attribute a larger share of those outcomes in 1962 to McCovey than we’d attribute to Lawrie’s outcomes in 2011.

But, if Lawrie continues to play for a few more years, and does so at a high-outcome level, then we can retroactively attribute more of Lawrie’s 2011 outcomes directly to Lawrie… even though 2011 is already in the books.

How we see the outcomes, and their relationship to Lawrie is dependent on what more we know about Lawrie outside those outcomes.


#2    mettle      (see all posts) 2011/11/08 (Tue) @ 14:00

So, can we retroactively award him the ROY? wink


#3    Tangotiger      (see all posts) 2011/11/08 (Tue) @ 14:08

Sabermetrically-speaking: yes!

Willie McCovey won the 1959 ROY on the strength of 219 PA.

Angel Berroa’s 2003 ROY might have have been retroactively removed and given to… I dunno, Mark Teixeira.


#4    matt t.      (see all posts) 2011/11/08 (Tue) @ 14:09

Perhaps then, the ROY Award shouldn’t be awarded until a couple of seasons later after we are able to better determine what percentage of the rookie’s plate appearances can be attributed to the rookie instead of merely linked to him.  Your point seems to be that we don’t know if Lawrie was the “best” rookie because we don’t have a large enough sample size to have confidence that the results reflect more skill than luck. I don’t know what the confidence level is for 150 at bats vs. the 523 at bats that Hosmer got. If its large, it would be nice if the rookie eligibility rules got re-written so that someone like Lawrie didn’t lose eligibility for the ROY by batting more than 130 times and lose out on being considered for the ROY for not batting more than 150 times.


#5    Xeifrank      (see all posts) 2011/11/08 (Tue) @ 14:33

#1. Can we use Lawrie’s minor league stats (MLE) or scouting reports or some combo of the two to possibly regress him towards a higher level than the average bloke?
vr, Xei


#6    Tangotiger      (see all posts) 2011/11/08 (Tue) @ 15:22

Xei: absolutely.

This is the slippery slope we are on.  We want to recognize the actual performance, but then people say we should 100% rely only on the OUTCOMES.

But since the outcomes themselves are partly influenced by someone other than the player, then we’re talking about regression.  And when we do that, we can use any and all information.

Think about a pitcher’s W/L record.  It’s an OUTCOME.  It’s LINKED to the pitcher.  But we know, we can see, that there’s alot more that goes into that (lots of teammates’ support, or lack of, in hitting, fielding, bullpen).

But things like OBP and SLG SEEMS that it’s only about the player in question, and, it’s not.


#7    xfactor      (see all posts) 2011/11/08 (Tue) @ 15:35

tango, would you award the LCS MVP to the player with the highest talent level who appeared in the series?


#8    Tangotiger      (see all posts) 2011/11/08 (Tue) @ 15:48

I suppose the idea to even have an MVP of a 5 or 7 game series itself doesn’t make much sense.


#9    Brent      (see all posts) 2011/11/08 (Tue) @ 17:04

By that logic we can’t give any award to any player with a breakout/career year, because the outcomes they accumulated cannot be attributed to them.  At least not as much as a player that’s already demonstrated that level of performance.  So Pujols should have unquestionably won the NL MVP in 2010 instead of Votto since their performances were close enough that the amount of regression should have been the deciding factor?  I think that’s a bit of a slippery slope in itself. 

I understand your point; the outcome of each of Lawrie’s plate appearances is influenced by the pitcher and his repertoire, the umpire, the base out state, the positioning and skill of the defense, the park, weather, time of day and a million other factors, in addition to the skill of Lawrie himself.  The more information we get on Lawrie the more, or less, of those outcomes can be attributed to his performance.  I get it.  You also say that Scouting reports and MLEs can be used to mitigate the amount of regression we do.  Isn’t the issue just a matter of how strongly we weight those factors?  Also if we’re regressing Lawrie’s outcomes we have to regress those of his competitors as well.  Sure we’re regressing his more, but is the difference really that great? 
Therefore people that believe in Lawrie as a candidate must weight scouting reports, MLEs etc. higher than those who don’t.  Is that wrong?


#10    Tangotiger      (see all posts) 2011/11/08 (Tue) @ 17:42

Brent, you get it.

So, people don’t want to do all that.  They simply want to BELIEVE that the outcomes we see are 100% attributable to the talent of the player.

With things like pitcher W/L record, it’s easier to see that this can’t possibly be true.

But for things like OBP, it’s not as obviously clear.

But yes, if Votto and Pujols or Bonds and Kent both have seasons of similar overall outcomes, then you give the MVP to Pujols and Bonds.  Even if Pujols and Bonds have outcome numbers a bit worse than Votto and Kent, the two big guys get the MVP.

Heck, if it’s old Bonds v young Pujols, you’d still lean on old Bonds getting more benefit of the doubt, because we knew more about him (at the time).  Retroactively, we’d of course regress them similarly.


#11    Greg Rybarczyk      (see all posts) 2011/11/08 (Tue) @ 18:03

If you had a sufficiently dominant player with a long record of greatness (say Pujols to give him a name), could such a player win an MVP award with a thoroughly mediocre season, just because by this revised method you regress everyone’s outcomes back towards their true talent estimates as they existed at the beginning of the year?

That is, could this method award Pujols the 2011 NL MVP?

Or to posit an extreme case, could Pujols put up 1 or 2 months of really weak numbers (say, due to an injury), and still win the MVP because their smaller sample size of current year performance was regressed even more heavily back towards their pre-existing true talent estimate?

Taken way beyond absurd, could Babe Ruth win an MVP award even after his death - suppose you hypothesized that he played and went 0 for 300, then regressed it back to his (huge) career numbers, and got an MVP out of it?

I’m just trying to get how this would work, because it does seem like a slippery slope to me to try to present season awards on the basis of prior season performance…


#12    Greg Rybarczyk      (see all posts) 2011/11/08 (Tue) @ 18:04

Let me clarify: I didn’t mean to suggest that Pujols’ 2011 year was mediocre.  Those words just happened to follow one another…


#13    Tangotiger      (see all posts) 2011/11/08 (Tue) @ 19:07

It’s possible that Pedro Martinez posting an equivalent .600 record would get the Cy over someone posting a .650 record, sure.

I don’t know how big the gap would have to be.


#14    Greg Rybarczyk      (see all posts) 2011/11/08 (Tue) @ 19:18

Taking it to another sport, wouldn’t something like this have literally guaranteed that Wayne Gretzky would win the Hart Trophy every year, once he established his dominant run?  Taking him at about age 24-25, how bad a season would he have to post to not win it?

Automatic awarding of in-season honors based on long stretches of dominance in prior seasons, with promising youngsters shut out because of their lack of league longevity, that’s starting to sound like...like… no, I can’t say it…


#15          (see all posts) 2011/11/08 (Tue) @ 19:50

(I post here daily, and you could probably guess who this, but I can’t afford professionally to have a Google search of my name come up with this post):

I love Tango’s analogy about stoplights.  You are driving to work, and you catch every green light.  You wouldn’t say that you had an outstanding driving performance.  You got lucky.

My fear and resistance to tying results of students’ test scores to teacher evaluations is because I am increasingly aware that most people don’t truly understand statistical evaluation.  ("He won 20 games, he should win Cy Young because all that matters is wins,” etc.) We recognize and joke about the fallacies in baseball data because we know it well ourselves and recognize the errors.  But those types of mistakes are made in many facets of life, and I am concerned that a lot of people don’t notice them.  I have NO opposition to evaluating teachers by student data whatsoever if a truly good evaluation of student test scores and my “value added” as a teacher could be derived.  I am not opposed for the reasons that the public would perceive - that I don’t want accountability for my teaching.  And I do agree that for some teachers, that is their reason for resistance.  Some teachers are very lazy.

But for me, this post nails it.  It’s that the public, even the well-educated public like my principal and my superintendent, sometimes do NOT have a great grasp of statistics.  They do not all understand causation/correlation/sample size/regression to the mean, etc. well enough for them to accurately evaluate teachers based upon statistics.

I would rather they continue to “old school scout” teachers - which is basically what they currently do - until they a.) develop better data, and b.) learn how to better analyze data.

If Paul Depodesta or a similar statistical mind was evaluating me, and created a proprietary database system for teacher evaluations at my school, then I would be 100% on board.  But that doesn’t exist, so I am not.


#16    Lex Logan      (see all posts) 2011/11/08 (Tue) @ 21:33

I think MVP (season and series), Rookie of the Year, Cy Young, Silver Slugger, etc. should all be based on performance in the given time frame. I’m all for using modern sabermetric measures to guuage performance, but I would reject any use of prior data or future projections. Fangrpahs has Brett Lawrie at 2.7 WAR, and so as a starting point I’d compare that figure to other AL rookies. We can then argue about how good the fielding part of that is and clutch hitting and whatever else, but I could care less how well he did in the minors or how he projects to future seasons.


#17          (see all posts) 2011/11/08 (Tue) @ 23:54

Anon/15, I’ve recently been thinking a lot about teacher evaluation as it relates to sabermetrics.  Instead of doing a static “X test score produces Y pay raise,” one could devise a system to estimate true teaching talent that would recognize that the tests are merely sample data.  So of course this system would require regression/Bayesian priors, etc.  However, after getting a bit excited thinking of all I could potentially do with such a system, I soon became a bit dejected as I realized the nuances are too subtle for most people to understand.  I fear that even straight Marcel for teaching would be deemed too complicated, let alone if someone were to incorporate other data like student performance in prior years and socio-economic factors.


#18          (see all posts) 2011/11/09 (Wed) @ 04:33

Mickey #17:  Totally agree.  Sadly, the statistics would be manipulated, misused, and the building block for sweeping, conclusive narratives.

“This teacher is our teacher of the year because he finished with a 21-6 win-loss record, and ultimately winning games is all that matters...”

“These four teachers should be fired because they each lost 15 games this season, highest in the league...”

“Is teacher X past his prime and due to be fired?  Sure, he had an All-Star season the past four years, but he’s 55 years old and his kids didn’t perform well on that one data point we look at, this year’s one standardized test...”


#19    MGL      (see all posts) 2011/11/09 (Wed) @ 05:22

I am not crazy about Tango’s argument. I think the ROY award (and other awards) should be based on a player’s performance, period.

However, the question that the reader posed, particularly this comment,

“But why should playing time be that important?”

makes little sense.  If we use “reductio ad absurdum” we can easily answer his question. If a player goes 1 for 1 with a HR, or 5 for 10, or has a 1.200 OPS in 20 PA, should they win the award?

Of course playing time has to enter into the equation.  Exactly how is another story…


#20    Tangotiger      (see all posts) 2011/11/09 (Wed) @ 10:28

"player’s performance”

Yes, but why do we presume that his stat line represents his performance?

We don’t presume that with a pitcher’s W/L record, do we?

Now, if you want to be explicit and say that the ROY should be based on the RESULTS ATTRIBUTED TO THE PLAYER, then fine.  And if you mean that to be the definition of “performance”, then fine as well.

***

When Cliff Lee has a .350 BABIP with men on base, and Roy Halladay has a .250 BABIP with men on base, and all the rest of their FIP are the same, does this mean that Cliff Lee had a worse “performance”?

So, exactly what do we mean by “performance”.  It’s possible that the BABIP itself had nothing to do with Lee and Halladay’s performance, that it was all their fielders (or luck), similar to Pitcher W/L records attributed to pitchers.

It’s possible that the results with men on base and bases empty were nothing but luck, but maybe that’s part of their “performance”.

***

So, I don’t see it as very black and white, as 100%/0%, in terms of what constitutes a pitcher’s performance.


#21    MGL      (see all posts) 2011/11/09 (Wed) @ 13:10

It’s not black and white.  Yes, there are any ways to interpret or measure performance.  We have discussed this before (many times).  So we may not be so far off in our thinking.


#22    JD      (see all posts) 2011/11/09 (Wed) @ 13:34

"Yes, but why do we presume that his stat line represents his performance?”

Because performance is what happened. This is turning into a really stupid argument because it seems arbitrary definitions of words are being used. I know some people here like to pretend the rules of language don’t matter, but they do, and this is why. “Performance” is not true talent.

I don’t much care about Lawrie and the ROY, and I won’t even get into a debate about how awards should be given out (though I do think they should be given out based on performance - based on what actually happened - and not true talent), but this idea that we can change meanings of words to make silly arguments just insults everybody’s intelligence.


#23    Tangotiger      (see all posts) 2011/11/09 (Wed) @ 13:45

JD:

Feel free to excuse yourself, if you feel insulted.  Otherwise, don’t tell me that I’m insulting anyone.  Only post here if you can accept that I’m not insulting anyone. 

I don’t need or want your opinion, or anyone, who thinks I’m insulting anyone.  Don’t even respond to this particular point of mine.

***

I’m not questioning this:
“Because performance is what happened.”

You are completely missing my point if you don’t see what I’m trying to say, that the ATTRIBUTION of that performance to a single player somehow is the correct presumption.

I gave examples of the pitcher W/L record being attributed to a pitcher.  I gave examples of BABIP, and with men on base and bases empty, being attributed to a pitcher.

So we then accept that the performance BELONGS to that pitcher.  That’s a presumption.  It’s not even necessarily a reasonable presumption.

Once you start to separate things out, then you get on the path to eventually attributing things to the lottery ticket holder.  And even if you do that, we’re not even sure who is holding that ticket, because a pitcher and his fielders are linked.


#24    mettle      (see all posts) 2011/11/09 (Wed) @ 13:49

It seems that we should take consistency to heart here, and while I think /20/ was partly meant as a straw man, I think trying to thread the needle and find some median between true talent and true performance is ridiculous in how undefined that task is. Halladay should win that every time.

So, it only seems to make sense for awards to be 100% performance given all the other comments above (Bonds/Gretzky getting the award every year).


#25    Tangotiger      (see all posts) 2011/11/09 (Wed) @ 13:54

Again, there’s no strawman.

The question is why do we accept that the stat line as currently presented represents the performance of the player in question?

Does the Pitcher W/L record represent Cliff Lee?  Does his BABIP represent his performance?


#26    ODD      (see all posts) 2011/11/09 (Wed) @ 14:10

Am I the only one who thinks that the stop light analogy is horrible?  Presumably the driver (and car) have no impact on the stop lights since they are generally set based on timing.  Conceivably if you drove the same roads at the same time everyday you could hit all the green lights (if the traffic was the same every time).  You factor 0 into this.

But with regards to Lawrie, yea there are variables outside of his control, but offensively he still has to make contact to get a hit etc.  His actions are part of the equation.

Good premise, not so good analogy.


#27          (see all posts) 2011/11/09 (Wed) @ 14:27

If I may take Tango’s point in a slightly different direction, there’s a continuum of “performance”

Team Win%
Team Runs
Runs Scored/RBIs
OBP/wOBA/SLG
Quality of Contact
Pitch Selection

Every one of those is “what happenned.” What we’re seeing is a growing recognition that actual player performance happens farther and farther down this list, and higher entries are influenced to a greater extent than we thought by things completely outside the player’s control.  Practically no one has ever used Team Win% by itself to characterize batters.  So, we’ve already all agreed that the important thing is what the player is resonsible for, we’re just negotiating the price.  The question is, what to use to infer the level of performance of the player given that we cannot measure it directly.


#28    mettle      (see all posts) 2011/11/09 (Wed) @ 14:39

Thumbs up to 27/Larry (though a few details are off; how often I wish there were some +/- thing in the comment section here).

Things more to the top (outcomes) should be for awards, things more to the bottom (talent) should be for contracts.

What other rubric would one use? Bottom for contracts and sort-of-middle-depending-on-the-day for awards?


#29    Nathaniel Dawson      (see all posts) 2011/11/09 (Wed) @ 14:52

I like the use of the term “outcome” for a stat line that’s generated by a player. I have issues with people using the term “performance” to denote the stats that a player produces. A player “performs” actions on the field, and we measure the results (outcomes) of that performance as stats. He’s trying to perform at his optimal level to generate good outcomes.

But we’ve all seen “bad” performance lead to a “good” outcome, and seen “good” performance lead to a “bad” outcome. Performance and outcome are inextricably linked, certainly, but they do not equate to each other.

Seriously, I think understanding this concept should be part of “Sabremetrics 101”. In my mind, it’s one of the underlying forces of most Sabremetric principles, and much easier to understand a lot of the concepts of Sabremetrics if you understand this first.


#30    WanderingWinder      (see all posts) 2011/11/09 (Wed) @ 14:52

It’s like this: There are three major contributors to what influence roled into a particular player’s statline: Talent, luck, and exxternalities (like teammates, opponents, ballparks, weather, etc.). The question is, where do some of these fall? We all want talent, and none of us want externalities when evaluating a player, but so much stuff that’s easiest attributed to luck could fall either way. For a pitcher, BABIP is little talent, a good amount luck, and a good bit fielding (including positioning and the skill of the guys behind him, etc.). So how much of that fielding stuff do we credit to the pitcher, and if we don’t want to do any, how do we control for it?
I want to argue that something like a hitter getting a ‘seeing-eye’ single, while probably mostly luck, maybe a little talent, and some defense, should count for the hitter in one of these awards, because he did his task of getting on base; he took advantage of the opportunity presented to him. Now, I don’t necessarily think he’s gonna be able to reproduce that consistently, so I don’t want to count it so much for my estimation of his talent level (though I will a little). I think Tango is wanting to say that he shouldn’t get much credit for it at all, because it was mostly on the luck/fielding side, and really the only consistent talent, i.e. the part that that hitter controlled, was in getting decent enough contact for the ball to be in play. Of course this is a reasonable enough position, but for something like these awards, I think it’s too fine a comb to use, because you end up not getting anywhere.
It’s somewhat analagous to this: I don’t think the Cardinals were the best team this year. I think the Phillies were. But the Cardinals deserve to win the World Series, becasue they did win, even though there was some luck in that, and even though I don’t think they could reliably reproduce that.
Of course, it’s more complicated on the individual level because it’s a team game, and so to separate out the different players, there are more confounding variables, because you’re looking for something smaller to start with. But this is why I wouldn’t just say “use stat X” as a measure, but rather a balance of different statistics, a little bit of eye test, knowledge of different situations, etc.
All of which seems rather unrelated to me to the question of how much weight we give to rates vs. longevity.


#31    Tangotiger      (see all posts) 2011/11/09 (Wed) @ 14:56

Suppose a batter reaches base 250 times, and scores 75 runs, while another batter reaches base 200 times, and scores 100.  What is the “performance” metric we attribute to the player?

He has 160 RBIs on the strength of a .500 SLG, while someone else has 100 RBI on the strength of a .600 SLG.

Crosby completes 900 passes that leads to 70 goals, while Ovechkin completes 500 passes that leads to 80 goals.  What is the performance metric that we attribute to the player?


#32    WanderingWinder      (see all posts) 2011/11/09 (Wed) @ 15:39

Answer in all cases: I can’t tell because my information is way too limited. if it’s 250 walks vs. 100 HR + 100 3B, probably the second guy. It could easily be the other way with a different texture. Similarly in the second case. But I think nowhere is this clearer than the hockey example, where there’s many different qualities of passes completed.


#33    Tangotiger      (see all posts) 2011/11/09 (Wed) @ 15:57

My point is that some “outcomes” are considered outcomes when it changes the scoreboard.  In other cases, it’s when there’s a handoff to the next guy.

A goalie is in nets when 5 goals are scored, but all are on tip-ins, rebounds, or defensive breakdowns.  The performance line says the goalie gave up 5 goals on 21 shots.

Again, the point is that people take for granted that the linking of the player to the event automatically means that it’s related to the responsibility or otherwise ownership of the player.


#34    WanderingWinder      (see all posts) 2011/11/09 (Wed) @ 17:44

@Tango/33
I totally agree, and I think we agree on the problem, more or less. I just think we want to use different ways of trying to get around it.


#35    mettle      (see all posts) 2011/11/10 (Thu) @ 05:22

WPA FTW!


#36          (see all posts) 2011/11/10 (Thu) @ 11:49

To follow up to 27/Larry:

The only thing that a pitcher actually does (other than a bit of fielding) is throw pitches. So in my mind, when we’re talking about a pitcher’s actual performance, we’re only talking about how well he threw pitches in the given time period. Everything else after that (strikeouts, singles, homeruns, runs allowed, wins/losses, etc.) are just outcomes that resulted partially from the pitcher’s performance, partially from his fielders’ performances, partially from the opposing hitters’ performances, and partially from luck.

The problem is that we have no way (at least right now) of directly measuring a pitcher’s performance, so instead we try to estimate it from the higher level results. But the farther up the list we get, the more noise there is from external factors.

So maybe there’s a 95% correlation between a pitcher’s performance and his observed K and BB rates, a 50% correlation between performance and observed ERA, a 20% correlation between performance and W/L record (numbers for illustrative purposes only). None of these are actually measuring the pitcher’s performance, but rather we’re trying to find the best way to use them to estimate what the pitcher’s performance was.


#37    Tangotiger      (see all posts) 2011/11/10 (Thu) @ 12:22

Right.  And then it’s a question of how far do you go with tieing performance of the pitcher to the binary outcomes that result.  Does a pitcher who puts himself in bases loaded jams only to escape with 0 runs allowed count the same as a 1-2-3 inning?  Is the outcome that matters the runs on the scoreboard?

In hockey, do we care about the skater who makes all the great passes, and puts himself to take great shots, only to be left goal-less and assist-less, while some guy gets two goals by being in the right place at the right time, and otherwise does nothing else for the game?  What outcome are we caring about?


#38    Steven Ellingson      (see all posts) 2011/11/10 (Thu) @ 14:12

Any other sport, and you’d get a much better response.  Everyone understands that people do other things than just score points and rebound in basketball, and most likely don’t think of 20 points, 10 rebounds as a player’s performance.

But with baseball, because of the static, 1 on 1 matchups, we have become comfortable saying that “.300/.350/.450” or “3 for 4” or “home run” are performances, that are directly attributable to the player.  We know that players aren’t all getting the same pitches from the same pitcher with the same defense.  We know that a swinging bunt looks the same as a line drive in the score card, yet we STILL attribute that performance 100% to the batter.  We only regress small samples because we know that the “performance” is not sustainable. 

I am with Tango in that these statistics are just estimates of performance. WAR is just a better estimate of performance than others. We need to regress not just to find true talent, but to get the best estimate possible of performance itself.

I’ve always been an advocate of using past years of fielding data, fan’s scouting, etc. when looking at MVP. I don’t know why other statistics should be any different.


#39    Brent      (see all posts) 2011/11/10 (Thu) @ 16:45

I’m of the opinion that a player’s true talent varies, not just as he ages, but from day to day, or even from pitch to pitch.  People, even professional athletes, aren’t machines (as if machines are models of consistency; my work computer used to BSOD almost daily); they can lose focus, get sick and/or dinged up over the course of the season.  Some people may be healthier or recover faster or able to maintain focus longer or whatever else.  I think there are as many internal factors affecting a player as there are external ones affecting the outcomes generated by his performance. 

Now, for predictive purposes I can understand taking those into account (which we do, intentionally or not, when making projections based on past performance), but to apply that to outcomes that have already taken place… I don’t know.  To me it seems like a matter of choosing whether to over-attribute portions of outcomes to a player or under-attribute portions of those outcomes.  You’re likely erring either way.  Which side would you rather err on?  I think I’d rather overcredit the individual in this case.  It’s not like I’m signing him to a lucrative long term contract.  It’s an award which has no bearing on anything else.


#40          (see all posts) 2011/11/11 (Fri) @ 05:34

I made a similar, but different, argument here:

http://sabermetricresearch.blogspot.com/2011/09/bayesian-cy-young.html

The difference is: Tango is arguing that you have to be Bayesian to decide who’s the best player.  By definition, the best performance must have come from the best player.  That’s because you assume that a player has a certain talent level, and any variation from it is just luck.  So, any deviation from true talent shouldn’t be rewarded with MVP.

I think that’s Tango’s argument.  I disagree.  Well, I agree with everything except that last sentence (which I am assuming is Tango’s argument—it’s not a direct quote).

To me, before you take the MVP award from player A, who had great outcomes, and give it to player B, who had less great outcomes, you have to prove, with firsthand non-Bayesian evidence, that B’s performance was actually better than A’s.  A strong Bayesian argument isn’t enough.

It’s like, suppose you have two specific suspects in a robbery, X and Y.  You know one of them did the crime, but you don’t know which one.  X has ten prior convictions.  Y doesn’t. 

You don’t say that X was more likely to have committed the crime than Y.  It’s probably true, but you don’t say it.  You need evidence.  Until you do, you have to say that they’re equally suspect.  You can say that, *in theory*, in 1,000 similar cases, you’ll find that the theoretical X did the crime 996 times.  But you don’t have the justice system say, “Alex P. Jones probably committed the crime, and John W. Smith probably did not.”

What kind of evidence do you need to award B the MVP instead of A?  Well, if A was lucky, find the luck.  Did he face worse pitchers?  Did the pitchers throw him worse pitches, in the sense of worse movement and velocity?  Did the defense not make as many plays as they should have given the trajectory of the ball?  Was the defense positioned worse?  Did he benefit from too many platoon advantages, or too many day games, or parks, or something? 

Sometimes, the luck might just be in A’s physical or mental performance.  Maybe he swung the way he always does, but just randomly was 0.1 mm closer to the ball, and got a few extra hits that way.  Maybe when there was a 50:50 chance of a fastball, A guessed right 60% of the time just by luck.

If that’s the case, and you can’t prove it, I think you have to give the MVP to A, even though I agree with Tango’s logic that B probably actually performed better.  Even if Mario Mendoza goes 4-for-4 the same game his teammate Albert Pujols goes 0-for-5, you give the game MVP to Mario, unless you can PROVE that he was just lucky.  Otherwise, you have to formally assume that Mendoza upped his skill that day, just like you have to assume OJ Simpson is innocent if the evidence isn’t reasonable-doubt proof.

I don’t think that’s a big deal.  We give the World Series to the best outcome, not the best talent.  Why should the MVP be any different?


#41          (see all posts) 2011/11/11 (Fri) @ 14:54

Phil: I don’t think that’s Tango’s argument. In particular, I don’t think this is right:

“That’s because you assume that a player has a certain talent level, and any variation from it is just luck.  So, any deviation from true talent shouldn’t be rewarded with MVP.”

A player has a certain talent level, but that doesn’t mean that he can’t perform above or below that level in a certain time period (or maybe, like Brent/39 said, his true talent is varying day to day, and his average true talent in the time period in question was different than what we’d expect his average true talent to be going forward).

I think a good way to think about this is with the following example: Suppose we have hitter A and hitter B, both of whom have 10 PA. Hitter A goes 0-10 with 10 hard hit line drives that are caught by fielders. Hitter B strikes out 3 times and hits 7 weak grounders/bloopers, 4 of which end up going for singles. Even though hitter B had better outcomes, it is highly likely that hitter A actually performed better in those 10 PA (because of the contact/batted ball profiles, not because of their true talent - in fact, hitter A probably performed above his true talent in this sample, and hitter B probably performed below his true talent in this sample).


#42    Tangotiger      (see all posts) 2011/11/11 (Fri) @ 15:22

All I’m trying to say is that you have outcomes, which we associate to players to some extent, and there’s the underlying performance that influences (though does not necessarily directly link to) those outcomes.

While we may say that the “better performing player” is getting rewarded, it’s actually the “player linked to the better outcomes” that is getting rewarded.

Hence a lucky pass that leads to a goal is one kind of outcome, while a great pass that leads to a shot (but no goal) is another kind of outcome, and we “reward” the action that lead to the goal outcome far more than the one that didn’t.

We’re not actually rewarding performance.


#43    phil birnbaum      (see all posts) 2011/11/11 (Fri) @ 15:30

What are you suggesting we reward?


#44    Tangotiger      (see all posts) 2011/11/11 (Fri) @ 16:12

I’m not suggesting what we reward.  I’m just pointing out what it is that IS being rewarded.  I don’t particularly care who gets rewarded what.  I just care about getting the specifics right.

Just as the World Series rewards outcomes, not performance, so do these awards.


#45    Steven Ellingson      (see all posts) 2011/11/11 (Fri) @ 17:51

There was a study recently about umpires calling more borderline strikes for guys who had better control. Sorry I can’t remember the author, though I’m pretty sure it was on THT.

I forget the principle behind it, but the idea was that this made umpires MORE accurate.  Basically if they had prior knowledge that the pitcher was likely to throw a strike, if they use that info, they’ll be more likely to be right on borderline pitches.

I feel like this argument hits a similar vein. It might not be “fair” but do we want our umpires to be fair or accurate?


#46          (see all posts) 2011/11/11 (Fri) @ 18:03

Steven/45, that’s Bayesian inference right there.  The same data (the umpire’s observation) with different prior beliefs (control of pitcher) will yield different posterior beliefs (likelihood of strike).


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves