THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, January 20, 2010

A saberist reviewing a paper of economists reviewing saberists

Dave Berri and JC Bradbury wrote a paper called “Working in the Land of the Metricians”, whose purpose is to discuss “how economists can benefit from sabermetric advances while avoiding its pitfalls.”

I will give you my thoughts on the paper, as I read it.


Dave Berri and JC Bradbury wrote a paper called “Working in the Land of the Metricians”, whose purpose is to discuss “how economists can benefit from sabermetric advances while avoiding its pitfalls.”

I will give you my thoughts on the paper, as I read it.

Part 1

Specifically, how should economists react to the work and ideas of the nonacademic analysts who are often seeking to answer the same questions?

Excellent question.  Indeed, it’s probably the only question I have. 

Fundamental to the rest of the article,we observe that these communities of nonacademic sports analysts evolved outside the typical parameters of an expert peer-review system familiar to most academics.

Yes, good setup.  This is exactly what happened.

Some metricians accuse the academic group of intellectual snobbery—that is, deliberately snubbing valid research only because the authors lack credentials. Phil Birnbaum, a frequent critic of both of the authors of this article, summarizes this complaint…

That should read “Some metricians accuse SOME IN the academic group”.  Otherwise, yes.

Birnbaum (2006) considers sabermetricians to be ‘‘no less intelligent than academic economists’’ and superior to economists in their understanding of baseball.

True.  Baseball analysis is what we do.  If you have two groups of people, equally intelligent, and one spends far more time being subject matter experts, and the other spending more time learning technical skills, well, by definition, the subject matter experts are superior in their understanding of baseball.  And the techies are superior in their application of technical skills.

This statement reveals a curious worldview. On one hand, the aspect that is universal across both groups—members of both communities have been devoted sports fans since an early age—is considered unique to the nonacademic sports analysts.

No, this is not considered unique to saberists.  It seems that the authors think that just being a hardcore fan is what we think makes us superior.  That’s not correct.

On the other hand, when it comes to the aspect that is unique to academics—academia normally involves many years of advanced training and requires its participants to be judged competent by their peers in a ‘‘publish or perish’’ environment—metricians demand equal recognition. In our view, this mentality begets misplaced confidence.

Actually, we don’t demand equal recognition.  We don’t even demand recognition frankly.  What we would like is for some economists to realize they are still dancing disco, while the rest of us have grown up from that.  Perfect the electric slide all you want, and that won’t mean that we want to be your equals there.

...researchers seeking to study sports phenomena should proceed with caution when using informally vetted metrician findings

Agreed.

Often the merit of ideas offered by this community is judged by a consensus of pseudonymous avatars,

Ah, yes, the necessary jab at aliases.  In that one statement, the authors have insulted half the internet population, if not more.  Is it really relevant that “Tangotiger” gives feedback on sabermetric topic, as opposed to someone with two names, like “Tom Tango”.  How do we know if “Gabriel Desjardins” appears on his birth certificate?  Is this really relevant?  So this is a statement of pure bias on the authors part: “judged by a consensus of pseudonymous avatars”.  They could have said “judged by a consensus of extremely dedicated students of sports analysis”.  The authors deserve scorn for such a blatant editorial slant.

many of whom appear to lack training and experience with advanced research methods.

Guilty. 

But, the presumption being made here is that “advanced research methods” is somehow beneficial.  You can make an argument that such methods see the trees as noise to the forest.  In my opinion, being a subject matter expert is a far larger prerequisite for being a quality sports analyst than being a technical expert.  Indeed, if you were to list all the great sports analysis in the past decade, my view would be supported.

And look at the fertile ground that is PITCHf/x.  What is a non-subject matter expert to do with such data?  Purely a fish out of water.  The subject matter experts are the ones who will sort out how the data should be aggregated, interpreted, etc.  The technical expertise will rely on data mining and cluster analysis.  The regression jackhammer will have little use to those who need archeological tools.

Consequently, much of what passes for research among metricians would not pass through the academic peer-review process of any academic discipline.

True. At the same time, much of what passes for research among some academicians would not ass through the non-academic review process. 

While a specific academician might agree that the worst hitting outfielder in baseball could have a performance worth 12 million dollars, even the most casual baseball fan would laugh at such a result.  Indeed, while the baseball fan would say that your toaster is broken, this academician will reply that the toast is supposed to be burnt black on one side and stay white on the other.

That being said, researchers who validate metricians’ findings should acknowledge the origins of their approach.

Good.

Although we are critical of the hobbyist community’s informality

And to me, I PREFER the informality.  Things get done, things move fast.  Things get corrected quickly.

we must acknowledge that metricians are responsible for many important discoveries that should inform sports economists in their research. In fact, we have observed instances where economists have made elementary mistakes in understanding sports that would not have been made if they had paid attention to metricians’ findings. It is our hope that academic researchers using sports as their laboratory will review the analysis from all relevant outlets, including the metrician community.

Good.  If only I can believe it.  JC shows an enormous level of disdain and scorn for our work.  For example, replacement level.  Also, he only uses Dewan’s plus/minus, and proclaims it as the superior metric, when a) he’s made no effort to show why Dewan is better than UZR, and b) UZR is far more publicly accessible than Dewan.  Indeed, the rest of us see UZR and Dewan as pretty close to each other.

With these thoughts in mind, we now proceed to three specific questions:
1. What can the sports economist learn from the metrician research?
2. What issues arise when sports economists review the metrician research?
3. How should sports economists handle the less savory interaction aspects when we journey into the land of the metricians?

I haven’t read the rest of the paper yet.  I’m just cut/paste/comment as I read.  I’m only at the bottom of page 3.  Let me give you my answers, and let’s see what the authors say:

1. That the saberists have provided the bulk of the research, so that means that the sports economist needs to spend the bulk of their time looking at saberist work.

2. That there is no one single placeholder, that things evolve, that sometimes there are competing views.  It’s a bit of a mess.

3. Some sports economists go into the deep end of the pool where the subject matter experts know how to swim and dive, so respect that they know how to swim and dive without having actually been taught how to swim and dive.  Maybe we should be wearing tighter swimsuits and shaving our chest hairs and wearing a cap to improve the aerodynamics of swimming and diving.  We can accept that.  But, we’re already 95% of the way there.  You can help by adding the other 5%.  But, don’t tell us that we should go back to the wading pool and buying textbooks on how to swim and shaving our hair.  Or, in a word, don’t be a dick. 

Are saberists dicks too?  I’m sure some are.  But being a dick who can swim is better than being a dick who tells an experienced swimmer how to swim.

Speaking as a saberist that has posted on, literally, hundreds of sports sites, blogs, and forums, I can say that there is nothing unsavory in my dealings with any reader.  There are those readers who refuse to be educated, and so, to those readers, you bid adieu.  Others are sincerely interested in learning or teaching or interacting.  This is by far the largest segment of my interactions.  JC, on the other hand, by his own hand, has created the unsavory landscape that he finds himself in.  How is it that the rest of us can treat each other with so much respect, and yet JC finds himself, on site after site, as the bad guy?  Stop being the swimming coach telling Michael Phelps that he’s not swimming technically correct, even if he’s the best swimmer in the world.

That’s all I have time for now.  When I review the rest of the paper, I’ll put up another post.

Part 2

According to Turocy, though the added explanatory power of a regression is superior to some simple metrics, the coefficient weights suffer from omitted variable bias due to the correlation of included metrics with other unmeasured important aspects of the game. ‘‘Linear weights’’ is therefore a superior estimator of run production, because it does not suffer from omitted variable bias… This metric was originally developed by operations research analyst Lindsey (1963) and updated by sabermetricians Thorn and Palmer (1984). Linear weights estimates contributions to baseball events using play-by-play data to weight the run-generating probabilities for individual events; thus, its estimates avoid the omitted variable bias problem.

Yes, perfect, and exactly what I’ve been saying.  So, here, JC agrees with me that simply relying on the jackhammer regression, the one where the run value of a double is only +.15 above the single, is the wrong thing to do.  Good.  But he adds:

Nevertheless, the story does highlight how academics can learn from metricians.

Yes, perfect.  This should be the rule, not the exception.

Scully (1974, 1989) used strikeout-to-walk ratio to measure pitcher performance; however, Zimbalist (1992b)6 and Krautmann (1999) argue that ERA is a better measure of pitcher quality because ERA has greater explanatory power. ....(then Voros DIPS talk)… Consequently, we see Scully’s general approach is confirmed by a metrician, demonstrating what the nonacademic sports research community can contribute to sports economics research.

Yes, good.  DIPS is the kind of thing that only a subject matter expert would see.  And it’s exactly the kind of thing that should be used by the sports economist.  It’s not a good idea for them to try to create a metric when we’re the ones who have so much experience in doing just that.

Part 3
Berri now takes over:

Looking at team data in our same sample from 1987-88 to 2007-08—or 591 team observations—we see that a team’s NBA Efficiency per game explains 32% of the variation in team winning percentage.

I follow the basketball metrics debate from a high level view.  But, from what I can gather, it’s similar to what you’ll find in hockey: the contributions of individual players are not recorded as well as we’d like.  And when we try to create inferences at the player level, it kind of breaks down at the team level.  And, since the only validation we have is at the team level, then, in order to get the best correlation, it’s best to throw out anything you might learn at the individual level that decreases that correlation. 

Really, the argument is very similar to various run created metrics: we really don’t care how runs are created at the team level, as the individual hitter is what interests us.  But, to increase correlation, throw logic and some useful individual data, to best serve the jackhammer regression.

And after I wrote that, here’s a third-person statement that comes right after all that:

To put these results in further perspective, consider Wins Produced, a measure reported by Berri (2008).14 Wins Produced explains 94% of team wins.

But, that’s NOT the objective.  If that was the objective, why not simply take points scored by the team, divide by 48*5, and multiply by minutes played for each player.  And do the same for points allowed.  There, r=.999.

Consequently, these models are not well connected to winning and, therefore, are not useful in determining the economic value of players.

I can see why Berri is so not-liked by the basketball analyst community.  He engages them even less than JC does, and he positions his metric much higher than JC would dare (though JC does go there when he says things like Brandon Lyon, per inning pitched, is worth as much as Doc Halladay).  JC has the good sense to shut up about it, while Berri just keeps compounding his assertions.

So, Berri should actually take JC’s advice from the page before and learn from the analyst community as to what makes a good metric.  The irony is of course completely lost on Berri.

The purpose of tracking statistics is to separate a player from his teammates. The plus–minus measure, though, provides a player evaluation that fails to accomplish this objective.

I’m glad JC is my antogonist, because Berri is starting to get on my nerve.  Pure unadjusted plus/minus is crap, yes.  Bobby Orr for example was +124 in 1970-71, for the highest such mark since it’s been recorded.  Bobby Orr is also one of the three greatest hockey players of all time.  There is no doubt that the +124, while perhaps not in its degree, at least points to something real: Orr was good. That Bruins team was great, as they were probably around +160 to +170 as a team.  So, we can see that when Bobby Orr was not on the ice, the rest of the team was around +40.  So, Orr was surrounded by some good players, but certainly not a fantastic great group of players.  Clearly though, Orr was the man on that team.  It’s also not hard to see who his defense partner was: Dallas Smith who was +94.

So, in order to make sense of plus/minus, you really need to know who was on the ice when Bobby Orr was on the ice.  Bobby Orr, basically, pollutes the data.  If Orr was used equally with the other five defensemen, then that would have been fantastic.  And really, that’s what we really want: to try to isolate a player from the rest of his team. To deride plus/minus because it is simply a counting number is short-sighted.  Indeed, it shows the reason why an economist should let the saberist figure out how to create the individual metric.

Regressing a player’s plus–minus this season on his plus–minus last season reveals that the latter explains only 9% of the variation in the former.

Ack!  Again, it completely misses the point, that Berri would do this.  I can’t believe what I’m seeing here.  We should be thankful that JC is our baseball antogonist, because Berri here simply is not fighting fair.

When we complete a similar exercise for other statistics tracked for hockey players—such as shooting percentage, assists, goals, points, penalty minutes, and shots on goal—we see a level of explanatory power that ranges from 39% to 80%. In sum, every other hockey statistic tracked for skaters exhibits a far higher level of consistency across time. And this suggests that relative to plus–minus, every other statistic captures more of the player’s individual skill and less the happenstance of his teammates’ identity.

Really?  Big deal!  So, the number of shots a player takes is more indicative of him as a shooter than the plus/minus he shares with his teammates is of his plus/minus he will share with his other teammates next year?  Again, completely misses the point.

So, it appears that plus–minus is not a particularly powerful metric in hockey.

Enough.  You’re killing me here.

Again, the problem with plus–minus is that a player’s teammates can affect his measure. Consequently, as a player’s teammates change, a player will see his plus-minus value fluctuate.

Yes, this is a GOOD thing.  That shows that you can’t use plus/minus in isolation, that you need to treat the individual as part of a team.  There are 10 guys on the court when things happen, when he scores, when a teammate scores, and when an opponent scores.  And not just scoring, but turnovers, shots, fouls, etc.  The field goals and assists simply made you want to hope that things can be so easily separated, individual from team, that plus/minus really shakes you back to reality: you need to tweak out the individual contribution from the team.

Plus/minus forms the basis for everything you want.  Granted, it comes at a price: an uncertainty level that is tied to how non-random your teammates on the ice or court are with you.  Ideally, every player plays with every other player an equal amount of time.  This is very similar to the WOWY (with or without you) that I use in baseball as well.  You simply need to tie an uncertainty level around the metric.

To combat this problem, people have introduced adjusted plus–minus.

Good.  Go on.

An even greater gain is seen if 5 years of player data are examined. Estimating a coefficient for 373 players who played for five seasons, we see that 38.9% of coefficients are at least twice the value of the standard error. And 50.4% surpass the 1.5 threshold. Although more data do increase the level of statistical significance, it is still the case that most players—even when 5 years of data are used—are not found by this method to have a statistically significant impact on outcomes.

That’s good to know for basketball.  That’s the kind of good stuff I’d like to see.

Turning to plus-minus, how many points are scored and surrendered when a player is on the court should be linked to current wins. However, efforts to separate the player from his teammates have not proven successful.

In the instance he is citing.  That doesn’t mean that it doesn’t exist in some other way of adjusting for plus/minus.

Consequently, despite the popularity of these approaches among the metricians, neither is an improvement over what has been published in academic journals.

Meaning “Wins Produced”?  Which is Berri’s metric.  Again, it’s a simplistic way to look at the problem.  Say we go back to hockey, where we see someone with 50 goals and +30 and on the same team, someone with 10 goals and +30.  And we know that they don’t play on the same line.  Now, it’s likely that the guy who scored 50 goals contributed more to the +30 than the guy with 10 goals.  So, the best thing to do is think of plus/minus as adding an extra component or layer to the understanding of the player.

The same would apply to basketball.  The combination of the other stats with plus/minus will give you a “sum greater than the whole”.

The story of Linear Weight and DIPS highlights how metricians can help sports economists. Our examination of PER and adjusted plus–minus, though, suggests economists should be careful.

The second sentence should be more like “economists should wait out to see what the analysts are discovering”.  I object to the idea that the economist will swoop in to save the day, as Berri seems to do with his “Wins Produced” metric.

The metricians do not provide the same peer review that academic journals do. Consequently, just because a work is accepted among the metricians, it does not necessarily mean this work would pass muster in the academic community.

No, not the same peer review.  But, neither do we provide necessarily inferior review.  In many respects, we provide superior review.  And just because it passes muster with the academic community doesn’t mean it will pass muster with the subject matter experts in the blogosphere.

Part 4

In addition to disseminating ideas more widely, blogs offer the opportunity to discuss and debate with the audience.... However, no matter the pedigree of the participants or the overall quality of ideas, these forums are no substitute for the formal peer-review process that typically governs economics research.

Sure they are.

Regardless of the usefulness of commentary, blogs should be considered a source of ideas and inspiration for further research rather than a serious research outlet.

Again, just more summary assertions without evidence.  Say it with me: bullllll-sh!!!!!t.  I consider my blog an extremely serious research outlet.  Now, I would be delusional to say that, if nothing also came of my blog.  But, that’s hardly the case.

Why can’t my blog be BOTH: inspiration for further research AND a serious research outlet?

Good ideas developed on blogs should be written up, tested rigorously, and then published through normal channels.

Why?  Why?  Why why why why why?  If I did that, if we did that, the sabermetric movement would slow down to a crawl.  Again, more asserting of their will, with no backing at all.  Indeed, JC went out of his way to discuss Linear Weights and DIPS, and those were published in non-normal channels.  Again, another bullsh!t assertion.

And I’ve seen the results of JC’s “rigorous” tests: by rigorous, he means through whatever standards his academic teaching tells him is rigorous.  Sorry, it makes no sense.

Kruger and Dunning (1999) find that many nonexperts tend to overestimate their abilities. ‘‘Not only do these people reach erroneous conclusions and make unfortunate choices, but their incompetence robs them of the metacognitive ability to realize it.’

I agree with him.... but, I think it’s JC and Berri who are the non-experts!  We who study baseball are the subject-matter experts.  When JC and Berri swoop in to tell us that they are experts, and that Francoeur’s abysmal performance is worth 12 million dollars because he happened to play 160 games, then that means it’s JC that reached erroneous conclusions and made unfortunate choices.

Dude, c’mon.

The knowledge disparity between academics and interested laymen sometimes leads to unpleasant discourse.

Knowledge disparity?  “Interested laymen”?  How about calling us “subject matter experts” or “highly motivated students”?

...some participants often confuse ease of access in the same forum as equal expertise.

Really?  You aren’t talking about subject matter experts are you?  If you mean the public in general, well, that’s the internet, the greatest form of communication yet invented.

When nonacademics discover their ideas are not being accepted, unpleasant behavior can result.

Can you differentiate between “nonacademics” and “subject matter experts” (SME) please?  SMEs make up a tiny percentage of these people.  So, I’m not going to defend some internet flamers.

Examples of ‘‘unpleasant behavior’’ include commentators leaving essentially the same comment over and over again, often under different aliases;

No SME would do this.  So, again, you are indicting the internet.

As a consequence, some blogowners have gone so far as to remove the comments option to avoid the unpleasantness.

As far as I know, only sabernomics.com has done so.

Really, this whole section so far is just a philosphical point of view that really has no basis to what we’re talking about.  Lumping SMEs with outright flamers is disingenuous to say the least.

Despite the behavior of commentators frequently referred to as ‘‘trolls,’’ the online interaction has proven to be net beneficial. Most commentators are pleasant and offer useful insight. Consequently, very few (if any) academics ever walk away from the blogging experience once they have developed an audience. Apparently, by their own behavior, sports economists who are aware of the costs of blogging can certainly reap many benefits; and thus, blogging can ultimately improve the research generated by sports economists.

That’s a good enough conclusion.  One must therefore ask why JC refuses to engage with other saberists.

(60) Comments • 2011/02/24 • SabermetricsStatistical_Theory
Page 1 of 1 pages

<< Back to main