THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, October 23, 2009

Should we hold BPro accountable for unannounced changes?

By Tangotiger, 10:28 AM

Colin points out:

Baseball Prospectus has a metric called Wins Above Replacement Player, which (ideally) should be a pretty straightforward proposition. But let’s look at a few things. I have a collection of sortable WARP1 reports from October 9th, covering 1996-2009. How does it match up what’s currently on the site?

[snipped examples]

So what’s going on here? Aggregate WARP1 per season has declined by roughly 150 over the past few weeks. That’s right - nearly 150 wins per season have simply disappeared from Baseball Prospectus’ website. Why? It seems that BPro has changed how they convert runs to wins. Before, they were using a scale where 8.2 runs = roughly 1 win. Now it’s a lot closer to 9.2 runs per win. (For a fuller explanation, check out this discussion of the subject.)

This is, by all accounts, good (it has the nice side effect of making individual player WARP sum up to team WARP, assuming a replacement level of .220 - we can have that conversation another time). Just one question - why hasn’t BPro announced this change?

How accountable is a site in making adjustments to their metrics?  Suppose Sean Forman decides to change his park factors to do it a different way?  Does he need to announce this?  What if he decides to change ERA+ to do it as Guy suggests (2- ERA/lgERA), instead of lgERA/ERA?  Does he need to tell us?  What if Fangraphs updates their WPA numbers to reflect the 2009 park factors, rather than using the 2008 PF?

It seems to me that making adjustments to factors that still keeps the mean and standard deviation roughly equal, but moves individuals up-and-down is one thing, and you can make the argument that a website should not necessarily make an announcement (though announcements are ALWAYS appreciated).  But, in the case of WARP, the mean point has changed.  I think that makes it announcement-necessary.  Especially since Clay wrote an article in the BPro annual of his epiphany, and made the announcement that the mean point of WARP will be changed.  I think status updates are appreciated.  Clay is a straight arrow, so I’d say it’s more of him making the announcement a low priority, rather than consciously dismissing the users.

Regardless, I feel for any of the researchers out there who download data, and then find that the data has been changed.


SabermetricsData
#1    Rally      (see all posts) 2009/10/23 (Fri) @ 14:23

They are putting these stat pages up on the web to draw traffic, to get people to look at their site.  And these are all the free pages, right? If you grab all the pages and put them in a spreadsheet, you are no longer looking at their site when you want the stats for some player.

In this situation I don’t think I’d feel any responsibility to make sure the offline spreadsheets are kept up to date.  If people are using scripts to download all these pages, that may even violate the terms of service for some stat websites.


#2    Tangotiger      (see all posts) 2009/10/23 (Fri) @ 14:44

Rally: my post was asking at what point is the guy providing the data responsible for telling you when he changes how the data is derived.

Your site would be a good example.  Suppose that you decided to change how you handle starters and relievers.  You end up giving starters way more wins, relievers a bit more, and maybe position players a bit less.  Instead of a .330 replacement level, you are using a .280.

All those graphers at BtB and THT who are creating charts from your site will suddenly become… perturbed. 

Is it really only a question of your own reputation here?  Is it a service whereby you need to explain when the basis is changed?  What kind of responsibility do you feel?

The one difference here is that Clay went out of his way to write/sell an article regarding WARP changing.  That changes the level of responsibility doesn’t it?


#3    dkappelman      (see all posts) 2009/10/23 (Fri) @ 14:52

It’s my policy that for “real” changes, things should be announced.  Updating the 2008 PF to 2009 PF is something that would be expected and not really out of the ordinary, so I’m not sure it’s necessary.

But, I agree if you’re changing something that wouldn’t be updated normally, especially something like replacement level, it’s probably a good idea to at least recognize it.  I think if the WAR data at FanGraphs changed over night re-ranking all the players without some explanation, people would be more than a little confused, not to mention there’d probably be some loss in credibility.  I think it’s important to be as transparent as possible in these situations.


#4    Tangotiger      (see all posts) 2009/10/23 (Fri) @ 14:56

I think a good example would be the calculation of WAR for pitchers, which is HEAVILY reliant on FIP.  If Fangraphs changes that, you would expect that to be announced, correct?


#5    Terry      (see all posts) 2009/10/23 (Fri) @ 14:58

They only need to feel accountable if they care about their credibility....


#6    Colin Wyers      (see all posts) 2009/10/23 (Fri) @ 15:41

Rally, BPro offers a link to a downloadable CSV file for any of their sortable stat reports. I didn’t spider anything - I went through and downloaded the years I was interested in one at a time, by hand.

And it’s not about keeping them up to date. I don’t think it serves the reader very well if they can’t expect the average to at least maintain some consistency on a week-to-week basis. This goes double for stats from a decade ago, I should think.

In this case, it comes out to change in what the average WARP1 is, from about 2.9 to 2.6 for position players with 650 PAs.

And of course BPro can put whatever they want on their website. But the lack of transparency they display on issues like this makes it harder, I should think, for readers to use the stats BPro is providing. And they do seem to be losing mindshare to Fangraphs’ stat reports.


#7    Rally      (see all posts) 2009/10/23 (Fri) @ 15:46

"Is it really only a question of your own reputation here?  Is it a service whereby you need to explain when the basis is changed?  What kind of responsibility do you feel?”

In my case, I did feel a level of responsibility, because many of them have paid me for the data downloads.  When I added the pre-retrosheet data, I sent an email out to those who had paid so they could download the new data at no additional cost.


#8    Nick      (see all posts) 2009/10/23 (Fri) @ 16:05

WPA is park adjusted?


#9    Tangotiger      (see all posts) 2009/10/23 (Fri) @ 16:18

WPA is run-environment aware.

There are two ways to do handle the issue.  One is the traditional way.  The other is to ONLY consider the runs scored for a team, and figure that that’s the environment that a player plays in.

This was discussed here:
http://www.insidethebook.com/ee/index.php/site/comments/quick_park_factors/


#10    Matthew Cornwell      (see all posts) 2011/03/26 (Sat) @ 23:31

Bringing up an oldy here.  Just checked out BP and noticed that WARP totals on Player Cards have changed.  Some pitchers have changed a lot.  The new WARP pitcher totals look more FIP reliant than before, but I can’t find any information about the change. Anybody know?


#11    Ryan JL      (see all posts) 2011/03/27 (Sun) @ 00:49

Gee, thanks a lot Matthew.  I just spent the last 15 minutes trying to figure out when Colin left BPro and how I missed it.


#12    Matthew Cornwell      (see all posts) 2011/03/27 (Sun) @ 01:10

You’re quite welcome!


#13    MGL      (see all posts) 2011/03/27 (Sun) @ 01:39

This is actually a good question and something I often run into with the data and programs I provide for FG.

I can’t tell them what to do, and DKA is a great guy and a 100% straight shooter…

I often find (and correct of course) small bugs in my programs and I also often tweak my methodologies and add new things.

My opinion is that if the changes lead to significant differences in the resultant numbers, you announce it. If not (the resultant changes are small), you don’t.

On the one hand, if you are constantly announcing changes and fixes, the site and/or the metric loses credibility. On the other hand, you have a responsibility to point out if some of the numbers have changed and why.

Obviously, there are grey areas as far as what are significant changes to the numbers and what are not, and judgment has to be used.

Here is one example of what I am talking about:

UZR uses park factors.  Every year, I have more recent data on parks, and I can update past park factors, especially for newer parks.  For example, last year, I assumed that Target field was roughly neutral (I actually made some “manual” park factor assumptions based on the characteristics of the park).  Now that I have 1 year of data, I can (and did) change my 2010 park factors for Target Field (and all other parks - although the older parks probably won’t change much if at all since I already have lots of years of data to construct their UZR PF’s).  Therefore the 2010 UZR’s will change, especially for MIN players.  However, these changes are likely to be small and I don’t think an announcement is necessary.  Perhaps a blanket explanation, like the one above, is warranted.

For a major change (one where the numbers for individual players can change significantly), even if the “mean” does not change, and I am not sure what that means, I believe that an announcement is definitely warranted, and should be treated as a good thing (complex stats should be constantly evolving), so what not?


#14    MGL      (see all posts) 2011/03/27 (Sun) @ 01:39

Funny, I also assumed that this was a recent thread, and I was wondering what Colin was doing bashing BP!


#15    Matthew Cornwell      (see all posts) 2011/03/27 (Sun) @ 08:37

This was the only discussion that I could find that was dealing with unannounced changes at BP.

So despite the confusion I caused by bringing back an older thread...anybody know the answer to my question?


#16          (see all posts) 2011/03/27 (Sun) @ 11:32

There was a mention that WARP had changed when PECOTA was first released 6 weeks ago:

“After that, we’ll be publishing the PECOTA cards, featuring perks like the percentiles and the ten-year forecasts. We’ll update you more as we get closer to that point.

We’re also revamping the cards to use the new Wins Above Replacement Player model we’ve developed. PECOTA has already been adapted to use the new WARP, so WARP baselines have shifted a bit from what you’re used to seeing. The biggest change is among relief pitchers, who take a major hit. Please keep this in mind as you review these forecasts. We know that many of you are relying on these forecasts for your fantasy teams, and we thought that it was better to get the forecasts out now rather than wait for when the entire site was ready to transition to new WARP.”

This might be what you are looking for:
http://www.baseballprospectus.com/article.php?articleid=12377

Unfortunately, the article about pitching and WARP never seems to have happened, much like the 10 year forecasts.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 11:53
Do pitcher’s reach back for velocity when needed?

May 25 11:33
“Why Kickstarter works”

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 10:14
Largest demonstration in Canadian history?

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 17:04
Firefox, IE, or Chrome?