THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, February 25, 2010

EqA renamed TAv (True Average)

By Tangotiger, 02:50 PM

Here’s Jay:

Ilya: Why is TAv better than wOBA?

Jay: 1. The fact that the stat is scaled to batting average makes it easier for the average fan to understand than wOBA being scaled to OBP. “.300 is good” is a notion with t over 100 years of baseball history behind it.

2. EqA is park adjusted, wOBA isn’t, at least as I understand it.

3. The two have virtually identical correlations to runs scored, but TAv produces a smaller RMSE. I’ll leave the defense of that statement and the grisly math to Clay Davenport, who’s got data showing that. He’ll have an article on the topic soon once he gets the PECOTA cards up, but perhaps I can get him to chime in here as well.

Me:
1. This is a feature of wOBA, not a bug.  Some people prefer the BA scale and others prefer the OBP scale.  That doesn’t make one “better” than the other. I can just as well show wBA by dividing by 1.15 instead of multiplying by 1.15.  It’s a choice I make.  It’s different, neither better nor worse.

2. wOBA as presented on Fangraphs is not park-adjusted. wOBA as presented on StatCorner.com IS park-adjusted.  People can use whichever they prefer.

3. If I was interested in having the lowest RMSE at the team-level, I would come up with unacceptable weights.  That’s not the point.  Colin Wyers understands the point, and I hope he’s involved in trying to prove that EqA is “better’ what wOBA.  At least, he’ll prove it at the game or inning level, which is where the real test should take place, not the antiquated team-level.

If you are going to call something better than wOBA, give me a chance to rebut please.


#1    Mike Fast      (see all posts) 2010/02/25 (Thu) @ 15:36

BP = BS.

http://www.hardballtimes.com/main/blog_article/is-eqa-better-than-woba/

Colin found that wOBA beat EqA in the period 1993-2008 at both the team and game level.


#2    David Cameron      (see all posts) 2010/02/25 (Thu) @ 15:43

When they made some good hires a few months ago, I had some hope that BP had seen the error of their ways, and were actually going to make some strides to fix the problems that have plagued them the last few years. 

Unfortunately, they’re getting more annoying by the day.  Between the crap Matt’s been spewing lately and this, it seems like BP’s tactic of choice is to try and assail FanGraphs, rather than catch up to us. 

SIERA is not better than xFIP.  EqA, or whatever they want to call it, is not better than wOBA.  PECOTA isn’t better than CHONE or ZIPS.  Hell, it may not even be better than Marcel anymore. 

Seriously, BP, you guys want to call yourself the leaders in baseball analysis? Do something interesting.  Making illegitimate claims about your mediocre metrics doesn’t count. 

/end rant.


#3    Michael      (see all posts) 2010/02/25 (Thu) @ 15:52

Isn’t EqR essentially linear? If so, why even bother with the conversation? The differences at this point are going to be very, very small, there’s really no reason to argue for or against one.

To me, I see it as a difference of scale: one is at BA, the other at OBP. To claim one is “better” in the way Jaffe did it seems unnecessarily disparaging.


#4    Mike Fast      (see all posts) 2010/02/25 (Thu) @ 15:52

Dave/2, I agree 100%.  In addition to it being annoying, I find it disgusting.

It’s unfortunate they took Colin into the fold because he was the one who could really lay bare all this nonsense.  I’m not quite equipped to show them for the snake-oil salesmen they are.  But I wish somebody with the reputation and analytical chops could take them to task publicly.


#5    Tangotiger      (see all posts) 2010/02/25 (Thu) @ 15:57

Yowza. 

***

First, I wince at the declaration of something better than the other, simply because I’ve seen it all and tried it all, and the best you are going to get is the slightest of marginal improvements.  I would hope that that is the prevailing spirit in which all new stats are introduced or old stats are reintroduced.  We’ve come a long way from 10 years ago.  We made big gains over that timespan, and now the gains are baby steps.  And that’s all they will be if we keep looking at the same data.

***

Also, I don’t think it’s a fair characterization of Matt.  I think he has gotten ahead of himself a bit with a conclusion here and there, but I presume I’ve been there, done that as well in the past.  As long as we can all talk about this, then we can easily move forward on any issue.

***

I’d be more than happy to act as a consultant, to anyone, be it at Fangraphs or BPro or B-R.com or elsewhere, and do it for free. 

It’s for the same reason that MGL was trying to get us to produce The Book for free(*): he was tired of seeing substandard discussions on topics and he wanted to give everyone the data and analysis to move forward.  He simply wanted to get the education level so high that we don’t keep going back in circles bouncing off walls just to still be in the same spot where we started.

(*) We explained to MGL that we needed to justify to our spouses the hundreds of hours on the book, and he relented, and allowed us to charge for the book.

That’s what I’d like to do.  It’s just so simple and obvious that a consensus can be reached, with the nariest of disagreements on the peripheral.

Let’s move forward on these issues.  I’m ready to do it with anyone.


#6    Sky      (see all posts) 2010/02/25 (Thu) @ 16:19

What I like about Fangraphs is that they aren’t trying to sell their stuff, for the most part.  David’s not inventing metrics and trying to sell them as good, he’s taking good things that are out there and putting them on Fangraphs.

Now, for stats like UZR, it probably takes some money to buy the framework, and it’s essentially proprietary to Fangraphs now.  But there hasn’t been a campaign to show that it’s better than +/- or PMR.  In fact, David’s quite open about its limitations.

Most of the stats, like WAR, wOBA, WPA, etc., are available for anyone to implement.  They’re a combination of a framework and number crunching.  I can see how BPro might feel it needs to use its own stats because they’re for-pay, but I don’t think that’s necessarily true.  The interface and willingness to change are huge features of FanGraphs.  I’d probably even pay some money to use it (although not what BPro charges).  If someone came along and provided more stuff in a better way, then I’d start visiting them and paying them.

Kudos to BPro and Marc Normandin for picking some excellent new fantasy writers.  I do think both BPro and Fangraphs both miss the boat a little bit with their fantasy analysis, though.  It’s mostly player analysis focused on fantasy stats, with less than I’d want in terms of strategy/valuation.  I think Derek Carty and his gang at THT and the various projects Todd Zola’s been involved with provide a lot more of what I find interesting and can be used to win leagues against others who know as much about player evaluation as I do.


#7          (see all posts) 2010/02/25 (Thu) @ 16:20

[1] As far as creating a metric publicly accepted, I agree that a stat on the Batting Average scale is more likely to gain traction.  Whether you want to call that “better” or just “realism”, I don’t care.

[2] Maybe I’m old fashioned, but I prefer stats that has one definition per name.  It is quite confusing that when you hear someone reference wOBA, you don’t know whether it is park adjusted without further clarification.  Sure, that isn’t a problem in rigorous discussion, but it is unwieldy for public outreach.  Again, whether you want to call that “better” or just “realism”, I don’t care.

One other note: this discussion was in the comments of Jaffe’s article; it’s not like he set out to bash wOBA.  Personally, I like wOBA more than EqA, but the difference is minuscule.  If I were ruler of the sabermetric universe and I was faced with creating a one-stop stat to value hitting for public consumption, I’d just scale wOBA to the batting average scale and give it a catchy name.  But really, for what the purpose is here, re-branding EqA will do just fine.


#8    Mike Fast      (see all posts) 2010/02/25 (Thu) @ 16:24

Let me clarify my agreement with Dave/2 re Matt.  I’ve been disappointed with how the work on SIERA has been done and presented.  It’s been done and presented better than a typical BP stat, but it’s still been too much about proving how BP is the leader and better than everyone else.  I don’t know if I’d use Dave’s words to describe that, but I agree with his general sentiment.


#9    Tangotiger      (see all posts) 2010/02/25 (Thu) @ 16:31

By the way, I’m all in favor of critical comments.

However, when it comes to regulars or semi-regulars, people who have posted here, the comments should not be so strong as to make it offputting for them to post here.

I would hate if it came to the point that Matt, Colin, Pizza and the others would stop posting here because the climate is not conducive for them to post here.  At the same time, I wouldn’t want Dave Cameron to feel that he has to hold back because he’s scared of looking too much of a bad guy.

I think it’s a fair point that the stench of the old BPro somehow gets placed on the new blood, without the new blood deserving it.

All to say, let’s be tough, but fair.


#10          (see all posts) 2010/02/25 (Thu) @ 16:33

Mike/#8:

they HAVE to present it like that.  if they don’t they will lose subscribers by the truckloads. 

i myself do not have a BP subscription.  i feel i can get even level analysis from fangraphs/BTBS/here rather than having to pay for it.  the metrics are equal and generally displayed better. 

there is almost zero reason for anyone to have BP subscription.  But as a business they can’t let their subscribers know that.  you can’t expect to charge people for metrics/analysis when they say “this is equal to so and so.”

and to be fair to Matt.  i’m pretty sure he wasn’t the one who goofed on the xFIP error.  so killing him isn’t the best thing to do.


#11    Rally      (see all posts) 2010/02/25 (Thu) @ 16:33

Dave, Mike,

Totally agree.  Add more and more to the alphabet soup of stats, just so you can claim (perhaps falsely) that it’s .005% better than someone else’s stat.  This is not what the world needs.

The studies on beyond the boxscore using Josh’s wonderful injury database are interesting, groundbreaking.

At this point trying to add that extra .001% to estimating how many runs Joe Schlabotnik was worth is nothing but statsturbation.


#12    Mike Rogers      (see all posts) 2010/02/25 (Thu) @ 16:37

I agree with Sky/6: B-Pro seems wayyyy too concerned with trying to sell people on their stats for my liking. Fangraphs/the Dave’s are really good at just laying out the stat, it’s limitations and positives, and lets you make the decision on whether it’s worthwhile. That’s how it should be.


#13    Patriot      (see all posts) 2010/02/25 (Thu) @ 16:55

One thing that is glossed over in these discussions is how EqA (sorry, TAv) is scaled to BA versus how wOBA is scaled to OBA.  As I’ve said before, I don’t like either scale--I prefer to keep things in terms of runs--although I understand why Clay and Tango have chosen their respective scales. 

The conversion between wOBA and a useful run unit is 100% linear: (wOBA-LgwOBA)/1.15 = RAA/PA.  Converting EqA to runs/out also takes two operations, but one is a power: 5*EqA^2.5 = R/O.

The power transformation surely matches the BA scale better, but it’s a little bit of overkill.  A linear regression should fit the bill just fine.

In any event, I don’t care what they want me to call it, I’m going to keep referring to it as EqA.  TAv is too close to Total Average, which has been around for 30 years.


#14    David Cameron      (see all posts) 2010/02/25 (Thu) @ 17:01

Sorry if my first post comes across as hostile.  It’s just a little frustrating to see BP continually tell their subscribers things that are just substantially not true.  There’s been a lot of that in the last week or two.


#15    Red Sox Talk      (see all posts) 2010/02/25 (Thu) @ 17:20

I agree with the substantive arguments above, there’s no clear reason why TAv should be considered “better” than wOBA, though they’re both pretty darn good. This is the marketing team at BP gone a little wild, I think.

I also think they switched the metric to a worse name, because there’s nothing “false” about batting average, other than a guy who benefits from a few extra bloops or gets an unusual number of well-hit balls caught for outs. Batting average is not meant to describe offensive run values, so in that sense “equivalent run average” was a much better name.


#16    Graham MacAree      (see all posts) 2010/02/25 (Thu) @ 17:26

This all seems to come down to whether you build your stats to make some logical sense or to hit some quasi-arbitrary statistical benchmarks.

I know where I stand on that one.


#17    Tangotiger      (see all posts) 2010/02/25 (Thu) @ 18:07

Speaking for myself, I liked Equivalent Average as the name.  Had it not been used, I would have called my stat that, cause that’s what it is.  It recasts the entire hitting profile into something that is for a player with that batting average and a proportionate OBP and proportionate SLG.  EqA is a great name.


#18    Mike Fast      (see all posts) 2010/02/25 (Thu) @ 18:24

Can anyone tell us non-subscribers if Jay gives any further reason why they don’t want to use the Equivalent Average name any more other than this marketing mumbo-jumbo?

This spring, we at BP have chosen to rebrand EqA as True Average (abbreviated TAv). Why? Because we feel strongly that the new name underscores our ability to get a “True-r” grasp on the quality of a hitter than the aforementioned traditional or more modern stats do. Quite frankly, we’re hopeful that this simple, easy-to-remember name can reach a wider audience.

That argument makes no sense to me.  I at least saw EqA used a few places outside BP.  I can’t imagine that re-branding as TAv is going to get them any additional traction.  If anything, it just obscures the origin of the stat.

It does conjure up some connotations with Boswell’s Total Average.  Maybe that’s a plus for them in the mainstream?  Maybe, although I doubt it.  In the sabermetric world it’s certainly not.

I don’t think I’ve invented any stats, but if I had and I’d given them a name and it had been out there for several years, I don’t think I’d rename just to get a slightly different acronym.

I suppose this is part of Will Carroll’s “Be Stupider” campaign?  Fail.


#19    Mike Fast      (see all posts) 2010/02/25 (Thu) @ 18:48

Speaking of LIPS over in the other thread, this is an example of a good reason why you change a name of a stat:

In light of this, we must re-assess Voros’ spectrum of what a pitcher can and cannot control. Rather than giving a pitcher credit for his strikeouts, walks, hit-by-pitch, and home runs and ignoring everything else as Voros did, we want to give him credit for his strikeouts, walks, hit-by-pitch, infield flies, outfield flies, and ground balls, while ignoring or adjusting everything else. At this point, we are not just removing defense from the equation, but luck itself, which is why I eventually changed the name of my statistic from “DIPS 3.0” to “LIPS.”

http://www.hardballtimes.com/main/fantasy/article/explaining-lips/


#20    tangotiger      (see all posts) 2010/02/25 (Thu) @ 19:11

Dave, cool, thanks.  I hope I didn’t come across as picking on you or anyone.  I value the posts of all you guys, and it’s a pleasure to come here and see everyone hash things out so that we actually get somewhere.


#21    Zach      (see all posts) 2010/02/25 (Thu) @ 21:00

I also don’t like the name True Average or the acronym. “True Average” has the connotation that it somehow adjusts BABIP luck or LD luck. It seems as if BP wanted to re-popularize EqA, but realized the “best” way was to bring it back with it a new name.


#22    Tangotiger      (see all posts) 2010/02/25 (Thu) @ 21:07

To me, “True” doesn’t work, because i always use it in conjunction with “True talent”, which means performance without the noise. 

I didn’t think of the Total Average issue, but, yeah, that applies to.

Equivalent Average was a good name.


#23    Tommy Bennett      (see all posts) 2010/02/25 (Thu) @ 21:31

This thread makes me a little sad. I really don’t see this as something getting worked up about--and certainly not worthy of the level of venom I’ve seen expressed in the comments here. I’m relatively junior so I can’t speak for anyone but myself, but it strikes me that most of this debate is just petty. Why don’t we just talk about baseball?


#24    Mike Fast      (see all posts) 2010/02/25 (Thu) @ 21:52

Tommy, it gets really old to see BP trumpeting themselves at the expense rest of the sabermetric community over and over again.  It also bugs me to see what I perceive as a shell game going on at BP.  Pay no attention to all of our issues, look at this fancy new-named stat in the hand over here!

I flatly disagree with the idea that you and Matt express that this is simply personal venom and nothing to do with facts and nothing to do with BP’s behavior.

Can you justify why BP gave a platform to JC Bradbury to shill his peak age study?  And then why did Will turn around and brag about how this was evidence that BP was leading the sabermetric discussion?  Are we not supposed to take that as a direct slap in the face of the sabermetric community?

And how can BP put out the Matt Wieters projection that it did last year, have it thoroughly debunked by (now BP’s own) Colin Wyers, and follow up on next year’s Annual with a picture of Matt Wieters right over the words claiming “deadly accurate PECOTA projections”?  At no point did BP admit an error.  Is that not basically a claim that BP doesn’t care what the facts are if they get in the way of making money?

And when people at this site identify an issue with this year’s PECOTA, BP doesn’t have the courtesy to identify who found the problem and provide a link to the thread here.  Instead, “we’ve seen statements on the internets”??  At what other major sabermetric site would you get away with that sort of attitude?

Yes, there’s a lot of good stuff that goes on at BP, and a lot of good people there.  But there is also a definite current of “Damn the (sabermetric community) torpedoes!  Full speed ahead!” That irks a few of us, to say the least.


#25    Tommy Bennett      (see all posts) 2010/02/25 (Thu) @ 22:05

I’m not in the business of justifying anything. I’m just a writer, and by no means am I the smartest guy in the room. But my general attitude is to be as kind as I can be. I think very highly of what you have written, Mike. I appreciate the work of Tango (indeed, that’s why I’m here) and many others in this thread.

I refuse to say anything disparaging about anyone else’s work. All of us love baseball and work hard to enhance others’ enjoyment of it. The amount that we agree on outweighs the things we (apparently) disagree about by so much, that it just strikes me as silly not to share in what ought to be one of the most enjoyable times of the year.

I am probably too naive to realize what more is going on here, but I’ve been reading BP, THT, and Tango for a long time, and I’ve always enjoyed smart baseball writing in all quarters. It really is just that simple for me.


#26    Tangotiger      (see all posts) 2010/02/25 (Thu) @ 22:07

I really don’t see this as something getting worked up about

Seeing that you have someone who used to be a BPro subscriber for some 6+ years, you should not be projecting how you feel on the matter. 

The correct thing to do is to accept Mike’s frustrations on the matter as genuine and relevant, and ask
1. how is it that he sees it this way
2. is there anything BPro can do to change that perception?

Indeed, this whole thread should be taken as a learning opportunity, not one where you would have preferred it didn’t exist.  You should be thankful that it exists.


#27    Tangotiger      (see all posts) 2010/02/25 (Thu) @ 22:10

Ditto Tommy/25.


#28    Mike Fast      (see all posts) 2010/02/25 (Thu) @ 22:27

Tommy, I appreciate your comments in #25.

I think what comes across as anger and venom is perhaps better seen as what Tango correctly called it: frustration.

Why does BP refuse to work with and take the input of the broader sabermetric community outside its walls on so many things?  It didn’t used to be that way there, even just a couple years ago.

The sabermetric community has made so much progress in the last decade or so, and I believe that cooperation and sharing of ideas and data is what spurred that.

When a group becomes isolated and defensive of its ideas, as I have seen BP do in the last year or two, those ideas get stale and stagnant.

And I probably let that bother me too much personally.  I should just walk away and let BP say and do whatever they want.  I had basically come to that point last year and was at peace with it.  But then BP went and hired several of my friends, and I let my hopes get raised that things were changing for the better.  So I find it particularly frustrating to see signs that the old behavior is as entrenched as ever there.  I know I need to just let it go again and let BP do whatever BP wants and not let it bug me.

I don’t want to make enemies of the people that write and work there.  I still consider many of them friends.  I still enjoy a lot of the free writing that’s available there (shout out to David Laurila!) and the chats and a few of the stat reports.  I need to just take it at that and quit expecting BP to be what it used to be for me.


#29    Nick Steiner      (see all posts) 2010/02/25 (Thu) @ 22:31

Ditto Tango/26

I think this actually might be a nice place to voice constructive criticism of BPro.  I have similar (although probably not as strong) feelings as Mike, and am a former BPro subscriber who let my subscription lapse as well.  So here are my suggestions for how BPro can better their standing with the readers of this blog:

1) Link around the web - a lot.  Tommy used to be great at this at BtB.  Links help the conversation flow from site to site, so that people are not only doing their own stuff, but integrating it with the current research out there.  At THT, I link to plenty of other sites in my articles as do other authors (Mike Fast’s DIPS post is a perfect example of that - look at the end).  FanGraphs and THT both have links to eachother’s websites on their front page.  BtB is great at this as well. 

BPro shouldn’t be trying to lead the discussion - they need to actually be a part of it first, and the best way to do that is the link to other places and integrate their work with outside research. 

2) Get more into Pitch f/x.  Really, it’s the future of Sabermetrics.  As great as having a DIPS estimator .1 points of RMSE better than FIP is, the real advancements are coming from pitch by pitch data.  Eric Seidman did some good work with velocity, but I haven’t heard anything else with Pitch f/x that BPro has done. 

Colin creating a fielding metric from the base up is a great series, and is truly helping to advance Sabermetrics.  Pitch f/x is the other thing. 

3) Get rid of VORP and all of the other obsolete metrics out there.  Seriously, they are useless.  EqBRR is great.  EqA was a perfectly acceptable alternative to wOBA.  The playoff odds and the adjusted standing reports are great.  SIERA is great. 

The rest of it, VORP, LEV, WARP, RARP… are either obsolete or just flat out wrong, and there is no need to have them still on the website.


#30    Tangotiger      (see all posts) 2010/02/25 (Thu) @ 22:36

I had basically come to that point last year and was at peace with it.  But then BP went and hired several of my friends, and I let my hopes get raised that things were changing for the better.

Two things:
1. The Five Stages of Grief

2. Kirk: “Forgive me Mr. Spock, I sometimes expect too much of you.” Bonus point to whoever can tell me the episode and location/setting of him uttering that line, or something close to it.


#31    Tommy Bennett      (see all posts) 2010/02/25 (Thu) @ 22:37

@Nick/29 #1: Can I ask you to be patient on this one for another couple weeks? smile


#32    Mike Fast      (see all posts) 2010/02/25 (Thu) @ 22:43

Tango/30.1, you’ve told me that before.  It was good advice then, and it is good advice now.

I had to look up the Star Trek quote, so I won’t spoil the answer for the rest unless they want to follow the link.
http://www.imdb.com/title/tt0708455/quotes


#33    Nick Steiner      (see all posts) 2010/02/25 (Thu) @ 22:51

Yeah!


#34    Terry      (see all posts) 2010/02/25 (Thu) @ 23:22

I’m left wondering why I should pay for BP when fangraphs doesn’t require a subscription.

Seriously.


#35          (see all posts) 2010/02/26 (Fri) @ 02:17

There are many very good analysts writing in this thread, so let me just say up-front that I’m just a fan.  I don’t have the math background to assess TAv/EqA versus wOBA or SIERA versus xFIP.  I do have a great interest in all of it, though, which began with my purchase of the Bill James Abstracts in the early 80’s.

I have been a BP subscriber for many years, but it’s no longer my “go to” site like it once was.  I check this blog and FanGraphs multiple times during the day, then The Hardball Times, with BP falling to 4th on my list.  Obviously, in my opinion, BP just isn’t as good as it used to be and these other sites have passed them by. 

This year was the first in many years in which I didn’t purchase their annual (I chose to spend my money on the Hardball Times annual and will purchase the Fangraphs Second Opinion PDF).  I have no confidence in PECOTA after the last two years and the problems they’ve had this year.  I won’t use their PFM this year for my fantasy leagues.  Every time an article uses PECOTA as the basis for their analysis (such as the recent fantasy positional rankings and the division previews), it just makes me question the entire article. 

As for SIERA, I read the 5-part series and was initially impressed with the results of their testing of SIERA versus the others, while being surprised that xFIP fared so poorly.  When the follow-up post came that they had made an error with xFIP and that SIERA and xFIP were pretty much the same, I thought, “Wow, so they did a ton of research and background work, then posted a 5-part series about this wonderful new metric, and it’s basically as good as one that already existed.  BFD.” (To be fair to Matt, his point about both having their benefits is probably true, so there may be some value there, just not as much as I initially thought.  Also, I go to Fangraphs for player stats, because the BP stat pages are ridiculously poor, and Fangraphs has xFIP, so I don’t know how much use of SIERA I will make anyway.)

Like Terry/34, I am wondering if I should continue to subscribe.  I guess I’m an optimist and I hope things will get better.  They’ve added many new writers and will greatly expand their fantasy coverage, which has been quite pitiful over the years.  My subscription doesn’t expire until March of 2011 (I subscribed for 2 years last time), so they have a year to win me back.


#36    Drew      (see all posts) 2010/02/26 (Fri) @ 02:24

I subscribed to BP for about ten years, and let my subscription lapse recently.  I left, like many others have, because I saw no reason to pay for something I could get here or at THT, Fangraphs, BtB, Baseball Analysts, etc.  That, and all my favorite writers there are now gone. 

Ultimately, BP acts the way they do because they charge people for the content.  They HAVE to be more opaque and claim that their content is unique/superior.  Otherwise they would never be able to get people to pay for it.  It’s similar to Linux/Microsoft.  If Microsoft made their OS open source, no one would ever pay for it.  That might be a little harsh to BP (implying that they put out as bad a product as Microsoft), but I think it’s apt.  Part of the reason BP has fallen behind is that no one has been able to peer review their work and tell them how to improve it.


#37          (see all posts) 2010/02/26 (Fri) @ 02:39

To me, a term like “True Average” (or Total Average for that matter) has an implication that its promoter believes he has reached the end of history, all other metrics are superseded and this is all you’ll need, now or in the future.  If you understand the history of baseball statistics, you know that such claims have been made time and again, and of course have never been proven true. 

So I get turned off by the term even before I learn that it’s just EqA in another guise. 

Relabeling always makes me suspicious that the promoter of the relabeled product is out of ideas.  I understand that a paysite needs to spend some time working on marketing, but is there anyone who could possibly react to the change of the name of a statistic by going, “Damn, I need to get out my credit card and subscribe!”?


#38    John Walsh      (see all posts) 2010/02/26 (Fri) @ 05:41

I’m with those who don’t understand people getting so worked up about BP’s way of doing business (especially Mike, who I’ve found to be one of the most level-headed, respectful and admirable guys in the online sabr community).  I think the key word is “business”, which is what BP is.  The main goal of BP, I would venture, is not to make fair comparisons of metrics or take part in the sabr community or advance sabermetrics in general.  Their main goal is to make money. 

The BP Annual is currently #21 on the Amazon top sellers list.  That’s not #21 among baseball or sports books. It’s #21 in all of Amazon.freakin.com.  To sell all those books, BP has to sell their metrics, their writing, their analysis.

I think they renamed EQA to TAv because the latter is easier to say and remember. That’s all.  (It is part of “Be Stupider”.) And they claim that TAv is better than wOBA much in the same way that Chevy says its trucks are better than Ford trucks.  You’re not supposed to believe them when they say that. Nor get angry about it.  That’s the way I look at it.

BTW, this is my view of BP’s philosophy as a whole, obviously individual writers may have different viewpoints.  But, I’d guess that BP has a policy on this that writers are expected to adhere to.


#39    Tangotiger      (see all posts) 2010/02/26 (Fri) @ 09:14

...in the same way that Chevy says its trucks are better than Ford trucks.  You’re not supposed to believe them when they say that. Nor get angry about it.  That’s the way I look at it.

That’s an interesting way to look at it.  You could apply it to “deadly accurate” as well I suppose. 

I’d be happy to ignore it, if some of the zombie readers aren’t so quick to repeat it.  It bothers me when say Chone is put down in the BPro comments, even though Chone is the one that has done better in the head-to-head testing for at least 3 years now.

So, it’s not so much that they say it, but that others believe it and propogate it.

In the same vein that Matt says that you are better off with both SIERA and FIP, aren’t you better off with Chone AND Marcel AND ZiPS (what Fangraphs shows), rather than JUST PECOTA?  Why, for example, is Marcel not on the BPro pages? Especially considering that Marcel beat PECOTA.


#40    Tangotiger      (see all posts) 2010/02/26 (Fri) @ 09:14

I mean MArcel and PECOTA.  BEer and tacos.


#41    studes      (see all posts) 2010/02/26 (Fri) @ 09:54

Wow, I just saw this in one of their SIERA articles:

Additionally, as it pertains to xFIP, we spoke with Dave Studeman of The Hardball Times in order to determine that the expected number of home runs to be substituted into the FIP formula is to be calculated through home runs per outfield flies, not the sum of those and popups.

No one at BPro has ever contacted me about xFIP--I left a comment in their first article about it. I don’t appreciate it when people imply that I’ve been involved with something when it’s not true.  Very uncool.


#42    Eric Seidman      (see all posts) 2010/02/26 (Fri) @ 10:20

Dave, I have all of the e-mails we sent to each other. I can understand if you don’t remember as it was a while ago, but we spoke, and I can prove it, and I also don’t appreciate allegations.


#43    Harry Pavlidis      (see all posts) 2010/02/26 (Fri) @ 10:36

If Ford tried to tell me that had statistical/scientific/mathematical “proof” that their truck was better than my Subaru, and it was total BS, I wouldn’t read that as “marketing”. There is a thin line in marketing/advertising and flat-out lying. That’s crossing it.

If I then told Ford I had an issue with their claims, but they tell me that Studes can back them up, when I know he has never worked for Ford ... then, well, I think that says everything about Ford that I need to know.


#44    Harry Pavlidis      (see all posts) 2010/02/26 (Fri) @ 10:41

Obviously I wrote 43 before 42 posted (I’m late to the thread and just read it, I guess it took me >16 minutes to do so) .... so I’m quite curious to hear Dave’s response to 42.


#45    studes      (see all posts) 2010/02/26 (Fri) @ 10:46

Eric reminded me that we did speak about xFIP last September.  Obviously, I don’t remember that and it certainly doesn’t flow in the logic of their articles. But I apologize for the allegation.


#46    Harry Pavlidis      (see all posts) 2010/02/26 (Fri) @ 10:47

The beauty of the internet. Mistakes can be made quickly, and corrected quickly.


#47    J. Cross      (see all posts) 2010/02/26 (Fri) @ 10:48

My BP subscription is up in a few days and I *will* renew for a few reasons:

1. Colin Wyers articles (pre-BP, I know) showed me (and I’m basically a computer illiterate) how to get mysql on my computer and start playing around with pitch f/x and retrosheet data and I’m having a lot of fun with it.  Summer project: learn Pearl or R.

2. Russell Carleton is doing pretty cool stuff.

3. I need Kevin Goldstein to tell me which teenagers to draft and stash in the minors in my ridiculous fantasy league.

4. I want to be able to give the authors a hard time by posting comments.


#48    Harry Pavlidis      (see all posts) 2010/02/26 (Fri) @ 10:52

My quick two cents, and then back to writing about Mets’ prospects ....

BP is damaging their brand. It is so highly regarded in the front office and MSM they’ve lost track of their base.

The recent hires seemed like a huge step in the right direction, but there have been mistakes (as debated in this thread, and others I’m sure) that have impacted many members of the analytical/grassroots community.

Obviously, I have not idea of the market segments that BP gets their revenue from. Or even their traffic, or comments. But my perception is the community is their base for active readership and writing talent. And, right wrong or indifferent, they have done damage to their brand amongst that segment. I can’t see that being healthy mid- or long-term for BP, not matter how well their book sells today.


#49          (see all posts) 2010/02/26 (Fri) @ 11:00

When I first saw this, I thought BP had written up a whole article about how their stat was better than wOBA.  That would’ve been particularly offensive.  It turns out they wrote an article about EQA’s namechange to True Average and why it was good, and the first comment at BP was “why is this better than wOBA?” That’s a fair question, of course, but they cannot possibly say “well, it isn’t.” So they gave three reasons.  1 is totally cosmetic (though I agree with it).  2 is true of one version of wOBA but not the other.  3 I don’t understand.

I don’t like the new name.  EQA was a good one, for one thing.  For another, I loathe the use of “True” because it just strikes me as really arrogant.  Just a personal pet peeve.

I may let my subscription lapse the next time it comes up.  I renewed recently, right before they announced Sheehan was leaving.  I liked his writing.  I still enjoy reading Kharl, Steve Goldman (though I can read him at Pinstriped Bible) & Kevin Goldstein.  Maybe the new talent will end up keeping my interest.  Maybe not.  But no longer do I look at their projections & stats as the best.  For projections I look at CHONE and CAIRO (I spend too much time at RLYW, what can I say).  For stats… Fangraphs.  For analysis, all over the place, including BP.


#50    WTE      (see all posts) 2010/02/26 (Fri) @ 11:05

What’s annoying to me is that BPro isn’t leading any discussion at all. They’re scrambling like hell to catch up to what others have made freely available—and then trying to cash in on it, substituting marketing for work on a better product and truly open discussion.

Aren’t they now pretty much exactly what Bill James used to complain about with Elias all the time back in the 1980s?

At any rate, my BPro subscription lapsed two years ago, and I haven’t seen any reason to renew.


#51    mb21      (see all posts) 2010/02/26 (Fri) @ 11:14

So BP has officially decided to take the path that is most accepted.  They changed EQA to reach more people and believe a stat scaled to avg is superior to one scaled to obp because more people will accept it.  They’ve become Bill James with one very notable exception: Bill James’ writing is far superior to that of BP. 

I’m not a sabermetrician so I must have missed the memo about the importance of sabermetrics becoming widely accepted by average fans rather than then importance of quality work.


#52    Rally      (see all posts) 2010/02/26 (Fri) @ 11:52

Since the lead in is about comparisons to wOBA, I thought they must have a new formula here.  Since I thought Colin had already shown that eqa is not better than wOBA.

Not even a new stat, just the same formula in use for years with a marketing twist?  Now that is pretty sad.  I’m at a loss for a precedent in sabermetrics, of a stat that has been around for over a decade being renamed just because it might test better in a focus group, instead of testing better in a spreadsheet.

I’ve never considered subscribing before.  It’s not something I’m very likely to do, if I can’t find something for free on the internet that I want I’m more likely to create it myself than buy it.  It is tempting to continue reading Colin and Pizza, but the marketing wing is a big enough turnoff to me that my dollars are safe for now.


#53    Mike Fast      (see all posts) 2010/02/26 (Fri) @ 13:23

Major kudos to Clay and BP for discussing the details of the fixes they are making to PECOTA in an Unfiltered post today.  That is EXACTLY what I have been hoping and advocating that they would do if we’re to have any faith in PECOTA.

http://www.baseballprospectus.com/unfiltered/?p=1521

I’m still reading through it, but I’m laughing at this one:

I can readily believe that teams would be stupid enough to start Sucky Player A in April, but don’t think that they’ll continue sticking with him in July.

Yuniesky Betancourt, anyone?


#54    berselius      (see all posts) 2010/02/26 (Fri) @ 13:34

Honestly, I was really disappointed in Clay’s post. My initial take on what he said about the depth chart stuff is just that, well projecting teams is hard and he seemed to just throw up his hands. It just doesn’t make sense to me. You can’t reconcile the playing time projections that PECOTA generated with the depth charts and playing time estimates, so a player’s PECOTA batting line obviously isn’t going to match up with a team projection of total offense, pitching, etc. I don’t see why they can’t just use the depth charts and the PECOTAs as inputs to a season simulator anyway. Maybe I’m just ignorant here, but isn’t that what the other team-level standings predictors do?


#55    Mike Fast      (see all posts) 2010/02/26 (Fri) @ 14:17

I’m with those who don’t understand people getting so worked up about BP’s way of doing business (especially Mike, who I’ve found to be one of the most level-headed, respectful and admirable guys in the online sabr community).  I think the key word is “business”, which is what BP is.  The main goal of BP, I would venture, is not to make fair comparisons of metrics or take part in the sabr community or advance sabermetrics in general.  Their main goal is to make money. 

John, I appreciate the kind words.  They mean a lot coming from you.

I explained some of the reasons behind my expressed emotions in post #28.  That may get lost in the context of all the comments that are here now.  I had no idea when I made comment #1 that this would turn into a huge sounding board on the community’s perception of BP.  I meant “BS” in the sense that Tango uses it here--a bold claim made without evidence provided to back it up.

I have no issue with BP trying to make money.  I don’t think that is or should be at odds with the other things you mention.  I believe that when they make fair comparisons of metrics, take part in the sabr community, or advance sabermetrics in general, that these also advance their business prospects.  I understand that a large chunk of their business probably comes from the fantasy baseball world and doesn’t care directly about sabermetrics. 

Nonetheless, BP is not looked to by the fantasy world because of their expertise at the fantasy game.  IOW, they’re not Ron Shandler.  (I certainly don’t mean that as a slap at Marc Normandin, whom I like, or a commentary on the new fantasy team they are bringing on, which seems like a good move.) They are looked to because of (1) PECOTA’s reputation and (2) the reputation of writers like Nate Silver, Dan Fox, Keith Woolner, and many others who are acknowledged by both the sabermetric community and the front offices of the leagues as knowing what they are talking about.

Being open and honest about mistakes and corrections actually fosters confidence, and that results in making money in the long run.  Bluster and pretension may carry someone in the short term, but it can’t last.  Eric’s post to Unfiltered about SIERA and xFIP and Clay’s post today about PECOTA are great ways for BP to regain the confidence of the sabermetric community.  More of that consistently and I won’t be suspicious that BP is hiding problems in order to carry on making a buck.

Cooperation with the sabermetric community also fosters not only links and mentions from other popular sites but also a continual fountain of new and improving ideas.  This is what BP has been lacking for a while, and eventually that will show up on the bottom line for them, too, if it hasn’t already.  Bringing in new blood helps with that, and you can gain something by discussing amongst yourselves within the walls of BP (or wherever).  But it doesn’t begin to compare with what you can gain by discussing something with the whole community.

I don’t think BP (and its revenues) and the sabermetric community have to or ought to be at odds.  I wish it weren’t so.  I believe the ball in this respect is in BP’s court, and I wish those at BP who are pushing for more openness and cooperation the best of luck.  I don’t expect perfection, by any means, and I hope no one expects it of me.  I do like honesty and transparency.  I believe that builds a far stronger and more lasting business than marketing efforts that are afraid to be honest or admit that other people out there know what they’re doing, too.


#56    Clemente      (see all posts) 2010/02/26 (Fri) @ 15:41

I think BP has been better in the last two years at admitting error or lack of sophistication, and trying to fix it.  That their whole website doesn’t stop while this occurs for someone’s objection seems to really annoy some people here.

People are complaining about problems created, pointed out, fixed and described in week or so periods, as if the world has ended and the ‘delay’ is proof of “hiding problems in order to carry on making a buck.” Grow up.


#57    Patriot      (see all posts) 2010/02/26 (Fri) @ 15:52

Re: Rally/52.  It’s not quite the same thing, but at some STATS or BIS decided to start calling Favorite Toy estimates “Career Projections” or something similar.  It kind of lost the spirit of the Favorite Toy, I thought, but Bill must have been ok with it.


#58    Mike Fast      (see all posts) 2010/02/26 (Fri) @ 16:15

People are complaining about problems created, pointed out, fixed and described in week or so periods, as if the world has ended and the ‘delay’ is proof of “hiding problems in order to carry on making a buck.” Grow up.

1. The problems with the Matt Wieters PECOTA projection were pointed out and described in detail nearly 11 months ago, most prominently by Colin here:
http://www.hardballtimes.com/main/article/the-death-of-superman/
No one at BP has even acknowledged this as an issue, AFAIK, much less talked about what went wrong or whether they have done anything to fix it.

2. PECOTA performed abysmally overall as a projection system in 2009, finishing worst among the major projection systems and basically indistinguishable from simply assuming players would repeat their 2008 performance in 2009. 
http://www.insidethebook.com/ee/index.php/site/comments/evaluating_the_2009_forecasts_chone_zips_fantastics_win/

PECOTA is BP’s flagship product and the basis for a large portion of their revenue.  Nobody at BP has acknowledged how poorly PECOTA performed in 2009.  Instead they continue to claim that’s it “deadly accurate” and the best of the projection systems.  It’s been 4.5 months since the 2009 season ended.  That’s plenty of time to fix whatever bedeviled PECOTA in 2009.  For all I know they’ve done that.  But if they have, they haven’t told anyone.  If you don’t think this has anything to do with a fear of losing a lot of money, I’ll have to respectfully disagree.

3. Beyond one chat comment by Eric Seidman, which I applaud, BP has never publicly acknowledged, AFAIK that the JC Bradbury aging study was a very minority viewpoint within the sabermetric community and that many of its claims have been discredited.  Instead they (Will) very publicly stated that this showed how they were leading the sabermetric discussion.  I don’t expect Will to come out and say he was wrong.  If I did, I’d be waiting till the evil one’s minions are putting on ice skates.  But what I would like to see publicly is what I have heard privately from some BP people about publishing a rebuttal/alternative study based on more sound analysis.

It’s been over six weeks since BP published JC’s stuff.  I don’t necessarily know if we’ve heard the last word from BP on this, but I do think that’s plenty of time for it to be considered fair that I’m pushing BP on their public stance on the matter.

4. Prior to today’s post by Clay, there had been a number of issues with this year’s PECOTA release raised either here or in comments at BP Unfiltered.  Prior to today, Clay had acknowledged only a portion of those issues and had not given very many details of the fixes he was making for them.  I wrote most of my criticisms in this thread prior to today’s Unfiltered post by Clay, which I have been very upfront about applauding.

Regarding this year’s PECOTA, it is imminently reasonable to expect fixes and explanations in time frames of a week or two.  Many people have already begun or will begin to have their fantasy league drafts and auctions in the next couple weeks.  If I wait until April 1 to point out issues with this year’s PECOTA, it’s too late to do any good.

5. I want BP to succeed and my criticism should be taken in that light.  I got angry because I care, and I grant that I shouldn’t care so much.  However, as a very wise man once wrote, “Wounds from a friend can be trusted, but an enemy multiplies kisses.”

People can cast this as me simply being immature or having unrealistic expectations if they want, but I think they are missing a very valid point if they do so.


#59    Mike Fast      (see all posts) 2010/02/26 (Fri) @ 16:55

Is this a good place to thank studes for how he runs THT?  He is without peer in that regard.  Absolutely without peer.  And that’s not intended as an insult to any of the other fine sabermetric sites out there without which my baseball experience would be much poorer and whose owners I admire.  Even though I don’t expect everyone to run a ship the way he does, this whole thread and related subjects give me new appreciation for what Dave does year in and year out.  Thank you, Dave.


#60    John Walsh      (see all posts) 2010/02/26 (Fri) @ 17:23

Mike/#55

I forgot to mention “prolific” when mentioning your virtues [g].

Kidding aside, I’m not sure that the values that you mention (and that I also consider important) are compatible with maximizing profits.  Based on the number of books they sell, it seems to me that BP reaches such a wide audience, that they are targeting more your average fan and not so much the sabermetrically savvy.

I’m guessing (no evidence to back it up, though) that the people who truly care about the relative merits of wOBA and EQA or who would really like to BP comment on their Matt Wieters projection last year, well I just think this group makes up a very small percentage of the BP market.  They are better off blowing us off and catering to the “unwashed masses” (not that basement-dwelling bloggers are especially hygienic). 

I know you feel differently, but that’s how I see it.

By the way, I second your comments on Studes and his running of THT—100% class act, 0% bullshit.


#61    Mike Fast      (see all posts) 2010/02/26 (Fri) @ 17:50

John/60, I agree for the most part.  I agree that BP wants to reach a much wider audience than the sabermetrically savvy fan, although I think, based on my impression from the comments to the articles that I read there, that the average baseball fan is not exactly their prime market, either. 

I would think that the bulk of their market is the fantasy player or baseball fan who otherwise enjoys numbers and stat breakdowns of players but doesn’t necessarily want to do any of the analysis himself.  It’s definitely a different target market than the readership of THT or Beyond the Boxscore.

However, as I said to someone else in email this morning, “They don’t keep the broad subscriber base for long if the leading sabermetric people don’t support them.  I know the fantasy market is huge for them, but even that market is informed by the sabermetric intelligentsia.”

If PECOTA gets a reputation in sabermetric circles as being for crap compared to other projection systems, you can bet that’s going to filter down to the impression held by the broader fantasy game.  I’m a regular participant on probably the major fantasy discussion board (Sky will know which one I’m talking about).  While only a small handful of the people there visit Fangraphs and BtB and THT on a regular basis and Sky and I are probably the only ones who visit here, the people there definitely listen to what the sabermetric world says.  It takes some time for it to filter through, but it does.  The Verducci Year-After Effect, for example, has started to lose credibility in the fantasy crowd.  In addition, some of the major voices in the fantasy world listen to what the sabermetric experts say, and in turn that affects what they say to their listeners and subscribers.

I don’t think BP can long afford to isolate themselves from regular and detailed interaction with the sabermetric community.  Clearly, there are people like Matt and Eric at BP who think the same way, so that they’re not isolating themselves completely.

I could be wrong,and I’m willing to be told so and why, especially by someone of your stature (figurative, maybe someday I will have the pleasure of meeting you and assessing your physical stature any chance you make it to a PITCHf/x summit one of these years?).


#62    John Walsh      (see all posts) 2010/02/26 (Fri) @ 18:03

Mike/#61

Thank you for the kind words.

I hope you are not wrong.  Perhaps I underestimate the role of saber issues in fantasy baseball public.  You certainly have a better feel for that than me.

As for the pfx Summit, if Sportvision has a great year and decides to spend some of the windfall by holding the Summit in some southern European country—well, I’m there, that’s all there is to it.  Otherwise, SF is just too far afield for me.


#63    tangotiger      (see all posts) 2010/02/26 (Fri) @ 18:15

I think you are overselling the logic of the sabermetric argument. 

They have about 15,000 subscribers (it’s somewhere between 10K and 20K, and my best estimate is 15K).  Those that hinge on the sabermetric argument is going to be pretty low.  Clearly, they don’t want to lose those subscribers (a bird in the hand is better than 2 in the bush), but neither do they need to cater to them too much.

From this perspective, they do a good enough job with sabermetrics to say they are a quality player.

Their book is a great example of that.


#64          (see all posts) 2010/02/26 (Fri) @ 21:37

Mike/61

I wonder how much of BP’s fan base is there for the fantasy content.  BP basically ignored the fantasy aspect for years.  To me, it felt like they considered themselves a hardcore sabermetrics site that couldn’t be trifled with fantasy baseball.  That’s changed over the last few years when they figured out the money that could be made. 

Even so, up until their recent announcement, they had exactly one writer (Marc Normandin) writing about fantasy baseball and he wrote just two columns per week.  I used PFM for drafts but BP wasn’t worth much fantasy-wise once the season started.

I think, with their new additions, they are making a concerted effort to get a larger piece of the fantasy market, but they’re coming from behind because they were slow to embrace it.


#65    Tangotiger      (see all posts) 2010/02/26 (Fri) @ 21:48

I don’t think htat’s so Bobby.  I would not be suprised if a great portion of their subscribers, at least 10% if not one-third are fantasy-only.  Add in those that are fantasy+regular, and I’d say half their subscribers are fantasy-based.


#66    Mike Fast      (see all posts) 2010/02/26 (Fri) @ 23:13

Will contacted me today to point out how I was wrong.  This was my response to him.

Will,

I don’t know exactly the text of the comment that was passed on to you as I’ve written a lot about my thoughts about BP in the last couple days at the Book blog.  I hope that you do not assume that I hate you or your work or BP based on one fowarded comment out of its context, because that would not be correct.  I was a long-time BP subscriber, and I have been a long-time fan of your writing, particularly your UTK stuff and your efforts to educate us about the importance of the work of training staffs around the league.  I think the sum of my comments at the Book blog speak very well for themselves, but I’m not sure how to sum that up in an email.

Specifically regarding JC’s article about aging, my beef with BP is fundamentally this:  the discussion had pretty much already taken place and run its course.  A number of people had written on it and investigated it.  The research was all out there, and it showed that JC was basically and fundamentally wrong unless his conclusions were taken so narrowly as to be nearly useless in their application.  BP reaches a much wider audience (i.e., beyond the sabr-inclined folks).  How were they to know that they were getting a minority and discredited view?  I think it’s good that a vigorous discussion took place in the comments, but that’s hardly the same thing as having the ideas presented within the article itself.  And since then, as far as I know (and I could be missing something), no one among the BP authors, except for Eric in a chat, has said anything about any skepticism of JC’s ideas.  I don’t expect you to know whether JC is right or not since that’s not your field of expertise, and I don’t think any less of you for that, but you’ve got a lot of people on your staff who could address what JC said.  I don’t know why JC needed the BP platform when he has his own well-read blog, and I don’t understand why BP has chosen, either purposefully or by inaction, to communicate to its readers that JC’s ideas on aging are at least as good if not better than the competing views.

I view Voros’ piece on DIPS very differently, and did so at the time, too.  Yes, there was significant opposition to his ideas, but mostly the people who went and researched either said that they agreed with him or agreed with a modified version of his findings.  There was not a chorus of sabermetric thinkers researching and finding that his results were anywhere from ludicrous to inapplicable.

I struggle to find a more modern-day analogy.  Would you put up a guest piece by Mike Marshall and then expect that a rebuttal by Fleisig in the comments was sufficient to air out that discussion and no further commentary by yourself or any other BP staffers to clear up the state of research in the field would be necessary?

My comments about hell freezing over were in response to someone who said that I wasn’t being patient enough with BP.  I said that I thought six weeks was long enough that it was fair to complain about JC getting the pulpit at BP without BP staff expressing any opposing views, and anticipating that he would tell me in response that if I expected you to publicly back down from having rejoiced in the discussion that JC’s article provoked, that I was being unreasonable, I headed him off at the pass and said that I didn’t expect that to happen and that that specifically is not what I was requesting of BP.  I don’t expect you to quit saying what you think or to start going around apologizing because people disagree with you.  I don’t think you [owe] anyone an apology and I didn’t mean that by what I said at the Book blog.  I do think BP would have been better served by a fuller, more balanced discussion on that topic, and that the way that topic was presented hurt BP’s reputation rather than helped it.  By no means do I think you intended to do that, and you probably even still disagree that it did so.  I’m not sure what else I can say by way of explanation except asking you to read the fuller context of my comments if you want to know better what I was trying to communicate.

Having said all that, I do realize that some of my comments about what I wish BP would be or become get cloaked with emotions associated with frustration, and that’s probably not terribly productive.  In that regard, I appreciate your email to me to clear this up, and I will try to keep in mind in the future the fuller context of what you were trying to say when you talked about leading the discussion.  As I ask you to do that for me, I owe the same to you.

Thanks for reaching out to me directly.


#67    Tangotiger      (see all posts) 2010/02/27 (Sat) @ 00:04

If you click on the “See all posts” next to Mike’s name, you get to see everything he wrote.

I don’t see anything there that he said wrong.  Whatever disagreements he had, he justified them to some extent or other.  Not only that, he was very balanced.  Given the opportunity to say something nice, he went out of his way to do so.

Of all the posters to take issue with, I don’t see Mike as being that guy to worry about.  He disagreed without being disagreeable.

Note: calling someone BS is perfectly fine here.  My standing rule is you get to call bullshit on anyone who makes a summary opinion without evidence.  Indeed, that’s going to appear on my sabermetric tombstone when the time comes.  One of you should see to that.


#68    colintj      (see all posts) 2010/02/27 (Sat) @ 04:26

I was going to step in at some point, but realized Mike would say it for everyone who took his side as well as anyone would.  And I already left comments at BP.  Point being: I think there are plenty of people Mike is speaking for, even if that total is dwarfed by fantasy-style subscribers. 

I do want to say that between the new fantasy staff, SIERA, CW’s fielding metric and the EqA rebranding, it’s pretty clear what the BP business plan looks like.  They’re going to get their stats up to fangraphs quality as quick as they can with Saber Approved guys in order to keep their old crowd, while going full bore into the fantasy market.


#69    studes      (see all posts) 2010/02/27 (Sat) @ 12:17

Just catching up with this thread and I want to thank Mike and John for the kind words.  Very embarrassing, actually.

I just want to say that I admire BPro a lot and I’m thrilled that a lot of THT people have moved onto there.  Maybe I put too much of a good spin on things, but I think it shows the quality of the work we do at THT. Yes, it’s a hassle to find new writers and content, but that’s what we do. I can’ begrudge anyone moving to a situation in which they’ll get more money.

Also, when am I going to get an official designation of Sabermetric God for inventing xFIP????


#70    Matt K. (d_f)      (see all posts) 2010/02/27 (Sat) @ 14:24

Update:

A couple of days ago, I posted a comment in response to Jay’s comment in which I linked Patriot’s classic analysis of EqA/R as well as Tango’s comments on Patroits post. I also posted a link to Colin’s excellent wOBA vs. EqA post from THT.

Jay responded today:

“I’m familiar with that work, and I’m also familiar with data that’s been circulated internally within BP which will rebut that.”

That sounds exciting. I’m beingn sincere. As I responded to Jay:

“I eagerly await the publication of the internal studies you mentioned, since transparency benefits everyone. I assume that you wouldn’t refer to those studies if they weren’t going to be made available to everyone, given your comment above. It would be really interesting for all to see. Was Colin convinced?”

I really hope they do publish the findings. Is that an unreasonable expectation on my part?


#71    Matt K. (d_f)      (see all posts) 2010/02/27 (Sat) @ 15:04

Oh, and studes, although I’m not sure I have the authority to deify someone, if I did, I’d do it for you.

I’m still waiting on my own deification for “DIO.” smile


#72    Mike Fast      (see all posts) 2010/02/27 (Sat) @ 15:28

Also, when am I going to get an official designation of Sabermetric God for inventing xFIP????

Were you there on the World Series chats at THT Live where we (I think it was Colin and I) decided that the “Book” had been given to us by the Holy Trinity: MGL the Father, Tango the Son, and Andy the Holy Spirit?

I suppose you could still be another part of the pantheon.  You wanna be saber-Zeus?


#73    studes      (see all posts) 2010/02/27 (Sat) @ 18:16

I’d settle for Pope compared to those guys.


#74    Fargo      (see all posts) 2010/02/27 (Sat) @ 18:21

I see a lot of dumping going on, people grinding again on issues they’ve raised before. But very little analysis of the comparative value of EqA (whatever it’s called) and alternatives. Did EqA predate or postdate WOBA? If they are comparable (as Colin Wyers pointed out on THT), why was the “second” one, whichever one that is, created?

Ragging on BP for publishing the Bradbury article is truly strange. By publishing it they didn’t endorse it. And both BP writers and the vast majority of commenters on the article pretty much rubbished Bradbury’s analysis. That’s a mark in favor of BP, not against it.

As for the comments on PECOTA, the strongest and, in my view, most relevant question is why this year’s problems are occurring in getting the individual player forecasts stabilized and the forecast manager (hence also the team forecasts) working consistently. While it does seem there were some problems with the 2009 projections, it’s less relevant that BP explains what went wrong last year than that they learned that there were some problems and worked to solve them for this year. 

I don’t buy the argument that PECOTA—the internal logic, the basic method of using comparables—suddenly broke, after it had performed fairly well since 2003. But what I would like to know is how much of last year’s problem was due to issues are are mainly external to PECOTA’s logic, such as issues in the DT’s and MLE’s, i.e., in the data that is the input to PECOTA’s forecasts. It appears from Clay’s latest BP entries that this year there are definitely issues brought on by changes that Clay has made in defining the pool of players from which the comparables are drawn. And further there seem to be problems with how the Player Forecast Manager is cutting up playing time and filling out the rosters for each team. Both of these problem are issues that are largely external to the logic of PECOTA though they are critical to the player and team forecasts.

And this brings up the question that was raised by at least one commentator above:  why wasn’t this addressed much earlier, both by replicating and improving the 2009 forecasts and by modelling the 2010 forecasts—for example, by anticipating that expanding the player pool, which had the effect of making the comparables contain a larger number of players with little or no ML experience, might throw a lot of things out of whack.  Why wasn’t this worked on last summer and fall? Why was it only faced in December, January, and February?


#75    tangotiger      (see all posts) 2010/02/27 (Sat) @ 19:20

Fargo, EqA predate wOBA by a decade at least.  I explained why wOBA was created in a long post called “History of the wOBA, part 1”.

The issue on JC’s article was not on it being published.  It’s fine that it was published.  The issue (my issue anyway) is that it was presented with no setup, and with no obvious linking to the “pre-rebuttal” from MGL’s article from a few weeks earlier.  It’s a judgement call obviously.

As for PECOTA, let’s give BPro a few weeks a chance to write about it.


#76    Brian Cartwright      (see all posts) 2010/02/27 (Sat) @ 19:44

Re Bradbury’s article at BP - when it was published and I saw the headline, my first thought was “Oh no!” because the subject had been thoroughly discussed back and forth for weeks, with enough time for MGL to have published his rebuttal, and then BP brings Bradbury to a new audience that many were probably unaware of the history of the subject. So it started all over from scratch. BTW, it was labeled Part 1 - perhaps they have decided not to go with Part 2.

Re PECOTA - almost a year ago I was in attendance at BP’s booksigning tour in DC. I asked about Wieters and from what Clay explained I believe that up until that time PECOTA had been Nate’s baby, and then it was handed off to Clay. Plus, it was still a gargantuan spreadsheet that took days to run. In the past year Clay has converted it to a database, so presumably he has had an opportunityto be more intimately familiar with it’s inner workings, but at the same time 2010 is from a freshly coded PECOTA.

I can associate with those situations as I’ve spent the last few months coding Oliver in MySQL for THT. After I finished coding, they were going to do all the regular updates on their end, but I’m using Windows, they’re using Linux, and it became a hassle to move things, so I’m going to run the code. One of these days, if we ever part ways, they’ll get to keep the code and will have to make it run, and of course there will be a learning curve and some bugs to work out.

From tests I ran last year, I don’t believe there’s much difference between the major projections in working with MLB stats. Differences show up when you start mixing in MLEs, and PECOTA did well for players in the lower minors (A+ and below) compared to Zips and CHONE. I haven’t checked the 2009 projections, but of course many of those low minor players aren’t in the majors yet.


#77    Tangotiger      (see all posts) 2010/02/27 (Sat) @ 21:29

Matt Klaassen
(47692)

One classic analysis of EqA/EqR is found here:

http://walksaber.blogspot.com/2008/05/analysis-of-clay-davenports-eqr-and-eqa.html

I would read that before asserting that it’s better or worse than anything. It’s an accurate analysis (as far as I can tell) of the basic construction of the stat, and gets at some technical issues.

From the past, also see:

http://www.insidethebook.com/ee/index.php/site/comments/why_is_eqa_so_complicated/

From the more recent past, see the excellent wOBA/EqA analysis done by current BP writer Colin Wyers (sorry if this has already been linked here, seems like a natural place for it):

http://www.hardballtimes.com/main/blog_article/is-eqa-better-than-woba/

And,

Feb 25, 2010 22:09 PM
link
rating: 1

BP staff member Jay Jaffe
BP staff
(9077)

I’m familiar with that work, and I’m also familiar with data that’s been circulated internally within BP which will rebut that. As I said before, I’m leaving the math-level details regarding the formula and its construction to Clay Davenport.
Feb 27, 2010 07:43 AM
link

Matt Klaassen
(47692)

I was just putting it out there for people to read, not to make any particular claim. I eagerly await the publication of the internal studies you mentioned, since transparency benefits everyone. I assume that you wouldn’t refer to those studies if they weren’t going to be made available to everyone, given your comment above. It would be really interesting for all to see. Was Colin convinced?

Feb 27, 2010 10:40 AM
link
rating: 1

BP staff member Jay Jaffe
BP staff
(9077)

I won’t presume to know what Colin thinks, but I can tell you that he’s been crunching numbers on this, too.

I imagine the data and discussion will be presented along the lines of Clay’s “About EqA” piece from 2004 which was linked above: http://www.baseballprospectus.com/article.php?articleid=2596

One of the key take-home points from both that and Colin’s linked THT piece above is the time range of comparison, because these formulas have been “tuned” to a given period. I’m not sure if this has changed, but at the time, Fangraphs only had wOBA going back to 1974. In Clay’s piece, which was written in 2004, before wOBA was unveiled, he noted that there were ranges of time where EqA was essentialy on par with other systems, and ranges where it was significantly superior, and that he could improve its performance over recent eras with a greater number of category inputs (remember, stats like sacrifices, intentional walks and caught stealing have relatively limited histories). I imagine all of that will find its way into the discussion.
Feb 27, 2010 12:47 PM
link

Matt Klaassen
(47692)

If you’re saying that people are going to publish on this, I look forward to it. It will be good to know what data sets there are, so that the results can be independently checked by disinterested parties.

I assume Colin will be publishing his results however they turn out?

I’m curious to see if he finds his earlier article to have been wrong. Of course wOBA is of the nature that, since it’s just straight linear weights converted to a rate stat, that it could be adapted to any set of weights
Feb 27, 2010 15:29 PM
link
rating: 0

I’ll comment when the article is presented.


#78    evo34      (see all posts) 2010/02/27 (Sat) @ 22:05

Does anyone know of a source for long-term projections other than BP?  I happen to feel that they are in complete disarray, and their projections are essentially useless at this point.  Problem is I cannot find anyone with even a simple long-term projection system to use instead of BP.


#79    Tangotiger      (see all posts) 2010/02/27 (Sat) @ 22:14

I think it’s fair to say that 2009 PECOTA was terrible.  I think you should wait to see how they present 2010 PECOTA before you take a position on the matter.

***

Long-term forecasts?  Huh, funny, I never thought of doing a Marcel on that.  It’s been, what, six years since I rolled out the Marcels, and they haven’t changed at all since (and won’t, ever).

But, adding long-term forecasts?  That sounds do-able.  What are we looking for?  3 yrs?  5 yrs?  6? Rest of career?  I’ll only generate on the aggregate, not year-by-year.  So, what do people want?


#80    Matt K. (d_f)      (see all posts) 2010/02/27 (Sat) @ 22:19

Tom/79 (re long-term Marcels): I know when I was messing with Marcels/my own forecasts, that Sky Kalkman encouraged me to do something like that. I think someone eventually did do it, Maybe Dan T.? I’m not sure… I’d ask Sky, although, of course, you know how to do them yourself. Just thought it’d be interesting to compare.


#81    Tangotiger      (see all posts) 2010/02/27 (Sat) @ 22:21

Oh, if someone has done them, then I won’t bother.  IF someone has a link, just post it here.


#82    Matt K. (d_f)      (see all posts) 2010/02/27 (Sat) @ 22:29

Tom/81 I don’t know if they have or not, I just remember some rumblings about it during 2009 at some point. please don’t take my word for it. I thought about doing it myself, but got hung up on some of the SQL stuff and gave up (typical).


#83    Zach      (see all posts) 2010/02/27 (Sat) @ 23:30

PECOTA is having problems again, this time with their long-term forecasts for young guys:

http://www.baseballprospectus.com/unfiltered/?p=1523

...

Rally had great long-term forecasts for his projections last year, but they are missing from his site this year. ZiPS has projected career stats, but not year-to-year forecasts.


#84          (see all posts) 2010/02/28 (Sun) @ 00:21

Tom/79 5 years I think would be best.  That’s just my 2 cents.


#85    Sky      (see all posts) 2010/02/28 (Sun) @ 00:46

Joel Luckhaupt from redreporter.com tweaked Colin’s Marcel SQL code for me so that it would create “retrojections”.  But it does not project multiple years past the given projection year.  I’m told that’s more annoying to code, but would be pretty easy to do in Excel on a player-by-player basis.


#86    Mike Fast      (see all posts) 2010/02/28 (Sun) @ 02:18

Fargo/74,

Ragging on BP for publishing the Bradbury article is truly strange. By publishing it they didn’t endorse it. And both BP writers and the vast majority of commenters on the article pretty much rubbished Bradbury’s analysis. That’s a mark in favor of BP, not against it.

You say BP writers rubbished on it.  Anyone besides Colin?

As for the comments on PECOTA, the strongest and, in my view, most relevant question is why this year’s problems are occurring in getting the individual player forecasts stabilized and the forecast manager (hence also the team forecasts) working consistently. While it does seem there were some problems with the 2009 projections, it’s less relevant that BP explains what went wrong last year than that they learned that there were some problems and worked to solve them for this year.

How do we know that they learned that there were some problems and worked to solve them for this year if they don’t say a peep about it?  I’m not going to take that on faith.

As an analyst, I completely and totally sympathize with the work that Clay and team must be putting in to get this thing to work.  I wouldn’t want to be in their shoes, but I’m very glad to see when they identify and overcome issues.

However, as a potential customer for using PECOTA for my fantasy league, I’m not going to pony up money for 2010 unless I know a heck of lot more details about why 2009 was such a failure for them and why I should expect that 2010 will be any better.

I’m cheering for them to do well, but I’m not laying out money on that basis.

I don’t buy the argument that PECOTA—the internal logic, the basic method of using comparables—suddenly broke, after it had performed fairly well since 2003.

Has anyone made that argument?  Certainly not me.

And this brings up the question that was raised by at least one commentator above:  why wasn’t this addressed much earlier, both by replicating and improving the 2009 forecasts and by modelling the 2010 forecasts—for example, by anticipating that expanding the player pool, which had the effect of making the comparables contain a larger number of players with little or no ML experience, might throw a lot of things out of whack.  Why wasn’t this worked on last summer and fall? Why was it only faced in December, January, and February?

This is where I think BP suffers from keeping all their discussion internally focused.  Maybe not.  Maybe they had a rigorous discussion among the BP writers from October onward.  I wouldn’t know.  But my opinion is that it would be much harder to work out all the kinks with only a few eyes than it would be with the eyes of the whole community, if even only limited or restricted pieces of the PECOTA system or projection results.


#87    Mike Fast      (see all posts) 2010/02/28 (Sun) @ 02:22

#86 - “rigorous discussion” should be “vigorous discussion”, although I suppose rigorous works okay there. wink


#88    Fargo      (see all posts) 2010/02/28 (Sun) @ 04:46

@Mike/#86.

(1) Re Bradbury: Colin criticized. Pretty much everyone else who commented criticized, including MGL and others from the outside. It was a thorough thrashing. I say “thank you BP” for helping to show its audience that the emperor has no clothes.

(2) Re did BP know there were problems with last year’s PECOTA’s? They read the criticisms; they know that PECOTA didn’t perform nearly as well against the other systematic forecasts last year as it had in prior years. They don’t have to have a big public ritual flagellation to expiate their sins. They just have to get to work to make sure things work better. There’s plenty of evidence that they’ve tried to do that—read Clay Davenport’s and Dave Pease’s comments, for example.

(3) What’s wrong with PECOTA now?  I think PECOTA is probably a significantly more complicated system than most of the other major forecasts. For one thing it is extremely sensitive to how the raw performance data on the referent population—the population of potential comparables—is evaluated and transformed (park effects, league and era adjustments, MLE’s). I suspect that’s where most of 2009’s problems may have came from—including the Wieters projection.

What I find least explicable is why the problems we are seeing now weren’t addressed 6-8 months ago. If I were undertaking the translation of the Excel-based system into the new code then the first thing I would have done is to replicate past annual PECOTA projections using the new code—while making no other changes in the methodology or he database. After late March 2009, Nate was no longer involved with PECOTA. They could have begun the code conversion any time after that. But it doesn’t appear that they started this until after the 2009 season was over.

Only if I could use the new code first to reproduce Nate’s results from 2008 or 2009 would I introduce any changes in the procedures. Did Clay first attempt such a direct replication? We don’t know.

At some point, Clay appears to have made some important changes—he refers specifically to broadening the database of players from past seasons. Not only for the weighting of each players’ baseline performance stats for the previous three seasons, but more importantly by adding a lot more player-seasons to the potential set of comparables. It seems likely that those added player-seasons—ones that were not in Nate’s database—include mainly players with little or no ML experience. So these are players whose stats are highly subject to any assumptions and imputations made using the DT’s and MLE’s.

In addition they’ve had certain problems that seem specific to the Player Forecast Manager (PFM):  taking the PECOTA estimates and plugging them in to the PFM. Now this is more complicated than it might at first appear to be—chiefly because the PFM introduces depth charts and playing time estimates as constraints on the PECOTA estimates for each player. And lineups and batting order and the projected performance of all players on a team ‘interact’ to affect the projected performance of each every player as well as the team as a whole. In that way the PECOTA’s aren’t simply input to the PFM; they are simultaneously dependent on the projected performance of other members of the team. 

Nonetheless, the PFM worked reasonably well in previous years (maybe not so much in 2009?).  Why has it apparently become such a problem this year? Is it due to efforts to change the PFM this year? Or was this, too, transformed from an Excel spreadsheet macro to a new system of instructions for the first time this year?

And then there’s the issue of how similar players are identified. It requires another complicated piece of analysis to search the database of player seasons who are most comparable to each current player – on the set of physical characteristics and performance indicators that PECOTA employs. This matching probably didn’t use an Excel procedure but instead some kind of analysis of covariance.  Maybe that’s where some of the problem lies this year. We haven’t heard anything about this.

Finally there’s the step of using the performance of the set of comparables to make projections for each current player’s 2010 performance. More complicated code but probably fairly straightforward, if I understand the methodology.

In sum, there are a lot of places where things could have gone awry. As Tango wrote, we have to wait for BP to explain what’s going on. As a long-time BP subscriber, I mainly want them to solve the problem.  But my confidence in the PECOTA system has been shaken, and it will have to withstand two tests to rebuild it.  First, I want to know why it took so long to get the new projections ready.  Second, and more importantly, I want to know that PECOTA is once again among the very best at what it tries to do.


#89    Tangotiger      (see all posts) 2010/02/28 (Sun) @ 10:23

Fargo/88: a perfect post.

And I love the “emperor has no clothes” perspective.


#90    Mike Fast      (see all posts) 2010/02/28 (Sun) @ 12:42

Fargo/88, very good points and very good explanation of some of the factors that likely go into PECOTA and PFM.

Re your second point, I have said several times and places that what I wanted from BP was exactly what Clay did in his Unfiltered post a couple days back.  When I first posted my complaints in this thread, that Unfiltered post had not been made.  But that was exactly the kind of thing I was asking them to do, and I was very pleased to see it.  It gives me a lot more confidence that they are embracing the issues and working to fix them.

Also, I never called for a big public self-flagellation on BP’s part.  There are several things that can be conflated here: what I would do, either ideally or practically, if I were in their shoes, what standard I want them to meet in order to get my subscription dollars back, and what I reasonably expect to see them actually do.  Maybe I haven’t been as clear as I should be as to which I was always talking about, but I know that many of the people who have a beef with what I’ve said here have definitely misunderstood which I was addressing at a particular time.

One thing that I have not seen addressed and that will probably continue to bug me enough that I won’t pay for PECOTA is what went so wrong with their MLEs that they could produce the Wieters forecast.  I don’t have access to the PECOTA cards to see if they’ve got similarly outlandish projections this year. 

One of the major reasons to pay attention to a CHONE or PECOTA system over the much simpler Marcel is that they can, at least theoretically, give a better prediction for players with little major league history.  But if MLEs are cocked up, they lose that value pretty quickly.


#91    Mike Fast      (see all posts) 2010/02/28 (Sun) @ 16:24

I was looking through the comments here:
http://www.baseballprospectus.com/unfiltered/?p=1517
and found this from Dave Pease re PECOTA:

we’ll be posting a post-mortem explaining some of the issues we’ve encountered over the last few years.

Hooray!  That’s pretty much spot on what I’ve been asking for, so once again big kudos to BP for recognizing the need and being willing to self-reflect in public. 

As blunt a critic as I have been when I have perceived that they weren’t doing this, I want to be just as big a cheerleader when they do it.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 01:57
Who is Jeremy Lin?

Feb 12 00:40
Clutch analogy

Feb 12 00:38
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 10:29
Dwight Evans