THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, September 08, 2009

Poz’s Production

By Tangotiger, 03:31 PM

In Poz’s ongoing quest to finding a stat he can champion, he’s gravitating toward a “counting” number, and the name ”Production”.

I’d post at his site, but since my office firewall doesn’t like his blatant spam-triggering content (actually, it blocks blogger and wordpress software run sites), I’ll make my comments here.

As I said, the only thing someone has to do is tell me what the user requirements are, and we’ll create a stat for it.  Joe has decided that he wants a counting stat, meaning that he only wants to count all the positive things.  He’s ignoring outs altogether.  He’s not interested in replacement or average levels (as far as I can see).  Therefore, Linear Weights Ratio is NOT the stat for him.  And he CANNOT use that as his basis.

Fortunately, a stat already exists that will satisfy him: it’s the weights that go into wOBA.  For the uninitiated, it’s here:
http://insidethebook.com/woba.shtml

So the numbers that Poz cares about is this:
0.72xNIBB + 0.75xHBP + 0.90x1B + 0.92xRBOE + 1.24x2B + 1.56x3B + 1.95xHR

RBOE is reaching base on error.  You can include 0.25 for SB and -0.50 for CS (the one exception to the positive rule).  Note that these numbers change year-by-year, and those were the numbers for the 1999-2002 era.  If you insist on static numbers, I can deliver those.

This works better than looking at the positive numbers that Joe is using from Linear Weights Ratio.  The reason is because of the way the outs and PA issue is resolved.  In Linear Weights Ratio, the denominator is outs, and so, the positive numbers get their Linear Weights values (proportionately).  But, in wOBA, the denominator is PA, which includes outs and hits and walks.  In order to get wOBA aligned properly, I need different values for the numerator.  In the article I linked to, what those values are is this:
HR 1.70, 3B 1.37, 2B 1.08, 1B 0.77, NIBB 0.62.

These are the RUN values of these events, over and above making an out.  So, a walk is +.62 runs better than an out.  For my purposes, I found it better to increase these values proportionately so that their overall (weighted) average was exactly 1.00.

For Joe’s sake, he might prefer the HR as a +1.7 because it represents something he can say like “he created 1.7 more runs with each HR than he did with an out”.  So, based on his current writings, I would say that’s the stat Joe would be favoring.

But, like I said, all we need are the user requirements.  Maybe he doesn’t know what his requirements are, and how he uses his stat will implicitly tell us.  We can derive it, and we can create something to satisfy it.

***

“Production” was a stat that Pete Palmer coined, which he labelled as “PRO” to describe.... OBP+SLG (i.e., OPS).

If someone wants to link to this blog from Joe’s blog, please do so.


#1    Tangotiger      (see all posts) 2009/09/08 (Tue) @ 16:49

He wants “Runs over out” (ROO).  That’s got to be a good acronym, no?

Pujols went 2-4, with a HR and a walk.  So, he gets about 1.7 for the HR, .8 for the single and .6 for the walk for a total of 3.1 ROOs in 5 PA.

Pujols has 250 ROOs this year!

“They’re not saying boo, they’re saying ROO”.


#2    Guy      (see all posts) 2009/09/08 (Tue) @ 16:59

Ugh.  A counting stat denominated in units of, well, what?  No chance of making this work.  He’d be better off with his “hitting average” idea.


#3    Tangotiger      (see all posts) 2009/09/08 (Tue) @ 17:10

Runs over out.  That’s the unit.  Number of runs he created more than had he made an out each time.


#4    Guy      (see all posts) 2009/09/08 (Tue) @ 19:00

Ah, got it.  So what’s an average hitter’s ROO?  Must be something like 150.  Can’t see this catching on.


#5    Tangotiger      (see all posts) 2009/09/08 (Tue) @ 19:52

Well, if the average hitter gets 80 RC, and the negative runs is .30 runs per out and there are 400 outs, then that means his ROO is 200.

Pujols would be say 280 ROO or something.


#6    Colin Wyers      (see all posts) 2009/09/08 (Tue) @ 20:02

How does an ordinal ranking of ROO look? I’m not convinced that this “works.”


#7    Nick      (see all posts) 2009/09/08 (Tue) @ 20:27

Me neither.  I don’t know why he would just do straight R/150.


#8    Tangotiger      (see all posts) 2009/09/08 (Tue) @ 21:51

ROO = wOBA*PA/1.15


#9    Nick      (see all posts) 2009/09/08 (Tue) @ 22:57

It seems like it would be easier to compare it to average, and do (((wOBA-LgwOBA)/1.15)*PA)/PA)*600

Runs above average per 600 plate appearances; simple, meaningful, and understandable.


#10    Colin Wyers      (see all posts) 2009/09/09 (Wed) @ 02:30

Nick, sometimes we want to know a player’s value given his actual number of PAs, not per 600 PAs. Playing time matters to some extent, although not as much as ROO or “Production” would lead you to believe.

I tested ROO against wRC (I cheated and just used .12 R/PA) for 2008 and they were close but not identical in the ordinal ranking. I really am not comfortable with ROO, I must say. I don’t know why wRC or wRAR isn’t a better choice. I know that ROO lines up better with what Pos did in that post but given the numbers involved I have a tough time saying that everything he did was intentional. It’s very possible that if explained to him he’d be perfectly happy with wRC.


#11    Nick      (see all posts) 2009/09/09 (Wed) @ 02:54

Yes, obviously it could be used as a value stat by including PA; however, I don’t see why it would be a counting stat, as apposed to above average or above replacement level.  To me, that seems a lot cleaner.


#12    Guy      (see all posts) 2009/09/09 (Wed) @ 08:24

Agree with Colin:  better to use wRC, and relabel it “production” if that’s the preferred name.  The implicit ROO counter-factual of makes-an-out-every-time-up is not helpful or interesting. 

I don’t think you’ll ever sell an above/below average stat for the purpose Poz has in mind (mass acceptance).  People will immediately want to know “what’s average?”, and then add that to your above/below metric.  And frankly, the numbers are just too small to be impressive:  15 runs above average is a very good player, but “15” sounds like a bad rating.


#13    Tangotiger      (see all posts) 2009/09/09 (Wed) @ 09:42

The above average will never work because you will have Bill James and his drones who will say that -1 in 600 PA has more value than +1 in 200 PA.

I would much rather present .329 wOBA in 600 PA and .335 wOBA in 200 PA, and let the reader decide.

Palmer has proven that the “runs above average” model is not sellable.  The reason is because of the single-dimension presentation.  In order to sell it, you always have to present two dimensions, with the second dimension being games or PA or outs made.  One of those three.  Once you accept that, then you either present a “numerator” or a rate stat ("per unit of denominator").

Regardless, there are millions of baseball fans, and each will have his own idea of what he wants.  I would say to simply present the metric that satisfies their conditions, because they will always find a problem with a “uberstat”.


#14    Patriot      (see all posts) 2009/09/09 (Wed) @ 11:05

I think the Poz thread illustrates why, if there was to be one unified stat, it would have to be a rate.  At least when a rate is used, 95% of people understand that playing time has to be taken into account.  That is ingrained in every baseball fan from Batting Average if nothing else.

But when a stat like LWR Plus appears on Poz’ blog, multiple commentators think they have a gotcha since Mark Teixeira leads the AL, and Poz opposes his MVP candidacy.  It gives the illusion of a value stat (as do R, RBI, RC, etc.), when in reality you still need to consider opportunity and either implicitly or explicitly baseline to get value.  But the general fan public has an easier time accepting this principle when you give them the rate and not the total.


#15    Tangotiger      (see all posts) 2009/09/09 (Wed) @ 11:20

Patriot: fantastically well-said. Yes, I am totally on board with you here. 

We (implicitly) accept as a requirement that you need two dimensions to be able to sell this to a whole group of fans.  Given that one of the dimensions is playing time, then it behooves us to ensure that the other dimension is a rate stat (either per opp, or indexed to the league average=100).

Fantastic.  So, Poz is wrong in wanting to only want a counting stat, because he will be asked daily about the playing time component (or outs or PA or whathaveyou).

If he aligned it to average=100, then the fan will say, “accepted, but I need to know how many games he played”.

Sold!


#16    Guy      (see all posts) 2009/09/09 (Wed) @ 11:30

Agreed it should be a rate stat.  But I’m not sold on notion is has to be a 100=average stat.  I think that’s fine, but something like runs per game, or Poz’s hitting average, could also work.  And I think many fans would actually be more comfortable with an absolute stat rather than one tied to a shifting league average.  But why not do both, as we have with OPS and OPS+ (but better than both)?  An absolute run production rate for the masses, and an adjusted one for the sabermetric crowd.


#17    Nick      (see all posts) 2009/09/09 (Wed) @ 11:46

It’s probably said before, but what’s wrong with with 1.75*OBP + SLG?  Or you could modify it to scale with OPS.


#18    Tangotiger      (see all posts) 2009/09/09 (Wed) @ 12:43

Agreed Guy that runs per game or 100=average works, and I’m not championing one over the other.

Nick: if you make it 1.75*OBP+SLG, people are going to ask “why”.  And then they’re going to ask about 1.68 and why not include BA, etc, etc.  Not to mention: what is the unit?

If you want to go down that road, fine.  But, it’s going to be a dead end.


#19    Colin Wyers      (see all posts) 2009/09/09 (Wed) @ 12:51

I think Pos’s big problem is winning readers over right now is his lack of a real unit - “baseball points” just sounds kind of fishy, and you have people in the comments making up their own “baseball points” on the assumption that the enterprise is arbitrary. You denominate everything in runs and I think it becomes a whole lot clearer.


#20          (see all posts) 2009/09/09 (Wed) @ 15:37

If the objectives are:

1) A counting, not a rate, stat;
2) Where outs = 0; and
3) that is relatively straightforward (i.e, uses only widely-available stats—hence, no RBOE—and minimizes number of calculations),why not:

2*(HBP+NIBB)+1.5*H+TB

If you also want to satisify objective (4): express as runs, then:

Multiply the above by .31 (or, more simply, approximate by dividing by 3).


#21    Tangotiger      (see all posts) 2009/09/09 (Wed) @ 16:03

"2*(HBP+NIBB)+1.5*H+TB “

Right, that equation is a great shorthand, and I think David Smyth and/or Studes offered that up as their “simple” wOBA-version.

Also, if you divide by PA, you get a league-average of roughly 1.000.


#22    Colin Wyers      (see all posts) 2009/09/09 (Wed) @ 16:29

If the ideal weights are the wOBA weights, then I think:

2*(BB+HBP+H)+TB

Is both cleaner and slightly more accurate. Am I wrong?


#23    Tangotiger      (see all posts) 2009/09/09 (Wed) @ 16:37

Colin:

BB     H     TB      event    sum     sum*.3636
2.0    0.0    0.0    bb     2.0          0.73 
0.0    1.5    1.0    1b     2.5          0.91 
0.0    1.5    2.0    2b     3.5          1.27 
0.0    1.5    3.0    3b     4.5          1.64 
0.0    1.5    4.0    hr     5.5          2.00


#24    Colin Wyers      (see all posts) 2009/09/09 (Wed) @ 16:45

I misread the formula and thought it was 1.5*(H+TB) for some weird reason.


#25    Rick G      (see all posts) 2009/09/09 (Wed) @ 21:10

A few thoughts:

Let’s not confuse what makes a stat an accurate measure versus what makes it a widely accepted one

At root, all popular stats have two things in common:

- They can be easily described by a non-stat person in terms of what the represent in the context of a single event, even if that description is not completely accurate.  Batting average = how often a guy gets a hit when he comes to bat.  Slugging percentage = how much power a guy hits for (the likelihood I’m going to see an extra-base hit), etc.

- The general value of the “thing” the stat measure is basically understood.  Hits are good, so more hits are better than fewer hits.

Counting stats fit these criteria clearly.  Some rate stats do, precisely because both the numerator (the counting stat) and the denominator (the opportunities to accrue that stat) are clearly understood.

Batting average isn’t intuitive in terms of what’s good and bad.  Neither is OBP.  They only have value in so far as they meet both of the criteria listed above.  The values have meaning because and the average fan understands what it is measuring and they have developed an understanding of the scale of good and bad.  Mapping a new metric on to one of those scales may address the latter point, but it completely ignores the former.  Without both, it won’t stick. 

Even while “good” batting averages change over time, batting average has stuck because it’s easy to understand what it measures.  Despite the accuracy of linear weights and so forth in estimating true run production, unless you can tell a person what that stat represents in the context of a single event, it won’t stick.

As Tango has pointed out many times, as soon as you start introducing complexity, it’s a slippery slope to the most complicated (and presumably most accurate) stat.  So I propose we go in the other direction - towards simplicity.  Batting average is actually pretty complex to calculate.  We not only have to figure out whether the batter reached base on an error or not (a subjective measure to be sure), but then we have to count up at bats (take all plate appearances, subtract out walks, sacrifices, etc.)

The biggest problem with the common stats out there right now is one basic thing: the denominator.  The “at bat” denominator makes life unnecessarily confusing and prevents the average fan from taking the leap away from batting average.  The history of baseball statistics should teach us many things.  Key among these is that change and acceptance of that change happens gradually.  Let’s not try to go from 5 mph (avg) to 15 mph (OPS) to 60 (wOBA).

Here’s a really simple stat that is a logical step from where we are, even if some might consider it a step backwards: Bases acquired per plate appearance (I’m sure it already exists under some fancy name).

- It gets walks in to the formula
- It accounts for the different value of extra base hits
- It is self explanatory. 

When Pujols has come to bat in 2009, he has acquired .805 bases on average, best in MLB by a crazy amount.  Mauer has acquired .659 bases on average, best in the AL.  People get that.

We have to be careful not to confuse a record of what has happened with a valuation of what has happened or a manipulation of what has happened such that we can best predict what is likely to happen in the future.  These require different formulae and serve different purpose.  While sabermetricians may care most about the 2nd and 3rd questions, most people don’t.  They care about the first.  And by building a better, still accessible version of the first

In the long run, it sets up the conversation for a linear weights model.  Yes, a HR is worth 4x the value of single.  Yes, park effects matter.  Yes, stolen bases and caught stealing matters.  But until we get an easily understood stat that lays out the basic premise for a linear weights approach, we’re trying to run before we can walk.  Get bases per at bat accepted (everybody—EVERYBODY—in the blogosphere should start using it.  Part of the reason advanced metrics don’t take route is precisely because of the disagreements about how to best calculate them) and wOBA can soon follow.  Get to wOBA and you can get to a run based metric.  But start simple.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 09 19:10
Who’s evaluating the 2011 forecasts this year?

Feb 09 18:35
MGL: Today on Clubhouse Confidential

Feb 09 17:36
New PECOTA

Feb 09 16:38
The will of the people?

Feb 09 16:25
Correlation of pitcher metrics: FIP strikes again

Feb 09 11:56
Forecaster’s Challenge: 2012?

Feb 09 11:45
When is a life entity considered a person?

Feb 09 10:08
Change in fastball velocity by going from starter to reliever

Feb 08 22:41
Batman, the webslinger?

Feb 08 22:24
When to purposefully lose the lead