THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, December 21, 2007

Quantum Leap or Wiseguy?

By Tangotiger, 03:00 PM

Quantum Leap was one of my favorite shows.  The tag line for Scott Bakula’s character was “setting right what once went wrong”.  The best episode was when he managed to leap back into his own body as a kid, trying to stop his father from an early heart attack, alerting his brother of his impending doom, and keeping his sister from getting involved with the wrong guy.  Fantastic episode, and great ending.  Bakula was perfectly cast.  This show was opposite one of my other all-timer, Wiseguy, with Ken Wahl.  Basically, the show was the precursor to the Donnie Brasco movie.  Brilliant show, and Wahl had the perfect charm for the role.  Why they would put two such great shows targetting such similar audiences (namely, me) before VCRs would be so ubiquitous, who knows.

Anyway, let’s set the sabermetric world right with what is currently wrong.  I’ve made my pleas in various threads here about correcting concepts or improving them.  Shows how much influence I have: nada.  I end up sounding like a wiseguy.

If by the time you read the rest of this post you are becoming offended in some way, then you’ve missed the intended spirit here.  This is not personal, it’s business.  It’s about getting things done better, not continuing to do things the same wrong way.  Check your egos at the door, stop your defensive mechanism, and let’s leave Karl Rove have his best-defense-is-a-good-offense mechanism to himself.

Let’s come up with all that troubles you and me in the sabermetric world, and I (or we) can prepare a decent primer on what should change, to make it the reference point or launching pad to try to correct things.

Here’s my list:


From Baseball Prospectus:
- entire concept of LEV
- replacement level of WARP
- runs created portion of VORP
- using the term Pythagenport instead of PythagenPat
- the unnecessary complexity of EqA
- [you fill in the blanks]

From Baseball Reference:
- similarity scores (non-separation of counting and rate stats)
- the meaningless of OPS+, to be replaced by some form of RC+ (based on RC, LWTS, or BaseRuns)
- [you fill in the blanks]

From Tangotiger:
- over-reliance on the Fans’ Scouting Report (show us better evidence and application, already)
- WPA?
- WPA/LI?
- [you fill in the blanks]

From MGL:
- the confusion as a reader of his use of replacement level for pitchers
- [you fill in the blanks]

From Bill James:
- the non-use of BaseRuns
- the non-use of Leverage Index (see: THT Annual 2008)
- the non-use of FIP
- the non-use of [you fill in the blanks]
- [you fill in the blanks]

From Hardball Times:
- [you fill in the blanks]

From [you fill in the blanks]:
- [you fill in the blanks]

#1    Tangotiger      (see all posts) 2007/12/21 (Fri) @ 15:19

From STATS, BIS, mlb.com:
- we want to know where the ball first hit the ground; how many hops it took to get to the fielder or get past him; where did the ball and player pass each other; where was the ball each time it hit something; how long did it take to get somewhere


#2          (see all posts) 2007/12/21 (Fri) @ 15:30

I almost feel bad responding here, as I’m a nobody.  But - I see a glaring problem almost every day as I comb blogs, and I’ve never seen an opportunity to bring it up before.  Everyone is focused on player value and, in particular, player value in the past.  This focus comes at great expense in that we ignore any attempt to figure out how good people are at playing baseball.

Let me give a few examples:
LI
WPA
Focus on replacement players
Era adjustments (I’ll comment on this later)

For fairness, some counterexamples:
Strides made in fielding metrics
Recent focus on pitch/fx data

This bothers me to a great degree because I came to love more “advanced” statistics because they came closer to describing how good players were, absent all luck.  I get the feeling that if we are giving players credit for WPA, it’s no better than using RBI as our favorite metric.  WPA/LI does a great service in trying to solve this, but it still gives players credit (or debit) based on how managers choose to use them.

Similarly, we give credit to closers because of the inherent value of being able to enter a game in high leverage situations.  I fear I’m going to see a very real example of the folly of this when Nathan is traded and the Twins refuse to use Neshek in a “closer” role.  He’ll be penalized for the imbecility of the management (or he’ll be helped because more high leverage situations come up in the 7th and 8th - either way it won’t actually reflect his skill at baseball).

This ties in to era adjustments.  I find it admirable to attempt to build a system that allows comparison across eras, but almost all such systems ignore (or seem to from my cursory inspections) the fact that the league talent level could actually change from year to year, and that management seems consistently to get smarter.  To say that a 3rd basemen from the 20’s is more valuable than one today because the average 3rd basemen then hit poorly is missing the cause of this - it’s not that the 20’s player was better than his contemporaries, it’s that managers were making a poor decision to value the wrong things at 3rd base.  I don’t find the question of which player was more valuable to a baseball owner all that interesting.  I want to know which was better at playing baseball, and which would be more valuable if employed by a strategically perfect team.

If we can find a way to move in this direction, we could begin answering questions about whether using a closer is a good idea, whether more teams should value defense vs. offense at each position differently than they do now, etc.

That turned in to a bit of a rant - and it’s definitely all based on the opinion that measures of skill are far more interesting than measures of value.  So, take it for what it’s worth.


#3          (see all posts) 2007/12/21 (Fri) @ 15:42

I love BP, but… of all the sins listed, I think the most inexcusable is one you left off. They have a lot more money pouring in than almost any other sabermetric outlet. Certainly more than you or David Pinto. They can’t manage to get a ZR/PBP fielding analysis system up and running?


#4    Tangotiger      (see all posts) 2007/12/21 (Fri) @ 16:21

Water/2: I appreciate all comments from all, and I’m as much a nobody as you.  I hope others can also blog their ideas here.

Colin/3: great point.  If Pinto can afford it, so can BP.  Man, it’s soooo easy to spend other people’s money.  In this case though, with the money coming directly from the customers, I think it’s a great point Colin brings up.


#5    Anthony      (see all posts) 2007/12/21 (Fri) @ 16:52

From Baseball Prospectus, I find the DT cards nearly useless, namely because I have no idea how the translating process works. The DERA and NRA statistics...are they fielding-independent? They don’t seem it, and the translated statistics always befuddle me. Now from the JAWS series, I see that 19th century players are getting a huge boost from the timeline adjustment now. They really need to offer an explanation there.

I’d also like to see more research on additive park factors, and from BTF, a shift back towards research.


#6          (see all posts) 2007/12/21 (Fri) @ 18:46

From BP:
My expectations are higher for BP than any of the above as it’s a paid service. I agree with all of Tango’s complaints and they’re all derivative from a stubborn desire to reject others’ work. When someone comes along and builds a better mousetrap, be that in the form of LEV, replacement level or whatever, embrace and adopt it.
—Secret sauce. Two of the three components are total garbage as a predictive tool (WXRL, FRAA), so who cares if they have a slight correlation with postseason success?
—Marc Normandin needs to prove that his stuff actually has some predictive ability beyond what a Marcel or PECOTA would have. 
—Get Dan Fox’s baserunning metrics up as a stat report particularly when there’s room for PAP, umpires, and RBI opps.
—This one is a bit harsh. Would Christina Kahrl and Joe Sheehan be able to get jobs at BP today if they had not been there from the beginning? I think not. Both would be fine fits at a major sports portal BP should be about people at the top of the profession. They are not insiders, historians or scouts, so their core competency is supposed to be in quantitative analysis. Neither ever does any research on their own and they’re use of sabermetric tools is sketchy. Being a good sabermetric writer is akin to being a good lawyer. You lay out a claim and back it up with evidence. Now consider these recent excerpts from Sheehan:

“I think Troy Patton is a pretty good pitcher, capable of being a mid-rotation starter . . . There’s no real upside in the rest of this package, from back-end starter types in Matt Albers and Dennis Sarfate to the same Ron Coomer-looking package Michael Costanzo was a month ago when the Astros got him for Brad Lidge. Other than Patton, there’s no one in this trade capable of being a championship-caliber ballplayer.”

Why? Show me some evidence. And don’t you realize even a .470 pitcher (or hitter) at minimum cost is valuable? He then, for some reason decides to combine the rosters of the Orioles and Astros and surmises this:
“The fact that we can combine these two rosters and not come up with a clear contender—there’s no way the Astorioles are one of the top eight teams in baseball, and might not be top-12”

Show me some evidence. If you insist on doing this futile exercise, use one of the projection systems to put together an estimated W-L record. Do not just pull some conclusion (even if it ends up being true) out of nowhere.


#7    Greg Rybarczyk      (see all posts) 2007/12/21 (Fri) @ 20:42

Tango/#1: Amen!  An XYZ time/position trace of the ball from when the pitcher begins his motion until the play ends.  Put a chip in the baseball to do this.  Then embed another one in each player’s uniform.  Too costly?  Nonsense!  Get a sponsor, the on-screen graphics this would enable could be incredible.

Now my wish:

From: Anyone who invents a new metric, and anyone who has ever done so in the past.

What I want:  A plain-English description of what the number represents, complete with units, the maximum and minimum values the metric can be defined at, and the range of “typical” values.

Bonus:  Then, explain the relationship between the metric and something tangible (either something physical on the input side like time, distance, force, rpm, etc, or something on the output side, like runs, outs, bases, balls, strikes, etc.)


#8    salb918      (see all posts) 2007/12/21 (Fri) @ 21:15

For [I don’t know - maybe THT?]

-A running projection system, updated once a week, for every player in the majors.  In other words, a Marcel projection that is updated every week so that we can see the effect of recent performance and “know” a player’s true talent over time.


#9    tangotiger      (see all posts) 2007/12/21 (Fri) @ 21:34

Re: 8

I’ve already provided the formula for a “moving average” forecast in the past, but I’ll do so again:
weight = .9994^daysAgo

That’s how much to weight each day’s performance going back as many days as possible.

For pitchers, change the number to .9990.

This should be super-easy for anyone with PBP data to implement.

Your best bet is to get David Appleman at Fangraphs to do this.  He’s got the PBP data as he needs it.  THT, BP, and b-r.com can all do it as well.


#10    tangotiger      (see all posts) 2007/12/21 (Fri) @ 21:41

Oh, and everyone starts off with 200 league-average PA and performance for hitters, and 300 PA for pitchers.


#11    studes      (see all posts) 2007/12/21 (Fri) @ 22:04

weight = .9994^daysAgo

Can you explain this in more detail, laying out how the match would work, say, starting with Marcel in the beginning of the year?  Or point me to where it’s written up?

Remember, I’m slow with this math stuff, but I get it eventually.


#12    tangotiger      (see all posts) 2007/12/21 (Fri) @ 23:45

I posted this elsewhere, and am cut/pasting it here:

Consider that for pitchers, you have the following:
YR: 1.00
YR-1: .70
YR-2: .70^2
YR-3: .70^3
YR-n: .70^n

What this means is that you take the most recent season, and give it a weight of 1.00. Take the previous season, and give it a weight of .70. Take the performance from two seasons ago and give it a weight of .49, and so on.

Now, we don’t have to limit this to seasonal data. You can actually look at it by day. A little algebra gives us this:
weight = .999^(daysAgo)

If today is May 23, 2008, then the performance of May 22, 2008 gets a weight of .999. Performance from May 13, 2008 gets a weight of .990. The performance of May 23, 2007 gets a weight of .693. And so on.

(For hitters, change .999 to .9994.)

This would be similar to a “moving average”, so that you are constantly updating your forecast by the day, based on the most recent data available.

(You can make it more complicated by also considering the age.)


#13    salb928      (see all posts) 2007/12/22 (Sat) @ 00:06

12: Thanks, Tango.  It is hard to keep up with the volume of your output!!!


#14    Anthony      (see all posts) 2007/12/22 (Sat) @ 01:15

That’s tremendous. The next step is to turn it into a full-blown ELO, with quality of opposition, et al included.


#15          (see all posts) 2007/12/22 (Sat) @ 01:24

I can’t comment as well as you guys already have about most of this stuff, since I’m not all that familiar with all the metrics.  So I don’t have much to add, except, off the top of my head:

From Bill James: lack of engagement on established clutch hitting research ... my view is that there is substantial evidence that clutch hitting ability does not exist, but Bill never addresses it (other than the Cramer study).

From Baseball Forecaster: lots of results of little studies near the front of the book, but no link to the studies themselves.  Or perhaps the studies were never published, which makes the conclusions kind of useless for serious researchers.

From Baseball Prospectus: how does PECOTA work?  I mean, in detail, so I can code it myself.  If that already exists, I haven’t seen it.  If they want to keep it proprietary, that’s OK too, but then it’s useless to me as a researcher, because I don’t play Rotisserie any more.

From the field in general: too many stats denominated in too many different ways.  I’d like to see a consistent set of values.  For example, for hitters:

First, something VORPish—bulk value over replacement. 

Second, something VORPish on a RATE basis—perhaps per 600PA.  (Right now, if you know a guy is +10 runs over replacement, is he an awesome pinch-hitter, or a below-average full-time guy?)

Third, a bulk measure of run contribution, like Runs Created. 

Fourth, a RATE measure of run contribution, like RC per 600PA. 

Finally, a rate measure of contribution as if he were nine identical players on a team—RC/27 is my favorite here, but I suppose RC per 400 batting outs or something would do also.

All these could also be wins, instead of runs ... whatever.  But I think it would be a huge benefit if everyone tried answering the same question with the same units. 

This would NOT preclude everybody and his brother from using different formulas to calculate these things.  But some people give me eqA, and some people give me wOBA, and some give me this, and some give me that.  Can’t you take your eqA and convert it into “how many runs a team of nine of these guys would score,” so it means something to me?

That way, Tango would say Joe Blow is expected to create 5.5 runs/27 outs next year, and Prospectus could say 5.7 runs, and Forecaster could say 5.4 runs, and I’d understand what the (expletive) they were all talking about!  smile


#16    studes      (see all posts) 2007/12/22 (Sat) @ 11:50

That everyone, but particularly Retrosheet, Lahman and Baseball Reference, would use the same player IDs and release them in real time—when players are first brought up to the majors.

How did we end up with the current situation anyway?


#17    studes      (see all posts) 2007/12/22 (Sat) @ 11:51

BTW, thanks Tango.


#18    studes      (see all posts) 2007/12/22 (Sat) @ 11:56

So, let’s just say that I know of a website that is posting the Marcels.  Is there a way to make this daily methodology work with a one-line projection from previous years?


#19    tangotiger      (see all posts) 2007/12/22 (Sat) @ 14:08

studes/18: that becomes a bit more complicated.  The forecast includes 200 PA of league average performance, plus weighted past performance.  As you move ahead through time, you continually discount the past performance, but the 200 PA remains a constant (plus of course you add the more recent performance).

Ideally, what you’d want is the forecast split into two: the non-regressed, and the league average data.

Fortunately, the Marcels give you enough information to infer this: I give the reliability column, which is: weightedPA/(weightedPA+200).

So, if someone has a reliability of .80, then you know that I regressed 20% toward the mean.  The weighted PA would come out to 800, and the other 200 is the league mean.

To answer your question: sure, you can do it.  If you intend for THT (or anyone else) to do this as a for-sure, then I can work out the set of equations you need to do this.


#20    Fargo      (see all posts) 2007/12/22 (Sat) @ 15:42

Re Phil Birnbaum’s comments:

“—Get Dan Fox’s baserunning metrics up as a stat report particularly when there’s room for PAP, umpires, and RBI opps.”

I think this is in process now.  See here: http://www.baseballprospectus.com/unfiltered/?p=683

“[H]ow does PECOTA work?  I mean, in detail, so I can code it myself.  If that already exists, I haven’t seen it.  If they want to keep it proprietary, that’s OK too, but then it’s useless to me as a researcher. . . .”

It’s proprietary, as are most projection systems.  All praise to Marcel, however.


#21          (see all posts) 2007/12/22 (Sat) @ 15:49

Fargo/20: The first quote isn’t actually mine.

I should add that I have no moral objection to BP keeping PECOTA proprietary ... they were the ones that built it, it’s their property, and they have the right to make a living off it.  All power to them.


#22    Peter Jensen      (see all posts) 2007/12/22 (Sat) @ 20:29

Re: Tango #1 post.  Have you asked your contact at mlb.com how they get their hit ball locations?  Are they provided by a third party or do they generate them in house?  Caught balls in the air are no problem, of course.  Neither should be infield ground balls fielded for outs or 99% of outfield ground balls, the exception being balls down the line that hit the wall in foul territory before being fielded.  The problem is almost entirely balls hit in the air for hits.  Has anyone tried asking mlb if they would change their policy to recording where the ball first hits the ground instead of where it is fielded?  If you had that and could convince mlb and/or Sportvision to give us speed off the bat and vertical and horizontal angles of hit balls from the pitch f/x system you would have all the data you need without chips in the balls.

Tom - Your article on catchers in the THT Annual was great.  I would have loved to have seen effect on pitchers included as well though.  Hanrahan’s study showed a large effect but others have shown very little effect.  It would have been interesting to see what your methodology concluded.


#23    HarryAbles      (see all posts) 2007/12/22 (Sat) @ 21:05

Tango/19:  If you explained the best way to do it, I’d gladly update projections throughout next year.


#24          (see all posts) 2007/12/23 (Sun) @ 09:46

Studes/16:

Oh please yes! And add everyone who does projections as well as every fantasy game out there. Use a Master ID! What an enormous pita every year generating a master file of names! If I was to take over the world, this is one of the first problems that would be rectified :-þ


#25    James Holzhauer      (see all posts) 2007/12/26 (Wed) @ 06:44

Re: 18/19, I’d like to see not only those equations, but also the ones for updating pitchers’ forecasts in-season.  My gut tells me that a pitcher who’s having a breakout first half (DIPS-wise) is more likely than a hitter to maintain that growth in the second half.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade