THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, September 10, 2010

Felix v CC

By Tangotiger, 04:54 PM

A GREAT idea by Poz.

I decided to look at Sabathia vs. Hernandez start by start since the beginning of the year. Like I say at the top, there’s no point to this. I’m not trying to PROVE anything here. I know that looking at start by start is about the least scientific way imaginable to compare the two pitchers. Still: I thought it would be fun to look at their starts, and simply pick who pitched better each time out. Then, at the end of the year, we total ‘em up, see who won … heck, it makes for a fun blog post if nothing else.
...So Hernandez is winning the season 17 to 12, which is pretty decisive.
...But if there’s one thing I hope this little thought experiment does, I hope it makes clear that the mysterious numbers that some people rip — WAR, FIP, xFIP and so on — these things are grounded in what ACTUALLY HAPPENS in baseball games. They’re not just throwing darts in an alley at midnight. There’s a reason that Hernandez has better numbers in all those crazy stats. It’s because game in and game out Hernandez has pitched better than Sabathia.

Great stuff, right?  Now, let’s take it to the next level.  Take Felix’s first start, and compare it,head-to-head, against each of Sabathia 30 starts.  Say he wins 3 and loses 27, so that start is worth 0.100 wins.  Repeat with Felix’s second start and so on.  Add them up.

In this way, rather than doing what Joe did, which could have allowed some flukes in there (like CC’s best start pitted against Felix’ best start, and Felix being better, so Sabathia gets 0 wins), he’s up against each of Felix’s start (basically his average start).

Someone want to try this?


#1    Ryan JL      (see all posts) 2010/09/10 (Fri) @ 17:37

"Someone want to try this? “

That would require, what, 900 comparisons? Yeah, no thanks.  wink

I was thinking, instead of lining them up chronologically, you could line them up in order of best start to worst (by GS or whatever,) and then do what Joe did from there.  That would be a bit better I think.


#2    Tangotiger      (see all posts) 2010/09/10 (Fri) @ 18:06

Well, once you line them up from best to worst, then the 900 comparisons go down substantially.  For example, if Felix’s #5 is better than CC’s #3, then it’s also better than CC’s #4 through #30.  So, CC gets 2 wins and 28 losses in that case.

Pretty straight forward really…


#3    Xeifrank      (see all posts) 2010/09/10 (Fri) @ 18:30

What kind of park do Felix and CC pitch in?  Hitters or pitchers?


#4    Tangotiger      (see all posts) 2010/09/10 (Fri) @ 18:35

Anyone who is going to quote a one-year PF, see yourself out please.


#5    Tangotiger      (see all posts) 2010/09/10 (Fri) @ 18:39

Also, when it comes to park factors, you have to be careful.  First off, Felix is a non-contact pitcher, and a groundball pitcher, and extremely good pitcher.  So, park don’t affect him they would an average pitcher contact, average GB, average quality pitcher.

This is like the Bonds/3Com issue: Bonds hit as many HR at home on the road, but LHH hit two-thirds as many homers at 3Com than away from 3Com.  So, it would be inappropriate to think that what affects LHH also affects Bonds.

Same issue at Coors, with Juan Pierre and Dante Bichette: that park simply affects each of these guys differently.

So, if you want to talk about park factors, bring something more to the table than generic park factors.


#6    Elkboy      (see all posts) 2010/09/10 (Fri) @ 18:48

I’m no researcher, but you got me wondering about this.  I used RE24 from Baseball Reference as my gauge of better game.  That’s probably completely wrong, but whatever.

I got:
Sabathia:  369-527-4
King Felix:  532-364-4

I’m not sure why they don’t match up, but that seems about right.  That puts Felix about 17 wins, Sabathia at about 12 which is exactly what Poz came up with, oddly enough.


#7    Kincaid      (see all posts) 2010/09/10 (Fri) @ 18:49

Without any park factors, I looked at the DIPS game score (using this method from Kevin Harlow:  GSDIPS = 50 + IP + 2*K – 3*BB – 13*HR) for each start of the year for CC and Felix.  I used those for what Tango talks about.  For example, CC had a 74 game score on April 16, which is worse than Felix’s 81 on August 10, tied with Felix’s 74s on July 21 and June 19, and better than every other Felix start.  So CC gets 27 wins, 2 half-wins for the two ties, and one loss for losing to Felix’s best game, and he is 28-2 for that game.  Conversely, Felix is 2-28 against that game.  Repeat for all games.

Felix ends up 594-306 for a .660 winning percentage against CC.  Or, in Tango’s terms, Felix has 19.8 wins to CC’s 10.2.

Again, that’s without adjusting game score for park effects.  If I have time, I might try to come up withe some crude game score park factors, but that’s a big enough difference that you can probably say Felix wins pretty easily.


#8          (see all posts) 2010/09/10 (Fri) @ 18:50

By 2010 game score rankings it’s 18.5 to 11.5 for Felix.


#9    Kincaid      (see all posts) 2010/09/10 (Fri) @ 19:05

If I plug in RE24 from B-R, I get 527-369-4 in Felix’s favour, same as Elkboy got for his Sabathia line.  Did you count for each one manually and separately?  They should match up.

Since WPA is right next to RE24, plugging WPA (B-R version) in gives 536-364 (17.9-12.1) in Felix’s favour.


#10    Xeifrank      (see all posts) 2010/09/10 (Fri) @ 19:05

Anyone who ignores park factors can also walk out the door.  Oh wait, I don’t run the asylum.  smile

That being said I’d put Liriano and Lee up with Felix for the AL CY.  Could easily argue for any of those three.  Not sure where the CC part of the equation came from.  Probably the east coast bias.


#11          (see all posts) 2010/09/10 (Fri) @ 19:20

I did the same thing with both Bill James GScore and the new game score you proposed in the previous post (1st version).
For Bill James:
Felix: .624 Win% 18.7 wins
CC: .376 Win % 11.3 wins

For Tom Tango Game Score (TTGScore):
Felix: .641 Win % 19.2 wins
CC: .359 Win% 10.8 wins

The thing that really jumped out to me looking at the TTGSCore was that CC’s best game wasn’t as good as any of Felix’s 9 best starts. 2 of which were against the Yankees in Yankee stadium, the #1 offense in baseball and the 2nd worst park to pitch in by this year’s park factors.


#12    Tangotiger      (see all posts) 2010/09/10 (Fri) @ 21:02

You guys are awesome, thanks for rolling up your sleeves.

So, Poz’s method, while crude, is easily explainable.

And, while limited by the sequence of their starts, you guys have shown that, in this instance, Joe’s method very well reflects what we want to show.

I like it.  Especially because it’s crazy to me that if the Mariners scored more runs, Felix would look like a better pitcher.

“How come they gave the Cy to CC and not Felix for best pitcher?  Because CC’s teammates scored more runs for him??”

Just an insane discussion point.


#13    anon      (see all posts) 2010/09/10 (Fri) @ 21:08

I don’t get it, why are we all of a sudden eliminating park factors?  I thought that was part of the equation.


#14    Tangotiger      (see all posts) 2010/09/10 (Fri) @ 21:44

"one-year PF”

PF count… if you do it the right way (or at least non-wrong way).


#15    Xeifrank      (see all posts) 2010/09/11 (Sat) @ 03:33

Nobody ever mentioned “one-year PF”, so I see this as a bit of a straw-man.  Gonna call you on it!

Have to be careful about rigging the method to help one certain pitchers numbers look better.


#16          (see all posts) 2010/09/11 (Sat) @ 03:53

Quality of opposition should be counted as well.
According to BP quality of opp OPS stat, CC has faced hitters with a 714 OPS (lowest among AL starters with 150 IP) and Felix has faced hitters with a 729 OPS.

Looking at the teams CC has faced.  He has faced good hitting teams (750+ OPS) in 8 games, while King Felix has faced such teams 13 games (Bos, NYY x 3, Tex 4, MIN x 2, CWS x 2, Det x 2).  Cc faced Boston 4 times, but twice early in the season when BOS could not hit it’s way out of a paper bag, and a 3rd time when the Red Sox were without Youkilis and Pedroia.

CC has also faced more weak hitting teams (OPS sub 710 OPS) than King Felix (13 to 7).  He faced Sea x 3, Bal x 5, Oak x 3, Cle x 2.


#17    Tangotiger      (see all posts) 2010/09/11 (Sat) @ 09:12

Xei, what are you talking about?

I said this:
“Anyone who is going to quote a one-year PF, see yourself out please. “

And one of the responses was:
“I don’t get it, why are we all of a sudden eliminating park factors? “


#18    Colin Wyers      (see all posts) 2010/09/11 (Sat) @ 11:48

Also, when it comes to park factors, you have to be careful.  First off, Felix is a non-contact pitcher, and a groundball pitcher, and extremely good pitcher.  So, park don’t affect him they would an average pitcher contact, average GB, average quality pitcher.

This is like the Bonds/3Com issue: Bonds hit as many HR at home on the road, but LHH hit two-thirds as many homers at 3Com than away from 3Com.  So, it would be inappropriate to think that what affects LHH also affects Bonds.

Same issue at Coors, with Juan Pierre and Dante Bichette: that park simply affects each of these guys differently.

So, if you want to talk about park factors, bring something more to the table than generic park factors.

I disagree. Let’s take the Pierre issue for a second.

Okay, so we know that Pierre isn’t affected by Coors the way other hitters are - or maybe not at all. In three years playing for the Rockies, he hit .308/.356/.371. The next year, playing for Florida, he hit .305/.361/.373.

So, if we were trying to project what Pierre would do for Flordia, we’d obviously want to use park factors that recognized that Juan Pierre simply wasn’t being affected by Coors.

But that’s not the same as saying that Pierre was just as valuable for the Rockies as he was for the Marlins. He very clearly wasn’t - because even if he was’t being affected by Coors, the hitters the Rockies pitchers were facing were. And so while Pierre wasn’t creating any more runs because of Coors, he was playing in an environment where you need more runs to generate wins.

This is why I feel for the purposes of this exercise, we don’t need specific park facts, but very very generic park factors.


#19    Tangotiger      (see all posts) 2010/09/11 (Sat) @ 12:21

that Pierre was just as valuable for the Rockies

Right, which is why one needs to decide whether they are talking about value in a specific environment, or his value in a neutral environment.

Insofar as the Cy Young is concerned, it’s given to the “best pitcher”, or at least, “pitcher who performed the best”.  I definitely would not look for “value” in there, as in “pitcher who provided the most value to his team”.

If you want to talk about Boggs taking advantage of Fenway, or Bonds not being affected by 3Com like other hitters, thereby really shining more than his competition, then sure, give them MVP credit.

Cy is not about MVP.  For me anyway.


#20    Xeifrank      (see all posts) 2010/09/11 (Sat) @ 12:25

Tango, the “see yourself out” comment when nobody had even remotely mentioned a one year park factor was a little overboard.

This head to head compare starts is a fun exercise but it is flawed on so many levels for determining who had a better season and then you are going to nitpick to the nth degree on park factors?  Boo!


#21    Tangotiger      (see all posts) 2010/09/11 (Sat) @ 12:35

Xei: I’m being proactive, because someone was going to mention it, and someone was going to be lazy and look at one year.  I’ll be the judge of being a moderator, thanks…


#22    Telnar      (see all posts) 2010/09/11 (Sat) @ 19:39

I don’t think it will change the result, but the two largest adjustments which favor CC (park factor and defense) have yet to appear in this discussion.


#23    Kincaid      (see all posts) 2010/09/11 (Sat) @ 21:46

Anything DIPS should account for defense, and I’m pretty sure WPA at least is park adjusted (since win probability depends on the run environment; Tango can correct that if it’s wrong).  So there are some approaches in this thread that account for one or the other, if not both at once.  I think for the most part, people aren’t bothering with it because it’s not even close, so there’s no reason to go through the extra effort of adding in adjustments that are only going to cover a fraction of the difference.


#24    Rally      (see all posts) 2010/09/11 (Sat) @ 22:35

No, DIPS does not account for defense.  Not when you are talking about who pitched better for one season instead of projecting next year.

Whatever Tango’s intentions, I had the same reaction that Xei did when I read the preemptive throw yourself out bit.  I haven’t even looked at what the latest park factors are but I suspect even a one year park factor would be better than no park factor at all.  Checking BB-ref for Seattle - 97 multi-year, 97 one year.  So what are you getting so worked up over?  Seriously dude.

For the Yankees, one year is 96 and multiyear (only 2 in this case since it’s a new park) is also 96.  So we can ignore parks in this case.  Yankee stadium got a quick reputation as a homerun haven, but seems to depress other offensive events.  Defensive considerations - Seattle has a better defense but not by a huge margin - the pbp metrics like Seattle better but Yankees have the better DER.

So in this case I’d be comfortable going with a game score comparison.  If I were comparing Halladay or Jimenez to Matt Latos, a game score comparison would not be appropriate.

From the time I started typing this post to finishing the Angels have put a few dents in Felix’s Cy Young bid.  Strangely, they’ve handled him well all year long, while at the same time making a string of mediocities look like Cy Young candidates.


#25    Tangotiger      (see all posts) 2010/09/11 (Sat) @ 23:19

I don’t know how B-R calculates single year park factors.

The number of runs scored in Seattle home games is 80% of those scored in Seattle away games.  So, no, there’s no way a single-year PF would be better than no PF in this case.


#26    Kincaid      (see all posts) 2010/09/11 (Sat) @ 23:23

When you are looking at individual games for two pitchers and comparing them, and you want to remove the effects of one having a better defense behind him than the other, DIPS removes the effects of one having a better defense.  That’s all I meant when I said it accounts for defense in this case.  There are all sorts of debates you could have over how to handle defense for value metrics, but using DIPS does mostly mean one pitcher having a better defense behind him is not inflating his scores.

B-R park factors are all from last year right now (which also means Yankee Stadium’s multi-year factor is just one year).  This year, ESPN has Yankee Stadium as 1.188 and Safeco as .795 (that’s without road games added in, so halve those to compare to B-R PFs).  So using one-year PFs would make a pretty big difference in this case.


#27    anon      (see all posts) 2010/09/13 (Mon) @ 15:10

so how do the numbers compare when you use multiple-year park factors?  I’ll bet Felix still looks better.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 06:43
Largest demonstration in Canadian history?

May 25 06:39
Lack of hustle during a game

May 25 05:00
Help needed with sticky issue…

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards