THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, January 29, 2011

Breathless claims of new home field advantage discoveries?  Take a deep breath because…

By Tangotiger, 09:09 PM

J-Doug, among several others, want to talk to you.  Phil too.

I love it that we have this sabermetric army these days ready, willing and able to knock everyone back to reality.


#1          (see all posts) 2011/01/29 (Sat) @ 22:08

Also, “Wertheim: Yeah, and in games like when the Angels play the Dodgers or the Ravens play the Redskins — games where there’s negligible travel — the winning percentage stays the same. If you fly across the country, you’re not losing any more than you are when you’re the Chicago White Sox playing across town at Wrigley Field.”

Totally untrue.

http://www.behindthenethockey.com/2009/12/2/1174752/impact-of-travel-on-offensive


#2    MGL      (see all posts) 2011/01/29 (Sat) @ 23:04

I’m reading Scorecasting right now on my Kindle (downloaded in about a minute - what a wonderful thing!).  I am barely through it, but so far it is written well, but it rehashes pretty much what has already been studied and the tone is that of a pop culture book rather than a serious compilation of research (like The Book!).  Of course if you want to sell lots of books, you pretty much have to write it as such.  Anyway, I’ll reserve more judgment until I am done.


#3    MGL      (see all posts) 2011/01/29 (Sat) @ 23:15

As J. Holz points out on Phil’s blog (in the comments section), the authors make some pretty bad math/logic/data errors throughout the book.

In the first few pages, they talk about omission bias (whereby people prefer bad outcomes when it is due to omission rather than commission).  As an example, they cite the following experiment:

When people are asked whether they would vaccinate their children if the vaccination caused 5 deaths per 10,000 vaccinations and the flu caused 10 deaths per case of the flu, people tended to say that they would not vaccinate.  The authors state, “Clearly vaccination if the better choice.” Unless I am reading that incorrectly, “Huh?” The correct choice depends on the chances of getting the flu of course if you don’t vaccinate.  If it is less than 50%, which I imagine it is, then NOT getting the vaccine is the clear choice!  There are other glaring mistakes like that in the first few chapters alone…


#4          (see all posts) 2011/01/30 (Sun) @ 00:32

MGL/3: You’re right!  I hadn’t even noticed that.  He should have said that the flu causes 10 deaths per 10,000 children overall, not 10 deaths per 10,000 children *who get the flu*.

Good catch.


#5    Geri Monsen      (see all posts) 2011/01/30 (Sun) @ 02:40

Not having read the book, I’m not passing judgement over its conclusions.  However, J Doug’s take-down on the strike zone data linked above has a serious flaw.  The authors are claiming that the home team gets better calls in high leverage situations and may end up getting slightly worse calls in low leverage situations.  Thus, J Doug’s heat maps, which don’t separate high leverage situations from low leverage situations could possibly wash out what the authors are claiming is a real effect.

In addition, the authors claim that the HFA is greater in crucial counts—like full counts followed by other three-ball counts, followed by two strike counts.  So, while J Doug might find that HFA seems washed out and inconsequential if one assumes that the HFA is evenly applied over all pitches, he might come to a different conclusion, if he breaks it out by leverage and by count-type.

Or so the authors are claiming.


#6    J-Doug      (see all posts) 2011/01/30 (Sun) @ 03:02

So, while J Doug might find that HFA seems washed out and inconsequential if one assumes that the HFA is evenly applied over all pitches, he might come to a different conclusion, if he breaks it out by leverage and by count-type.

Geri, have you read my post? My entire analysis based on the effect of errant calls per ball-strike counts.

And have you read Phil’s? Because he debunks the low-vs-high leverage effect.

You just brought up two concerns that are not only directly addressed by the two links, but are the sole basis of the analysis.


#7    MGL      (see all posts) 2011/01/30 (Sun) @ 03:10

”...and may end up getting slightly worse calls in low leverage situations.”

That would be odd if that were true.

If, in fact, the author’s claims about the strike zone are incorrect, that would put a serious cloud over the entirety of their book.


#8    Millsy      (see all posts) 2011/01/30 (Sun) @ 13:55

Just re-posting my thoughts here from BtB, but I think it’s certainly an interesting take on the situation:

J-Doug,

It seems as though there could be some bias in the data. If the home field advantage bias is apparent and known by the players, this could affect their swing rate or the home team pitcher’s approach to pitching.

For example, if a Road Team Pitcher knows he won’t get that call on the corners, perhaps he throws it down the middle more often. If that were true, the increase in whatever offensive production statistic you want to lose would seem to be increased, but not attributed to the bias in your model. And you would see many less opportunities for the umpire to blow any calls.

Similarly, if a Road hitter up to bat may know of a bias and swing more at pitches outside the zone, again reducing the possibility of a blown call against the Road Team.

The opposite would happen for the Home Team players, ultimately reaching some equilibrium (or near it). In that case, the model would fail to show any effect, no?

Ultimately, the HFA could be less attributed to the blown calls themselves (as you calculate), and more directly a result of pitches nearer the center of the plate due to a known bias.  This adjustment doesn’t necessarily have to happen *after* the blown call if the home bias is known, and therefore would not be picked up in the ‘blown call bias’ calculation you speak of.


#9    J-Doug      (see all posts) 2011/01/30 (Sun) @ 14:20

Millsy:

Thanks. I responded at BtB but I’ll repost here.

I would not be surprised to see this happening. We know that BABIPs are higher for the home team.

The data aren’t so cooperative, however. There’s very little split in the number of pitches thrown inside and outside the legal zone—the home team sees 45.4% and the away team 45.2%. I also find that the home team gets pounded inside more but doesn’t see more pitches off the outside corner. I also doubt these numbers are statistically significant but I’d have to look closer.

Either way, the problem is this isn’t the finding that’s reported in Scorecasting. I have to investigate further, but it seems they claim specifically that there is a significant bias in home vs. away plate calls and that this is a significant component of the home field advantage. I could be interpreting this wrong, but that seems to be what they’re saying.

Personally, I think the bias may have a lot more to do with calls on balls in play, pickoffs and stolen bases. We know that foul calls are significantly biased in favor of the home team in college basketball. It would not surprise me if there were a significant effect going on here.

And, of course, on most balls in play, a bias in the call would have a far stronger effect than the bias on called pitches, and with a smaller sample size would be less likely to even out over the same period of time.


#10    Millsy      (see all posts) 2011/01/30 (Sun) @ 14:28

Thanks for the response, J-Doug.  My suggestion was going to be to simply look at the pitch location distribution for home and away.  And I haven’t read Scorecasting, so I was not sure if they had accounted for this or not (the possibility of running some sort of bias correcting econometrics approach would seem right up the alley of a Chicago finance professor).

The pitching approach issue based on a number of things (including umpire bias) is something I’m beginning to delve into in the office.  Unfortunately I haven’t had much time to do so.  I’ve enjoyed your stuff at BtB though.


#11    MGL      (see all posts) 2011/01/30 (Sun) @ 15:26

From Scorecasting:

Major league hitters hit 27 points lower the first time they face a pitcher in a game.  Their on base percentage is about 27 points lower and their slugging percentage is 58 points lower the first time they face a pitcher.

First of all, “Lower than what?” Their overall expected numbers?  The 2nd through the order? The 4th time? My advice to anyone who is writing an informative article or book, especially one whose substance is largely based on data and mathematical interpretation of that data:  If the meaning of a sentence is ambiguous, re-write that sentence!

More importantly, where in the heck did they get these numbers from?  According to The Book, the first time through the order, batters hit .264/.336/.423.  The 3rd time, they hit .282/.346/.460.  The difference in BA is 18 points, OBA is 10 points, and SA is 37 points.  And the difference between the first time and all times combined is considerably less.  Their numbers are ridiculous on their face.

Another word of advice to those same would-be writers.  If your sentence technically makes little or no sense, even though some (or even most) people may understand what you mean, also re-write it.  In the chapter on competitive balance, which is terrible as someone points out on Phil’s blog (for example, they talk about the number of games per season constituting sample size with respect to the best team coming out on top, without addressing the notion of the number of opportunities per game, as Tango has discussed several times), they write this sentence, with regard to baseball:

If two teams play one game, anything can happen, bit if they play a good many games, the better team will win a majority of the time.

That sentence is like saying, “Apples are red, but cherries are sweet.” Obviously, while that is true, apples and cherries are both red and sweet, so the “but” makes no sense.  Same with their sentence.  Over a good many games, “Anything can happen.” And in one game, the better team also wins a majority of the time.  I am a stickler for words.  You better say what you mean in an accurate fashion when you write something informative or you will be called out on it!


#12          (see all posts) 2011/01/30 (Sun) @ 15:38

#9 J-Doug - I have been diving into the home vs away records since the discussion started a couple of days go.  I was going to go off of differences I have seen like:

Base running - I have seen away batters in Boston think they hit a ball over the Green monster, but don’t.  They were running hard and didn’t get an easy double.  It has been hard to quantify if home teams take more extra bases than away teams with seeing if the home team was built for speed in the part, but home teams on average hit 33% more triples than the away team

Errors - After watching the Royals struggle in the Metrodome for years, familiarity with a home park can be definitely help the home team.  There is a bias, but the numbers work out to about 4 extra base runners per 8000 plate appearances (not much).

I have noticed the BABIP and that is huge difference for something that shouldn’t be so much of a difference.  I am looking into currently and hope to publish the results at Fangraphs tomorrow.


#13    MGL      (see all posts) 2011/01/30 (Sun) @ 15:41

The authors talk quite about irrational (non-optimal) behavior in sports, by players and coaches.  One thing they discuss along those lines is how loss aversion contributes to this.  For example, in baseball, they claim that when a pitcher gets to a 3-2 count, if he started at 3-0, he throws more fastballs (55 to 51%) than if he started with an 0-2 count.  They also state that the batter’s BA on the 3-2 count is 11 points higher when the count started at 3-0 rather than at 0-2.  They claim that the reason is loss aversion.  The pitcher and batter are altering their strategy because when they start out ahead, they don’t like to lose back what they had gained, so they take a more “conservative” strategy, but when they started out near failure (0-2 for the batter and 3-0 for the pitcher), they don’t mind failing as much, and they adopt a more “aggressive” approach.

While this may be true, I don’t think that the authors properly accounted for selective sampling in their data.  I doubt that the pool of pitchers and batters, especially the latter, is the same in both groups (3-0 to 3-2 and 0-2 to 3-2).  I suspect that the 3-0 to 3-2 pool of batters are much weaker than the 0-2 to 3-2 pool.  If you get behind or ahead on a strong batter, you are more likely to “nibble” than on a weak batter.  The outs and bases are also not likely to be the same in the two pools.  Anytime a pitcher nibbles at 0-2, it is more likely to be a situation where they don’t want to give up a hit and don’t mind a walk.  If you are 3-0 and then go 3-2, it is more likely a situation where you don’t want a walk.

More importantly, I think, if a pitcher started out at 0-2 and is now 3-2, he is more likely going to continue nibbling.  If he is 3-0 and then goes to 3-2, he is more likely to continue to throw a strike.  Which situation do you think will yield the higher BA?  The 3-0 of course, regardless of the pool of batters!  Why didn’t they look at OBP in this situation?


#14    Millsy      (see all posts) 2011/01/30 (Sun) @ 15:43

Jeff/12,

Out of curiosity, is the BABIP issue related to Line Drive rate as well as the lack of comfort on defense account for the increase?  If so, I would imagine this is pretty strong evidence that away players really are getting these pitches closer to their wheelhouse.  (of course, directly relating this to umpire calls would need a bit more work).

Maybe I’ll have to wait and see tomorrow at Fangraphs.


#15    MGL      (see all posts) 2011/01/30 (Sun) @ 15:50

Here’s another one:

Regarding the 1968 Harvard/Yale Ivy League Championship game:

Yale controlled the game, up 29-13 with less than a minute to play...After recovering a fumble, Harvard scored an unlikely touchdown.  With nothing to lose, it tried a two-point conversion…

Huh?  “With nothing to lose?” That’s why they tried it?  Perhaps it was because they were down 16 points and needed 2 touchdowns and two 2-point conversions to tie the game?  As I said, this book is sometimes written like it is a pop-culture piece rather than a serious econometric and statistical treatise.  I suppose that is defensible to some degree.


#16    studes      (see all posts) 2011/01/30 (Sun) @ 15:58

I’m not keeping up with all the dialog, but is anyone referencing John Walsh’s work in this area from the THT 2011 Annual???

He found a little more than one-third of the home field bias is due to the home plate umpire.


#17          (see all posts) 2011/01/30 (Sun) @ 16:40

Studes, I mentioned this at BtB, too, but do you have any idea why John’s findings on this matter on different than what Dan Turkenkopf and J-Doug found?

I know we don’t do sabermetric analysis by voting, but it seems odd that Dan and J-Doug both found 0.06 runs/game by independent methods, and John found 0.14 runs/game.


#18    studes      (see all posts) 2011/01/30 (Sun) @ 16:52

I posted a summary of John’s findings here:

http://www.hardballtimes.com/main/blog_article/the-ump-in-the-home-field-advantage/

I don’t know, Mike. John found a difference of about 0.8 balls/game. Do you know if Dan and J-Doug found a similar difference?


#19          (see all posts) 2011/01/30 (Sun) @ 17:05

Dan’s work is here:
http://www.beyondtheboxscore.com/2008/4/24/459913/a-strike-is-a-strike-right

He doesn’t state the number of balls/game, but he states that he used 0.161 runs/pitch as the conversion to runs, which is very close to what John stated he used (0.17 runs/pitch).

If it were just Dan’s data, one might assume it was a difference between Dan’s study being based on 2007 and John’s being based on 2008-2009, but now J-Doug has replicated Dan’s findings from 2008-2010 data.

Oddly enough, Dan used John’s definition of the strike zone, while I believe J-Doug used the rulebook zone.


#20          (see all posts) 2011/01/30 (Sun) @ 17:19

#14

I am hitting a wall with the incomplete data right now, so I am looking for some help.  Over the past 4 years the difference in BABIP in the home team’s advantage is 0.0071. This is a difference over 220 additional hits for the home teams in a season.

Over that same time period, the average difference between a home and away team’s LD% is 0.3%.  The increase in LD data accounts for some of the change, but I would love to have the follow data for a time period:

Home vs away defensive numbers.  Do the numbers show that home teams get to more balls in play than away teams

- and -

Home vs away batted ball data and the BABIP value for each. I would like to see where the extra balls in play are being hit.  Is it players knowing the bounces of the infield making more players.  Is it outfielders knowing they have plenty of range to shag long hit popup.


#21          (see all posts) 2011/01/30 (Sun) @ 17:30

As I posted over on the BtB thread, after reading John’s article more closely, I see that he made his calculation for 0-0 counts only and then extrapolated that out to the full game.  Given J-Doug’s and Dan’s matching findings from independent studies, I’m skeptical that John’s extrapolation from 0-0 to all counts is valid.


#22    Guy      (see all posts) 2011/01/30 (Sun) @ 18:56

MGL/13: I have to say, this sounds like a load of crap to me.  Who’s to say if throwing a fastball at 3-2 is “aggressive” or “conservative?” You are more likely to give up a hit, but less likely to give up a walk.  Both of those are a form of “loss” to the hitter.  I bet that if they found that pitchers who started 0-2 threw more fastballs, this too would have become evidence of loss aversion.  This strikes me as a fishing expedition, in which almost any disparities the authors found were deemed evidence for one of their theories.

I propose a new rule for anyone who wants to use sports statistics to prove various theories in behavioral economics.  Publicly post your theory about how the sports data could confirm the theory BEFORE you do your research, so we know what your hypothesis actually was.


#23    MGL      (see all posts) 2011/01/30 (Sun) @ 21:48

Guy, I don’t think their conclusions about loss aversion are necessarily out of line on their face, given the difference in fastball percentages they found (If I am afraid to walk someone after I started out 0-2, I might throw more fastballs if I am loss averse).  I just don’t think that (loss aversion) is necessarily the reason for the differences they found.  I think it probably has more to do with their approach given that they either just threw 2 strikes or 3 balls.  As I said earlier, if they started out 3-0 and went to 3-2, it is more likely to be a weak hitter or a situation where they didn’t want to walk the batter in the first place.  So naturally they are going to throw more fastballs at 3-2, whether they are loss averse or not.

In any case, the only proper way to do the analysis to see whether there is some kind of loss aversion going on is to control for the quality of the batter and the game, out, base runners, and score (basically hold the approach constant).  They clearly did not do that and any conclusion about loss aversion or not is unwarranted.

I agree that they seem to have started out with the assumption that players in baseball will show the same kind of loss aversive (and sub-optimal) behavior that coaches and managers tend to show, and presumably golfers show, and then they went fishing to find evidence to support that assumption, without properly controlling for confounding variables.

If I recall correctly, there were some legitimate criticisms of the golf study that these authors cite as a prime example of loss-aversive behavior by professional athletes.  I am not sure if the authors of that study controlled for number of putts made before the putt being looked at.  In other words, even if you control for distance and break, a par putt which is your second or third putt is going to made more often than a birdie (or eagle) putt which is your first putt, for obvious reasons. You would think that they controlled for the number of putts attempted before the putt in question, but I am not sure.


#24    Guy      (see all posts) 2011/01/30 (Sun) @ 22:14

I certainly agree there are likely other causes for the difference in fastball rates.  But even leaving that aside, it’s not clear which outcome should be associated with loss aversion.

You say here (23) that the 0-2 pitcher might throw more fastballs to avoide “losing” a hitter he should have put away.  OK.  But earlier (13) you said the authors reported exactly the opposite pattern.  Which is it?  Since you say BA is higher following 3-0, I assume that’s the sequence with more fastballs.  In any case, that supports my point:  you could claim loss aversion either way.  If the 0-2 pitchers throw fewer fastballs, you could say that is “conservative” because it reduces the chance of a hit or HR.  But it increases the risk of a BB, so you could also call it “aggressive.”

*

The real question is whether there is evidence that either pitchers or hitters are acting suboptimally in one of these cases.  Presumably that must be true if pitcher and hitter talent are the same, along with score and base/out, but as we both agree that’s likely not the case.  So pitchers and hitters may be acting quite rationally in both sequences.


#25    Xeifrank      (see all posts) 2011/01/30 (Sun) @ 22:45

So if umpires were automated (as much as possible, atleast on balls and strikes though) would we see a dip in HFA numbers?
vr, Xei


#26    MGL      (see all posts) 2011/01/30 (Sun) @ 23:59

"So if umpires were automated (as much as possible, atleast on balls and strikes though) would we see a dip in HFA numbers?”

Are you not reading this thread? The authors claim that umpire bias accounts for the lion’s share of the HFA.  Sabermetric researchers claim that it accounts for some of the HFA.  So what do you think is the answer to your question?

Guy, right, I made a mistake.  The authors claim that when a pitcher starts off at 3-0, they throw more fastballs, which they call a “conservative approach.” They say that throwing more off-speed pitches is the more aggressive approach, which pitchers use when they were ahead 0-2 and are now facing a “loss”.  As you say, the important thing is whether pitchers (or batters) are using sub-optimal approaches based on whether they were 0-2 or 3-0, which should be irrelevant.  I doubt that they do.  As I said, I think the reason for the differences in pitch selection and outcome is based on the pool of pitchers and batters in each count sequence (3-0 or 0-2) and the game situation.

Anyway, the more I read this book, while they present some good stuff, the less respect I have for the authors, particularly the so-called economist.  Consider this passage:

They cite the discredited Pope and Simonhson study whereby the authors found that .299 hitters hit .430 in their last AB of the season, thus they were somehow able to ratch it up around 130 points just because they really wanted to, or the pitchers were throwing them meatballs (yeah, right!) Of course the real reason why .299 hitters hit .430 in their last AB is because when a .299 batter gets a hit in his next to last AB and thus reaches .300, he is likely to be pinch hit for in his last AB.  In fact, it takes only around a 2/3 pinch hit rate for a true .290 hitter to hit .430 in his last AB. Consider that in the penultimate AB of the last game of the season, .299 batters will get 29 hits in 100 AB.  Of those, say only 33% take another AB.  So we have 80.7 batters batting again and 19.3 batters who sit on the bench with a hit in their last AB.  Of those 80.7 batters, 23.4 (.299*80.7) get a hit for a total of 42.7 (23.4+19.3) hits or a BA of .427 in their last AB.

Anyway, the authors write this about that study:

...In that final AB of the season, .299 hitters have hit almost .430...(Why, you may ask, don’t all batters employ the same strategy of swinging wildly (.299 batters rarely take a walk in their last AB), given the success of .299 hitters?  Does this not indict their approach the rest of the season?  We think not.  For one thing, these batters never walk, so their OBA are markedly lower than those of more conservative hitters. Also, if every batter swung away liberally throughout the season, pitchers would adjust accordingly and change their strategy to throw nothing but unhittable junk.

What a bunch of crap!  For one thing, the economist author should easily have realized that the .430 BA had nothing to do with the .299 batters having more success just because they really, really wanted to.  He should have explained the selective sampling issue if he is worth his weight in tenure.  Secondly, an OBA of .430 (your OBA can rarely be lower than your BA) is “markedly lower than those of more conservative batters?” Huh?  What conservative batters is he talking about?  Bonds?  Mantle? And don’t the authors think that pitchers know that .299 batters are swinging wildly in their last couple of AB’s and are adjusting their strategy accordingly?  And pitchers would NOT throw “nothing but unhittable junk” otherwise batters would NOT be swinging at those pitches.  Have these authors never heard of game theory?  Wow!


#27          (see all posts) 2011/02/01 (Tue) @ 14:10

The paper on umpire bias and home team advantage caught my eye and I looked at it from the viewpoint of first pitches.

Of Pitches Taken:

First pitch Called a Ball (not including intentional balls)
2007-- 2008-- 2009-- 2010-- Total
=================================================
45.23% 45.69% 45.59% 44.88% 45.35% Vis
46.48% 46.27% 46.73% 46.10% 46.40% Home
=================================================
-1.25% -0.58% -1.15% -1.22% -1.05% Vis-Home

First Pitch Called a Strike
2007-- 2008-- 2009-- 2010-- Total
=================================================
35.21% 35.17% 36.14% 37.07% 35.90% Vis
34.65% 34.74% 35.20% 36.23% 35.21% Home
=================================================
+0.56% +0.43% +0.94% +0.84% +0.69% Vis-Home
(--Retrosheet Event files)

As can be seen from the above tables, the first pitch was called a ball on the Home batters at a slightly lesser rate than for the visiting batters and the first pitch was called a strike slightly more often on the Visiting batters than on the home batters. It is not a large amount of difference but it is completely consistent across the two categories for during the four years studied.

While the amounts are not large, there is a big difference between batting after an 0-1 count and batting after a 1-0 count. The following table shows MLB batters’ OPS+ relative to their Total OPS for the years in question.

tOPS+ for batting after an 0-1 count/batting after a 1-0 count
============
2007--69/126
2008--69/126
2009--67/128
2010--69/127
(--baseball-reference.com)

While this is not enough to prove umpire bias towards the home team, I do think it goes some ways towards explaining why home teams have an edge in baseball.


#28    Tangotiger      (see all posts) 2011/02/01 (Tue) @ 14:39

Phil takes exception to the lucky/unlucky thesis as well:

http://sabermetricresearch.blogspot.com/2011/02/scorecasting-are-cubs-unlucky-or-is-it.html


#29    studes      (see all posts) 2011/02/01 (Tue) @ 15:54

Cliff, that was exactly what John Walsh analyzed in the THT Annual.


#30    MGL      (see all posts) 2011/02/01 (Tue) @ 16:10

#27, as you know, without using pitch f/x data (or perhaps even with it), there is no way to know whether it is because of the umpires or the pitchers (or batters).  The traditional view is that the visiting pitchers don’t pitch as well.  Given that, it would not be surprising if they threw more balls at any count, including on the first pitch.  Likewise, home batters could be taking more pitches that are balls and fewer that are strikes.

I’ll be back later with some more commentary on the book.  I will now mention something in the book which doesn’t seem to make sense:

The authors claim that in interleague games between teams in the same city, “the home teams win at exactly the same rate at which they normally do.”

It is curious that they used inter-league games in an example of how travel does not affect HFA in baseball.  Inter-league games themselves have a higher HFA than regular season ones - around 55/45 versus 53/47 or 54/46 (likely due to the DH/no DH rule), so you would expect that inter-league games among teams of the same city would NOT have the exact same HFA as normal, simply because they are IL games.


#31    Peter Jensen      (see all posts) 2011/02/01 (Tue) @ 17:42

Any theory of the cause of the HFA is going to have to explain why the majority of the HFA occurs in the 1st, 3d and 5th innings with the 1st inning contributing almost 50% more advantage than any other single inning.  I can’t imagine a reason why umpire calls would be skewed in that manner.


#32    J-Doug      (see all posts) 2011/02/01 (Tue) @ 18:09

Any theory of the cause of the HFA is going to have to explain why the majority of the HFA occurs in the 1st, 3d and 5th innings with the 1st inning contributing almost 50% more advantage than any other single inning.  I can’t imagine a reason why umpire calls would be skewed in that manner.

This seems to directly contradict Scorecasting’s claim that the HFA is weak (actually reversed) in low-leverage situations.


#33          (see all posts) 2011/02/01 (Tue) @ 18:09

#29 That’s not the first time I’ve re-invented the wheel. When I read this thread it was before you posted the link to your article on Walsh’s work. He dis take it a step beyond what I did but I’m still not sure that it proves umpire bias (I’m not sure it disproves it, either).

I haven’t gotten to splitting it out on a team basis, yet and I suspect that it could be interesting to look at it by umpire, too (even with the necessarily small sample sizes).


#34    MGL      (see all posts) 2011/02/01 (Tue) @ 20:19

I just finished with the chapter, “So, what is driving the HFA?”

They make a very compelling case that it is largely or at least significantly umpire-based on all sports, including baseball.

I agree with Peter that anyone analyzing the causes of HFA in baseball needs to address the huge 1st inning imbalance.  I am not aware of anything major in the 3rd and 5th.  And yes, that seems to contradict the low and high leverage data that the authors report (they show that HFA, as reflected in home called strikes and balls, is much larger in high leverage situations and non-existent in low leverage situations), since the 1st inning is largely low leverage.

One thing the authors state is that looking at pitch F/x data, they find no difference in speed, break, and location for home and away pitchers, independent of the results of the pitch, and I assume controlling for count (although I am not sure about that).  Is that true? Surely, John, Mike, or some of the other pitch f/x guys have looked at home and away pitches.  Anyone know what they found?

Another interesting thing that the authors claim is that when Questec was in existence from 02-08, that there was NO umpire bias in Questec stadiums.  That is a strong claim.  Since they assert that umpire bias is a major component of HFA, overall HFA must have been lower during the Questec period, at least when many stadiums had it installed (11 at the max, I think).  Plus, there should have been very little HFA for those teams that had Questec.  Is that true? Maybe that is one reason for the higher HFA we see in the last few years (no Questec anymore).  Although, you would think that the HFA would have simply returned to pre-2002 levels. They say that the reason that pitch f/x does not have the same (deterrent) effect on the umpires is that it is not used directly to evaluate them, which is true, so that makes sense. However, looking at Matt Swarz’ numbers from his 4-part BP article on HFA, shows that from 2000-2009, roughly the “Questec era”, HFA was .541, which is quite high by historical standards.  Maybe someone can look at the team HFA for Questec teams - there were 11 of them at its peak.

Tango, if someone wanted to duplicate their ball/strike, home/road versus leverage numbers, how does one calculate the LI from the score, inning, base runners, etc.?  Is there a chart for that somewhere?  Or a formula?

One glaring mistake they make, which was pointed out here or on Phil’s blog, is this:

They say that the difference between umpire ball and strike calls for the home and away teams is around 7.3 runs a season.  They then say:

That might not sound significant but cumulatively, home teams outscore their opponents by only 10.5 runs a season.  Thus, more than 2/3 of the HFA in MLB comes by virtue of the home plate umpire’s bad calls.

And throughout the entire book, they refer directly or indirectly to this “2/3” number.  Except, even if we accept the 7.3 runs number at face, it is not even close to 2/3 of the run differential” between the home and away teams.  Most of you probably know why that 10.5 number is wrong right away. Yet, these authors don’t.  That is incredible. How are we supposed to take the rest of the book seriously when they don’t even realize that the home teams scores considerably fewer runs per game or per season simply by virtue of the fact that they don’t bat in the bottom of the last inning in around half the games! 

Now, first of all, 10.5 run difference between home and away seems low.  It depends on the year or years of course, but I think it is twice that in the 90’s and aughts.  More importantly though, the home team averages around .5 innings per game less than the road team, which is around .25 runs per game or 40 runs a seasons!  So even if their 10.5 runs was right, after we add another 40 runs, that is a 50 run difference and 7.3 runs is now around 15% of the HFA rather than the 2/3 that the authors claim.  That sort of changes the whole tone of the book, doesn’t it?  How can they make such an elementary and important mistake?  Did anyone who knows even a little about baseball proofread the book?


#35          (see all posts) 2011/02/01 (Tue) @ 20:22

I missed the “bottom of the 9th” issue too.  Dumb Phil.


#36    MGL      (see all posts) 2011/02/01 (Tue) @ 20:30

For people who have not read this book, which I assume is most people reading this thread, they admit in the book that even if called strikes and balls differ significantly for the home and away teams, which they claim is true, that alone could be due to umpire bias, or batter or pitcher home/road performance.

To rule out batter or pitcher performance, they do several things: 

One, they show using pitch f/x data that umpires call many more pitches in the zone balls for road pitchers and many more pitches outside the zone strikes for home pitchers.

Two, that the differences are marked in high-leverage situations (and especially at 2 strike and 3 ball counts when the “count leverage” is also high) and non-existent in low-leverage situations (you would not expect batter or pitcher home/road performance to change with the leverage, but behavioral and cognitive psychology would suggest that umpire behavior might).

Three, Questec significantly affects these home/road differences.

Four, pitch f/x speed, location, and movement are indistinguishable for home and road pitchers.

Again, I am wondering how much of these claims is true.  Unfortunately, when a non-subject matter is making claims like this AND they are writing and marketing a book for money, one has to be skeptical.


#37    MGL      (see all posts) 2011/02/01 (Tue) @ 20:33

Well, 10.5 runs a season would have to raise a red flag, as that is .065 rpg, which is a pythag record of 50.7% for the home team!

I meant “non-subject matter expert” above, of course.

This is also why “books” typically are not considered authorities.  Other than The Book, of course!  At the very least you would like to have the input of a subject matter expert.  And then you need some kind of peer review…


#38    Peter Jensen      (see all posts) 2011/02/01 (Tue) @ 21:11

MGL - There is still an umpire evaluation program.  Sportvision now has the contract, using the Pitch f/x system. I believe they now do the evaluations in all parks.


#39          (see all posts) 2011/02/01 (Tue) @ 21:46

Surely, John, Mike, or some of the other pitch f/x guys have looked at home and away pitches.  Anyone know what they found?

MGL, Tango linked to J-Doug’s post at the top of the thread, and I linked to Dan’s post on the topic in #19.

They found a home-field advantage of about 0.06 runs/game due to ball-strike calls.


#40          (see all posts) 2011/02/01 (Tue) @ 21:51

Btw, in re-reading the comments to Dan’s post (from 2008), I found this from MGL which I really like:

The only way to control for this is to look at only those “edge” pitches for all pitchers and then use what percentage of those were mistakes (and in what direction) to credit or debit the pitcher. So a pitcher with 10 “mistakes” on edge pitches per 150 with 6 not in his favor and 4 in his favor, will have the same overall credit/debit (or run value) as a pitcher with 20 mistakes, 12 against him and 8 for him. If you did the calculations by “per total pitches (150 in this case), pitcher B with more edge pitches and more mistakes will get more debits and his run value will be a lot more negative (or positive, whichever way you are doing it).

This is one of the issues I’ve been grappling with in most of the published research on the strike zone, and MGL explained it quite well for me there.


#41    MGL      (see all posts) 2011/02/01 (Tue) @ 22:02

"MGL, Tango linked to J-Doug’s post at the top of the thread, and I linked to Dan’s post on the topic in #19.

They found a home-field advantage of about 0.06 runs/game due to ball-strike calls.”

Mike, I was referring to the parameters of the pitches, not the results (ball/strikes).  IOW, is there a difference between home and away pitches themselves, in terms of speed, movement, and location.  The authors claim there isn’t.

Peter, # 38, that kind of pokes a hole in their Questec theory, unless they were somehow afraid of their home/road bias being discovered then but not now, for some reason.


#42          (see all posts) 2011/02/01 (Tue) @ 22:06

MGL/41, I’ve looked at home/away speed, and yes, there is a difference, primarily in the 1st inning.  There’s a thread on it here somewhere.

I’m not aware that anyone has looked at home/away movement or location, but I could be forgetting or overlooking something.


#43          (see all posts) 2011/02/01 (Tue) @ 22:08

It was here:
http://www.insidethebook.com/ee/index.php/site/comments/is_batting_last_an_advantage/


#44    Guy      (see all posts) 2011/02/01 (Tue) @ 22:14

"Another interesting thing that the authors claim is that when Questec was in existence from 02-08, that there was NO umpire bias in Questec stadiums.”

From Phil’s account, I believe they simply compare home and away teams in the Questec stadiums.  That’s problematic, of course, because there is no guarantee that the 11 Questec teams were average teams overall.  You would have to take these teams, compare their home performance to road performance, and then do the same for non-Questec teams to see if Questec teams enjoyed a smaller HFA on called pitches.


#45          (see all posts) 2011/02/01 (Tue) @ 22:47

Guy, right.  I don’t think they explain exactly how they do it, but if they didn’t control for team quality then they should be further ashamed of themselves.

Later on in the book, they say that attendance has a significant effect on HFA in baseball (and in other sports).  They look at attendance versus some aspect of HFA.  They claim they find a significant correlation even after adjusting for the quality of the team (obviously teams with higher attendance are better teams, on the average).  So they are obviously aware of a potential bias in selecting teams based on variable X whether that variable is correlated with the variable they are look at (as in the attendance thing) or not (as in the Questec thing).

I seem to vaguely recall that sabermetric research has been done looking at attendance and HFA.  And that no connection was found…


#46          (see all posts) 2011/02/01 (Tue) @ 22:56

"MGL/41, I’ve looked at home/away speed, and yes, there is a difference, primarily in the 1st inning.  There’s a thread on it here somewhere.”

The authors say this:

In addition to having identical accuracy at home and on the road, pitchers throw with the same velocity...and movement no matter where they play...We tested the first inning versus later innings.  Again there was no difference...Pitchers appear to pitch no differently along any dimensions we can measure at home versus on the road, suggesting that neither the crowd nor the optics of the stadium influences their performance.

If what they say is true (and Mike appears to directly contradict them - he has 100 times more credibility in my view), that would put a serious dent in the “players on the road don’t play as well for whatever reasons” theory.

I don’t know why they implicitly assume that the only thing that would affect visiting pitchers is the crowd or stadium optics. How about they are not used to the mound or they are simply less physically and/or mentally prepared due to being on the road?


#47          (see all posts) 2011/02/01 (Tue) @ 23:05

In the thread that Mike references in #43 above, he did a data dump of fastball speed by inning for the home and away teams. In the first inning his graph shows that the home pitcher throws quite a bit faster (can’t tell from the graph - maybe .2 mph).  After that, the visiting team’s pitchers throw a little harder and a lot harder late in the game. But he is not controlling for the pool of pitchers so we have to assume that after the first inning, the pool of pitchers is different, especially in the later innings.

Certainly the Scorecaster authors should have noticed something different in the first inning and addressed that.  I can’t think of any reasons why the umpires would be so biased in the first inning.

Also, according to Guy’s numbers, there does appear to be a small spike in the 3rd and 5th innings, especially the 5th.  I assume that those numbers take into consideration the lineup.


#48          (see all posts) 2011/02/02 (Wed) @ 08:36

Could part of the HFA be due simply to the mound just not feeling the same to the visiting pitcher?


#49          (see all posts) 2011/02/02 (Wed) @ 09:36

Mike/40:

I had forgotten I did this, but when you mentioned MGL’s comment, I went back and checked.

I apparently re-ran the analysis only looking at close pitches and used a count-weighted run value.

The HFA was slightly smaller at .04 runs per game using this approach (the chart says .08 but that’s per 150 pitches, where there’s only about 75 “close” called pitches per game).

http://www.beyondtheboxscore.com/2008/5/12/506919/a-nibble-here-a-nibble-the


#50    Michael K      (see all posts) 2011/02/02 (Wed) @ 13:57

#31: “Any theory of the cause of the HFA is going to have to explain why the majority of the HFA occurs in the 1st, 3d and 5th innings with the 1st inning contributing almost 50% more advantage than any other single inning.”

That smells like a lineup effect.  The top of the order bats in the 1st, and is more likely than average to bat in the 3rd and probably also the 5th.

Anyone have the league average run scoring numbers by inning?  How closely does the inning-by-inning by HFA mirror the overall run-scoring by inning?


#51    Peter Jensen      (see all posts) 2011/02/02 (Wed) @ 14:05

Michael K - Unless there has been some change in baseball that nobody has told me about, the top of the order bats in the 1st inning for both the home team AND the visiting team.  Every team plays as many away games as home games.  How does that translate into a lineup effect increasing the HFA in the first inning?


#52    Michael K      (see all posts) 2011/02/02 (Wed) @ 14:35

#51 Peter- If there are X% more (combined) runs scored in the first inning than the second inning, and if HFA uniformly effects run scoring for the home and visiting teams in opposite directions, then wouldn’t HFA appear X% higher (in terms of net runs) for the first vs. second innings?


#53    Peter Jensen      (see all posts) 2011/02/02 (Wed) @ 14:51

Michael - Not if you were calculating HFA as he percentage of the total runs scored in the inning that were scored by the home team, as I was.  Sorry if I hadn’t made that clear.


#54    Michael K      (see all posts) 2011/02/02 (Wed) @ 15:06

Peter, ok, sorry, so that’s not it.  There is probably a small secondary effect.  If you take HFA as a given, the visiting team is more likely than the home team to bat their 4-5-6 hitters in the second inning.  And the home team is more likely than the visiting team to bat their 7-8-9 hitters in the second inning.  Which would seemingly give the visiting team a small advantage in the second inning relative to other innings.  But I imagine this effect would be tiny.


#55    MGL      (see all posts) 2011/02/02 (Wed) @ 18:19

Tango (I almost wrote Tiger, as I was just commenting in the Tiger Woods thread), did you notice my question above about how to compute LI or whether there was a chart?


#56          (see all posts) 2011/02/02 (Wed) @ 18:56

MGL/55, Tango’s chart is here:
http://www.insidethebook.com/li.shtml

And the explanation he gave is here:
http://www.hardballtimes.com/main/article/crucial-situations


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 03:39
Lack of hustle during a game

May 25 02:54
Largest demonstration in Canadian history?

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards