THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, November 16, 2007

Hardball Times Annual 2008, starring…

By Tangotiger, 10:54 AM

A cool collection of writers and bloggers (including MGL and me) in the latest annual.  Bill James is the big name, but I’m looking forward to reading everyone there. Is that cool, or what, that I’ve got an article right after my biggest inspiration?

I think it behooves all to support THT, as studes et al provide excellent analysis every week.  studes must kill himself this time of year every year.  I know I couldn’t stand doing the editing when we were finishing up our book.  Show studes (our version of Jerry Lewis) that you care, and buy this book.  We need studes to keep going like this every year.  We get a few thousand people visiting our blog every day: if ever you felt like you wanted to, then support THT instead.  THT is sabermetrics’ best friend.

If you can avoid Amazon, do so and order directly from the publisher.  It’s win-win for THT and the publisher.

Or, if you insist on Amazon, then go through this Amazon link (we get a small referral from Amazon), and also make a direct donation to Hardball Times (bottom right corner).  It’s win-win for The Book, THT and you.  You can even get “super saver shipping” from Amazon if you buy a second book.  The Bill James Handbook 2008 is out there as well.  Or The Book, if you’ve somehow managed to avoid getting it.

There’s also lots of great minds out there I’ve come across this year.  You guys should seriously think about contributing a piece to THT (be it annual or online).  Mike, Joe, Joe, Justin, Matt, and others: support the cause.


#1    studes      (see all posts) 2007/11/16 (Fri) @ 12:13

Tango, thanks so much for the plug and the kind words.  Just so everyone knows, all of our money goes to our writers and editors (after web hosting and stats costs), and they’re not exactly quitting their day jobs.  Aaron is still a part owner of the site, but he doesn’t take any money from it, and I consider my ownership cut to be reimbursement for all the work I do on the book.

And it is a ton of work.  Honestly, I feel the same way right now that I did last year at this time: never again.  The book basically consumes me for six to eight weeks, and that’s with a lot of support from other folks.  And most of the work isn’t much fun.  Typesetting 180 pages of stats is a chore.

But, it is a great book.  The content is awesome, thanks to people like you.


#2    MGL      (see all posts) 2007/11/16 (Fri) @ 20:38

I did not know that Tango had an article (two, actually - he one-upped me!) and don’t think he knew that I had one (I sort of backed into it).

Anyway, I ditto Tango’s sentiments and it is without a doubt the best annual out there, with all due respect to BJH and BP, which are both good as well, although I am not nearly as big a fan of the Handbook as some people are (I don’t care about the goofy stats).  I like the articles in THT and BP.  That is all I really read in these pubs.  The only bad thing about having an article in THT is that is one less thing in the book to anticipate with excitement.


#3    Mike Flatt      (see all posts) 2007/11/16 (Fri) @ 20:40

Just thought I’d let you know I bought the book.  It’s the first THT Annual I’ve bought but from reading everything, it seems like a) it’s going to be a great book, and b) you guys deserve some type of compensation for doing us a service.

Looking forward to it!


#4    David Smyth      (see all posts) 2007/11/16 (Fri) @ 21:15

I pre-ordered the book on Amazon for $13.57--a nice discount from the 19.95 price on acta. I understand the idea of supporting THT--I certainly hope they make enough $ to continue. But I’m also a capitalist. This means that I will generally obtain a product at the lowest price I can. If this was not profitable for THT, then they would not offer their product on Amazon. You can’t have it both ways, guys. If you want people to pay the Acta price, then don’t offer it cheaper on Amazon. The way around that, I suppose, is not to make people feel ‘guilty’ for buying the book on Amazon (I don’t feel guilty at all), and simply request voluntary contributions or donations to THT site.


#5    studes      (see all posts) 2007/11/16 (Fri) @ 22:08

David, we do ask for donations as well.  Feel free to chip in.  I’m glad you’re a capitalist--I am, too--but I also donate to causes I support.

That’s a bizarre comment about “not having it both ways.” If everyone buys the book through Amazon, we will not be able to continue selling it.  We expect and depend on people continuing to help the site if they can do so financially, thus making it available through Amazon for those who can’t afford full retail.


#6    Vegas Watch      (see all posts) 2007/11/16 (Fri) @ 22:14

I bought it straight from ACTA.

And I’m in college.  It’s six dollars.  Come on, people.


#7    tangotiger      (see all posts) 2007/11/16 (Fri) @ 22:34

I should say that everything I said, I speak for myself, and studes had nothing to do with it.

And, yes, getting it on Amazon, and then making the voluntary contribution is the best way for everyone all-around.  You certainly don’t have to make the extra contribution. 

I see it that if BP can have 10K subscribers each paying 40$ a year, and THT is offering their content for free, that a few hundred people chipping in a few dollars is certainly reasonable.  Basically, we’re paying for R&D, and making sure it’s free for everyone else.  The contribution is not guilt, but a pleasure.


#8    David Smyth      (see all posts) 2007/11/16 (Fri) @ 22:50

---"that’s a bizarre comment about “not having it both ways”. If everyone buys the book through Amazon, we will not be able to continue selling it.”

To me, your comment is bizarre. Who, exactly, forced you to sell the book thru Amazon? I mean, I assume that almost all buyers are people who visit the site and would choose (or not choose) to buy the book for the 19.95. By also offering it thru Amazon at a lower price, you are dangling a carrot and then asking people not to grab it. IMO, that is not a nice or proper thing to do to people.

And just to show that I’m speaking on principle here, I will make a $20 donation to THT.


#9    David Smyth      (see all posts) 2007/11/16 (Fri) @ 23:01

Oops. I went to the site and clicked on “make a donation”, and all that comes up is buying the book. I don’t need/want a second copy. Where can I make a clean donation? Plus, my PayPal acct is inactivated. Apparently, somebody in Europe tried to use my acct to purchase something.


#10    MGL      (see all posts) 2007/11/16 (Fri) @ 23:53

I assume that they (and we) have our books on Amazon even though we make a lot less money because presumably there is more action and volume on Amazon such that THT makes more money overall by being on Amazon AND selling directly from the site. One could easily set up a reasonable model that shows that to be true.

Or it could be that Amazon sales are extra sales that would not come at all from the publisher (although it is more likely that a sale from Amazon would have a 10% chance of being a sale from the publisher or from THT itself if it wasn’t on Amazon.  In any case, if Amazon sales are extra (would not exist without Amazon) then THT or any writer would sell his book on Amazon for whatever they offered (unless they decided not to on prinicple and with the hope of eventually changing Amazon’s deal), be it $5 per book or 5 cents.

Given that that is the case, more or less (some model like the ones above), it is perfectly OK if one has the opportunity to steer someone to a site that gives them more profit as long as they are honest about it.  For example, if Amazon’s price is cheaper and want to steer someone to my publisher rather than Amazon, I better tell them that, “BTW, it is cheaper on Amazon, but we make a lot more money if you buy it from the publisher.  Or you can buy on Amazon and then donate to us.  Your choice.”


#11    tangotiger      (see all posts) 2007/11/16 (Fri) @ 23:57

There’s a “Make A Donation” button on the front page of THT.

You don’t have to be a PayPal member.  You can click the link to process a credit card, and you don’t have to register as a PayPal member.  They will simply act as a “merchant” to process the payment.

***

The Amazon thing is for wider exposure.  I know when we sold The Book, we decided to forego Amazon, because we would end up with 50 cents for each book.

It’s really a balancing act to figure out how to do it. We decided to sell on our site for a year, and take our profits.  Then, after a year, we sold the rights to a publisher who put it on Amazon. 

It works out for us, because The Book is not a time-sensitive publication.


#12    tangotiger      (see all posts) 2007/11/17 (Sat) @ 00:02

And the reason I highlight this is because I used to always look for the best deal, even getting books used.  Now, I don’t do that, because I realize the author really gets nothing like that.  It’s really a conscious decision as to what it is you want to do with your money, be it save it and take the savings on other things (like my kid), or pay the extra, and consider the extra as R&D for the author. 

I like Tom Kyte from Oracle, and will buy it from the publisher, even though it’s alot more expensive.  He puts out great books, and I need him around.  For other books, I won’t be so generous, and will simply go via Amazon, or even used, and just spend the rest on my kid.

I think most people don’t even consider the effect on the author or publisher, and my post was to make people think about it.


#13    studes      (see all posts) 2007/11/17 (Sat) @ 07:54

David, we don’t decide where our book gets sold.  Our publisher does.  The way book economics work, they basically don’t care where the book is sold, because virtually all of the incremental revenue difference is cut out of our share of the proceeds.  ACTA is a great business partner, but they’re capitalists too.

Another important reason to sell on Amazon is that they make the book available to audiences we can’t economically sell to, such as Europe and Asia.

But it basically means that we make 8 to 10 times more per book when people buy it from our site instead of Amazon.

If we wanted to, we could forego working with a publisher altogether and just sell on our site.  We have decided not to do that, primarily because we think the wider exposure is better for THT in the long run.  But I don’t see the problem with making this clear to people and asking them to buy from our site.

And thanks in advance for the donation.  The button is on the lower right hand part of the home page.


#14    Rally      (see all posts) 2007/11/17 (Sat) @ 13:36

I wouldn’t compare the Bill James handbook to BP or THT.  Its a good book and I’ve bought it every year, butits mostly just a reference book, and while I once looked forward to getting that every year, its kind of redudant since Sean Forman provides almost everything the handbook has updated daily - plus a ton of things that are not in the handbook.

I have to say the Hardball Times is the best annual out there, but I’m a little biased.  I don’t have an article in this one but will write the Angel’s section for the preview book.


#15    Chris J.      (see all posts) 2007/11/20 (Tue) @ 01:54

Well studes, I hope you’re feeling more willing to edit next year’s book than you feel now.  I don’t know what would happen to it otherwise.


#16    Chris J.      (see all posts) 2007/11/20 (Tue) @ 01:55

That should say, “more willing to edit next year’s book than you feel in post #1.” That’s what I get for posting just before bedtime.


#17    Jim P      (see all posts) 2007/11/20 (Tue) @ 18:07

I support what they’re doing here.  The problem I had with buying through ACTA was the $4 shipping that goes to no one.  I too want the lowest price and also to support the authors, and Amazon plus donation is the most effective way to do it.

I wrote a book, too, but I don’t own the rights to it, so I just get a straight percentage of the wholesale price (and then I split it 50/50 with the government).  I recommend to people who ask for it (and want to help me) to just buy it on Amazon.  (Previously, I had ordered some at an author’s discount from the publisher and was selling them at a markup, but the sales were too infrequent and most would have been when I was on the road and I didn’t want to have to lug them around.)


#18    Tangotiger      (see all posts) 2007/11/20 (Tue) @ 18:18

Jim, you might want to mention the name of the book!


#19    Rob      (see all posts) 2007/11/20 (Tue) @ 19:42

I just went the Amazon plus donation route.


#20    David Smyth      (see all posts) 2007/11/20 (Tue) @ 22:26

I’ve been trying to make my promised donation for 3 days now. Every time I click on “make a donation”, the page has all these squares in place of the letters, and doesn’t work. I’m ready to say “F*ck it”, unless somebody can tell me what to do.


#21    tangotiger      (see all posts) 2007/11/20 (Tue) @ 22:34

Here’s what happens when I do it:

1. http://www.hardballtimes.com/main
2. Page down, and click Make a Donation.  Brings up PayPal.
3. Enter a price in the “Unit Price” box. 
4. Click Update Totals. Brings up the login screen.
5. Click Continue where it says:
“Don’t have a PayPal account?
Use your credit card or bank account (where available). Continue”
6. Brings up a “Pay with Credit Card” screen, where you enter your info.
7. Click Review Order

How far did you get?


#22    David Smyth      (see all posts) 2007/11/22 (Thu) @ 22:58

Like I said, I’m getting a screwed up page when I click on “make a donation”. All of the print letters are replaced by squares. There is no guidance. And trying to wing it is not working. I don’t know if the problem is with BTF or my computer, but it’s been the same for about 4 days…


#23    tangotiger      (see all posts) 2007/11/22 (Thu) @ 23:04

By BTF, I guess you meant HardballTimes.

I guess you can chalk it up to a bug somewhere.


#24    studes      (see all posts) 2007/11/22 (Thu) @ 23:22

David, the page works fine for me, too, and we’ve received donations lately.  I’m guessing you have a browser issue.


#25    Jim P      (see all posts) 2007/11/24 (Sat) @ 13:54

My book, Ultimate Techniques & Tactics, is for the intermediate to advanced ultimate frisbee player, a sort of textbook for individual and team skills.  The publisher sells it for $20, my author discount is $10, the average wholesale price (from which the royalties are derived) is about $9, and one can order 10 or 20 for resale from the publisher for $10.  I think I’ve made about minimum wage for the hours I’ve devoted specificially to preparing, writing, editing, and reviewing the book.  I don’t own the rights to it because i wasn’t involved in the publishing or the rest of the book-preparation process (copyediting, typesetting, distribution, etc.) other than what was necessary for the writing.

I guess my point was that the marginal dollars go to the publisher/owner/distributor/reseller, not the author (unless the author is also one of those).  In my case, I was fine with that.  I didn’t have to take any financial risk, I didn’t have to do any of the crappy work, I didn’t have to try to market it.


#26    tangotiger      (see all posts) 2007/11/24 (Sat) @ 22:54

When we sold our book, we had zero financial risk.  And we did as much marketing as our current publisher, which is nothing.  And now that Amazon offers order fulfillment, I think a publisher is really not offering anything here.

A self-starter can publish at Lulu.com, or can bypass the middleman, and publish via Lightning Source.  With Amazon now in the order fulfillment game, your financial risk is almost nothing.


#27    Tangotiger      (see all posts) 2007/12/07 (Fri) @ 12:39

Hardball Times is shipping now.  Feel free to use this thread to post your thoughts on it.


#28          (see all posts) 2007/12/10 (Mon) @ 19:08

I got my copy today, and it has a lot of great stuff that will take me a while to go through. Great job by everybody involved!


#29    tangotiger      (see all posts) 2007/12/11 (Tue) @ 01:11

I like mgl’s forecast article, studes’ wpa, and Cameron’s Mariners retrospective.  I liked the Deadspin one as well.  Walsh’s article is next on my list. 

It also looks like there’s alot of freebie files for buyers of the book, which is always nice.


#30    tangotiger      (see all posts) 2007/12/11 (Tue) @ 20:54

The Walsh and Rybarczyk pieces are fantastic: they exhibit what I’ve been talking about the pinnacle of sabermetrics being the convergence of performance and scouting data.  As I see it, those pieces are the pillar to the book.


#31    David Smyth      (see all posts) 2007/12/12 (Wed) @ 08:17

So far I’ve just read the mgl and Tango pieces. In particular, I think Tango’s approach (also used for C a few years ago, IIRC) is the coolest thing since sliced bread. And Jeter takes another sabermetric wedgie!

Is there a good way to combine those individual results for Jeter into one overall ranking?


#32    studes      (see all posts) 2007/12/12 (Wed) @ 08:37

I agree that Tango’s pieces are fantastic.  Thanks, Tango.

I probably shouldn’t admit this in public, but I do have a favorite this year.  It’s Jonathan Helfgott’s piece on international baseball.  What a great read.


#33    Mike Fast      (see all posts) 2007/12/12 (Wed) @ 10:12

I just got my annual last night, so I haven’t had the chance to read much yet, but I must say that John Walsh’s piece on the cause of the platoon advantage is fantastic!  It’s very good work by John and worth the price of the book on its own.


#34    Tangotiger      (see all posts) 2007/12/12 (Wed) @ 10:23

Thanks guys, glad you liked it.

David, there are ways to combine them, similar as you would do “strength of schedule” adjustments. (That’s what all these things are, at its core.) I’m just not very qualified to do that myself at the moment.  One way is to do what Pinto’s PMR does, which uses Maximum Likelihood Estimation (MLE).

If Jeter’s numbers would have been all over the place, then I would have applied that, or some sort of logit/probit model.  Seeing how consistent his numbers were across each parameter, I didn’t think it worthwhile for the book.

However, if the more stats-savvy out there can offer guidance, I’m willing to learn more, and come up with a single model, in addition to those intermediates I did.

By the way, I also ran it for CF for the Retrosheet years (since 1957).  Dwayne Murphy and Gary Pettis are among the standouts.  Andre Dawson was excellent too.  Mickey Mantle?  Very low.  Near the bottom.


#35    David Smyth      (see all posts) 2007/12/12 (Wed) @ 22:02

How ‘bout some opinions from the heavy hitters here on D Gassko’s study on managers? I’ll keep my reaction short--I’m simply not buying what he is selling…


#36    david smyth      (see all posts) 2007/12/13 (Thu) @ 07:41

My comment was an overreaction as a result of seeing D Baker ranked as the greatest manager of all time for improving batter performance. I do believe that even if David’s concept/execution is generally sound, the results for any individual manager should be taken with a nice grain of salt.

I mean, I saw Baker during his 4 year tenure here in Chicago, and have no sense that he helped any batters. They didn’t slow the rapid decline of S Sosa. D Lee did have his huge career year in 2005, but there’s no reason to think Baker had anything to do with it. They certainly weren’t much help to C Patterson. When they got M Murton, a strong guy with thick legs, they said that he should be able to learn to hit more HR. But they never got him to alter his GB type swing.

But, apparently some veteran hitters did improve during Baker’s 1993-2002 regime in SF. Allow mw to suggest a cynical reason for that. Steroids, anyone?


#37    tangotiger      (see all posts) 2007/12/13 (Thu) @ 08:02

DSG’s process is essentially the same as mine with the catchers/pitchers.  The difference is that there’s more “without you” with my set of players year after year.  Catchers miss substantial portion of games, while managers don’t.  DSG’s process is more reliable for managers that move around alot.  At the least for guys like Bonds, who showed a substantial amount of playing time with Baker should not be so heavily weighted in the sample.

I like that he handled the aging issue, which would have been a big concern.

It would have been better for him to focus on individual components first, like here:
http://www.tangotiger.net/alou.html
Clearly a manager, if he were to have an impact, would most likely manifest itself in the BB and K rates, and not in BABIP.

All in all, a very worthy effort that needs more refinement, rather than an overhaul.


#38    Rally      (see all posts) 2007/12/13 (Thu) @ 10:28

I don’t understand why the Cubs couldn’t just leave Murton alone and be happy with a .360/.450 hitter.

Besides Bonds, Ellis Burks, Jeff Kent, and Rich Aurilia stand out in San Francisco.  Maybe they were getting “tips” from Bonds.

Tango, Have you looked at shortstops beyond the recent years you had in the THT article?  I’m wondering how Ozzie rates.  And Brooks Robinson. 

I tried a retrosheet fielding system on some of the old seasons and the results had Brooks as the top fielder relative to position for 1957 to 1987, the pre-zone years.  Over his career Brooks was something ridiculous like +300-+400 runs, with 3 seasons over +40 (Don’t think any other infielder got a +40.  Methodology completely different from yours, without hit location I tried to estimate hit location based on GB/FB rates and handedness of batters faced, then combine that with plays made to come up with a zone type rating.


#39    David Gassko      (see all posts) 2007/12/13 (Thu) @ 10:44

My comment was an overreaction as a result of seeing D Baker ranked as the greatest manager of all time for improving batter performance.

***

For what it’s worth, even without Bonds, Baker does very, very well. If I remember correctly, Bonds’ real impact is to make Jim Leyland, his manager is Pittsburgh, go from about average to well below.

And of course, as always when you get results like these, I can’t help but bring up the Bill James 80/20 rule—actually, David, I think I learned that one from you back on Fanhome.


#40    Tangotiger      (see all posts) 2007/12/13 (Thu) @ 10:47

For the 1957-2006, exclusing 1999, I ran it for LF and CF.  I should go and do SS.  In LF, Rickey does fantastic.  Raines was a bit above average.  Rice was around average too.  The cool thing about this, is that it takes care of the Green Monster issue.  I think Yaz and Greenwell did well.  Manny was fairly low IIRC.  But Mick in CF was the surprise for me.  Very low.  Imagine if sabermetrics was alive and well in Mick’s time?


#41    studes      (see all posts) 2007/12/13 (Thu) @ 11:15

Hey, if you write those up, we’ll publish a special edition of the Annual, just for you!


#42    Harveywall      (see all posts) 2007/12/13 (Thu) @ 12:17

Tom:  I don’t mean to hijack this, but I don’t know how to start my own topic, and I have some questions about the THT Annual mentioned above.
1.  On page 126 of MGL’s article he states that New York (A) is one of the teams that won significantly fewer games than their Pythag.  This doesn’t seem to jive w/the chart to the right.  Am I missing something?
2.  The next article is “The Best Fielding Teams...” I’m wondering how Beamer’s rankings fit w/MGL’s UZR rankings on a team basis.
3.  In Beamer’s next article on Markov in the second paragraph on pg 131, he says that “with the bases empty and one out, we’d expect a further 0.387 runs to cross home plate.  In his table above, it seems to say that nobody on, one out yields 0.285 runs.  Again, am I missing something?
4. At the end of the first column on pg 131 Beamer says that the NL prediction is 4.80 runs and the AL is 4.86 runs.  This confuses me (unfortunately that’s pretty easy) vs the 0.5 runs difference between AL and NL pitchers’ ERA.  Can someone help me here?
5. I found your two articles very interesting, but I was amazed at the (apparently) huge amount of work that appears to have gone into the data gathering--was it really as hard as it looks?


#43    Tangotiger      (see all posts) 2007/12/13 (Thu) @ 12:31

Responding to point 5 in post 42:

The data gathering was done by Retrosheet.  The parsing of the data was done by Retrosheet’s BEVENT.  The long (not hard) part was to load it into a database with my own extra fields and additional tables.  The programs themselves are very straightforward, and I’d expect several people here to be able to reproduce anything I did.  It takes a good hour to generate the final data.  It actually looks more impressive than the underlying code.

Other ideas that can be done using similar code is: DP/pivot, RFarm/catcherBlocking, 3Bthrows/1Bscoops.  Literally, anything where you have two players that are somewhat interacting, and they have enough “without you” data, is a candidate for study.


#44    studes      (see all posts) 2007/12/13 (Thu) @ 13:43

Harveywal, I don’t think this is the place to put general questions for the THT Annual.  You can email me, instead, or post something on Ballhype through our site.  A couple of responses:

- Good catch on the Yankees.  Might have meant Boston.

- The fielding article was written by John Dewan, not John Beamer.  I believe all the data is there for you to compare the systems yourself.

- #3, looks like we made a mistake in the Beamer article.  I hate it when that happens.  My apologies.

- #4, you’re right that the numbers don’t quite match actual runs scored.  I’d suggest you email John directly about that.


#45          (see all posts) 2007/12/13 (Thu) @ 13:47

Response to 42.

2/ I assume you mean Dewan’s ranking ... I provided no fielding rankings

3/ No you are not, the 0.387 should read 0.285.

4/ If you look at league ERA for 2007 in the NL you have 4.44 and AL 4.52—so superficially close. The difference comes from the fact that the AL has better hitters AND better pitchers. In other words put an AL hurler in the NL hitting environment and, bingo, half a run.

There is also the DH/ pitcher effect too at play.

Hope that helps


#46    John Beamer      (see all posts) 2007/12/13 (Thu) @ 13:47

Here is my response to post 42

Question 2:  I assume you mean Dewan’s ranking ... I provided no fielding rankings

Question 3: No you are not, the 0.387 should read 0.285.

Question 4: If you look at league ERA for 2007 in the NL you have 4.44 and AL 4.52—so superficially close. The difference comes from the fact that the AL has better hitters AND better pitchers. In other words put an AL hurler in the NL hitting environment and, bingo, half a run.

There is also the DH/ pitcher effect too at play.

Hope that helps


#47    Tangotiger      (see all posts) 2007/12/13 (Thu) @ 14:32

This thread can be used for anything THT Annual related.  But, if there’s particular topics that are of an editorial nature, you might as well contact studes directly.


#48    david smyth      (see all posts) 2007/12/13 (Thu) @ 19:31

----"And, of course, when you get results like these, I can’t help but bring up the Bill James 80/20 rule.”

That ‘rule’ has been mentioned quite a bit lately in saber articles. But, it’s not any sort of inherent truth or rule. It’s just a casual observation, which is not something to hang your hat on, in any specific case. In this case of Baker, it does absolutely nothing to persuade me off of my opinion. If you want to quote Bill James’ rules, there’s another one, which I will try to nail from memory. And that is, “no statistical finding is immune to the laws of common sense”.


#49          (see all posts) 2007/12/14 (Fri) @ 02:24

Guys, I have read the annual cover to cover and it is awesome.

My favorite articles were Tango’s Jeter piece (surely that is the final nail in the coffin), Beamer’s Markov (the spreadsheet is awesome), and Walshes platoon advantage.

Great job. cannot wait till next year


#50    Guy      (see all posts) 2007/12/14 (Fri) @ 07:55

Mine arrived last night.  I just had time to tackle DSG’s manager piece (prompted by discussion here).  I think it’s a very interesting and ambitious analysis. 

However, I did have a question on the age adjustment used. DSG, how do you age adjust each season such that an individual player’s seasons always total to zero (or do you?).  Depending on how the age adjustment works, a possible concern would be that players with long careers (star players) could tend to systematically overperform during their pre-peak and peak years, because their career mean is dragged down by years of post-peak performance (and maybe pre-peak peformance too, since stars break in at younger age).  For short-career players, in contrast, pre-peak and peak years won’t look as good because that’s their whole career. 

If this is true, then a manager who happened to have a lot of star players under, say, age 34 might have an advantage over other managers. Cox, for example, might benefit from having the prime years of Maddux and Glavine, whose Atlanta years wouldn’t look quite as good vs. career total if they had retired at age 35.  Whether this could impact the ratings in a meaningful way, I don’t know.


#51    Tangotiger      (see all posts) 2007/12/14 (Fri) @ 11:06

Rally, I ran the SS for the Retro years, and I remembered why I didn’t do that for THT Annual.  I purposely set the range of years of 1993 to present, because there was a substantial change in BABIP at that point.  That is, we had a true offensive explosion that started in either 1993 or 1994.  All the stats point to that.  So, it would be unfair to the current SS to compare to the past, without making an era adjustment.

With that provision noted, the top SS (min 5 years) in the Retro era were Mark Belanger and Ozzie Smith among a couple of others.  This was based ONLY on the Parks comparison (as noted in THT08).  I’ll run the other comps tonight.  These guys were monsters.  Concepcion was also fairly high.

If I’m going to be serious about this, I need to definitely incorporate era adjustments, and a way to merge all the parameters I noted in THT08.


#52    tangotiger      (see all posts) 2007/12/14 (Fri) @ 22:29

Among the 84 SS with at least 5 seasons since 1957, Belanger is #1 and Ozzie #2 (+45 plays), when looking at their batters.  Taveras, Blauser, Jeter are bottom 3.

When looking at pitchers, Bud Harrelson is #1, Belanger is #2.  Ozzie is #9 (+33 plays).  Bottom 3: Jeter, Eckstein, Schofield.

When looking at parks: Belanger and Kubek are top 2.  OZzie is #5 (+41 plays).  Bottom: Julio Franco, Chris Gomez, JEter.

The impressive thing for Ozzie is that he has 17 full seasons, far more than anyone else.  So, it includes alot of his “down” years.  Anyway, it looks like he’s +25 to +30 runs per year, for 17 years.  Giving him some +400 runs seems justifiable by this method.


#53    Rally      (see all posts) 2007/12/15 (Sat) @ 00:49

I’m in the process of trying to run this myself.  Eckstein is a surprise to me, as he’s done so well in zone and UZR.

Also, as of this week we can go back to 1956.


#54    MGL      (see all posts) 2007/12/15 (Sat) @ 06:05

I did make a mistake with NYA and meant BOS.  I casually compared my team UZR ratings (in runs of course) with Dewan’s and some were close and others were not.  I was disappointed that they were not closer than they were overall.  I am not sure how Dewan constructed the team ratings.  I just added up each player’s runs saved or earned, which is not the best way to do it.  Doing team UZR is actually quite easy - a lot easier than doing individual UZR’s where you have to worry about what to do if a CF’er catches a ball in the RF’ers area or how to apportion the responsiblity when a ball is not caught, etc.  For team UZR, you simply use a league-wide, multi-year baseline for all buckets.  If a ball is caught in any bucket, then the team gets X credit and if it is not, they get Y demerit.  If I do it that way, the total SHOULD come out real close to just totaling up all players, but I am not sure.  And I don’t know which way Dewan did it with his data.  But you would think that even with different databases and slightly different methodologies, that our numbers would be closer than they are.

John, if the AL has better hitters AND better pitchers, do they not cancel each other out (assuming about the same level of “better")?  Shouldn’t the AL runs allowed be around .3 to .4 runs higher because of the DH?  If it is not, it is because of parks or random fluctuation (or the AL pitchers are better than the AL hitters as compared to the NL, which I think is the case anyway).


#55    David Gassko      (see all posts) 2007/12/15 (Sat) @ 06:56

Guy/50: I don’t think that should be an issue given how the aging patterns work (which is explained completely in the article).


#56    Guy      (see all posts) 2007/12/15 (Sat) @ 10:49

Got to Tango’s pieces last night—really fine work.  Question:  in the Jeter piece it sounded like you thought the Park parameter gave the most correct answer.  Is that your view or am I misreading you? And if so, why?

One thought on the park analysis:  I think that full-time, durable players (like Jeter) might gain a subtle advantage here, as in their home parks they are being compared almost exclusively to visiting fielders.  Assuming home fielders have a generic edge across all parks, that will make guys who regularly log 150+ games a bit better.  (On the other hand, this also depends on the quality of the home backup player, who may not be any better than a visiting starter). 

I would also think that the GB/FB tendencies of a staff could throw off the park ratings quite a bit.  If a SS plays behind a GB staff, won’t he look good vs. other fielders?  On balance, controlling for pitcher and hitter seems much more important than park.

* *

DSG:  I could certainly be wrong, but I’ve read the article carefully and think there may be a problem.  I approximated YTY changes for a typical player using your THT graph, and compared an age 25-34 career player to a 21-40 player.  For the 25-34 years, the first player obviously was +0 vs. his own mean.  But the 21-40 player is +30 runs vs. self over those same years.  If your aging curve is based on all players, the short-career players will tend to underperform, while the long-career players overperform. 

Think of it this way:  by including a lot of 3- and 4-year players, you flatten the aging curve.  These guys only play at their peak, so peak basically equals mean.  That’s why you show peak as only +2 runs, which strikes me as far too small for a guy who plays 15+ years.  Anyway, this should be easy to check:  look at all long-career players, and see if their 18-32 age-adjusted performance is above zero.


#57    tangotiger      (see all posts) 2007/12/15 (Sat) @ 11:10

Guy, good point on the tiny subtle advantages.  I should probably do a “park factor” for fielders, and see how each position comes out.

As for which one has more impact, it completely depends on the variance.  I think the runners one had very little variance for the “without you” SS (other rate6).  Whichever one has more is the more important one.  Over a long enough time period, you would think (hope) that the variance of batters faced and pitchers standing behind would be a wash across all SS (more or less), and really it’s the park that stands out the most.  In the medium-term, probably pitchers have a bigger influence.  In the short term, you really want to know who the batters are.

In any case, I really should try to combine them all at some point.


#58    tangotiger      (see all posts) 2007/12/15 (Sat) @ 11:11

That should be a home/away factor, not park factor.


#59    tangotiger      (see all posts) 2007/12/15 (Sat) @ 13:25

Top CF, by park: Pettis, +42 plays, Dwayne Murphy, +40… a little lower, Dawson at +28 plays.  Dawson was a fantastic fielder. Andruw at +24.  Worst was Matty Alou at -59, Mick at -47.

By pitchers: Pettis +46, Piersall +41… Dawson +13.  Bottom: Alou -44, Ken Landreuz -42. 

Didn’t do it by hitters yet.

Pettis was fantastic, perhaps the greatest fielding CF in the last 50 years.  I think we can give him +200 runs for his career on his fielding, plus another say 35 or so for being a CF.


#60    tangotiger      (see all posts) 2007/12/15 (Sat) @ 13:25

Mick was -14 on his pitchers.


#61    tangotiger      (see all posts) 2007/12/15 (Sat) @ 14:04

Top LF by parks or pitchers, are Rickey and Gary Ward.  Greg Luzinski is at or very close to the bottom in both lists.  The range is the typical +30 to -30 plays.  Barry is very high at +26 and +20 in the two lists.


#62    David Gassko      (see all posts) 2007/12/15 (Sat) @ 14:45

Guy,

I should have known better than to doubt you. Looking at all hitters with at least 8,000 career PA, they were +1.67 runs better than expected per 150 games over their careers. I don’t think that really changes much if anything, but the effect is certainly there.


#63    Rally      (see all posts) 2007/12/15 (Sat) @ 15:33

Good to see Pettis way up there.  The Angels had a ton of great defensive CF from the early 80’s (Lynn) to 2003 when Erstad got hurt and couldn’t play out there anymore.

I wonder if Mick would do better if we had 1951-1955 data.  That’s when he had the most speed, and while he was always hurting, the injuries probably got worse as his career continued.

I ran it for shortstops/pitchers for 1956-1992 - I’ve got my retrosheet data in different databases.

Top 3 per year were Barry Larkin (only the first few years of his career), Harrelson, and Belanger.  Ozzie at #8.  In total runs its Belanger +342, Ozzie +296, Harrelson +286.

At the bottom were Roy Smalley, Dick Schofield, and Frank Tavares.


#64    Guy      (see all posts) 2007/12/15 (Sat) @ 17:07

Rally/Tango:
It seems to be that one potentially biasing factor is staff GB/FB tendencies, especially in earlier eras when team composition was more stable.  What if you shrunk your denominator by removing the plays you know your player couldn’t have made (or didn’t need to make): all outs and errors recorded by teammates?  This could be especially helpful when looking at IFs, by removing a lot of the FBs.  So for a SS, opportunities are his own outs and errors plus all hits.  Does that make sense?


#65    tangotiger      (see all posts) 2007/12/15 (Sat) @ 17:17

Top CF, by hitters: Cameron +43, PEttis, Dwayne Murphy.  (Dawson +28, Mick -34, Willie Mays +16).  Matty Alou last at -56.  Dale Murphy was -45.

Dale, v Park is -41, and v Pitchers is -17.  Dawson has an enormous lead on him on fielding.  They simply don’t compare.


#66    tangotiger      (see all posts) 2007/12/15 (Sat) @ 17:19

But I don’t need to worry about that.  After all, I’m comparing Mickey Mantle to other CF that his pitchers had.  So, if his pitchers were GB heavy, this would affect any CF behind those pitchers.  Those are the guys I’m comparing Mick to: other CF behind his own pitchers.

I don’t know if this applies to Rally.


#67    Guy      (see all posts) 2007/12/15 (Sat) @ 17:26

Tango:  that’s right for the pitcher comparison, which I think is the best. But not for the hitter or park comparisons. Right?


#68    Guy      (see all posts) 2007/12/15 (Sat) @ 17:44

DSG:
You may be right that the age curve problems won’t change your manager evaluations much.  But maybe not.  The way the bias works, it will grow proportionately with the length of career.  But it’s not uniform over a player’s career:  it will be larger in the early years.  Your age 35 estimate will work pretty well for long-career players, because no one else is playing at age 35.  But your age 27 estimate will be dominated by short-termers and be a bad fit for the stars.  So if a manager gets a bunch of 18- and 20-year players in their prime—but not their decline—he could get a pretty good bonus.  Conversely, a lot of 4- and 5-year players will drag your rating down.

I also worry that the problem is magnified with pitchers, so many of whom have fairly short careers.  I think that makes your pitcher aging curve much too flat when applied to someone with a long career.  If Maddux had only pitched from age 26 to 35—not an unusual career—his career ERA would probably be a run lower.  So if, like Cox (or should we say Scheurholz?), you happen to get some durable All-Star calibre pitchers and dump them before their decline, I could see that having an impact on the rating.

You should take a look at long-career pitchers. It might even be interesting to generate a pitcher aging curve based only on guys who pitched, say, more than 10 seasons, and see how that changes the expected peak years performance for guys like Maddux and Glavine.


#69    tangotiger      (see all posts) 2007/12/15 (Sat) @ 19:13

Guy/67: correct, we’re on the same page.  I prefer to keep the 3 (or 4) parameters separate, and not apply adjustments to each one (but you would be perfectly fine in doing so if you like).  If I’m going to adjust, I’d rather not rely just on some things, like GB/FB tendencies, but rather go full throttle, and include all the 3 parameters (hitters, pitchers, park) into a super-duper With Or Without You (WOWY).

I think the power of the WOWY is in its simplicity.  If I start making adjustments, I’ll lose that.  That’s why, if I make one adjustment, I might as well go all the way, and do some sort of “maximum likelihood estimation” process as Pinto does.

In my view, if Pinto were to present his results similar to mine, as well as his final numbers, we’d all be so much more accepting of it.  The problem with his (and MGL and Dewan) method is that we don’t see any of the internal workings.  In my case, I’m giving it on a silver platter.  It’s pure data.  You’ll be able to guess on Jeter or Mick’s or Pettis’s final number based on the three or four charts I presented.


#70    Tangotiger      (see all posts) 2007/12/15 (Sat) @ 21:25

Here’s something cool that I did.  Let’s take Ozzie Smith.  I looked at how his pitchers did with him at SS and without, as you know.  Ozzie made 33 more outs than the other SS.  But, I ALSO looked at how Ozzie’s 1B, 2B, and 3B did, when Ozzie was on the field with that pitcher, compared to the 1B, 2B and 3B of the non-Ozzie SS with those same pitchers.  (I did not hold the 1B, 2B, 3B constant.)

Anyway, the non-SS infield made the exact same number of outs with and without OZzie.  In the OF, the Ozzie-presence OF made 7 fewer outs than the non-Ozzie OF.  Remember, always with the same pitchers.

With Jimmy Rollins, he made 15 more outs, the rest of his infield made 82 more outs, and his outfield made 51 fewer outs.  It would seem that Rollins was on the field with a GB heavy pitcher… BUT, remember, I controlled for the pitcher.  It’s the exact same pitchers with and without Rollins.  The only thing I can guess is that at that point in their careers, they were more GB-heavy when Rollins was on the field.  It’s kinda weird.  And Rollins is not the only example here.

I was hoping to see “0” for all the other fielders, but not at all the case.


#71    tangotiger      (see all posts) 2007/12/15 (Sat) @ 21:34

OR of course that Rollins had great infielders and horrible outfielders, with those pitchers, and when those pitchers didn’t have Rollins, they also happened to have much worse infielders and much better outfielders.  Hard to believe that after several years, this could happen.  I can look at each pitcher and see if this is the case.


#72    Rally      (see all posts) 2007/12/16 (Sun) @ 01:22

Tango,

How did you weight the with/ without BIP?

At first I just tried summing all the pitchers without, but this could produce bad results - Lets say Jeter is behind Wang for 8000 BIP at 15% rate and Wang has 60 BIP with another shortstop.  Johan Santana comes over with 15000 BIP at 10% for other shortstops and then in his first 100 BIP, Jeter fields 10%.  A simple sum would show Jeter as a great fielder since almost all his with is behind Wang, and without is behind Santana.  A simple example, obviously 100 or so other pitchers will affect his ranking, but you can see how this could skew things.

What I did was look at the runs +/- for each pitcher, with and without, then sum the runs. But then if he’s got 6000 BIP behind one pitcher who only has 10 BIP against other SS, his rating will vary by a ton of runs based on what happened in that small 10 BIP sample.

So my solution is this:  For each SS/P combination, take (with% - wo%) * the smaller value of with bip, or without bip.  So if a shortstop is the only fielder who worked with a pitcher in 99% of his appearances, that shortstop’s rating will be close to zero, for that pitcher/ss combination.


#73    John Beamer      (see all posts) 2007/12/16 (Sun) @ 09:36

MGL/54

Yes you are right. I have no idea (as in I haven’t studied it) as to relative batting and pitching prowess of the two leagues. If you look at the data the AL and NL ERA were close in 2007 so it is some combination of the different talent levels between the sets of batters and pitchers as well as the DH that contributes.


#74    tangotiger      (see all posts) 2007/12/16 (Sun) @ 09:57

Rally/72: you could use the min of the two values, or always use the BIP of the SS as the weight.  I chose the latter.

So, I do out_rate_other times bip_jeter, for each pitcher.  You would do out_rate_other times min(bip_jeter, bip_no_jeter).


#75    Guy      (see all posts) 2007/12/16 (Sun) @ 10:56

"I was hoping to see “0” for all the other fielders, but not at all the case.”

Shouldn’t all other fielders be more like -0.7 * Fielder?  Doesn’t explain the IF-OF disparity, but I’d think teammates should make fewer outs alongside a great fielder.


#76    tangotiger      (see all posts) 2007/12/16 (Sun) @ 13:04

The top 20 SS averaged +31 plays per 4000 BIP.  Their 1b+2b+3b averaged exactly 0.

The bottom 20 averaged -17 plays.  Their IF teammates averaged -1 play.

The guys in the middle were +5 and +3, respectively.

I think it likely that there is virtually no overlap.


#77    tangotiger      (see all posts) 2007/12/16 (Sun) @ 13:07

OTOH, the top 20 SS had these OF: -13 plays.
The bottom 20: +15 plays.
The middle 44: +3.

This is based on the pitcher report.  So, I think it’s likely that there’s something that’s not being captured here.


#78    Rally      (see all posts) 2007/12/16 (Sun) @ 13:10

How did Willie Mays rate by pitchers?


#79    tangotiger      (see all posts) 2007/12/16 (Sun) @ 13:55

Mays was +5.  His LF/RF were -24.  His IF were +38.  His catcher, pitcher were +30.  I’m tempted to think that there were more GB hit when Mays was on the field than not, for those pitchers.


#80    tangotiger      (see all posts) 2007/12/16 (Sun) @ 15:12

I’m looking at Jeter.  The OF make 4 less plays with JEter on the field.  The rest of the fielders (pos=1,2,3,4,5) made 7 more plays.

Interestingly, his C/P made 26 more plays when Jeter is SS than when he is not.  His 3/4/5 made 19 fewer plays when Jeter is SS than otherwise (from the viewpoint of the pitcher/Jeter combination).  Hard to see why a pitcher would be able to record more outs by himself, depending on whether Jeter is his SS or not.

So, we’re ok on Jeter.

***

With Mickey Mantle though, his IF made 74 more plays, while his OF made 6 less plays.  In his case however, Tony Kubek was his SS in those Retrosheet years, and he was +28 plays.  I didn’t do the other Yankee IF, but it’s possible that the Yankee IF was generally great fielders.  In this case, it’s not so much that the pitcher pitched differently with Mick in the field, but that Mick’s other fielders were simply not random at all.


#81          (see all posts) 2007/12/16 (Sun) @ 23:31

Fantastic job on the book.

A question re: Walsh’s platoon article.

One thing I noticed looking at the lists of pitchers with small platoon splits is that in addition to it being guys who throw slow pitches (but not always, Marichal being an obvious exception), it’s a list of guys who throw a lot of pitches. The batter can guess what is coming when Brandon Webb takes the mound, but not when Mussina is pitching.  So, I’m wondering whether the pitcher’s throwing “everything but the kitchen sink” has close to or as much effect as the actual pitch selection.


#82    Tangotiger      (see all posts) 2007/12/21 (Fri) @ 16:40

bump


#83    Mike      (see all posts) 2007/12/25 (Tue) @ 00:18

I apologize if this has been addressed in the above posts, but I haven’t had time to read all the posts yet.

First of all, great read so far.  I am enjoying most of the articles and the presentation is nice.  Just a few things, though.

After reading MGL’s article and John Beamer’s article, I noticed that they talked about similar topics but varied in their statistics.  What explains the discrepancy between the expected wins (in any of his expected win columns) and runs scored/allowed between the teams listed in mgl’s article and the markov chains presented by John Beamer? For example, in Beamer’s article he says that the Angels should have finished with an 86-76 record and MGL never says that they were projected to win 86 wins in any of his columns in his article.  The Royals were also projected using Markov to win 64 games and MGL never goes that low.  Not only that, but Beamer talks about how a team like the Orioles conceding 59 more runs than they should have when MGL shows in his projected and actual pitching category that they conceded less than they were projected to...just little things like that I noticed between the two articles.  There are other examples but I don’t feel like going back and looking.  I assume that the major difference is the numbers used to calculate player projections, etc. inputted in markov vs. mgl’s player projections.  Also, I don’t know if uses Markov, but I thought he did.  With that said, Beamer’s “starter wins” column in the book did show, like mgl did in his article, that the diamondbacks best record in the NL defied logic.  Great stuff.

BTW, I loved the way MGL broke down the pitching/defense/offense/baserunning on pg.122.  That type of information is what I would show the owner if I was trying to get hired as a GM.  It specifically shows areas where teams can get better.

Also, clearly wins above bench in John Beamer’s article is not a well representation of replacement level (esp. when Milton Bradley is listed as a bench player, although it makes sense given how they attribute what players are determined to be bench players based on their system) because his $/win is around $4.11 mil/win (using an average of the 66 expected wins for NL bench and 59 for AL bench).  But, when mgl always talks about scanning through his database and looking at where he see’s replacement level, is he doing this using markov data on all teams and their players? It does make sense that markov could help us finally determine replacement level once and for all (although it seems the likes of tango already has it for the most part). Maybe I’m just wondering if Markov is necessary/could help us…

Last few things:
Tango, what did you think of Beamer’s analysis on the last page of his “a random walk through a markov model” where he claims, through markov analysis, that dice-k would have given the Yankees 49 runs, or 4.45 more wins (assuming 11 runs/win) if they had signed him.  Seems like a lot to me in one season for one player (for that much of an impact), but maybe it is that much…

He also uses Markov to talk about Soriano’s impact on the cubs’ lineup.  Without Soriano on the team, and with Theriot leading off and a bench player batting eighth, Beamer concludes that they would have cost them about 5 wins, dropping them below .500.  Does this seem like a lot for one player in one season?  The only reason I can see them losing 5 more games is if a replacement level player truly did player LF, but if the Cubs didn’t have Soriano (got rid of him somehow), they would surely play someone in LF that wasn’t replacement level, which wouldn’t cost them 5 wins.

With everything said above, both mgl and John Beamer’s articles were a fantastic read.  I loved the way MGL broke down each team’s performance throughout the season and that type of info would be great for a front office to have at their hands.  Beamer also did an excellent job presenting his Markov article in an informative, easy-to-read style.  I apologize if anything of what I said above doesn’t make sense because I’m thinking out loud here.


#84    Guy      (see all posts) 2007/12/25 (Tue) @ 14:41

Just got to John W’s platoon article.  Very interesting findings re: slider and FB as “source” of platoon splits. Question for John if he’s out there:  You write as though a big platoon split is a disadvantage for pitchers (pitchers are “susceptible” to big splits; the curve is “good” for a pitcher’s split).  Have you looked to see if that’s true?  Do pitchers with large splits tend to be worse pitchers? 

John’s article also raised an interesting question (to me):  Why are there any LH starters?  Only about 10% of the population is LH, and on top of that a LHP has the platoon advantage only about 2/3 as often as a RHP (40% vs. 60%).  So we should expect that something like 5% of starting pitchers, maybe less, should be LH—but it’s more like 25%?  Why is that?

Perhaps the answer lies in the process for selecting hitters in MLB. To survive, a hitter must be able to hit RHPs—if you can’t hit righties well, you can’t hit as far as professional baseball is concerned.  So hitters who can’t hit lefties (whether RHH or LHH) can stay in MLB, while RHPs don’t get to face hitters who can’t hit RHPs.


#85    tangotiger      (see all posts) 2007/12/25 (Tue) @ 15:11

A related thread to Guy/84 is here:
http://www.insidethebook.com/ee/index.php/site/comments/switch_hitters/


#86    Guy      (see all posts) 2007/12/26 (Wed) @ 14:04

If we control for platoon advantage, LHPs are better than RHPs.  Using John W’s data for 2000-2006, LHPs put up an OPS-against that is .013 better than RHPs with the platoon advantage (.724 vs. .737), and .020 better without the platoon edge (.788 vs. .808).  However, MLB digs far deeper into the LH talent pool, so the LHPs should actually be LESS talented pitchers overall (leaving platoon advantage aside). 

The only way to explain that, I think, is that the real difference is in the hitters, not the pitchers:  it’s not that the LHPs are “better pitchers,” but that major league hitters are not as good at hitting lefties.  It would make sense that hitters are selected primarily for their ability to hit RHPs, since 75% of the opposing pitchers are RHPs.  And I would guess that the RHP percentage is even higher in Little League, HS, and college (maybe even the low minors?), so the advancement selection for hitters who can hit righties is probably even stronger in the early years of development (anyone know the LHP/RHP splits for HS, college, minors?).  The ability to hit lefties, in contrast, would not be an important selection criteria. (I can still remember facing a hard throwing lefty in Little League, because it was so exotic.)

If this is true, we should see a smaller platoon split among RHHs.  RHHs have to hit OK with the platoon disadvantage, because they won’t get enough PAs against LHPs to make up a shortfall; in contrast, a LHH who is helpless against LHPs can survive as long as he hits RHPs.  And in fact this is what we observe:  RHHs have a much narrower platoon split overall.  Presumably, there is a population of RHHs out there whose platoon-neutral hitting ability is as good as many MLB players, but who had a large platoon split and thus couldn’t make it in a 75%+ RHP world.


#87    Tangotiger      (see all posts) 2007/12/26 (Wed) @ 14:31

The Book (Table 25), shows that for pitchers in 1999-2002, they earned a 19 or 20 point platoon advantage in wOBA.  That is, it’s what you’d expect: lefty pitchers enjoyed the same advantage as righty pitchers, with the platoon advantage.

If John’s 2000-06 data is showing differently, then this could very well be a blip.  It would be interesting to see the results over time.  (B-R.com has it, for those who want to do the leg work.)


#88    Guy      (see all posts) 2007/12/26 (Wed) @ 14:45

Tango, I’m talking about hitters, not pitchers.  See Table 66:  RHB platoon split is .017 wOBA, while LHB is .027—very similar to John’s results.


#89    Tangotiger      (see all posts) 2007/12/26 (Wed) @ 14:53

Ok, looking at Table 25, you get similar results: 25 point advantage for LHH and 14 for RHH.

In Table 27, I posit a what-if, of removing the bad righties, to the point of moving from a 24% LHP universe, to one with 26% LHP.  And the result is one where you have the same platoon split all-around.

And in Table 28, I continue by having 45% LHP, a major shift, with a result that RHH have no (apparent) platoon advantage.  This is what I said:
“So, what happened? Well, lefty batters face all lefty pitchers, good or bad. But, they only face good righties. That’s why they have no (apparent)
platoon advantage.”

Are we contradicting each other?


#90    Mike      (see all posts) 2007/12/26 (Wed) @ 16:21

I just noticed someone else posted about the difference in MGL’s findings and Beamer’s analysis.  Maybe one of them can tell us more about the differences between the numbers in their two articles...looking forward to Beamer’s response and/or anyone else’s regarding Dice-K and Soriano.

Also, the spreadsheet looks great, but when I was fooling around with it and substituting random no-name players in for David Ortiz and Manny Ramirez, the RPG actually went up from 3.93 (why is it even 3.93 to start wth?) to 3.94.  I’m not sure why this is…

Just also got done with Greg’s home run tracker article and it is fantastic.  The type of analysis and data derived in a two short years from his site is great.  I think his type of analysis can improve upon many apsects of the game...the whole thing about a shift for Jones is something a team could really value.


#91    Guy      (see all posts) 2007/12/26 (Wed) @ 16:49

I think we agree on the numbers, but disagree on the interpretation.  The numbers tell us that LHHs have a larger platoon split than RHHs.  Your theory is that this is the result of LHHs getting to beat up on “weak” RHPs, who wouldn’t have a job if it weren’t for the inherent platoon advantage that comes from being a RHP (RHHs also get to hit against these weak pitchers, but the effect for them is to narrow the platoon split).

The problem with this theory, in my view, is that there are too FEW, not too many, RHPs.  Unless we believe that lefties are inherently more athletic, throw harder, etc., then 90% of the pitchers should be RH.  Add in the platoon edge, and it should probably be 95%.  But it’s only 75%.  So in terms of raw, platoon-neutral talent, the RHPs must be superior not inferior. (Maybe someone can see what the pitch/fx data says about velocity for RHP vs. LHP). 

My theory is in some sense the reverse of yours:  hitters are selected based mainly on their ability to hit righties (because that’s what matters most).  That means a RHH with a big platoon split won’t succeed, but a LHH with a big split can thrive.  If we think in terms of platoon netural ability—a straight average of wOBA(RHP) and wOBA(LHP)—then the impact of a 40 point platoon split is -.010 for a RHH but +.010 for a LHH. Hitters are of course selected for their overall performance, not their platoon split, but the consequence of that will be culling out RHHs with big splits (and keeping LHHs with big splits). This also explains your finding that switch hitters hit RHP better—like everyone else, they’ve been selected for the majors mainly based on their ability to hit RHPs.

Coming back to pitchers, the selection of hitters based on their ability to hit righties opens the door to more LHPs. Lots of hitters who can’t hit lefties (mostly LHHs, but also some RHHs) survive in MLB because they can hit the more numerous RHPs, and these guys are then easy pickings for LHPs.  In a sense, RHPs and LHPs are on an uneven playing field, because their opponents (hitters) have been selected mainly because of their ability to hit RHPs, with little regard for ability to hit southpaws.  (A comparable situation would be if no pitcher in the minors threw breaking balls.  Hitters would then be selected for advancement purely on the ability to hit FBs.  And then MLB pitchers with good breaking stuff would dominate.)


#92    Tangotiger      (see all posts) 2007/12/26 (Wed) @ 17:27

Only 90% of the pitchers should be RHP, if and only if 75% of the hitters are RHH.  (These numbers being the likely population split for right hand throwing and right hand hitting).

But, with so many LHH finding a job in baseball, precisely because they can beatup on RHP, they need more LHP to balance it out (to keep MLB teams honest). 

That we actually have something like 73% RHP and 58% RHH tells ne that the market has moved on both sides to make sure one side isn’t getting a bigger advantage than the other.

And since LHP and RHP are both, as a group, roughly .500 pitchers, shows that we’re in equilibrium (insofar as the balance between pitchers and hitters).

That is, MLB can sustain a model where you have 50% LHP (i.e., bring in lots of crappy lefty pitchers), if the % of LHH goes up to something like 65% (i.e., bring in lots of crappy lefty hitters).  But, that won’t happen, because there’d be too many good RHH that are left to the sidelines.

And the platoon splits would look much different than today.  RHH would be feasting on LHP, while LHH would be facing tons of tough righties and easy lefties.


#93    Guy      (see all posts) 2007/12/26 (Wed) @ 21:27

"Only 90% of the pitchers should be RHP, if and only if 75% of the hitters are RHH.”

I don’t think this follows.  We would expect 90% of the pitchers to be RHP if 50% of the hitters were RHH.  That would mean neither LHP nor RHP had any platoon edge, so the distribution should then match the underlying population.  Any RHH proportion between 49% and 100% would serve to increase the expected RHP% even higher than 90%.  So the actual LHP% of 25-30% is at least 3x higher than what it “should” be, probably more like 5x as high. 

* *

Let’s forget about “platoon splits” and think of wOBA(RHP) and wOBA(LHP) as two discrete, but somewhat correlated, skills.  Since a hitter’s wOBA(RHP) has three times the impact on his productivity, we expect hitters to come mainly from the far right tail of the wOBA(RHP) curve. Because wOBA(LHP) matters much less, it would follow that the variance in wOBA(LHP) will be larger.  And that’s what we see: if you treat them as two separate hitting skills, the variance for wOBA(LHP) is much larger.  I don’t see how this could not be true.


#94    Guy      (see all posts) 2007/12/26 (Wed) @ 23:15

Sorry, I misstated that last point.  There’s no reason to expect the variance for wOBA(LHP) and wOBA(RHP) to differ.  But we should expect hitters’ mean performance against RHP to be higher, controlling for platoon advantage, since that’s the primary skill being selected for.  And that is the case:  hitters do better against RHPs both with the platoon advantage and without.


#95    tangotiger      (see all posts) 2007/12/27 (Thu) @ 09:32

I’m saying that if the population at large has 90% RHP and 75% RHH, then if the MLB pop was 90% RHP, they should also have 75% RHH.


#96    tangotiger      (see all posts) 2007/12/30 (Sun) @ 13:06

Recently, someone said that they would have liked to see more data on the pitchers for the With Or Without You, similar to the catchers.

The entire spreadsheet is available on the THT site.  Just follow the directions for Beamer’s article to download his file.  It’s in the same folder.

It also includes the complete catcher data, including a couple of catchers that were missed in the book (any of them with a 3-letter name).


#97          (see all posts) 2008/01/01 (Tue) @ 14:22

Let me try and tackle a couple the issues regarding my and MGL’s article.

Apologies for the delay but what with the holidays I have been on and off the web.

Let me back up a minute and tell you how you’d calculate a W-L record from my Markov.

1) Get a team’s batting line for the season
2) Enter into Markov to identify runs scored
3) Get a team’s batting line against for the season
4) enter into Markov to get runs allowed
5) Use pythag to get win %

The markov makes a number of assumptions, particularly about base advancement and the running game. Now in my model you can tweak these but I didn’t. I assumed that the running game for all of these teams is the same (and is based on a 2000-2005 base running environment—from retrosheet).

Also the Markov assumes that events are independent of each other. We know this isn’t true as walks are less likely to be issued with no outs and the bases loaded than with 2 outs, man on 3rd.

The point is before directly comparing data you have to dig into the specific assumptions behind the different models.

The relevant (closes) column of comparison between the Markov and MGL’s analysis is “projected wins based on actual underlying performance”. MGL reports that he has developed conext-neutral linear weights and has adjusted for all manner of things, including park.

Although I did adjust for park in the bench analysis I didn’t when presented the RS/RA data. I just took the bald stats and compared the actual runs vs Markov runs. The point of the article wasn’t to produce a full analysis of a team’s performance but was to illustrate the power and usefullness of the Markov.

Anyway a quick glance an MGL’s underlying performance and Markov wins (calculated from the table presented in my article that shows markov RS and RA) shows that we are broadly in line ... close enough given the differences in approach.

I agree with your point on bench analysis and as such I tried to adjust players accordingly to make sure I got as close to “bench level” as possible. The analytically correct way to do this is to use the depth charts from the beginning of the season to define the first team and bench and calculate the bench level on that.

As to your question on Soriano’s performance you have to make sure that you chain correctly (ie, identify the replacement). Soriano on his own might not be worth 5 WAR but if the cubs haven’t got an adequate replacement then you can have a massive difference in production (eg—a Braves example, Teixeria is easily 50 runs better than Thorman over the course of the season). I defined the Cub’s replacement level as the Markov bench level (which is based on actual performance).

Anyway 5 WAR = 3WAA, which is highish but not crazy. The dice-K situation is the same. The Yankees’s “fifth pitcher”—and by that I mean every player who took the mound outside the first four starters (Igawa, Karstens etc) were horrendous. Check out the stats—we are talking really terrible.

Mike—on the spreadsheet can you send me the exact line-up that you tried and what you changed to get the funny result and I’ll look into it. If there is an issue I suspect it is with either a playing time adjustment (if you alter position) or a misplaced lookup. I elected not to allow the user to adjust playing time because I thought it would add more complexity to the user interface but perhaps that was the wrong call.

You can mail me from the address on the THT page, or probably click on my name on this post


#98    tangotiger      (see all posts) 2008/01/01 (Tue) @ 18:03

WOWY 1B:

Remembering that a 1B has four skills: on balls in play, on bunts, on holding runners, and on throws, here’s how 1B did on the first one, using WOWY, controlling only for pitcher:

Vic Power: an incredible +65 plays (per season, at 6.3 seasons).  Pujols ia up there at +34 plays.  Dave Stapleton (RIP, 1986) +30 plays.  Billy Buck was +18 per season, (over his 10.4 season).  Murray +17, Mattingly +13.  Minky only +8.  JT Snow -6.  Pete Rose -29.  Willie Stargell -30 (bad in LF too).

The 4 worst fielders: Mike Epstein, -43, Mo Vaughn -36, Frank Thomas -34, Dick Stuart -34.

Remember, I didn’t control for park or hitters.

***

Next up, instead of looking at 1B/pitcher pairs, I’ll look at 1B/SS pairs.  Should be fun.


#99          (see all posts) 2008/01/01 (Tue) @ 23:31

Great stuff, Tango!


#100    Tangotiger      (see all posts) 2008/01/02 (Wed) @ 10:21

Just noticed that Vic Power won 7 GG at 1B, which is just about how long his career lasted.


#101    Guy      (see all posts) 2008/01/02 (Wed) @ 15:43

I have a theory about Pujols—with zero data to back it up—that might also apply to some other high-scoring 1Bmen.  The theory is that Pujols’ strong rating is in part a reflection of Molina’s talent.  Molina is so good at shutting down the running game—not just a high CS%, but also very few SBA—that Pujols may be able to “cheat” off the bag more when holding a runner.  That would allow him to make plays that other 1Bmen can’t.  So it might be interesting to see if there’s any connection between 1B ratings and C’s SB rates.

* *

Also a question:  On BIP with a runner on 1B and <2 outs, do you think that it would be worth the trouble to distinguish between plays where the 1Bman forces a runner at 2B (or turns a 3-6-3) vs. settles for getting the hitter at 1B?  Getting the lead runner clearly has more value, and I can imagine that some 1Bmen are more skilled at making the tough throw to 2B.


#102    SirKodiak      (see all posts) 2008/01/02 (Wed) @ 16:05

I watched probably 130-140 Cardinal games this year, and some things I observed about Pujols’ fielding is that he plays an extremely deep 1B and that gets to a lot of balls that the 2B could have easily gotten to.  I think a lot of 1B just head to first and allow the 2B to make the play.

As to the Pujols/Molina connection, I don’t know the numbers, but it seemed that the Cardinals attempt pickoffs (or at least throw to first) a lot more than their opponents do.  This seems to me to condradict the ‘cheating off the bag’ theory at least somewhat.

------

On the matter of the defense of first basemen on throws, I wonder if whatever results are found might be skewed by IFers being a little more careful about their throws to first when they think they have a sub-par 1B.


#103    Guy      (see all posts) 2008/01/02 (Wed) @ 16:58

Interesting observation on Pujols.  I noticed that Dan Fox’s SFR method has Pujols at a much less impressive +10.  That would be consistent with your theory, because in Fox’s method Pujols’ opportunities increase as he makes more outs.  If he “stole” 30 plays that a 2Bman would normally make, he would only be a +10 in Fox’s system.  But in Tango’s system he would be +30 (not saying he grabs that many, #s just for illustration). 

I’ve argued before that PBP metrics should only reward/penalize players based on the overall probability that a FB would become an out (including all players with P>0, not just the guy who makes that play).  I think MGL said he does it this way (not sure), but I don’t believe the others do.  Now I’m wondering if the same is true for GBs between 1B and 2B, which may often be “discretionary” in the same way.  If Pujols is making plays on softly hit 90%-out-probability GBs to the hole, he shouldn’t be getting full credit for all of those.  (I don’t think this is a problem on GBs elsewhere, because the longer throw forces IFs to make the play and throw if they think they can.)


#104    studes      (see all posts) 2008/01/02 (Wed) @ 23:12

Tango, did you see Rob Neyer’s recent blog entry about Trammell. In it, he wonders whether or not Trammell’s fielding stats were enhanced by Tiger Stadium (a concern of Joe Sheehan’s).  Allegedly, the grass was higher there.  Sounds tailor-made for you.


#105    Alex R      (see all posts) 2008/01/06 (Sun) @ 14:05

Question for beamer post/97

Can you list the playing time assumptions for the fifth starter in the Yankees scenario?


#106    tangotiger      (see all posts) 2008/01/06 (Sun) @ 14:49

Well, score one for Rob and Joe!

Trammell’s WOWY, per season (14.2 seasons total), controlling only for…
...hitters: +9 plays
...pitchers: +1 plays
...park: -4 plays

You’d have to look more closely at the possible interdependence of the pitchers and park, notably, to try to untangle that.  But, it sure looks like the park helped Trammell quite a bit.


#107    tangotiger      (see all posts) 2008/01/06 (Sun) @ 15:06

Somebody at BTF wanted to know who was the only batter that had Jeter as his opposing SS.  That answer is Shawn Hill:
http://www.baseball-reference.com/h/hillsh01.shtml

Through 2006, he had 11 PA, of which 9 were K or SH.  That left him with 2 PA, and Jeter was his opposing SS:
http://www.baseball-reference.com/boxes/WAS/WAS200606160.shtml

Since he played in 2007, this Jeter statement is no longer true.


#108    studes      (see all posts) 2008/01/06 (Sun) @ 22:51

Cool.  Thanks, Tango.


#109    tangotiger      (see all posts) 2008/01/10 (Thu) @ 08:35

When I run WOWY on rightfielders (controlling only for parks), Ichiro is a MONSTER, at +38 plays.  Of guys who played longer than he has, the only one ahead of him was Hank Aaron at +53 plays (!!). 

It’s possible that the PBP systems simply are not picking up the nuances of Safeco.

I’ll wait until I run the WOWY on pitchers and hitters before passing judgement.

Clemente: +32.
Vlad +16.
Dewey +11.
Larry Walker +5.
Dave Parker +2.
Tony Gwynn +2.
Dale Murphy -6. (consistent with his really poor CF)

Hawk -15 (totally inconsistent with his CF… he was basically a DH in RF after his knees gave up on him… he therefore should be evaluated half as a great fielding CF, and half as a Molitor-like DH)

Sheff -19.
Juan Gone -23.

MANNY -25.  This guy just can’t catch a break anywhere!

Ken Singleton -27. 

And the worst outfielder of the Retrosheet era: Claudell Washington, -46 plays.

Again, all only controlling for park.

There is a boatload of great hitting poor fielding corner OF these days.  In effect, teams are finding lots of DH and putting them on the field.


#110    tangotiger      (see all posts) 2008/01/10 (Thu) @ 08:56

Controlling for pitchers:

Ichiro: +61 plays (!!!)

Hank is +38.

By the way, Ollie Brown (5.5 seasons) comes out as +100 plays in both views.  Who the heck is this guy?  Gotta be some sort of ballhogging, as the LF/CF are way below average.

This time, Singleton is the worst at -45 plays.  Jeff Burroughs is 2nd to last in both views.


#111    Anthony      (see all posts) 2008/01/10 (Thu) @ 10:25

The +/- plays...are they cumulative, per 150 games, per 6,000 PA, etc.?


#112    Tangotiger      (see all posts) 2008/01/10 (Thu) @ 10:34

There are exactly like in the THT08 Annual: per 4000 BIP (excludes bunts, HR).


#113    Rally      (see all posts) 2008/01/10 (Thu) @ 10:58

Check out today’s THT.  I’ve combined batter, pitcher, and at least for OF, park ratings to evaluate defense.

re: Downtown Ollie Brown - There were a number of missing plays around 1969-70, where retrosheet fills in the data with 1st putout and fielded by =9.  I removed those plays before I crunched the data, and Ollie doesn’t rate so high anymore.


#114    tangotiger      (see all posts) 2008/01/11 (Fri) @ 08:17

Rally, I bumped your TotalZone blog entry.

***

Fantastic catch on the RF issue.  I posted this to the Retro group:

Someone pointed out some invalid data with the 1969-70 data, regarding RF.

If you look at the 1968, 70 and 71 event files under
these parameters:
first putout fielder = RF
first assist fielder = RF
outs on play = 2

You will get the following counts for the three years:
61, 58, 72

However, for the 69/70 years, it’s: 161, 102

And if you set the outs on play to 1, you get:
121 for 1968
1866 for 1969
875 for 1970
1 for 1971
35 for 1972
(For LF or CF, you get at most 1 record per year.)

I looked at the event type field, and (in my database), these records show a “99”.  I’m guessing that these records are being parsed for rightfielders (pos=9) based on the parsing of this “99”.

I didn’t run Ted’s CWEVENT to confirm the same issue.


#115    tangotiger      (see all posts) 2008/01/12 (Sat) @ 12:21

Partly inspired back by RAlly, I decided to make my first major change to WOWY.

As you know, WOWY controls for only one variable, for two reasons: (1) it’s super easy to explain, and (2) trying to control for two variables at once (like pitcher and hitter) would collapse my sample size.  I can look at how often Johnny Damon was the batter with Jeter at SS, and how often Clemens was the pitcher with Jeter at SS, but the combination of both Clemens and Damon with and without Jeter at SS would be tiny.

However, the batter hand is easy enough to control, and would not reduce the sample size much.  In fact, it improves the model, since the knowledge of Clemens being on the mound had an implied split of LHH/RHH.  An implied split that is in fact not that good.

So, I made the change, and OZzie and Mark Belanger, while still the monsters, are no longer out of the world monsters.  They are +27 and +30 plays above average (barely about +20 runs per 4000 BIP).  That is alot more palatable than what I was previously getting.

The next group of top SS are now below +20 plays per season (i.e., +15 runs).  Again, makes perfect sense.

Jeter is still last, at -31 plays.

I will hopefully write something up for THT.


#116    Tangotiger      (see all posts) 2008/01/25 (Fri) @ 10:38

Btw, 5 minutes after I wrote the above about Ozzie/Belanger, I noticed a bug.  Those two guys are still off the charts with the handedness parameter included.  I’ll get it written up before Spring Training.

***

Another positive review on the THT08 Annual:
http://www.amazinavenue.com/story/2008/1/25/1427/45905

Studes puts out a great book every year, and the more books he sells, the more likely we can keep abusing him in the summer and fall to put out another book.  (I know how exhausting the proofing and typesetting is.  Yechh.) If he were to ever retire, I don’t see how anyone could pick up the reigns.

He’s our Bud Selig.  Buy THT08.  Keep Seligmund in office.


#117          (see all posts) 2008/01/26 (Sat) @ 00:58

Agreed

WIthout Studes the annual wouldn’t happen. I contributed a couple of articles this year, which was work enough, but I reckon Studes must have spent 50x more time than I did working on the book.

And don’t forget the Season Preview book too. Although others (DSG, Chris) work on that as well, Studes also plays a bit role.


#118    Guy      (see all posts) 2008/01/26 (Sat) @ 15:32

"He’s our Bud Selig.”

Is that supposed to be a compliment?  :>)


#119    tangotiger      (see all posts) 2008/01/26 (Sat) @ 17:27

Well, insofar as the owners gave him a unanimous mandate, that’s about as big a compliment you can give.  I didn’t say he was our George Bush!


#120    studes      (see all posts) 2008/01/30 (Wed) @ 15:35

I didn’t say he was our George Bush!

Thank you!


#121    tangotiger      (see all posts) 2008/02/23 (Sat) @ 15:23

Another THT review:

http://sabermetricstudies.com/2008/02/07/the-hardball-times-2008/


#122    david smyth      (see all posts) 2008/02/24 (Sun) @ 08:19

Link not working. So I went to the site home page and located the article.


#123    Tangotiger      (see all posts) 2009/01/08 (Thu) @ 15:42

Available in its entirety:
http://www.wowio.com/users/product.asp?BookId=5523

My articles start at page 140, and then there’s the Walsh and Rybarczyk articles a bit later that I quite enjoyed, among others.

The free online reader is a bit cumbersome, but what a great idea.  And for a 9.95 PDF download, you can’t really ask for better.


#124    Jeff      (see all posts) 2009/01/08 (Thu) @ 16:06

#123 On the link there is a list of books that others that readers of this book also viewed.  The top book is:

Sex Secrets: A Husband’s Guide to Lovemaking, Frank Cupidon

Impressive


#125    Aztecbill      (see all posts) 2011/05/04 (Wed) @ 14:10

Downtown Ollie Brown had the best outfield arm in the history of baseball and yes there were a number of 9-3 putouts in the early 1970s. I was there. I remember. So don’t be so quick to pull those plays. He also nailed runners at the plate from the warning track on sac fly attempts. And runners from second on hits to the gap near the warning track. Runners who hit singles to right field had to run hard or face the results of his arm.


#126    Aztecbill      (see all posts) 2011/05/04 (Wed) @ 14:12

Fans would come early to watch Brown show off his arm. He would start in the corner of the park in right field and nail throws to 3B on the fly.


#127    Aztecbill      (see all posts) 2011/05/04 (Wed) @ 14:13

On hits to right runners routinely only went one base.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 04:52
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential