THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

Mail:You ask:We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Forecasting

Tuesday, May 06, 2008

Pitchers WAR Aging Curves: 5 year forecast

By Tangotiger, 12:39 PM

I finally got around to doing an initial look.  This is what I did:

Read More

(3) Comments • 2008/05/07 • SabermetricsForecasting

Friday, May 02, 2008

What can happen when you take a team’s stats too seriously, while ignoring player projections

By , 11:12 PM

You end up with ridiculous conclusions.  BP says that these teams have the highest chances of making the playoffs:

80.3: Arizona
68.9: Oakland
53.9: Chicago (AL)
52.1: Tampa Bay

How can TB be on the list and not Boston?  Boston is the better team with the better record!  What about CLE, an excellent team in a division with teams that have terrible records, other than CWS, who are a terrible team?  Why is OAK on the list, but not ANA, when they have the same record, and ANA is the better team?

Those numbers, as well as the rest of the numbers in BP’s playoffs odds, are ridiculous because they are using some kind of silly system to compute each team’s expected wp for the rest of the season, which involves, I think (it is impossible to tell, when you have 136 different labels, stats, explanations, etc.), putting a great deal of weight on a team’s current season “something,” be it w/l record, pythag. record, underlying EQ hitting and pitching, or whatever, and then doing some kind of regression toward .500, rather than just doing what they do at the beginning of the season, which is using each player’s projection, both performance rates, and playing time, which is THE correct way to do it.

I say what they are doing is silly and wrong not because I know and understand exactly what they are doing (I don’t), but only because if you look at the results, you can easily see that they make zero sense.

As you can see from these, and their other “playoff odds,” numbers, doing whatever they are doing, is spitting out obviously ridiculous numbers.

Check these out:

The Angels, with an 18-12 record, and obviously a good team, based on everyone’s pre-season projections, including BP’s own, are somehow now a .486 team going forward, if I am reading their charts correctly (which is no easy task)!  How is that possible?  I assume that they must have some kind of bad underlying offense and defensive numbers so far this season.  I don’t really know.

The WS, a bad team, somehow are now a .537 team now going forward, better than the Phillies, Mets, Braves, Boston, Yankees, and Tigers.

I don’t think I need to go on.  Does anyone at BP actually look at these numbers to see if they make any sense?  Because they don’t.

I am looking at the “post-season odds report” and then the link to their “adjusted standings report” and I can’t make much sense of it, so I will admit that the “pct3” column may not be their expected wp going forward, although I think it is, but either way, the final expected w/l for each team, as well as their expected odds of making the playoffs, are ridiculous.

(32) Comments • 2008/05/09 • SabermetricsForecasting

What do good and bad starts by pitchers tell us?

By , 04:53 AM

I looked at the first month of the season (April only) ERA for all pitchers who were either exclusive relievers or exclusive starters.  I did this for 04-07.

From these, I put them into two groups:  Those that started out great in ERA and those that started out terribly. 

For starters, I used an ERA of over 6 or under 3 for a poor or great start.  For relievers, I used 5.2 and 1.8.  This created 82 and 89 pitchers in the two starter groups, bad and good starts, and 68 and 91 pitches in the two reliever groups.  Again, that is for 4 years combined.  There might be duplicate pitchers (probably are) in some groups from year to year.  Not in the same year of course.

For each pitcher in each group, I computed his projected ERA before the season started, using a crude Marcel with no age adjustments (I did do a regression).

I also computed their projected ERA after the one month of great or poor performance, updating the pre-season projection with the new information by weighting the new information twice the amount of the prior (pre-season) information.  Finally, I looked at their ERA for the remainder of the season.

Here are the results and some commentary:

Read More

(25) Comments • 2008/05/14 • SabermetricsForecasting

Thursday, May 01, 2008

Cliff Lee’s hot start: You wanted crap/yap from a premium writer….

By , 11:13 PM

From Rob Neyer, who is lately (maybe for a long while) just as obsessed (and misguided) as almost everyone else about short-term recent performance:

So is Cliff Lee for real? I think all we can say is that he’s really healthy. He’s going to give up a higher batting average on balls in play, and some reasonable percentage of the fly balls he gives up will fly over the fence. So no, he probably doesn’t wind up winning the Cy Young Award. But I’ll bet he’s better than average. And considering how well C.C. Sabathia’s pitched in his last two starts, suddenly the Indians would seem to have the best rotation in the majors.

So Cliff Lee, 31 years old, is better than average, because he has pitched well to 128 batters after having pitched mediocrely, at best, to 3047 batters over the last 4 years?  I think not, and I will take up Neyer on that bet (he offered this time, although obviously not literally).

Here are Lee’s last 4 years’ NERC, keeping in mind that a league average pitcher, and full-time starter, within his league, is defined as 4.00:

04 4.87
05 3.84
06 4.45
07 4.93

That is a fairly sucky pitcher who, based on his 128 batters faced so far this year, is a now an ever-so-slightly less sucky pitcher!  He is NOT better than a league average pitcher, nor he is a league average pitcher.  (Warning: of course, I don’t KNOW what he is for sure, but my estimate, since it is based on science, is a heck of a lot better than Neyer’s, which is based on nothing, but a distorted and misinformed view of what 5 outings of good pitching following 4 years of poor pitching, means.)

Again, I ask, for any of these, “Is he for real?” questions, that someone simply look at all players in history of about the same age and circumstances, who have had X prior stats, followed by Y (presumably really good or really bad) stats for a short period of time (whatever you want) and then see how they all did in ANY future time period you want (the more, the larger the sample of course).  Oh, you mean researchers have already done that (see Tango’s, my, and probably others’ “banner years” study)?  And the answer is that they performed at around the usual Marcel projection?  So why are these writers trying to answer the silly, “Is he for real?” question and coming up with equally silly answers?  It is a combination of ignorance, they have to write something, and it has to be something that their audience likes (otherwise they are out of a job).  However, it doesn’t matter much if what they write is true or not.  They don’t get graded on the truth.

How about we all just say in unison, come on now, everyone together, “They will ALL likely (our best estimate) perform somewhere in between their past weighted performance and the ‘breakout’ (or collapse) period you are citing, MUCH closer to whichever is the largest sample!”

Then we can all get on with our lives.

Anyway, I am not done with Neyer.

So, now Sabathia is part of a great rotation, considering the way he performed in his last TWO starts?”

I guess before those two starts, when everyone was calling for Sabathia’s head, and wondering whether he was hiding an injury, the Indians’ staff was NOT great.  But now it is.  Considering Sabathia’s last 2 starts.  Maybe we better wait until his (or Westbrook’s or Carmona’s) next start or two.  Because if they pitch badly, then we are not so sure if the Indians have a great staff, right?  I am just following Neyer’s logic and that of every other sports writer in the world.

News flash:  The Indians staff is roughly the SAME staff it was before the season started, the same staff it is now, and the same staff it will be (assuming no major injuries) in a month from now, no matter how any of their pitchers pitch between now and then!

The sad part is that Neyer knows this stuff (I think), but he still writes the same crap that everyone else does.

(50) Comments • 2008/05/14 • SabermetricsForecastingMedia

Tuesday, April 29, 2008

Brewers, Bush: What DO teams use to evaluate pitchers?

By , 05:55 AM

I’d love to be a fly on the wall during a team meeting about a certain pitcher.  Unfortunately, I’d be quickly squashed under a fly swatter after either laughing hysterically or throwing up all over the wall.

Seriously, how do you think the discussion goes?

Milwaukee is a supposedly sabermetrically-inclined team (I have my doubts about that).  They just sent Bush down to the minors.  Here are his last 4 years NERC’s (normalized component ERA - 4.00 is an average starter), as a starter:

04 3.61
05 4.44
06 3.42
07 4.47

I hate to even break it down by year because it is a goofy odd/even pattern, which is most likely meaningless.  Ignoring the goofy pattern, if you can (a lot of people just HAVE to see “reasons” in random patterns), those are the numbers of a solid #3 starter.  Really solid.  There are probably a hundred worse starters in baseball right now, and 300 worse pitchers.

But this guy gets sent down to the minors.  On a team that is clearly in contention.  Do they have any idea that he has been a decent to good major league starter for over 4 years now?  Did he lose his velocity or break into the clubhouse kitchen and steal some sodas?

Seriously, how can a half billion dollar business not be able to recognize that one of their employees is and has been for many years as solid as you can get without being a star.  There is NOTHING in his performance record that indicates that he is anything but.

Even this year, in 76 TBF (that is as far as my current database goes - he might have more), he has not “pitched” badly.  4.87 NERC.  There are tons of good and great pitcher with worse NERC’s than that so far.  His K rate is good - right around career norms.  His HR rate is actually stingy. His singles rate is high, which is even more luck driven than BABIP.  In any case, we are only talking about 20 IP or so - who cares what his numbers are in that many innings.  It should make almost no difference whatsoever.  If it did, then Bonderman, Bucholz, Buehrle, Burnett, Byrd, Garland, Oswalt, and about 50 other starters would all have to be sent down too.  Did he all of sudden forget how to pitch in the last 20 innings?

Oh, yeah, his ERA is 6.75 and he is 0-3.  Hey, maybe that is what they talk about in those meetings.  A pitcher’s w/l record and his ERA.

Pitiful.  Absolutely pitiful.

(21) Comments • 2008/05/02 • SabermetricsForecastingMLB_Management

Tuesday, April 22, 2008

Small team sample size: Do I care that the Tigers are 7-13, the WS are 11-7, or that Flo is 12-7?

By , 03:07 AM

What do you think?  The answer, of course, is, “Nope!” I couldn’t care less.  Or with improper grammar, “I could care less” (which would mean sort of the opposite).

At least with players, as they accumulate current ("recent" actually, since there is no such thing as a “current” stat) performance, we use that to update their projections; with teams, the only thing to use to update their w/l projections are the projections for their players.  However, 20 games into the season, updating everyone’s projection on each team is not going to make a lick of difference in terms of the team projections.  IOW, if I use pre-season projections for all of a team’s players, that team’s w/l projection from this point on is going to be almost exactly the same as if I used updated (including this year’s stats) projections for all the players.  The reason is, and this is important, is that a collection of individual player stats is NOT the same as ONE player’s stats for the same number of PA.  Not even close.

Read More

(17) Comments • 2008/05/13 • SabermetricsForecasting

Friday, April 18, 2008

Do Teams Overalue Pitchers with Bad Control but Good Stuff

By , 08:28 PM

Just as teams (the ones who are not that smart at least) do not value hitters’ individual components properly (e.g. putting too much weight on the negative value of K’s, too much weight on BA, and other garbage stats like HR and RBI), so too do they overvalue pitchers with good stuff who walk a lot of batters.

Yes, it is true that a pitcher can improve his walk rate more easily than he can improve his “stuff” (I think), but still, a pitcher’s walk rate is (obviously) an integral part of his overall talent/value, a concept which seems to be illusive for some teams.  And a pitcher with a high walk rate is very unlikely to be a major league caliber pitcher, unless he has an unusual ability at K’ing batters and/or keeping the ball in the ballpark.

Read More

(15) Comments • 2008/04/22 • SabermetricsForecasting

Monday, April 14, 2008

Fastball aging curves

By Tangotiger, 10:28 AM

Rally and Sal talk about fastball speeds, and what they can tell us.  I wrote on ballhype:

Age and injuries would be the biggest cause of a dropoff.  I would guess that if you looked at pitchers only in their 20s, the fastball in 2006 would be almost identical to his 2007.  Can you confirm? Simply put, you need an age parameter. I’d also love to see it for the other pitch types. And the “split” in pitches thrown (% of pitches that are fastballs, etc). Lots of great stuff here.

And in Rally’s study, selection bias will certainly play a part here (prospects who lose their fastball speeds are much less likely to get any playing time in MLB, thereby restricting our sample pool).

(5) Comments • 2008/04/14 • SabermetricsBall_TrackingForecastingScouting

Thursday, April 10, 2008

Is there a hangover effect for hitters following a knuckler?

By , 01:00 AM

Editor’s Note: I’ve changed the name of this thread to better capture the exciting research that is appearing, starting at post 14.  MGL’s original blog post appears after the jump, and the URL remains as it was in its original form.

Read More

(35) Comments • 2008/04/18 • SabermetricsForecastingPlatoon

Monday, April 07, 2008

Request for Vegas data

By Tangotiger, 09:33 AM

Someone wrote me for a request for data, of which I also share some interest:

Hi Tango, I’m an occasional commenter at the inside the book blog.  I’ve been looking for what appears to be an unusual source of baseball data, and I was hoping you may have heard of someone or some group who has been collecting it.  ... Do you know of anyone who has collected a history of Vegas lines on MLB games?

(5) Comments • 2008/04/07 • SabermetricsForecasting

Thursday, April 03, 2008

FIP and ERA

By Tangotiger, 04:22 PM

Nothing big.  I figured the FIP and ERA of all pitchers since 1994 with at least 60 IP and ran a year-to-year correlation:

Read More

(26) Comments • 2008/04/04 • SabermetricsForecasting

Wednesday, April 02, 2008

Community Forecast, 2007 - Pitcher Results

By Tangotiger, 03:56 PM

The previous thread on this topic focused mostly on forecast results of hitters

I’m starting this thread to deal exclusively on pitchers.  I haven’t done anything, so I’m looking forward to seeing the results as much as you are.

(30) Comments • 2008/04/10 • SabermetricsForecasting

Tuesday, March 25, 2008

The next 5 years

By Tangotiger, 03:20 PM

This is Rob Neyer‘s list.  Does it make sense?  Has be balanced current talent level and aging?  While I have big problems with Win Shares, we can make some use of my 5-yr Win Shares aging curves.  The players we are interested in will follow the “20+ WS” pattern, so focus on that column.  A guy who is 22 will generate double the wins as a guy at age 34, and 50% more a guy at age 30.  How does Neyer do?

Let’s take ARod.  He averaged 34 win shares over the last 3 years.  Therefore, his 5-yr looking forward Win Shares for a guy who is 32, would be around an average of 54% of that per year.  34 * 0.54 * 5 = 92 win shares.  That’s what we should expect from him over 5 years.  Zimmerman (whom I love), at age 23 is expected to be at 84% over 5 years.  In order to match ARod, he has to have a current talent level today of 22 win shares: 22 * .84 * 5 = 92.  Zimmerman averaged 23 win shares, so that gives him a forecast of 97 win shares.  Pretty good, Rob.  Pretty good.

You guys can go through the list and see if it makes sense.  In post 13, I have a smoothing function that will make life easy for you.  In the case of ARod, that would be:
23 + 0.64 * 34 - 0.78 * 32 = 19.8 per year.
Multiply by 5, and you get 99.

Pretty close.

(17) Comments • 2008/03/30 • SabermetricsFinancesForecasting

Tuesday, March 18, 2008

Yappin’ Jim Cramer’s Forecast: Bear Stearns is Fine!

By Tangotiger, 09:46 AM

Note: I’m using a non-sports story to make a sports point.  I direct you to Keith Law for your economics fix on the Bear Stearns story.

Thanks to YouTube, we get to hear Jim Cramer with all his convictions.  Do you understand why I can’t stand individual forecasters?  They know as much as the market, if not less.  No one knows more, unless they have insider information. I would love for all the individual baseball forecasters and stock forecasters to shut up, unless they come out with Smooth Jimmy Apollo‘s accompanying statement (courtesy of Dead on:

Well, folks, when you’re right 52% of the time, you’re wrong 48% of the time.

Homer: Why didn’t you say that before!

(19) Comments • 2008/03/19 • SabermetricsForecasting

Tuesday, March 11, 2008

Forecasting Standings

By Tangotiger, 11:02 AM

Please put in your own links.  Here’s what I’ve found:

Read More

(42) Comments • 2008/03/24 • SabermetricsForecasting

Thursday, March 06, 2008

The cost of high pitches at a young age

By Tangotiger, 11:13 AM

Gassko throws in this nugget:

But what stands out most is that, after controlling for all these other variables, every pitch thrown before age 26 knocks off half an inning in the latter half of a pitcher’s career. That is a pretty huge effect. Consider, 100 extra pitches a year—just three pitches a start—means 55 fewer innings pitched down the road.

An inning is roughly 16 pitches.  What David is suggesting is that for every extra pitch thrown prior to age 26, a pitcher has 8 fewer pitches of mileage after age 26.  This is a rather startling revelation, and is begging for more study.  As I’ve shown, regardless of how many pitches are thrown at ages 25-28, you should expect the same number of pitches at age 29-32:

Of the 96 warrior pitchers born between 1935-1958, they faced an average of 3777 batters at ages 25-28, and followed that up with 2692 batters at ages 29-32 (a 71.3% retention rate).  The pitchers born from 1959-1974 faced an average of 3470 batters at ages 25-28 (that is, babied by 307 batters over those 4 years), and followed that up in their ages 29-32 years with 2648 batters (76.3% retention rate).  Both groups of pitchers, the “overused” legendary pitchers and the “babied technology” pitchers both ended up facing virtually the exact same number of batters at ages 29-32! 

A breakdown by earlier age classes seems to be in order.

(9) Comments • 2008/03/06 • SabermetricsForecastingPitchers

Tuesday, March 04, 2008

Community Forecast, 2007 - Preliminary Results

By Tangotiger, 01:44 PM

Here’s the Google Docs of the Community Forecast for last year, 2007, for hitters. Here’s how to read the chart:
Player: Albert Pujols (playerName), pujolal01 (LahmanID), 405395 (MLB.com/Elias playerID)
Forecast: 29 ballots (n1) averaged forecast OPS of 1.101; 32 ballots (n2) averaged forecast of 154 games; appeared on 94% of Cardinals ballots

And for pitchers.  I added the teamid.  There’s also a “depth” column, which is really just a sort order.  The five new columns (Starter AceReliever Setup_or_Swing Mopup Callup ) is a percentage of all the ballots cast, where the fan thought that’s how the pitcher would be used.

I did substantial data cleanup, but I may have more to do.  That’s why I’m calling this one preliminary.  But, a first pass look seems to be reasonable, and I doubt any further cleanup will change much.  If you spot anything irregular (like all of a team’s players are too low), let me know.

I’d love it if someone out there did a study here.

(103) Comments • 2008/04/19 • SabermetricsForecasting

Sunday, March 02, 2008

Are Consistent Batters Easier to Project than Inconsistent Ones?

By , 09:03 PM

In another thread, where we were discussing Nate Silver’s (BP) interview on SOSH, I mentioned that I thought that the notion that players (like Pierre) who had consistent historical stats were “easier to project” was hooey.  I may have been dead wrong!  Check out this (admittedly incomplete) study I did the other day.  I am lost as to figuring out what is going on and why the results.

(19) Comments • 2008/03/08 • SabermetricsForecastingStreaks

Wednesday, February 27, 2008

PECOTA chat at SOSH

By Tangotiger, 05:31 PM

Already underway.

(52) Comments • 2008/03/03 • SabermetricsForecasting

Sunday, February 24, 2008

Projecting Pitching Longevity Revisited

By , 06:20 AM

Another excellent, albeit incomplete again, look, by David Gassko, at whether and how we can project one pitcher to have a longer career than another, other than by our estimate of their overall pitching talent, as measured by something like regressed FIP (or simply a good context-neutral pitching projection).

Last time he debunked (not completely, mind you) the “conventional” (I put that in quotes because it also was/is the CW in sabermetric circles, or at least something that analysts occasionally mention in passing and it goes unchallenged) wisdom that high K pitchers have longer careers than similarly talented (overall) low-K pitchers, originally promulgated by Bill James, using some (severely) flawed research.

This time he looks at low and high BB pitchers and finds that the low BB ones have substantially longer (around 25%) careers.  This seems to fly in the face of his work last week, although apples and oranges (K rate and BB rate), or at least Macintosh and Granny Smith apples only, are being compared.

Hopefully, David will come back with some more work on the subject and not leave us hanging.

Speaking of “myths,” please tell everyone you know that, ballparks are NOT smaller than they used to be, at least since 1990.  I have been trumpeting this for a while now.  Jay Jaffee of BP gives us the net change in park dimensions (not fence heights though) from 1990 to 2007.  Guess what?  Parks are bigger now than they used to be!  The next time you hear a commentator tell us that run scoring and home run rates are up since the 80’s and early 90’s because of “smaller parks,” please call your Congressman or at least the radio or TV station from whence the broadcaster comes!

(8) Comments • 2008/03/06 • SabermetricsForecasting
Page 1 of 5 pages  1 2 3 >  Last »