THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, January 16, 2008

Yappin’ Joe Sheehan?

By Tangotiger, 10:57 AM

It looks like Joe Sheehan has decided to put out his Bloomberg List.  You know, those yappin’ talking heads who will tell you which stocks will rise over the next 6 to 12 months, which of course they won’t put their own money on, allowing them to not retire early, and then give us another list in a month.  And on and on it goes.

Now, how am I supposed to evaluate Joe here?  The Marcels seem like a good fit.  The 9 hitters that Joe mentions (the 8 plus Francoeur) are expected by Marcel to perform at these levels: 4633 PA, 1116 H, 388 XBH, 556 RBI, 612 R, 86 SB, 20 CS, 421 BB, and 765 K.  (If someone wants to work out the Chones, Bill Jameses, MGLes, Voroses, Shandlers, THTes, and PECOTAs, I’d be much obliged.)

Since Joe is calling them breakout candidates, we should see big movements on these numbers, right?  Those guys are expected, by Marcel, to hit .272.  What’s Joe saying?  He isn’t, but let’s make them .300.  That’d be a nice breakout, especially for the breakouts of all breakouts.  Their OBP is .344.  Let’s say they should have an OBP of .375.  Marcel says .436 SLG.  Let’s make our breakout guys .480.  That’s basically a 10% bump across the board.

If Joe is still writing on Oct 1, 2008, this means that either the gang of 9 breakout candidates did exactly like Marcel said they would, or that he didn’t put his money where his mouth is.  Or the breakout of all breakouts would only give a 5% bump.  Or 1%?

Disclosure: I correspond with Joe on occasion.  Though, that may change very fast pretty soon.  I don’t mean to pick on Joe.  It’s just that he’s much too smart to put out a list like this.  So as not to single out Joe, I’ll comb through all the forecasters, and come out with their “gang of 9 breakouts of all breakouts”, and see if any of them can beat Marcel by more than a hair.  Since Joe was nice enough to not pick out really young players, I’ll set the candidate list at anyone with a Marcel reliability of 0.70 or higher.  (Jason Kubel was 0.73.) That’s 327 hitters to choose from.


#1    Warren      (see all posts) 2008/01/16 (Wed) @ 12:01

This touches on something I’ve been thinking recently: can we use the wisdom of crowds to beat Marcel if we ask the crowd questions in a specific way?  That is, we ask people who haven’t looked at any statistical projection to tell us whether or not particular players are likely to exceed their projection or not.  It’s an up or down question designed to get at information not included in (current) projections.

So we ask people to try to ignore the things that statistical models do best (weighting years appropriately, providing general aging curves, general park factors) and only focus on the things that don’t go into (current) predictive models.  Things like:

- injuries (aside from the general issue of reduced plate appearances)
- more specific aging curves (e.g. Posada had a light load as a young catcher, so maybe he’ll age better than the average catcher)
- park adjustments for players with particular traits (e.g. Tango’s discussion about how a park can’t effect Boggs and Rice in the same way)
- changes in tools (improved fastball speed, better break on slider, etc.)
- scouting generally, particularly for very young pitchers (did the guy dominate in the minors because of great stuff, or because of deception that won’t translate to the bigs)

I wonder if something similar to the size of the Fans Scouting Report would yield data that, when added to Marcel, yields better projections, or if the crowd cannot help but have their “gut” affected by the numbers, even if they try to put the numbers aside.


#2    Tangotiger      (see all posts) 2008/01/16 (Wed) @ 12:15

Actually, last year I ran the Community Forecast project to attempt to answer just that.

I am wayyyyy behind on going through the data.  It was my fault for not using dropdowns, and letting people put in free form numbers.  I get shivers every time I have to finally compile all the data.  But, I gotta do it very soon.  I can’t disappoint those who took the time to give me the data.

I’m sure that the Community did fairly well.  But, that’s as a group.  Any single person is so completely unreliable as to be worthless, whether that person is me, mgl, Joe, or Peter Gammons.


#3          (see all posts) 2008/01/16 (Wed) @ 12:39

I’ve been thinking about going through the old Bill James fantasy books, finding all the players he marked with stars, and seeing how well is predictions did.  There’s no Marcels to compare them to, but it would still be interesting.

My completely uneducated gut feeling is he’d wind up beating the Marcels, but not by much.


#4    Warren      (see all posts) 2008/01/16 (Wed) @ 12:53

To be clear, what I’m suggesting is slightly different from the Community Forecast.  Rather than have people enter AVG/OBP/SLG directly (or whatever), you give them a list of players and simply ask if they are going to exceed their projection or not.  But most importantly, you don’t list their projection.  You simply ask people that have an idea of how projections work to estimate whether or not the data that projections *don’t* include would tend to work in the player’s favor or not.  But once participants actually see “real” projections, they’re probably biased based on those results. 

Here’s an example (numbers made up): based purely on my own bias in favor of weighting the most recent season too highly, I think A-Rod should be projected to hit 50 home runs in 2008.  Marcel says 40.  So I then say that the projection is too low. 

I don’t really like this method, because my choice is influenced by the projection I saw, and what we’re really measuring is the fact that (unlike good projections) I weight the most recent season too highly.  What we want is to let statistical projections do what they do best (which includes proper weighting), and just let people deal with the data that the projections don’t include.  The Community Forecast does get at this “extra” information but that extra data may get swamped by other biases from the participants (seeing patterns that aren’t there, etc.)


#5          (see all posts) 2008/01/16 (Wed) @ 12:59

One concern I have with not showing the projections is that you might get people considering effects that projections also do.

For instance: suppose someone is not all that familiar with regression to the mean as a formal principle, but has internalized it.  Ask that person if A-Rod will exceed his projection.  He’ll think, “A-Rod hit 54 HR, so his projection will be 54.  I think he’ll hit less, so I’ll say less.”

What you get is a result of the consensus IMPROPERLY ESTIMATING the projection, rather than of ADDING ADDITIONAL INFORMATION to the projection.

I’d be happier showing all the projections, then asking which ones are likeliest to be wrong, and in which direction.


#6    Tangotiger      (see all posts) 2008/01/16 (Wed) @ 13:06

It works out to the same thing.  I ask people how many HR ARod will hit, and not provide them with any data.  Those people will look at the data as they see fit, and use other knowledge, and come up with a number.  If that number is higher than Marcel, then that means a combination of:
a. they weight more recent seasons more
b. they saw a change in talent level for him

While we are only interested in b. to improving Marcel, there’s no way for me to know, or a fan to know, that they are basing it on b. and not a.  I can’t say “did you see a change in talent or concentration level”.  I’m going to get 90% yes votes, but they may have answered that only because they know he had an MVP year, so they infer b. from a.

Since I can’t separate a. from b., I let them pick their number.


#7    Cooper      (see all posts) 2008/01/16 (Wed) @ 13:14

I am not good with stats so my comments may be ignorant and simple.

Couldn’t you just pick 9 guys with really low babip numbers?  Or do all the projectors already figure that in?

If Joe Ballplayer had a babip of .220 - i would guess he’ll have a better year next year.

Am i missing something?


#8    Tangotiger      (see all posts) 2008/01/16 (Wed) @ 13:34

This is just like stock prices.  You would hope that all known information is already factored in, so that’s all that’s left for you to do is:
a. have unknown information
b. be able to combine known information in ways that the market does not

Marcel doesn’t know about minor league stats, so that’s one way to beat him.


#9          (see all posts) 2008/01/16 (Wed) @ 17:09

Marcels doesn’t know about minor league stats, or park/league adjustments - but CHONE or ZiPS does.

Where you’re probably best off trying to leverage information to “beat” a projection system is when you have information that doesn’t show up in the on-field record. By which I don’t mean “intangibles” or character issues. Projection systems are generally “dumb” about injuries and how they effect players going forward - fans generally don’t have enough info about those to try and beat PECOTA, or ZiPS, but teams certainly should.


#10    MGL      (see all posts) 2008/01/16 (Wed) @ 18:41

Getting back to the Sheehan article, I thought pretty much the same thing when I read it (and there are lots of articles by analysts/journalists, like Sheehan, Law, etc., anout “breakout candidates").

The question, though, in terms of testing their hypothesis, is what does it mean to be a “breakout candidate?” Does that mean that they are expected to exceed their projection (a standard projection of course)?  Does Sheehan explicitly say this?  Are they simply players who a standard projection will underestimate?  For example, in Shandler’s projection model (a good one overall I think), he does NOT use a standard Marcel-type algorithm.  He uses, at least for some players, certain “keys” which help him to determine which players will under or over-shoot their standard projections.  Whether they work, and if yes, to what degree, I have no idea.

Or…

Is a breakout candidate one who merely has a higher variance in terms of his next year’s performance?  If that is the case, then we will NOT see a combined projection for these players exceed their standard projection and we would need lots of data (much more than one year) to test whether these types of players indeed have a higher variance in performance the next year.  IOW, is a “breakout candidate” a player who has a standard projection of .800 OPS but has a 10% chance of performning at .900 rather than a 5% chance (numbers made up)?  IOW, under a reasonable definition of a “breakout candidate” it may NOT be that their projection is any different than their standard projection.

So before we “test” Sheehan’s assertion, we need to specifically ask him what he means by “breakout candidate?” Another example of a person using words without explaining exactly what they mean (which I hate, BTW).  Words are indeed a powerful method of communication, but only when their meaning is crystal clear.  In this case, it is far from that, although I have not read the article in a few days.


#11    tangotiger      (see all posts) 2008/01/16 (Wed) @ 20:43

Joe wrote this in the article:

I went looking for players who have been in the league for two-to-three years, are young enough to have development left, and have established a certain level of performance

That last bit makes it clear that a Marcel would work just fine here.  It’s not a guy who was up and down in his performance, so we’re not sure how good he is.  The guy’s got an “established level” (read: fairly consistent that we should expect otherwise more of the same), but Joe thinks that they are due for a breakout.

Marcel works fine here as the baseline.


#12    Pizza Cutter      (see all posts) 2008/01/16 (Wed) @ 23:07

This is a pretty common problem in my “real” job as well.  Trying to predict child development on an individual level is a fool’s game.  Sure, there are signs of significantly aberrant development that are easy to pick out to the trained (and sometimes untrained) eye.  But, ask the nervous mother who’s 1.5 year old won’t say more than a few years as to whether he’ll ever talk.  Answer: yeah, probably, but I have no idea when.

The biggest problem in modeling development, whether it be child vocabulary or baseball players is that outcomes are often the combination of _several_ necessary, but independent skills (hitting for power involves physical strength, bat speed, and swing mechanics, and an eye for picking out that thigh-high curve.  If any one of those is off, power will be affected. 

I’d say that the next frontier in data-driven forecasting will come from a fuller appreciation of some of these multi-variable growth trajectories, and understanding the variables that go into them and how they interact.


#13    MGL      (see all posts) 2008/01/17 (Thu) @ 03:34

I have been thinking a little bit lately about the player who is not too young or old and has been fairly healthy and consistent, versus players who are very young or old, have not been consistent, and/or have had some serious injuries (usually with inconsistent performance to go along with the injuries).

While it seems like the consistent “middle” aged players are much “easier” to project, maybe that is just an illusion.  Maybe the “difficult” players project just as well. My guess is that a Marcel works just fine for ALL players on the whole.

The question is whether our projection for a so-called, not-too-young or old, consistent player has a smaller performance variance than one who is presumably harder to project. I am not convinced that that is the case.  I would like to see the data.

For starters, one can look at a player database and separate similar players into two categories:  One, players who have been pertty consistent for 3 years in a row.  Two, players who have been very inconsistent for 3 years in a row.  Other than that, the two groups would have to be the same - around the same age (say 26-32 or perhaps no age limits at all), and would have to have had some minimum number of PA per each of those 3 seasons.

Now we look at 2 things firther:  One, the 4th year performance and compare that to each group’s Marcel.  They should be about equal.  IOW, both groups’ performance in year 4 should about equal their Marcels.  Two, we look at the variance in performance in year 4 among all the players in the two groups.  If that variance is the same, then basically we have shown that a projection for a consistent player is NOT easier or more reliable than one for an inconsistent player.

Anyone want to predict or bet on how a study like that would turn out?  I am going to guess that both groups are going to be identical in terms of Marcel and in terms of 4th year performance variance.  If that is the case (and I am not sure that it is), can we stop assuming that a projection for a player who has been consistent is more reliable than for one who has been inconsistent, even wildly inconsistent.  It is one of those things that bugs me, because we throw that out all the time (that a player who was at .700 last year but .900 the year before is somehow “harder” to predict than the player who was .800 for 3 straight years), yet we have no idea whether it is true or not.

You could do the same for injured players, older players, younger ones, certain kinds of “inconistency” in performance over those first 3 years, etc.

Anyone want to take that on?  Should be an easy first pass.


#14    MGL      (see all posts) 2008/01/17 (Thu) @ 05:04

I’ll set the candidate list at anyone with a Marcel reliability of 0.70 or higher.  (Jason Kubel was 0.73.) That’s 327 hitters to choose from.

What is that?  What is a Marcel reliability?


#15    MGL      (see all posts) 2008/01/17 (Thu) @ 05:12

Speaking of hard/easy to project players, what is “upside” as in player A has more more upside than player B?  Is there any merit to the concept?  Has anyone done any studies on that or do we also just accept it as true because it “sounds good?” Is it defined enough to even look into?

Let’s say that we have two players who have the same projection.  What does it mean that player A has a higher upisde than player B?  Is it that player A has a greater chance of performing at some future time or establishing a projection at some point in the future, BETTER than player B?  Exactly what does that even mean and how do we define/quanitfy it?  Why would that be?  Can we test such a hypothesis?

Or do we just assume (whether it is true or not) that, for example, a pitcher that sucks (say, a 5.50 ERA in a 4.50 ERA league), but has great stuff, has a higher “upisde” than a pitcher who has the same recent performance or projection but who has average stuff?  That is the classic example, right?  No one ever would bat an eyelash if we made that statement about 2 pitchers with those characteristics, right?  Do we know that is true?  I don’t.  I would like to see some data.

IOW, I would like to see someone set up some criteris for players that have high and low “upsides” and then test those criteria on historical databases.  Hopefully without datafitting of course.

Sometimes, with things like “breakout seasons,” clutch, and “upside,” I feel like it is the 70’s all over again and Bill James is asking, “Is that true?”


#16    tangotiger      (see all posts) 2008/01/17 (Thu) @ 08:11

Upside/downside: that’s another project that I’ve got on my todo list - proving or not the PECOTA percentiles.

***

Marcel reliability:
suppose you have the following PA in the last 3 years, most recent first: 600, 500, 400
The total weighted PA (using weights of 5,4,3) is 6200 weighted PA.  My regression toward the mean PA is 1200 (i.e., 2 seasons of 600 PA).

reliability = 6200/(6200+1200)=.84

That is, I applied 16% regression toward the mean for that player.


#17    tangotiger      (see all posts) 2008/01/17 (Thu) @ 08:13

A guy with 100 PA in each of 3 seasons has a reliability of .50, and therefore gets his weighted performance regressed 50% toward the league mean.


#18    MGL      (see all posts) 2008/01/17 (Thu) @ 16:38

Got it. Never heard that before (Marcel reliability).

Basically it is a min # PA, but a little more rigorous (more recent PA are “better” than old ones, etc.).

That is a nice way to set thresholds on a study rather than the usual “at least 300 PA in each of the last 3 years” or what have you!


#19    tangotiger      (see all posts) 2008/01/17 (Thu) @ 17:01

Yup.  For example, each of these PA sets have a reliability of .75:

yr, yr-1, yr-2
440, 200, 200
200, 500, 200
200, 200, 600

That is, you get an r=.75, or regression toward the mean of 25%.  So, weight as 5,4,3, then regression by 25%.

The implication is that the weighted performance line of these 3 players will be EQUALLY predictive of the performance of these players in yr+1.

Getting 700 PA for 3 straight years gives you an r=.88, meaning you only need to regress his performance by 12%.


#20          (see all posts) 2008/01/18 (Fri) @ 02:55

So before we “test” Sheehan’s assertion, we need to specifically ask him what he means by “breakout candidate?”

From a Joe Sheehan article from January 26, 2005 called “Fun with PECOTA”:

One of the best features of PECOTA is that it tags players with “Breakout” scores, expressed as a percentage. From the BP statistical glossary:
Breakout Rate is the percent chance that a hitter’s EQR/PA or a pitcher’s PERA will improve by at least 20% relative to the weighted average of his EQR/PA in his three previous seasons of performance. High breakout rates are indicative of upside risk.

Since I don’t have a BP account, that is all I can read, so I do not know if it continues.

Sheehan article from March 28, 2000, “AL East Notebook: Breakout Candidates” lists these breakout candidates:  Calvin Maduro, Jorge Posada, Esteban Yan, Bryce Florie and John Wasdin.

February 24, 1999 article by Sheehan titled “Breakout and Flameout Review:  How well did we pick them in 1998?” that is about to this article predicting breakouts and flameouts.

A search of BP shows a lot of ‘breakout’ perdictions over the years, but perhaps better defined with PECOTA.


#21    MGL      (see all posts) 2008/01/18 (Fri) @ 17:02

#20, interesting.  As long as Sheehan or someone can define “breakout” we can test it.  If that is defined as a “breakout rate” above a certain percentage, I have no problem with that.

Of course, young players will automatically have the highest breakout rates according to that definition - the younger the better.  So will players who have the worst performance over the last 3 years (regression toward the mean will be in the “up” direction).  So will players who have the fewest PA over the last 3 years AND were below average in those 3 years (largest regression in the UP direction).  So too for a player who has performed well below his draft number or scouting report would suggest (the “mean” we regress toward will be higher).  So too for big, strong players who have not shown much power over the last 3 years (again, the mean we regress toward will be higher).

So, for example, without knowing anything about a player other than his stats and his age, height, and weight (and perhaps his scouting report and his draft number), to say that a very young player who has few PA in the last 3 years, is big and strong, was a high draft pick and had or has a good scouting report, AND who has performed poorly (relative to these things) over the last 3 years, is a good or great candidate for a breakout season is pretty much a trivial statement.

On the other hand, this player’s projection is going to be very good.  So as I was trying to say in previous posts, if a player does not exceed his projection, can we still consider it a breakout season?  According to BP’s definition of “breakout percentage,” it has nothing to do with a player’s projection.  A player might have an .750 OPS over the last 3 seasons, have a projection of .850 because of all the things mentioned before (he is young, strong, a great prospect, high draft pick, etc.), and yet have a very high breakout percentage.  The .850 projection alone is 13% higher than the .750.

So if Sheehan is just identifying players whose projection is a lot higher than their past performance, for reasons stated above, then he is spot on.

The question that Tango and I would like to answer (and is apparently on Tango’s “to do” list) is whether Pecota or anyone or anything else can identify “breakout” (or “collapse") limits beyond that defined by a player’s projection and his prior number of PA (including minor league performance).

We typically assume that a player’s future potential performance curve will be around normal centered on his mean or median projection with a width defined by the weighted sample size of his past performance.  Pecota claims that they can do better than that.  Can they?


#22    Pizza Cutter      (see all posts) 2008/01/18 (Fri) @ 18:25

I’ve done a tiny bit of research on the issue of breakouts.  It seems to me that the more important part of a breakout definition is distinguishing a true breakout season from a statistical outlier season.  The best example is Brady Anderson’s 50 HR season.  The only reliable predictors of whether a breakout (I think I defined it as a 50 pt jump in OPS just to get a quick and dirty look) would continue over the next two seasons (regressed less than halfway back the original gain) were age (younger was better) and that the original OPS had been rather humble to begin with.


#23    MGL      (see all posts) 2008/01/19 (Sat) @ 03:28

Rather than looking at for what types of players a “breakout” season (banner year) will tend to alter a player’s projection over and above the normal projection algorithm (like a Marcel), I am talking about projecting a breakout season, which is a projection over and above a normal projection algorithm before any breakout occurs.  This is what Sheehan is talking about.  Similar, but two distinct things.

As I said before, one can call a normal projection (based on a trafditional projection model) a “breakout” projection for certain players (young, highly regarded, and below average in the past), but that is trivial.  What we really want to know is whether anyone can project a player to have a breakout season where the definition of breakout is significantly better than a good projection system would say.


#24    Peter Jensen      (see all posts) 2008/01/19 (Sat) @ 11:44

"The question that Tango and I would like to answer (and is apparently on Tango’s “to do” list) is whether Pecota or anyone or anything else can identify “breakout” (or “collapse") limits beyond that defined by a player’s projection and his prior number of PA (including minor league performance).”

“What we really want to know is whether anyone can project a player to have a breakout season where the definition of breakout is significantly better than a good projection system would say.”

MGL - If someone can systematically identify “breakout” or “collapse” season’s doesn’t that methodology become incorporated into a new projection system?  Isn’t the question that you and Tango are asking simply “can the projection systems that we have now be significantly improved”?


#25    tangotiger      (see all posts) 2008/01/19 (Sat) @ 12:09

Good point.  I suppose then that the question is if someone can forecast a breakout over and above Marcel, if I promise to never change the basic premise of Marcel.  (Which I hereby do.)


#26    MGL      (see all posts) 2008/01/19 (Sat) @ 18:54

Peter, yes of course. That is tautological.

And of course a more rigorous projection system like just about all of them (mine, ZIPS, Chone, Pecota) can do a little better than Marcel, so by definition, they will be able to project at least a small “breakout” for some players.  The question is can someone or some system project a significant “breakout” (say, 10 or 15%) over and above Marcel.  I forgot, does Marcel do age adjustments?  That is important of course.  If it does not, the younger the player, the more Marcel will undershoot the projection automatically.  Does Marcel do park adjustments?  If not, then if a player goes from Oakland to Texas, obviously it is going to look like a “breakout” season as compared to Marcel.  So I don’t think Marcel is the appropriate standard to compare to in terms of testing breakout predictions, unless AT LEAST Marcel adjusts for age and context (including league).

Now, from what I remember, Joe was not talking about some objective indicators that might be incorporated into a sophisticated projection system in order to project “breakout” seasons.  He was talking out of his *ss, with all due respect.

And again, if Joe, or anyone wants to talk about breakout seasons, he has to identify the player, quote a prevailing projection, based on several good projection systems combined (otherwise I can just take the lowest projection for any given player, and claim that this player is likely to “break out"), and then assert that this player will likely beat that projection by a lot.

Of course claiming something about one player when the results are going to include lots of random fluctuation is meaningless, as there is no way to test the claim after the fact (I can randomly claim that any player will “break out” and be “right” 10 or 20% of the time, just by luck alone, right).

One would have to take lots of players that someone or something claimed to be “breakout” candidates, and then check those after the fact, and hope that there is enough of a sample size to get some reliable results.

That is basically what Tango wants to do with Pecota’s “breakout” and “collapse” numbers.


#27    tangotiger      (see all posts) 2008/01/19 (Sat) @ 23:18

Marcel does 3 and only 3 things:
1. Looks at past 3 years, weighting most recent season first (5/4/3 for hitters and 3/2/1 for pitchers)
2. Age adjust (steeper curve upward for younger players and gentler slope for older players)
3. Regress toward the mean based on weighted playing time (PA or IP, with weights noted above)

League and/or park adjustments are not made, nor are minor league stats considered.  So, you could predict breakouts based on player movement along any of those 3 parameters.  Ryan Braun for example probably has a much lower Marcel forecast than the other systems.

Checking Fangraphs now…

Hmmmm… Marcel is between Chone and Bill James.  Bill James treats his performance of 2007 as indicative of talent, and is not applying any regression at all:
http://www.fangraphs.com/statss.aspx?playerid=3410&position=3B

MGL, what’s your forecast for BRaun (OBP/SLG)


#28    MGL      (see all posts) 2008/01/21 (Mon) @ 01:24

I don’t have an OBP/SLG handy for Braun, only a lwts above average, which is +31 per 630 PA, whatever that is in OPS.  His MLE for AA in 06, for around 238 PA was +45 per 630 PA (+38 in 07 in AAA and MLB), so his minor league stats are certainly going to help his projection.  He tore up most of single A and college too of course and was a 1st round pick (5 overall).  So, despite great MLB stats, I assume that Marcel will underrate him a lot, because it is going to regress a lot because of his few MLB PA and regression to zero (or less, because of his age or MLB service time - I’m not sure what the Marcel formula regresses toward), as opposed to regressing to something greater than zero, which is appropriate, given his size (6’2”, 205), minor league performance (whichever performance is NOT being used in the projection algorithm itself), and his draft status and scotuing report.


#29    Rally      (see all posts) 2008/01/21 (Mon) @ 02:02

How does his AA performance in 2006 come out to a better MLE than his 2007 major league record?

He hit 303/367/589 in AA, and 324/370/634 in the majors.


#30    Tangotiger      (see all posts) 2008/01/21 (Mon) @ 08:48

OPSwins = (1.7*OBP+SLG-1)*.025*PA

So, Marcel has him as .369/.584, which would convert roughly as wins=+3.3 per 630 PA.

Looks like we convert him exactly the same!


#31    MGL      (see all posts) 2008/01/21 (Mon) @ 17:35

Rally, #29, I meant that my projection (or any projection that uses minor league stats) will be higher than one that does not use minor league stats if the minor league MLE’s are greater than MLB average for that age, since a Marcel-type system will essentially substitute league average (for that age) stats for those minor league stats.

The fact that Marcel and my system got around the same projection has NOTHING to do with the fact that I use minor league stats and Marcel does not.  IF Marcel encounters a player with NO major league stats, or 50 PA or so, its projection will be fixed (something less than league average I would presume, depending upon the age of the player) no matter who the player is.  Obviously if another system uses minor league stats and the player has lots of minor league stats, that system will do MUCH better than Marcel!


#32    Rally      (see all posts) 2008/01/21 (Mon) @ 19:26

MGL, I agree with your points in #31.  I read #28 as saying Braun’s 2006 in AA translates to a better LWTS than his 2007 performance.  That shocks me, because Braun’s BA/OBA/SLG are better in 2007 even if you don’t let the air out of 2006 for being in AA.

At least that’s what it looks like you are saying in #28, if you aren’t then never mind.


#33    Mike Green      (see all posts) 2008/01/24 (Thu) @ 11:04

I pay little attention to articles about “breakout candidates” for position players.

There are players who I subjectively think will do a little better than the projections because of anthropomorphics (Carl Crawford, Alex Rios), but the magnitude of the differences is pretty small.  Rios is projected by Marcel at .290/.350/.480, and I’d have him subjectively at .300/.365/.500. Crawford is similar. Big whup. 

Pitching is a different story entirely.  Injury histories and new reports, new pitches and so on all make projection less reliable.  Changes in team defence also affect pitching, of course.  For instance, Andy Sonnanstine is projected to have an ERA of 4.82 and a FIP to match in 2008 by Marcel.  Matt Garza is projected to have an ERA of 4.50.  I have them at 1/2 a run less each, due to the interaction between defence and pitching.  Tampa 2008 is an extreme case.


#34    Tangotiger      (see all posts) 2008/01/24 (Thu) @ 11:22

If you set a 0.50 run difference as the threshhold for big difference (implies a 1 win difference in 180 IP), then a 1 win difference on 600 PA would be 20 points in OBP and 30 in SLG.

I agree that the 15/20 difference you are citing is not that much of a difference, but it’s getting there.

***

Note, 600 forecasted PA is equivelent to 180 forecasted IP.  The 30th highest forecasted player in those measures are roughly at those numbers.


#35    Tangotiger      (see all posts) 2008/01/24 (Thu) @ 11:29

By the way, I personally set the “basically agreed” line at 20 OPS points (8 OBP, 12 SLG, which is less than 0.4 wins or 4 runs).

For pitchers, that 4 run difference in ... IP levels implies .... acceptable ERA difference:

IP, ERA
180, 0.20
120, 0.30
90, 0.40
72, 0.50
60, 0.60

So, if you have two forecasts for relievers are 72 IP, and their ERA forecasts are 0.50 apart, it’s basically the same forecast (4 run difference).

A 4-run difference is basically one really good or really bad game difference in a season.  Seems like a reasonable cutoff.


#36    MGL      (see all posts) 2008/01/24 (Thu) @ 22:02

Interesting (not saying it is wrong or right) that “basically the same forecast” can be less than .20 in ERA for a #1 starter (with 200+ IP) and .6 runs for a reliever.  I realize that that is the same number of wins (and that both forecasts will generate around the same value), but it seems rather odd.  Also, you have to distinguish between a 1.0 (or so) LI reliever and a closer or set-up guy.  For a closer who pitchers 80 IP, you want that “same thing” ERA projection to be closer to that of a starter, no?


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Dec 03 21:29
Sabermetric Moves of the 2009 Pre-Season

Dec 03 23:36
How to calculate the area of a baseball field

Dec 03 23:25
NYC’s 3 1/2 year mandatory jail time sentence for carrying a loaded weapon

Dec 03 21:15
What would happen if the shootout period was 10 minutes, not 5?

Dec 03 20:51
Marcel 2009 is here

Dec 03 18:40
Avery being Avery

Dec 03 14:50
The Return of the Baseball Abstract?  No, the next best thing…

Dec 03 14:48
Estimating BABIP

Dec 03 10:42
What was Pedro worth?

Dec 03 10:20
Complete Run Expectancy, Retrosheet Years