THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, February 01, 2012

Tempered Yu Forecast

By Tangotiger, 10:58 AM

Brian doesn’t blindly follow his off-the-wall forecast.  Good for him. 


#1    Jesse      (see all posts) 2012/02/01 (Wed) @ 13:55

A good analysis of Yu and the Oliver system, but I don’t think I agree with his using the next 3 years of performance to evaluate the projection of a Japanese pitcher before the start of his first MLB season. Many projection systems only look back 3 years of performance meaning that a pitcher is essentially a different projection entity by their 3rd MLB season. I’d be more interested in how the pitchers performed the first season they came over as compared to their projections rather than the first 3 seasons weighted.


#2    Tangotiger      (see all posts) 2012/02/01 (Wed) @ 14:06

I disagree.  If you sign someone for 9 years, don’t you want to have an estimate for the next 9 years?


#3    Jesse      (see all posts) 2012/02/01 (Wed) @ 14:18

I think we can all agree that you WANT the projection for the next 9 years, but his projection system is only projecting the 2012 season, no? After 1 year the pitcher has undergone some degree of change and so I don’t see it as particularly informative to continue using an out-dated projection for baseline comparison. If he wanted to use 3 years for increased sample he could update the projections to include the relevant data from both leagues and create a likewise weighted aggregate projection. Since most Japanese pitchers come over at or after their peak age using a 3-year window instead of 1-year could bias the results to be worse than they actually were.


#4    Micah      (see all posts) 2012/02/01 (Wed) @ 14:41

Interesting read, but something stands out:

“It’s not the first time it’s happened, but when a player so dominates his non-major league competition that that his derived major league true talent exceeds generally accepted norms, it offers an opportunity to examine the system and make some changes for the better.”

Granted, his analysis attempts to temper his projection based on a more closely regarded set of observations, but as I read the sentence above, it appears that the writer is trying to adjust his model so that his projection so that it more closely matches expectations. Obviously, projections are only useful when compared to observations (which don’t yet exist), so maybe it’s more appropriate to say something like:

“We don’t know what to expect from Yu because we don’t believe there is an adequate/accurate way to model/predict his future performance based on his NPB history. I tried to adjust the projections based on data that appears to be more reliable, but still the projections are wildly different from consensus expectations. It will be interesting to observe how he pitches and see if we can develop a more accurate model.”

Now, I may be entirely misreading that first paragraph, and I don’t think the approach to the analysis appears as alarming as the one statement, but still this is a dangerous sentiment from the author.

Then again, as a Rangers fan, I’d like his wild projection to be more true than everyone expects.


#5    Brian Cartwright      (see all posts) 2012/02/01 (Wed) @ 20:37

#4 -

I see this as an opportunity to study the situation, and see if there’s something I might not have considered previously. I could always just regress more heavily to the mean, as that’s the easiest way to pull in outliers, but it might not offer greater accuracy.

Tango questioned my Strasburg projection, based on only his college and fall league performances. I stuck by the projection, and it turned out to be very close. But that’s just one player.

I’m trying to minimize the total error for all players. Also, I’m not going to carve out an exception to the rules. Whatever changes I might make will apply, in this example, to all players coming from Japan.

For example, I believe I could do better by comparing pitchers in the same roles (starter vs reliever). Right now, I am comparing a season pitching line in Japan to a season pitching line in the US. However, in some cases that’s taking a pitcher’s stats as a starter in Japan compared to his stats as a reliever in the US. We know that on average a player will perform better as a reliever, so I’m not making an apples to apples comparison.

Also, I wrote a piece for this year’s Hardball Times Annual that showed that expected BABIP allowed is a function of ground ball rate. Pitcher’s BABIP is heavily regressed, but it would be more accurate to regress to a mean derived from his ground ball rate.

So these aren’t just fudging the numbers because something doesn’t look right, it’s about studying the situation and seeing if there is an informed improvement that can be made.


#6    Tangotiger      (see all posts) 2012/02/01 (Wed) @ 21:30

You should not regress BABIP based on GB rate and NOT regress SLG on BIP as well. You have to do both.  And you also have to do DP.

Indeed, you’d be better off not doing anything at all, if you are just going to do some of the above.  Because the RUN value of a BIP (excluding HR) is pretty identical, regardless if you are a GB or FB pitcher.  Check it out…

***

As for questioning the Strasburg forecast: since Brian had the same forecast for Strasburg as he had for Phil Hughes, then yeah, nailing Strasburg but completely missing on Hughes is not good.

(Not that Brian was alone on this.)

Anyway, I appreciate Brian going the extra mile here…


#7    Brian Cartwright      (see all posts) 2012/02/01 (Wed) @ 23:43

6/Tango -
First I do HP, BB-IBB and SO per PA-SH-IBB
then HR per PA-SH-IBB-HP-BB-SO
then BH per PA-SH-IBB-HP-BB-SO-HR (BABIP)
then XBH (DO+TR) per BH
then TR per XBH

then each of these rate are regressed.

Adjusting the rate of base hits per balls in play adjusts the denominator for doubles and triples.

If BABIP regressed to GB derived mean, HR per balls contacted, or per air balls, are also regressed to mean determined by ground ball rate.

So that covers SLG


#8    Tangotiger      (see all posts) 2012/02/01 (Wed) @ 23:50

Ok, so you not only do BABIP, but you also do SLG on BIP?

Now, do me a favor: calculate wOBA on BIP.  What’s the standard deviation you have for BABIP and wOBA on BIP?


#9    Brian Cartwright      (see all posts) 2012/02/02 (Thu) @ 00:01

SD for batters, pitchers, both?
MLB only?
unadjusted? (for ballparks, etc)


#10    Tangotiger      (see all posts) 2012/02/02 (Thu) @ 00:04

Pitchers, MLB.  And I mean your forecasts.  Park-neutral, sure.


#11    Brian Cartwright      (see all posts) 2012/02/02 (Thu) @ 00:07

Got it.

May be a little while until I get some free time to code a query.


#12    Brian Cartwright      (see all posts) 2012/02/02 (Thu) @ 06:26

wOBA on BIP = (0.77*SI+1.08*DO+1.37*TR)/(AB-SO+SF-HR)

mean = 0.2565 SD = 0.0082

BABIP = (SI+DO+TR)/(AB-SO+SF-HR)

mean = 0.2994 SD = 0.0096

I used 2010 projections, which do include minor league data as input, and 2011 actual MLB balls in play.


#13    Tangotiger      (see all posts) 2012/02/02 (Thu) @ 09:04

Not sure why you used .77, 1.08, 1.37.  Sticking to .90, 1.24, 1.56 would have been just about right (keeping your mean at .300).

In either case, the SD comes in at .0096, showing that there is an inverse relationship between XBH and 1B.


#14    Micah      (see all posts) 2012/02/02 (Thu) @ 11:30

Brian, I appreciate the response and I don’t think your analysis (and adjustment) are objectively wrong, instead that the first paragraph caught me off guard. Really, what stands out to me is “better.” Independent from any judgement of your model or adjustments, that seems to be a large value judgement prior to any observations for Yu.

It appears you’ve tempered your model based not on what has/will happen, but instead because others don’t have the same expectations. That’s not to say that your adjustments aren’t appropriate, or that your model won’t project efficiently, but what I read (and again, I may be misreading) seems to suggest that your model is “better” when your projections conform to consensus expectation.


#15    Brian Cartwright      (see all posts) 2012/02/02 (Thu) @ 23:35

Tango, I used those weights because when I googled and went to http://www.insidethebook.com/woba.shtml I stopped at the first place you gave weights and didn’t read the whole way through.

Micah - I’m not conceding that I was wrong or that another method is better. When Tango writes a new thread about some very optimistic projection I made, it’s a reason to examine my work, but not necessarily change it - only if I find something that I believe that objectively can be improved upon. I’ve offered a couple examples of methods I do think could be improved.

I wrote the article so that people could understand some of the process and hopefully come away more confident in my work.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 14:14
Pete Palmer’s new book: Basic Ball

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion