THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, November 30, 2009

Bayes, Marcel, Academics, Phil, back-and-forth discussion… this one has it all!

By Tangotiger, 04:06 PM

Here’s the article, with 4 back-and-forth commentaries.  I am pleased that Marcel was used as the benchmark, if for no other reason is that it’s so basic and open source that it should be the benchmark.

Glove-slap Phil.


#1    Tangotiger      (see all posts) 2009/11/30 (Mon) @ 16:21

I’m just reading the paper now.  Actually, I read it already.  Didn’t we link to it already?  The new stuff is the responses to the article.

One thing that was asked is why use the fielding position.  The reason is because of regression.  If you have two players, each of which hit 25 HR in 600 PA, and one is a 1B and the other is a SS, there is a much better chance that the 1B will hit more HR than the SS in the next season.  That’s if you know nothign else about the players.

If you know that both players are 6’3”, 220lbs, then the position might not be so important (it might be though).

I don’t like using position.  I’ve talked about this before.  I think it’s lazy to say that you will give a different hitting forecast for ARod, depending whether he is a SS, 3B, or RF.

You see, the reason that it works out (for MOST players) to use the position is simply because that’s an extra parameter that is linked to performance.  Most SS are not HR hitters, so if you find one that is, chances are, he’s not really that much of a HR hitter. 

And if you find a 1B with just 10 HR, then chances are, he is much better than that.  Of course, if you ALSO know that he’s a great fielder (Minky), then maybe that 10 HR is representative.  But, Teix is also a great fielder, so if he hit 10 HR you know it’s a bad luck season (no one pays 180MM$ for a great-fielding 1B who hits 10 HR).

Anyway, that’s the reason it’s used: to infer something about the player, that the other parameters can’t tell you.  But if you had better scouting information, you wouldn’t need this parameter.


#2    Hizouse      (see all posts) 2009/11/30 (Mon) @ 17:35

Is Shane Jensen any relation to the Peter Jensen who posts here?


#3    Ian      (see all posts) 2009/11/30 (Mon) @ 17:51

Using position as a proxy for agility seems ok, since you’d have to think that most players play the most difficult position that they are able to play.  Inversely correlating this agility with strength might work too, but I’d expect height/weight to be better - do any of the current projection systems incorporate player size?  Would it be useful if they did?  I’d lean towards no, since people of the same size can perform quite differently.  Not to mention the dubious accuracy of listed heights/weights, and the fact that strength might not correlate that well with home-run power.

Something else I saw in the second commentary, from Glickman: he points out that if young players are expected to improve season-to-season, we should also expect them to improve during the season. I wonder if this shows up in the data, or if increased fatigue from younger players (who might not be used to playing long seasons) counteracts it.  Or maybe players only improve in the offseason, without coaches around.


#4    Peter Jensen      (see all posts) 2009/11/30 (Mon) @ 18:06

Shane is not related to me.


#5    Toffer Peak      (see all posts) 2009/11/30 (Mon) @ 18:24

Tangotiger - You bring up a question I was just thinking of last night, particularly in respect to catchers. Do you know if any of the popular projection systems (Marcel, ZiPS, CHONE, etc) regress catcher stats toward catchers or if they simply regress toward all hitters? I was curious since it would seem like catchers would be more likely to break down due to injury and wear and tear and thus the projections for them might be more optimistic than for other players.


#6          (see all posts) 2009/11/30 (Mon) @ 19:28

Ian/3: >“Something else I saw in the second commentary, from Glickman: he points out that if young players are expected to improve season-to-season, we should also expect them to improve during the season.”

I noticed that too, and was intrigued.  It’s a great idea ... I always wondered whether the improvement was during the season only, between seasons only, or some of each (and in what proportion?).  I agree with Ian that fatigue might be a factor too, but, in that case, you can just compare fatigue between young and old to tease out the effect.


#7    John Harris      (see all posts) 2009/11/30 (Mon) @ 20:12

@3 and 6:

Surely you would run into huge sample size issues when looking for in-season improvement.  Younger players, who are less likely to play full seasons, would exacerbate this issue.  How would you draw conclusions from how someone did over 50 plate appearances in April when compared with 75 plate appearances in September?  At the season level, you avoid a sizeable chunk of the sample issue.  Though the idea is intriguing, I am uncertain it can be measured.


#8          (see all posts) 2009/11/30 (Mon) @ 20:28

You could just find all 23-26 year-olds with (say) 500PA.  Find the most similar season among players 30+.  See if there is a difference between their first halves, and their second halves.

You’d expect the young players to improve during the season, and the older players to decline.  Even if part of the cause for both is decline due to fatigue, the old players should show a higher decline overall.


#9    MGL      (see all posts) 2009/12/01 (Tue) @ 00:55

There is nothing “wrong” with using position as the population to regress towards. It is essentially a proxy for height and weight and somewhat a proxy for agility.

It is better to use weight/height, but position is fine, if that is all you know.  You can use any characteristic/population you want, as long as the mean is unique to that population.  If you do use height/weight, be careful about catchers.


#10    Tangotiger      (see all posts) 2009/12/01 (Tue) @ 08:01

Phil, I weight by day.  And I think if you do that, the weight would be 52% first half and 48% second half (or 51/49).

That is, you will find barely a difference.


#11    Tangotiger      (see all posts) 2009/12/01 (Tue) @ 08:09

Obviously, I meant higher for 2nd half.


#12          (see all posts) 2009/12/01 (Tue) @ 10:59

Tango, is that for young players, old players, or all players?


#13    Nate      (see all posts) 2009/12/01 (Tue) @ 11:37

Tango-

Somewhat off topic, but when can we expect Marcels for 2010?


#14    Tangotiger      (see all posts) 2009/12/01 (Tue) @ 11:46

Marcel 2010 will be done by the weekend.

***

Phil, the general equation for weighting is:

weight(daysAgo) = .9994^daysAgo

If you are on a steep slope, then that .9994 would be .9990 or something, perhaps even .9986.  If you are on a gentle slope, then it would be say .9996 or .9998.

Just a little research will give you what you need.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential