Friday, August 27, 2010
How to do true talent batter platoon splits
Good stuff from Jeff.
Buy The Book from Amazon
You ALWAYS regress.
Right, it is a matter of how much. The larger the sample, the less you regress. You probably want to know the “50/50” point, for whatever that is worth. Look in The Book for that. It is a lot more for RHB than for LHB (because the spread of “platoon talent” for LHB is greater). For some reason, a lot of people seem to be obsessed with the 50/50 point. They like to say, “less than this, the sample splits (or whatever the stats is) are unreliable, and more than this, they are reliable.” I hate that characterization. It is arbitrary. What constitutes reliable or not reliable? I have no idea other than at the extremes I suppose. As I have said a million times (and I am not exaggerating), I ignore sample stats when I see them. I do a mental regression and that is the number that I use (for whatever).
Speaking of “I am not exaggerating"…
I read this somewhere and I love it:
An example of a common misuse of the word “literally"…
“I made a mistake at work today and the boss literally tore my head off...”
It’s not a misuse - one of the definitions of “literally” is “virtually”:
#4, OK.
Aren’t you supposed to do the regression on the ratio between wOBA v LHP and wOBA v RHP? Not the absolute difference?
I’m all for the evolution of language, but that definition of “literally” is still a weak one. Even the usage discussion from your link is kind of lame. The usage discussion is saying that “literally” is similar to the way teenagers use the word “like”, which is just to indicate an important point in the sentence. Basically meaningless but it’s, like, a pointer.
Well, I looked it up on Websters online, and that is what it said. The second definition is indeed “virtually,” as in NOT literally, but almost. Strange but true.
Keep in mind that the first definition of work is the primary one. They are not equal.
And of course definitions change as common usage changes. And why not? The definition of a word is somewhat arbitrary.
I prefer ratio and others use the absolute difference. I am not sure which one is technically correct. It could actually be a combination of the two (or some other function), since we don’t necessarily know the exact function/regression equation…
To ask something that has nothing to do with English grammar ...
What are the splits and the regressions for the individual stats that make up the platoon split? Say, what is a normal split for lefty batters of singles against lefties and righties, and how much do you need to regress that?
For lefty batter, the average (all batters, weighted by their PA) singles per PA plat ratio is 1.06. For RHB, it is 1.02. I don’t know off the top of my head the regression for singles (per PA). If you do it per batted ball or per non-HR batted ball, the numbers are probably different of course. I also assume that the regression for RHB is a lot more than for LHB, given the number of PA (the smaller of the two - vs RHP and LHP).
For doubles (again, per PA), it is 1.33/1.18 (LHB/RHB), triples, 1.11/1.40, HR, 1.57/1.34, BB, 1.12/1.22, and SO, .83/.89. For OPS, it is 1.17/1.10.
Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data? And what about outliers?
Feb 12 04:55
Who is Jeremy Lin?
Feb 12 03:15
New PECOTA
Feb 12 02:42
Whitney Houston
Feb 12 02:23
Psst… wanna intern in Canada?
Feb 12 00:40
Clutch analogy
Feb 11 20:11
Fighting leads to goals?
Feb 11 19:55
Why do players get crappy caps?
Feb 11 19:12
Hero of the month: Brittney Baxter
Feb 11 17:59
MGL: Today on Clubhouse Confidential
I don’t have my copy of the BOOK in front of me. At what point do you not have to regress a LHB’s splits at all?
Is it 2000 PA (1000 PA x 2)?
And a RHB would be 4400 PA vs. LHP for no regression?