THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, August 20, 2010

Bayes is Regression Toward The Mean

By Tangotiger, 03:25 PM

This comment from another thread sparked the exchange that follows between me and Jared:

You’ve got a weighted die.  You know it’s weighted because you built it.  It lands on “1” 25% of the time.  You also have nine unweighted dice.  You built those too.

You put all 10 in a pouch.  You roll each die 36 times.  You get these counts for the number of times you roll “1” for the 10 dice:

11-9-8-7-6-6-6-5-5-4

Which one is the weighted die?  YOU DON’T KNOW!!!

What is the CHANCE that it’s the one that rolled 11 ones?  MORE than the chance that it’s the one that rolled 4 ones.  But EACH of the 10 has a chance to be the weighted die.

It’s all based on probability, a number that is GREATER than zero and LESS than one.

If anyone says “all luck” or “all skill”, leave this blog, and never come back.  WE DON’T KNOW.  All we can do is make a best estimate as to the mean, and a best estimate as to the uncertainty of that mean.  And, if you like, a best estimate of the uncertainty of the uncertainty of that mean.  And so on.


#1    J. Cross      (see all posts) 2010/08/18 (Wed) @ 09:20

It might be interesting to see what you get when using Tango’s dice example using a) Bayes Theorem and b) Regression to the mean.  In theory, it should be the same but I think amid the Strasburg discussion MGL said that regression to the mean breaks down a the extremes b/c we don’t have a normal distribution (sorry, if I’m misstating that).

I’d revise the problem slightly so that rather than removing all 10 dice, you remove 10 dice from an infinite supply of dice, 10% of which are weighted (since, I think this better matches the baseball analogy).

So, using Bayes theorem the prior probability of any die being weighted is .1, the chance of getting a “hit” (a “1") if weighted is .25 and if not weighted is 1/6th.

Given the results, the revised probabilities of each die being weighted are:

hits, posterior p(weighted), expected hit%

11, 0.41, .200
9, 0.20, .183
8, 0.13, .178
7, 0.08, .174
6, 0.05, .171
5, 0.03, .170
4, 0.02, .169

How does this compare to regression each die’s hit% to the population mean of 0.171?


#2    J. Cross      (see all posts) 2010/08/18 (Wed) @ 09:26

oops, that should say, regressing each die’s hit% to the population mean of 0.175 (not 0.171).


#3    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 11:42

J: regression toward the mean works when the distribution follows a bell curve, rather than what my example shows. 

Given that note, the SD true spread in population .025.  The observed SD spread is .056.  The binomial, given 36 trials, is one SD = .062.

As you can see, I did not make this example realistic looking at all.  The observed spread should have been:

sqrt(.025^2+.062^2) = .067, and not .056.

Use this as the observed values instead:
11
9
8
7
6
6
6
5
4
2


#4    Tangotiger      (see all posts) 2010/08/18 (Wed) @ 12:01

Regression toward the mean would give these mean probabilities:

0.194 11
0.186 9
0.182 8
0.178 7
0.174 6
0.174 6
0.174 6
0.170 5
0.166 4
0.158 2

J, what does Bayes say?


#5    J. Cross      (see all posts) 2010/08/18 (Wed) @ 14:52

Good stuff.

regression, hits, Bayes

0.194, 11, .200
0.186, 9, .184
0.182, 8, .178
0.178, 7, .174
0.174, 6, .171
0.174, 6, .171
0.174, 6, .171
0.170, 5, .170
0.166, 4, .169
0.158, 2, .168

I’m surprised that the agreement isn’t better but maybe I shouldn’t be since our population of dice isn’t close to normal.  You can see that the for 2 “hits” the regression predicts a true hit rate of .158 which we know doesn’t exist in our population of dice (the Strasburg projection?) whereas Bayes will never leave the range .167-.250.  I’m guessing that Bayes would approach .250 much faster than regression would.

MGL, I’m going to have to tackle your homework assignment when I have a little more time.


#6    Tangotiger      (see all posts) 2010/08/19 (Thu) @ 14:01

J Cross: here’s a more realistic example.

Let’s say you have these 10 dice of various KNOWN weights:
die# true weights to rolling a “1” or “2”
0 0.276
1 0.285
2 0.292
3 0.297
4 0.300
5 0.300
6 0.303
7 0.308
8 0.315
9 0.324

That’s a mean of .3000, with sd of .0132.

And let’s say you observed those 10 dice to see a 1 or 2 on 380 rolls this many times:
die# observed “1” or “2” on 380 rolls
A 95
B 103
C 108
D 112
E 114
F 114
G 116
H 120
I 125
J 133

Note that the order of the first chart has no bearing on the order of the second chart.

Regression toward the mean says to regress the observed values so you get these as the estimated means:
die# Observed Estimate
A 0.250 0.288
B 0.271 0.293
C 0.284 0.296
D 0.295 0.299
E 0.300 0.300
F 0.300 0.300
G 0.305 0.301
H 0.316 0.304
I 0.329 0.307
J 0.350 0.312

What does Bayes say?  I would bet it would be a pretty close match.


#7    J. Cross      (see all posts) 2010/08/19 (Thu) @ 15:14

Indeed, you’re right.  To 3 decimal places they match up perfectly.

die, obs, est, Bayes

A 0.250 0.288 0.288
B 0.271 0.293 0.293
C 0.284 0.296 0.296
D 0.295 0.299 0.299
E 0.300 0.300 0.300
F 0.300 0.300 0.300
G 0.305 0.301 0.301
H 0.316 0.304 0.304
I 0.329 0.307 0.307
J 0.350 0.312 0.312


#8    Tangotiger      (see all posts) 2010/08/19 (Thu) @ 15:37

Well, that is fantastic, isn’t it?

There you go, regression toward the mean is a perfect shortcut.

For those of you interested, the amount of regression is equal to the square of the binomial divided by the observed SD.

In the case of above, with 380 trials, the SD of the binomial was 0.0235.  The observed SD per trial was 0.0269.  The amount of regression therefore is one divided by the other, then squared, or 76%.

It’s that easy.


#9    J. Cross      (see all posts) 2010/08/19 (Thu) @ 15:53

If instead of those 10 possible dice in our bag, we treat the population of dice as being normally distributed with a mean (m’ ) of .300 with a stdev of .0132 then the standard deviation of one toss is sqrt(.7*.3) = .4582 Bayes tells us to do the math as follows

1) calculate the relative precision (n’ ) as (.4582/.0132)^2 = 1204

This is the variance of a die roll over the variance in the population and can be interpreted as saying that our “prior” the population of dice gives us as much information as 1200 die rolls.

then using M = sample mean and N = number of tosses we can get the new estimate, m“‘, (our posterior) as:

m”’ = m’ (n’/(n’+ N)) + M(N/(n’ + N))

so for die A with N = 380 and M = .250 we get:

m”’ = .300(1204/(380+1204) + .250(380/(1204+380)
= 0.288

with the new standard deviation in our estimate, s“‘:

s”’ = .4582/(sqrt(380+1204)) = .0115 meaning that we have somewhat less uncertainty in the true value of die A than we did prior to collecting data when we had a stdev of 0.0132.

Anyway, I went through this b/c I think this is ultimately the same math Tango is doing with the regression, no?


#10    J. Cross      (see all posts) 2010/08/19 (Thu) @ 15:55

It is pretty neat. 

1204/(380+1204) = .76

Regression is Bayes is regression.


#11    Tangotiger      (see all posts) 2010/08/19 (Thu) @ 17:19

The regression equation is this:
A = .3 * .7 / (.0132 ^ 2) = 1205

regression rate = 1205 / (1205 + n)

So, when n=380, regression amount = .76

***

Therefore, Bayes is regression toward the mean.  Interesting.


#12    Tangotiger      (see all posts) 2010/08/20 (Fri) @ 15:37

Obviously, we won’t have true at the ready.  Therefore, dealing with observations, you would do the following.  Presuming the observed standard deviation after 380 rolls is .027, with an observed mean of .300, you get this:

1/A = .027^2/(.3*.7) - 1/380 = 1/1191
A = 1191

regression rate = 1191 / (1191 + n)
So, when n=380, regression amount = .76


#13    Dan      (see all posts) 2010/08/20 (Fri) @ 16:20

I don’t know much, but shouldn’t these be similar?

[~= means approximately equal, or proportional to]

bayes ~= observation * prior
regre ~= observation * weighting-to-mean

Aren’t the prior, and the weighting-to-mean, pretty much the same thing? Shouldn’t we expect these to produce very similar results?

Anyone know enough Probability Theory to answer definitively?


#14    Brian Cartwright      (see all posts) 2010/08/20 (Fri) @ 16:51

re 12 - in other words, A = 1191 means you add 1191 units of average performance to n observed?

Towards what Dan asks in 13, the average performance may be league average, or it may be an estimate based on other characteristics.

For example, if measuring a pitcher’s babip, I might regress his observed to the observed of all other pitchers with the same ground ball rate. Or regress the wOBA of a High-A shortstop to all other High-A shortstops, instead of all players at High-A, because we know shortstops as a whole hit less than the group of all players.


#15    Tangotiger      (see all posts) 2010/08/20 (Fri) @ 17:06

It’s whatever population you draw the player from.  Once you decide on the population, you use the mean and SD of that population.


#16          (see all posts) 2010/08/20 (Fri) @ 19:39

yeah, go stats!  i’m getting pumped for my marketing analysis class to start in a week.


#17    Carl Gauss      (see all posts) 2010/08/23 (Mon) @ 14:24

What a beating the statistical dead horse takes when it ambles onto the baseball diamond!


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 04:52
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential