Wednesday, November 23, 2011
Research conclusions have to be Bayesian
So says Phil.
Buy The Book from Amazon
Lex, that’s weird, never heard of that happening before. If you want to post, e-mail me and I’ll post under your name.
I am a huge proponent of Bayesian approaches, but one of the objections is that you do this thorough, rigorous, and elegant research yielding very precise numbers, regression formulas, p-values, confidence intervals, etc. Then you try and introduce some very valid and very relevant a priori distributions and guess what? You can only guess at the nature of that distribution and all of a sudden you no longer have any clean results. You are left with a mess. At best you are able to say something like, “I am no longer certain about my mean, my confidence intervals, my regression equations, my p-values, etc. In fact, I have no idea what they are anymore. All I can say is that while I thought that my conclusion was sound (given a certain level of certainty), I am no longer sure that it is sound - in fact, I may have reached a completely different conclusion, but I am not too sure about that either.”
I’m not saying that that is a reason not to use a Bayesian approach. I’m just saying that unfortunately, unless you know your prior pretty darn well, it makes the computation and the results extremely messy and uncertain, although it does lead you in a direction which is far better than the direction in which the clean results lead you.
For example, let’s say that we didn’t know the incidence of HIV infection in the population and that we had to guess at it. The statistics involved in figuring out your chances of being infected after one positive test would be a lot messier with another layer of uncertainty added in.
Another example: Imagine that we didn’t know that baseball BA talent was roughly normal (at least as a function of playing time) and additionally we didn’t know the mean or the variance. We could still do a BA projection or estimate of true talent from a sample of AB, but again, it would be messy and uncertain.
That is why some statisticians reject Bayes. They don’t like the messiness and uncertainty associated with it. They are wrong. They are much better off with it than without it in almost all cases. They just have to live with such an imperfect and unclean world…
MGL, do you have any good links to how Bayesian techniques are used in practice? Like, for instance, a simple research study where Bayesian inference was used to good effect?
I don’t know much about how it works in practice ...
No sorry. But it is pretty straightforward just like the AIDS test example. If you have an a priori distribution that cannot be approximated with a known shape like normal, poissant, etc, then you simply use each element separately. Like if you know that 10% of all MLB players are true .305 hitters, 15% are true .290 hitters, etc., and you observe an MLB hitter who bats .400 in 50 AB’s , you figure the odds that a .305 hitter would hit exactly .400 in 50 AB, the odds that a .290 hitter would hit exactly .400 in 50 AB, etc.
You do the math weighted by the 10% and the 15%, etc. and you come up with an exact estimate of the .400 batter’s true talent BA.
Phil: I did one on Chipper Jones’ chance of hitting .400. Do a search for
Chipper Silver Tangotiger
And you’ll probably find it.
mgl/3: “...The statistics involved in figuring out your chances of being infected after one positive [HIV] test would be a lot messier with another layer of uncertainty added in.” The deal is, you CANNOT answer that question without applying Bayes’s Theorem; knowing the False Positive and False Negative rates of the test DOES NOT tell you what the odds are you have the disease, given a positive result. Here is a real-life example used in a statistics course I tutor: a Fecal Occult Blood test for colon cancer yielded 20 positive and 276 negative results. 2 were true positives, 18 false positives (test says yes, colonoscopy reveals no), 1 false negative, 275 true negatives.
The characteristics of the test are 1/3 False Negative Rate (67%) and 18/293 False Positive Rate (94%). Those rates would be expected to be stable no matter what group we tested: random, healthy people or those likley to have cancer. But to answer the question: “If I get a positive test, do I have cancer?” you must calculate 2/20, or 10%, but that rate depends on how many healthy vs. sick people are tested. The more healthy people are tested (lower prior expectation) the lower the odds are a positive result will be associated with actual cancer. This is why study panels have recommended that many screening tests such as the PSA test for prostate cancer NOT be given routinely to those with no symptoms. Doing so practically insures the false positives swamp the true ones.
Lex, right when it comes to obvious Bayesian analyses like medical tests, researchers have no problem, especially when the numbers are clear.
However when the numbers are not so clear and subjectivity is involved, a lot of researchers don’t like to deal with bayes or flat out reject it.
For example, let’s say that someone says that they have ESP. A researcher gives them a test, say using those “cards” where they have to guess the symbols without seeing them, and the person shows a guess rate greater than expected by chance at a p=.01 level.
A lot of researchers would leave it at that. But of course this should be a Bayesian problem. The problem is how to assign the prior…
#5/#6:
Right, I know about examples like that ... those are fairly routine. I’m interested in situations where you don’t really have a decent prior, and are using a “non-informative prior,” where you just arbitrarily pick one and go with it.
If you already have good reason to use a certain prior, then it’s just routine. But in the “does X cause cancer?” case, it seems like the answer really does depend on your prior, so I can’t imagine how you can avoid spending most of your paper justifying your choice of prior.
Put another way: I can’t imagine even a “non-Bayesian” objecting to the examples in #5 or #6. Those are just standard probability problems. My question is, what kind of techniques ARE the “non-Bayesians” objecting to?
My question is, what kind of techniques ARE the “non-Bayesians” objecting to?
I wondered the same thing. The link in the comments on your website about the objections to Bayes was not very illucidating.
I guess the criticism of Bayesian analysis would have something to do with an experiment like the ESP one. The non-Bayesian researcher would say that we have no idea what the prior is AND that is exactly what we are trying to surmise in the first place by doing the experiment. I would have to disagree. I think that the prior would be critical in an experiment like that. Carl Sagan said, “Extraordinary claims require extraordinary evidence.”. That is the essence of Bayes in a nutshell.
Let me see if I can explain some of the objections clearly:
1) Introduction of bias into the results: There is a sense in which researchers want to get a clean assessment of what did or didn’t happen, absent any prior hypotheses or biases. For example, if subsequent research leads to drastic changes in your prior, you still want your original findings to be there.
2) Difficulty in quantifying priors: Using the ESP example here, how, precisely would you quantify you prior from that? There are many other similar subjects that are studied that don’t lend themselves to a simple quantification like is done in medicine, (but far more reasonable than ESP). Guesstimating some number and the garbage that would go into that guess would do far more harm than any good Bayesian techniques would provide you with.
3) Individual studies versus models/answers: In many fields of study people sometimes separate data from theory - physics is an obvious example. This is because unlike baseball analyses, it’s just way too daunting to propose a complete answer to a question based on one or two studies. Typical is that we get our results, then *subsequent* to that, we may introduce the prior as part of interpreting the results in a model. Indeed, some fields have large scale review papers that integrate dozens if not hundreds of studies in a meta-analysis using complex mathematical methods, including Bayes. That is how the task of proving “Extraordinary claims [with] extraordinary evidence” might be done, crucially not just based on one study.
Thanks, Mettle ... that helps.
So “criticizing or being anti-bayes” means criticism when bayes is used improperly....
There’s a biostatistics professor at Hopkins who had this quote on his door (I can’t remember if there was a source, but I’ve found it attributed to Peter Hamer):
“A frequentist uses impeccable logic to answer the wrong question, while a Bayesian answers the right question by making assumptions that nobody can fully believe in.”
Excellent work, Phil. I’ll post another comment on one small piece of disagreement.
MGL/14: I would add a bit to your formulation. Being anti-Bayesian means objecting when Bayesianism is used improperly, with the prior that Bayesianism is almost always used improperly.
One of the things that really impresses me about this site is how rarely statistical inference is misused (I can’t think of any examples). You tell your readers what your priors are. This is in dramatic opposition to the standard practice of just taking a frequentist result and reversing the direction of causality.
Phil,
I have one partial objection to a statement you made in the article. Some fields (I’m only familiar with two examples) have specific conversions between significance levels and the language to be used about them.
In particle physics (one such example), a difference between the measurement and the null hypothesis can be expressed as follows:
5 sigma: “Measurement of”
4 sigma: “Strong evidence for”
3 sigma: “Evidence for”
2 sigma: “Might be evidence for.”
Sigma is the combined measurement uncertainty (statistical and systematic). Systematic uncertainty is one of those areas where one needs to make an argument.
Many fields (e.g., medicine, baseball) do not have sufficient data to make measurements with this separation for most items of interest.
Thanks, Jeremy. I’ll look up “systematic uncertainty” to see what it’s all about.
BTW, from a Bayesian standpoint, isn’t even 1 sigma “evidence for”? It’s weak evidence, but evidence nonetheless.
re: #10
Phil, you may be underestimating non-Bayesian sentiment. Have you read The Theory That Would Not Die?
Nope! Will look it up, thanks!
Still, I have never heard of anyone objecting to Bayesian analysis *when the prior is known*. Are you saying that happens? For instance, are there non-Bayesians who object to the “what is the probability the patient has the disease?” problem?
I can see people “objecting” when the prior is not known and/or requires some subjective consideration, such as with the ESP example, but to ignore the existence and influence of a prior is ridiculous. I can’t imagine a credible researcher doing that.
I’m with MGL. My perception has been that the non-Bayesians are only objecting to the fact that choosing a prior is subjective.
Phil,
If the prior is known, Bayes’ theorem is a mathematical identity. If there is not a well-defined prior, Bayes’ theorem becomes a religion with some interesting sectarian squabbles.
It is important not to accept a hypothesis prematurely, because while some people are capable of rational analysis on open questions, pretty much everybody defaults to heuristics on settled matters. As Mr. Clemens said, it’s not what you don’t know that will hurt you; it’s what you do know that ain’t so.
A one-sigma “signal” is still noise. One sigma is like deciding your dice are lucky or unlucky based on one roll.
A three-sigma result ("evidence") is like seeing your candidate about 8-10 points up (or down) in a single poll. (Statistical uncertainty on most polls is set at 2 sigma = 3.5%, but you have to factor in the error bars on two candidates rather than just one.)
Jeremy,
A one-sigma result is still evidence. Weak evidence, but enough to move your posterior a little bit away from your prior.
A team that wins its first game of the season is exactly 1 SD above the mean, right? Before that game, your estimate of that team being above average in talent was 50%. After that game, it’s a tiny bit over 50%.
But, yeah, you want to make sure the conclusions you draw are commensurate with the strength of the evidence.
MGL,
Choosing a prior can be subjective. I can accept that—any results from a subjective prior are what I would call model-dependent analyses. The goal is to be model independent wherever possible, but I accept that it is not always possible.
The problem is that for a Bayesian analysis to be valid, you can’t be biased in selecting what data to include. In any world where complete population data are not readily accessible, this is not a realistic condition. Anything involving journal publication fails this test—most journals are heavily biased toward positive results rather than “no signal found.” (Particle physics results in a bunch of experimental papers with titles beginning in “Search for. . .” but even there, I would argue that negative results are severely underrepresented in the published record.)
As for not believing any credible researcher would ignore the existence of a prior, I suspect you overestimate the humans involved. If I hadn’t seen exactly the behavior you find impossible to believe both among physicists (who should know better) and elsewhere, I would probably be shocked, too.
I just called a friend who did social-psychology research, who just tried to convince me that randomized trials and properly tested survey instruments ensured that you could take p-values at face value. She got the disease-probability problem right, but couldn’t see how the idea of a prior distribution applied to her experiments.
Phil,
Yes, of course, the results from that one game will shift your posterior away from your prior. I suppose you can call that evidence if you wish.
As for systematic uncertainty, you will be more successful in searching for the synonyms “systematic error” or “bias.” I prefer “uncertainty” to “error” because the casual observer is less likely to misinterpret it. However, despite the unrelenting (last I checked) crusade of one Illinois professor (Steve Errede), the more misleading term continues to be standard.
I think there is some confusion amongst laypeople about what the distinction between “Bayesian” and “non-Bayesian” means. People are conflating the distinction between the Bayesian interpretation of probability and Bayesian inference when they say they are “rejecting Bayes.”
There is no rejecting Bayes when speaking of inference. Bayesian inference is simply any statistical inference that uses Bayes’ Rule. The existence of that rule isn’t itself contentious. By that I mean: given the underlying assumptions about probability, Bayes’ Rule is a mathematical fact about conditional probability that isn’t debated amongst statisticians or probabilists because it has been proven. Bayesian inference is more of a framework than anything else. One can easily convert Bayesian inference to an equivalent frequentist one by setting the prior distribution to have variance zero. Then you have inference in a Bayesian framework which gives results exactly similar to inference in a corresponding frequentist one. It just doesn’t make sense to debate whether Bayesian inference exists or whether it is valid.
What is debated is:
1) whether or not they think it’s APPROPRIATE or USEFUL for the inferential task at hand, or
2) whether the Bayesian interpretation of probability that underlies Bayes’ rule is valid.
When people speak of “rejecting Bayes”, what they are really trying to get at is rejection of the Bayesian interpretation of probability. The question there is whether a “true probability”, if it exists, should be seen as the outcome of infinite realizations of an unchanging random process (i.e. a CHANCE of something happening), or whether it should be seen as the CREDENCE in a specific outcome.
For example, a coin toss is something that can be easily admitted under a frequentist interpretation, because one can easily imagine tossing the coin infinitely many times and deriving a chance from that. But the probability that a crime has been committed based on some evidence—that is not something that one can easily imagine as the outcome of a realization of an infinity of events. Well, at least to Bayesians. Frequentists have a different view on that matter.
The point is: if one speaks of “rejecting Bayes”, one needs to talk about probability interpretations. One can’t speak of “rejecting Bayesian inference” because that’s equivalent to rejecting the existence of Bayes’ Rule, which is absurd.
Gogurt,
Appreciate your summary of the philosophical difference between Bayesians and Frequentists on the deep question, about what probability is.
But are you sure that’s what “bayesian inference” refers to? My impression is it’s not Bayes’ Theorem that’s being rejected, but, rather, the practice of drawing conclusions by initially ("subjectively") choosing a prior.
There are two types of critics, right? (1) philosophical critics, and (2) non-philsophical critics who think Bayesian infererence would be fine if only you could make sure you had a reasonably accurate prior.
Or, are you saying that the (2)s are actually (1)s, and saying that NO prior is appropriate because it’s not consistent with frequentism?
My impression is that there are (2)s who are not (1)s. Am I wrong?
I guess I’m one of them, actually. I think I’m a bayesian, philosophically. But if you pull an arbitrary prior out of your hat, I’m going to call you on it, and I might even accuse you of deliberately choosing a prior that makes your case look good.
Am I an anomaly? I assume I’m not. If I’m not, what do you think is the ratio of (2)s to (1)s?
Seriously, I don’t know the answer to that. I’d appreciate your estimate, if you think you have a decent one, to inform my prior.
Hey Phil,
I think we’re just speaking different dialects of the same language. When you say that:
“My impression is it’s not Bayes’ Theorem that’s being rejected, but, rather, the practice of drawing conclusions by initially ("subjectively") choosing a prior.”
You are singling out exactly that group of analyses that uses Bayes’ Theorem. That’s because whenever you specify a prior and seek to update it using data to generate a posterior, you MUST use Bayes’ Theorem. There’s isn’t another mathematically rigorous way to update beliefs.
My sense is that every statistician, philosophically, is a Bayesian. Basically everyone admits that probabilities exist where a frequentist interpretation isn’t valid (like the crime example I gave earlier). So in that sense, everyone is a Bayesian. But I think that statisticians differ in whether they think Bayesian inferential TECHNIQUES are always appropriate.
Even within the class of statisticians who use Bayesian techniques there are many subgroups. For example:
1) There are those who think that using subjective judgement to form a prior distribution is OK--those are statisticians who use “subjective priors.”
2) There are also those who think that it’s NOT ok to form subjective judgements--those are statisticians who use “noninformative priors” or “flat priors.”
Both groups use a prior distribution that imposes some randomness on our prior beliefs. It’s just that one group thinks it’s OK to skew those beliefs to one side based on judgement, and the other prefers not to taint the analysis.
And then there is a whole category of “naive” versus “empirical” Bayesians. Naive Bayesians believe that prior distributions should only be based on subjective judgement and should not appeal to the data (seems natural, given the definition of what a prior distribution is). Empirical Bayesians think that it’s OK to use some set of data to form a prior distribution which is then subsequently updated again by other data (e.g. using a past estimate based on data as a prior, then updating with current data).
There are lots of variations.
>“You are singling out exactly that group of analyses that uses Bayes’ Theorem.”
I didn’t phrase myself properly. My impression is that there are Bayesians (like me) who are uncomfortable with the idea of choosing a prior that might be way off.
That’s regardless of whether it’s a “subjective prior” or an “uninformative prior.” (To me, uninformative priors are also subjective.)
By your definition, I’m an “empricial Bayesian.” If you’re trying to figure out Pujols’ talent, you want to use what you know about the general distribution of baseball talent. To do otherwise is silly.
Would naive Bayesians really want to ignore the distribution of (other) baseball talent when evaluating Pujols? Or am I misunderstanding the distinction?
To MGL in 3: This seems more of an issue with a frequentist who thinks he’s a Bayesian.
Add on those priors to your regression, run an MCMC algorithm and you are given the correct posterior distribution (assuming the MCMC converged). Credible intervals are then very easy to create, as you have as many samples from the posterior distribution as you care to have. Personally, this is no more messy than any standard regression method (and personally, more intuitively simple). Just because you don’t have a closed form that can be evaluated doesn’t mean your endless supply of random draws from the posterior distribution isn’t any less correct for describing your distribution.
Though I admit I am biased. I am a Bayesian academic who readily admits I am Bayesian almost 100% because of how great MCMC methods (and other sampling methods) are.
Sorry Phil, I don’t think you phrased yourself incorrectly. I think I just misread your post. I’ll address your points in parts:
1) Right, I think that you’re right. People tend to think that using Bayesian techniques are inappropriate because they are scared that the incorrect choice of prior will mess up their results. Bayesians are concerned with this all the time--that’s why a lot of papers published by Bayesians have to do with that: what kind of prior is right for a specific type of analysis.
2) You are absolutely correct that an uninformative prior is subjective in a sense. There are different uninformative priors for different scenarios. A flat, uniform prior (for a binomial probability) is uninformative in a different way than is a Jeffreys prior (the Jeffreys prior is invariant under reparameterization). Choice of the proper uninformative prior is subjective. But it’s more the guiding principle of seeking to inject as little bias into the posterior (via the prior) that is the essence of the noninformative method.
3) Phil, I would call you an empirical Bayesian. But the distinction between empirical and naive is not cut-and-dry.
Everything comes in degrees. Subjective priors will always be based on SOME data, because any belief is necessarily founded on data at some level. If I think that God exists, it’s because I observed some events in my life (data) which sways me in that direction. Yet it’s still considered subjective.
So in that sense, it’s natural to use data to form priors. But keep in mind what a prior distribution is in its purest sense: it’s a SUBJECTIVE belief that is updated using data. Once you use data to form the subjective belief, then you are in some sense “dirtying” the subjectivity of the prior by introducing objective, data-driven inference. That’s the sense in which empirical Bayes is somewhat strange in proper Bayesian inference.
To go with your example, and to illustrate how sometimes it’s just a question of framing the model:
When you use the distribution of other baseball talent to evaluate Pujols’ baseball talent, that model would generally be considered empirical when considered with two levels because you are “objectifying” the prior by using observable data. But a naive Bayesian can come by and re-form that model as a naive one, by interpreting the “general distribution of baseball talent” as driven by yet ANOTHER more subjective distribution (perhaps whether changes in baseball rules during that year biased a parameter which the “general distribution of baseball talent” depends on).
Does that make sense?
In terms of baseball, what would you use for a handedness split as a prior if you had no data at all?
That is, you have a lefty pitcher facing a righty hitter. All you have are their overall stats. You don’t have their own split stats, nor do you even have league split stats.
Just the fact that they are of opposite hands, and your personal experience, testimonials of players, and physics that it’s tougher to hit same-handed pitching.
Let’s make this high school ball in Korea, where we wouldn’t couldn’t even necessarily apply MLB split data.
What prior do you use?
+1 to Perceptron/31.
I strongly get the feeling that a lot of accepted statistical methods (including simulation techniques like MCMC) haven’t yet made their way into the sabermetrics community. I just chalk this up to sabermetrics being new, but I do think there is a lot left to be learned in general. Sabermetricians should open themselves to studying more wide-ranging statistical methods, not close themselves to it because it’s inconvenient or involves nonintuitive mathematics.
I don’t know who the question is being posed to in Tango/33, but the answer is: whatever prior that your completely subjective and non-data-driven beliefs ("Just the fact that they are of opposite hands, and your personal experience, testimonials of players, and physics that it’s tougher to hit same-handed pitching.") lead you to. It’s naive, but that’s just how Bayesian techniques are supposed to work.
Tango/33
To add to what gogurt said, if you are extremely unsure, then make sure your prior has a lot of variability. Once you have enough data, the posterior will be centered around the ‘truth.’ This is essentially using a flat prior, as gogurt mentioned earlier.
For example, if I thought a player was going to hit .300, I could give it a beta(.03,.07) prior, or a beta(300,700) prior. The first is saying we think the probability is pretty much equal anywhere from .000 to 1.000, whereas the latter says it will probably be between .270 and .330.
Now if the player goes 30 for 100, the former will have a beta(30.03, 70.07) posterior and the latter a beta(330,770). The former says the truth is probably between 0.215 and .393, whereas the latter now says it will be between 0.273 and 0.327.
The flat (or uninformative) prior said he could bat anything, and once be actually did something, we had reasonable (but wide) interval centered around .300. The informed prior already had a pretty informative interval, which got a bit narrower as we added data (and had the prior average of .300 been wrong, the posterior would be shifted according to the data)
I’d like to add a third option to the choices of “rejecting Bayes” and “using Bayes” and that is:
“Not bothering to use Bayes because it’s not worth the effort and has very little value to add in the broad scientific framework I am working in and no one else really uses it anyway.”
That, I would surmise, best describes the reason Bayes isn’t used for about 70% of the papers you see in psychology, neuroscience and the like. So, it has nothing to do with specific rejection of Bayes, per se as you’ll notice that this point has nothing to do with actual statistics.
mettle/37: Actually, I think the real reason that “70% of the papers in psychology, neuroscience, and the like” don’t use Bayesian methods is that they are ignorant of them. Or maybe they know that they exist but don’t have the expertise to implement them efficiently.
I mean, of course psychologists and sociologists and whoever are going to think Bayesian methods are difficult. Because they are psychologists and sociologists, and they study psychology and sociology. But that shouldn’t preclude them from consulting a statistician who studies statistics and whose livelihood depends precisely on making good inferences.
If one avoids using Bayesian techniques because they can’t be bothered to figure out MCMC (for example), that is a shame. MCMC is just the tip of the iceberg. There are tons of well-established statistical techniques that help make sense of data, and furthermore could be readily applied to lots of other scientific fields. If it takes a statistician to help make sense of them, then so be it. Working together is rarely a bad thing.
I should add that I was an undergrad in statistics at a very frequentist school, and never even knew Bayesian statistics existed until grad school (I graduated in 2009, so modern Bayesian analysis was well established). If stats undergrads are ignorant of Bayesian methods, then certainly most psychologists and neurologists are as well.
gogurt, sabermetrics often lacks in advanced and rigorous statistical methods because most sabermetricians are not statisticians and, as you say, it is a relatively new science.
Now, that might actually be a good thing because what you might gain in accuracy, you might lose in transparency. Remember we are talking about a game and not about climate change or some such important thing…
MGL/40:
Noted. There definitely is a trade-off with accuracy and transparency. But I think it all comes down to how “good” is defined. Is “good” wider proliferation of sabermetric thinking, or is “good” more accurate and powerful methods for analyzing baseball?
If it is the former, then you’re right. Maybe we shouldn’t get too deep into the statistics. But if it’s the latter, then I don’t see a reason not to go digging in deeper, often-more-opaque mathematics for more powerful techniques.
Anyway, I know this is opening up a can of worms and promise not to take it any further, but:
How can people not be bothered when a sabermetrician scolds a baseball traditionalist for shunning the use of numbers on one hand but then immediately turn around to argue against more mathematically-sophisticated techniques for the sake of transparency? It just seems somewhat inconsistent to me.
I don’t think “more powerful” techniques would help us. That is, the reason we haven’t embraced more detailed, technical methods is that they wouldn’t tell us much that we don’t already know from simpler methods.
But, prove me wrong. Discover something about baseball using a complicated technique that wasn’t possible with a simple technique. We’re listening.
Phil/42,
You have a point. It may very well be true that there are diminishing returns to more sophisticated statistical techniques.
But… consider your challenge accepted.
-Jimmy
What are “more powerful” methods? A Bayesian hierarchical model probably won’t help us much over a regression, even if the former might be better in general. But there are other methods that could be quite helpful.
But what about methods like hidden Markov models and factor analyses? I have used both for baseball analysis and had interesting results, telling me things regression couldn’t. Both of these and others may have been used occasionally elsewhere, but most of what I have seen is all about plots (weak), summary statistics (stronger) and regression (quite powerful).
Of course, this says nothing of Bayesian vs frequentist. Certainly Bayesian methods are no more ‘powerful’ than frequentist, just different. Thanks to MCMC and similar techniques however, Bayesians can tackle more complex problems very easily (if you have a likelihood, a prior, and some time, then MCMC methods can give the posterior for literally anything).
In terms of regression v Markov: regression is simply a dumbed-down version of Markov. If you had a Markov model, you wouldn’t need regression. It’s simply easier/quicker to use regression. It’s a matter of how much of the extreme end-points are you willing to live that will breakdown.
That’s not the argument between Bayes and non-Bayes.
Yes… defining “more powerful” is a problem. But if we knew the answer, we’d be using them already, wouldn’t we? In the end, it really is just about asking the right question.
Also, Tango I am definitely not following you when you say that a regression (what kind of regression?) is a dumbed-down version of an HMM. Care to explain?
-Jimmy
As an example, we know exactly how runs are created in baseball, if you have all the estimated inputs. Markov describes it perfectly.
But, if you rely on regressions (using those same inputs), you won’t necessarily get close to the right answer.
I didn’t say Markov models, but hidden Markov models. An HMM is a mixture model of Markov processes, where the mixture indicator is unknown.
I got off the topic I suppose. Bayes vs non-Bayes has nothing to do with more powerful vs less powerful. Bayesian methods are not more powerful than frequentist, like I said.
#41 I am not arguing against using more sophisticated and rigorous methods in order to discover things we don’t know about baseball or refine things we do know. Come on! Do you think I would reject a valid finding just because it was derived from a sophisticated opaque method like a traditionalist might do via a vis an ordinary sabermetric insight or discovery? I might be skeptical if only because of the opacity (and rightly so).
For example, when Andy Dolphin, our co-author of The Book, used a fairly sophisticated technique to identify and quantify clutch talent, Tango and I were thrilled. That being said, I think most people are pretty happy that most sabermetricians use pretty simple and transparent statistical methods. The reason is simple; most sabermetric research is designed to be presented to the average stats oriented fan. Sabermetrics is unlike other scientific disciplines in that regard.
38/gogurt
I think there is general awareness but the things I pointed out in #12 diminish their utility significantly. But you’re right—I wish there were a knowledgeable statistician at my right elbow to work with on that sort of stuff.
I should also correct myself a bit: Researchers do sometimes use MCMC for corrections of p<.05 and generally use Bonferroni correction where appropriate. The former I guess maybe is technically Bayes, though it doesn’t feel like it to me and the latter I’d say isn’t.
@mettle Just to chime in with gogurt#38
As a graduate student in Behavioral Neuroscience, I can tell you that none of the statistics courses I’ve been required to take discussed Bayesian statistics. Furthermore, I have never encountered a paper in my field that uses anything other than frequentist significance testing. If you arent working on neuronal spike characteristics, Bayesian statistics is just a name.
51/Ian
Perhaps look harder.
MCMC is the de facto standard for significance correction in functional imaging studies.
(I do hope your stats classes at least discuss Bonferonni correction - I think that’s pretty standard.)
Also, if you have the chance to read any meta-analyses papers, combining many neuroscience studies into one, you’ll often/sometimes find Bayesian methods.
It may be the case that some neuroscience students don’t know what is going on when they click a button on their software analysis package, but some of these methods are baked into the code.
52/mettle
I don’t work on fmri data, so fair enough it’s a big field. Yes, the Bonferroni Correction was taught but I’m not sure I follow you on the connection with Bayesian statistics. I think it’s safe to say one can teach the adjustment from a purely frequentist viewpoint. Here’s my understanding: If the null hypothesis is true, one would expect to make a type I error in one out of twenty repeated trials (assuming an alpha of .05). But, do twenty tests and the chance of getting at least one significant p-value is .64. The bonferroni correction reduces the alpha level to account for this. I’m not trying to be pendantic in walking through this, I’m just trying to understand where Bayes comes in.
53/ It doesn’t ... I was just asking, hence the parentheses.
MGL/49:
“The reason is simple; most sabermetric research is designed to be presented to the average stats oriented fan. Sabermetrics is unlike other scientific disciplines in that regard. “
Really? I dispute that sabermetrics is “designed to be presented to the average stats oriented fan.” My understanding of sabermetrics was always that it had one goal and one goal only: to improve quantitative analyses in baseball. And by “improve” I mean more accuracy, more precision, more inferential potential. I mean, how else can one possibly define “improve analyses”?
It seems that you are defining “improve analyses” by whether it is digestible to the average sports fan. Now I agree that it’s admirable to make things easily-understandable, but I don’t see that as the main goal of sabermetrics. If a sabermetric method happens to be transparent, then that’s a bonus. But it’s not the real goal.
Anyway, I agree that this has nothing to do with the whole Bayesian vs. non-Bayesian argument. I apologize for getting off on a tangent like this.
---
Also, I just want to clear something up. MCMC is not a “significance correction” technique. It can be used for such purposes in combination with other methods, but at its core MCMC is a TOOL used to simulate the posterior distribution of complicated models, which usually happen to be Bayesian because… conditional probability is involved, which means Bayes’ theorem is involved. It is not a model in an of itself.
For example, if I have a joint distribution f(X,Y) and I wish to obtain the marginal distribution f(X), then I would integrate f(X,Y) with respect to Y. In the general case with many variables, this is not always easy to do. So MCMC is a way to get around that by using Markov chains to “draw” out the stationary distribution.
This is directly related to Bayesian statistics. Basically, any time you have a Bayesian model with complicated joint distributions, MCMC is the go-to method for getting the marginal. People use it all the time. It’s just that people might not write about it papers because it’s just a tool, like integration.
Feb 23 01:15
How much should minor leaguers make?
Feb 22 22:31
Not everything you learn in college is true (duh)…
Feb 22 17:27
Would you cut to a regularly scheduled show, if the main event ran long?
Feb 22 17:02
This week in chart failure
Feb 22 16:26
Who’s evaluating the 2011 forecasts this year?
Feb 22 12:21
MLB 2012 Odds: BetOnline
Feb 22 07:11
K minus BB differential or ratio?
Feb 22 01:18
Two players have the same stats: one is much younger. Which one will be better next year?
Feb 21 14:49
Knuckleball pitchers: all of them
Feb 21 13:57
Proper compensation for Epstein?
Excellent blog. I don’t think it was too long.
To amplify on the “routine test” example, if the test were only given to those exhibiting clear symptoms of the disease, with perhaps a 50% likelihood of actually having it, a positive result would be more important; 1000 take the test, 500 have the disease, results are 50 false positives and 500 true positives; odds you have the disease (given a positive test) are now 10-to-1 in favor. The misuse of medical screening tests designed for those with symptoms, applied to healthy people with no symptoms, is a serious problem resulting from, essentially, ignorance about Baye’s Theorem. It is impossible to calculate the answer to “if the test is positive, what are the chances I have the disease” without some prior estimate.
Phil, I am unable to post comments on your website, the “verification” letters do not appear on my home PC (Windows XP) or work (multiple Windows 7 systems.)