THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, October 10, 2009

An AIDS vaccine trial and statistical significance

By , 05:42 AM

Here is the link to the WSJ article on the study:

http://online.wsj.com/article/SB125511780864976689.html?mod=WSJ_hpp_MIDDLTopStories

I’ll try and summarize the findings and the controversy:

Apparently in drug studies, they often, always, or sometimes give 3 different data sets.  One are most enrollees, including non-compliant ones.  Two include those who didn’t follow some of the guidelines (a smaller number than group one), and three are those that followed all of the guidelines, the fewest number of subjects of course.  They compare the results (in this case, the rate of HIV infection) of all three groups to the placebo group.  There were about 8200 subjects in the first group.  I don’t know over what time period the study took place.  The article did not say.

What they found was that the difference between the control (placebo) and non-control subjects for group I was significant at p=.039, for group II it was p=.076, and for group II, it was p=.16.  So the results (difference in rate of infection between the drug-taking group and the placebo-taking group) were statistically significant (at the 5% level) for group I, almost significant for group II, and not even close to significant for Group III.

The controversy was that originally the researchers only reported the results for Group I.

According to the article:

The results announced last month were based on a “modified intent to treat analysis,” which includes virtually everyone who enrolled in the study, regardless of whether they ended up getting the full course of the vaccine. It is a good stand-in for the real world, where people don’t always follow instructions properly.

Any results that include subjects who didn’t take the drug or hardly took it at all (group I) obviously have more noise in them.  That is the last group you want to use to report your results.  The second group apparently includes subjects who took the drugs to some degree, but did not take them exactly according to the protocol, or some such thing.  Again, you are likely to have more noise in group II than in group III.  I suppose you could make the argument that this group includes people who took the drug to some extent and that they might have gotten some benefit.

Basically, it seems to me that ONLY reporting the results from group I, which were significant at the 3.9% level, while the results for group III (the “pure” group) were only significant at the 16% level (which is not very good at all for a medical study I would think), is a gigantic ethical breach, and I can’t imagine that a researcher would be allowed to do such a thing.

Aren’t there rules for how you are supposed to report the data?

From the article:

“I think in general it’s best to lay out as much data as possible,” said Barton Haynes, director of the center and an HIV vaccine expert at Duke University, who was at the meeting. “This is a very difficult situation for everyone, and we’ll have to wait until all the data are released so we can drill down into it.”

Isn’t that what we always say?  Present ALL the data and let the readers decide for themselves?

Anyway, as we always say, there is no magic number for a result to be significant.  Isn’t that especially true for medical studies when you are dealing with potentially saving lives?  If you do a study and get results significant at the p=.16 level, shouldn’t you simply redo the study?  Even if you get a very significant result, don’t you always have to redo a study before a drug is approved?  Again, I don’t know how long this study took.  Since they were testing a vaccine, it probably took a long time so it may not be all that practical to do another study.  Don’t you sometimes have to do multiple studies on drugs in order to tweak the protocols, dosages, etc.?

Aside from comments, which are more than welcome of course, I’ll pose this question to all the smart minds on this blog?  Let’s say that you do a similar study and it takes 5 or 10 years.  What kind of results do you need before you allow the drug to be sold to the general public?  Clearly you will eventually have a chance to re-check the results on a much, much large sample in another 5 or 10 years if it is approved, but what do you need to approve the drug, at least on a “temporary” basis (if that is even allowed)?  1%?  5%? 10%? 15%? 20%?

Do we want a whole bunch of drugs out there when we KNOW that some percentage of them will not work and only succeeded in a trial or trials by chance?  Do we put as many drugs out there (assuming that we think they are safe and there are few if any alternatives) as possible and then collect more data on them and if it turns out that they are not effective, we remove them from the market?  Should it depend on how life-saving the drug is (and how dangerous it might be) and how long it takes to conduct these trials?  Do you think there are lots of drugs out there that are completely ineffective but passed a series of trials by chance alone?  Would we even know it, especially if it is a drug that is not supposed to have a high success rate (like some of the cancer drugs which may or may not work, or some drugs whose affects are not necessarily noticeable, like some of the anti-depressants)?  And since most drugs have a fairly significant placebo effect, there may be many drugs out there whose only value is in their placebo effect.


News
#1    Nick      (see all posts) 2009/10/10 (Sat) @ 07:38

It seems weird that the P-Value gets higher as the “purity” of each group increases.  Wouldn’t you expect higher significance from a sample with less noise? 

As to your question, it obviously depends on the situation.  In this instance, since we are talking about a vaccine for aids, there are obviously no alternatives and the upside far exceeds the downside.  Is there really much to gain from holding the drug back 10 years for more testing?  It may very well prove to be right or wrong, but by that time thousands of people will have died - when at least a certain percentage of them may have been helped by the vaccine.  And as you say, the P-Value thresholds are arbitrary - ideally you would just want to incorporate it into a Bayesian problem.  As long as there is a decent chance the drug works for any significant percentage of the population, why not?

If it was a cosmetic drug, or one that would be harmful towards future studies if it was released and found to be faulty, then I would say no.  But I can’t really think of a downside towards releasing the Aids vaccine.


#2    Guy      (see all posts) 2009/10/10 (Sat) @ 08:37

I totally agree that all the data should be released.  No argument there.

But I would read this in the opposite way you do.  The really unethical thing would be to release results only for the smaller group that was fully compliant.  That is the LEAST important group, not the most.  Two reasons:  1) you will never get full compliance in real world, so can’t replicate the result, and 2) risk that those who are compliant are a non-random subset whose better outcomes have nothing to do with the treatment. 

In general, you want to look at the entire treated group.  If there is a small group who truly got zero treatment, and no conceivable benefit (in group I but not group II), you could argue for their exclusion.  But maybe being involved in this process had a hidden benefit (affecting behavior), so hard to say.  And the p for group 2 is .07, close to significant, so even if you excluded the truly zero-treatment folks, results are probably significant.

In general, the results look encouraging to me.  The success rate is very similar in all 3 groups, which means the only reason for “insignificant” results in groups 2 and 3 is smaller sample size.  If group 3 showed zero or small impact, then I would agree that it’s likely the group I result is a fluke.

And as you often remind us, let’s not fetishize “significance.” It’s solely a function of sample size and size of effect.  If there was something in the article that should draw an MGL rebuke, it’s this:

“The “per protocol” analysis also showed that the supposed effectiveness was lower, at 26.2%. Dr. Kim, of the U.S. Army, declined to comment on the data. It isn’t clear why the vaccine was seemingly ineffective among participants who followed the guidelines to the letter.”

This is bad reporting. There’s no way the 26% success rate in group 3 is statistically different from the group I rate, given smaller sample size.  More importantly, p=.16 does NOT mean the vaccine was “ineffective” in group 3, only that sample size prevents us from being certain of its effectiveness.  (Now, depending on the cost and possible side effects, and considering bad prior results on similar vaccine, I agree we may not want to start vaccinating people without more testing.  But calling it “ineffective” based on p=.16 is wrong.)

The benefit here is modest (if it exists), which makes the result “insignificant” in this test.  But if true, it’s still well worth knowing—a bit more important than clutch hitting or preventing BABIP!


#3    jinaz      (see all posts) 2009/10/10 (Sat) @ 10:29

I’m not in drug testing.  But the general guideline according to my training (I’m a biologist) is to decide before the study begins what data you will release in the publication.  You don’t let the data determine what goes into the paper.  Now, it’s often the case that things happen and you have to add alternative analyses to address unexpected confounds or surprises.  And sometimes it becomes necessary to drop less informative parts of a study for the sake of space or focus in the paper. 

But I’m not sure that this is the same thing as happened in this paper.  It sounds like their data were noisy, and they chose the analysis that showed the most significant-looking results.  That is NEVER supposed to be what determines how you analyse your data.  Transformations, model fitting, parametric vs. nonparametric, etc, are all data-driven decisions.  But you can’t choose which dataset to use based on which one shows the most significant results!
-j


#4    Guy      (see all posts) 2009/10/10 (Sat) @ 10:57

"But you can’t choose which dataset to use based on which one shows the most significant results!”

Agreed.  But let’s be clear:  this is not a case where the result in the full treatment group is “not significant” because the vaccine had little or no effect, whereas the vaccine appears to help in the larger sample.  In that case, it would be very misleading to report only the larger sample.  But here, the three samples all show roughly the same positive effect for the vaccine.  The only difference is sample size—all we’re seeing is that a 25-30% reduction in AIDS will not be significant given the size of the full-treatment sample.  And the plan may always have been to focus on the full sample, which is indeed the most important one.


#5    jinaz      (see all posts) 2009/10/10 (Sat) @ 11:05

In that case, they need to at least report that subsets of the data were not significant, and demonstrate that the problem was statistical power (which can be specifically reported), not the direction or size of the effect. -j


#6    Depot      (see all posts) 2009/10/10 (Sat) @ 12:22

Actually, the number you want is the group 1 number.  The other numbers aren’t meaningful because they include the selection effect (people adhering to all the guidelines are different from those that aren’t).  The division between group 1 and the control group is completely random (I assume).  The others are not and, consequently, those results are biased...and rather meaningless.

You really care about 2 numbers.  First, you want the difference in the two groups (what they report).  Second, you want that number scaled by the number of compliers in each group (technically, this is a Wald estimate or an IV estimate).


#7    MGL      (see all posts) 2009/10/10 (Sat) @ 14:47

I hadn’t thought about the bias of the compliant group.  That is true that once you eliminate some people for any reason (other than randomly eliminating them) from one of the groups, all of a sudden your control group and your experimental group are no longer equivalent.  For example, the people who were compliant may have had a better lifestyle than those who were non-compliant, such that they are at a lower risk for AIDS with or without the vaccine.

On the other hand, let’s say that I have two groups. One are all the people in my experimental group, including those who never took the drug at all.  The other group are those who were 100% compliant or somewhat compliant.  The first group shows a significant difference in infection rate (from the control group) and the second group does not.  You would have a hard time convincing me that the drug had any effect at all, since the effect was only shown in the people who didn’t take the drug at all.

IOW, it is not really helpful to include people who didn’t take the drug, assuming that you are trying to determine the efficacy of the drug.  That is true even though Group I mimics what happens in real life.

Actually, I take all that back.  Don’t you do the same thing with the control group - the ones taking a placebo?  Split them into 3 groups - one, all of them, two, those who were somewhat compliant, and three, those who were 100% compliant?  That way, your experimental and control groups for all three groups are equivalent.

In any case, all the data should be released and then the people who make the decisions about which drugs should or should not be released and what further studies, if any, are warranted, etc., can make the appropriate decisions.  That should be obvious.

I don’t have any beef with the article.  The article was about the beef that people had with the researchers only releasing part of the data, obviously to make the study look more successful than it was. I agree with that beef.  All data need to be released.  Why wouldn’t it?


#8    Pizza Cutter      (see all posts) 2009/10/10 (Sat) @ 14:56

I’ve actually consulted on a (real) research project involving AIDS treatment, although certainly not this one and not one involving a vaccine.  One of the things to know about HIV/AIDS treatment is that it’s a weird virus in a number of ways.  If you’re going to do treatment for HIV, you are best off if you do the full course of the treatment.  That’s intuitive.  What’s weird is that if you are not 100% compliant, the outcomes are actually better off not having the treatment at all, rather than doing it part way.  (Our group is looking at what motivates people to do full treatment vs. partial) I don’t know if the same effect applies here in the vaccine study, but these folks are certainly steeped in that research, that might be what’s in the back of their minds as to why they are releasing things in stages of treatment purity (industry term “compliance")

On the philosophical issue, of course, everyone would be happier if the p value was 0.0000000001, and there would be no questions.  But, whether or not to release the vaccine is a question denominated in yes or no.  At what point do we say that yes there is an effect?  As I used to teach my Intro Stats students, 5% is a good a place as any.

But most of all, let’s consider the motivations here.  The group that discovers an AIDS vaccine is probably going to win a Nobel Prize.  (Let’s pause for everyone to make a B-Rock joke.) I hate to say this, but researchers do stuff like this all the time.  There are plenty of statistical ways to cook the books, and here, the chance at personal glory might have been too good to pass up.

Also, anti-depressants work just fine.  At least the ones that are most commonly prescribed.


#9    Depot      (see all posts) 2009/10/10 (Sat) @ 15:15

MGL,

This is why you want to adjust for the compliance rate.  If you only look at people who comply, then you might as well not do a random trial.

Say you give each person in the treatment group a drug to take.  Only 80% actually take it.  And assume that 0% in the control group get it (this isn’t necessary, but it’s easier this way).  The difference in outcome is X.  There are reasons to care about X.  If you’re just studying the effect of the drug, then you want X/.8.

In any study, you have to choose which numbers to report.  Why report the meaningless ones?  Maybe it is, in fact, typical to report all 3 of those numbers in medical studies.  But why?  Only the 1 has any meaning.  The other 2 don’t tell us anything.


#10    MGL      (see all posts) 2009/10/10 (Sat) @ 15:49

Depot, I don’t get what you are advocating.  Which 1 has any meaning and which ones don’t tell you anything?

The only group which has any meaning in terms of actually testing the drug is the 100% compliant group.  The only value in the result from people who took the drug but were not 100% compliant is in the fact that there might be some benefit from taking the drug even if you were not 100% compliant.  But there is no value in including people who didn’t take the drug at all.  That only adds noise to and dilutes the results.

It is true that you are introducing a bias in the treatment group by eliminating those that were not compliant, but as long as you eliminate those same people in the control group (they don’t take their placebos properly or at all), then you still have a proper study where both the control and experimental groups are equivalent.  So, your statement, “If you only look at people who comply, then you might as well not do a random trial” is not correct.  You can do anything you want in an experimental group as long as you are able to do the same thing in the control group and you still retain the “randomness” between the two groups.

The fact that the treatment group which includes people who don’t even take the drug mimics reality is not relevant to your study.  You are trying to see if the drug has any effect.  Including people who didn’t take the drug in the results makes no sense at all in terms of testing the efficacy of the drug.

Depot, I don’t understand you post at all.


#11    Dan Brooks      (see all posts) 2009/10/10 (Sat) @ 17:44

Not to interject in the discussion, this is somewhat offtopic: The idea that “most drugs have a fairly significant placebo effect, there may be many drugs out there whose only value is in their placebo effect” is overstated, I think.

The placebo effect is usually understood to be some psychological/physical effect on the progression of an illness from the mere belief that one is taking an actual drug to treat that illness.

But, in reality, “The Placebo effect” probably accounts for only a small fraction of changes in the control group. For example, improvement of a patient’s condition from taking a sugar pill can be due to a host of different factors: time, accessibility to the hospital in which he is being administered the fake treatment, etc.

It’s a good idea to include control ("Placebo") conditions in studies because they act as a great comparison for the real medicine, but I think the actual benefits of taking the placebo pill are usually overstated.

Unless you’re marketing “Head On! Apply Directly to the Forehead!”, in which case the Placebo Effect nets you millions of dollars.


#12    MGL      (see all posts) 2009/10/10 (Sat) @ 18:00

Dan, yes I agree.  When I say placebo effect, I am including the effect of doing nothing.  As you say, in many cases, illnesses go away by themselves. I have even heard numbers like 30% of the time (obviously it depends on what the illness is and what constitutes “going away” or “improvement").  So when I say that there are probably many drugs that do nothing other than the “placebo effect,” what I mean is that are probably many drugs that do nothing, period, but people think they “work” because of all the reasons why an illness or malady gets better, including, but not limited to, the psychological effect of taking something that you think might work.

Pizza, I know you are a psychologist, but I mentioned anti-depressants for several reasons. One, the only way we know whether they “work” or not, is from self-reporting data.  That is more than a little questionable.  Talk about placebo effect.  (I realize that the clinical studies compare one group to another group taking a placebo.)

Two, it is not like an infection which goes away after you take an antibiotic.  It is often very difficult for people to determine whether they feel any better when taking anti-depressants, especially when their feelings and moods change all the time from day to day and with their experiences, with or without medication.

And three, there are some recent studies I read (at least one that is) which suggests that anti-depressants have little or no effect on those people with anything but severe depression.  I’ve had 20 years of personal experience with many anti-depressants as well as with many friends and family members (depression runs in my family).  I have done a lot of research on them.  I also have an undergrad degree in pysch and went to graduate school for a while in clinical psych.  So I am not just taking out of my derriere.  The efficacy of anti-depressants is FAR from an exact science or a “sure thing.”


#13    Guy      (see all posts) 2009/10/10 (Sat) @ 18:49

"The only group which has any meaning in terms of actually testing the drug is the 100% compliant group.  The only value in the result from people who took the drug but were not 100% compliant is in the fact that there might be some benefit from taking the drug even if you were not 100% compliant.  But there is no value in including people who didn’t take the drug at all.”

This is close to backwards.  The most important group is the entire sample, while the least important—by far—is the 100% compliant group.  And there absolutely can be value in including totally non-compliant people:  suppose people are non-compliant because they are at less risk.  Or suppose compliant people engage in more risky behavior because they believe the vaccine protects them.  In either case, excluding the non-compliers will reduce not increase your accuracy.  Comparing everyone in the treatment group to the entire control group is your best comparison.

Now, maybe you can compare real compliers to placebo “compliers” as you suggest.  I don’t know if that’s ever done. But you then have some risk that the two groups differ in some way relevant to outcomes—by chance, or some systemic difference you’re unaware of.  For example, suppose certain kinds of people have a bad early reaction to the treatment, and that’s why they aren’t 100% compliant?  The placebo won’t have the same side effects, so the placebo compliant group won’t really be the same.  Again, your best comparison is the full sample. 

And I don’t think the information in this article gives us reason to be as cynical about the researchers as Pizza suggests.  The result they released is the most important, and may be what they always planned to focus on.  The group 3 result is consistent with the result of the larger sample, so hardly needs to be hidden even if you doubt the ethics of the researchers.  The people criticizing the study appear to be those who had predicted its failure and argued against the trial in the first place—they may have an axe to grind.  The researchers may have acted unethically—I don’t have enough information to say—but nothing in the article convinces me they did.


#14    MGL      (see all posts) 2009/10/10 (Sat) @ 19:04

Guy, I said as long as both the experimental and control groups are equalized, the best group to study is the compliant group only.  There is ZERO value in including the non-compliant persons other than to balance the control and experimental groups.  Since the control (placebo) group should have non-compliant persons as well (for the same reasons as the experimental group), as I said, this is the ONLY group to use.


#15    Guy      (see all posts) 2009/10/10 (Sat) @ 19:32

MGL:  Read my 2nd paragraph.  You can’t be sure that the treatment compliant group and the placebo compliant group are identical.  And just by chance, you might get higher-risk people as the control compliers and lower-risk people as the treatment compliers (or vice-versa).  And I can’t even imagine why you want to exclude partially compliant participants:  if you don’t even know if a treatment works (and why else does the trial exist), how can you possibly know that only 100% compliance works?

It’s difficult if not impossible to ensure your two compliant groups are identical.  Focusing on them introduces more risk of bias than it removes.  The best approach will usually be to take two random groups, treat one group as best you can, and then compare total results for both groups.  Just as was done here.


#16    Depot      (see all posts) 2009/10/10 (Sat) @ 21:05

I don’t know if this is the source of confusion, but the article isn’t clear on this point.  MGL is assuming that when the non-compliers are thrown out in the treatment group, the non-compliers are also thrown out in the control group.  I’m guessing this is right, though I didn’t read it that way at first either.  Guy seems to be making the assumption that I originally made - that they were throwing out people in the treatment group but not throwing out people in the control group.  Yes?  That would be problematic.  But I’m guessing they are limiting both sample so everything should be fine.


#17    MGL      (see all posts) 2009/10/10 (Sat) @ 22:08

"that they were throwing out people in the treatment group but not throwing out people in the control group.  Yes?  That would be problematic.”

Yes, that would be a problem.  If they did that, they don’t know how to do a study, right?

Guy, of course we don’t KNOW that the control and experimental groups are equivalent if we throw out the non-compliant people in both groups any more than we KNOW that the two groups were equivalent in the first place when they were randomly selected!  As long as you throw out the same people from both groups, you are still left with two, random, presumably equivalent groups!

And yes, I also said that you might want to look at the partially compliant people since there could be some value to a drug even if it is not taken according to the “protocol.” Other than that, there is no value in looking at the results from people who are not taking the drug at all or hardly taking the drug at all.  Period.  There is no other way to couch it.  There is no argument there.  If you are not taking the drug and you are in the experimental group, there is no value in your results.  Your results are noise. It is up to the researchers to make sure that the same people are “removed” from the control group, that’s all.  If you absolutely can’t or won’t do that, then I guess you are stuck with using Group I (everyone), including the “noise” from the non-compliant people.  But that is a bad second choice.  There is no reason not to remove from the control group the same people you remove from the experimental group.  No reason at all.  I am 99% sure that they do that even though I know nothing about medical research.

I don’t think there is any controversy here.  I don’t think any reasonable person would argue that the researchers should get to choose which data is presented in order to slant the study one way or another, whichever way suits their needs.


#18    Guy      (see all posts) 2009/10/10 (Sat) @ 23:06

"of course we don’t KNOW that the control and experimental groups are equivalent if we throw out the non-compliant people in both groups any more than we KNOW that the two groups were equivalent in the first place when they were randomly selected!”

Sure, the two original samples could differ. But that’s not a reason to compound the problem by adding potential additional bias.  The smaller compliant samples are more likely to differ by chance, on top of any difference in your original samples. 

“As long as you throw out the same people from both groups, you are still left with two, random, presumably equivalent groups!”

Unless patients’ reaction to the treatment itself affects their full compliance—which certainly seems possible. (If researchers can be certain this isn’t the case, then they should be equivalent but smaller samples.)

“If you are not taking the drug and you are in the experimental group, there is no value in your results.”

Simply not true.  Let’s say the treatment compliance group, just by chance, is a disproportionately low risk pool, while the control compliance group is a high risk group (increasing the chance the treatment will appear to work).  In that scenario, including the non-compliance people offsets that error, by adding more high-risk people to the treatment sample and adding more low-risk people to the control sample.  Using the full, larger sample protects you against certain kinds of sample variation in the compliance samples—any group overrepresented there will be underrepresented in the non-compliers.  Even though these non-compliance patients could not possibly have benefitted from the vaccine, including their results may still give you a more accurate comparison, because the likelihood of getting HIV is not only, or even mainly, a function of the innoculation.


#19    MGL      (see all posts) 2009/10/11 (Sun) @ 02:56

I understand what you are saying but that is a backhanded way of giving something value, if you know what I mean.  Clearly the results of people who don’t take a certain drug tell you NOTHING (zero, zilch, nada) about the effectiveness of that drug.  The inclusion of those people in one group in order to “balance” your comparison to another group is a backhanded, unnecessary way of achieving results.  Surely no researcher in his right mind would recommend that method. You are MUCH better off balancing the groups by trying at least to remove similar persons from your control group.  Much better off. It simply makes no sense at all to purposely introduce into your data set in the EXPERIMENTAL group, people who are not taking the drug (let’s assume that non-compliance means not taking the drug at all).  No sense at all. To me at least.


#20          (see all posts) 2009/10/11 (Sun) @ 04:34

Cochrane published a systematic review showing that tricyclic antidepressants are “1.47 times more likely to bring about a response” in people with depression than placebo. However, they also published a systematic review concluding that tca’s only resulted in small differences in mood compared to “active placebos"- ie, placebos that contain ingredients which mimic the side effects of tca’s.

Here is the link to that review:
http://www.cochrane.org/reviews/en/ab003012.html

In light of the authors’ conclusions (”...the effects of antidepressants may generally be overestimated and their placebo effects may be underestimated."), I am not inclined to believe Pizza’ assertion that antidepressants “work just fine.” However, as the limits of my knowledge do not extend much past these studies, I would be very (earnestly) interested to hear his response.

Also, re: active placebos in a depression study: Isn’t there a major ethical problem with telling your control group subject that you are going to give him a pill to help his depression, and then just give him sugar and saltpeter? First, there’s an opportunity cost in that the depressive is now not receiving medicine that is more likely to help his mood. And on top of that, you’re condemning the poor guy to flaccidity for the duration of the trial, potentially making things much, much (,much) worse.


#21    Alt_n      (see all posts) 2009/10/11 (Sun) @ 11:01

Most analysis in a clinical trial is done on the “intent-to-treat” population.  This will generally include a few subjects who never receive any treatment at all, but study sponsors will always design the study to try to reduce this number.

For example, most trials will consider every subject who is randomized (i.e., assigned either the active drug or the placebo) as an “intent to treat” subject.  So the goal of the sponsor is to reduce the amount of time between randomization and the administration of the first dose. 

In practice, this means that randomization will be done right when the subject is ready to receive his or her first dose of medication--the subject is in the clinic for the first dose, the doctor telephones a central clearinghouse number, which tells the doctor “administer 50 mg from vial number 1234” or some such instruction, and the subject becomes part of the intent-to-treat population the instant the doctor is given that instruction.

Obviously, some subjects will do things like back out when they see the needle, or disappear when the doctor is off making the phone call, but this shouldn’t be too large of a percentage of the population.  I suspect that almost everybody in that “modified intent to treat” population described in the article received at least one dose of the vaccine.  The article calls the vaccine a “regimen,” so obviously there is more than one dose involved.

Somebody needs to look at the data closely in order to determine why the subjects who received the “wrong” regimen seemed to do better than the ones who received the protocol-specified regimen.  Not knowing anything at all about this trial other than what the linked article says, I suspect that the vaccine works to an extent, but the regimen selected by the designers of the study is not a valid one.  Hopefully there is enough data on non-compliant subjects to get an idea of what a better regimen actually would be.

The debate above--whether it is better to perform analysis on the “intent to treat” population or the “according to protocol” population--is also a debate seen within the clinical research community.  There are certainly valid reasons to perform analysis on the “intent to treat” population in some trials, but there is a certain slavish devotion to intent-to-treat analysis that just doesn’t make sense.


#22    Terry      (see all posts) 2009/10/11 (Sun) @ 11:17

Torture the data long enough and it will confess....


#23    Guy      (see all posts) 2009/10/11 (Sun) @ 13:17

"The debate above--whether it is better to perform analysis on the “intent to treat” population or the “according to protocol” population--is also a debate seen within the clinical research community.”

The difference being, of course, that neither MGL or I actually know anything about clinical research… :>)

“Somebody needs to look at the data closely in order to determine why the subjects who received the “wrong” regimen seemed to do better than the ones who received the protocol-specified regimen.”

We don’t really know that’s true here, do we?  We only know that the gap between the low-compliance treatment and control groups is greater.  It could just be that the low-compliance control group had a very low infection rate, for some reason.  Also, why couldn’t the low-treatment group include more at-risk patients, who benefit from the vaccine, while the 100% compliant population includes many people who can’t benefit because they actually have a 0% chance of infection?  Anyway, without knowing the sample sizes it’s hard to know how reliable the results are for the low- and zero-compliance samples. 

*

I can see a lot of reasons why intent-to-treat is the default.  As you say, it gives researchers an incentive to treat as broadly as possible.  You don’t have to worry about whether full, partial, and zero treatment are defined reasonably.  You don’t have to worry about whether the two full treatment samples differ, by chance or poor administration or because the treatment itself impacted compliance.  You just ask, given a population you attempted to treat, what is the net impact?  The bias here, in general, will be to UNDERSTATE the actual effect of the treatment, which seems like the bias you want.  But the 100% treatment group does provide an important check.  If you see a result in the full group but not with 100% treatment, that raises serious questions.


#24    MGL      (see all posts) 2009/10/11 (Sun) @ 16:32

Josh, good stuff.  I was skeptical of Pizza claims for the same reasons you cite.  I didn’t want to offend him though.  Many psychologists and psychiatrists are biased and think that certain anti-depressants work “just fine” (whatever that even means).

“Also, re: active placebos in a depression study: Isn’t there a major ethical problem with telling your control group subject that you are going to give him a pill to help his depression, and then just give him sugar and saltpeter?”

Isn’t that an issue in all clinical trials with control groups?

How else are you going to test drugs?  I assume that the people are told that they may be getting placebos and it is their choice to participate or not.  And I assume that they get paid and/or free drugs in exchange for being a guinea pig.


#25    Alt_n      (see all posts) 2009/10/11 (Sun) @ 17:51

#20/Josh:  “Also, re: active placebos in a depression study: Isn’t there a major ethical problem with telling your control group subject that you are going to give him a pill to help his depression...”

Well, you don’t tell the subject that you are giving him a pill to help his depression, you tell him the exact probability that he will be receiving a study medication or a placebo.  It’s called “informed consent"--you have to let the subject know exactly what he or she will be getting, or (assuming the study is blinded) the probability that he or she will be in each group.

#24/MGL:  “How else are you going to test drugs?”

You have stumbled onto another controversy in the research community.  The FDA is more likely to approve a clinical trial involving a placebo than the EMEA (the European equivalent to the FDA).  The EMEA’s position is that a placebo trial is almost always unethical, unless there is absolutely no alternative treatment available to the subject.  The FDA is tending more and more toward the same position. 

If you have developed a treatment for HIV, for example, it is not ethical to compare that treatment to placebo.  You need to compare it to the current standard of care--in other words, the 2 groups that subjects are randomized into are not “experimental” and “placebo,” they are “experimental” and “standard of care.” You need to prove that your drug is better than what people are getting right now, not that it is better than nothing.

#24/MGL:  “And I assume that they get paid and/or free drugs in exchange for being a guinea pig.”

They get paid, but not too much.  Just enough to compensate them for their time and their expenses for traveling to the doctor’s office.  They get free medical tests, which can be a pretty significant benefit (it has happened that study subjects were given an ECG or a lab test performed as part of the trial whereby a life-threatening but treatable medical condition is discovered).  Of course, all of the study medications are free, and many trials that do involve placebos are set up so that all subjects eventually receive the study medication.


#26          (see all posts) 2009/10/11 (Sun) @ 19:59

#25/Alt:

“Well, you don’t tell the subject that you are giving him a pill to help his depression, you tell him the exact probability that he will be receiving a study medication or a placebo.  It’s called “informed consent"--you have to let the subject know exactly what he or she will be getting, or (assuming the study is blinded) the probability that he or she will be in each group.”

Gotcha. Thanks for the clarification.

24/MGL:

“Isn’t that an issue in all clinical trials with control groups?”

But there’s an extra issue in an active placebo trial. The person receiving an active placebo is not only not getting potentially positive treatment, he is receiving a substance that actively harming him through the side effects.

I understand your point about informed consent, it just intuitively strikes me as ethically icky. Maybe I’ll be relieved of that feeling as I think about it more.

The interesting thing to me about the antidepressant study is that that something about the very side effects themselves seem to be providing relief. Maybe it’s a standard placebo effect, but maybe whatever mechanism advances depression of mood is also (inversely) related to depression of libido (and the common effects that show up in anti-depressant users).

#24/MGL:

“I was skeptical of Pizza claims for the same reasons you cite.  I didn’t want to offend him though.”

Ha. I don’t think Pizza is overly sensitive. Judging by his other writings, I’m pretty sure he can hack the scrutiny- especially when he makes a hasty remark such as this.

Moreover, I think we’d be a lot better off if we all stopped worrying so much about offending people when we’re honestly trying to trying to get
at the truth.


#27    Pizza Cutter      (see all posts) 2009/10/11 (Sun) @ 20:16

MGL, offend away!  I don’t always agree with you, but you are always logical and address issues substantively, so I don’t mind being told I’m full of manure by you or anyone else who will engage in similar terms.  (It’s one of the reasons I like hanging out here.)

Josh that specific study covers tri-cyclics, which are second-choice (on a good day) drugs and rarely used.  The primary drugs that psychiatrists go for are the SSRI’s (Selective Serotonin Reuptake Inhibitors.) Prozac, Zoloft, Paxil are the big three, and they are chemically and neurologically very different than tri-cyclics.

I went to PsychInfo, which is THE database for psychology professional journal articles, and typed in “fluoxetine (the chemical name for Prozac) and efficacy”.  Within the last two years alone (2008-9), I found three meta-analyses (studies which pool together all available studies and analyze them collectively), looking at the effects of anti-depressants vs. placebo.  Guirguis-Blake et al. (2008) looked at moderate to severe depression in adolescents and found a good size effect for meds over placebo.  The TADS (Treatment of Adolescent Depression Study) data found that the effect size was smaller, although significantly different from placebo.  Hansen et al. (2008) found that fluoextine was better at preventing relapses of depression at six months than placebo.  This is admittedly a truncated search, but of course, we don’t have time to review everything here.  I’ve (been required to) read a lot of the stuff out there on the subject.  Most of the arguing left is about which med works best of all.

The evidence is not 100% clear cut, I’ll grant that.  It looks like the best course of treatment is a *combination* of talk therapy with meds, and there are studies that do find zero effect for meds, although they are out-numbered.  There is also the problem of the “file drawer effect” in which studies which do not produce significant effects are “not publishable.” As the drug companies themselves do most of the efficacy studies, there is the fear that they only release the one study that shows that “it works!” and keep the other 25 hidden.  (America needs a pharmaceutical trials registry.) I respect that those are clear biases in the field, and if these issues are addressed and produce different results, I will gladly change my recommendations on the subject.

There’s also the simple fact that nothing works for everyone.  MGL, you spoke to the fact that your experience with anti-depressants has been poor.  In my family, it’s been a mixed bag.  As professionals, we like to refer to pharmacotherapy as “brain surgery with a butter knife.” The evidence says that it works a good chunk of the time.  That’s small consolation to the person for whom it doesn’t work, but the evidence says with a good amount of force that you’re likely to be better off with meds.  As a clinician, I would love to be 100% certain about things.  The problem is that I can’t be.  The best evidence I’ve seen so far says that anti-depressants really do help people to recover from depression in a lot of cases.  They aren’t perfect.  They have side effects.  They have drawbacks.  They need to be monitored carefully.  There may be some element of placebo in there.  I worry though that people mistake all of these caveats for “there’s no point then.”


#28    MGL      (see all posts) 2009/10/11 (Sun) @ 20:33

Good stuff Pizza!

“MGL, you spoke to the fact that your experience with anti-depressants has been poor.”

Just mixed, not poor really.  I have been on and off SSRI’s for 20 years.  Hard to say what effect they have and how much is a placebo effect.  I just put them in a different class than, say, an antibiotic, which clearly “works” AND you can tell rather easily whether it works or not.

I have known many people who go on SSRI’s and when asked if they are having an effect, they say, “I’m not really sure.” That answer, in my experience with myself and family and friends, seems to be more common than, say, “Oh yeah, it’s great!”

I am not claiming that they don’t work for some people by any means.  I just don’t think one can say, “They work just fine,” depending upon what you mean by “just fine” of course.  If “just fine” means significantly better than a placebo in most studies, then I’m OK with that statement.  If “just fine” means, “like an anti-biotic or a pain-killer,” then I am not so OK with that statement.

No real disagreement I don’t think.


#29          (see all posts) 2009/10/11 (Sun) @ 23:59

Pizza,

I used the TCA data because I cannot find any systematic review (which I think provides superior quality information to meta-analysis) for SSRIs. My understanding is that SSRIs are more prescribed because of their tolerability- ie, they have less side effects. This does not mean that the most prescribed antidepressants are more effective than the very dubious TCAs.

You point to meta-analyses that show SSRIs are more effective than placebo. But, as I pointed out above, Cochrane’s systematic review showed TCAs also do better than placebo. But, when you compare them to active placebos, they look like they don’t work.

I can’t find a review that compares SSRIs to active placebo, but here is a meta-analysis that shows SSRIs as being no more effective than TCAs:

http://www.ncbi.nlm.nih.gov/pubmed/10760555?dopt=Citation

Another part of the TCA review that I find suggestive is the fact that you can take the standard dosage and cut it down to a much lower dosage… and get no reduction in effectiveness (though you’ll get a reduction in side effects, as with substituting SSRIs). Of course, it could be that the normal dosage is just too high. Another explanation is that these drugs don’t really work.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential