Saturday, October 10, 2009
An AIDS vaccine trial and statistical significance
Here is the link to the WSJ article on the study:
http://online.wsj.com/article/SB125511780864976689.html?mod=WSJ_hpp_MIDDLTopStories
I’ll try and summarize the findings and the controversy:
Apparently in drug studies, they often, always, or sometimes give 3 different data sets. One are most enrollees, including non-compliant ones. Two include those who didn’t follow some of the guidelines (a smaller number than group one), and three are those that followed all of the guidelines, the fewest number of subjects of course. They compare the results (in this case, the rate of HIV infection) of all three groups to the placebo group. There were about 8200 subjects in the first group. I don’t know over what time period the study took place. The article did not say.
What they found was that the difference between the control (placebo) and non-control subjects for group I was significant at p=.039, for group II it was p=.076, and for group II, it was p=.16. So the results (difference in rate of infection between the drug-taking group and the placebo-taking group) were statistically significant (at the 5% level) for group I, almost significant for group II, and not even close to significant for Group III.
The controversy was that originally the researchers only reported the results for Group I.
According to the article:
The results announced last month were based on a “modified intent to treat analysis,” which includes virtually everyone who enrolled in the study, regardless of whether they ended up getting the full course of the vaccine. It is a good stand-in for the real world, where people don’t always follow instructions properly.
Any results that include subjects who didn’t take the drug or hardly took it at all (group I) obviously have more noise in them. That is the last group you want to use to report your results. The second group apparently includes subjects who took the drugs to some degree, but did not take them exactly according to the protocol, or some such thing. Again, you are likely to have more noise in group II than in group III. I suppose you could make the argument that this group includes people who took the drug to some extent and that they might have gotten some benefit.
Basically, it seems to me that ONLY reporting the results from group I, which were significant at the 3.9% level, while the results for group III (the “pure” group) were only significant at the 16% level (which is not very good at all for a medical study I would think), is a gigantic ethical breach, and I can’t imagine that a researcher would be allowed to do such a thing.
Aren’t there rules for how you are supposed to report the data?
From the article:
“I think in general it’s best to lay out as much data as possible,” said Barton Haynes, director of the center and an HIV vaccine expert at Duke University, who was at the meeting. “This is a very difficult situation for everyone, and we’ll have to wait until all the data are released so we can drill down into it.”
Isn’t that what we always say? Present ALL the data and let the readers decide for themselves?
Anyway, as we always say, there is no magic number for a result to be significant. Isn’t that especially true for medical studies when you are dealing with potentially saving lives? If you do a study and get results significant at the p=.16 level, shouldn’t you simply redo the study? Even if you get a very significant result, don’t you always have to redo a study before a drug is approved? Again, I don’t know how long this study took. Since they were testing a vaccine, it probably took a long time so it may not be all that practical to do another study. Don’t you sometimes have to do multiple studies on drugs in order to tweak the protocols, dosages, etc.?
Aside from comments, which are more than welcome of course, I’ll pose this question to all the smart minds on this blog? Let’s say that you do a similar study and it takes 5 or 10 years. What kind of results do you need before you allow the drug to be sold to the general public? Clearly you will eventually have a chance to re-check the results on a much, much large sample in another 5 or 10 years if it is approved, but what do you need to approve the drug, at least on a “temporary” basis (if that is even allowed)? 1%? 5%? 10%? 15%? 20%?
Do we want a whole bunch of drugs out there when we KNOW that some percentage of them will not work and only succeeded in a trial or trials by chance? Do we put as many drugs out there (assuming that we think they are safe and there are few if any alternatives) as possible and then collect more data on them and if it turns out that they are not effective, we remove them from the market? Should it depend on how life-saving the drug is (and how dangerous it might be) and how long it takes to conduct these trials? Do you think there are lots of drugs out there that are completely ineffective but passed a series of trials by chance alone? Would we even know it, especially if it is a drug that is not supposed to have a high success rate (like some of the cancer drugs which may or may not work, or some drugs whose affects are not necessarily noticeable, like some of the anti-depressants)? And since most drugs have a fairly significant placebo effect, there may be many drugs out there whose only value is in their placebo effect.