[E]very time in the past that these researchers had claimed that an association observed in their observational trials was a causal relationship, and that causal relationship had then been tested in experiment, the experiment had failed to confirm the causal interpretation -- i.e., the folks from Harvard got it wrong. Not most times, but every time. No exception. Their batting average circa 2007, at least, was .000.
...
I never used the word scientist to describe the people doing nutrition and obesity research, except in very rare and specific cases. Simply put, I don’t believe these people do science as it needs to be done; it would not be recognized as science by scientists in any functioning discipline.
Science is ultimately about establishing cause and effect. It’s not about guessing. You come up with a hypothesis -- force x causes observation y -- and then you do your best to prove that it’s wrong. If you can’t, you tentatively accept the possibility that your hypothesis was right. Peter Medawar, the Nobel Laureate immunologist, described this proving-it’s-wrong step as the ”the critical or rectifying episode in scientific reasoning.” Here’s Karl Popper saying the same thing: “The method of science is the method of bold conjectures and ingenious and severe attempts to refute them.” The bold conjectures, the hypotheses, making the observations that lead to your conjectures… that’s the easy part. The critical or rectifying episode, which is to say, the ingenious and severe attempts to refute your conjectures, is the hard part. Anyone can make a bold conjecture. (Here’s one: space aliens cause heart disease.) Making the observations and crafting them into a hypothesis is easy. Testing them ingeniously and severely to see if they’re right is the rest of the job -- say 99 percent of the job of doing science, of being a scientist.
...
[B]ecause this is supposed to be a science, we ask the question whether we can imagine other less newsworthy explanations for the association we’ve observed. What else might cause it? An association by itself contains no causal information. There are an infinite number of associations that are not causally related for every association that is, so the fact of the association itself doesn’t tell us much.
...
[A]s we move from the bottom quintile of meat-eaters (those who are effectively vegetarians) to the top quintile of meat-eaters we see an increase in virtually every accepted unhealthy behavior -- smoking goes up, drinking goes up, sedentary behavior (or lack of physical activity) goes up -- and we also see an increase in markers for unhealthy behaviors -- BMI goes up, blood pressure, etc. So what could be happening here?
...
[P]eople who comply with their doctors’ orders when given a prescription are different and healthier than people who don’t. This difference may be ultimately unquantifiable. The compliance effect is another plausible explanation for many of the beneficial associations that epidemiologists commonly report, which means this alone is a reason to wonder if much of what we hear about what constitutes a healthful diet and lifestyle is misconceived.
...
[W]henever epidemiologists compare people who faithfully engage in some activity with those who don’t -- whether taking prescription pills or vitamins or exercising regularly or eating what they consider a healthful diet -- the researchers need to account for this compliance effect or they will most likely infer the wrong answer. They’ll conclude that this behavior, whatever it is, prevents disease and saves lives, when all they’re really doing is comparing two different types of people who are, in effect, incomparable.
...
[O]bservational studies may have inadvertently focused their attention specifically on, as Jerry Avorn says, the “Girl Scouts in the group, the compliant ongoing users, who are probably doing a lot of other preventive things as well.”
...
It’s this compliance effect that makes these observational studies the equivalent of conventional wisdom-confirmation machines.
...
So when we compare people who ate a lot of meat and processed meat in this period to those who were effectively vegetarians, we’re comparing people who are inherently incomparable. We’re comparing health conscious compliers to non-compliers; people who cared about their health and had the income and energy to do something about it and people who didn’t. And the compliers will almost always appear to be healthier in these cohorts because of the compliance effect if nothing else. No amount of “correcting” for BMI and blood pressure, smoking status, etc. can correct for this compliance effect, which is the product of all these health conscious behaviors that can’t be measured, or just haven’t been measured. And we know this because they’re even present in randomized controlled trials. When the Harvard people insist they can “correct” for this, or that it’s not a factor, they’re fooling themselves. And we know they’re fooling themselves because the experimental trials keep confirming that.
...
This is why the best epidemiologists -- the one’s I quote in the NYT Magazine article -- think this nutritional epidemiology business is a pseudoscience at best. Observational studies like the Nurses’ Health Study can come up with the right hypothesis of causality about as often as a stopped clock gives you the right time. It’s bound to happen on occasion, but there’s no way to tell when that is without doing experiments to test all your competing hypotheses. And what makes this all so frustrating is that the Harvard people don’t see the need to look for alternative explanations of the data -- for all the possible confounders -- and to test them rigorously, which means they don’t actually see the need to do real science.
...
Now we’re back to doing experiments -- i.e., how we ultimately settle this difference of opinion. This is science. Do the experiments.
...
So we do a randomized-controlled trial. Take as many people as we can afford, randomize them into two groups -- one that eats a lot of red meat and bacon, one that eats a lot of vegetables and whole grains and pulses-and very little red meat and bacon -- and see what happens. These experiments have effectively been done. They’re the trials that compare Atkins-like diets to other more conventional weight loss diets -- AHA Step 1 diets, Mediterranean diets, Zone diets, Ornish diets, etc. These conventional weight loss diets tend to restrict meat consumption to different extents because they restrict fat and/or saturated fat consumption and meat has a lot of fat and saturated fat in it. Ornish’s diet is the extreme example. And when these experiments have been done, the meat-rich, bacon-rich Atkins diet almost invariably comes out ahead, not just in weight loss but also in heart disease and diabetes risk factors. I discuss this in detail in chapter 18 of Why We Get Fat, ”The Nature of a Healthy Diet.” The Stanford A TO Z Study is a good example of these experiments. Over the course of the experiment -- two years in this case -- the subjects randomized to the Atkins-like meat- and bacon-heavy diet were healthier. That’s what we want to know.
Now Willett and his colleagues at Harvard would challenge this by saying somewhere along the line, as we go from two years out to decades, this health benefit must turn into a health detriment. How else can they explain why their associations are the opposite of what the experimental trials conclude? And if they don’t explain this away somehow, they might have to acknowledge that they’ve been doing pseudoscience for their entire careers. And maybe they’re right, but I certainly wouldn’t bet my life on it.
Ultimately we’re left with a decision about what we’re going to believe: the observations, or the experiments designed to test those observations. Good scientists will always tell you to believe the experiments. That’s why they do them.
...
Conventional methods assume all errors are random and that any modeling assumptions (such as homogeneity) are correct. With these assumptions, all uncertainty about the impact of errors on estimates is subsumed within conventional standard deviations for the estimates (standard errors), such as those given in earlier chapters (which assume no measurement error), and any discrepancy between an observed association and the target effect may be attributed to chance alone. When the assumptions are incorrect, however, the logical foundation for conventional statistical methods is absent, and those methods may yield highly misleading inferences.
...
Systematic errors can be and often are larger than random errors, and failure to appreciate their impact is potentially disastrous.