Looking for the Pony in the HERS Data
One of the great public health advances of the last century was the development of the randomized, controlled clinical trial, which is designed to control for known and unknown differences in persons who take or do not take medication (by the use of randomization), and for the effects of patient and provider expectation (by the use of double-blind placebo control).
See p 917
The late Reuel A. Stallones was fond of teaching that clinical trials are done during a narrow window of opportunity when there is enough evidence for benefit to justify the time and expense of the trial, but not so much evidence that it would be unethical to deprive participants of the “active” treatment by assigning them to placebo. This window is particularly difficult to achieve when the medication has been in use for many years, its benefit has been demonstrated repeatedly in epidemiological studies and in clinical practice, and when the thought leaders and major medical organizations have already recommended its widespread use.
Such was the case with hormone replacement therapy (HRT) and the prevention of heart disease. At the beginning of the Heart and Estrogen/progestin Replacement Study (HERS), a secondary prevention trial of estrogen in postmenopausal women with heart disease, several experts opined that this trial was unnecessary at best and unethical at worst, given the consistency of the observational data, which certainly looks very impressive in a meta-analysis, and the plethora of potential cardioprotective mechanisms for estrogen that have been demonstrated in vivo and in vitro.
It was therefore more than a bit of a shock when the HERS trial results showed no overall benefit to HRT and a completely unexpected early excess of cardiovascular events.1 The reaction of the research and clinical community to these results has been one of disbelief and denial accompanied by a frantic search for possible explanations for the “trial failure.”
Subgroup analyses often are used to show that benefit in subgroups parallels benefit in the study overall. For example, it is useful and reassuring to know that results are equivalent in different age, sex, and ethnic groups, as illustrated in a recent subgroup analysis of the Dietary Approaches to Stop Hypertension (DASH)-sodium trial for high blood pressure.2 The rationale for the post hoc subgroup analyses in HERS is different. Can we find a subgroup of women who might be benefited or harmed by HRT when the overall results show no benefit?
It can be argued that subgroup analysis of null results is akin to “looking for the pony.” In this modern fable, a man has two sons, one a hopeless pessimist and the other an unrealistic optimist. Determined to change their thinking to a less extreme position, the man buys a room full of toys for the pessimist and a room full of horse manure for the optimist. When he returns later, the pessimist is crying because his toys are already broken or soon will be. In contrast, the optimist is happily shoveling through his gift, explaining, “With all that manure there must be a pony in there somewhere.”
When the original HERS results were published in 1998,1 the authors already had looked very hard for the pony, which in this case consisted of characteristics that might explain the unexpected results of the HERS trial. In the present issue of Circulation, Furberg and colleagues3 publish the results of multiple subgroup analyses from HERS. They report 9 statistically significant interactions—that is, subgroups of women who did better or worse in the HRT group than in the placebo group, either overall or in the first year when excess cardiovascular events were observed. These 9 statistically significant comparisons out of 172 comparisons (86 tests for first-year outcomes and 86 for cumulative 4-year outcomes) approximate the 5 out of 100 differences expected by chance at P<0.05.
There is no scientific way to reduce the number of associations sought in post hoc analyses or to determine which of the observed associations were not due to chance. The authors emphasize that they sought biologically plausible associations, but previously implausible associations sometimes prove to be coherent in the context of new discoveries in biology or physiology. More importantly, biological plausibility is quite easy to theorize; anyone with 2 hours and a little imagination can do it.
For example, Furberg et al3 report more heart disease for women assigned to HRT who were also taking digitalis, and less heart disease for women smokers who were assigned to HRT. Both of these associations are biologically plausible, in a stretch. Like estrogen, digitalis is a steroid and could have estrogen-like effects; perhaps too much estrogen is a bad thing. Women smokers treated with HRT have lower levels of estradiol than do nonsmokers; in this case, smoking prevents “too much” estrogen. In a previously published subgroup analysis,4 there was an apparent cardiovascular benefit for women with high lipoprotein(a) levels at baseline and harm for those with low lipoprotein(a). The benefit is plausible, the harm inexplicable.
HERS subgroup analyses do suggest that HRT works in qualitatively different ways among a few subgroups. As concisely stated by Sackett and colleagues,5 however, the statistics of determining subgroup prognoses are about prediction, not etiology. “They are indifferent to whether the prognostic factor is physiologically logical…or a biologically nonsensical and random, noncausal quirk….” Therefore, even when the difference in response makes biological sense, if it was not hypothesized before the trial and is not supported by similar results from another trial, the observation should not override conclusions based on the overall results.
Furberg et al3 have done a wonderful job indicating the caveats and limitations of multiple post hoc subset analyses. So then why perform these post hoc analyses? Any results will remain suspect unless confirmed in another trial. Nevertheless, despite the dangers of wrong conclusions, it would make no more sense to answer only the “main” question in a large clinical trial than it would to have a large observational study and not explore the data for disease associations unimagined when the cohort was established. One thing is clear: None of the interactions reported by Furberg and colleagues3 seems likely to explain the null cardiovascular results in HERS.
Can we salvage a hypothesis in the face of a negative trial? Early in the Lipid Research Clinic (LRC) Coronary Primary Prevention Trial, when the cholesterol–heart disease hypothesis was first being tested, investigators were given teaching slides prepared by the program office. One of these slides listed 6 or 7 reasons why the LRC trial results could be negative and the lipid hypothesis could still be true. Similar convolutions followed the release of the HERS results. It was variously proposed that the HERS study was too small and too short; events were too few; the women were too old, too sick, or too noncompliant; and the wrong estrogen or the wrong progestin was studied, possibly in the wrong dose as well. Further, it was only one trial and probably a fluke.
Now, 3 years after HERS, we still have no clinical trial evidence that HRT reduces the risk of coronary heart disease (CHD) events. The 4 published secondary prevention trials have shown no benefit to HRT. HERS showed fatal and nonfatal CHD as the primary outcome1 or stroke as a secondary outcome6; the Papworth Hormone-replacement therapy Atherosclerosis Survival Enquiry (PHASE) study showed fatal and nonfatal CHD including hospitalization for unstable angina as the primary outcome;7 the Estrogen Replacement and Atherosclerosis (ERA) study showed quantitative coronary angiography as the primary outcome8; and the Women’s Estrogen and Stroke Trial (WEST) showed stroke as the primary outcome.9 In HERS, women were treated with conjugated equine estrogen and medroxyprogesterone acetate; in PHASE, active treatment was transdermal estradiol plus norethisterone; in ERA, conjugated equine estrogen with or without medroxyprogesterone acetate; and in WEST, oral 17-β estradiol. No results from primary prevention trials specifically designed to examine the effect of HRT on CHD events have been published, but a pooled analysis of 26 small, short trials of HRT, mostly using unopposed estrogen, with CHD or cardiovascular disease noted as adverse events, showed no benefit.10,11⇓ Letters to trial participants after the second and third year of the Women’s Health Initiative, in which one third of women are taking unopposed estrogen, report an excess of heart disease and stroke.12
Thus, all clinical trial data with CHD outcomes published to date support the revised American Heart Association position that estrogen should not be prescribed to prevent or treat CHD.13 The present publication does not suggest any subgroup likely to obtain benefit. The clinical trial results do not exclude the possibility that physiological levels of endogenous estrogen are cardioprotective.
HERS is not the first trial showing that best clinical practice can be misinformed when not evidence-based. In the recent past, clinical trials have shown that although medicine corrected dangerous cardiac arrhythmias, the patient was harmed rather than helped,14 and that, despite the strong and biologically plausible reason to hope that the antioxidant vitamin E would prevent cardiovascular disease, it does not.15
The opinions expressed in this editorial are not necessarily those of the editors or of the American Heart Association.
- ↵Furberg CD, Vittinghoff E, Davidson M, et al. Subgroup interactions in the Heart and Estrogen/Progestin Replacement Study: lessons learned. Circulation. 2002; 105: 917–922.
- ↵Sackett DL, Straus SE, Richardson WS, et al. Evidence-Based Medicine. How to Practice and Teach EBM. 2nd ed. London: Churchill Livingstone; 2000: 99.
- ↵Simon J, Hsia J, Cauley JA, et al. Postmenopausal hormone therapy and risk of stroke: the Heart and Estrogen/progestin Replacement Study (HERS). Circulation. 2001; 103: 638–642.
- ↵Clarke S, Kelleher J, Lloyd-Jones H, et al. Transdermal hormone replacement therapy for secondary prevention of coronary artery disease in postmenopausal women (abstract P 1194). Eur Heart J. 2000; 21 (suppl): 212.
- ↵Hemminki E, McPherson K. Impact of postmenopausal hormone therapy on cardiovascular events and cancer: pooled data from clinical trials. BMJ. 1997; 315: 149–153.
- ↵Mosca L, Collins P, Herrington DM, et al. Hormone replacement therapy and cardiovascular disease: a statement for healthcare professionals from the American Heart Association. Circulation. 2001; 104: 499–503.