(Circulation. 2002;106:1015.)
© 2002 American Heart Association, Inc.
Clinical Cardiology: New Frontiers |
From the Duke Clinical Research Institute (R.M.C.), Duke University Medical Center, Durham, NC, and the Department of Biostatistics and Medical Informatics (D.L.D.), University of Wisconsin, Madison, Wis.
Correspondence to Robert M. Califf, MD, Duke Clinical Research Institute, PO Box 17969, Durham, NC 27715.
Key Words: trials cardiovascular diseases therapy outcome assessment statistics
| Introduction |
|---|
|
|
|---|
| Principle 1: Treatment Effects Are Modest |
|---|
|
|
|---|
|
At the other extreme, from chronic therapies designed to prevent events, Guyatt and Sackett popularized the concept of the n=1 clinical trial, which is oriented toward symptomatic treatments.1,2 This approach engages the clinician and the patient in an experiment in which the treatment, or placebo, is randomly allocated and symptoms are measured until treatment efficacy or failure is proved. Although this approach may have merit in many clinical situations, it can be treacherous in chronic disease therapies in which the acute response does not predict the long-term outcome, or when the anticipated treatment effect is so small that a response to treatment is difficult to differentiate from random fluctuations in the measurement of unrelated changes in the natural history of the disease.
A practitioners individual experience is simply not adequate to recognize treatment effects of the size usually seen in therapies to prevent future events in a chronic disease. In fact, a practitioners personal experience has a reasonable probability of misleading him or her about what to expect when the next patient is treated. Within any large clinical trial, multiple practitioners will experience outcomes that differ from the overall results of the trial. Although monitoring each individual patient closely for symptomatic and physiological improvement is critical in any clinical practice, it is not the best way to determine whether proposed treatments are effective (with the exception of suitable situations for the n=1 trial), especially when the treatment is primarily given to alter the long-term course of a chronic disease. Furthermore, the experience with the last patient says little about what to expect in the next patient; rather, to detect modest treatment effects, large randomized trials are needed.3
Another important implication of the modest treatment effect of most therapies is the need for the clinical community to give more thought to the degree of benefit it considers to be a "clinically meaningful difference." If the construct for developing new therapies is to demonstrate a clinically meaningful difference in a generalizable clinical trial, and if most such differences are small, then many marginally incremental therapies will need to be evaluated in trials and used in practice. In addition to the specifics of the disease in the individual patient, the overall magnitude of the disease as a societal issue must be considered; small effects in epidemic diseases deserve serious consideration because of the overall impact of the treatment. For example, in acute coronary syndromes, a relative reduction of 15% in recurrent clinical events has recently been considered clinically important; this level is far below the perceived threshold that drove the sample size calculations for clinical trials just a decade ago. As we develop more incrementally beneficial therapies for the epidemics of hypertension, heart failure, coronary artery disease, and sudden death, it is likely that the minimally important clinical difference will become even smaller.
| Principle 2: Qualitative Interactions Are Uncommon, but Quantitative Interactions Are Usual |
|---|
|
|
|---|
|
The clinical consequence of this finding is that when a therapy is shown to be beneficial for patients with a clinical condition, the therapy can be applied systematically to the population in clinical practice. The burden is on the clinician to justify failing to treat rather than having to justify treating each patient with the diagnosis. This approach allows practices and health systems to develop clinical practice standards and performance measures that can be introduced into hospitals and clinics to ensure reliable use of effective practices.
A quantitative interaction occurs when there is a significant difference in response to treatment in one group compared with another, but the direction of the treatment effect (benefit or harm) is the same in both groups. Quantitative interactions are common, and the sicker patients almost always have a greater benefit from treatment than do the less sick patients. This finding that sicker patients derive more benefit from treatment is at odds with what practitioners commonly observe in their own patients, which is that less sick patients have better outcomes with treatments. This intuitive lesson from "clinical experience" is incorrect, of course, because it cannot take into account the fact that less sick patients also do better without treatment.
This principle has important implications for treatment selection. Rather than selecting patients who have the best outcomes with a given therapy, the important construct is to select patients in whom the outcome with therapy is most favorable compared with what would have happened without the therapy. Multiple studies have shown that selection of patients for angiography14 and revascularization1416 tends to err toward low-risk patients who get less benefit rather than high-risk patients who get the greatest benefit. Similarly, the elderly are less likely to be treated with secondary prevention therapies, despite consistent findings that show greater benefit in older versus younger patients.17 This pattern of treatment selection needs to change, and doing so will require that clinicians understand the paradox that patients who benefit the most from treatment are not necessarily those who tend to do best with therapy.
Multiple studies have also demonstrated that patients enrolled in clinical trials tend to be different from the general population of treated patients. Early studies of coronary bypass grafting enrolled only a small minority of eligible patients.18 Later studies documented a surfeit of women and older patients in trials evaluating acute MI.19 A recent evaluation of trials funded by the National Heart, Lung, and Blood Institute found little change in the underrepresentation of women in studies enrolling both sexes,20 but single-sex studies focused on women have resulted in women accounting for 54% of the overall study population in NHLBI-funded studies. Lee and colleagues21 recently demonstrated that enrollment of the elderly and women in large acute coronary syndrome trials continues to lag behind observed community demographics, and other studies have demonstrated that excluded patients have a much higher mortality rate. Thus, despite the general principles of interactions, the quantification of these observations is often lacking in clinical trials because of lack of enrollment of relevant populations.
A popular notion is that postmarketing surveillance can make up for this deficiency in the ability to generalize. Indeed, observational postmarketing studies can provide support for broader use of therapies that have been proved effective,22,23 but these studies are plagued by uncertainty about confounding factors that cannot be excluded as a cause for observed differences in outcomes without randomization. There is little that a clinician can rationally do about this issue except to participate in clinical trials that enroll relevant populations.
| Principle 3: Unintended Targets Are Common |
|---|
|
|
|---|
-blockade raises doubts about blood-pressure lowering as a surrogate, because it apparently does not carry all of the effect on clinical outcome,25 and the question of the degree to which the blood pressure alone is the critical issue in antihypertensive therapy remains under debate. These lessons should be sobering to the practicing clinician. At the time of the Cardiac Arrhythmia Suppression Trial (CAST) trial, as described in the first part of this series of articles, antiarrhythmic drugs were being used in hundreds of thousands of patients for asymptomatic ventricular arrhythmias.26 The inability of practitioners to detect the excess risk of sudden death from their individual practices led to a large number of preventable deaths. Before it was recognized that substantial toxicity was occurring with mibefradil, a novel calcium channel blocker, >400 000 patients had been treated.27 The problems encountered included heart failure, heart block, and sudden deathhardly clinical events that are difficult to detect. These examples point out that relying on clinical experience to decide which therapies to use is not adequate. A similar experience has now occurred with cerivastatin,28 and questions have been raised about cyclooxygenase-2 inhibitors.29
The common occurrence of the unintended target has implications for the clinical dictum, "Do no harm." Indeed, anyone who has sat on a Data Monitoring Committee for a large clinical trial is aware of the many unusual but real examples of unpredictable harm done to individuals by therapies that, on average, are significantly beneficial. Perhaps a better dictum would be: "Always attempt to do more good than harm with treatment selection."
| Principle 4: Interactions Are Unpredictable |
|---|
|
|
|---|
|
In the development of glycoprotein (GP) IIb/IIIa inhibitors, it was known that the interaction with thrombin inhibitors would be critical. Initial trials evaluated the use of abciximab as an adjunct to percutaneous coronary intervention (PCI), and full-dose heparin was used. The Evaluation of c7E3 for the Prevention of Ischemic Complications (EPIC) trial demonstrated a reduction in ischemic events but an increase in hemorrhage.30 This combination of outcomes led to a request by the Food and Drug Administration for additional trials to attempt to reduce the bleeding. In subsequent trials, the dose of heparin was reduced; not only was bleeding reduced, but in indirect comparisons the magnitude of the benefit of abciximab increased. The investigators had no conceptual basis for a better effect in preventing ischemic events by reducing the dose of thrombin inhibitor below levels shown to be effective in many previous studies of heparin in PCI. In retrospect, perhaps this situation resulted from prevention of bleeding with plaque hemorrhage and hypotension, both of which may precipitate new ischemic events.
The concomitant use of aspirin and ACE inhibitors provides another instructive example. Conceptual reasons can be garnered for an additive benefit of these two classes of drugs in the treatment of patients after MI. However, physiological experiments and observational studies provided ample reason to be concerned that aspirin would nullify the benefits of ACE inhibitors. A careful systematic overview in post-MI patients clarified the issue by demonstrating that ACE inhibitors were beneficial in patients treated with aspirin, although the magnitude of the benefit was less than in patients not treated with aspirin.9
The disastrous experience with mibefradil underscores how complex and difficult this issue can be. During its development, investigators noted the metabolism of mibefradil by cytochrome P-450 pathways and raised concern about pharmacokinetic interactions with other drugs metabolized by this pathway.31 Most experts and the manufacturer of the drug felt that these interactions would not cause major clinical difficulties. Unfortunately, a number of complications resulted, including deaths, before the drug was pulled from the market.
These examples point out that assumptions by clinicians about untested combinations of potent therapies may not only be incorrect, but they could also lead to widespread negative outcomes. Caution is needed to withstand the temptation to prescribe untested combinations. The imperative is growing to conduct more trials with the use of a factorial design so that such interactions can be sorted out.32 These examples also bring up the important issue of the interaction of the clinical community with regulatory agencies. Unless regulatory agencies require trials to include more representative patients and to evaluate therapies in the context of the complex clinical world of multiple interactions and comorbidities, it is unlikely that the medical products industry will be motivated to conduct such studies. However, even when everything possible is done before marketing, unforeseen effects of therapies will occur, and the clinical community needs to enhance its interaction with regulators to avoid missing important information from postmarketing surveillance.
| Principle 5: Long-Term Effects Deserve Evaluation |
|---|
|
|
|---|
A graphic example that has captured the attention of the national press is the diet combination phenfluramine dexpheneramine (fen phen).35 In small clinical trials performed over short periods of time, the combination caused weight loss. Only longer-term clinical observations raised the issue of valvular insufficiency.36 Yet, because longer-term randomized clinical trials were not done, the community is unclear about the extent to which the valvular lesions caused irreversible harm.
More recent examples include the Heart and Estrogen/progestin Replacement Study (HERS) and Prospective Randomized Flosequinan Longevity Evaluation (PROFILE) trials. In HERS, the administration of hormone replacement therapy to postmenopausal women with an intact uterus and with documented coronary heart disease led to excess thrombotic events in the first year and fewer thrombotic events between the first and fourth years of follow-up.37 In PROFILE, flosequinan was shown to improve quality of life in the first several months of treatment, but over subsequent follow-up, both quality of life and survival were adversely affected.38 This same pattern emerged in the evaluation of vesnarinone.39 The recent example of the Long-term Intervention with Pravastatin in Ischemic Disease (LIPIDS) study demonstrates the benefit of monitoring patients even when the randomized portion of the trial is stopped early for benefit. In this case, the event-rate curves have continued to diverge, enabling the investigators to have more power to look at key subgroups and ancillary questions and to add important extra data to support the benefits of pravastatin therapy.40
The increasing number of implantable devices in cardiovascular medicine raises similar issues. No one would advocate withholding access to new cardiac valves until after a decade of follow-up, yet we know differences among valves that could be highly significant will not emerge until long-term follow-up. Recently, a clinical trial of a new valve aimed at reducing thrombotic events unfortunately demonstrated an excess rate of perivalvular leak compared with a standard valve41; it is unlikely that this difference could have been detected without a randomized trial. Similar issues will exist for newer biological approaches to dealing with restenosis, such as radiation or chemotherapy.
These findings should motivate clinicians to help develop more long-term evidence about the effects of chronically administered therapies. Multiple therapies have been recalled recently through postmarketing surveillance,42 but the fact that postmarketing surveillance is sometimes effective should not obscure the obvious problems with nonrandomized postmarketing observation. The absence of a control group or a true denominator often makes it difficult or impossible to know whether observed findings are because of random variation or whether the observed rate of events is even above the expected rate for the population. Unfortunately, too little research support is oriented towards understanding how to conduct long-term studies in a rapidly changing medical care environment.
| Principle 6: Class Effects Can Be Uncertain |
|---|
|
|
|---|
The Antiplatelet Trialists Collaboration made the point that antiplatelet drugs generally reduce ischemic events.46 However, the effort tended to lump all such agents together. When the trials with aspirin alone were evaluated, little effect could be discerned in patients with peripheral arterial disease. Whether this represents an agent-specific lack of effect continues to be debated, and the Food and Drug Administration labeling for aspirin did not include an indication for patients with peripheral arterial disease. The Clopidogrel versus Aspirin in Patients at Risk of Ischaemic Events (CAPRIE) trial then demonstrated a small but significant benefit of clopidogrel over aspirin,43 and interestingly, the largest benefit of clopidogrel was in patients with peripheral arterial disease. Most recently, despite the profound effects of orally administered GP IIb/IIIa inhibitors on platelet aggregation, it has been found that not only do they not reduce ischemic events, their administration actually increases ischemic events.44
In the arena of intravenous GP IIb/IIIa inhibitors, tremendous debates have occurred about "class effects." Recently, a direct comparison pitted tirofiban against abciximab in the setting of PCI.45 Not only did tirofiban fail as a noninferior therapy compared with abciximab, but it turned out to be significantly inferior for the primary end point of 30-day major cardiac events. However, with longer-term follow-up,46 the significant benefit of abciximab was no longer present, raising a justifiable debate about the appropriate duration of follow-up to demonstrate therapeutic superiority or equivalence. Just before this direct comparison as an adjunct to PCI, abciximab was found to be ineffective in medical stabilization of acute coronary syndromes,47 whereas both tirofiban and eptifibatide had been shown to be effective for acute coronary syndromes.48,49 More recent pharmacokinetic data50 demonstrated that the dose of abciximab may have variable inhibition of platelet aggregation, raising the issue of whether different doses in the same "class" act differently. Thus, this class of drugs is distinguished by different molecules with proven effectiveness in different clinical situations, and multiple clinical studies have been incapable of predicting these findings on the basis of biological data. Assuming a homogeneous class effect would have been a major error.
Although differences among the ß-blockers have been recognized for some time, they have usually been lumped into the same class. In the treatment of heart failure, 3 ß-blockers have been shown to reduce mortality, whereas a fourth failed in a major trial.11,12,51 Furthermore, indirect comparisons have claimed that carvedilol has a superior effect on left ventricular (LV) function than metoprolol, stimulating a direct comparative trial.52
A variety of statins have been developed with the goal of preventing atherosclerotic events by lowering low-density lipoprotein (LDL) cholesterol. Several of the statins have been evaluated in definitive trials and have demonstrated profound effects on clinical events. Others have been shown to produce greater lowering of LDL cholesterol but in the absence of definitive mortality trials. The clinical community must decide whether to believe the class effect and use the drugs interchangeably on the basis of the surrogate of LDL-lowering, or on the basis of whether there is enough distrust of the class effect to lead them to preferentially use the statins that have been shown to reduce mortality. Some experts have voiced concern that the different molecular structures may lead to different clinical effects,53 whereas others have focused on the mounting evidence that the effect of statins on clinical events may be because of multiple effects beyond LDL-cholesterol lowering.54
The ACE inhibitors have produced a dramatic effect on mortality and morbidity in patients with heart failure, and multiple ACE inhibitors have been shown to be effective in mortality trials. However, only one ACE inhibitor, ramipril, has been shown to have a definitive major influence on cardiac events in patients with preserved LV function.55 This ACE inhibitor, like several others, has been shown to have a high level of tissue penetration. Again, the clinical community is left in the quandary of whether to embrace the class effect or to preferentially use the particular ACE inhibitor that has been shown to benefit patients with preserved LV function.
The area of thrombin inhibition has also created a major controversy. Unfractionated heparin became standard therapy in acute coronary syndromes without definitive trials by current standards. Subsequently, multiple trials have been completed with several different low-molecular-weight heparins. When all of the low-molecular-weight heparin trials are combined, the evidence for superiority over unfractionated heparin is not definitive.56 However, a particular low-molecular-weight heparin, enoxaparin, has been found to be superior to unfractionated heparin on its own in 2 separate trials. Because each of the low-molecular-weight heparin preparations has different pharmacological properties, many have made the argument that the different trials of low-molecular-weight heparin should not be combined.57
This issue of class effect raises major questions for both doctor-patient interactions and health system decisions. When a particular drug or device has been shown to have a specific benefit, would it be appropriate to substitute a less-expensive agent from the same class? Given the uncertainties reviewed above, clinicians should carefully evaluate such situations to ensure that there is not a substantial question about the class effect in that case. If such a question arises, the clinician must perform the difficult task of weighing that uncertainty against the likelihood of compliance with more affordable medicine for the patient or the financial benefit to the health system. In any case, the examples reviewed above make a strong case against therapeutic substitutions being administrative decisions, and for clinicians having a primary role when therapeutic substitution is considered.
In the final installment of this four-part series, we will present the last 5 of the 11 principles we have gleaned from the past 15 years of clinical research in cardiovascular medicine. Together, these principles should help the practicing clinician understand the statistical issues that surround clinical trials and appropriately apply the lessons of those trials in their daily care of patients with heart diseases.
| Acknowledgments |
|---|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
D. L. Mann, M. A. Acker, M. Jessup, H. N. Sabbah, R. C. Starling, and S. H. Kubo Clinical Evaluation of the CorCap Cardiac Support Device in Patients With Dilated Cardiomyopathy Ann. Thorac. Surg., October 1, 2007; 84(4): 1226 - 1235. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. P. Anderson Risk Assessment for Defibrillator Therapy: Il Trittico J. Am. Coll. Cardiol., September 18, 2007; 50(12): 1158 - 1160. [Full Text] [PDF] |
||||
![]() |
S. V. Rao, J. A. Eikelboom, C. B. Granger, R. A. Harrington, R. M. Califf, and J.-P. Bassand Bleeding and blood transfusion issues in patients with non-ST-segment elevation acute coronary syndromes Eur. Heart J., May 2, 2007; 28(10): 1193 - 1204. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Califf The cycle of quality as a model for improving health outcomes in the treatment of hypertension Eur. Heart J. Suppl., May 1, 2007; 9(suppl_B): B8 - B12. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. Perlin and J. Kupersmith Information Technology And The Inferential Gap Health Aff., March 1, 2007; 26(2): w192 - w194. [Abstract] [Full Text] [PDF] |
||||
![]() |
The National Heart, Lung, and Blood Institute Work Major Clinical Trials of Hypertension: What Should Be Done Next? Hypertension, July 1, 2005; 46(1): 1 - 6. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. March, S. G. Silva, S. Compton, M. Shapiro, R. Califf, and R. Krishnan The Case for Practical Clinical Trials in Psychiatry Am J Psychiatry, May 1, 2005; 162(5): 836 - 846. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Califf Simple Principles of Clinical Trials Remain Powerful JAMA, January 26, 2005; 293(4): 489 - 491. [Full Text] [PDF] |
||||
![]() |
N. Rajagopalan, T. D. Miller, D. O. Hodge, R. L. Frye, and R. J. Gibbons Identifying high-risk asymptomatic diabetic patients who are candidates for screening stress single-photon emission computed tomography imaging J. Am. Coll. Cardiol., January 4, 2005; 45(1): 43 - 49. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A. Diamond and S. Kaul Prior convictions: bayesian approaches to the analysis and interpretation of clinical megatrials J. Am. Coll. Cardiol., June 2, 2004; 43(11): 1929 - 1939. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Majumdar, F. A. McAlister, and C. D. Furberg From knowledge to practice in chronic cardiovascular disease: a long and winding road J. Am. Coll. Cardiol., May 19, 2004; 43(10): 1738 - 1742. [Abstract] [Full Text] [PDF] |
||||
![]() |
|