B-Type Natriuretic Peptide and Clinical Judgment in Emergency Diagnosis of Heart Failure
Analysis From Breathing Not Properly (BNP) Multinational Study
Background— We sought to determine the degree to which B-type natriuretic peptide (BNP) adds to clinical judgment in the diagnosis of congestive heart failure (CHF).
Methods and Results— The Breathing Not Properly Multinational Study was a prospective diagnostic test evaluation study conducted in 7 centers. Of 1586 participants who presented with acute dyspnea, 1538 (97%) had clinical certainty of CHF determined by the attending physician in the emergency department. Participants underwent routine care and had BNP measured in a blinded fashion. The reference standard for CHF was adjudicated by 2 independent cardiologists, also blinded to BNP results. The final diagnosis was CHF in 722 (47%) participants. At an 80% cutoff level of certainty of CHF, clinical judgment had a sensitivity of 49% and specificity of 96%. At 100 pg/mL, BNP had a sensitivity of 90% and specificity of 73%. In determining the correct diagnosis (CHF versus no CHF), adding BNP to clinical judgment would have enhanced diagnostic accuracy from 74% to 81%. In those participants with an intermediate (21% to 79%) probability of CHF, BNP at a cutoff of 100 pg/mL correctly classified 74% of the cases. The areas under the receiver operating characteristic curve were 0.86 (95% CI 0.84 to 0.88), 0.90 (95% CI 0.88 to 0.91), and 0.93 (95% CI 0.92 to 0.94) for clinical judgment, for BNP at a cutoff of 100 pg/mL, and for the 2 in combination, respectively (P<0.0001 for all pairwise comparisons).
Conclusions— The evaluation of acute dyspnea would be improved with the addition of BNP testing to clinical judgment in the emergency department.
Received May 9, 2002; revision received May 21, 2002; accepted May 28, 2002.
We are in the midst of a chronic disease epidemic of congestive heart failure (CHF) worldwide.1–8⇓⇓⇓⇓⇓⇓⇓ This epidemic is marked by a rapid rise in prevalent cases over the past decade that is due in part to the aging population and improved survival in patients with other cardiovascular conditions.1–8⇓⇓⇓⇓⇓⇓⇓ However, the diagnosis of CHF has been fundamentally unchanged and has been based on the clinical history, physical examination, ECG, chest x-ray, and assessment of left ventricular function over the past several decades. B-type natriuretic peptide (BNP) is a cardiac neurohormone specifically secreted from the cardiac ventricles as a response to ventricular volume expansion, pressure overload, and resultant increased wall tension.9,10⇓ The present (2001) American College of Cardiology/American Heart Association practice guidelines for the evaluation and management of CHF state that the role of blood BNP in the identification of patients with CHF remains to be fully clarified.11 We sought to specifically determine the added diagnostic value of BNP over the conventional information obtained in the evaluation of patients with acute dyspnea.
The Breathing Not Properly (BNP) Multicenter Study was an international 7-center prospective study (5 US centers and 2 European centers). The study was conducted from April 1999 to December 2000. The institutional review boards of all study centers approved the study protocol, and all participants provided written informed consent.
A total of 1666 patients presenting to the emergency departments (EDs) of the study centers with a primary complaint of dyspnea were screened. Eighty patients were excluded from the study on the basis of the protocol exclusion criteria, which included the presence of advanced renal failure (calculated creatinine clearance <15 mL/min), acute myocardial infarction, and overt cause of dyspnea, including chest wall trauma or penetrating lung injury. A total of 1586 participants were enrolled in the present study. For those 1586 individuals, 48 had records that did not have the ED physician assessment of clinical probability of CHF; hence, those patients were excluded, leaving a final set of 1538 to be analyzed.
Baseline demographics, clinical history, and objective assessment of clinical signs were gathered by trained ED research personnel who were present continuously during the evaluation of the consenting individuals. All participants were seen and examined by an attending physician, and findings from the ECG, chest x-ray, and blood tests were categorized in a structured checklist. On disposition from the ED, research personnel recorded the attending physician’s estimate of clinical probability of CHF on a visual analog scale.
Measurement of BNP
During initial evaluations, a blood sample (5 mL) was collected into tubes containing potassium EDTA (1 mg/mL blood). In a 15-minute period, BNP was measured by using the Triage BNP Test (Biosite Inc). The Triage BNP Test is a fluorescence immunoassay for the quantitative determination of BNP in whole-blood and plasma specimens. Precision, analytical sensitivity, and stability characteristics of the system have been previously described.12 In brief, the coefficient of variation for intra-assay precision has been reported to be 9.5%, 12.0%, and 13.9%, and the coefficient of variation for interassay precision is known to be 10.0%, 12.4%, and 14.8% for BNP levels of 28.8, 584.0, and 1180.0 pg/mL, respectively.13 The measurable range of the BNP assay was 5.0 to 1300.0 pg/mL. Consistent with concurrent research using the Triage BNP Test, each sample was tested in triplicate to minimize variation from single observations and for internal controls. Final results were reported as the mean of the 3 samples. Of note, the current approved clinical method is to measure BNP in a single run of the test. Test results were kept in separated data binders linked only by a study code; thus, both ED physicians and adjudicating cardiologists were blinded regarding the BNP results.
Reference Standard Definition of Heart Failure
Approximately 30 days after the ED visit, the case report form (excluding the estimate of CHF probability), ECG, chest x-ray, echocardiogram, and all other clinical tests and consultations were reviewed by 2 independent cardiologists at the local study center who were not treating physicians. In addition, case report information was used to calculate the Framingham scores (requiring 2 major or 1 major and 2 minor criteria for CHF) and National Health and Nutrition Examination Survey (NHANES) scores (requiring ≥3 points for CHF) for CHF. After reviewing all information, if agreement was achieved, then the case was categorized as one of the following: (1) dyspnea due to CHF, (2) history of CHF but dyspnea due to noncardiac cause, or (3) dyspnea due to noncardiac cause. In the event of disagreement (n=164; 10.7%, range 0% to 24.3% across 7 sites), cases were adjudicated by the study end-points committee. For binary analyses of CHF versus no CHF, groups 2 and 3 were combined.
Sample Size and Power
The primary end point was diagnostic accuracy at the optimum cutoff of BNP and at ≥80% ED physician estimate of clinical probability of CHF. The following assumptions were made in the sample size calculation: diagnostic accuracy of the ED physician, 85%; prevalence of CHF as a final diagnosis in the ED dyspnea population, 30%; and effect size of ≥5% absolute difference between clinical judgment and BNP, β=0.20 and α=0.05 (2-sided). The calculated sample size of 1613 was set for study to have 80% power to observe a ≥5% absolute difference in diagnostic accuracy between the groups. With the 1538 participants evaluated in this analysis having a higher prevalence of CHF and larger effect size than expected, the observed power was 99%.
Baseline characteristics were reported in counts and proportions or mean±SD values as appropriate. Univariate comparisons were made with χ2 or 2-sample t tests as appropriate. Because this was the largest and most broadly inclusive population with dyspnea to be tested for BNP to date, we decided a priori to derive the optimum cut point for BNP from the parent population of 1586 participants. We arrived at the optimum cut point of 100 pg/mL by selecting the point on the receiver operating characteristic (ROC) curve that maximized both sensitivity and 1−specificity. The optimum cut point for ED clinical certainty of CHF was chosen at ≥80%, a cut point providing reasonable and actionable certainty of a cardiovascular syndrome.14 Decision statistics were computed from 2×2 tables and reported as sensitivity, specificity, and positive and negative predictive value. Diagnostic accuracy was computed as the sum of the concordant cells divided by the sum of all cells in the 2×2 table. Agreement between clinical judgment and BNP was quantified by using Cohen’s κ statistic. The positive likelihood ratio was taken as the slope of the ROC curve for the optimum cut point and was expressed as sensitivity/1−specificity. Pairwise comparisons among the areas under ROC curves were made by using Delong’s method.15 Logistic regression was used to combine clinical judgment with BNP data in predicting final adjudicated diagnosis, generating a graphic displayed as a heart failure diagnosis nomogram. Judgments of 0% were set to 1%, and judgments of 100% were set to 99% so that the log of the odds ratios could be computed.
The demographics for the study sample were as follows: age 64.0±16.7 (range 18 to 105) years; 883 (55.7%) men and 703 (44.3%) women; and 773 (48.7%) white, 715 (45.1%) African American, and 98 (6.2%) other race. Additional baseline characteristics are reported in Table 1 according to the ED attending physician’s judgment of CHF probability in the following categories: low, 0% to 20%; intermediate, 21% to 79%; and high, 80% to 100%. Of note, 511 (33.2%) participants had a prior history of CHF by self-report or by records available to the ED physician. The frequency histogram by decile of clinical probability for CHF is given in Figure 1. The histogram was trimodal, indicating that ED physicians tended to be relatively certain in either establishing or rejecting the diagnosis of CHF, with an additional hump at 50% certainty.
Symptoms and Physical Examination Findings
All participants in the present study required dyspnea on exertion or at rest for study inclusion. Table 2 indicates that the cardinal symptoms and signs (paroxysmal nocturnal dyspnea, elevated jugular venous pressure, pulmonary rales, cardiac enlargement, third heart sound, hepatic enlargement, and edema) of CHF were more common as ED clinical judgment was more certain of the diagnosis of CHF. Conversely, approximately one fourth of all participants had wheezing, regardless of pretest probability for CHF.
Diagnostic Testing Performed
All participants were subjected to ECG, and a majority, 1476 (96.0%), had chest x-rays performed in the ED. Table 3 lists the results of these tests stratified by the clinical probability of CHF as assessed by the ED physicians. Rates of all ECG abnormalities were more frequent in those with high clinical probabilities of CHF. Likewise, the rates of chest x-ray abnormalities indicating signs of CHF were more frequent in the high probability of CHF category. However, the presence of pneumonic infiltrate was not statistically significant across the categories (all <10%).
Reference Standard for Heart Failure
Two independent cardiologists at each study center evaluated all clinical data, including echocardiograms with reported ejection fractions in 689 (44.8%) cases. There was initial agreement between the 2 cardiologists in 1374 (89.3%) of the cases. The remaining 164 cases required adjudication locally between the 2 cardiologists, including requesting additional data from the treating physicians and, finally, review by the end-points committee if disagreement remained. The diagnosis of CHF (n=722) was supported by positive NHANES and Framingham scores in 599 (83.0%) and 621 (86.0%) individuals, respectively. The cardiologists reported that the diagnosis of CHF was supported in 587 (81.3%) by chest x-ray, 448 (62.0%) by echocardiography, 34 (4.7%) by nuclear ventriculography, and 55 (7.6%) by cardiac catheterization. In addition, the cardiologists reported that 490 (67.9%) of those with CHF had an expected response to CHF therapy. Conversely, 684 (91.4%) of those 748 found not to have CHF had cumulative evidence from chest x-ray, echocardiography, or ventriculography suggesting that CHF was not the cause of dyspnea.
The diagnostic accuracy for high (80% to 100%) ED probability of CHF on clinical grounds was 74.0%. The other decision statistics for this category were as follows: sensitivity 49% (95% CI 47% to 52%), specificity 96% (95% CI 95% to 97%), positive predictive value 91% (95% CI 90% to 92%), negative predictive value 68% (95% CI 66% to 71%), and positive likelihood ratio 11.5. Diagnostic accuracy for BNP ≥100 pg/mL was 81.2%. The other decision statistics for BNP were as follows: sensitivity 90% (95% CI 89% to 92%), specificity 73% (95% CI 71% to 73%), positive predictive value 75% (95% CI 72% to 77%), negative predictive value 90% (95% CI 88% to 91%), and positive likelihood ratio 3.4. For a composite decision based on clinical probability of 80% to 100% or BNP >100 pg/mL, or both, the diagnostic accuracy was 81.5%, sensitivity was 94% (95% CI 93% to 95%), specificity was 70% (95% CI 68% to 73%), positive predictive value was 74% (95% CI 71% to 76%), negative predictive value was 93% (95% CI 92% to 94%), and the positive likelihood ratio was 3.2. As an overall measure of diagnostic value, BNP levels ≥100 pg/mL would have added to clinical judgment, thus boosting accuracy from 74.0% to 81.5% (P<0.0001) (Figure 2). Overall, BNP at a cut point of 100 pg/mL and clinical judgment ≥80% certainty were relatively independent indicators, as reflected by a κ value of 0.30 (P<0.0001). In participants without a self-reported history of CHF (n=1027), the diagnostic accuracy of BNP was 80.4%. In other important subgroups, including men, women, whites, African Americans, the elderly (aged >70 years), and those with ischemic heart disease, the diagnostic accuracy of BNP was 83.6%, 78.0%, 80.7%, 81.0%, 78.1%, and 81.2%, respectively. Compared through a range of values with the use of ROC curves (Figure 3), the areas under the ROC curve were 0.86, 0.90, and 0.93 for clinical judgment, for BNP, and for the 2 in combination, respectively (P<0.001 for all pairwise comparisons).
Heart Failure Diagnosis Nomogram
Figure 4 displays a CHF diagnosis nomogram with the estimate of pretest probability being the certainty in the ED that dyspnea is due to CHF. The rates of actual CHF by final adjudicated diagnosis were 17.1%, 33.6%, and 49.3% for the low-, intermediate-, and high-probability groups, respectively (P<0.0001 for trend). The middle line represents BNP level in picograms per milliliter at the time of presentation. When a straight line is drawn through the pretest probability and BNP level in picograms per milliliter, the posttest probability is found on the right line. For example, a clinical judgment of 20% probability of CHF with a BNP of 1000 pg/mL yields an ≈85% probability of CHF based on these 2 predictors. As indicated, BNP has the greatest value as a diagnostic test in the intermediate zone of probability. In this category, BNP ≥100 pg/mL correctly classified 315 (74.0%) of the 427 cases as CHF or not CHF. Importantly, in this intermediate group, only 30 (7.0%) of 427 had a BNP level <100 pg/mL and a final adjudicated diagnosis of CHF. Of note, in 721 participants with low (≤20%) ED probability of CHF, 123 (17.1%) of 721 indeed had a final adjudicated diagnosis of CHF. Of these 123 individuals, 111 (90.2%) would have had the misdiagnosis corrected if the additional information of BNP >100 pg/mL had been provided. Conversely, in the cases in which the ED clinician was completely certain the diagnosis was not CHF (n=232), BNP was <100 pg/mL in 80.6% and would have been confirmatory of a final diagnosis of noncardiac dyspnea in 182 (85.4%) of 213 and would have corrected the diagnosis in 14 (73.7%) of 19. Conversely, in the cases in which the ED clinician was 100% certain that CHF was present (n=109), BNP was ≥100 pg/mL in 89.0% and would have been confirmatory of a final diagnosis of CHF in 96 (92.3%) of 104 and would have corrected the final diagnosis in 4 (80.0%) of 5.
This is the first large-scale prospective study of BNP as a diagnostic test that incorporates the ED physician’s pretest probability of CHF when blinded to the BNP result. In pilot studies, Cheng, Morrison, Dao, and colleagues12,13,16⇓⇓ found similar additive value of BNP in the clinical diagnosis of CHF by ED physicians. Importantly, ED physician clinical judgment has been shown to have a high diagnostic accuracy, which can be refined in a safe and conservative manner with BNP. In other words, incorporating BNP into the clinical evaluation of CHF raises the diagnostic accuracy by 10% in patients for whom the ED physician has a high confidence of the diagnosis of CHF. Importantly, the one third of patients for whom the ED physician is uncertain of the diagnosis (intermediate probability), adding BNP to clinical judgment correctly classified 74% of the patients and only misclassified 7% of the patients as not having CHF when the final diagnosis was indeed CHF. Our results, derived from a broad population at 7 centers, which included 44.3% women and 45.1% African Americans, are somewhat different from the results of Dao et al,16 who measured BNP in 250 patients with acute dyspnea, 96% of whom were men and of unspecified race. Notably, Dao et al found a lower cutoff value of 80 pg/mL but a very similar area under the ROC curve, for clinical judgment of 0.88 versus 0.86 in the present study. However, the area under the ROC curve for BNP in the study of Dao et al was 0.97 compared with 0.90 in the present study; thus, we observed a lower sensitivity and specificity with BNP.16 This can be explained by our more heterogeneous population and the fact that it was a multicenter study, with 7 sets of cardiology reviewers adjudicating the final diagnosis. Furthermore, the Veterans Administration population benefited from a comprehensive longitudinal electronic medical record, and, likely, there was less variation in the ascertainment of the original diagnosis and greater precision in the gold standard assessment of CHF.
The present study confirms the value of the careful history and physical examination in patients with dyspnea.17 The sharpest gradients in the cardinal features of CHF were seen across our diagnostic probability categories. Indeed, the symptom of paroxysmal nocturnal dyspnea and the physical examination findings of rales, cardiac enlargement on palpation, third heart sound, and peripheral edema were all 2 to 3 times more likely in those patients with a high probability of CHF. Conversely, wheezing did not appear to be a factor in the clinicians’ ability to discriminate among cases. The ECG and chest x-ray appeared to be valuable in the development of a clinical probability discrimination for CHF. Of note, only a pneumonic infiltrate appeared to be of little help in making an assignment of CHF probability. Despite the value of a careful clinical examination seen in the present study, it has been shown in several prior studies that the clinical examination for CHF is limited.18–22⇓⇓⇓⇓
The source of plasma BNP is cardiac ventricles, which suggests that BNP may be a more sensitive and specific indicator of ventricular disorders than other natriuretic peptides.23 This release appears to be responsive to wall tension, which, in turn, is affected by a variety of determinants that are deranged in CHF.24–27⇓⇓⇓ The results of the BNP Multinational Study reported in the present study suggest that the biological properties of this peptide make it an attractive test for the acute ED diagnosis of CHF.
The present study has multiple limitations related to any study that attempts to create a gold standard for a clinical syndrome. Blinded cardiologists used all possible information in making the final adjudicated CHF diagnosis. We attempted to aid in this process by creating standardized CHF scores from 2 prior validated methods for the cardiologists to view with all of the clinical data. We acknowledge that misclassification bias is possible and difficult to quantify. It is also possible that the measurement of BNP could have been confounded by other factors, including acute ischemia or renal insufficiency, in patients who were not excluded on these grounds.28,29⇓ It is unlikely that missing data, either in the pretest, test, or posttest probability categories, have influenced the results, given the fact that the study sample was restricted to 1538 individuals, ensuring complete data in all cases.
We believe that the importance of the present study will be to advance the current state of certainty regarding the usefulness of BNP in the diagnosis of CHF as indicated by the most recently published set of CHF guidelines.11 Our findings are supportive of the recently published European guidelines for the diagnosis and treatment of CHF, which incorporate BNP as a diagnostic test for routine clinical practice.30 In addition to being a useful outpatient screening tool for left ventricular dysfunction, results of the BNP Multinational Study support the use of BNP in the ED.31 Routine use of BNP in the evaluation of suspected heart failure would be largely confirmatory, yet still valuable in cases in which the clinician has a high degree of certainty of the diagnosis. Importantly, BNP would clarify the final diagnosis in a large proportion of cases encountered in the ED. We anticipate that the published nomogram in this article will be useful to ED and other physicians in establishing the diagnosis of CHF in patients with dyspnea of uncertain etiology. This nomogram leverages an objective yet conservative approach, providing a safeguard for the highly subjective clinical assessment of patients who have CHF presenting as dyspnea of uncertain etiology. To put this in context, 90% of the patients who had CHF but were thought by the ED physician to be of low probability (≤20%) would have been correctly diagnosed with a point-of-care blood test, allowing for rapid triage and appropriate care of these patients. Importantly, a final degree of clinical utility is achieved by integration of a careful history, physical examination, ECG, chest x-ray, and BNP level, as demonstrated in the ROC curves.
In conclusion, in a multinational sample of men and women seen in the ED with acute dyspnea, BNP measurement would have added to clinical judgment in establishing a final diagnosis of CHF. In those patients with an intermediate probability of CHF, BNP would have clarified the diagnosis in the majority of cases.
Some financial support and Triage BNP devices and meters were provided by Biosite, Inc. We are indebted to Roberta A. Sullivan, BSN, MPH, for preparation of this manuscript.
This article originally appeared Online on July 1, 2002 (Circulation. 2002;106:r1–r7).
Drs McCullough, Omland, McCord, Wu, Maisel, Abraham, Kazanegra, Hollander, Storrow, and Duc and P. Clopton have received honoraria from Biosite. Drs McCullough, Maisel, McCord, Abraham, Hollander, Storrow, and Omland are consultants for Biosite. Drs McCord, Maisel, and McCullough are members of the speaker’s bureau for Biosite. Dr Wu has received grants from and owns stock in Biosite.
- ↵Graves EJ. US Department of Health and Human Services, Detailed Diagnoses and Procedures, National Hospital Discharge Survey, 1990. Washington, DC: National Center for Health Statistics, Vital and Health Statistics; 1991. Series 13, No. 113, DHHS publication (PHS) 92-1774.
- ↵Ranofsky AL. Inpatient Utilization of Short-Stay Hospitals by Diagnosis. Washington, DC: US Department of Health, Education, and Welfare, National Center for Health Statistics, Vital and Health Statistics; 1974. Series 13, No. 16, DHEW publication (HRA) 75-1767.
- ↵Hunt SA, Baker DW, Chin MH, et al. ACC/AHA guidelines for the evaluation and management of chronic heart failure in the adult: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation. 2001; 104: 2996.
- ↵DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics. 1988; 3: 837–845.
- ↵Remes J, Miettinen H, Reunanen A, et al. Validity of clinical diagnosis of heart failure in primary health care. Eur Heart J. 1991; 12: 315–321.
- ↵Wheeldon NM, MacDonald TM, Flucker CJ, et al. Echocardiography in chronic heart failure in the community. Q J Med. 1993; 86: 17–23.
- ↵Davie AP, Francis CM, Love MP, et al. Value of the electrocardiogram in identifying heart failure due to left ventricular systolic dysfunction. BMJ. 1996; 312: 222.
- ↵Cohn JN, Johnson GR, Shabetai R, for the V-HeFT VA Cooperative Studies Group. Ejection fraction, peak exercise oxygen-consumption, cardiothoracic ratio, ventricular arrhythmias and plasma norepinephrine as determinants of prognosis in heart failure. Circulation. 1993; 87 (suppl VI): VI-5–VI-16.
- ↵Tsutamoto T, Wada A, Maeda K, et al. Attenuation of compensation of endogenous cardiac natriuretic peptide system in chronic heart failure: prognostic role of plasma brain natriuretic peptide concentration in patients with chronic symptomatic left ventricular dysfunction. Circulation. 1997; 96: 509–516.
- ↵Luchner A, Stevens TL, Borgeson DD, et al. Differential atrial and ventricular expression of myocardial BNP during evolution of heart failure. Am J Physiol. 1998; 274: 1684–1689.
- ↵Remme WJ, Swedberg K. Guidelines for the diagnosis and treatment of chronic heart failure. Eur Heart J. 2001; 22: 1527–1560.