Inaccuracy of Estimated Resting Oxygen Uptake in the Clinical SettingCLINICAL PERSPECTIVE
Background—The Fick principle (cardiac output = oxygen uptake (O2)/systemic arterio-venous oxygen difference) is used to determine cardiac output in numerous clinical situations. However, estimated rather than measured O2 is commonly used because of complexities of the measurement, though the accuracy of estimation remains uncertain in contemporary clinical practice.
Methods and Results—From 1996 to 2005, resting O2 was measured via the Douglas bag technique in adult patients undergoing right heart catheterization. Resting O2 was estimated by each of 3 published formulae. Agreement between measured and estimated O2 was assessed overall, and across strata of body mass index, sex, and age. The study included 535 patients, with mean age 55 yrs, mean body mass index 28.4 kg/m2; 53% women; 64% non-white. Mean (±standard deviation) measured O2 was 241 ± 57 ml/min. Measured O2 differed significantly from values derived from all 3 formulae, with median (interquartile range) absolute differences of 28.4 (13.1, 50.2) ml/min, 37.7 (19.4, 63.3) ml/min, and 31.7 (14.4, 54.5) ml/min, for the formulae of Dehmer, LaFarge, and Bergstra, respectively (P<0.0001 for each). The measured and estimated values differed by >25% in 17% to 25% of patients depending on the formula used. Median absolute differences were greater in severely obese patients (body mass index > 40 kg/m2), but were not affected by sex or age.
Conclusions—Estimates of resting O2 derived from conventional formulae are inaccurate, especially in severely obese individuals. When accurate hemodynamic assessment is important for clinical decision-making, O2 should be directly measured.
Accurate determination of cardiac output (Qc) is important in the hemodynamic evaluation of valve area, pulmonary and systemic vascular resistance, and severity of heart failure. The Fick method [Qc = oxygen uptake (O2)/systemic arterio-venous oxygen difference]1,2 is the time-honored gold standard for determining Qc, and has been used to validate other techniques such as indicator dilution and foreign gas rebreathing.3 Application of the Fick method requires measurement of O2. However, direct measurement of O2 through (1) mass spectrometry analysis of timed Douglas bag collections of exhaled air, or (2) breath-by-breath analysis of exhaled air using indirect calorimetry or metabolic cart analysis4,5 is time consuming and involves specific equipment that requires frequent calibration and is expensive to maintain. As a result, resting O2 is commonly estimated rather than measured using derived formulae available in the peer-reviewed literature.6–9 However, the accuracy of the formulae and nomograms most commonly used to estimate resting O2 is questionable, with most estimating methods derived from limited samples of highly selected, ethnically homogenous populations consisting of similarly aged, lean adults,10–12 populations that differ substantially from contemporary adult cardiology practice. Other formulae were derived from clinical populations composed exclusively or primarily of infants and children.6,8 Hence, we assessed the accuracy of estimated resting O2 compared with measured O2 obtained by the gold-standard analysis of timed collections of exhaled air by the method of Douglas in a large population of consecutive adult patients who underwent right-heart cardiac catheterization for clinical indications at our hospital.
Clinical Perspective on p 210
We conducted a retrospective study of consecutive patients who underwent right heart cardiac catheterization with direct measurement of resting O2 at Parkland Memorial Hospital between 1996 and 2005. Charts were reviewed for demographic, anthropometric, and baseline clinical characteristics. This study was approved by the University of Texas Southwestern Medical Center Institutional Review Board.
Calculating Estimated O2
Estimated resting O2 was calculated by the formula of Dehmer, et al7,9: O2 (ml/min) = 125 (ml/min/m2) × body surface area (BSA, m2), with BSA calculated according to the formula of Dubois13: BSA (m2) = 0.007184 × Weight (kg) 0.425 × Height (cm) 0.725. For sensitivity analysis, estimation of O2 was also calculated using the formula of LaFarge6: O2 (ml/min) = 138.1 – (X × logeage) + (0.378 × Heart Rate) × BSA (Men: X = 11.49; Women: X = 17.04); and the formula of Bergstra8: O2 (ml/min) = 157.3 × BSA + X – (10.5 × logeage) + 4.8 (Men: X = 10; Women = 0).
Direct Measurement of O2
Resting O2 was measured in all patients using the gold-standard technique of Douglas,1 with analysis of a 3-minute collection of exhaled air collected through a properly fitted mouth piece with a 3-way valve. Exhaled volume was measured with a Tissot spirometer and concentrations of oxygen, carbon dioxide, and nitrogen were determined by mass spectrometry (Marquette MGA 1100), calibrated before every measurement, and all testing was completed while patients remained in the supine position.
The magnitude of agreement between directly measured and estimated resting O2 was assessed by median absolute difference, ordinary least products regression,14 and typical error analysis.15 Median absolute difference (ml/min) for the overall cohort was calculated as a median of the absolute value of the differences between measured and estimated resting O2 determined for each patient. The degree of disagreement between measured and estimated resting O2 was calculated as a percent error, dividing the absolute difference by the corresponding measured oxygen uptake, and multiplying by 100. Ordinary least products regression was used to assess both fixed and proportional error for the overall cohort.14 Variance of comparative plot data points for estimated versus measured O2 in the overall cohort was assessed by intraclass correlation coefficient. Typical error estimation, expressed as a coefficient of variation derived from the standard deviation of the mean absolute difference divided by the square root of 2,15 is reported for body mass index (BMI) strata. Median absolute difference between measured and estimated resting O2 was assessed in the overall cohort and in patients stratified by (1) BMI using clinical categories (<25, 25–29.9; 30–34.9; 35–39.9; ≥40 kg/m2); (2) sex; and (3) age stratified by median split. The 1-sample Wilcoxon signed-ranks test was used to determine statistical significance for the median of the raw difference of values between estimated and measured O2 in the overall cohort and in patients stratified by sex, age, and BMI categories. Tests for interaction were used to assess statistical significance for median absolute difference amongst BMI categories. The clinical relevance of errors of O2 estimation is demonstrated in a hypothetical clinical context of aortic valve area (AVA) calculation, comparing results using O2 estimated by each of the 3 formulae versus measured O2. Diagnostic performance was assessed by receiver operator curve analysis comparing area under the curves, and by calculation of optimal cut points via Youden’s index16 for each estimating formula to identify AVA <1.0 cm2 derived from direct O2 measurement. Sensitivity, specificity, and positive and negative predictive values were conventionally defined and compared to assess diagnostic accuracy. Attempts to develop a more accurate estimating equation using the present data included piecewise linear models, restricted cubic splines, additive models, variable transformations, including interaction terms, and higher-order variables. Cross-validated samples were obtained in 100 iterations, with 75% used as the training set and 25% used as the validation set. All testing was 2-tailed at a significance level of 0.05, with analyses performed using SAS Version 9.1.3 (Cary, NC) and no corrections made for multiple comparisons.
Patient characteristics for the overall cohort and selected strata are shown in Table 1. The overall cohort comprised 535 patients, with mean age 55 years, 53% were women, and the mean BMI was 28.4 kg/m2. The mean measured O2 for the overall cohort was 241±56.6 ml/min (mean ± standard deviation; Table 2), with a range of 108 to 457 ml/min. Using the Dehmer formula, the mean estimated O2 was 235.4 ± 32 ml/min, with a range of 162 to 356 ml/min; it differed significantly from the direct measurement with a median absolute difference of 28.4 (13.1, 50.2) ml/min [median (25th, 75th percentile)] (Table 2 and Figure 1A; P<0.0001).
Analysis of agreement in the overall cohort between directly measured O2 and O2 estimated by the Dehmer formula using ordinary least products regression demonstrated significant fixed error [reflected by y intercept >0 (95% confidence interval, 92.4–106.1)] and proportional error [reflected by slope <1.0 (95% CI, 0.53–0.61; Figure 1A]. Poor agreement between directly measured and estimated O2 was also observed when the LaFarge and Bergstra formulae were used (Table 2; Figure 1B and 1C). Intraclass correlation coefficients were similar and slightly higher in the LaFarge and Bergstra formulae, when compared with the Dehmer formula, demonstrating slightly more consistency overall in intra-test measurement (Figure 1).
The magnitude of error between measured and estimated O2 expressed as a percent error for the overall cohort is shown in Figure 2. For all 3 formulae used to estimate O2, the degree of error ranged from 10 to 25% in ≈40% of the overall cohort, and the error was >25% in 17–25% of the cohort, depending on the estimating formula used.
The median absolute difference between measured and estimated O2 using the 3 formulae was stratified by BMI, sex, and age, as shown in Table 2. When estimating O2 using the Dehmer formula, the difference in measured and estimated O2 widened as BMI increased, with significant disagreement between measured and estimated O2 observed in all BMI strata. Patients with BMI <40 kg/m2 had a median absolute difference ranging from 24.6 (11.6, 43.7) ml/min to 29.3 (14.8, 57.8) ml/min, whereas median absolute difference was significantly higher at 47.0 (29.5, 83.4) ml/min in the ≥40 kg/m2 group when compared with the other strata (Pinteraction<0.0001). Similar results were observed in both the LaFarge and Bergstra formulae (Pinteraction=0.001 and Pinteraction=0.005, respectively). When analyzed using typical error analysis, the error associated with estimating resting O2 was similarly magnified in the ≥ 40 kg/m2 (Table 2).
Both measured and estimated O2 were higher in men compared with women (Table 2). In sex-stratified analyses, median absolute differences between measured and estimated resting O2 were large in both men and women for all 3 formulae tested. Although statistically significant, the difference in error between sexes was small (<10 ml/min) and varied in direction depending on the estimating formula used.
In analyses stratified by median age, median absolute difference between estimated and measured O2 comparisons for age groups ≤55 years and >55 years were both large [31.7 (13.5, 55.2) ml/min and 26.5 (12.7, 46.6) ml/min, respectively, P<0.0001 for each] when calculated by the Dehmer formula, though the differences between the 2 groups did not reach statistical significance (P=0.13). In assessment of both LaFarge and Bergstra formulae, the median absolute difference for both age strata was >30 ml/min, and the differences between age groups similarly did not reach statistical significance (Table 2; P=0.92 and P=0.20; respectively).
We were unable to resolve or improve upon the discordance between measured and estimated resting O2 using data-derived estimating equations, which we explored using a variety of methods of multivariable linear regression. In all cases of the exploratory models, the mean bias was unacceptably large, and the median percent difference was no less than 15%. Additionally, the predicted R2 based on the PRESS statistic17 yielded values no larger than 0.27 (data not shown), demonstrating unacceptably poor model performance.
To demonstrate the clinical importance of error in resting O2 estimation, we calculated hypothetical Fick-derived Qc based on resting and estimated O2 from all 3 formulae for each patient in the study. Fick-derived Qc values were used to derive hypothetical AVA by the Hakki equation,18 as a clinical example to determine the effect of errors in resting O2 estimation For this analysis, each patient was assigned an hypothetical mean aortic valve gradient of 40 mm Hg and an arteriovenous oxygen difference of 4.5. An AVA of <1.0 cm2 is classified as severe stenosis and warrants consideration for surgical correction. Using AVA <1.0 cm2 derived from directly measured O2 as the outcome, the Dehmer formula had a sensitivity of 93% (95% CI, 90−95%) with a specificity of 33% (95% CI, 25–40%) and an area under the curve (AUC) of 0.79 (95% CI 0.75–0.84; Table 3). The LaFarge formula had similar sensitivity, specificity, and AUC when compared with the Dehmer formula; the Bergstra formula had the lowest sensitivity of 75% (95% CI, 71–79%) of the formulae tested, but was significantly more specific (77%; 95% CI, 68–86%) with a slightly greater AUC of 0.82 (95% CI, 0.77–0.87) when compared with the other formulae (Figure 3). Similarly, the optimal ROC cut point for the Bergstra formula (0.94 cm2) was the closest of the 3 formulae to the clinically important diagnostic cutoff of 1.0 cm2 (Table 3). Expanded clinical application of the errors encountered in resting O2 estimation are shown in Figures 4 and 5 represented by theoretical Fick cardiac output and systemic vascular resistance (hypothetical mean arterial pressure of 75 mm Hg, central venous pressure of 8 mm Hg, arteriovenous oxygen difference of 4.5), where a large majority of data points deviate notably from the line of equality.
This study demonstrates that in a large, consecutive sample of adult patients undergoing right-heart cardiac catheterization, estimation of resting O2 by commonly used formulae is inaccurate compared with the gold-standard analysis of timed collections of exhaled air. Estimated O2 was most inaccurate in morbidly obese patients with BMI ≥ 40 kg/m2. To put the degree of inaccuracy of resting O2 estimation into clinical context, the observed error and variance in estimated O2 applied to Fick-calculated Qc yields up to 38% error within 1 standard deviation (SD) and up to 64% error within 2 SDs. Because Fick-calculated Qc and resting O2 are directly proportional, error in estimation of resting O2 >25%, a magnitude of error observed in 17% of the present study cohort, can dramatically alter clinical decision-making.
Formulae and nomograms for the estimation of resting O2, published as early as 1954, were derived from highly selected cohorts of homogeneous ethnicity and age.12 More recently published estimating formulae presently in broader clinical use, such as that by LaFarge and colleagues,6 were derived primarily from pediatric patients. Similarly, many of the patients used to derive the Bergstra formula8 were infants and children with congenital heart disease, possibly confounding the application of these methods in adult patients. Previous studies have demonstrated errors in O2 estimation using these and other formulae in pediatric19–22 and adult23–25 populations. However, these analyses have limitations, including small sample size with homogeneity of patient population, lack of sub-group analysis in determining other variables that may influence errors in estimation, and use fewer analytic methods to explain the errors observed. Our study builds on previous analyses by better addressing the above-mentioned limitations, allowing for a comprehensive assessment of errors in resting O2 estimation not seen to date. In the present study, we observed a substantial fixed and proportional error when O2 was estimated by each of the 3 formulae compared with analysis of timed collections of exhaled air, with error in excess of 25% in many patients.
We previously demonstrated the inaccuracy of resting O2 estimation in a smaller cohort of research participants.25 The present observations confirm and extend these previous findings of exaggerated error in the most obese patients, now in a larger sample of adult patients in a clinical setting where such methods are commonly applied. Several of the contemporary formulae for estimating O26,8 incorporate weight or body surface area without taking into account the degree of adiposity, which may impact the accuracy of estimation, because fat has little impact on oxygen uptake. In addition, there is a metabolic requirement for accommodating excess weight, including physical support and respiratory efforts, all of which may materially influence resting O2. Although predictive formulae for maximal O2 have adjusted for the metabolic cost associated with excess adiposity to improve accuracy,26,27 whether similar adiposity adjustments for the estimation of O2 at rest could improve accuracy remains to be determined.
In contrast to our previous study where estimation with mean absolute difference was exaggerated in men compared with women, we found no significant difference in observed disagreement between measured and estimated O2 at rest when stratified by sex in the present study. We postulate that results from our previous study may be explained by the fact that on average, men have greater absolute and proportional fat-free mass than women, which likely contributed to the sex-based error of the estimating formula. In the present analyses, the error was numerically greater in men compared with women using the Dehmer and Bergstra formulae, whereas with the LaFarge formula the error was numerically greater in women. However, the magnitude of differences was not statistically significant between the groups, and none of the formulae yielded a significant statistical interaction between error and sex, challenging the necessity of inclusion of sex as is done in the LaFarge and Bergstra formulae.
Although early nomograms for estimating O2 at rest, some of which were derived from pediatric populations, were stratified by age,10,11 we found no age-based error association in this study. No association between the degree of error of the estimating formula and age was present when assessing O2 using each of the 3 formulae. Similar results were found in studies of both older adults with mean age 60 years24 and younger adults with mean age 39 years.25
The errors encountered when resting O2 is estimated instead of directly measured can potentially impact clinical decision making, including determining the initiation and titration of inotropic support, decision for mechanical ventricular support, determining eligibility and monitoring response of pulmonary vasodilator therapy, and determining candidacy for valve procedures, among others.
The clinical relevance of the observed errors in O2 estimation as depicted by hypothetical hemodynamic calculations demonstrates the potential impact on critical clinical decision making, as derivations of AVA by all 3 estimating formulae revealed substantial inaccuracies in diagnostic performance. The Dehmer and Large formulae were adequately sensitive in detection of potential severe aortic stenosis but were poorly specific. Using the Dehmer and LaFarge formulae, a significant number of subjects with AVA >1.0 cm2 derived from directly measured O2 were found to have estimated AVAs <1.0 cm2, reflecting the high false positive rate. The Bergstra formula was significantly more specific than the other formulae tested, but overall had the largest combined proportion of false negative and false positive results, leading to potential clinical misclassification of AVA when estimated to be both greater and less than 1.0 cm2. We similarly observed this critical degree of potential clinical error in our previous study, which was composed of research study participants and not actual patients, contrasted with the present study of patients who underwent cardiac catheterization for clinical indications.
From a drug regulatory standpoint, estimation versus measurement of O2 is also an important consideration. In 2010, the FDA Cardiovascular and Renal Drugs Advisory Committee convened to discuss the potential role of using pulmonary vascular resistance index (PVRI) and its change in response to therapy as a surrogate for drug efficacy in pediatric patients with pulmonary arterial hypertension (PAH). Therapy in these pediatric patients would consist of drugs with approved indications in adults.28 In this context, in both pediatric and adult PAH patients, virtually all of the available data from clinical trials had been compiled using estimating formulae for resting O2, a key parameter in the calculation of PVRI.29 Given the degree of error of the estimation of resting O2 in the present study, and the absence of validation or critical assessment in the pediatric population, the routine use of resting O2 estimation in the calculation of hemodynamics and their response to therapy for drug registration studies should be closely scrutinized, if not abandoned.
Our study has limitations. Differences in methods between operators with regard to exhaled air collection and analysis could have influenced the accuracy of resting O2 measurement using the method of Douglas. However, to address this, a specific protocol was followed by all operators and the equipment was routinely maintained and calibrated. In addition, although patients were stationary and in a steady state condition before timed collections of exhaled air, it is unlikely that this method equates to a true resting O2 measurement given the anxiety associated with cardiac catheterization and the variable clinical stability across a population of patients undergoing right heart catheterization for clinical purposes.
Errors in hemodynamic assessment may adversely impact clinical assessment and therapeutic decision-making across a spectrum of serious cardiovascular conditions. Given the imperative in such situations for accuracy of assessment, if the Fick method is to be used for Qc estimation, resting O2 should be directly measured and not estimated.
Sources of Funding
This study was supported by the clinical research fellowship from the Doris Duke Charitable Foundation. The funding organization did not participate in study design, data collection, analysis, interpretation, or writing of this report, or the decision to submit this paper for publication. The corresponding author had full access to all of the data and the accuracy of the data analysis, and had final responsibility for the decision to submit for publication. Dr. Narang is supported by a clinical research fellowship from the Doris Duke Charitable Foundation.
- Received April 25, 2013.
- Accepted September 23, 2013.
- © 2013 American Heart Association, Inc.
- Consolazio CF,
- Johnson RE,
- Pecora LJ
- Baim DS,
- Grossman W
- Lazlo G
- Webb P,
- Troutman SJ Jr.
- LaFarge CG,
- Miettinen OS
- Bergstra A,
- van Dijk RB,
- Hillege HL,
- Lie KI,
- Mook GA
- Astrand PO,
- Ryhming I
- Dubois D,
- Dubois E
- Hopkins WG
- Hakki AH,
- Iskandrian AS,
- Bemis CE,
- Kimbiris D,
- Mintz GS,
- Segal BL,
- Brice C
- Kendrick AH,
- West J,
- Papouchado M,
- Rozkovec A
- Wasserman K,
- Hansen JE,
- Sue DY,
- Stringer WW,
- Whipp BJ
- Stockbridge N
- 29.↵Pfizer. The role of hemodynamic measurements in pediatric pulmonary arterial hypertension clinical trials: Experience from the revatio® (sildenafil) pediatric program. Groton, CT: Pfizer;2010.
The Fick method is the gold-standard for determining cardiac output in numerous clinical scenarios. A primary determinant of Fick-derived cardiac output is resting oxygen uptake (O2), which if inaccurately estimated will proportionally manifest as commensurate error in estimation of cardiac output, a hemodynamic parameter directly influencing critical clinical decision-making. Our study demonstrates the inaccuracies of estimating O2 with commonly used formulae compared with its direct measurement. It is important to consider these limitations when calculating cardiac output based on estimated O2, especially in obese and severely obese individuals as commonly encountered in contemporary practice. When accurate hemodynamic assessment is important for clinical decision-making, O2 should be directly measured.