| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(Circulation. 2007;115:2652-2659.)
© 2007 American Heart Association, Inc.
Pediatric Cardiology |
From the University of Texas School of Public Health, Houston (L.G.B.), and Department of Pediatrics, University of California at Davis Childrens Hospital, Sacramento (J.P.M.).
Correspondence to James P. Marcin, MD, MPH, Department of Pediatrics, Section of Critical Care Medicine, 2516 Stockton Blvd, Sacramento, CA 95817. E-mail jpmarcin{at}ucdavis.edu
Received December 5, 2006; accepted March 9, 2007.
| Abstract |
|---|
|
|
|---|
Methods and Results We used data from the state of Californias patient discharge data set from the years 19982003 to replicate 4 previous research studies of pediatric cardiac surgery volume and mortality. The total number of pediatric surgeries varied from 12 801 to 13 971 depending on the selection criteria applied. Using this larger and more contemporary data set, we found a weaker and less consistent volume-mortality relationship than had been reported previously. We also developed a new model, which incorporated elements of the old models, and found a statistically significant relationship with higher volume and lower mortality (odds ratio=0.86 per 100-patient increase in annual volume; 95% CI, 0.81 to 0.92). Post hoc analyses show that this relationship was related to the performance of the single largest-volume hospital.
Conclusions With the use of data from California, the volume-mortality relationship among pediatric cardiac surgery patients has changed since previous research, such that the old models no longer describe a clear or consistent association. With the use of a continuous definition of volume and an updated model, an association is observed but is dependent on highly leveraged covariate patterns found in the largest-volume hospital.
Key Words: heart defects, congenital hospitals pediatrics quality of health care survival
| Introduction |
|---|
|
|
|---|
Since 1990, 4 separate research investigations have analyzed the relationship between volume and in-hospital mortality in large, statewide populations of pediatric cardiac surgery patients.25 Despite differences in their methodologies, all reported a significant inverse relationship between case volume and in-hospital mortality and inferred that their research was supportive of regionalization for pediatric cardiac surgeries. All but 1 study used cut points for defining "large" and "small" hospitals that were determined endogenously, and each varied in placement of these volume cut points. Nevertheless, the 1 study that defined volume as a continuous variable found that the magnitude of association between mortality and volume may be decreasing with time,5 as other longitudinal studies have reported for other surgical services.610 The most recent analysis for pediatric cardiac surgery patients used California hospital discharge data from 1995 to 1997.
Editorial p 2599 Clinical Perspective p 2659
This investigation reexamines the relationship between hospital volume and mortality for pediatric cardiac surgery patients in California, using a larger and more contemporary hospital discharge data set (19982003) from the California Office of Statewide Health Planning and Development. The major objectives of this work were (1) to determine whether the previously developed models continue to show results similar to those that had been published, (2) to incorporate the most salient features of the old models into a new model, and (3) to explore the association between risk-adjusted mortality and volume through post hoc analyses.
| Methods |
|---|
|
|
|---|
One of the variables used in previous research was the median household income obtained from a patients home zip code. After inclusion and exclusion criteria were applied (see below), the Patient Discharge Database data were merged by the patients 5-digit zip code to the median household income variable provided by the US Census 2000 database.12 The study was approved by the University of California at Davis and the California Health and Welfare Agencys Committee for the Protection of Human Subjects.
Replication of Previous Authors Work
To replicate the 4 previous studies, we closely followed the population selection criteria, variable definitions, and statistical models according to their published methods.25 These studies varied in their criteria for population selection, the number of years of data involved, the definition of case volume, and the covariates included in the predictive models. An overview of the methodologies used by each of these investigations is presented in Table 1, along with notations when their methods could not be replicated precisely.
|
Most deviations from the published methods noted in Table 1 are minor. However, Hannan et al3 had selected cases from a surgical registry and used patient-level clinical data to generate their risk adjustment model. For the replication, we applied the selection criteria used by Sollano et al5 and the International Classification of Diseases, 9th Revision, Clinical Modification procedure and diagnosis codes for risk adjustment. Severe cyanosis/hypoxia was defined by diagnoses codes 786.5, 768.9, 997.01, 728.5, and 770.83. Extracardiac anomalies were defined as diagnoses codes 740 to 744.9 and 748 to 759.9. Pulmonary hypertension (at time of admission) was defined by diagnoses codes 747.83 and 416.0 to 416.9. Because there is no satisfactory International Classification of Diseases, 9th Revision, Clinical Modification code for acidemia, it was not included in the model.
In addition, as noted in Table 1, 3 of the 4 investigations made use of a 4-level variable representing surgical complexity, with level 1 representing the category with the least risk for mortality. The 4 severity categories utilized by Hannan et al3 were approximated for the replication by first ranking the procedures identified by Jenkins et al4 (who used International Classification of Diseases, 9th Revision, Clinical Modification codes) by observed percent mortality in the selected population and comparing these observed rates against their original severity classifications. Procedures were allowed to vary by, at most, 1 severity level above or below their original severity classification by Jenkins and colleagues.4 However, when there were <100 cases for a given procedure, the procedure was given its original severity categorization regardless of the observed mortality rate. The resulting classification of procedures into surgical severity categories is presented in Table 2. If there were >1 risk codes, the highest code was used for risk adjustment.
|
New Model Development
The new predictive model for mortality was developed with the use of the study population selected with the criteria developed by Sollano and colleagues5 (Table 1). There were 12 801 cases representing 11 840 unique individuals in the data set; a total of 681, 125, 6, and 3 patients had 2, 3, 4, and 5 repeat admissions for cardiac surgeries, respectively. All records are treated as independent cases and are referred to collectively as the study population.
Model development began with use of standard logistic regression and an adaptive process of choosing a reasonable set of predictors of mortality. We started with a list of all the patient-level variables that had been identified as potentially important by all previous authors, even if these variables were not used in their final models. Each potential predictor was analyzed for its crude relationship with mortality and with volume with the use of both annual volume and 6-year total volume. The variables were also evaluated for the frequency of their use in the previously published models, statistical significance in the published models, and quality of the measure in terms of missing or unknown values. Those variables with favorable evaluations (ie, those associated with both volume and mortality, those that had been used in >1 of the published models, those that had statistically significant coefficients in the published models, and those without large numbers of missing or unknown values) were used to create a new starting model for mortality. We checked for interactions between these initial variables. We then assessed all other variables that had been used in the previously developed models for their ability to improve model fit (with the likelihood ratio statistic) when added singly to the starting model. Those variables with significant coefficients in the separate analyses were then added all at once to the starting model and rechecked for their contributions to model fit through backwards selection.
Volume was added next to the model, initially as 6-year total volume and ultimately as annual volume such that each hospital could contribute as many as 6 annual volume values. We divided the annual volume term by 100 so that the coefficient would correspond to the effect of a 100-person increase in annual case volume. We also dichotomized annual volume at 75 cases per year because California Childrens Services guidelines recommend that hospitals conduct a minimum of 75 pediatric congenital open heart surgeries per year.13
Interactions between the already included covariates and volume were also considered in a graphical analysis, and finally, all other variables not yet tested for their ability to improve model fit were added to the model singly and assessed with the likelihood ratio statistic. After the standard logistic model was complete, the model was refit as a 2-level, hierarchical model, with patients nested within hospitals and hospital-level variance ascribed to the intercept term. A second-order penalized quasi-likelihood estimation procedure was used to perform the calculations.
Model Diagnostics
Model diagnostics were performed for the finalized standard logistic regression model. Model discrimination was assessed with the use of the receiver operating characteristics curve and its area statistic (area under the curve).14 Additional diagnostics involved graphical analysis of statistics calculated by covariate pattern. These statistics included Pearson
2 and deviance residuals, leverage values, and overall measures of fit, including the Hosmer-Lemeshow change in Pearson
2 (Hosmer-Lemeshow 
2) and deviance (Hosmer-Lemeshow
D) and influence (Pregibons
ß).15
Simulations for Graphical Presentation
We wanted to demonstrate graphically the performance of hospitals over the range of annual case volumes given their case mix for a given year. To accomplish this, a simulation procedure was followed to calculate exact probabilities for the observed mortalities, given our best estimates for the individual risks of the patients involved. First, the predictive model for mortality without volume was applied to the data set. Coefficients from the model were then used to generate, for each case (record) in the data set, a predicted probability of mortality. Each simulation began with randomly assigning to each case a uniformly distributed number between 0 and 1. A case was then defined as a simulated death if the random number was less than that individual cases predicted probability of mortality; otherwise, the case was defined as a simulated survivor. For each cohort of patients described by a given annual volume value, the simulated number of deaths was counted and added to a data set specific to that volume value. These simulations were run 10 000 times so that each volume-specific data set reflected a distribution of potential numbers of deaths against which the observed number could be compared.
The probability of observing at least the number of deaths that actually occurred is calculated by summing the number of times at least that number of deaths appeared in the data set of simulations and dividing by 10 000. These procedures were an adaptation of the methodology published by Luft and Brown16 to address biases that may arise in the generation of CIs about expected mortality rates when the expected number of deaths is <5 (methods utilizing approximation to the normal and Poisson distributions may show bias under this condition).
Post Hoc Analyses
Two post hoc analyses were conducted. First, we stratified the data by age category to investigate whether there were differences in the volume-outcome associations between the different age categories. Second, to test the possibility of a trend in the volume-outcome association over time, we added a term for discharge year and an interaction term for discharge year and volume to the final model.
Software
Standard logistic regression, regression diagnostics, and the simulation procedures were conducted with the use of Stata version 8.2 (College Station, Tex; 2004). Multilevel analysis was conducted with the use of MLwiN version 2.00 (2003).
The authors had full access to and take full responsibility for the integrity of the data. All authors have read and agree to the manuscript as written.
| Results |
|---|
|
|
|---|
|
Replicated Models
Jenkins and colleagues4 used a single years worth of data, and therefore no hospital contributed >1 volume value to the data set. Although the number of patients and hospitals from our data set is comparable to that originally studied by Jenkins and colleagues, the overall mortality rate among pediatric cardiac surgery patients has decreased, and there were no deaths in hospitals conducting between 1 and 9 pediatric cardiac surgeries during the years 1998, 1999, 2001, and 2002. Therefore, the model was fitted to all 6 years of data (each hospital could have up to 6 different volume values and therefore could be categorized differently in different years). The odds ratios (ORs) for the volume variables were 3.04 (95% CI, 0.89 to 10.40) for annual volumes 1 to 9, 1.61 (95% CI, 1.12 to 2.30) for annual volumes 10 to 100, and 1.45 (95% CI, 1.10 to 1.90) for annual volumes 101 to 300, with annual volumes >300 as the reference category.
In the replication of the work conducted by Sollano and colleagues,5 volume (defined continuously as "volume/100") was not significantly associated with mortality in the overall population (OR, 0.99; 95% CI, 0.98 to 1.01). However, there was a small but statistically significant volume-outcome association among hospitals conducting surgery on children aged <30 days (OR, 0.97; 95% CI, 0.95 to 0.99). We found no clear trend between volume and surgical complexity among infants aged <30 days: ORs, 0.98, 0.96, 0.99, and 0.95 at surgical complexity levels 1, 2, 3, and 4, respectively; only the OR at complexity category 4 was statistically significant at the 0.05 level.
Hannan and colleagues3 did not calculate an effect for volume per se but applied their logistic regression model to arrive at the number of expected deaths for annual volumes of <100 and
100. These expected rates were used to generate severity-adjusted mortality that approximates what the mortality "would have been" had the groups experienced an average severity of illness identical to that of the states population. With our data set, the risk-adjusted mortality rates for these 2 volume categories were 3.87% (95% CI, 2.56% to 4.87%) and 3.56% (95% CI, 3.27% to 3.99%). Therefore, the groups severity-adjusted mortality rates were not statistically different from each other or the overall population rate (3.62%).
The research of Chang and Klitzner2 estimated the number of deaths that could theoretically be avoided through statewide regionalization. Application of their selection and exclusion criteria resulted in 13 917 cases and 622 deaths across 20 hospitals, yielding an overall mortality rate of 4.5%. In this replication, hospitals were classified as small, medium, and large with the use of Chang and Klitzners original mean annual volume cut points of 70 and 170 cases per year. The original case mix adjustment variable was also used, which dichotomized cases as either high or low risk on the basis of principal procedure codes. ORs for small- and medium-volume categories, compared with the high-volume category, were determined separately for the high- and low-risk classifications of procedures, with adjustment for age, gender, race, type of insurance, family income, type of admission, prematurity, failure to thrive, Down syndrome, and pulmonary hypertension. Among the low-risk cases, the relative odds of mortality were 1.52 (95% CI, 1.03 to 2.26) and 0.82 (95% CI, 0.58 to 1.19) at low- and medium-volume hospitals, respectively, and among the high-risk cases the odds were 1.84 (95% CI, 1.27 to 2.67) and 1.07 (95% CI, 0.77 to 1.47) at low- and medium-volume hospitals, respectively.
New Model
Our risk-adjusted mortality prediction model is presented in Table 3. The model includes age (categorized as age <30, 30 to 89, and 90 to 360 days with >360 days as the reference group), surgical complexity categories 2, 3, and 4 (defined as in the replication of the work by Hannan and colleagues,3 with complexity category 1 as the reference group), nonelective admission status (with admissions having at least 24 hours notice as the reference group), cardiopulmonary bypass surgery, pulmonary hypertension as a condition present on admission, extracardiac anomalies, and expected nonprivate payer status (with expected payers of all other types as the reference group). The OR for volume was 0.86 (95% CI, 0.81 to 0.92) and denotes that a 100-patient increase in annual volume is associated with a 13.9% decrease in odds of mortality. We noted the crude (unadjusted) association between volume and mortality was negligible (OR, 1.00; 95% CI, 0.94 to 1.07). The addition of the volume variable to the model of covariates significantly improved model fit (
2(1)=19.19 for the likelihood ratio; P<0.0001).
|
Overall fit was acceptable according to the Hosmer-Lemeshow Pearson
2 statistic (
2(8)=10.30, P=0.24). The area under the curve suggested good discrimination at 0.82. Additional diagnostic analyses, however, showed that the covariate patterns with the highest leverage values occur at the highest volumes. Because of this finding, we re-ran the analysis excluding the single highest-volume hospital (which had 99 deaths among 2832 cases over the 6-year period) and found that the OR for annual volume increased to 0.93 (95% CI, 0.82 to 1.05) and volume no longer contributed significantly to the fit of the model (
2(1)=1.33 for the likelihood ratio; P
0.25).
The possibility that the data would exhibit significant within-hospital correlation was considered by specification of a multilevel logistic model in which random variance was ascribed to the constant (intercept) term. The estimation procedure yielded coefficients and errors for the fixed part of the model that were essentially unchanged from those of the standard logistic model, and the hospital-level variance estimate was not statistically different from 0 (s2m0=0.006, SE=0.014). Therefore, there was no significant correlation of outcomes to warrant the use of this modeling strategy.
When the standard logistic regression model was re-run with annual hospital volume dichotomized at 75 cases per year, hospitals with 75 pediatric cardiac surgery cases or more per year had an OR of 0.75 (95% CI, 0.55 to 1.02) relative to hospitals with <75 such cases per year. When the largest-volume hospital was removed from the analysis, the OR was 0.84 (95% CI, 0.62 to 1.16).
Figure 2 shows the results of the simulations used to generate an exact probability for at least the number of deaths observed for each volume value. The figure shows that there is a wide range of these probabilities for the observed data and that these probabilities are approximately evenly distributed above and below the 50% level throughout all volume values except for a cluster of high probabilities at the far right of the figure. These 6 highest volume values all correspond to the largest hospital, which had performed well in each of the 6 years of the study.
|
Post Hoc Analyses
When the data were stratified by age, coefficients for the volume effect increased in absolute value with decreasing age. A statistically significant association was obtained only among children aged <30 days (n=1203; OR, 0.77; 95% CI, 0.69 to 0.86). Second, we found no evidence that there was a change in the volume-outcome association over time. Although there was a trend in decreasing mortality over time, with the OR for discharge year being 0.92 (95% CI, 0.82 to 1.02), the interaction term of discharge year and volume was not significant (OR, 1.02%; 95% CI, 0.97 to 1.06).
| Discussion |
|---|
|
|
|---|
Our post hoc analysis replicating the work by Sollano et al5 demonstrated a small but significant relationship between volume and mortality among infants; however, there are limitations unique to this subanalysis. The exclusions used in our analyses (ie, patients with very low birth weight and patients aged <3 months receiving certain surgical procedures) limit the generalizability of the findings to the infant population as a whole. For example, 91% of children aged <30 days were unscheduled admissions compared with 14% unscheduled admissions in the remainder of the population. We lack important clinical details that could explain the circumstances under which these surgeries were being conducted.
Hospital case volume was defined as a continuous variable similar to others (by Sollano et al5) because this definition avoids the pitfalls of deciding a priori what constitutes "small" or "large" volumes. Although we considered the threshold of 75 cases per year (not statistically significant), the common practice of using the data to define volume thresholds in favor of finding an association was expressly avoided. We also believe that after examining the volume-outcome scatterplot (Figure 1), there is no clear threshold that would justify dichotomization or any other transformation. It is possible that there is a very high volume threshold (eg, 400 cases per year) above which risk-adjusted mortality could be lower; however, given the fact that only 1 hospital conducted >400 cases per year in our data, the selection of this threshold for California data is not feasible.
We found no evidence for a trend in the volume effect over the 6-year period. Annual volume was not averaged over the 6-year period because several hospitals experienced sudden increases or decreases in their caseload, which likely reflected department closings/openings or gains/losses of surgeons; such changes potentially dissociate any quality of care (or selective referral) concepts linked to an average annual volume measure. These same possibilities may explain why the level-2 error variance (with level 2 defined by hospital identification) was relatively small. Similar to Sollano et al,5 we found that no information was gained by rendering the model multilevel. Alternative definitions of volume are possible, such as lagged measures, average daily volume, and measures that do not assume major hospital staffing changes. However, our model was intended to build on the findings of the replications, and we avoided deviating from the manner in which volume has been conceptualized previously.
The fact that leverage increases with volume was noted by Chang and Klitzner2; however, these authors did not report how responsive the volume-outcome association was to these covariate patterns. The fact that a single institution affected the statistical significance of a volume-outcome investigation is similar to the finding of another research effort investigating in-hospital survival of blunt trauma patients. In an analysis by Glance et al,17 regression diagnostics identified the largest-volume trauma center to be an outlier, and exclusion of cases from this trauma center resulted in no demonstrable volume-outcome association. The authors concluded that the hospital volume criteria established by the American College of Surgeons needed to be reexamined. We would similarly recommend that the current California Childrens Services minimum pediatric cardiac surgery volume recommendation of 75 cases per year be reexamined because our findings were very sensitive to a single institutions outcomes.
We could not replicate all previous studies precisely. Most notably, the research conducted by Hannan and colleagues3 utilized clinical information that was unavailable in the Office of Statewide Health Planning and Development data set. However, our use of up to 20 procedure and 24 diagnostic code fields is a strength. A recent study demonstrated very similar results when comparing a mortality prediction model based on patient-level data with a model based on administrative data for coronary artery bypass graft surgery when robust model development and careful statistical methods are used.18
The process by which covariates were screened and the order in which they were added to the model followed procedures conceptualized a priori. The identification, definition, and selection of covariates may be susceptible to criticism in the process of the models development. Because there was no evidence of an unadjusted volume-mortality association, any inference of significant association in the final model must be attributed to the effects of confounding factors (barring, of course, any sources of bias). The strength of the association of a covariate with mortality and volume will tend to drive the strength of association between volume and mortality. Since the publication of the volume-mortality studies replicated here, more sophisticated "complexity categories" have been developed, notably the Risk-Adjusted classification for Congenital Heart Surgery (RACHS)19 and Aristotle20 scoring methods. We cannot discount the possibility that had another risk adjustment model been used, we may have obtained different results. However, given the goals of our research, a customized model that more closely fits the data being analyzed is more appropriate. We were encouraged by the fact that our model fit the data well and that the covariates predicted mortality in the expected ways with respect to direction and magnitude. Choices were made a priori with awareness of the various objections that could be made, and it is hoped that the processes used are transparent and seem reasonable to most.
The surgical complexity categories used in this and other studies group multiple procedures together of similar, but not identical, risk. Risk adjustment with the use of these surgical complexity categories and other comorbidities is an imperfect process. Even without consideration of the impact of the largest-volume hospital, our results do not imply that hospitals may perform as well across the gamut of volumes as high-volume hospitals for all procedures. The lack of a statistically significant association may reflect that low-volume hospitals already avoid specific surgeries that they are ill-equipped to perform.
Although we found a statistically significant association between volume and mortality, it is important to translate ORs into absolute risk for illustrative purposes. For example, if it is assumed that a hospital had a mean mortality rate of 3.6% (the overall average of the data set), an additional 100 cases is associated with, on average, a mortality rate of 3.1%. In other words, with this example, the OR of 0.86 translates to 1 fewer death for every 200 additional cases performed. The OR of 0.77 for infants translates to approximately 1 fewer death for every 120 additional cases performed.
Finally, the illustration of how hospitals fall within their probability distributions over the range of volumes (Figure 2) is informative. The removal of the largest-volume hospital was not done to imply that it was a "statistical outlier" but rather to demonstrate the impact of highly leveraged covariate patterns on the results. These results provide an illustration of how the finding of a statistically significant association is not automatically supportive of regionalization because it would be unfeasible for all pediatric cardiac surgeries to be performed at a single hospital in the state. We recommend that future volume-outcome studies present statistical diagnostics to elucidate the reason for an apparent association or lack thereof; overall measures of fit and discrimination are insufficient assessments of model performance.
| Acknowledgments |
|---|
None.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. L. Bratton Case Load and Mortality in Pediatric Cardiac Surgery AAP Grand Rounds, August 1, 2007; 18(2): 17 - 18. [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Circulation Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2007 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |