| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(Circulation. 2008;117:1955-1963.)
© 2008 American Heart Association, Inc.
Health Services and Outcomes Research |
From the Center for Quality and Safety, Department of Surgery, and Institute for Health Policy, Massachusetts General Hospital, and Harvard Medical School (D.M.S.), and Department of Health Care Policy, Harvard Medical School, and the Department of Biostatistics, Harvard School of Public Health (S.T.N.), Boston, Mass.
Correspondence to Sharon-Lise T. Normand, Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave, Boston, MA 02115. E-mail Sharon{at}hcp.med.harvard.edu
Received November 9, 2007; accepted February 13, 2008.
| Abstract |
|---|
|
|
|---|
Methods and Results— Data from 14 Massachusetts hospitals were analyzed for 4393 adults undergoing isolated coronary artery bypass graft surgery in 2003. Mortality estimates were adjusted using clinical data prospectively collected by hospital personnel and submitted to a data coordinating center designated by the state. The primary outcome was hospital-specific, risk-standardized, 30-day all-cause mortality after surgery. Propensity scores were used to assess the comparability of case mix (covariate balance) for each Massachusetts hospital relative to the pool of patients undergoing coronary artery bypass grafting surgery at the remaining hospitals and for selected pairwise comparisons. Using hierarchical logistic regression, we indirectly standardized the mortality rate of each hospital using its expected rate. Predictive cross-validation was used to avoid underidentification of true outlying hospitals. Overall, there was sufficient overlap between the case mix of each hospital and that of all other Massachusetts hospitals to justify comparison of individual hospital performance with that of the remaining hospitals. As expected, some pairwise hospital comparisons indicated lack of comparability. This finding illustrates the fallacy of assuming that risk adjustment per se is sufficient to permit direct side-by-side comparison of healthcare providers. In some instances, such analyses may be facilitated by the use of propensity scores to improve covariate balance between institutions and to justify such comparisons.
Conclusions— Risk-adjusted outcomes, commonly the focus of public report cards, have a specific interpretation. Using indirect standardization, these outcomes reflect a providers performance for its specific case mix relative to the expected performance of an average provider for that same case mix. Unless study design or post hoc adjustments have resulted in reasonable overlap of case-mix distributions, such risk-adjusted outcomes should not be used to directly compare one institution with another.
Key Words: health care quality assessment outcomes research risk statistics
| Introduction |
|---|
|
|
|---|
Clinical Perspective p 1963
Provider profiling has a number of features that distinguish it from other types of outcomes research. First, unlike trials of new medications or treatment regimens, randomization of patients to hospitals or physicians would often be both impractical and unethical. Thus, profiling studies are almost always observational in nature, relying on data from usual practice settings. In further contrast to drug trials that involve direct comparisons of outcomes for only a few treatments, profiling studies typically assess outcomes for many providers, usually with regard to some population reference standard. Finally, when profiling is based on outcomes measures such as mortality or morbidity, risk adjustment is necessary to account for preexisting conditions that may confound their assessment.
Despite their increasingly widespread use, considerable confusion exists among consumers, the media, payers, and providers as to the correct meaning and interpretation of risk-adjusted outcomes. For example, many incorrectly interpret such outcomes as having "leveled the playing field" to permit direct comparison of one provider with another. Direct comparability may sometimes be justified in an observational study, but this would be fortuitous and is not an inherent characteristic of the study design.
Correct interpretation of the concept of risk-adjusted outcomes is neither a trivial nor a strictly academic concern. Such outcomes are used to designate centers of excellence, to determine reimbursement levels in pay for performance programs, to rank institutions, and to classify providers as "outliers." These determinations may have profound effects on patient access, hospital reputation, referrals, and financial survival.
The goal of this article is to systematically review the fundamental concepts from which the deceptively simple term "risk-adjusted outcome" is derived. We develop the concept of risk-adjusted outcomes in the context of causal inference theory and illustrate the derivation of indirectly standardized mortality ratios, often referred to as O/E (observed/expected) ratios. Key methodological concepts (eg, outlier determination and direct comparison of hospitals) are illustrated through the example of coronary artery bypass grafting surgery (CABG) mortality profiling, in which the difference in outcomes of a hospital compared with the reference standard is generally regarded as a reflection of quality of care.5
| Methods |
|---|
|
|
|---|
A fundamental precept of causality is that only one of a series of potential outcomes can be experienced at any one time.7,17,20,23,24 In CABG hospital profiling, a patient can undergo CABG at only one hospital on a given day. Therefore, some method must be used to estimate what would hypothetically have occurred to that patient had he or she undergone surgery at a different hospital. The observed result is referred to as the actual outcome, and the unobservable estimated outcome is the counterfactual.7,17,20,23,24 Estimation of this counterfactual outcome, the hypothetical result if treated under a different set of circumstances, is the primary motivator for risk model development. Several approaches have been developed to estimate these potential outcomes for individual patients and subsequently to assess the overall performance of a hospital.
Estimation of Counterfactuals for Risk Adjustment and Standardization
The simplest estimator of a counterfactual would be the average result of treating a similar condition (eg, a CABG procedure) in the overall population or at another specific institution. However, this estimator is likely to be both inaccurate and misleading. Patients are nonrandomly allocated among institutions, and use of crude mortality rates from other hospitals as the counterfactual outcomes would ignore systematic differences among patients such as acuity status. At the other end of the spectrum, the counterfactual outcomes could be determined through randomization,15,18,19,25 the most internally valid design. Both measured and unmeasured confounders would be balanced, so the mortality experience of patients undergoing CABG at one hospital could serve as the counterfactual outcome for patients treated at another hospital. However, it is implausible to think that most patients would consent to randomization for anything but truly experimental care; for this reason, almost all profiling studies are conducted with observational data. Matching and stratification are other methods sometimes used to derive counterfactuals, but they quickly become impractical when more than a few predictor variables are considered, the typical case in mortality profiling.
Most profiling studies have relied on regression modeling to derive counterfactual outcomes, and it is the method used here. Risk adjustment, the term commonly used for this approach, refers to the results of statistical regression models that relate the outcome for a specific patient to his or her observed characteristics.4,26–29 Then, because the main focus of profiling is to determine how the overall experience of a particular hospital compares to what would be "expected," the next step is to standardize the results of an institution to the reference population.
Indirect standardization is used for almost all profiling and public report cards. With this method, the expected rate represents what the mortality rate would have been at a hospital given its actual distribution of patients but replacing its observed mortality rates with rates estimated from the entire group of providers. The indirectly standardized mortality ratio, often referred to as the ratio of observed to expected outcomes (O/E ratio), compares the outcomes for the specific distribution of patients at a hospital with their expected results had they been treated by an average provider in the reference population.
Indirect standardization is accomplished by first summing the individual risk probabilities for each patient within a given hospital using the coefficients estimated from the regression model and the patients specific distribution of confounders. This yields the expected total number of deaths for that hospital. This counterfactual hospital mortality often is used as the denominator of the ratio of observed to expected mortality (O/E ratio), a form of causal estimand. This O/E ratio is favorable if <1 and unfavorable if >1. As a final step, the O/E ratio may be multiplied by the unadjusted population mortality rate for the procedure to obtain what is often called the risk-adjusted mortality rate but which is more correctly designated the risk-standardized mortality rate (RSMR) or standardized mortality incidence rate (SMIR).30–34
Outlier Determination and the Direct Comparison of Hospitals
Outliers
The main goal of outcomes profiling is to identify differences in hospital quality. Because the risk-standardized rates for each hospital are derived from the reference population, it is most appropriate to determine whether these rates are statistically different from the population average. If so, the hospital is regarded as a statistical outlier. Most commonly, this is achieved by determining whether the 95% interval for a hospitals risk-standardized mortality estimate includes the overall state average mortality (or alternatively, if the intervals around their O/E ratio intersect 1). If no overlap exists, they typically are classified as an outlier. An important but overlooked aspect of outlier determination is the effect on expected outcomes when true outlying programs are included in the development of the statistical model. This problem and a potential solution (cross-validated P values) are described further in the Illustration.
Risk Factor Distribution and Direct Comparability
In addition to comparing individual hospitals with the reference population to determine outlier status, some consumers also seek to directly compare individual hospitals with one another. A problem with direct comparisons that has been widely recognized by statisticians, and that was the motivation for the development of balancing methods such as propensity scores,14–16,18,19,35–41 is that of covariate imbalance. Absent randomization, the patient cohorts from 2 hospitals may be unbalanced with regard to the frequency of confounders. The implications of such imbalance have received little attention in the context of risk-adjusted outcomes profiling, which in turn has led to both misunderstanding and misuse.
In general, only the results for those patients with comparable risk profiles (eg, that overlap the risk distributions of the 2 providers) should be directly compared. Consider the extreme but not uncommon example of a state or region with many small community hospitals and 1 or 2 tertiary/quaternary hospitals. As a general principle, direct comparison of a community to a tertiary hospital would be appropriate only for the relatively small proportion of patients who overlap between the 2 hospitals. Although the results for the overlap group can be used to estimate expected outcomes for patients not in common between the 2 institutions, this form of extrapolation depends heavily on assumptions that are typically unverifiable. For example, the indirectly risk-standardized results at a community hospital apply to its specific type of patients, who might be relatively low risk compared with a tertiary center. It cannot be assumed that a favorable risk-standardized mortality at the community hospital, based on its lower risk case mix, could necessarily be achieved if it were confronted with the higher-risk case mix of the tertiary center, including some types of patients that it rarely, if ever, encounters.
Propensity scores are a useful method to construct treatment and control groups that may differ in number of subjects but are similar to randomized studies in their balanced distribution of all measured confounders.14–16,18,19,35–41 The propensity score is the likelihood of receiving treatment of one type compared with another (or in the case of profiling, exposure to one or another specific provider) on the basis of a patients set of observed characteristics. It provides a convenient scalar (1-number) summary of the information contained in all the patients measured covariates. The propensity score may then be used for matching, stratification, blocking, or weighting in regression modeling.
The problem of covariate imbalance has received little attention in provider profiling studies.42–45 If the propensity score provides a convenient summary estimate of individual patient risk, then each provider will have a specific distribution of propensity scores that characterizes its "case mix." For 2 providers to be comparable, the area of overlap in their respective propensity score distributions should be identified. As shown in Figure 1A, 2 hypothetical hospitals (hospitals 1 and 2) might by chance (or as a result of randomization) have substantial overlap in their propensity score distributions. The area of shaded overlap in Figure 1A indicates that a majority of patients treated at hospital 2 have a similar propensity to have been treated at hospital 1. For almost every patient who underwent CABG at hospital 1, we can find a "similar" patient from among those having CABG at hospital 2.
|
Figure 1B depicts a different set of 2 hospitals with significant imbalance in their average patient risk as measured by their propensity score distributions. Only a small percentage of patients at the 2 institutions have comparable risk profiles. It is only the group of patients who overlap from which relative performance inferences should be drawn.
Illustration
Study Population
We examined data from all adults (
18 years of age) undergoing isolated CABG at all acute-care, nonfederal hospitals in Massachusetts between January 1, 2003, and December 31, 2003. Data collection is mandated by the Massachusetts Department of Public Health.
Data Sources
We used clinical data submitted to a data coordinating center (Mass-DAC) located in the Harvard Medical School Department of Health Care Policy. Data are collected by trained hospital personnel using the Society of Thoracic Surgeons National Adult Cardiac Database instrument.46 Supplemental patient and surgeon identifying information also is collected using additional data forms developed by Mass-DAC. The data are sent electronically to Mass-DAC, where they are cleaned, audited, and verified using internal and external procedures.
End Points
The primary end point is hospital-specific, risk-standardized, all-cause, 30-day mortality rate. Mortality data are obtained 2 ways. First, hospital personnel are responsible for collecting 30-day mortality for all patients undergoing cardiac surgery. Second, patient identifying information is linked to this registry from the Massachusetts Registry of Vital Records and Statistics to verify date of death. The registry includes mortality information for Massachusetts residents and all records of deaths that occur within the Commonwealth regardless of the state of residence. Because Mass-DAC has access to Social Security numbers, the Social Security Index Web site47 also is searched to identify deaths, including those reported to the Social Security Administration by funeral homes or by relatives.
Statistical Analyses
Distributions of clinical and demographic variables are computed and stratified by hospital to identify unusual or extreme values. Because of data collection protocols and auditing procedures, no data are missing in the clinical variables or outcomes for the mortality models.
Risk Adjustment
We first estimated a propensity score model in which the dependent variable was multinomial, assuming 13 distinct values corresponding to the 13 hospitals (1 hospital is the reference group). The specific clinical variables included in the model were selected from a literature review of existing models and expert opinion from a panel of senior cardiac surgeons. A multinomial logistic regression model was estimated, and predictions for each patient in the sample were subsequently obtained. Thus, each patient had 14 estimated probabilities, each reflecting the likelihood that the patient would undergo CABG at 1 specific hospital rather than 1 of the remaining 13 hospitals. For this reason, the sum of the 14 estimated probabilities for each patient was 1.
To compare the performance of each hospital with that of its peers, it is necessary to assess whether the population of patients undergoing surgery at a particular hospital is comparable to that of all other Massachusetts hospitals on the basis of their observed characteristics. To accomplish this, we examined the overlap between the distribution of the propensity scores for patients undergoing surgery at each hospital and the distribution of the propensity scores for patients not undergoing surgery at that hospital. Ideally, the estimated propensity scores of the latter group would cover the entire range of estimated propensity scores at the particular hospital being studied. This finding would provide support for the assumption that the 2 groups of patients (those treated at a particular hospital versus all others) were similar in terms of observable demographic characteristics and other comorbidities.
We next estimated a regression model for the mortality outcomes. The dependent variable was binary, assuming a value of 1 if the patient died of any cause within 30 days of surgery and 0 otherwise. We included the same set of confounders used in the propensity score model. We included a random hospital-specific intercept that represented the underlying quality of the hospital and accounted for within-hospital correlation of patients. We calculated odds ratios (ORs) conditional on the hospital random effects that apply to comparisons of patients belonging to the same hospital (see Larsen and Merlo48 for a discussion of differences between conditional and unconditional ORs).
The size of between-hospital variation was summarized by the median OR (MOR).49 The MOR considers 2 CABG patients with the same set of observed risk factors but selected randomly from 2 different hospitals. The MOR is the OR between the patient with a higher probability of dying and the patient with a lower probability of dying. A MOR value >1 supports the hypothesis that between-hospital variation in mortality exists after adjustment for patient characteristics. If the between-hospital variation were 0, this would imply that differences in hospital outcomes, after adjustment for patient characteristics, are due only to random sampling variability. Although between-hospital variation will always be >0 in practice, some have suggested that small values can be effectively ignored by essentially setting the between-hospital variation component to 0. We see no reason to assume that between-hospital variation is 0 given that this value can be estimated.
We calculated the mortality risk for each patient using the observed values of his or her confounding variables. The individual risk factors were multiplied by the estimated coefficients from the regression model, transformed onto the probability scale, and summed to obtain the number of expected number of deaths at each hospital.
Hospital RSMRs
We next estimated a risk-standardized mortality ratio for each hospital by computing the ratio of the "observed" number of deaths to the expected number of deaths (RSMR). However, rather than use the actual numbers of deaths at a hospital, we used an adjusted number (called a shrinkage estimate) that avoids several statistical problems associated with the observed number, including small sample sizes and clustering.28,34,50,51 We then multiplied the standardized mortality ratio by the crude state mortality rate to obtain hospital-specific RSMRs. Ninety-five percent posterior intervals for each RSMR were computed.
Cross-Validation
Because all hospitals contribute to the model used to estimate the expected number of deaths, each hospital helps to define its own expected behavior.50,51 If one hospital is truly "outlying," with an unusually high or low mortality rate, it may "inflate" the estimated between-hospital variance component because the regression model adapts to incorporate the results of the unusual hospital. Consequently, this hospital will be less likely to be identified as an outlier. With a very large number of hospitals, the results of one institution are unlikely to distort the model substantially. However, with a smaller number of cardiac surgery hospitals, as in Massachusetts or other individual states, one aberrant hospital could substantially influence the counterfactual outcome and make the performance of that hospital less likely to be identified as an outlier.
We addressed this problem through cross-validation. In a second set of analyses, the data from each hospital were sequentially deleted from the determination of the counterfactual distribution for its particular patients. With this approach, the expected number of deaths for a hospital represents how well the rest of the hospitals in the state would fare with the patients from that specific hospital. We computed the difference between the observed numbers of deaths in each hospital and the number of deaths predicted using its case mix and the regression coefficients from a model based on all other hospitals. Posterior predictive probability values, which reflect the similarity of the mortality experience of a particular hospital to that of its peers, also were computed.50 Extreme predictive P values (P
0.01 or P
0.99) indicate a discrepancy between the observed data and what is predicted by the model developed from the remaining hospitals.
The authors had full access to and take full responsibility for the integrity of the data. All authors have read and agree to the manuscript as written.
| Results |
|---|
|
|
|---|
|
|
Table 2 illustrates the prevalence of the individual covariates from which these propensity score density distributions were derived. Column 1 shows the characteristics of the subset of patients at hospital B who do not overlap with hospital F (ie, for whom the log-odds of their propensity scores are >5). The prevalence of individual high-risk characteristics is quite elevated in this patient subset (eg, 24% renal failure, 17% reoperation, 10% cardiogenic shock, 52% emergent or salvage), and hospital F has no experience with patients having this overall level of acuity. The last 2 columns demonstrate the balancing properties of propensity scores in the area of overlap, in which patients are found from both hospitals with comparable log-odds of propensity score. For many of the most important covariates (eg, prior CABG, cardiogenic shock, recent myocardial infarction, urgent or emergent/salvage status), the prevalence was comparable for hospital B and F patients in the overlap region.
|
Although direct hospital-to-hospital covariate balance was poor, the overlap of estimated propensity score distributions for each hospital compared with the propensity score distribution for patients at most of the remaining hospitals was excellent. For example, Figure 2A displays the overlap for hospital B and all remaining hospitals based on the predictions obtained from the multinomial logistic regression model. This suggests that a comparison of the performance of hospital B relative to the overall group of other Massachusetts CABG providers is statistically valid.
The prevalence of the confounders and their relationship to 30-day mortality are presented in Table 3. Between-hospital variation measured by the MOR, after accounting for patient risk factors, is 1.34. This implies that for 2 patients with the same observed risk factors, the patient treated in the hospital with higher mortality risk is 1.34 times as likely to die within 30 days of isolated CABG as the patient treated in the hospital with lower mortality risk.
|
The last column of Table 4 depicts the typical profiling results that would be obtained with the entire state experience (all 14 hospitals) as the counterfactual. The 95% posterior interval of each hospital for its RSMR includes the state crude rate of 2.25%. This would imply that no hospital had higher- or lower-than-expected mortality rate given its case mix. In most public report cards, this finding would be regarded as sufficient evidence for the absence of statistical outliers, but as noted previously, this conclusion may be misleading. The 3 columns on the left demonstrate the results of analyses performed with cross-validation, sequentially deleting the results of each hospital from the determination of its own counterfactual. The result of this cross-validation predictive P value analysis was highly significant (P=0.01) for hospital D on the left side of Table 4. Supporting this concern is the fact that the between-hospital variation in risk-adjusted mortality is reduced by 50% when hospital D is excluded from the model (from 0.0939 to 0.048; data not shown), and the MOR decreases from 1.34 to 1.23. Finally, a 2.26% excess mortality rate results when hospital D is compared with its peers. These findings all suggest that hospital D is in fact a statistical outlier.
|
| Discussion |
|---|
|
|
|---|
Are current report cards useful? Yes, they are useful when interpreted in the correct context. Most outcomes report cards use indirect standardization. In this context, the RSMR of a hospital may be interpreted as a measure of quality for the type of patient it treats. Properly constructed and interpreted, report cards facilitate comparisons of hospitals with the entire experience of a larger population of providers (eg, a state or region). Such a comparison group for each hospital typically will be rich enough to support a valid assessment of their quality of care, and it provides meaningful information to payers, regulators, and healthcare consumers.
| Conclusions |
|---|
|
|
|---|
| Acknowledgments |
|---|
Dr Normand is contracted by the Massachusetts Department of Public Health to monitor hospital cardiac quality and also receives funding from Yale University to develop risk models for CMS.
Disclosures
None.
| References |
|---|
|
|
|---|
2. Institute of Medicine. Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: National Academies Press; 2001.
3. Institute of Medicine. Performance Measurement: Accelerating Improvement. Washington, DC: National Academies Press; 2006.
4. Gatsonis CA. Profiling providers of medical care. In: Armitage P, Colton T, ed. Encyclopedia of Biostatistics, Volume 6. 2nd ed. Chichester, UK: John Wiley & Sons Ltd; 2005: 4252–4254.
5. Normand S-LT. Quality of care. In: Armitage P, Colton T, ed. Encyclopedia of Biostatistics, Volume 6. 2nd ed. Chichester, UK: John Wiley & Sons Ltd; 2005: 4348–4352.
6. Rubin DB. Comment: Neyman (1923) and causal inference in experiments and observational studies. Stat Sci. 1990; 5: 472–480.
7. Holland PW. Statistics and causal inference. J Am Stat Assoc. 1986; 81: 945–960.[CrossRef]
8. Holland PW, Rubin DB. Causal inference in retrospective studies. Eval Rev. 1988; 12: 203–231.
9. Rothman KJ, Greenland S. Causation and causal inference in epidemiology. Am J Public Health. 2005; 95: S144–S150.
10. Rothman KJ, Greenland S. Modern Epidemiology. Philadelphia, Pa: Lippincott-Raven; 1998.
11. Pearl J. Causality: Models, Reasoning, and Inference. Cambridge, UK: Cambridge University Press; 2000.
12. Robins JM, Greenland S. The role of model selection in causal inference from nonexperimental data. Am J Epidemiol. 1986; 123: 392–402.
13. Rosenbaum PR, Rubin DB. Estimating the effects caused by treatments: comment. J Am Stat Assoc. 1984; 79: 26–28.[CrossRef]
14. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983; 70: 41–55.
15. Rosenbaum PR. Observational Studies. New York, NY: Springer; 2002.
16. Little RJ, Rubin DB. Causal effects in clinical and epidemiological studies via potential outcomes: concepts and analytical approaches. Annu Rev Public Health. 2000; 21: 121–145.[CrossRef][Medline] [Order article via Infotrieve]
17. Rubin DB. Causal inference using potential outcomes: design, modeling, decisions. J Am Stat Assoc. 2005; 100: 322–331.[CrossRef]
18. Gelman A. Applied Bayesian Modeling and Causal Inference From Incomplete Perspectives. Chichester, UK: Wiley; 2004.
19. Gelman A, Hill J. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge, UK: Cambridge University Press; 2007.
20. Maldonado G, Greenland S. Estimating causal effects. Int J Epidemiol. 2002; 31: 422–429.
21. Rubin DB. The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat Med. 2007; 26: 20–36.[CrossRef][Medline] [Order article via Infotrieve]
22. Rubin DB. Direct and indirect causal effects via potential outcomes. Scand J Stat. 2004; 31: 161–170.[CrossRef]
23. Rubin DB. Bayesian-inference for causal effects: role of randomization. Ann Stat. 1978; 6: 34–58.[CrossRef]
24. Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974; 66: 688–701.[CrossRef]
25. Fleiss JL, Levin BA, Paik MC. Statistical Methods for Rates and Proportions. Hoboken, NJ: J. Wiley; 2003.
26. Shahian DM, Blackstone EH, Edwards FH, Grover FL, Grunkemeier GL, Naftel DC, Nashef SA. Nugent WC, Peterson ED. Cardiac surgery risk models: a position article. Ann Thorac Surg. 2004; 78: 1868–1877.
27. Shahian DM, Normand SL, Torchiana DF, Lewis SM, Pastore JO, Kuntz RE, Dreyer PI. Cardiac surgery report cards: comprehensive review and statistical critique. Ann Thorac Surg. 2001; 72: 2155–2168.
28. Normand S-LT, Glickman ME, Gatsonis CA. Statistical methods for profiling providers of medical care: issues and applications. J Am Stat Assoc. 1997; 92: 803–814.[CrossRef]
29. McNeil BJ, Pedersen SH, Gatsonis C. Current issues in profiling quality of care. Inquiry. 1992; 29: 298–307.[Medline] [Order article via Infotrieve]
30. Hannan EL, Wu C, Ryan TJ, Bennett E, Culliford AT, Gold JP, Hartman A, Isom OW, Jones RH, McNeil B, Rose EA, Subramanian VA. Do hospitals and surgeons with higher coronary artery bypass graft surgery volumes still have lower risk-adjusted mortality rates? Circulation. 2003; 108: 795–801.
31. Hannan EL, Kumar D, Racz M, Siu AL, Chassin MR. New York States Cardiac Surgery Reporting System: four years later. Ann Thorac Surg. 1994; 58: 1852–1857.[Abstract]
32. Krumholz HM, Wang Y, Mattera JA, Wang Y, Han LF, Ingber MJ, Roman S, Normand SL. An administrative claims model suitable for profiling hospital performance based on 30-day mortality rates among patients with an acute myocardial infarction. Circulation. 2006; 113: 1683–1692.
33. Krumholz HM, Wang Y, Mattera JA, Wang Y, Han LF, Ingber MJ, Roman S, Normand SL. An administrative claims model suitable for profiling hospital performance based on 30-day mortality rates among patients with heart failure. Circulation. 2006; 113: 1693–1701.
34. Shahian DM, Torchiana DF, Shemin RJ, Rawn JD, Normand SL. Massachusetts cardiac surgery report card: implications of statistical methodology. Ann Thorac Surg. 2005; 80: 2106–2113.
35. Rosenbaum PR, Rubin DB. Constructing a control-group using multivariate matched sampling methods that incorporate the propensity score. Am Stat. 1985; 39: 33–38.[CrossRef]
36. Rosenbaum PR, Rubin DB. Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc. 1984; 79: 516–524.[CrossRef]
37. Rubin DB. The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat Med. 2007; 26: 20–36.[CrossRef][Medline] [Order article via Infotrieve]
38. DAgostino RB Jr. Propensity scores in cardiovascular research. Circulation. 2007; 115: 2340–2343.
39. Braitman LE, Rosenbaum PR. Rare outcomes, common treatments: analytic strategies using propensity scores. Ann Intern Med. 2002; 137: 693–695.
40. DAgostino RB Jr. Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med. 1998; 17: 2265–2281.[CrossRef][Medline] [Order article via Infotrieve]
41. Joffe MM, Rosenbaum PR. Invited commentary: propensity scores. Am J Epidemiol. 1999; 150: 327–333.
42. Glance LG, Osler TM, Mukamel DB, Dick AW. Use of a matching algorithm to evaluate hospital coronary artery bypass grafting performance as an alternative to conventional risk adjustment. Med Care. 2007; 45: 292–299.[CrossRef][Medline] [Order article via Infotrieve]
43. Huang IC, Frangakis C, Dominici F, Diette GB, Wu AW. Application of a propensity score approach for risk adjustment in profiling multiple physician groups on asthma care. Health Serv Res. 2005; 40: 253–278.[CrossRef][Medline] [Order article via Infotrieve]
44. Dehejia RH, Wahba S. Causal effects in nonexperimental studies: reevaluating the evaluation of training programs. J Am Stat Assoc. 1999; 94: 1053–1062.[CrossRef]
45. Tchernis R, Horvitz-Lennon M, Normand SL. On the use of discrete choice models for causal inference. Stat Med. 2005; 24: 2197–2212.[CrossRef][Medline] [Order article via Infotrieve]
46. Society of Thoracic Surgeons. STS National Database. Available at: http://www.sts.org/sections/stsnationaldatabase/. Accessed September 5, 2007.
47. Social Security Death Index interactive search. Available at: http://ssdi.rootsweb.com/cgi-bin/ssdi.cgi. Accessed September 5, 2007.
48. Larsen K, Merlo J. Appropriate assessment of neighborhood effects on individual health: integrating random and fixed effects in multilevel logistic regression. Am J Epidemiol. 2005; 161: 81–88.
49. Larsen K, Petersen JH, Budtz J, Endahl L. Interpreting parameters in the logistic regression model with random effects. Biometrics. 2000; 56: 909–914.[CrossRef][Medline] [Order article via Infotrieve]
50. Normand ST, Shahian DM. Statistical and clinical aspects of hospital outcomes profiling. Stat Sci. 2007; 22: 206–226.[CrossRef]
51. Draper D, Gittoes M. Statistical analysis of performance indicators in UK higher education. J Royal Stat Soc Ser A (Stat Soc). 2004; 167: 449–474.
52. Iezzoni LI. Risk Adjustment for Measuring Health Care Outcomes. 3rd ed. Chicago, Ill: Health Administration Press; 2003.
| Footnotes |
|---|
Related Article:
Circulation 2008 117: 1909.
This article has been cited by other articles:
![]() |
H. M. Krumholz and S.-L. T. Normand Public Reporting of 30-Day Mortality for Patients Hospitalized With Acute Myocardial Infarction and Heart Failure Circulation, September 23, 2008; 118(13): 1394 - 1397. [Full Text] [PDF] |
||||
![]() |
L. A. Menicanti Reply to d'errico et Al. Eur. J. Cardiothorac. Surg., August 1, 2008; 34(2): 469 - 469. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Circulation Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2008 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |