(Circulation. 2007;115:109-113.)
© 2007 American Heart Association, Inc.
Statistical Primer for Cardiovascular Research |
From Massachusetts General Hospital Biostatistics Center and Institute for Health Policy (S.R.R.), and Massachusetts General Hospital Biostatistics Center, Department of Biostatistics, Harvard School of Public Health (D.A.S.), Boston, Mass.
Correspondence to Sowmya R. Rao, PhD, Massachusetts General Hospital, Biostatistics Center, 50 Staniford St, Suite 560, Boston, MA 02114. E-mail srrao{at}partners.org
Key Words: proportional hazards models Kaplan-Meiers estimate survival
| Introduction |
|---|
|
|
|---|
Survival distributions are usually described in terms of 2 functions: the survival function, S(t), defined as the probability that a person survives past a specified time t; and the hazard function, h(t), which is the instantaneous failure rate and is defined as: equation
|
|
Suppose a patient has survived to time t; then the hazard function is the probability that the patient will have an event in the next instant. The hazard function is conceptually useful in describing survival distributions but is rarely published. The greater the hazard function, the shorter is the survival time.
The survival methods that we describe require that the censoring time is independent of the event time. This is called noninformative censoring. An example that illustrates when this assumption would always be met is a clinical trial in which patients enter the study over a period of time and there are no dropouts. If the patient does not have an event before the end of the study, the patients event time will be censored. The distribution of the potential censoring time will only depend on when the patient entered the study. This time will be independent of the patients time to event as long as there are no secular trends in the survival distribution. An extreme example of an instance when these assumptions would not be met is a study of time to death, where patients are no longer followed up after they recover from a disease. Patients who are lost to follow-up in a clinical trial or drop out of a clinical trial are problematic because the time to their last observation may or may not be related to their unobserved event time. For instance, if patients who feel better drop out of the study, the censoring may be informative. Approaches to this problem have been the focus of an extensive literature.1
| Estimating the Survival Function |
|---|
|
|
|---|
The Kaplan-Meier method2 is frequently used to estimate the survival function when there are censored data. The best way to understand this method is to break up the time scale into intervals that end at each event time. Let t1,t2,... be ordered event times. Then, because no event occurs before time t1, the value of S(t) is 1 from t=0 to just before t=t1. Suppose that n1 is the number of patients that are being observed at time t1 (by convention, if a patients censoring time is t1, the patient is considered to be observed at t1), and m1 is the number of events at t1. Then, S(t) is equation
|
|
for t=t1 up until just before t2. Note that this is simply the proportion of patients that survive past t1 among those who survived until t1. At t2, one estimates the probability that a patient survives past t2 given that the patient lives up to t2 by equation
|
|
The estimate of S(t) is then equation
|
|
from t=t2 up to just before t=t3. Most computer programs that compute the Kaplan-Meier survival curve start with 2 columns of data; the first is the survival or censoring time, and the second is a censoring indicator that is 0 if the time is a censoring time and 1 if the time is an event time.
The plots with survival time on the horizontal axis and the proportion surviving, S(t), on the vertical axis when there is censoring are called the KM curves. These start at 1 (because probability of survival beyond time 0 is 1) and step down toward zero. If there are subjects who survive beyond the study time, then the survival curve does not go to zero but stays horizontal. Because the survival curve does not change in intervals in which no events occur, one can calculate the curve at event times only. The mean time to an event is estimated by the area under the survival function. If the largest time is an event time, then the survival function goes to zero at that time, and this estimate is finite. Otherwise, the mean time cannot be estimated. This is why median survival is used more often than mean survival. The median survival time is estimable only if the survival curve drops to or below 0.5. The median survival time is estimated by the value on the horizontal axis at the intersection of a horizontal line drawn from the vertical axis to the survival line where S(t)=0.5. If the KM curve drops to or below 0.5 but does not equal 0.5, then the first event time when the curve falls below 0.5 is used.
Survival curves can be compared to assess differences in treatment effects. If
2 groups are being compared, survival curves are plotted for each group. If the survival curves are parallel to each other, then the group that consistently has a higher survival curve than the other has longer survival, and the treatment given to this group is concluded to be the better of the treatments being studied. Although visually the survival curves for the 2 groups might seem different from each other, we need to test whether the "true" survival curves are statistically different by using a formal test.
| Hypothesis Testing |
|---|
|
|
|---|
Proportional Hazards or Log-Rank Test
The log-rank test can be used to test the hypothesis of no difference in survival between the 2 groups. This test makes the assumptions that the observations are independent and that the censoring distribution is independent of the survival distribution; notably, the censoring distribution can be different in each group, so that the test can be used to compare a current treatment group with a historical control.
This tests the null hypothesis that there is no statistical difference between the survival curves in the 2 groups. The basic idea behind the test is that at each event time ti there will be n1i patients in group 1 and n2i patients in group 2. Under the null hypothesis of no treatment effect, the probability that the treatment group of the patient who had the event will be in group 1 is equation
|
|
Thus, if we define an indicator variable
i to equal 1 if the patient is from group 1 and zero if the patient is from group 2, then equation
|
|
has a mean equal to zero. This is the numerator of the log-rank test. The denominator is the standard deviation of this quantity: equation
|
|
The log-rank test can be used to compare
2 survival curves. It is preferable to use the multivariable regression method to assess the relationship of many risk factors to survival.
Cox Proportional Hazards Model
This is the most popular method to evaluate the relationship between covariates and survival with the use of a mathematical model. This is called a semiparametric model because it does not assume any distribution for the baseline hazard. The model is defined as equation
|
|
where
0(t) is the baseline hazard at time t and x1,x2, ..., xk are k independent covariates. No assumptions are made regarding the baseline hazard function.
We can test the association of each of the independent variables with survival time adjusted for other covariates. It is important to understand the meaning of the parameters in a proportional hazards model. Suppose first that the covariate, say x1, has 2 values, 0 and 1. Then, exp(
1) is the hazard ratio for patients with x1=1 versus those with x1=0. That is the instantaneous probability of an event in one group divided by that probability in the other. Notice that we are modeling the hazard so that if patients for whom x1=1 have longer survival times, then
1 will be negative. The model specifies that the hazard ratio is constant over time and for the values of all the other covariates. When the covariate is continuous, then exp(
1) is the hazard ratio for a unit change in the value of x1. It is often helpful to divide continuous covariates by their standard deviation so that the units for each covariate are comparable, and
1,
2,... have the same scale, which would be 1 standard deviation. Hazard ratios are approximately equal to the relative risk (ratio of risk in the exposed group to the risk in the unexposed group) and are used interchangeably.
Parametric Methods
Parametric methods assume that the survival times follow a specified distribution. Exponential, Weibull, Gompertz, Gamma, and log-normal distribution are often used for survival times. In the exponential model the hazard function is constant, which means a persons probability of an event in the future is independent of how long the person has gone without an event. Weibull, Gompertz, or Gamma can be used when the hazard function is monotonically increasing or decreasing. The log-normal distribution is assumed when the hazard increases in the beginning and then decreases.
Probably the most commonly used parametric survival model is the exponential model, which is defined as equation
|
|
This model has some very useful properties. Without covariates, the survival function is S(t)=exp[t exp(
0)]. The hazard function is constant exp(
0), the mean survival is exp(
0), and the median survival is exp(
0)xlog(0.5). To estimate
0, let n be the total number of events and let f be the total amount of follow-up; then
0 is estimated by log(n/f), which has a standard error of 1/
n. The estimates from the exponential model are the same as those from the proportional hazards model when the data are exponential, which is why the exponential distribution is often used for sample size calculations even when the data are to be analyzed with a proportional hazards model or a log-rank test.3,4
Kalbfliesch and Prentice5 provide more details on parametric methods.
A brief discussion of other issues related to survival analysis is presented in the next section.
| Additional Topics |
|---|
|
|
|---|
Interval Censoring
It is important to identify the time origin in a survival analysis. The usual time origin is the entry into the study. In that case, at time 0 the value of n0 is the total number of patients. However, many survival techniques will also work without this restriction. For instance, one can analyze age at death among nursing home patients using survival theory. In that case, the time variable is age, and nj will not decrease but will vary as people of different ages enter the nursing home. This is known as truncation. For those interested in this type of survival analysis, see Hyde11 and Turnbull.12 There are also a wealth of techniques for the situation in which the event time is only known up to an interval; for instance, if the event were the development of a condition that could only be diagnosed by an ultrasound or a laboratory test, all that one would know was that it occurred after the last negative test and before the next positive one. This is known as interval censoring.13
Time-Varying Covariates
The assumptions of proportional hazards may not hold for a given data set. The proportional hazards model can be made more general because one can add time-varying covariates to handle situations in which the hazard ratio is not constant over time or add interaction and quadratic terms when the hazard ratio is not constant over other covariates. For instance, the covariate txZ would model a situation in which the hazard ratio increases linearly when Z=1 compared with Z=0. Thus, issues of whether the model "fits" the data are actually issues about whether the model is correctly specified. Tests of fit are described by Schoenfeld,3,14 and Wei.15 and the consequences of misspecification on testing are described by Lagakos and Schoenfeld16 and Gail et al.17
If the model is used for testing, the conclusion is that misspecification is not a great problem.16 However, estimates are biased and represent a weighted average of the hazard over the duration of the study.
One of the advantages of covariate adjustment is that it can help to ameliorate the effects of informative censoring. In an analysis with covariates, the censoring distribution can depend on the covariates. Informative censoring in this case would be a dependency between the event time and the censoring time for patients with the same value of all covariates included in the model. Thus, for instance, if a covariate affected both the censoring time and the event time, then the censoring would be informative in a model without the covariate but noninformative in a model with the covariate.
Power Calculations
Power calculations are useful for designing studies. To calculate the power for survival analysis, one needs to know the total number of subjects in the trial, accrual time (time during which subjects are recruited in the study), failure time (time at which an event or death occurs), median survival ratio (or hazard ratio) or the minimum detectable hazard, and the significance level.3,4,13,18 Standard software packages like SAS, SPSS, S-PLUS, and STATA have procedures that are simple to use for all methods described in this article.
| Example |
|---|
|
|
|---|
The aim of the study was to evaluate whether n-3 fatty acids prevent potentially fatal ventricular arrhythmias in high-risk patients. A total of 402 patients with implantable cardioverter-defibrillators (ICDs) were randomly assigned to double-blind treatment with either a fish oil or olive oil daily supplement for 12 months. The primary end point was time to first ICD event for ventricular tachycardia or fibrillation confirmed by stored electrograms or death from any cause. Analyses were performed both according to the intention to treat and according to actual treatment. All randomized subjects in this study were included in the intention-to-treat analysis. The primary analysis, based on confirmed events, was an intention-to-treat analysis of the survival free of appropriate ICD events for ventricular tachycardia/ ventricular fibrillation and/or death from any cause, which included all ICD events that occurred during the 12-month period after the first dose of the study drug, irrespective of the duration of treatment. An "on-treatment" analysis was done that included all ICD events that occurred no later than 2 months after treatment was stopped. In this analysis, the date of cessation of treatment plus 2 months was used as the censoring variable. To ensure that the time to event was independent of time to noncompliance, conditional on covariates in the model, the authors tested for associations between baseline variables and time to noncompliance and used any that were significant as covariates in this analysis.20
Time to first event analysis was calculated by the Kaplan-Meier method, and survival time across the 2 groups was compared with log-rank tests. Cox proportional hazards models were also performed to calculate hazard ratios and to adjust for clinical covariates that were associated with noncompliance in the on-treatment analysis and with the primary end point in the multivariable analysis.
The survival plots displayed in the Figure and the results of the analysis displayed in the Table were published previously in Circulation.19
|
|
Time to First Event Analyses
In the primary analysis, according to the intention-to-treat principle, there was a trend toward a longer time to first ICD event for ventricular tachycardia/ventricular fibrillation confirmed by electrograms or death from any cause among patients randomized to fish oil compared with those randomized to the olive oil placebo (P=0.057). According to the KM estimates (Figure), 28% of patients in the fish oil arm (n=57) and 39% of patients in the olive oil arm (n=78) had reached the primary end point at 12 months. This difference corresponds to a hazard ratio of 0.72. The multivariable analysis that controlled for baseline clinical characteristics resulted in a hazard ratio of 0.67, 95% confidence limits of 0.47 to 0.95, and significance of P=0.024 (Table).
In the "on-treatment" analysis of "confirmed" events, which included all who had taken any prescribed oil supplements during the 12-month period, the hazard ratio was 0.73, which was not significant (P=0.11). This analysis controlled for baseline left ventricular ejection fraction, which was the only variable affecting time to noncompliance. After the investigators controlled for more baseline variables, the reduction in risk associated with use of fish oil became significant (hazard ratio, 0.67; P=0.037) (multivariable analysis, Table).
It is interesting that the P value decreases in the multivariable analysis compared with the analysis that only considers treatment. This is not necessarily due to confounding, which was not large in this study. The effect is rather due to the fact that a multivariable model that controls for factors that affect the time to event increases the power to see a treatment effect when the covariates are equally distributed between the treatment groups, which is usually the case in clinical trials. This effect is quantified in Schoenfeld et al.21 When a study is designed, it is often a conundrum to decide whether the covariate analysis should be a primary or secondary analysis, but the decision regarding the primary analysis and the covariates to be used in the analysis should be made before data collection for a confirmatory type of study. It has more power at the cost of complexity.
| Acknowledgments |
|---|
This study was supported by research grant CA-74302 to Dr Schoenfeld.
Disclosures
None.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
K. L. Lunetta Genetic Association Studies Circulation, July 1, 2008; 118(1): 96 - 101. [Full Text] [PDF] |
||||
![]() |
D. Oakes and D. R. Peterson Survival Methods: Additional Topics Circulation, June 3, 2008; 117(22): 2949 - 2955. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Circulation Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2007 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |