(Circulation. 2000;102:1523.)
© 2000 American Heart Association, Inc.
Clinical Investigation and Reports |
From Høgskolen i Stavanger (T.E., S.O.A., J.H.H.), Department of Electrical and Computer Engineering, Stavanger, Norway; Ulleval University Hospital (K.S.), Institute for Experimental Medical Research and Norwegian Air Ambulance, Oslo, Norway; and Ulleval University Hospital (K.S., P.A.S.), Department of Anesthesiology, Oslo, Norway.
Correspondence to Trygve Eftestøl, Høgskolen i Stavanger, Department of Electrical and Computer Engineering, PO Box 2557, Ullandhaug, 4091 Stavanger, Norway. E-mail trygve-e{at}ux.his.no
| Abstract |
|---|
|
|
|---|
Methods and ResultsCentroid frequency, peak power frequency, spectral flatness, and energy were studied. A second decorrelated feature set was generated with the coefficients of the principal component analysis transformation of the original feature set. Each feature set was split into training and testing sets for improved reliability in the evaluation of nonparametric classifiers for each possible feature combination. The combination of centroid frequency and peak power frequency achieved a mean±SD sensitivity of 92±2% and specificity of 27±2% in testing. The highest performing classifier corresponded to the combination of the 2 dominant decorrelated spectral features with sensitivity and specificity equal to 92±2% and 42±1% in testing or a positive predictive value of 0.15 and a negative predictive value of 0.98. Using the highest performing classifier, 328 of 781 shocks not leading to ROSC would have been avoided, whereas 7 of 87 shocks leading to ROSC would not have been administered.
ConclusionsThe ECG contained information predictive of shock therapy. This could reduce the delivery of unsuccessful shocks and thereby the duration of unnecessary "hands-off" intervals during cardiopulmonary resuscitation. The low specificity and positive predictive value indicate that other features should be added to improve performance.
Key Words: cardiopulmonary resuscitation fibrillation defibrillation Fourier analysis
| Introduction |
|---|
|
|
|---|
From 20% to 80% of defibrillation attempts in clinical studies are reported to cause the discontinuation of VF8 9 10 11 (the great variability depends on different time definitions of VF reoccurrence or different shock waveforms), but we found a frequency of only 10% ROSC after 883 individual shocks in 156 patients.11 This is similar to the results of previous reports.8 9 12
Brown et al12 reported that the combination of centroid frequency (CF) and peak power frequency (PPF) of the VF could predict ROSC with 100% sensitivity and 47.1% specificity. We question the reliability of their results due to both the study design and the small data set, with only 9 incidents of ROSC after 128 shocks in 55 patients. The reliability of their results could have been confirmed if the prognostic criteria had been defined from 1 data set ("training set") and the sensitivity and specificity had been derived from a new data set ("testing set") instead of both having been determined from the same data set.
We have therefore attempted to predict defibrillation outcome in human cardiac arrest by combining features of spectral characterization, splitting the data into training and testing sets, and using classifier generalization techniques in an attempt to increase the degree of expected reliability. One of the combinations studied was that reported by Brown et al.12
| Methods |
|---|
|
|
|---|
Characterization
The ECG segments before shock were grouped according to shock
outcome. Outcome was defined as ROSC if a palpable pulse was
present in the postshock period (independent of duration). The
remainder of the shocks corresponded to No ROSC, including conversions
to electromechanical dissociation, asystole, VF (VF starting >5
seconds after the shock), or nonreset shocks (VF starting <5 seconds
after the shock). If the initial postshock rhythm was present for
>10% of the duration of the interval, it was defined as the postshock
rhythm. Otherwise, the next rhythm was considered. This was done
through an automated procedure, which handled all except 15 shocks.
These failures were caused by illogics in the annotation structure.
Each shock was analyzed as an independent event because we
wanted to predict the result of each shock independent of clinical
variables that may affect outcome.12 Thus, we made no
distinction between shocks with or without prior ACLS in the class
division scheme (Table 1
). We evaluated
whether ACLS affected the results with the best predicting feature by
further dividing the ROSC and No ROSC groups into subgroups who did or
did not receive ACLS before shock.
|
The prediction analysis was performed in 2 stages that apply pattern recognition methods.14 First, the ECG was spectrally characterized (feature extraction), and second, decision regions for shock outcome prediction were determined and evaluated.
Feature Extraction
The characterizing features were computed from the estimated
power spectral density (PSD) of each ECG
segment.4 5 6 12 15 16 17 18
![]() |
We attempted to discriminate between preshock ECG segments that correspond to ROSC and No ROSC outcome by computing the following features from the ECG segment PSD estimates.
The CF, or median frequency,4 5 15 17 is given
by
![]() |
f
fu;
f1 and fu are the lower-
and higher-frequency band limits, respectively. By varying these
limits, we could study the effect of extracting features from different
frequency bands.
PPF is given by
![]() |
The spectral flatness measure (SFM)19 of the VF is given
by:
![]() |
Various time domain measurements of signal amplitude characteristics of
VF have been investigated.6 12 16 20 21 22 In the
present study, we investigated an alternative frequency
bandlimited energy measurement (ENRG):
![]() |
An alternative decorrelated feature set was generated by principal component analysis (PCA) transformation.14 The features were projected onto the eigenvectors that best represented the entire data set. Thus, the new decorrelated feature set is represented by the magnitudes of the projections along the eigenvectors. Before classification, a combination from either the original or the decorrelated feature set was placed into a feature vector, v.
Classification
In the classifier, each feature vector, v, was considered to
belong to 1 of the K classes,
i,
i=1, ... , K, which corresponded to the shock outcome
rhythms (K=5). As shown in Table 1
,
1
and
25 corresponded to the ROSC and No ROSC
groups, respectively. Decision regions for these defined classes from
annotated data can be retrospectively calculated with classification
theory. The class membership of new data are decided through
prospective comparison with these decision regions.
K decision regions, Ri, i=1, ... ,
K, are computed by assigning costs for the possible wrong decisions.
Each cost, C(
i,
j), expresses the risk associated with
classification of a pattern of the true class
ias belonging to the decided class
j. A reject class,
K+1, is added to handle ambiguous or
out-of-range patterns. Each Ri is calculated by
selecting the minimum component of the risk vector
r=[r1 r2 ...
rK+1]T,
where
![]() |
j||v),
j=1, ... , K denotes the a posteriori probability
function for class
j, which is derived
according to Bayes rule:
![]() |
i) and
p(v||
i) denote the a priori probability
and the class-specific probability density function (PDF) for class
i, respectively.
The classifier performance characteristics are expressed by the
sensitivity (probability of positive prediction of ROSC outcome) and
specificity (probability of negative prediction of No ROSC outcome)
given by
![]() |
![]() |
i) expresses
the proportion of true class
i with the
corresponding decision being
j.
The decision regions were calculated iteratively with minimization of
the object function
![]() |
i). This is done
by multiplying the costs, C(
i,
j), j
i, by
factor
. By setting i=1 (
1
corresponding to ROSC), this allowed specification of a sensitivity for
the recognition of ROSC outcome. The underlying statistics can be estimated with classification theory.14 Multidimensional histograms were applied in which the feature space is divided into bins of equal volume, in which the PDF estimates are computed. Each feature set is normalized by dividing by the respective feature axis into nb equal-sized intervals in the range from the minimum to the maximum feature value. The PDF estimates in each histogram bin are then distributed by applying an elliptic gaussian kernel function, resulting in a smoother continuous estimate.14
Histogram bin resolution and kernel width are the 2 key parameters of the classifier. A small number of large bins provide low histogram resolution, whereas a large number of small bins provide high resolution. Each feature axis of the PDF is divided into nb intervals. Thus, if the feature dimension is D for a specific feature combination, the feature space is divided into nbD bins of equal volume. Smoothness is governed by the width of the kernel function. A narrow kernel function provides a high-resolution estimate with high variance, whereas a wide kernel function provides a smoother low-resolution estimate with low variance.
The concept of generality is important in the design of classifiers. The decision regions are calculated with a training set of feature vectors that represent the experience on which the classifier will base future decisions. Testing is done on an independent set. In a well-designed classifier, the testing performance should approach the training performance. Both the histogram bin resolution and the kernel width applied in the estimation affect generality.
Training and testing were conducted with a cross-validation technique.14 In each of S consecutive experiments, an (S-1)/S portion of the entire data set is used to train classifier number i. The remaining 1/S portion is kept out for testing. i is varied from 1 to S, thus producing S classifier performance results.
Experimental Setup
The ECG was sampled at 100 Hz with 8-bit resolution,
and PSD was estimated from segment lengths L=400 zero padded to 512
samples. Three feature sets were extracted with frequency ranges
(fl-fu Hz) of 0 to 50, 0
to 25, and 0 to 12.5 Hz. The spectral features produced in each of
these experiments were vSFM,
vENRG, vCF, and
vPPF. The PCA transformation of these features
gave the corresponding decorrelated feature set of
vPCA1, vPCA2,
vPCA3, and
vPCA4. The ECG immediately before
defibrillation was analyzed, and the measurements were grouped
according to the postshock rhythm for classifier design (Table
1).
Classifiers were designed and tested with the use of all possible
combinations of spectral features and decorrelated features (Table 2
).
|
The statistical functions were estimated with multidimensional histograms. Resolutions were adjusted according to setting nb equal to 4, 8, 16, 32, 64, and 128 bins. For each of these resolutions, the smoothness was varied by setting the variance of the gaussian kernel function, kw, equal to 0, 1, 5, 10, 15, and 20.
This combination of changing bin size resolution and smoothness enabled
a search for the classifier that met the generality criterion, which we
defined to be that the test sensitivities and specificities should
approach the training sensitivities and specificities to within a 5%
tolerance range. In Figure 1
, the
training and testing specificities and sensitivities are shown as
functions of resolution and kernel width. Training with high bin
resolution and narrow kernel width generates a classifier with 100%
performance in both sensitivity and specificity as the result
of overtraining, as verified by the large deviation in sensitivity in
test performance. Generality in sensitivity is achieved either
by increasing the kernel width or by using lower bin resolution, with
both resulting in lower specificity.
|
A full-scale evaluation with respect to generality of the
classifiers corresponding to all possible feature combinations (Table
2) was performed.
Finally, for a given feature combination, the classifier with the best general performance was defined as that corresponding to the highest average test performance, lowest bin resolution, and narrowest kernel width, requiring that the training and test performances satisfied the generality criterion.
Statistical Analysis
The comparisons between ROSC and No ROSC and between ACLS
and No ACLS were tested with the Wilcoxon rank sum test and
presented as median values (25th and 75th percentiles).
P<0.05 was regarded as statistically significant.
Classifier performance results are presented as the
mean±SD of the cross-validated sensitivities and specificities.
| Results |
|---|
|
|
|---|
|
The test performance results of the classifiers that met the
generality criterion are shown in Figure 2
.
|
The performances of the reference classifier
[vCF vPPF], for comparison with
earlier work, and of the highest performing classifier
[vPCA1 vPCA2] are
listed in Table 4
, and the class-specific
PDFs with corresponding decision regions for these 2 classifiers are
shown in Figure 3
. The highest performing
classifier, [vPCA1
vPCA2], shows a clearer distinction between
ROSC and No ROSC than the reference classifier [vCF
vPPF], where there is more intermingling of the
classes. The highest performing classifier was based on PCA
decorrelation and dimension reduction to 2 features and achieved a
sensitivity of 92±2% and a specificity of 42±1% in testing, or a
positive predictive value of 0.15 and a negative predictive value of
0.98 (Table 5
).
|
|
|
The frequency ranges of 0 to 25 and 0 to 12.5 Hz were best suited for
the discrimination of ROSC from No ROSC outcomes (Table 3
).
Spectral flatness was the least suitable individual feature for all
frequency ranges. Although the 3 other spectral features seem
promising, the results of the decorrelated features indicate that only
2 features are significantly different when grouped according to ROSC
and No ROSC outcome. This indicates that there is redundant information
in the original feature set.
The highest performing single-feature spectral classifiers were CF and PPF in the low-frequency range. The combination of these 2 features did not improve the results. For the single decorrelated feature classifiers, the 2 principal classifiers gave the best results for all frequency ranges. The combination of decorrelated features improved the performance significantly when the 2 midfrequency-range principal features were combined. The inclusion of >2 decorrelated features did not further improve the performance.
Whether ACLS caused changes in the PCA1 feature
is summarized in Table 6
. The No ACLS/No
ROSC subgroup may be considered the starting point, where the initial
shocks are futile, and is further divided into the following subgroups:
|
The ACLS/No ROSC subgroup, where treatment has been futile and the myocardial condition probably has deteriorated as reflected by a significant decrease in the feature values
The ACLS/ROSC subgroup, where treatment probably has caused an improvement in myocardial condition, which is reflected by a significant increase in the feature values comparable to that corresponding to the No ACLS/ROSC subgroup.
| Discussion |
|---|
|
|
|---|
We further demonstrated how classification methodology allows the combination of features with an increase in classifier performance compared with individual classification of features. We also showed how decorrelation by PCA allows dimensional reduction in the feature set with no decrease in performance compared with a combination of the complete feature set.
The rate of ROSC after individual shocks in patients is reported to be low8 9 11 12 ; the rate was 10% in a recent study from Oslo.11 Most shocks are thus individually futile. Based on the present results, 42% of the unsuccessful shocks (328 of 781) could have been avoided, and a period of chest compressions, ventilations, and vasoactive drugs could have been administered before a new defibrillation attempt was made. Studies in animals have shown that this may be favorable,23 and a recent study in humans indicated that this might improve the outcome.3 It would minimize the detriment of "hands-off" intervals, where the vital organs are without perfusion, which reduces the possibility of ROSC, recovery with intact neurological status, or both. The number of shocks should also be kept to a minimum, because repetitive shocks and total electric power are injurious to the already ischemic myocardium and increase the severity of postresuscitation myocardial dysfunction.7 Moreover, because the spectral characteristics of the VF have been reported to reflect myocardial perfusion,4 5 6 the defibrillator also might guide the CPR attempt, because the myocardial perfusion depends on compression force, rate, and duration.24 25 26 27
On the other hand, 7 shocks that resulted in a pulse-giving rhythm would not have been administered. These shocks presumably would have been administered later if CPR changed the characteristics of the VF. The effects of this could not be evaluated. The comparison of ACLS with No ACLS features illustrates this aspect of use of the features for online monitoring of the CPR efficiency. The use of the features as monitoring parameters for performance feedback during CPR is an interesting idea that is closely related to the prediction problem. Retrospectively, we studied the influence of ACLS on a single feature and demonstrated changes in values according to treatment. The present study demonstrates how a general classifier can be designed through cross-validation, which allows training and testing on independent data sets in combination with different resolutions and kernel widths in the estimation of the statistics that describe the features. This method gives an indication of how well the classifier will perform when challenged with new data in the future.
In a similar study of 128 shocks in 55 patients with only 9 successful shocks (defined as a conversion of VF to a supraventricular rhythm with a palpable pulse or blood pressure of any duration within 2 minutes of the shock without ongoing CPR), Brown et al12 extracted 4 parameters from the recorded ECG. The combination of CF and PPF gave the best predictive potential (sensitivity 100%, specificity 47.1%).12 The same combination of features gave a poorer predictive potential in the present study (sensitivity 92±2%, specificity 27±2%). We believe that the results of our generalized classifier are more realistic due to the larger database and the use of independent testing and generalization that were not done by Brown et al.12 Those authors generated the sensitivity and specificity with the same data from which the threshold values were computed, with no independent evaluation.
Noc et al6 reported in pigs that maximum and mean VF amplitude and dominant VF frequency were all acceptable shock outcome predictors. They derived the threshold values from 1 group and tested these in a separate validation group but had different results in the 2 groups, indicating that the results might not be reliable.6 Our results indicate that Brown et al12 would have experienced the same if their threshold values had been tested on new data.
Our method includes independent testing and generalization to avoid these problems. To ensure reliability, the data were split in 2. Training performance for ROSC and No ROSC prediction was computed from half of the data, whereas the other half was used to compute the corresponding test performance.
There are some limitations in the present study. First, the number of ROSC observations is low. Second, in the cross-validation processing of the data, the test performances were considered in the design of the classifiers to choose the generalizing parameters. Ideally, a final evaluation should have been performed on yet another data set that did not influence the design process. Third, we used only 1 type of classifier: the histogram method. To obtain even more reliable results, the experiments should be repeated with other types of classifiers.
Spectral characterization of VF can be of clinical importance if it can be incorporated into the software of defibrillators. We have demonstrated a method to develop an outcome predictor for defibrillation attempts in out-of-hospital cardiac arrest patients, although the sensitivity of 92±2% and specificity of 42±1% are not satisfactory for clinical use. Therefore, other features should also be investigated to add discriminative power to the feature set.
| Acknowledgments |
|---|
Received February 7, 2000; revision received April 26, 2000; accepted May 2, 2000.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
Part 5: Electrical Therapies: Automated External Defibrillators, Defibrillation, Cardioversion, and Pacing Circulation, December 13, 2005; 112(24_suppl): IV-35 - IV-46. [Full Text] [PDF] |
||||
![]() |
Part 7.2: Management of Cardiac Arrest Circulation, December 13, 2005; 112(24_suppl): IV-58 - IV-66. [Full Text] [PDF] |
||||
![]() |
Part 3: Defibrillation Circulation, November 29, 2005; 112(22_suppl): III-17 - III-24. [Full Text] [PDF] |
||||
![]() |
T. Eftestol, L. Wik, K. Sunde, and P. A. Steen Effects of Cardiopulmonary Resuscitation on Predictors of Ventricular Fibrillation Defibrillation Success During Out-of-Hospital Cardiac Arrest Circulation, July 6, 2004; 110(1): 10 - 15. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Amann, K. Rheinberger, U. Achleitner, A. C. Krismer, W. Lingnau, K. H. Lindner, and V. Wenzel The Prediction of Defibrillation Outcome Using a New Combination of Mean Frequency and Amplitude in Porcine Models of Cardiac Arrest Anesth. Analg., September 1, 2002; 95(3): 716 - 722. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Eftestol, K. Sunde, and P. A. Steen Effects of Interrupting Precordial Compressions on the Calculated Probability of Defibrillation Success During Out-of-Hospital Cardiac Arrest Circulation, May 14, 2002; 105(19): 2270 - 2273. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Shusterman, B. Aysin, K. P. Anderson, and A. Beigel Multidimensional Rhythm Disturbances as a Precursor of Sustained Ventricular Tachyarrhythmias Circ. Res., April 13, 2001; 88(7): 705 - 712. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Circulation Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2000 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |