Accuracy of Serial National Institutes of Health Stroke Scale Scores to Identify Artery Status in Acute Ischemic Stroke
Background— Early recovery after intravenous thrombolysis can be observed in stroke; however, the utility of measuring clinical improvement to assess artery status has not been established. We sought to determine the accuracy of serial National Institutes of Health Stroke Scale (NIHSS) scores to detect complete early recanalization of the middle cerebral artery.
Methods and Results— Data from the CLOTBUST trial (Combined Lysis of Thrombus in Brain Ischemia Using Transcranial Ultrasound and Systemic tPA) were used to determine the most sensitive and specific NIHSS-derived parameter to identify complete recanalization. Then, reproducibility was tested against a separate patient population (Barcelona data set). NIHSS scores were determined before tissue plasminogen activator bolus and at 60 and 120 minutes in both data sets. Receiver operating characteristic curves were used to compare test performance. The accuracy of individual cutoffs was demonstrated by sensitivity, specificity, and positive and negative predictive values. A total of 122 patients in the CLOTBUST data set and 98 in the Barcelona data set received 0.9 mg/kg intravenous tissue plasminogen activator [mean age 69±12 versus 72±12 years, 57% male versus 51% male, median NIHSS 16 versus 17 points, mean time from onset to treatment 140±32 versus 177±59 minutes, and complete recanalization of the middle cerebral artery in 19% versus 17%). For identification of recanalization, an NIHSS score reduction of ≥40% offered the best tradeoff, with sensitivity, specificity, positive predictive value, and negative predictive value of 65%, 85%, 50%, and 91% at 60 minutes and 74%, 80%, 58%, and 89% at 120 minutes, respectively. Test performance was equal in the Barcelona data set.
Conclusions— Relative changes in serial NIHSS scores can serve as a simple clinical indicator of arterial status after intravenous thrombolysis. Accuracy parameters are affected by the process of recanalization and its varying clinical significance.
Received July 11, 2006; accepted February 22, 2007.
Post hoc exploratory analysis of the National Institute of Neurological Disorders and Stroke (NINDS) rt-PA Stroke Study1 showed that a neurological improvement by ≥5 points on the National Institutes of Health Stroke Scale (NIHSS) at 24 hours was observed significantly more often in patients treated with tissue plasminogen activator (tPA) than in placebo-treated patients.2 In addition, dramatic early clinical improvement was observed in stroke patients who achieved complete recanalization after treatment with intravenous thrombolysis.3–5 Such evidence suggests that serial NIHSS examinations within the first few hours are a useful and simple clinical indicator of early recanalization. The wide-ranging impact of such a simple clinical tool is that it can be used in place of, or in addition to, diagnostic tests such as transcranial ultrasound, computed tomographic angiography, magnetic resonance angiography, or digital subtraction angiography, which are limited by their availability, procedural risk, length of time to perform, and sometimes their cost.
Editorial p 2602
Clinical Perspective p 2665
The utility of a clinical surrogate marker for detecting recanalization is 2-fold. First, it can be used to identify a persistent occlusion after intravenous thrombolysis and thus select patients for additional rescue therapies such as intra-arterial thrombolysis or mechanical thrombectomy.6–10 Second, it can be used as a suitable end point in designing clinical trials that would identify the biological activity of an intervention aimed at achieving early recanalization.
The NIHSS is the most widely used neurological deficit scale, with documented reliability, validity, and outcome predictive ability.11–15 We sought to determine the accuracy of serial NIHSS score measurements to detect complete recanalization of the middle cerebral artery (MCA) and to validate our results against a separate data set.
The present study used 2 independent patient data sets. The first (CLOTBUST data set) originates from the Combined Lysis of Thrombus in Brain Ischemia Using Transcranial Ultrasound and Systemic tPA (CLOTBUST) trial.16 This data set was used to determine the accuracy of serial NIHSS score measurements for detecting recanalization (development data set). The second set (Barcelona data set), obtained in Barcelona, Spain, apart from the CLOTBUST trial, validated the results against a separate patient population (validation data set).
Both data sets included patients with acute MCA occlusions (including isolated M1 occlusion, proximal M2 occlusion, tandem internal carotid artery–MCA occlusion, or terminal internal carotid artery occlusion) who were treated with intravenous tPA (0.9 mg/kg body weight, with 10% given as a bolus). CLOTBUST was a phase II multicenter, randomized clinical trial that determined the safety and signal-of-efficacy of adjuvant therapy with continuous transcranial Doppler (TCD) monitoring versus sham TCD monitoring.16 The Barcelona data set was an open-label series of consecutive patients with MCA occlusions diagnosed by TCD between January 2001 and June 2005 (excluding those enrolled in the CLOTBUST study). In CLOTBUST, patients were treated within a 0- to 3-hour window of symptom onset; in the Barcelona data set, they were treated within a 0- to 6-hour window.17
In both data sets, complete recanalization of the MCA was determined with the previously validated Thrombolysis in Brain Ischemia (TIBI) flow grading system (which consists of a 6-point scale, with a score of 0 indicating no flow and 5 indicating completely normal flow). TIBI demonstrated >90% accuracy compared with angiography.18–20 Complete recanalization was defined as TIBI 5 flow in the symptomatic artery. The NIHSS score was determined at baseline and at 60 and 120 minutes in both data sets by physicians trained and certified in NIHSS scoring.
In the CLOTBUST trial, treating physicians determined the NIHSS scores without knowledge of vessel recanalization. In the Barcelona data set, physicians were not blinded to the diagnosis made by TCD. In both data sets, however, physicians were not aware of the purposes of this analysis. In the CLOTBUST data set, patients with unknown NIHSS score at 60 minutes (4 patients) and at 120 minutes (11 patients) were excluded from the analysis. In the Barcelona data set, 3 patients with missing data at 120 minutes were excluded.
Development (CLOTBUST) Data Set
Serial NIHSS scoring offers 3 options for describing clinical improvement: (1) the absolute value of the NIHSS score at a certain time point (ie, NIHSS60 at 60 minutes), or (2) the absolute improvement, ie, the difference between the NIHSS score at baseline (NIHSSbaseline) and at a certain time point (ΔNIHSS60=NIHSSbaseline minus NIHSS60), or (3) the relative improvement, ie, percent reduction from baseline to a certain time point (%NIHSS60=ΔNIHSS60 divided by NIHSSbaseline).
Our first step was to select the most “valuable” NIHSS-derived parameter for identifying complete MCA recanalization. This value was gauged by its accuracy and performance regardless of the baseline NIHSS scores.
To measure the accuracy of different NIHSS-derived parameters, we constructed an empirical (nonparametric) receiver operating characteristic curve for each parameter and calculated the area under the receiver operating characteristic curve (AUC) with 95% CIs.21 The AUC is a measure of the accuracy of a diagnostic test that ranges from 0.5 (no diagnostic ability) to 1.0 (perfect diagnostic ability).
Because response to stroke treatment is governed by initial stroke severity, we then tested the performance of the NIHSS-derived parameter irrespective of NIHSSbaseline. This ensures that the accuracy of the selected parameter was similar across the range of NIHSSbaseline. To address this issue, we calculated whether there was an association (Spearman rank correlation coefficient [rs]) between the value of the NIHSS-derived parameter and NIHSSbaseline in patients with complete recanalization. If such an association was absent, it meant that the same proportion of change (ie, NIHSS60 or ΔNIHSS60 or %NIHSS60) was achieved with different NIHSSbaseline scores (eg, 5 points or 50% NIHSS reduction similarly predicted recanalization whether the NIHSSbaseline was 10 or 20 points).
After the most valuable NIHSS-derived parameter was selected, we used the receiver operating characteristic curve to determine accuracy at varying thresholds, namely, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and likelihood ratios (positive likelihood ratio and negative likelihood ratio) with 95% CIs (Pearson χ2). We aimed to detect thresholds with the best tradeoffs between sensitivity and specificity, as well as PPV and NPV.
Validation (Barcelona) Data Set
The overall accuracy of the selected NIHSS-derived parameter was tested by comparing the AUC for the Barcelona data set with the CLOTBUST data set.21 Next, performance of the selected thresholds of the most valuable NIHSS-derived parameter from the CLOTBUST data set was assessed on the Barcelona set.
Statistical significance for intergroup differences was assessed by χ2 test for categorical variables. For continuous variables, the 2-sample Student t test was used to compare the means and the Mann-Whitney U test to compare the medians. Repeated-measures ANOVA was used to compare time-event occurrence data. For comparison of AUC and NIHSS-derived parameters between the CLOTBUST and Barcelona data sets, a maximum difference was set to 0.01 to ensure comparative validity. A value of P<0.05 was considered significant.
The authors had full access to and take full responsibility for the integrity of the data. All authors have read and agree to the manuscript as written.
In the CLOTBUST data set, we analyzed 122 patients at 60 minutes and 115 patients at 120 minutes. In the Barcelona data set, we analyzed 98 patients at 60 minutes and 95 at 120 minutes. Baseline characteristics of both data sets are shown in Table 1. There was a difference in the onset-to-treatment time. The overall mean NIHSS scores in the CLOTBUST data set in patients with early complete recanalization versus partial or no recanalization, shown in Figure 1, were significantly different (P<0.001, repeated-measures ANOVA).
The performance of different NIHSS-derived parameters (absolute value versus absolute improvement versus relative improvement) for diagnosing complete recanalization in the CLOTBUST data set is shown in Figure 2 and Table 2. There was no difference in accuracy between different NIHSS-derived parameters at 60 or 120 minutes as measured by AUC. The percent of NIHSS reduction at 60 and 120 minutes, however, was the only parameter that did not correlate with and was independent of the pretreatment stroke severity, and therefore, it was selected for further analysis and validation.
The accuracy parameters at different thresholds of % NIHSS reduction at 60 and 120 minutes are listed in Table 3. The best tradeoff between sensitivity and specificity was achieved with a cutoff of ≥40% NIHSS reduction at 60 and 120 minutes. Such a threshold offered sensitivity of 65% (95% CI, 45% to 81%) and 74% (95% CI, 57% to 86%) and specificity of 85% (95% CI, 76% to 91%) and 80% (95% CI, 70% to 87%) at 60 and 120 minutes, respectively. In addition, a ≥40% NIHSS reduction at 60 and 120 minutes offered a positive likelihood ratio of 4.3 (95% CI, 2.5 to 6.7) and 3.7 (95% CI, 2.4 to 5.2) and a negative likelihood ratio of 0.4 (95% CI, 0.2 to 0.6) and 0.3 (95% CI, 0.2 to 0.5), respectively.
The best tradeoff between PPV and NPV is achieved with a cutoff of ≥80% NIHSS reduction at 60 and 120 minutes. Such a threshold offers a PPV of 67% (95% CI, 30% to 90%) and 86% (95% CI, 49% to 97%) and an NPV of 84% (95% CI, 76% to 89%) and 77% (95% CI, 68% to 84%) at 60 and 120 minutes, respectively. The association between ≥80% NIHSS reduction and recanalization was statistically significant (P<0.01 for 60 and 120 minutes). If NIHSS reduction was ≥80% at 2 hours, then 85% of patients achieved a modified Rankin score of 0 or 1 at 3 months; if NIHSS reduction was <80% at 2 hours, then 30% of patients achieved similar results at 3 months (P=0.003). In the Barcelona data set, the overall accuracy (AUC) of % NIHSS reduction and of selected thresholds (sensitivity, specificity, PPV, and NPV) did not differ from the CLOTBUST data set (Table 4).
The present study showed that a serial neurological examination is a valuable measure of arterial status after intravenous thrombolysis, with similar performance in 2 different patient data sets. We demonstrated that considerable change in the NIHSS scores happens within the first 60 minutes of treatment after complete recanalization is achieved and thus can serve as an indicator of early complete recanalization. This is in agreement with other studies that showed that recanalization is associated with short-term improvement3,5,22 and long-term outcome.19,23,24
The present study developed and validated a parameter from serial NIHSS score measurements suitable for identifying arterial status irrespective of pretreatment stroke severity. We showed that clinical improvement, measured as the NIHSS percentage reduction, performs independently of baseline stroke severity. We therefore consider NIHSS percentage reduction as a better way to gauge the early recanalization process than the absolute numbers of the NIHSS scores or NIHSS point reduction. This is because patients with complete recanalization are more likely to achieve the same degree of relative reduction of the deficit rather than the same absolute improvement.
This somewhat contradicts the most sensitive measure of successful thrombolysis at 2 hours from treatment onset (total NIHSS score ≤5 points), derived from the NINDS rt-PA Stroke Study.25 If such a criterion is applied to the present data, however, sensitivity to detect recanalization for patients with an NIHSS baseline score <14 (77%) would be significantly different (P<0.01) from sensitivity for patients with NIHSS baseline score ≥14 (28%; not shown in results).
The present data have important implications for the design of clinical trials, because the absolute values of NIHSS score or NIHSS point reduction may not be accurate measures of early clinical improvement unless adjusted for baseline NIHSS score. Likewise, other studies have argued for a similar approach that adjusted 3- and 12-month clinical outcome to the baseline NIHSS score, which better evaluates the effect of treatment.26–28
The present study provides sensitivities and specificities at different thresholds of NIHSS percentage reduction. We showed that the optimal tradeoff between sensitivity and specificity is reached with ≥40% NIHSS score reduction compared with baseline. This means that if all patients after intravenous thrombolysis who have improvement in NIHSS score by <40% are sent for rescue interventional therapy, 85% of all occlusions will be indicated, and 65% of all recanalizations will be saved from intervention. In terms of likelihood, compared with patients who did not experience recanalization, patients whose occlusion recanalized had odds of 4.3 for ≥40% NIHSS score reduction and odds of 0.4 for <40% NIHSS score reduction.
For different purposes, different cutoffs may apply, ie, a cutoff of ≥60% NIHSS score reduction, because of a high specificity, may be a more appropriate end point for a clinician who does not have a TCD at bedside. The processes of either brain stunning (delayed improvement after recanalization)29 or clinically “silent” recanalization (ie, no improvement despite recanalization) may be responsible for limited sensitivity. In addition, limited specificity (ie, improvement despite lack of recanalization) may be explained by partial recanalization or the presence of good collateral flow.30
We also showed that NIHSS score reduction by ≥80% at 2 hours after the start of intravenous thrombolytic therapy predicts recanalization with a PPV of 86% and an NPV of 77%. The present results confirm the findings from the NINDS trial, which suggested that major neurological improvement at 24 hours may be a useful surrogate for thrombolytic activity.31 The present data are, however, distinct, because we related NIHSS scores directly to recanalization rather than to 3-month outcome.
The main limitation of the present study is that the patient population was limited to MCA occlusions. The present results could not be broadened for the entire spectrum of stroke patients, especially those with non-MCA infarction or small-vessel stroke. We recognize, additionally, that the lack of blinding of the TCD evaluation in the Barcelona cohort may have biased the assessment of the NIHSS score.
The main strength of the present study is that the performance of NIHSS percentage reduction was equivalent in 2 separate and characteristically different (ie, age and treatment-to-onset time) data sets. These results can therefore be generalized to patients with MCA occlusion within 6 hours from stroke onset.
In conclusion, changes in serial NIHSS scores can be used as a clinical indicator of arterial status, although accuracy is affected by the process of recanalization and its varying clinical significance. Ultimately, the present data may have an impact on the design of clinical trials, as well as a practical application in emergency settings. Further studies will be needed to determine the final implications for patient management.
Sources of Funding
This study was supported by NINDS grants 1K23NS02229-01 and 1P50NS044227. Canadian sites were supported by the Canadian Institutes of Health Research and the Alberta Heritage Foundation for Medical Research. The CLOTBUST trial is an investigator-sponsored trial (protocol A2207s, Genentech, Inc) that is exempt from investigational new drug status by the Food and Drug Administration. Spencer Technologies, Seattle, Wash, provided power-motion Doppler units and technical support for all participating study sites. DWL, Multigon, and Nicolet also provided portable equipment to hospitals in Houston, Tex.
Christou I, Alexandrov AV, Burgin WS, Wojner AW, Felberg RA, Malkoff M, Grotta JC. Timing of recanalization after tissue plasminogen activator therapy determined by transcranial Doppler correlates with clinical recovery from ischemic stroke. Stroke. 2000; 31: 1812–1816.
Alexandrov AV, Demchuk AM, Felberg RA, Christou I, Barber PA, Burgin WS, Malkoff M, Wojner AW, Grotta JC. High rate of complete recanalization and dramatic clinical recovery during tPA infusion when continuously monitored with 2-MHz transcranial Doppler monitoring. Stroke. 2000; 31: 610–614.
Alexandrov AV, Burgin WS, Demchuk AM, El-Mitwalli A, Grotta JC. Speed of intracranial clot lysis with intravenous tissue plasminogen activator therapy: sonographic classification and short-term improvement. Circulation. 2001; 103: 2897–2902.
Ernst R, Pancioli A, Tomsick T, Kissela B, Woo D, Kanter D, Jauch E, Carrozzella J, Spilker J, Broderick J. Combined intravenous and intra-arterial recombinant tissue plasminogen activator in acute ischemic stroke. Stroke. 2000; 31: 2552–2557.
Hill MD, Barber PA, Demchuk AM, Newcommon NJ, Cole-Haskayne A, Ryckborst K, Sopher L, Button A, Hu W, Hudon ME, Morrish W, Frayne R, Sevick RJ, Buchan AM. Acute intravenous–intra-arterial revascularization therapy for severe ischemic stroke. Stroke. 2002; 33: 279–282.
Keris V, Rudnicka S, Vorona V, Enina G, Tilgale B, Fricbergs J. Combined intraarterial/intravenous thrombolysis for acute ischemic stroke. AJNR Am J Neuroradiol. 2001; 22: 352–358.
Lee KY, Kim DI, Kim SH, Lee SI, Chung HW, Shim YW, Kim SM, Heo JH. Sequential combination of intravenous recombinant tissue plasminogen activator and intra-arterial urokinase in acute ischemic stroke. AJNR Am J Neuroradiol. 2004; 25: 1470–1475.
Lewandowski CA, Frankel M, Tomsick TA, Broderick J, Frey J, Clark W, Starkman S, Grotta J, Spilker J, Khoury J, Brott T. Combined intravenous and intra-arterial r-tPA versus intra-arterial therapy of acute ischemic stroke: Emergency Management of Stroke (EMS) Bridging Trial. Stroke. 1999; 30: 2598–2605.
Adams HP Jr, Davis PH, Leira EC, Chang KC, Bendixen BH, Clarke WR, Woolson RF, Hansen MD. Baseline NIH stroke scale score strongly predicts outcome after stroke: a report of the Trial of Org 10172 in Acute Stroke Treatment (TOAST). Neurology. 1999; 53: 126–131.
Lyden P, Lu M, Jackson C, Marler J, Kothari R, Brott T, Zivin J; NINDS tPA Stroke Trial Investigators. Underlying structure of the National Institutes of Health Stroke Scale: results of a factor analysis. Stroke. 1999; 30: 2347–2354.
Goldstein LB, Samsa GP. Reliability of the National Institutes of Health Stroke Scale: extension to non-neurologists in the context of a clinical trial. Stroke. 1997; 28: 307–310.
Lyden P, Brott T, Tilley B, Welch KM, Mascha EJ, Levine S, Haley EC, Grotta J, Marler J; NINDS tPA Stroke Study Group. Improved reliability of the NIH stroke scale using video training. Stroke. 1994; 25: 2220–2226.
Weimar C, Konig IR, Kraywinkel K, Ziegler A, Diener HC. Age and National Institutes of Health Stroke Scale score within 6 hours after onset are accurate predictors of outcome after cerebral ischemia: development and external validation of prognostic models. Stroke. 2004; 35: 158–162.
Ribo M, Molina CA, Rovira A, Quintana M, Delgado P, Montaner J, Grive E, Arenillas JF, Alvarez-Sabin J. Safety and efficacy of intravenous tissue plasminogen activator stroke treatment in the 3- to 6-hour window using multimodal transcranial Doppler/MRI selection protocol. Stroke. 2005; 36: 602–606.
Burgin WS, Malkoff M, Felberg RA, Demchuk AM, Christou I, Grotta JC, Alexandrov AV. Transcranial Doppler ultrasound criteria for recanalization after thrombolysis for middle cerebral artery stroke. Stroke. 2000; 31: 1128–1132.
Demchuk AM, Burgin WS, Christou I, Felberg RA, Barber PA, Hill MD, Alexandrov AV. Thrombolysis In Brain Ischemia (TIBI) transcranial Doppler flow grades predict clinical severity, early recovery, and mortality in patients treated with intravenous tissue plasminogen activator. Stroke. 2001; 32: 89–93.
Saqqur M, Shuaib A, Alexandrov AV, Hill MD, Calleja S, Tomsick T, Broderick J, Demchuk AM. Derivation of transcranial Doppler criteria for rescue intra-arterial thrombolysis: multicenter experience from the Interventional Management of Stroke Study. Stroke. 2005; 36: 865–868.
Felberg RA, Okon NJ, El-Mitwalli A, Burgin WS, Grotta JC, Alexandrov AV. Early dramatic recovery during intravenous tissue plasminogen activator infusion: clinical pattern and outcome in acute middle cerebral artery stroke. Stroke. 2002; 33: 1301–1307.
Molina CA, Alexandrov AV, Demchuk AM, Saqqur M, Uchino K, Alvarez-Sabin J. Improving the predictive accuracy of recanalization on stroke outcome in patients treated with tissue plasminogen activator. Stroke. 2004; 35: 151–156.
Smith WS, Sung G, Starkman S, Saver JL, Kidwell CS, Gobin YP, Lutsep HL, Nesbit GM, Grobelny T, Rymer MM, Silverman IE, Higashida RT, Budzik RF, Marks MP. Safety and efficacy of mechanical embolectomy in acute ischemic stroke: results of the MERCI trial. Stroke. 2005; 36: 1432–1438.
Broderick JP, Lu M, Kothari R, Levine SR, Lyden PD, Haley EC, Brott TG, Grotta J, Tilley BC, Marler JR, Frankel M. Finding the most powerful measures of the effectiveness of tissue plasminogen activator in the NINDS tPA Stroke Trial. Stroke. 2000; 31: 2335–2341.
Abciximab Emergent Stroke Treatment Trial (AbESTT) Investigators. Emergency administration of abciximab for treatment of patients with acute ischemic stroke: results of a randomized phase 2 trial. Stroke. 2005; 36: 880–890.
Alexandrov AV, Hall CE, Labiche LA, Wojner AW, Grotta JC. Ischemic stunning of the brain: early recanalization without immediate clinical improvement in acute ischemic stroke. Stroke. 2004; 35: 449–452.
Brown DL, Johnston KC, Wagner DP, Haley EC Jr. Predicting major neurological improvement with intravenous recombinant tissue plasminogen activator treatment of stroke. Stroke. 2004; 35: 147–150.
The present study is the first to provide data on the accuracy of serial National Institutes of Health Stroke Scale (NIHSS) scores for detecting recanalization of the middle cerebral artery in patients treated with intravenous tissue plasminogen activator. The utility of a clinical surrogate marker for detecting recanalization is 2-fold: (1) to identify a persistent occlusion after intravenous thrombolysis, which may enable the selection of patients suitable for rescue therapies, and (2) to serve as an end point in clinical trials that allow assessment of the efficacy of an intervention aimed at achieving early recanalization. The present study used data on 126 patients from the CLOTBUST trial to determine the accuracy of serial NIHSS score measurements for detecting recanalization and then validated the results against a separate patient population (98 patients from Barcelona). The best tradeoff between sensitivity and specificity was achieved with a cutoff of ≥40% reduction in NIHSS score at 60 and 120 minutes. Such a threshold offered sensitivity of 65% (95% CI, 45% to 81%) and 74% (95% CI, 57% to 86%) and specificity of 85% (95% CI, 76% to 91%) and 80% (95% CI, 70% to 87%) at 60 and 120 minutes, respectively. The best tradeoff between positive and negative predictive value was achieved with a cutoff of ≥80% NIHSS reduction at 60 and 120 minutes. Such a threshold offered a positive predictive value of 67% (95% CI, 30% to 90%) and 86% (95% CI, 49% to 97%) and a negative predictive value of 84% (95% CI, 76% to 89%) and 77% (95% CI, 68% to 84%) at 60 and 120 minutes, respectively. Thus, the present study showed that a serial neurological examination is a valuable measure of arterial patency after intravenous thrombolysis, with similar performance in 2 different patient data sets.
Clinical trial registration information—URL: http://www.strokecenter.org/trials/TrialDetail.aspx?tid=452.