Abstract 15230: Development and Validation of a Natural Language Processing System for Identifying and Classifying Severe Congenital Heart Defects: The PORTAL CHD Study
Introduction: The number of adults with congenital heart defects (CHD) is increasing nationally with recognition of the need to improve surveillance and care across the lifespan. Yet, accurately identifying patients with severe CHD, from electronic medical records (EMR) is challenging.
Methods: Within 3 integrated healthcare delivery systems (Kaiser Permanente in Northern California (KPNC), Southern California (KPSC), and Colorado (KPCO)) serving >8.5 million members, we identified 66,775 adults with a CHD diagnostic code during 2008-2013; 1652 had ≥1 severe CHD diagnosis code (atrioventricular canal defect [AVCD], Eisenmenger syndrome, single ventricle physiology, Tetralogy of Fallot [TOF], or transposition of the great arteries [TGA]). We developed a natural language processing (NLP) system using Linguamatics i2E and split-sample validation in KPNC based on medical records review by a CHD specialist. Positive predictive value (PPV) and negative predictive value (NPV) were calculated for each severe CHD condition. Accuracy of the refined NLP system was also evaluated among 216 KPCO patients and 373 KPSC patients with severe CHD diagnosis codes.
Results: The KPNC NLP system had high PPV and NPV across all severe CHD diagnoses and improved after refinement (range 96-100%, Table). Validation results in KPCO showed lower PPVs and NPVs, especially the NPV for TGA and single ventricle (Table). The results of the same NLP system in KPSC were better, with PPVs and NPVs that were approximately 90% except TGA (0% NPV) because the NLP identified one diagnosis out of 44 as not being TGA, when all the ICD-9 diagnoses were accurate.
Conclusion: We developed and validated an NLP system for accurate identification persons with severe CHD from unstructured EMR data that can facilitate population-based opportunities for quality improvement and future research.
- Congenital heart disease
- Electronic health records (EHRs)
- Epidemiologic methods
- Healthcare delivery systems
Author Disclosures: K.K. Lee: None. J. Yang: None. A.K. Meadows: None. K. Reynolds: None. D. Magid: None. S. Sung: None. A.S. Go: None.
- © 2016 by American Heart Association, Inc.