TY - JOUR
T1 - Creating rare epilepsy cohorts using keyword search in electronic health records
AU - Barbour, Kristen
AU - Tian, Niu
AU - Yozawitz, Elissa G.
AU - Wolf, Steven
AU - McGoldrick, Patricia E.
AU - Sands, Tristan T.
AU - Nelson, Aaron
AU - Basma, Natasha
AU - Grinspan, Zachary M.
N1 - Publisher Copyright:
© 2023 International League Against Epilepsy.
PY - 2023/10
Y1 - 2023/10
N2 - Objective: Administrative codes to identify people with rare epilepsies in electronic health records are limited. The current study evaluated the use of keyword search as an alternative method for rare epilepsy cohort creation using electronic health records data. Methods: Data included clinical notes from encounters with International Classification of Diseases, Ninth Revision (ICD-9) codes for seizures, epilepsy, and/or convulsions during 2010–2014, across six health care systems in New York City. We identified cases with rare epilepsies by searching clinical notes for keywords associated with 33 rare epilepsies. We validated cases via manual chart review. We compared the performance of keyword search to manual chart review using positive predictive value (PPV), sensitivity, and F-score. We selected an initial combination of keywords using the highest F-scores. Results: Data included clinical notes from 77 924 cases with ICD-9 codes for seizures, epilepsy, and/or convulsions. The all-keyword search method identified 6095 candidates, and manual chart review confirmed that 2068 (34%) had a rare epilepsy. The initial combination method identified 1862 cases with a rare epilepsy, and this method performed as follows: PPV median =.64 (interquartile range [IQR] =.50–.81, range =.20–1.00), sensitivity median =.93 (IQR =.76–1.00, range =.10–1.00), and F-score median =.71 (IQR =.63–.85, range =.18–1.00). Using this method, we identified four cohorts of rare epilepsies with over 100 individuals, including infantile spasms, Lennox–Gastaut syndrome, Rett syndrome, and tuberous sclerosis complex. We identified over 50 individuals with two rare epilepsies that do not have specific ICD-10 codes for cohort creation (epilepsy with myoclonic atonic seizures, Sturge–Weber syndrome). Significance: Keyword search is an effective method for cohort creation. These findings can improve identification and surveillance of individuals with rare epilepsies and promote their referral to specialty clinics, clinical research, and support groups.
AB - Objective: Administrative codes to identify people with rare epilepsies in electronic health records are limited. The current study evaluated the use of keyword search as an alternative method for rare epilepsy cohort creation using electronic health records data. Methods: Data included clinical notes from encounters with International Classification of Diseases, Ninth Revision (ICD-9) codes for seizures, epilepsy, and/or convulsions during 2010–2014, across six health care systems in New York City. We identified cases with rare epilepsies by searching clinical notes for keywords associated with 33 rare epilepsies. We validated cases via manual chart review. We compared the performance of keyword search to manual chart review using positive predictive value (PPV), sensitivity, and F-score. We selected an initial combination of keywords using the highest F-scores. Results: Data included clinical notes from 77 924 cases with ICD-9 codes for seizures, epilepsy, and/or convulsions. The all-keyword search method identified 6095 candidates, and manual chart review confirmed that 2068 (34%) had a rare epilepsy. The initial combination method identified 1862 cases with a rare epilepsy, and this method performed as follows: PPV median =.64 (interquartile range [IQR] =.50–.81, range =.20–1.00), sensitivity median =.93 (IQR =.76–1.00, range =.10–1.00), and F-score median =.71 (IQR =.63–.85, range =.18–1.00). Using this method, we identified four cohorts of rare epilepsies with over 100 individuals, including infantile spasms, Lennox–Gastaut syndrome, Rett syndrome, and tuberous sclerosis complex. We identified over 50 individuals with two rare epilepsies that do not have specific ICD-10 codes for cohort creation (epilepsy with myoclonic atonic seizures, Sturge–Weber syndrome). Significance: Keyword search is an effective method for cohort creation. These findings can improve identification and surveillance of individuals with rare epilepsies and promote their referral to specialty clinics, clinical research, and support groups.
KW - automated
KW - clinical data
KW - cohort creation
KW - genetic epilepsy
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85167344602&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85167344602&partnerID=8YFLogxK
U2 - 10.1111/epi.17725
DO - 10.1111/epi.17725
M3 - Article
C2 - 37498137
AN - SCOPUS:85167344602
SN - 0013-9580
VL - 64
SP - 2738
EP - 2749
JO - Epilepsia
JF - Epilepsia
IS - 10
ER -