TY - JOUR
T1 - Identifying and Characterizing a Chronic Cough Cohort Through Electronic Health Records
AU - Weiner, Michael
AU - Dexter, Paul R.
AU - Heithoff, Kim
AU - Roberts, Anna R.
AU - Liu, Ziyue
AU - Griffith, Ashley
AU - Hui, Siu
AU - Schelfhout, Jonathan
AU - Dicpinigaitis, Peter
AU - Doshi, Ishita
AU - Weaver, Jessica P.
N1 - Publisher Copyright:
© 2020 American College of Chest Physicians
PY - 2021/6
Y1 - 2021/6
N2 - Background: Chronic cough (CC) of 8 weeks or more affects about 10% of adults and may lead to expensive treatments and reduced quality of life. Incomplete diagnostic coding complicates identifying CC in electronic health records (EHRs). Natural language processing (NLP) of EHR text could improve detection. Research Question: Can NLP be used to identify cough in EHRs, and to characterize adults and encounters with CC? Study Design and Methods: A Midwestern EHR system identified patients aged 18 to 85 years during 2005 to 2015. NLP was used to evaluate text notes, except prescriptions and instructions, for mentions of cough. Two physicians and a biostatistician reviewed 12 sets of 50 encounters each, with iterative refinements, until the positive predictive value for cough encounters exceeded 90%. NLP, International Classification of Diseases, 10th revision, or medication was used to identify cough. Three encounters spanning 56 to 120 days defined CC. Descriptive statistics summarized patients and encounters, including referrals. Results: Optimizing NLP required identifying and eliminating cough denials, instructions, and historical references. Of 235,457 cough encounters, 23% had a relevant diagnostic code or medication. Applying chronicity to cough encounters identified 23,371 patients (61% women) with CC. NLP alone identified 74% of these patients; diagnoses or medications alone identified 15%. The positive predictive value of NLP in the reviewed sample was 97%. Referrals for cough occurred for 3.0% of patients; pulmonary medicine was most common initially (64% of referrals). Limitations: Some patients with diagnosis codes for cough, encounters at intervals greater than 4 months, or multiple acute cough episodes may have been misclassified. Interpretation: NLP successfully identified a large cohort with CC. Most patients were identified through NLP alone, rather than diagnoses or medications. NLP improved detection of patients nearly sevenfold, addressing the gap in ability to identify and characterize CC disease burden. Nearly all cases appeared to be managed in primary care. Identifying these patients is important for characterizing treatment and unmet needs.
AB - Background: Chronic cough (CC) of 8 weeks or more affects about 10% of adults and may lead to expensive treatments and reduced quality of life. Incomplete diagnostic coding complicates identifying CC in electronic health records (EHRs). Natural language processing (NLP) of EHR text could improve detection. Research Question: Can NLP be used to identify cough in EHRs, and to characterize adults and encounters with CC? Study Design and Methods: A Midwestern EHR system identified patients aged 18 to 85 years during 2005 to 2015. NLP was used to evaluate text notes, except prescriptions and instructions, for mentions of cough. Two physicians and a biostatistician reviewed 12 sets of 50 encounters each, with iterative refinements, until the positive predictive value for cough encounters exceeded 90%. NLP, International Classification of Diseases, 10th revision, or medication was used to identify cough. Three encounters spanning 56 to 120 days defined CC. Descriptive statistics summarized patients and encounters, including referrals. Results: Optimizing NLP required identifying and eliminating cough denials, instructions, and historical references. Of 235,457 cough encounters, 23% had a relevant diagnostic code or medication. Applying chronicity to cough encounters identified 23,371 patients (61% women) with CC. NLP alone identified 74% of these patients; diagnoses or medications alone identified 15%. The positive predictive value of NLP in the reviewed sample was 97%. Referrals for cough occurred for 3.0% of patients; pulmonary medicine was most common initially (64% of referrals). Limitations: Some patients with diagnosis codes for cough, encounters at intervals greater than 4 months, or multiple acute cough episodes may have been misclassified. Interpretation: NLP successfully identified a large cohort with CC. Most patients were identified through NLP alone, rather than diagnoses or medications. NLP improved detection of patients nearly sevenfold, addressing the gap in ability to identify and characterize CC disease burden. Nearly all cases appeared to be managed in primary care. Identifying these patients is important for characterizing treatment and unmet needs.
KW - chronic cough
KW - electronic health records
KW - natural language processing
KW - structured data
KW - unstructured data
UR - http://www.scopus.com/inward/record.url?scp=85105319455&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105319455&partnerID=8YFLogxK
U2 - 10.1016/j.chest.2020.12.011
DO - 10.1016/j.chest.2020.12.011
M3 - Article
C2 - 33345951
AN - SCOPUS:85105319455
SN - 0012-3692
VL - 159
SP - 2346
EP - 2355
JO - Chest
JF - Chest
IS - 6
ER -