Identifying and Characterizing a Chronic Cough Cohort Through Electronic Health Records

Michael Weiner; Paul R. Dexter; Kim Heithoff; Anna R. Roberts; Ziyue Liu; Ashley Griffith; Siu Hui; Jonathan Schelfhout; Peter Dicpinigaitis; Ishita Doshi; Jessica P. Weaver

doi:10.1016/j.chest.2020.12.011

Identifying and Characterizing a Chronic Cough Cohort Through Electronic Health Records

Michael Weiner, Paul R. Dexter, Kim Heithoff, Anna R. Roberts, Ziyue Liu, Ashley Griffith, Siu Hui, Jonathan Schelfhout, Peter Dicpinigaitis, Ishita Doshi, Jessica P. Weaver

Medicine

Research output: Contribution to journal › Article › peer-review

17 Scopus citations

Abstract

Background: Chronic cough (CC) of 8 weeks or more affects about 10% of adults and may lead to expensive treatments and reduced quality of life. Incomplete diagnostic coding complicates identifying CC in electronic health records (EHRs). Natural language processing (NLP) of EHR text could improve detection. Research Question: Can NLP be used to identify cough in EHRs, and to characterize adults and encounters with CC? Study Design and Methods: A Midwestern EHR system identified patients aged 18 to 85 years during 2005 to 2015. NLP was used to evaluate text notes, except prescriptions and instructions, for mentions of cough. Two physicians and a biostatistician reviewed 12 sets of 50 encounters each, with iterative refinements, until the positive predictive value for cough encounters exceeded 90%. NLP, International Classification of Diseases, 10th revision, or medication was used to identify cough. Three encounters spanning 56 to 120 days defined CC. Descriptive statistics summarized patients and encounters, including referrals. Results: Optimizing NLP required identifying and eliminating cough denials, instructions, and historical references. Of 235,457 cough encounters, 23% had a relevant diagnostic code or medication. Applying chronicity to cough encounters identified 23,371 patients (61% women) with CC. NLP alone identified 74% of these patients; diagnoses or medications alone identified 15%. The positive predictive value of NLP in the reviewed sample was 97%. Referrals for cough occurred for 3.0% of patients; pulmonary medicine was most common initially (64% of referrals). Limitations: Some patients with diagnosis codes for cough, encounters at intervals greater than 4 months, or multiple acute cough episodes may have been misclassified. Interpretation: NLP successfully identified a large cohort with CC. Most patients were identified through NLP alone, rather than diagnoses or medications. NLP improved detection of patients nearly sevenfold, addressing the gap in ability to identify and characterize CC disease burden. Nearly all cases appeared to be managed in primary care. Identifying these patients is important for characterizing treatment and unmet needs.

Original language	English (US)
Pages (from-to)	2346-2355
Number of pages	10
Journal	Chest
Volume	159
Issue number	6
DOIs	https://doi.org/10.1016/j.chest.2020.12.011
State	Published - Jun 2021

Keywords

chronic cough
electronic health records
natural language processing
structured data
unstructured data

ASJC Scopus subject areas

Pulmonary and Respiratory Medicine
Critical Care and Intensive Care Medicine
Cardiology and Cardiovascular Medicine

Access to Document

10.1016/j.chest.2020.12.011

Cite this

@article{2fac0da9740848dfb2ac53d17a055422,

title = "Identifying and Characterizing a Chronic Cough Cohort Through Electronic Health Records",

abstract = "Background: Chronic cough (CC) of 8 weeks or more affects about 10% of adults and may lead to expensive treatments and reduced quality of life. Incomplete diagnostic coding complicates identifying CC in electronic health records (EHRs). Natural language processing (NLP) of EHR text could improve detection. Research Question: Can NLP be used to identify cough in EHRs, and to characterize adults and encounters with CC? Study Design and Methods: A Midwestern EHR system identified patients aged 18 to 85 years during 2005 to 2015. NLP was used to evaluate text notes, except prescriptions and instructions, for mentions of cough. Two physicians and a biostatistician reviewed 12 sets of 50 encounters each, with iterative refinements, until the positive predictive value for cough encounters exceeded 90%. NLP, International Classification of Diseases, 10th revision, or medication was used to identify cough. Three encounters spanning 56 to 120 days defined CC. Descriptive statistics summarized patients and encounters, including referrals. Results: Optimizing NLP required identifying and eliminating cough denials, instructions, and historical references. Of 235,457 cough encounters, 23% had a relevant diagnostic code or medication. Applying chronicity to cough encounters identified 23,371 patients (61% women) with CC. NLP alone identified 74% of these patients; diagnoses or medications alone identified 15%. The positive predictive value of NLP in the reviewed sample was 97%. Referrals for cough occurred for 3.0% of patients; pulmonary medicine was most common initially (64% of referrals). Limitations: Some patients with diagnosis codes for cough, encounters at intervals greater than 4 months, or multiple acute cough episodes may have been misclassified. Interpretation: NLP successfully identified a large cohort with CC. Most patients were identified through NLP alone, rather than diagnoses or medications. NLP improved detection of patients nearly sevenfold, addressing the gap in ability to identify and characterize CC disease burden. Nearly all cases appeared to be managed in primary care. Identifying these patients is important for characterizing treatment and unmet needs.",

keywords = "chronic cough, electronic health records, natural language processing, structured data, unstructured data",

author = "Michael Weiner and Dexter, {Paul R.} and Kim Heithoff and Roberts, {Anna R.} and Ziyue Liu and Ashley Griffith and Siu Hui and Jonathan Schelfhout and Peter Dicpinigaitis and Ishita Doshi and Weaver, {Jessica P.}",

note = "Publisher Copyright: {\textcopyright} 2020 American College of Chest Physicians",

year = "2021",

month = jun,

doi = "10.1016/j.chest.2020.12.011",

language = "English (US)",

volume = "159",

pages = "2346--2355",

journal = "Chest",

issn = "0012-3692",

publisher = "American College of Chest Physicians",

number = "6",

}

TY - JOUR

T1 - Identifying and Characterizing a Chronic Cough Cohort Through Electronic Health Records

AU - Weiner, Michael

AU - Dexter, Paul R.

AU - Heithoff, Kim

AU - Roberts, Anna R.

AU - Liu, Ziyue

AU - Griffith, Ashley

AU - Hui, Siu

AU - Schelfhout, Jonathan

AU - Dicpinigaitis, Peter

AU - Doshi, Ishita

AU - Weaver, Jessica P.

PY - 2021/6

Y1 - 2021/6

N2 - Background: Chronic cough (CC) of 8 weeks or more affects about 10% of adults and may lead to expensive treatments and reduced quality of life. Incomplete diagnostic coding complicates identifying CC in electronic health records (EHRs). Natural language processing (NLP) of EHR text could improve detection. Research Question: Can NLP be used to identify cough in EHRs, and to characterize adults and encounters with CC? Study Design and Methods: A Midwestern EHR system identified patients aged 18 to 85 years during 2005 to 2015. NLP was used to evaluate text notes, except prescriptions and instructions, for mentions of cough. Two physicians and a biostatistician reviewed 12 sets of 50 encounters each, with iterative refinements, until the positive predictive value for cough encounters exceeded 90%. NLP, International Classification of Diseases, 10th revision, or medication was used to identify cough. Three encounters spanning 56 to 120 days defined CC. Descriptive statistics summarized patients and encounters, including referrals. Results: Optimizing NLP required identifying and eliminating cough denials, instructions, and historical references. Of 235,457 cough encounters, 23% had a relevant diagnostic code or medication. Applying chronicity to cough encounters identified 23,371 patients (61% women) with CC. NLP alone identified 74% of these patients; diagnoses or medications alone identified 15%. The positive predictive value of NLP in the reviewed sample was 97%. Referrals for cough occurred for 3.0% of patients; pulmonary medicine was most common initially (64% of referrals). Limitations: Some patients with diagnosis codes for cough, encounters at intervals greater than 4 months, or multiple acute cough episodes may have been misclassified. Interpretation: NLP successfully identified a large cohort with CC. Most patients were identified through NLP alone, rather than diagnoses or medications. NLP improved detection of patients nearly sevenfold, addressing the gap in ability to identify and characterize CC disease burden. Nearly all cases appeared to be managed in primary care. Identifying these patients is important for characterizing treatment and unmet needs.

AB - Background: Chronic cough (CC) of 8 weeks or more affects about 10% of adults and may lead to expensive treatments and reduced quality of life. Incomplete diagnostic coding complicates identifying CC in electronic health records (EHRs). Natural language processing (NLP) of EHR text could improve detection. Research Question: Can NLP be used to identify cough in EHRs, and to characterize adults and encounters with CC? Study Design and Methods: A Midwestern EHR system identified patients aged 18 to 85 years during 2005 to 2015. NLP was used to evaluate text notes, except prescriptions and instructions, for mentions of cough. Two physicians and a biostatistician reviewed 12 sets of 50 encounters each, with iterative refinements, until the positive predictive value for cough encounters exceeded 90%. NLP, International Classification of Diseases, 10th revision, or medication was used to identify cough. Three encounters spanning 56 to 120 days defined CC. Descriptive statistics summarized patients and encounters, including referrals. Results: Optimizing NLP required identifying and eliminating cough denials, instructions, and historical references. Of 235,457 cough encounters, 23% had a relevant diagnostic code or medication. Applying chronicity to cough encounters identified 23,371 patients (61% women) with CC. NLP alone identified 74% of these patients; diagnoses or medications alone identified 15%. The positive predictive value of NLP in the reviewed sample was 97%. Referrals for cough occurred for 3.0% of patients; pulmonary medicine was most common initially (64% of referrals). Limitations: Some patients with diagnosis codes for cough, encounters at intervals greater than 4 months, or multiple acute cough episodes may have been misclassified. Interpretation: NLP successfully identified a large cohort with CC. Most patients were identified through NLP alone, rather than diagnoses or medications. NLP improved detection of patients nearly sevenfold, addressing the gap in ability to identify and characterize CC disease burden. Nearly all cases appeared to be managed in primary care. Identifying these patients is important for characterizing treatment and unmet needs.

KW - chronic cough

KW - electronic health records

KW - natural language processing

KW - structured data

KW - unstructured data

UR - http://www.scopus.com/inward/record.url?scp=85105319455&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85105319455&partnerID=8YFLogxK

U2 - 10.1016/j.chest.2020.12.011

DO - 10.1016/j.chest.2020.12.011

M3 - Article

C2 - 33345951

AN - SCOPUS:85105319455

SN - 0012-3692

VL - 159

SP - 2346

EP - 2355

JO - Chest

JF - Chest

IS - 6

ER -

Identifying and Characterizing a Chronic Cough Cohort Through Electronic Health Records

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this