A tale of three subspecialties: Diagnosis recording patterns are internally consistent but Specialty-Dependent

Jose Franck Diaz-Garelli, Roy Strowd, Tamjeed Ahmed, Brian J. Wells, Rebecca Merrill, Javier Laurini, Boris Pasche, Umit Topaloglu

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Background: Structured diagnosis (DX) are crucial for secondary use of electronic health record (EHR) data. However, they are often suboptimally recorded. Our previous work showed initial evidence of variable DX recording patterns in oncology charts even after biopsy records are available. Objective: We verified this finding's internal and external validity. We hypothesized that this recording pattern would be preserved in a larger cohort of patients for the same disease. We also hypothesized that this effect would vary across subspecialties. Methods: We extracted DX data from EHRs of patients treated for brain, lung, and pancreatic neoplasms, identified through clinician-led chart reviews. We used statistical methods (i.e., binomial and mixed model regressions) to test our hypotheses. Results: We found variable recording patterns in brain neoplasm DX (i.e., larger number of distinct DX- OR=2.2, P<0.0001, higher descriptive specificity scores-OR=1.4, P<0.0001-and much higher entropy after the BX-OR=3.8 P=0.004 and OR=8.0, P<0.0001), confirming our initial findings. We also found strikingly different patterns for lung and pancreas DX. Although both seemed to have much lower DX sequence entropy after the BX-OR=0.198, P=0.015 and OR=0.099, P=0.015, respectively compared to OR=3.8 P=0.004). We also found statistically significant differences between the brain dataset and both the lung (P<0.0001) and pancreas (0.009<P<0.08). Conclusion: Our results suggest that disease-specific DX entry patterns exist and are established differently by clinical subspecialty. These differences should be accounted for during clinical data reuse and data quality assessments but also during EHR entry system design to maximize accurate, precise and consistent data entry likelihood.

Original languageEnglish (US)
Pages (from-to)369-377
Number of pages9
JournalJAMIA Open
Issue number3
StatePublished - Oct 1 2019
Externally publishedYes


  • Clinical data management
  • Data quality
  • Electronic health records
  • Learning healthcare system
  • Secondary use of clinical data

ASJC Scopus subject areas

  • Health Informatics


Dive into the research topics of 'A tale of three subspecialties: Diagnosis recording patterns are internally consistent but Specialty-Dependent'. Together they form a unique fingerprint.

Cite this