PEDF, a pleiotropic WTC-LI biomarker: Machine learning biomarker identification and validation

George Crowley, James Kim, Sophia Kwon, Rachel Lam, David J. Prezant, Mengling Liu, Anna Nolan

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Biomarkers predict World Trade Center-Lung Injury (WTC-LI); however, there remains unaddressed multicollinearity in our serum cytokines, chemokines, and high-throughput platform datasets used to phenotype WTC-disease. To address this concern, we used automated, machine-learning, high-dimensional data pruning, and validated identified biomarkers. The parent cohort consisted of male, never-smoking firefighters with WTC-LI (FEV1, %Pred< lower limit of normal (LLN); n = 100) and controls (n = 127) and had their biomarkers assessed. Cases and controls (n = 15/group) underwent untargeted metabolomics, then feature selection performed on metabolites, cytokines, chemokines, and clinical data. Cytokines, chemokines, and clinical biomarkers were validated in the non-overlapping parent- cohort via binary logistic regression with 5-fold cross validation. Random forests of metabolites (n = 580), clinical biomarkers (n = 5), and previously assayed cytokines, chemokines (n = 106) identified that the top 5% of biomarkers important to class separation included pigment epithelium-derived factor (PEDF), macrophage derived chemokine (MDC), systolic blood pressure, macrophage inflammatory protein-4 (MIP-4), growth-regulated oncogene protein (GRO), monocyte chemoattractant protein-1 (MCP-1), apolipoprotein- AII (Apo-AII), cell membrane metabolites (sphingolipids, phospholipids), and branchedchain amino acids. Validated models via confounder-adjusted (age on 9/11, BMI, exposure, and pre-9/11 FEV1, %Pred) binary logistic regression had AUCROC [0.90(0.84-0.96)]. Decreased PEDF and MIP-4, and increased Apo-AII were associated with increased odds of WTC-LI. Increased GRO, MCP-1, and simultaneously decreased MDC were associated with decreased odds of WTC-LI. In conclusion, automated data pruning identified novel WTC-LI biomarkers; performance was validated in an independent cohort. One biomarker -PEDF, an antiangiogenic agent-is a novel, predictive biomarker of particulate-matterrelated lung disease. Other biomarkers-GRO, MCP-1, MDC, MIP-4-reveal immune cell involvement in WTC-LI pathogenesis. Findings of our automated biomarker identification warrant further investigation into these potential pharmacotherapy targets.

Original languageEnglish (US)
Article numbere1009144
JournalPLoS Computational Biology
Issue number7
StatePublished - Jul 2021

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Ecology
  • Molecular Biology
  • Genetics
  • Cellular and Molecular Neuroscience
  • Computational Theory and Mathematics


Dive into the research topics of 'PEDF, a pleiotropic WTC-LI biomarker: Machine learning biomarker identification and validation'. Together they form a unique fingerprint.

Cite this