Prediction of interactions between cell surface proteins by machine learning

Zhaoqian Su, Brian Griffin, Scott Emmons, Yinghao Wu

Research output: Contribution to journalArticlepeer-review

Abstract

Cells detect changes in their external environments or communicate with each other through proteins on their surfaces. These cell surface proteins form a complicated network of interactions in order to fulfill their functions. The interactions between cell surface proteins are highly dynamic and, thus, challenging to detect using traditional experimental techniques. Here, we tackle this challenge using a computational framework. The primary focus of the framework is to develop new tools to identify interactions between domains in the immunoglobulin (Ig) fold, which is the most abundant domain family in cell surface proteins. These interactions could be formed between ligands and receptors from different cells or between proteins on the same cell surface. In practice, we collected all structural data on Ig domain interactions and transformed them into an interface fragment pair library. A high-dimensional profile can then be constructed from the library for a given pair of query protein sequences. Multiple machine learning models were used to read this profile so that the probability of interaction between the query proteins could be predicted. We tested our models on an experimentally derived dataset that contains 564 cell surface proteins in humans. The cross-validation results show that we can achieve higher than 70% accuracy in identifying the PPIs within this dataset. We then applied this method to a group of 46 cell surface proteins in Caenorhabditis elegans. We screened every possible interaction between these proteins. Many interactions recognized by our machine learning classifiers have been experimentally confirmed in the literature. In conclusion, our computational platform serves as a useful tool to help identify potential new interactions between cell surface proteins in addition to current state-of-the-art experimental techniques. The tool is freely accessible for use by the scientific community. Moreover, the general framework of the machine learning classification can also be extended to study the interactions of proteins in other domain superfamilies.

Original languageEnglish (US)
Pages (from-to)567-580
Number of pages14
JournalProteins: Structure, Function and Bioinformatics
Volume92
Issue number4
DOIs
StatePublished - Apr 2024

Keywords

  • cell surface protein
  • immunoglobulin domain
  • immunoglobulin fold
  • machine learning
  • protein–protein interactions

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology

Fingerprint

Dive into the research topics of 'Prediction of interactions between cell surface proteins by machine learning'. Together they form a unique fingerprint.

Cite this