Identifying functionally informative evolutionary sequence profiles

Nelson Gil, Andras Fiser

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Motivation Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually. Results We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases.

Original languageEnglish (US)
Pages (from-to)1278-1286
Number of pages9
JournalBioinformatics
Volume34
Issue number8
DOIs
StatePublished - Apr 15 2018

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Identifying functionally informative evolutionary sequence profiles'. Together they form a unique fingerprint.

Cite this