Coreprobe: A novel algorithm for estimating relative abundance based on metagenomic reads

Dongmei Ai, Hongfei Pan, Ruocheng Huang, Li C. Xia

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

With the rapid development of high-throughput sequencing technology, the analysis of metagenomic sequencing data and the accurate and efficient estimation of relative microbial abundance have become important ways to explore the microbial composition and function of microbes. In addition, the accuracy and efficiency of the relative microbial abundance estimation are closely related to the algorithm and the selection of the reference sequence for sequence alignment. We introduced the microbial core genome as the reference sequence for potential microbes in a metagenomic sample, and we constructed a finite mixture and latent Dirichlet models and used the Gibbs sampling algorithm to estimate the relative abundance of microorganisms. The simulation results showed that our approach can improve the efficiency while maintaining high accuracy and is more suitable for high-throughput metagenomic data. The new approach was implemented in our CoreProbe package which provides a pipeline for an accurate and efficient estimation of the relative abundance of microbes in a community. This tool is available free of charge from the CoreProbe’s website: Access the Docker image with the following instruction: sudo docker pull panhongfei/coreprobe:1.0.

Original languageEnglish (US)
Article number313
JournalGenes
Volume9
Issue number6
DOIs
StatePublished - Jun 20 2018
Externally publishedYes

Keywords

  • Core genome
  • Dirichlet model
  • Gibbs sampling
  • Metagenomics
  • Relative abundance estimation

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Fingerprint

Dive into the research topics of 'Coreprobe: A novel algorithm for estimating relative abundance based on metagenomic reads'. Together they form a unique fingerprint.

Cite this