SMITE: An R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information

N. Ari Wijetunga, Andrew D. Johnston, Ryo Maekawa, Fabien Delahaye, Netha Ulahannan, Kami Kim, John M. Greally

Research output: Contribution to journalArticlepeer-review

17 Scopus citations


Background: The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the information derived from separate, complementary assays to gain higher-confidence insights into cellular states. At present, the analysis of multi-dimensional, massive genome-wide data requires an initial pruning step to create manageable subsets of observations that are then used for integration, which decreases the sizes of the intersecting data sets and the potential for biological insights. Our Significance-based Modules Integrating the Transcriptome and Epigenome (SMITE) approach was developed to integrate transcriptional and epigenetic regulatory data without a loss of resolution. Results: SMITE combines p-values by accounting for the correlation between non-independent values within data sets, allowing genes and gene modules in an interaction network to be assigned significance values. The contribution of each type of genomic data can be weighted, permitting integration of individually under-powered data sets, increasing the overall ability to detect effects within modules of genes. We apply SMITE to a complex genomic data set including the epigenomic and transcriptomic effects of Toxoplasma gondii infection on human host cells and demonstrate that SMITE is able to identify novel subnetworks of dysregulated genes. Additionally, we show that SMITE outperforms Functional Epigenetic Modules (FEM), the current paradigm of using the spin-glass algorithm to integrate gene expression and epigenetic data. Conclusions: SMITE represents a flexible, scalable tool that allows integration of transcriptional and epigenetic regulatory data from genome-wide assays to boost confidence in finding gene modules reflecting altered cellular states.

Original languageEnglish (US)
Article number41
JournalBMC bioinformatics
Issue number1
StatePublished - Jan 18 2017


  • Bioinformatics
  • Epigenetic
  • Gene expression
  • Genomic
  • Interaction network
  • Modules

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics


Dive into the research topics of 'SMITE: An R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information'. Together they form a unique fingerprint.

Cite this