Heavy-Tailed noise suppression and derivative wavelet scalogram for detecting DNA copy number aberrations

Nha Nguyen, An Vo, Haibin Sun, Heng Huang

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Most existing array comparative genomic hybridization (array CGH) data processing methods and evaluation models assumed that the probability density function (pdf) of noise in array CGH data is a Gaussian distribution. However, in practice, such noise distribution is peaky and heavy-tailed. Therefore, a Gaussian pdf is not adequate to approximate the noise in array CGH data and hence introduces wrong detections of chromosomal aberrations and leads misunderstanding on disease pathogenesis. A more accurate and sufficient model of noise in array CGH data is necessary and beneficial to the detection of DNA copy number variations. We analyze the real array CGH data from different platforms and show that the distribution of noise in array CGH data is fitted very well by generalized Gaussian distribution (GGD). Based on our new noise model, we propose a novel array CGH processing method combining the advantages of both the smoothing and segmentation approaches. The new method uses generalized Gaussian bivariate shrinkage function and one-directional derivative wavelet scalogram in generalized Gaussian noise. In the smoothing step, with the new generalized Gaussian noise model, we derive the heavy-tailed noise suppression algorithm in stationary wavelet domain. In the segmentation step, the 1D Gaussian derivative wavelet scalogram is employed to detect break points. Both real and simulated array CGH data with different noises (such as Gaussian noise, GGD noise, and real noise) are used in our experiments. We demonstrate that our new method outperforms other state-of-the-art methods, in terms of both root mean squared errors and receiver operating characteristic curves.

Original languageEnglish (US)
Article number7970153
Pages (from-to)1625-1635
Number of pages11
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume15
Issue number5
DOIs
StatePublished - Sep 1 2018
Externally publishedYes

Keywords

  • DNA copy number variations
  • Heavy-tailed noise
  • aCGH
  • wavelet

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Heavy-Tailed noise suppression and derivative wavelet scalogram for detecting DNA copy number aberrations'. Together they form a unique fingerprint.

Cite this