TY - GEN
T1 - Fast cancer classification based on mass spectrometry analysis in robust stationary wavelet domain
AU - Pham, Phuong
AU - Yu, Li
AU - Nguyen, Minh
AU - Nguyen, Nha
PY - 2012
Y1 - 2012
N2 - Mass spectrometry (MS) is a technology recently used for high dimensionality detection of proteins in proteomics. However, due to the high resolution and noise of MS data (MALDI-TOF), almost existing MS analysis algorithms are not robust with noise and run slowly. Developing new ones is necessary to analyze such data. In this paper, we propose a novel feature extraction method considering the inherent noise of mass spectra. The proposed method combines stationary wavelet transformation (SWT) and bivariate shrinkage estimator for MS feature extraction and denoising. Then, statistical feature testing is applied to denoised wavelet coefficients to select significant features used for biomarker identification. To evaluate the effectiveness of proposed method, a double cross-validation support vector machine classifier, which has high generalizability, and a fast Modest AdaBoost classifier, which improves significantly experimental runtime, are applied for cancer classification based on selected features by proposed method. Several experiments are carried out to evaluate the performance of our proposed methods. The results show that our proposed method can be an effective tool for analyzing MS data.
AB - Mass spectrometry (MS) is a technology recently used for high dimensionality detection of proteins in proteomics. However, due to the high resolution and noise of MS data (MALDI-TOF), almost existing MS analysis algorithms are not robust with noise and run slowly. Developing new ones is necessary to analyze such data. In this paper, we propose a novel feature extraction method considering the inherent noise of mass spectra. The proposed method combines stationary wavelet transformation (SWT) and bivariate shrinkage estimator for MS feature extraction and denoising. Then, statistical feature testing is applied to denoised wavelet coefficients to select significant features used for biomarker identification. To evaluate the effectiveness of proposed method, a double cross-validation support vector machine classifier, which has high generalizability, and a fast Modest AdaBoost classifier, which improves significantly experimental runtime, are applied for cancer classification based on selected features by proposed method. Several experiments are carried out to evaluate the performance of our proposed methods. The results show that our proposed method can be an effective tool for analyzing MS data.
KW - Bivariate shrinkage
KW - Boosting
KW - Feature extraction
KW - Mass spectrometry
KW - SVM
KW - SWT
UR - http://www.scopus.com/inward/record.url?scp=81255176747&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=81255176747&partnerID=8YFLogxK
U2 - 10.1007/978-94-007-2598-0_21
DO - 10.1007/978-94-007-2598-0_21
M3 - Conference contribution
AN - SCOPUS:81255176747
SN - 9789400725973
T3 - Lecture Notes in Electrical Engineering
SP - 189
EP - 199
BT - IT Convergence and Services, ITCS 2011 and IRoA 2011
T2 - 3rd International Conference on Information Technology Convergence and Services, ITCS 2011 and 2011 FTRA International Conference on Intelligent Robotics, Automations, Telecommunication Facilities, and Applications, IRoA 2011
Y2 - 20 October 2011 through 22 October 2011
ER -