A common open representation of mass spectrometry data and its application to proteomics research

Patrick G.A. Pedrioli, Jimmy K. Eng, Robert Hubley, Mathijs Vogelzang, Eric W. Deutsch, Brian Raught, Brian Pratt, Erik Nilsson, Ruth H. Angeletti, Rolf Apweiler, Kei Cheung, Catherine E. Costello, Henning Hermjakob, Sequin Huang, Randall K. Julian, Eugene Kapp, Mark E. McComb, Stephen G. Oliver, Gilbert Omenn, Norman W. PatonRichard Simpson, Richard Smith, Chris F. Taylor, Weimin Zhu, Ruedi Aebersold

Research output: Contribution to journalReview articlepeer-review

665 Scopus citations


A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary. The diverse, nontransparent nature of the data structure complicates the integration of new instruments into preexisting infrastructure, impedes the analysis, exchange, comparison and publication of results from different experiments and laboratories, and prevents the bioinformatics community from accessing data sets required for software development. Here, we introduce the 'mzXML' format, an open, generic XML (extensible markup language) representation of MS data. We have also developed an accompanying suite of supporting programs. We expect that this format will facilitate data management, interpretation and dissemination in proteomics research.

Original languageEnglish (US)
Pages (from-to)1459-1466
Number of pages8
JournalNature biotechnology
Issue number11
StatePublished - Nov 2004
Externally publishedYes

ASJC Scopus subject areas

  • Biotechnology
  • Bioengineering
  • Applied Microbiology and Biotechnology
  • Molecular Medicine
  • Biomedical Engineering


Dive into the research topics of 'A common open representation of mass spectrometry data and its application to proteomics research'. Together they form a unique fingerprint.

Cite this