TY - JOUR
T1 - Batch effects and pathway analysis
T2 - Two potential perils in cancer studies involving DNA methylation array analysis
AU - Harper, Kristin N.
AU - Peters, Brandilyn A.
AU - Gamble, Mary V.
PY - 2013/6
Y1 - 2013/6
N2 - Background: DNA methylation microarrays have become an increasingly popular means of studying the role of epigenetics in cancer, although the methods used to analyze these arrays are still being developed and existing methods are not always widely disseminated among microarray users. Methods: We investigated two problems likely to confront DNA methylation microarray users: (i) batch effects and (ii) the use of widely available pathway analysis software to analyze results. First, DNA taken from individuals exposed to low and high levels of drinking water arsenic were plated twice on Illumina's Infinium 450 K Human Methylation Array, once in order of exposure and again following randomization. Second, we conducted simulations in which random CpG sites were drawn from the 450 Karray and subjected to pathway analysis using Ingenuity's IPA software. Results: The majority of differentially methylated CpG sites identified in Run One were due to batch effects; few sites were also identified in Run Two. In addition, the pathway analysis software reported many significant associations between our data, randomly drawn from the 450 K array, and various diseases and biological functions. Conclusions: These analyses illustrate the pitfalls of not properly controlling for chip-specific batch effects as well as using pathway analysis software created for gene expression arrays to analyze DNA methylation array data. Impact: We present evidence that (i) chip-specific effects can simulate plausible differential methylation results and (ii) popular pathway analysis software developed for expression arrays can yield spurious results when used in tandem with methylation microarrays.
AB - Background: DNA methylation microarrays have become an increasingly popular means of studying the role of epigenetics in cancer, although the methods used to analyze these arrays are still being developed and existing methods are not always widely disseminated among microarray users. Methods: We investigated two problems likely to confront DNA methylation microarray users: (i) batch effects and (ii) the use of widely available pathway analysis software to analyze results. First, DNA taken from individuals exposed to low and high levels of drinking water arsenic were plated twice on Illumina's Infinium 450 K Human Methylation Array, once in order of exposure and again following randomization. Second, we conducted simulations in which random CpG sites were drawn from the 450 Karray and subjected to pathway analysis using Ingenuity's IPA software. Results: The majority of differentially methylated CpG sites identified in Run One were due to batch effects; few sites were also identified in Run Two. In addition, the pathway analysis software reported many significant associations between our data, randomly drawn from the 450 K array, and various diseases and biological functions. Conclusions: These analyses illustrate the pitfalls of not properly controlling for chip-specific batch effects as well as using pathway analysis software created for gene expression arrays to analyze DNA methylation array data. Impact: We present evidence that (i) chip-specific effects can simulate plausible differential methylation results and (ii) popular pathway analysis software developed for expression arrays can yield spurious results when used in tandem with methylation microarrays.
UR - http://www.scopus.com/inward/record.url?scp=84879015632&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84879015632&partnerID=8YFLogxK
U2 - 10.1158/1055-9965.EPI-13-0114
DO - 10.1158/1055-9965.EPI-13-0114
M3 - Article
C2 - 23629520
AN - SCOPUS:84879015632
SN - 1055-9965
VL - 22
SP - 1052
EP - 1060
JO - Cancer Epidemiology Biomarkers and Prevention
JF - Cancer Epidemiology Biomarkers and Prevention
IS - 6
ER -