Biostatistics Advance Access originally published online on April 21, 2006
Biostatistics 2007 8(1):118-127; doi:10.1093/biostatistics/kxj037
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Adjusting batch effects in microarray expression data using empirical Bayes methods
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA and Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA cli{at}hsph.harvard.edu
Department of Genetics and Complex Diseases, Harvard School of Public Health, Boston, MA, USA
* To whom correspondence should be addressed.
Non-biological experimental variation or "batch effects" are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes (
) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.
Keywords: Batch effects; Empirical Bayes; Microarrays; Monte Carlo
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. C. Cushman, R. L. Tillett, J. A. Wood, J. M. Branco, and K. A. Schlauch Large-scale mRNA expression profiling in the common ice plant, Mesembryanthemum crystallinum, performing C3 photosynthesis and Crassulacean acid metabolism (CAM) J. Exp. Bot., May 1, 2008; 59(7): 1875 - 1894. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Shabalin, H. Tjelmeland, C. Fan, C. M. Perou, and A. B. Nobel Merging two gene-expression studies via cross-platform normalization Bioinformatics, May 1, 2008; 24(9): 1154 - 1160. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. R. Acharya, D. S. Hsu, C. K. Anders, A. Anguiano, K. H. Salter, K. S. Walters, R. C. Redman, S. A. Tuchman, C. A. Moylan, S. Mukherjee, et al. Gene Expression Signatures, Clinicopathological Features, and Individualized Therapy in Breast Cancer JAMA, April 2, 2008; 299(13): 1574 - 1587. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. S. Abrahams, D. Tentler, J. V. Perederiy, M. C. Oldham, G. Coppola, and D. H. Geschwind Genome-wide analyses of human perisylvian cerebral cortical patterning PNAS, November 6, 2007; 104(45): 17849 - 17854. [Abstract] [Full Text] [PDF] |
||||



