Skip Navigation


Biostatistics Advance Access originally published online on April 21, 2006
Biostatistics 2007 8(1):118-127; doi:10.1093/biostatistics/kxj037
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Data Supplement
Right arrow All Versions of this Article:
8/1/118    most recent
kxj037v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Johnson, W. E.
Right arrow Articles by Rabinovic, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Johnson, W. E.
Right arrow Articles by Rabinovic, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

Adjusting batch effects in microarray expression data using empirical Bayes methods

W. Evan Johnson and Cheng Li*

Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA and Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA cli{at}hsph.harvard.edu

Ariel Rabinovic

Department of Genetics and Complex Diseases, Harvard School of Public Health, Boston, MA, USA

* To whom correspondence should be addressed.

Non-biological experimental variation or "batch effects" are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes (Formula) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.

Keywords: Batch effects; Empirical Bayes; Microarrays; Monte Carlo


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J Exp BotHome page
J. C. Cushman, R. L. Tillett, J. A. Wood, J. M. Branco, and K. A. Schlauch
Large-scale mRNA expression profiling in the common ice plant, Mesembryanthemum crystallinum, performing C3 photosynthesis and Crassulacean acid metabolism (CAM)
J. Exp. Bot., May 1, 2008; 59(7): 1875 - 1894.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. A. Shabalin, H. Tjelmeland, C. Fan, C. M. Perou, and A. B. Nobel
Merging two gene-expression studies via cross-platform normalization
Bioinformatics, May 1, 2008; 24(9): 1154 - 1160.
[Abstract] [Full Text] [PDF]


Home page
JAMAHome page
C. R. Acharya, D. S. Hsu, C. K. Anders, A. Anguiano, K. H. Salter, K. S. Walters, R. C. Redman, S. A. Tuchman, C. A. Moylan, S. Mukherjee, et al.
Gene Expression Signatures, Clinicopathological Features, and Individualized Therapy in Breast Cancer
JAMA, April 2, 2008; 299(13): 1574 - 1587.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
B. S. Abrahams, D. Tentler, J. V. Perederiy, M. C. Oldham, G. Coppola, and D. H. Geschwind
Genome-wide analyses of human perisylvian cerebral cortical patterning
PNAS, November 6, 2007; 104(45): 17849 - 17854.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.