Biostatistics Advance Access published online on April 21, 2006
Biostatistics, doi:10.1093/biostatistics/kxj037
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA; Department of Biostatistics, Harvard School of Public Health, Boston, MA
* To whom correspondence should be addressed. Non-biological experimental variation or batch effects are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes (>25) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and nonparametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.
Received February 28, 2006
Revised April 14, 2006
Accepted April 14, 2006
Article
Adjusting batch effects in microarray expression data using Empirical Bayes methods
W. Evan Johnson 1,
Ariel Rabinovic 2,
and
Cheng Li 1 *
2 Department of Genetics and Complex Diseases, Harvard School of Public Health, Boston, MA
Cheng Li, E-mail: cli{at}hsph.harvard.edu
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
W. Mostertz, M. Stevenson, C. Acharya, I. Chan, K. Walters, W. Lamlertthon, W. Barry, J. Crawford, J. Nevins, and A. Potti Age- and Sex-Specific Genomic Profiles in Non-Small Cell Lung Cancer JAMA, February 10, 2010; 303(6): 535 - 543. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Biernacki, O. Marina, W. Zhang, F. Liu, I. Bruns, A. Cai, D. Neuberg, C. M. Canning, E. P. Alyea, R. J. Soiffer, et al. Efficacious Immune Therapy in Chronic Myelogenous Leukemia (CML) Recognizes Antigens That Are Expressed on CML Progenitor Cells Cancer Res., February 1, 2010; 70(3): 906 - 915. [Abstract] [Full Text] [PDF] |
||||
![]() |
E.-H. Tan, R. Ramlau, A. Pluzanska, H.-P. Kuo, M. Reck, J. Milanowski, J. S.-K. Au, E. Felip, P.-C. Yang, D. Damyanov, et al. A multicentre phase II gene expression profiling study of putative relationships between tumour biomarkers and clinical response with erlotinib in non-small-cell lung cancer Ann. Onc., February 1, 2010; 21(2): 217 - 222. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. L. Vidal-Cardenas and C. W. Greider Comparing effects of mTR and mTERT deletion on gene expression and DNA damage response: a critical examination of telomere length maintenance-independent roles of telomerase Nucleic Acids Res., January 1, 2010; 38(1): 60 - 71. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Caers, D. Hose, I. Kuipers, T. J. Bos, E. Van Valckenborgh, E. Menu, E. De Bruyne, H. Goldschmidt, B. Van Camp, B. Klein, et al. Thymosin {beta}4 has tumor suppressive effects and its decreased expression results in poor prognosis and decreased survival in multiple myeloma Haematologica, January 1, 2010; 95(1): 163 - 167. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. M. Frizzell, M. J. Gamble, J. G. Berrocal, T. Zhang, R. Krishnakumar, Y. Cen, A. A. Sauve, and W. L. Kraus Global Analysis of Transcriptional Regulation by Poly(ADP-ribose) Polymerase-1 and Poly(ADP-ribose) Glycohydrolase in MCF-7 Human Breast Cancer Cells J. Biol. Chem., December 4, 2009; 284(49): 33926 - 33938. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. V. Rao, P. J.M. Valk, K. H. Metzeler, C. R. Acharya, S. A. Tuchman, M. M. Stevenson, D. A. Rizzieri, R. Delwel, C. Buske, S. K. Bohlander, et al. Age-Specific Differences in Oncogenic Pathway Dysregulation and Anthracycline Sensitivity in Patients With Acute Myeloid Leukemia J. Clin. Oncol., November 20, 2009; 27(33): 5580 - 5586. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. R. Friedman, J. B. Weinberg, W. T. Barry, B. K. Goodman, A. D. Volkheimer, K. M. Bond, Y. Chen, N. Jiang, J. O. Moore, J. P. Gockerman, et al. A Genomic Approach to Improve Prognosis and Predict Therapeutic Response in Chronic Lymphocytic Leukemia Clin. Cancer Res., November 15, 2009; 15(22): 6947 - 6955. [Abstract] [Full Text] [PDF] |
||||
![]() |
A H Sims Bioinformatics and breast cancer: what can high-throughput genomic approaches actually tell us? J. Clin. Pathol., October 1, 2009; 62(10): 879 - 885. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Anguiano, S. A. Tuchman, C. Acharya, K. Salter, C. Gasparetto, F. Zhan, M. Dhodapkar, J. Nevins, B. Barlogie, J. D. Shaughnessy Jr, et al. Gene Expression Profiles of Tumor Biology Provide a Novel Approach to Prognosis and May Guide the Selection of Therapeutic Targets in Multiple Myeloma J. Clin. Oncol., September 1, 2009; 27(25): 4197 - 4203. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Zhang, J. G. Berrocal, K. M. Frizzell, M. J. Gamble, M. E. DuMond, R. Krishnakumar, T. Yang, A. A. Sauve, and W. L. Kraus Enzymes in the NAD+ Salvage Pathway Regulate SIRT1 Activity at Target Gene Promoters J. Biol. Chem., July 24, 2009; 284(30): 20408 - 20417. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Dakhova, M. Ozen, C. J. Creighton, R. Li, G. Ayala, D. Rowley, and M. Ittmann Global Gene Expression Analysis of Reactive Stroma in Prostate Cancer Clin. Cancer Res., June 15, 2009; 15(12): 3979 - 3989. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Berchuck, E. S. Iversen, J. Luo, J. P. Clarke, H. Horne, D. A. Levine, J. Boyd, M. A. Alonso, A. A. Secord, M. Q. Bernardini, et al. Microarray Analysis of Early Stage Serous Ovarian Cancers Shows Profiles Predictive of Favorable Outcome Clin. Cancer Res., April 1, 2009; 15(7): 2448 - 2455. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. S. Garman, C. R. Acharya, E. Edelman, M. Grade, J. Gaedcke, S. Sud, W. Barry, A. M. Diehl, D. Provenzale, G. S. Ginsburg, et al. A genomic approach to colon cancer risk stratification yields biologic insights into therapeutic opportunities PNAS, December 9, 2008; 105(49): 19432 - 19437. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. M. Kang, C. Ye, and E. Eskin Accurate Discovery of Expression Quantitative Trait Loci Under Confounding From Spurious and Genuine Regulatory Hotspots Genetics, December 1, 2008; 180(4): 1909 - 1925. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Owzar, W. T. Barry, S.-H. Jung, I. Sohn, and S. L. George Statistical Challenges in Preprocessing in Microarray Experiments in Cancer Clin. Cancer Res., October 1, 2008; 14(19): 5959 - 5966. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Cushman, R. L. Tillett, J. A. Wood, J. M. Branco, and K. A. Schlauch Large-scale mRNA expression profiling in the common ice plant, Mesembryanthemum crystallinum, performing C3 photosynthesis and Crassulacean acid metabolism (CAM) J. Exp. Bot., May 1, 2008; 59(7): 1875 - 1894. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Shabalin, H. Tjelmeland, C. Fan, C. M. Perou, and A. B. Nobel Merging two gene-expression studies via cross-platform normalization Bioinformatics, May 1, 2008; 24(9): 1154 - 1160. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. R. Acharya, D. S. Hsu, C. K. Anders, A. Anguiano, K. H. Salter, K. S. Walters, R. C. Redman, S. A. Tuchman, C. A. Moylan, S. Mukherjee, et al. Gene Expression Signatures, Clinicopathological Features, and Individualized Therapy in Breast Cancer JAMA, April 2, 2008; 299(13): 1574 - 1587. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. S. Abrahams, D. Tentler, J. V. Perederiy, M. C. Oldham, G. Coppola, and D. H. Geschwind Genome-wide analyses of human perisylvian cerebral cortical patterning PNAS, November 6, 2007; 104(45): 17849 - 17854. [Abstract] [Full Text] [PDF] |
||||












