Biostatistics Advance Access originally published online on December 22, 2006
Biostatistics 2007 8(2):485-499; doi:10.1093/biostatistics/kxl042
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data
Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA
Department of Statistics, University of California, Berkeley, CA, USA
Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute, Melbourne, Australia and Department of Statistics, University of California, Berkeley, CA, USA
Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA ririzarr{at}jhsph.edu
* To whom correspondence should be addressed.
In most microarray technologies, a number of critical steps are required to convert raw intensity measurements into the data relied upon by data analysts, biologists, and clinicians. These data manipulations, referred to as preprocessing, can influence the quality of the ultimate measurements. In the last few years, the high-throughput measurement of gene expression is the most popular application of microarray technology. For this application, various groups have demonstrated that the use of modern statistical methodology can substantially improve accuracy and precision of the gene expression measurements, relative to ad hoc procedures introduced by designers and manufacturers of the technology. Currently, other applications of microarrays are becoming more and more popular. In this paper, we describe a preprocessing methodology for a technology designed for the identification of DNA sequence variants in specific genes or regions of the human genome that are associated with phenotypes of interest such as disease. In particular, we describe a methodology useful for preprocessing Affymetrix single-nucleotide polymorphism chips and obtaining genotype calls with the preprocessed data. We demonstrate how our procedure improves existing approaches using data from 3 relatively large studies including the one in which large numbers of independent calls are available. The proposed methods are implemented in the package oligo available from Bioconductor.
Keywords: Affymetrix; Genotyping; High-throughput; Microarrays
Received June 27, 2006; revised September 18, 2006; accepted for publication October 12, 2006.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. D. Greenman, G. Bignell, A. Butler, S. Edkins, J. Hinton, D. Beare, S. Swamy, T. Santarius, L. Chen, S. Widaa, et al. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data Biostat., October 15, 2009; (2009) kxp045v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. J. Jorgensen, I. Ruczinski, B. Kessing, M. W. Smith, Y. Y. Shugart, and A. J. Alberg Hypothesis-Driven Candidate Gene Association Studies: Practical Design and Analytical Considerations Am. J. Epidemiol., October 15, 2009; 170(8): 986 - 993. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Y. Teo Exploratory data analysis in large-scale genetic studies Biostat., October 14, 2009; (2009) kxp038v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. E. Ritchie, B. S. Carvalho, K. N. Hetrick, S. Tavare, and R. A. Irizarry R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips Bioinformatics, October 1, 2009; 25(19): 2621 - 2623. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Wan, K. Sun, Q. Ding, Y. Cui, M. Li, Y. Wen, R. C. Elston, M. Qian, and W. J Fu Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation Nucleic Acids Res., September 1, 2009; 37(17): e117 - e117. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Bengtsson, P. Wirapati, and T. P. Speed A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6 Bioinformatics, September 1, 2009; 25(17): 2149 - 2156. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bahlo and C. J. Bromhead Generating linkage mapping files from Affymetrix SNP chip data Bioinformatics, August 1, 2009; 25(15): 1961 - 1962. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lin, G. C. Tseng, S. Y. Cheong, L. J. H. Bean, S. L. Sherman, and E. Feingold Smarter clustering methods for SNP genotype calling Bioinformatics, December 1, 2008; 24(23): 2665 - 2671. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Xie, J. Wang, and J. Chen A model of higher accuracy for the individual haplotyping problem based on weighted SNP fragments and genotype with errors Bioinformatics, July 1, 2008; 24(13): i105 - i113. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. F. Thompson, M. Reimers, B. Khulan, M. Gissot, T. A. Richmond, Q. Chen, X. Zheng, K. Kim, and J. M. Greally An analytical pipeline for genomic representations used for cytosine methylation studies Bioinformatics, May 1, 2008; 24(9): 1161 - 1167. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Bengtsson, R. Irizarry, B. Carvalho, and T. P. Speed Estimation and assessment of raw copy numbers at the single locus level Bioinformatics, March 15, 2008; 24(6): 759 - 767. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Y. Teo, M. Inouye, K. S. Small, R. Gwilliam, P. Deloukas, D. P. Kwiatkowski, and T. G. Clark A genotype calling algorithm for the Illumina BeadArray platform Bioinformatics, October 15, 2007; 23(20): 2741 - 2746. [Abstract] [Full Text] [PDF] |
||||



