Skip Navigation



Biostatistics Advance Access published online on March 10, 2009

Biostatistics, doi:10.1093/biostatistics/kxp003
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supplementary Data
Right arrow All Versions of this Article:
10/3/446    most recent
kxp003v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Hardin, J.
Right arrow Articles by Wilson, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hardin, J.
Right arrow Articles by Wilson, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2009. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

A note on oligonucleotide expression values not being normally distributed

Johanna Hardin*

Department of Mathematics, Pomona College, 610 North College Avenue, Claremont, CA 91711, USA
jo.hardin{at}pomona.edu

Jason Wilson

Department of Mathematics, Biola University, La Mirada, CA 90639, USA
jason.wilson{at}biola.edu

* To whom correspondence should be addressed.

Novel techniques for analyzing microarray data are constantly being developed. Though many of the methods contribute to biological discoveries, inability to properly evaluate the novel techniques limits their ability to advance science. Because the underlying distribution of microarray data is unknown, novel methods are typically tested against the assumed normal distribution. However, microarray data are not, in fact, normally distributed, and assuming so can have misleading consequences. Using an Affymetrix technical replicate spike-in data set, we show that oligonucleotide expression values are not normally distributed for any of the standard methods for calculating expression values. The resulting data tend to have a large proportion of skew and heavy tailed genes. Additionally, we show that standard methods can give unexpected and misleading results when the data are not well approximated by the normal distribution. Robust methods are therefore recommended when analyzing microarray data. Additionally, new techniques should be evaluated with skewed and/or heavy-tailed data distributions.

Keywords: Affymetrix; Distributions; Microarray data; Nonnormality

Received July 15, 2008; revised November 12, 2008; accepted for publication January 27, 2009.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BiostatisticsHome page
S. Stjernqvist and T. Ryden
A continuous-index hidden Markov jump process for modeling DNA copy number data
Biostat., October 1, 2009; 10(4): 773 - 778.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.