Biostatistics Advance Access published online on November 29, 2007
Biostatistics, doi:10.1093/biostatistics/kxm042
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MOST: detecting cancer differential gene expression
Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371 henglian{at}ntu.edu.sg
* To whom correspondence should be addressed.
We propose a new statistics for the detection of differentially expressed genes when the genes are activated only in a subset of the samples. Statistics designed for this unconventional circumstance has proved to be valuable for most cancer studies, where oncogenes are activated for a small number of disease samples. Previous efforts made in this direction include cancer outlier profile analysis (Tomlins and others, 2005), outlier sum (Tibshirani and Hastie, 2007), and outlier robust t-statistics (Wu, 2007). We propose a new statistics called maximum ordered subset t-statistics (MOST) which seems to be natural when the number of activated samples is unknown. We compare MOST to other statistics and find that the proposed method often has more power then its competitors.
Keywords: Cancer; COPA; Differential gene expression; Microarray
Received August 8, 2007; revised September 24, 2007; revised October 23, 2007; accepted for publication October 25, 2007.