Biostatistics Advance Access originally published online on July 23, 2009
Biostatistics 2009 10(4):694-705; doi:10.1093/biostatistics/kxp024
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sample size calculations for controlling the distribution of false discovery proportion in microarray experiments
Department of Biostatistics, Kyoto University School of Public Health, Yoshidakonoe-cho, Sakyo-ku, Kyoto 606-8501, Japan, toura-kyt{at}umin.ac.jp
Department of Data Science, The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106-8569, Japan
Department of Pharmacoepidemiology, Kyoto University School of Public Health, Yoshidakonoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
* To whom correspondence should be addressed.
The false discovery proportion (FDP), the proportion of false rejections among all rejections, provides useful criteria for controlling false positives in multiple testing to detect differential genes in microarray experiments. Owing to a substantial variability in FDP for correlated genes, some authors considered controlling actual FDP, instead of its expectation, that is false discovery rate, in multiple testing. However, there has been no attempt to do this in the design of microarray experiments. In this article, we develop a procedure for sample size calculation to control the distributions of FDP and true positives simultaneously under blockwise correlation structures among genes. The sizes of gene blocks, correlation coefficients, and effect sizes within gene blocks can vary across gene blocks. Gene clustering is proposed to identify gene blocks using historical data sets. The adequacy of the procedure is demonstrated using simulated data sets. An application to a clinical study for lymphoma is also provided.
Keywords: False discovery proportion; Gene expression; Microarray; Sample size
Received November 20, 2008; revised May 20, 2009; accepted for publication June 22, 2009.