Biostatistics Advance Access published online on March 18, 2008
Biostatistics, doi:10.1093/biostatistics/kxn002
Optimal screening for promising genes in 2-stage designs
Department of Applied Mathematics and Computer Science, Ghent University, Krijgslaan 281-S9, 9000 Gent, Belgium beatrijs.moerkerke{at}ugent.be
Department of Applied Mathematics and Computer Science, Ghent University, Krijgslaan 281-S9, 9000 Gent, Belgium and Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA
* To whom correspondence should be addressed.
Detecting genetic markers with biologically relevant effects remains a challenge due to multiple testing. Standard analysis methods focus on evidence against the null and protect primarily the type I error. On the other hand, the worthwhile alternative is specified for power calculations at the design stage. The balanced test as proposed by Moerkerke and others (2006) and Moerkerke and Goetghebeur (2006) incorporates this alternative directly in the decision criterion to achieve better power. Genetic markers are selected and ranked in order of the balance of evidence they contain against the null and the target alternative. In this paper, we build on this guiding principle to develop 2-stage designs for screening genetic markers when the cost of measurements is high. For a given marker, a first sample may already provide sufficient evidence for or against the alternative. If not, more data are gathered at the second stage which is then followed by a binary decision based on all available data. By optimizing parameters which determine the decision process over the 2 stages (such as the area of the "gray" zone which leads to the gathering of extra data), the expected cost per marker can be reduced substantially. We also demonstrate that, compared to 1-stage designs, 2-stage designs achieve a better balance between true negatives and positives for the same cost.
Keywords: Alternative p-value; Balanced test; Cost-efficient screening; False discovery rate; Gene selection; Multiple testing; Optimal designs; Two-stage designs
Received September 5, 2006; revised July 3, 2007; revised December 31, 2007; accepted for publication January 22, 2008.