Published by Oxford University Press 2007.
Software and data news
Biostatistics wishes to encourage rapid dissemination of novel statistical methods motivated by substantive problems in the health or biomedical sciences and innovative applications of statistics in these areas. Novel methods are more likely to be adopted by practitioners if software for their implementation is freely available. Innovative applications are more likely to stimulate the interest of others if the data can be made available for wider use.We already encouraged authors to deposit software and/or data sets as supplementary information when their papers are accepted for publication. However, we have not until now had any means of disseminating software which is developed only after publication or data sets which can only be released after publication.
To remedy this, from the first issue of 2007 we are adding to the journal a regular feature "Software and Data News," which we encourage authors of published Biostatistics papers to use to announce the availability of software or data related to their papers in the journal. Each item for this feature should include a short descriptive paragraph, a reference to the Biostatistics paper to which the announcement is relevant, and a web address from which the software or data can be downloaded free of charge. Authors should include with their submission a signed letter stating the basis on which they have the authority to make the software available in this way. In addition, the software should be licensed and distributed in such a manner that readers are free to examine the source code of the relevant programs.
Computational enhancement of a shrinkage-based analysis of variance F-test proposed for differential gene expression analysis
Department of Biostatistics, MS 768, St Jude Children's Research Hospital, Memphis, TN 38105, USA stanley.pounds{at}stjude.org
* To whom correspondence should be addressed.
Shrinkage is widely used in the analysis of microarray gene expression data (Allison and others, 2005). Cui and others (2005)
develop a shrinkage-based estimator of the mean-squared error for use in analysis of variance-style analyses of microarray gene expression data. Their estimator includes terms V and B that are simple functions of the mean and variance of a log-transformed chi-square random variable. They provide simulation-based estimates of these quantities for use in their formula. However, analytical expressions for these parameters can be derived using standard transformation and expectation theory (Casella and Berger, 1990
). In particular, the mean of a log-transformed chi-square variable with
degrees of freedom is given by ln(2) +
(
/2), where
is the digamma function. Additionally, the second moment is given by ln(2)2 + 2 ln(2)
(
/2) +
(
/2) +
(
/2)2, where
is the first derivative of the digamma function. Analytical expressions for V and B are easily derived from these results. All the necessary functions are available in R (www.r-project.org), and thus it is possible to perform the calculations without reliance on simulation-based estimates. I have developed an R routine that uses the correct values of V and B to compute the statistic FS proposed in equation (3.5) of Cui and others (2005)
. The routine accepts the expression data matrix and a vector of group labels as arguments and returns a vector with the F-statistics. The supplementary materials (available at Biostatistics online, http://www.biostatistics.oxfordjournals.org) provide instructions on how to obtain and use the routine. I gratefully acknowledge the support of the American Lebanese-Syrian Associated Charities.
| REFERENCES |
|---|
|
|
|---|
-
Allison DA, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nature Genetics (2005) 7:5565.[Web of Science]
Casella G, Berger R. Statistical Inference (1990) Pacific Grove, CA: Brooks/Cole. Wadsworth.
Cui X, Hwang JTG, Qiu J, Blades NJ, Churchill GA. Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics (2005) 6:5975.[Abstract]
Received November 17, 2006;
revised February 10, 2007; revised March 16, 2007;
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||