Skip Navigation

Biostatistics 2007 8(3):505-506; doi:10.1093/biostatistics/kxm021
This Article
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Pounds, S. B.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Pounds, S. B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published by Oxford University Press 2007.

Software and data news

Biostatistics wishes to encourage rapid dissemination of novel statistical methods motivated by substantive problems in the health or biomedical sciences and innovative applications of statistics in these areas. Novel methods are more likely to be adopted by practitioners if software for their implementation is freely available. Innovative applications are more likely to stimulate the interest of others if the data can be made available for wider use.

We already encouraged authors to deposit software and/or data sets as supplementary information when their papers are accepted for publication. However, we have not until now had any means of disseminating software which is developed only after publication or data sets which can only be released after publication.

To remedy this, from the first issue of 2007 we are adding to the journal a regular feature "Software and Data News," which we encourage authors of published Biostatistics papers to use to announce the availability of software or data related to their papers in the journal. Each item for this feature should include a short descriptive paragraph, a reference to the Biostatistics paper to which the announcement is relevant, and a web address from which the software or data can be downloaded free of charge. Authors should include with their submission a signed letter stating the basis on which they have the authority to make the software available in this way. In addition, the software should be licensed and distributed in such a manner that readers are free to examine the source code of the relevant programs.


 

Computational enhancement of a shrinkage-based analysis of variance F-test proposed for differential gene expression analysis

Stanley B. Pounds*

Department of Biostatistics, MS 768, St Jude Children's Research Hospital, Memphis, TN 38105, USA stanley.pounds{at}stjude.org

* To whom correspondence should be addressed.

Shrinkage is widely used in the analysis of microarray gene expression data (Allison and others, 2005). Cui and others (2005)Go develop a shrinkage-based estimator of the mean-squared error for use in analysis of variance-style analyses of microarray gene expression data. Their estimator includes terms V and B that are simple functions of the mean and variance of a log-transformed chi-square random variable. They provide simulation-based estimates of these quantities for use in their formula. However, analytical expressions for these parameters can be derived using standard transformation and expectation theory (Casella and Berger, 1990Go). In particular, the mean of a log-transformed chi-square variable with {nu} degrees of freedom is given by ln(2) + {Psi}({nu}/2), where {Psi} is the digamma function. Additionally, the second moment is given by ln(2)2 + 2 ln(2){Psi}({nu}/2) + {Lambda}({nu}/2) + {Psi}({nu}/2)2, where {Lambda} is the first derivative of the digamma function. Analytical expressions for V and B are easily derived from these results. All the necessary functions are available in R (www.r-project.org), and thus it is possible to perform the calculations without reliance on simulation-based estimates. I have developed an R routine that uses the correct values of V and B to compute the statistic FS proposed in equation (3.5) of Cui and others (2005)Go. The routine accepts the expression data matrix and a vector of group labels as arguments and returns a vector with the F-statistics. The supplementary materials (available at Biostatistics online, http://www.biostatistics.oxfordjournals.org) provide instructions on how to obtain and use the routine. I gratefully acknowledge the support of the American Lebanese-Syrian Associated Charities.


    REFERENCES
 TOP
 REFERENCES
 

    Allison DA, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nature Genetics (2005) 7:55–65.[Web of Science]

    Casella G, Berger R. Statistical Inference (1990) Pacific Grove, CA: Brooks/Cole. Wadsworth.

    Cui X, Hwang JTG, Qiu J, Blades NJ, Churchill GA. Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics (2005) 6:59–75.[Abstract]

    Received November 17, 2006; revised February 10, 2007; revised March 16, 2007;
    Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



    This Article
    Right arrow FREE Full Text (PDF) Freely available
    Right arrow Supplementary Material
    Right arrow Alert me when this article is cited
    Right arrow Alert me if a correction is posted
    Services
    Right arrow Email this article to a friend
    Right arrow Similar articles in this journal
    Right arrow Alert me to new issues of the journal
    Right arrow Add to My Personal Archive
    Right arrow Download to citation manager
    Right arrowRequest Permissions
    Right arrow Disclaimer
    Google Scholar
    Right arrow Articles by Pounds, S. B.
    Right arrow Search for Related Content
    PubMed
    Right arrow Articles by Pounds, S. B.
    Social Bookmarking
     Add to CiteULike   Add to Connotea   Add to Del.icio.us  
    What's this?