Biostatistics 4:555-567 (2003)
© 2003 Oxford University Press
Multivariate exploratory tools for microarray data analysis

Aniko Szabo, Kenneth Boucher, David Jones, Alexander D. Tsodikov. Huntsman Cancer Institute and Department of Oncological Sciences, University of Utah, 2000 Circle of Hope, Salt Lake City, UT 84112-5550, USA aniko.szabo{at}hci.utah.edu
Lev B. Klebanov. Department of Probability and Statistics, Charls University, Sokolovska 83, Praha-8, CZ-18675, Czech Republic
Andrei Y. Yakovlev. Department of Biostatistics and Computational Biology, University of Rochester, 601 Elmwood Avenue, Box 630 Rochester, NY 14642, USA and Huntsman Cancer Institute and Department of Oncological Sciences, University of Utah, 2000 Circle of Hope, Salt Lake City, UT 84112-5550, USA
To whom correspondence should be addressed
The ultimate success of microarray technology in basic and applied biological sciences depends critically on the development of statistical methods for gene expression data analysis. The most widely used tests for differential expression of genes are essentially univariate. Such tests disregard the multidimensional structure of microarray data. Multivariate methods are needed to utilize the information hidden in gene interactions and hence to provide more powerful and biologically meaningful methods for finding subsets of differentially expressed genes. The objective of this paper is to develop methods of multidimensional search for biologically significant genes, considering expression signals as mutually dependent random variables. To attain these ends, we consider the utility of a pertinent distance between random vectors and its empirical counterpart constructed from gene expression data. The distance furnishes exploratory procedures aimed at finding a target subset of differentially expressed genes. To determine the size of the target subset, we resort to successive elimination of smaller subsets resulting from each step of a random search algorithm based on maximization of the proposed distance. Different stopping rules associated with this procedure are evaluated. The usefulness of the proposed approach is illustrated with an application to the analysis of two sets of gene expression data.
Keywords: Cross-validation; Differential expression; Permutation test; Probability distance; Random search; Sets of genes
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. W. Kong, W. T. Pu, and P. J. Park A multivariate approach for integrating genome-wide expression data and biological knowledge Bioinformatics, October 1, 2006; 22(19): 2373 - 2380. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. S. Kim, I. Kim, S. Lee, S. Kim, S. Y. Rha, and H. C. Chung Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer Bioinformatics, February 15, 2005; 21(4): 517 - 528. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Li, S. Rao, Y. Wang, and B. Gong Gene mining: a novel and powerful ensemble decision approach to hunting for disease genes using microarray expression profiling Nucleic Acids Res., May 17, 2004; 32(9): 2685 - 2694. [Abstract] [Full Text] [PDF] |
||||

