Biostatistics Advance Access originally published online on March 29, 2009
Biostatistics 2009 10(3):468-480; doi:10.1093/biostatistics/kxp005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Estimating equation–based causality analysis with application to microarray time series data
Department of Biostatistics, Division of Quantitative Sciences, University of Texas M. D. Anderson Cancer Center, Houston, TX, USA
jhu{at}mdanderson.org
Department of Statistics, University of Virginia, Charlottesville, VA, USA
fh6e{at}virginia.edu
* To whom correspondence should be addressed.
Microarray time-course data can be used to explore interactions among genes and infer gene network. The crucial step in constructing gene network is to develop an appropriate causality test. In this regard, the expression profile of each gene can be treated as a time series. A typical existing method establishes the Granger causality based on Wald type of test, which relies on the homoscedastic normality assumption of the data distribution. However, this assumption can be seriously violated in real microarray experiments and thus may lead to inconsistent test results and false scientific conclusions. To overcome the drawback, we propose an estimating equation–based method which is robust to both heteroscedasticity and nonnormality of the gene expression data. In fact, it only requires the residuals to be uncorrelated. We will use simulation studies and a real-data example to demonstrate the applicability of the proposed method.
Keywords: Chi-square approximation; Estimating equation; F-test; False-positive rate; Granger causality; Time-course data
Received March 30, 2008; revised November 24, 2008; accepted for publication December 18, 2008.