Skip Navigation


Biostatistics Advance Access originally published online on June 19, 2007
Biostatistics 2008 9(1):187-198; doi:10.1093/biostatistics/kxm024
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
9/1/187    most recent
kxm024v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Schwender, H.
Right arrow Articles by Ickstadt, K.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Schwender, H.
Right arrow Articles by Ickstadt, K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

Identification of SNP interactions using logic regression

Holger Schwender* and Katja Ickstadt

Collaborative Research Center 475, Department of Statistics, University of Dortmund, 44221 Dortmund, Germany holger.schwender{at}udo.edu

* To whom correspondence should be addressed.

Interactions of single nucleotide polymorphisms (SNPs) are assumed to be responsible for complex diseases such as sporadic breast cancer. Important goals of studies concerned with such genetic data are thus to identify combinations of SNPs that lead to a higher risk of developing a disease and to measure the importance of these interactions. There are many approaches based on classification methods such as CART and random forests that allow measuring the importance of single variables. But none of these methods enable the importance of combinations of variables to be quantified directly. In this paper, we show how logic regression can be employed to identify SNP interactions explanatory for the disease status in a case–control study and propose 2 measures for quantifying the importance of these interactions for classification. These approaches are then applied on the one hand to simulated data sets and on the other hand to the SNP data of the GENICA study, a study dedicated to the identification of genetic and gene–environment interactions associated with sporadic breast cancer.

Keywords: Feature selection; GENICA; Single nucleotide polymorphism; Variable importance measure

Received July 5, 2006; revised November 29, 2006; revised March 2, 2007; accepted for publication April 25, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.