Biostatistics Vol. 5 No. 4 © Oxford University Press 2004; all rights reserved.
Bayesian analysis of binary prediction tree models for retrospectively sampled outcomes

Institute of Statistics & Decision Sciences, Duke University, Durham, NC 27708-0251, USA
jennifer{at}stat.duke.edu
Department of Surgery, Duke University, Durham, NC 27708-0251, USA
Department of Molecular Genetics & Microbiology, Duke University, Durham, NC 27708-0251, USA
Computational & Applied Genomics Program, Duke University, Durham, NC 27708-0251, USA
Institute of Statistics & Decision Sciences, Duke University, Durham, NC 27708-0251, USA
To whom correspondence should be addressed.
Classification tree models are flexible analysis tools which have the ability to evaluate interactions among predictors as well as generate predictions for responses of interest. We describe Bayesian analysis of a specific class of tree models in which binary response data arise from a retrospective case-control design. We are also particularly interested in problems with potentially very many candidate predictors. This scenario is common in studies concerning gene expression data, which is a key motivating example context. Innovations here include the introduction of tree models that explicitly address and incorporate the retrospective design, and the use of nonparametric Bayesian models involving Dirichlet process priors on the distributions of predictor variables. The model specification influences the generation of trees through Bayes' factor based tests of association that determine significant binary partitions of nodes during a process of forward generation of trees. We describe this constructive process and discuss questions of generating and combining multiple trees via Bayesian model averaging for prediction. Additional discussion of parameter selection and sensitivity is given in the context of an example which concerns prediction of breast tumour status utilizing high-dimensional gene expression data; the example demonstrates the exploratory/explanatory uses of such models as well as their primary utility in prediction. Shortcomings of the approach and comparison with alternative tree modelling algorithms are also discussed, as are issues of modelling and computational extensions.
Keywords: Bayesian analysis; Binary classification tree; Bioinformatics; Case-control design; Metagenes; Molecular classification; Predictive classification; Retrospective sampling; Tree models
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. V. Rao, P. J.M. Valk, K. H. Metzeler, C. R. Acharya, S. A. Tuchman, M. M. Stevenson, D. A. Rizzieri, R. Delwel, C. Buske, S. K. Bohlander, et al. Age-Specific Differences in Oncogenic Pathway Dysregulation and Anthracycline Sensitivity in Patients With Acute Myeloid Leukemia J. Clin. Oncol., November 20, 2009; 27(33): 5580 - 5586. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. S. Garman, C. R. Acharya, E. Edelman, M. Grade, J. Gaedcke, S. Sud, W. Barry, A. M. Diehl, D. Provenzale, G. S. Ginsburg, et al. A genomic approach to colon cancer risk stratification yields biologic insights into therapeutic opportunities PNAS, December 9, 2008; 105(49): 19432 - 19437. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Minna, L. Girard, and Y. Xie Tumor mRNA Expression Profiles Predict Responses to Chemotherapy J. Clin. Oncol., October 1, 2007; 25(28): 4329 - 4336. [Full Text] [PDF] |
||||
![]() |
D. S. Hsu, B. S. Balakumaran, C. R. Acharya, V. Vlahovic, K. S. Walters, K. Garman, C. Anders, R. F. Riedel, J. Lancaster, D. Harpole, et al. Pharmacogenomic Strategies Provide a Rational Approach to the Treatment of Cisplatin-Resistant Patients With Advanced Cancer J. Clin. Oncol., October 1, 2007; 25(28): 4350 - 4357. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. H. Cheng, C.-F. Horng, M. West, E. Huang, J. Pittman, M.-H. Tsou, H. Dressman, C.-M. Chen, S. Y. Tsai, J. J. Jian, et al. Genomic Prediction of Locoregional Recurrence After Mastectomy in Breast Cancer J. Clin. Oncol., October 1, 2006; 24(28): 4594 - 4602. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Potti, S. Mukherjee, R. Petersen, H. K. Dressman, A. Bild, J. Koontz, R. Kratzke, M. A. Watson, M. Kelley, G. S. Ginsburg, et al. A Genomic Strategy to Refine Prognosis in Early-Stage Non-Small-Cell Lung Cancer N. Engl. J. Med., August 10, 2006; 355(6): 570 - 580. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhang, R. Rekaya, and K. Bertrand A method for predicting disease subtypes in presence of misclassification among training samples using gene expression: application to human breast cancer Bioinformatics, February 1, 2006; 22(3): 317 - 325. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. N. Rich, C. Hans, B. Jones, E. S. Iversen, R. E. McLendon, B.K. A. Rasheed, A. Dobra, H. K. Dressman, D. D. Bigner, J. R. Nevins, et al. Gene Expression Profiling and Genetic Markers in Glioblastoma Survival Cancer Res., May 15, 2005; 65(10): 4051 - 4058. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Berchuck, E. S. Iversen, J. M. Lancaster, J. Pittman, J. Luo, P. Lee, S. Murphy, H. K. Dressman, P. G. Febbo, M. West, et al. Patterns of Gene Expression That Characterize Long-term Survival in Advanced Stage Serous Ovarian Cancers Clin. Cancer Res., May 15, 2005; 11(10): 3686 - 3696. [Abstract] [Full Text] [PDF] |
||||





