Biostatistics Advance Access published online on June 24, 2009
Biostatistics, doi:10.1093/biostatistics/kxp020
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Rank-based estimation in the
1-regularized partly linear model for censored outcomes with application to integrated analyses of clinical predictors and gene expression data
Department of Biostatistics, Emory University, Atlanta, GA 30322, USA bajohn3{at}emory.edu
* To whom correspondence should be addressed.
We consider estimation and variable selection in the partial linear model for censored data. The partial linear model for censored data is a direct extension of the accelerated failure time model, the latter of which is a very important alternative model to the proportional hazards model. We extend rank-based lasso-type estimators to a model that may contain nonlinear effects. Variable selection in such partial linear model has direct application to high-dimensional survival analyses that attempt to adjust for clinical predictors. In the microarray setting, previous methods can adjust for other clinical predictors by assuming that clinical and gene expression data enter the model linearly in the same fashion. Here, we select important variables after adjusting for prognostic clinical variables but the clinical effects are assumed nonlinear. Our estimator is based on stratification and can be extended naturally to account for multiple nonlinear effects. We illustrate the utility of our method through simulation studies and application to the Wisconsin prognostic breast cancer data set.
Keywords: Lasso; Logrank; Penalized least squares; Survival analysis
Received November 4, 2008; revised May 6, 2009; accepted for publication May 27, 2009.