Biostatistics Advance Access originally published online on April 14, 2005
Biostatistics 2005 6(3):465-478; doi:10.1093/biostatistics/kxi022
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Reduced rank proportional hazards model for competing risks
Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, P.O. Box 9604, 2300 RC Leiden, The Netherlands m.fiocco{at}lumc.nl
* To whom correspondence should be addressed.
| SUMMARY |
|---|
|
|
|---|
Competing events concerning individual subjects are of interest in many medical studies. For example, leukemia-free patients surviving a bone marrow transplant are at risk of developing acute or chronic graft-versus-host disease, or they might develop infections. In this situation, competing risks models provide a natural framework to describe the disease. When incorporating covariates influencing the transition intensities, an obvious approach is to use Cox's proportional hazards model for each of the transitions separately. A practical problem then is how to deal with the abundance of regression parameters. Our objective is to describe the competing risks model in fewer parameters, both in order to avoid imprecise estimation in transitions with rare events and in order to facilitate interpretation of these estimates. Suppose that the regression parameters are gathered into a p x K matrix B, with p and K as the number of covariates and transitions, respectively. We propose the use of reduced rank models, where B is required to be of lower rank R, smaller than both p and K. One way to achieve this is to write B = A

with A and
matrices of dimensions p x R and K x R, respectively. We shall outline an algorithm to obtain estimates and their standard errors in a reduced rank proportional hazards model for competing risks and illustrate the approach on a competing risks model applied to 8966 leukemia patients from the European Group for Blood and Marrow Transplantation.
Keywords: Biplot; Competing risks; Prognostic factors; Reduced rank; Survival analysis
| 1. INTRODUCTION |
|---|
|
|
|---|
In many clinical trials with survival data, one has to deal with many events concerning individual subjects. Each individual can experience repeated events of the same type, or multiple types of events. We consider here models for competing risks, where the occurrence of one event precludes the occurrence of others. An example is given by bone marrow transplants where a patient may develop acute or chronic graft-versus-host disease (GvHD), return of platelet counts to normal level or development of infection. A patient can die from these or other causes, including transplant-related toxicities. Another example of multiple events concerns cancer trials where, after the initial treatment local recurrence, distant metastasis or death may occur. Competing risks models have one initial state 0 (alive, event-free) and a number of absorbing states k = 1, ..., K, corresponding to the different types of events. For more details, we refer to a recent review by Andersen etal. (2002)
In most trials, the influence of a number of prognostic factors on the disease or recovery process is of interest. One straightforward way to model the effect of these prognostic factors on the time to an event from each of the competing causes is to use Cox's proportional hazards model for each of the transitions separately.
We propose as an alternative approach a reduced rank proportional hazards model. In this model, a parametrization is used such that the matrix of all regression coefficients is of lower rank R. For the special case of R = 1, such a reduced rank model is a proportional hazards model where all covariates have the same effect on each transition (from state 0 to one of the K possible states) apart from proportionality coefficients. Several advantages derive from the application of such a model: fewer parameters need to be estimated, it allows for clearer interpretation of the parameter estimates, and the model can also handle transitions with rare events. The principle of reduced ranks is well known in the literature in the case of multivariate linear regression (e.g. Reinsel and Velu, 1998
) and has recently been extended to generalized linear models (Yee and Hastie, 2003
). However, the application to survival data is novel.
The outline of the paper is as follows. In Section 2 we introduce the competing risks model and the necessary notation; in Section 3 we introduce reduced rank regression and describe an algorithm for estimating the parameters of the reduced rank model. Section 4 describes an application to leukemia-free patients surviving a bone marrow transplant. We describe the data set we have used as an application for the reduced rank model and present estimation results. In Section 5 we discuss our methods and results. Technical details dealing with the computation of standard errors can be found in the supplementary material (www.biostatistics.oupjournals.org).
| 2. COMPETING RISKS MODEL |
|---|
|
|
|---|
In the competing risks framework, in addition to the survival time T, the event cause
is observed. The data consist of failure (event) times for different subjects and the failures (events) are categorized into distinct and exclusive types.
Let Z(t) and hk(t) denote, respectively, the covariate vector and the transition intensities hk(t) from state 0 to state k. The transition intensities are the cause-specific hazards defined by
![]() |
for
The function hk(t) gives the instantaneous failure rate from cause k at time t, given the regression vector Z(t), in the presence of other failure types. In what follows, we shall suppress t in the notation Z(t), and consider only time-constant covariates, but our approach can also be applied to time-dependent covariates. Denote the marginal survival probability and the cumulative incidence function for event k by
![]() |
![]() | (2.1) |
respectively. The cumulative incidence function Fk(t) for a failure of type k is also called the subdistribution. Note the use of S(s) and not
to get the proper incidence function; note also that the cumulative incidence function for cause k depends on the cause-specific hazards for all K causes.
If Z is a p x 1 covariate vector, then for event k,
the proportional hazards model (Cox, 1972
) specifies that
![]() | (2.2) |
where ßk is a p-vector of regression coefficients for a type-k event and hk0(t) is the corresponding baseline hazard. In what follows, we shall suppress the dependence on Z on the hazard functions. The p x K regression coefficients can be gathered into a matrix
![]() | (2.3) |
Let ti and di be the observed survival time and cause of failure, respectively, for individual i. The likelihood function based on n independent observations is then
![]() |
which is completely specified by the cause-specific hazards hk(t),
This implies that standard inference procedures may also be applied to the competing risks model. The parameters can be estimated by maximizing the partial likelihood
![]() |
where
denote the Dk times of failure of type k, Zki is the corresponding regression variable for the individual who fails at tki, and Rk(tki) is the set of subjects at risk for failure of type k just prior to time tki. Note that this method of estimation requires no assumption concerning the interrelation among the causes of failure. Unfortunately, with this approach the cumulative incidence functions are rather complicated nonlinear functions of the covariates and the effects on the cumulative incidence functions of the covariates are not described by simple parameters. For a detailed discussion of regression analysis of the cumulative incidence functions, see Fine and Gray (1999)
and Fine (2001)
.
| 3. REDUCED RANK MODEL |
|---|
|
|
|---|
Reduced rank regression is not a new idea. Anderson (1951)
Consider the regression matrix B defined in (2.3). The K columns of B represent the regression coefficients for each transition. The classical multivariate linear regression model, that relates a set of K multiple responses to a set of p predictor variables, assumes implicitly that the p x K regression coefficient matrix B is of full rank.
A practical concern is that even for a moderate number of variables, the number of parameters in the resulting regression matrix can be large. Moreover, in situations with few events, full models cannot be identified. The rank R reduced rank regression model specifies that
![]() | (3.1) |
and reduces the number of parameters to R(p + K R). Requirement (3.1) can be achieved by writing the regression matrix B as
![]() | (3.2) |
where A and
have dimensions p x R and K x R, respectively,
![]() |
The cause-specific hazard function hk(t) for a reduced rank R model can then be written as
![]() | (3.3) |
where
is the indicator vector for transition k. In the special case of R = 1, the reduced rank model (3.3) becomes
![]() | (3.4) |
Thus, in the rank 1 model, all covariates have the same effects on each transition apart from the proportionality coefficients
k. The factor 
Z can be seen as a prognostic score for a patient with vector of covariates Z, which determines how likely a patient is to experience an event. The parameter
k determines the effect of the prognostic score on transition k.
![]() | (3.5) |
In the rank 2 model, each patient has two prognostic scores
and two coefficients
k1 and
k2 which determine the effect of these prognostic scores on transition k.
An advantage of the reduced rank proportional hazards model is that for lower ranks, it is able to cope with rare transitions. In the reduced rank model of rank R, the events of all transitions are used to estimate
whereas the parameter vector
k, which has dimension R < p, is estimated using the events of transition k only, and for small R this leads to an increase in precision. This idea of borrowing strength from neighboring units (here transitions) is well known in empirical Bayes estimation.
Yee and Hastie (2003)
considered a more general setup where the covariate vector Z is partitioned into Z1 and Z2 and the reduced rank regression is only applied to Z2. For ease of notation, we shall not pursue this here, although our methods can also accommodate it.
It is possible to test for the overall effect of a covariate. One way to do this is to include the covariate of interest in Z1 in the partial reduced rank setup of the previous paragraph. In principle, it is also possible to include the covariate in the reduced rank part and check the overall effect with a likelihood ratio test. However, the distribution of the likelihood ratio test statistic under the null hypothesis of no overall effect has yet to be established.
For the reduced rank model defined in (3.3), the matrix B is uniquely defined but the decomposition
is not, since for any R x R nonsingular rotation matrix
is also a reduced rank model. If one wishes to obtain a unique set of parameter values for A and
from the elements of B, additional conditions have to be imposed. One possibility is to use the singular value decomposition,
where
are the nonzero eigenvalues of
, and
is p x K such that
The columns Ui are the normalized eigenvectors of
corresponding to the
and
where
One could also fix the upper diagonal R x R matrix of A to be the identity matrix. For an accurate overview of the topic (see Reinsel and Velu 1998
, Chapter 2), or the summary in Yee and Hastie (2003)
. In our illustration, we choose a different decomposition of the matrix B into A and
, motivated by our particular application (see Section 4). If the covariates Z are transformed by a linear transformation given by a p x p nonsingular matrix L, then
implies that the corresponding p x K matrix
of the covariate vector LZ is given by
If we also define
it follows that
Below we describe the algorithm we have used for estimating the regression parameters for reduced rank proportional hazard models in the competing risks framework. We created an augmented data set, where each subject is treated as if he/she had K identical event-times. A variable stratum has been added to the augmented data set that distinguishes between the K different failure causes, and a variable status indicating whether, with respect to type k, the event-time has been observed (status = 1) or censored (status = 0).
- 1. Initial estimate for
: any K x R matrix of rank R.
- 2. For given
with elements
estimate A with a Cox regression, stratified by transition, with covariates Wkr defined by
, i.e.

(3.6)
- 3. For given A with columns
estimate
by a Cox regression for each transition
separately, with
as covariates, i.e.

(3.7)
- 4. Repeat steps 2 and 3 until log-likelihoods of subsequent iterations do not differ by more than a prespecified tolerance.
- 2. For given
can be used, but the speed of convergence of the iterative procedure may depend on the initial matrix
. If the rank R 1 and the rank R solutions are close to each other,
can be taken as the K x (R 1) matrix
R 1 from rank R 1 with an additional random column. Preliminary results show that this choice can considerably speed up convergence compared to a random initial matrix. Convergence of this algorithm is not guaranteed. Clearly, it shares the convergence problem of the Cox model in that it does not converge if all subjects die in the same order as the covariate. Reduced rank models of lower rank will be less vulnerable to this danger and may be a good way to resolve such convergence problems in the full rank model (see also the earlier discussion on rare events).
During iteration, although B stabilizes, A and
may not because of the nonuniqueness. However, this will not be a problem in practice since convergence is judged by the log-likelihood, unless for example, A and
will tend to zero and infinity. This could lead to numerical instability although we have never encountered this problem.
For given
, the model is a generalized linear model for A, and vice versa. The idea of alternately optimizing the likelihood with respect to A and
is well established and known under such names as the nonlinear iterative partial least squares (NIPALS) algorithm by Wold and Lyttkens (1969)
, the crisscross method by Gabriel and Zamir (1979)
, and the partitioned algorithm by Smyth (1996)
. At each iteration the likelihood is guaranteed to be larger than or equal to that of the previous iteration. Since the likelihood is bounded above by the maximum likelihood of the full rank model, this proves convergence of the iterative process. Note that convergence is a necessary but not sufficient condition for
to be the maximum likelihood estimator. It is advisable to run the iterative algorithm using different starting values.
In the supplementary material (http://www.biostatistics.oupjournals.org), we describe how to obtain standard errors for the estimated elements of the matrix B.
| 4. ILLUSTRATION |
|---|
|
|
|---|
We now apply a reduced rank proportional hazard model as described in Section 3 to data provided by the European Group for Blood and Marrow Transplantation where we consider deaths from competing causes as events. Hematopoietic stem cell transplantation (HSCT) is an effective and standard treatment for many severe disorders of the hematopoietic system for acute and chronic leukemia. However, the procedure is associated with considerable mortality. After transplantation, leukemia patients are at risk for death from the procedure itself and from the relapse of their leukemia. The most common nonrelapse complication is GvHD which results from a reaction of immune cells in the donor graft against normal host tissues. Other transplant-related causes of death are infection, hemorrhage, and severe organ toxicity from high-dose chemotherapy given prior to the transplant. We consider in our analysis patients with early leukemia, classified as acute myeloid leukemia (AML) or acute lymphoid leukemia (ALL) in first complete remission or chronic myeloid leukemia (CML). All patients had an allogenic transplantation from a human leukocyte antigens (HLA)-identical sibling donor. We perform our analysis on 8966 patients who underwent bone marrow transplantation. The important prognostic factors studied are shown in Table 1.
|
Almost 3000 patients (32%) had missing values for GvHD prevention. In order to retain these patients and to include this important prognostic factor in the analysis, we added missing as an additional category for this factor.
As already mentioned, fatal infection remains a major complication of HSCT. Causes of death may be roughly divided into death due to relapse or transplant-related mortality (TRM). Within TRM, a further distinction can be made between death due to development of acute or chronic GvHD and death due to different kinds of infections, like bacterial, viral, and fungal. All other causes of death are gathered in the category other. One other type of infection, parasitic infection, was also recorded, but since only 28 patients died from parasitic infection, we combined this category with fungal infection. The pooling of these two infections is not strictly necessary for reduced rank models of lower rank, since they can also deal with transitions where few events are present, as discussed in Section 3. However, to be able to fit reduced rank models of higher rank, collapsing these categories was necessary. With these definitions, patients after HSCT are exposed to six competing risks. The number of events for each cause of death is recorded in Table 2.
|
We start modeling the data by fitting for each transition a Cox proportional hazard model with nine covariates and six transitions; this proportional hazards model has full rank 6, and is referred to as the full model. The effect of the nine covariates for each of the six transitions is shown in Table 3.
|
We then fitted each of the five reduced rank models. Since the usual arguments leading to an asymptotic chi-squared distribution for the likelihood ratio test do not hold here, comparison between models is based on the Akaike Information Criterion (AIC, Akaike, 1974
|
The rank 2 model can be illustrated with a biplot (Gabriel,1971
|
Table 5 shows parameter estimates for A, B, and
for the rank 3 model. Standard errors for
and
have not been computed since these estimates are not uniquely identifiable. Motivated by the biplot of the rank 2 model, we define
in such a way that the first, second, and third columns of
are highly correlated with death due to infections or other diseases, death due to relapse, and death due to GvHD, respectively. This was achieved by computing the logarithm of predicted 5-year cumulative incidence for each individual for each death cause, as well as the principal component of the log-predicted 5-year cumulative incidence function for death due to infections and other causes. Subsequently, canonical correlation analysis was applied to this principal component and the logarithm of predicted 5-year cumulative incidence for relapse and for GvHD, together with the three columns formed by
at termination of iteration. This yielded the results described below.
|
We now explain how the results obtained by fitting the reduced rank 3 model can be interpreted. Each patient is given three prognostic scores, s1, s2, and s3, defined by
Each score determines how likely a patient is to experience an event. The estimated parameter
determines the size of the effect of prognostic score sr for transition k. From Table 5 we see that the first score s1 is mainly determined by year of HSCT (later years of transplantation resulting in lower scores) and age at transplantation (higher scores for higher ages at transplantation). The prognostic score s2 plays an important role for the risk of relapse. The risk factor GvHD prevention yields a positive score, CML a negative score. Positive scores give an increased risk of relapse. Finally, in the prognostic score s3 disease classification CML and age at transplantation stand out; the influence of score s3 is the largest negative for fungal infections and positive for GvHD and other causes. Figure 2 shows the distribution of the scores s1, s2, and s3 of the estimated rank 3 model in our 8966 patients. The means (standard deviations) of s1, s2, and s3 are 0.46 (0.99), 0.08 (0.48), and 0.29 (0.37), respectively. The correlations between s1 and s2, s1 and s3, and s2, and s3 are 0.088, 0.663, and 0.525, respectively. Figure 2 also shows the effect of the three scores s1, s2, and s3 on the predicted 5-year cumulative incidence for each of the six transitions. For each patient, the 5-year cumulative incidence
determined by
and the estimated baseline hazard from the rank 3 model were calculated. The distribution of the logarithms of
and their relation with the scores are plotted. As we can see from Figure 2, there is strong positive correlation between the score s1 and death due to infections and other causes; s2 has strong positive correlations with relapse, while s3 has a strong impact on GvHD.
|
To illustrate the effect of the covariates on the cumulative incidence functions, based on the rank 3 model, we consider four patients, 0, ..., 3. Patient 0 has the reference category for each of his/her prognostic factors. Thus, s1 = s2 = s3 = 0, and the corresponding cumulative incidence functions are the baseline cumulative incidence functions. The prognostic factors of the four patients and their resulting scores s1, s2, and s3 are summarized in Table 6. Patients 1, 2, and 3 have been chosen in such a way that they have a relatively high risk of death due to infections (Patient 1), high risk of death due to GvHD (Patient 2), or high risk of death due to relapse (Patient 3). Figure 3 shows the stacked cumulative incidence functions for the six transitions, for each of the four patients.
|
|
| 5. DISCUSSION |
|---|
|
|
|---|
The two most important objectives of applying a reduced rank competing risks model are reduction of the number of parameters to be estimated, and clearer interpretation of the estimates. The application of the reduced rank competing risks model to the bone marrow transplant data set clearly shows the merit of the reduced rank techniques, for both objectives. The number of parameters has been reduced from 54 (full rank model) to 36 (rank 3 model). Moreover, the rank 3 model (and also the biplot of the rank 2 model) clearly shows the most important aspects of the effect of prognostic factors on each of the curves of death: years of transplantation and the age of a patient at transplantation have a strong influence on death due to infections, whereas type of leukemia (AML, CML, or ALL) and GvHD prevention have a major impact on the death due to relapse and GvHD. These are conclusions which make clear clinical sense, but are not immediate from the full (rank 6) model.
As already mentioned in Section 4, we also included in our analysis patients with missing values for GvHD prevention, by adding missing as an extra level of this prognostic factor. We redid the analysis with those patients with missing GvHD prevention removed. The optimal rank of the model, as judged from AIC, remained unchanged, and we observed no major changes in the parameter estimates of the rank 3 model.
Useful tools for the specification of the rank include information-theoretic model selection criteria, such as the AIC and BIC based on measures of the predictive performance of models of various ranks. However, further work is still required on the asymptotic distribution of the likelihood ratio statistic for testing between two reduced rank models of different ranks (cf. Anderson,1984
).
Recently, Yee and Hastie (2003)
proposed a reduced rank model for generalized linear models. In principle, the original data set can be augmented and adapted so that the reduced rank Poisson regression should yield identical results. However, this augmented data set would become very large, quite possibly too large in our application for further analysis.
The reduced rank approach can be extended to wider topics in survival analysis. Work is in progress in applying this method to multistate models and to time-dependent covariates. In the context of competing risks, one could adopt this approach also following Fine and Gray's (1999)
method.
| ACKNOWLEDGMENTS |
|---|
The authors would like to thank the editor and referees for many constructive comments. This work was supported by a grant (ZonMW 2002-912-02-015 Survival analysis for complicated data) from the Netherlands Organization for Scientific Research. The European Group for Blood and Marrow Transplantation is gratefully acknowledged for making data available for this analysis.
| REFERENCES |
|---|
|
|
|---|
-
AKAIKE, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control AC-19, 716723.
ANDERSEN, P. K., ABILDSTROM, S. Z. AND ROSTHOJ, S. (2002). Competing risks as a multi-state model. Statistical Methods in Medical Research 11, 203215.
ANDERSON, J. A. (1984). Regression and ordered categorical variables. Journal of the Royal Statistical Society, Series B 46, 130.
ANDERSON, T. W. (1951). Estimating linear restrictions on regression coefficients for multivariate normal distribution. Annals of Mathematical Statistics 22, 327351.[CrossRef][Web of Science]
COX, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society, Series B 34, 187220.
FINE, J. P. (2001). Regression modeling of competing crude failure probabilities. Biostatistics 2, 8598.[Abstract]
FINE, J. P. AND GRAY, R. J. (1999). A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association 94, 496509.[CrossRef][Web of Science]
GABRIEL, K. R. (1971). The biplot graphic display of matrices with application to principal component analysis. Biometrika 53, 453467.
GABRIEL, K. R. AND ZAMIR, S. (1979). Lower rank approximation of matrices by least squares with any choice of weights. Technometrics 21, 489498.[CrossRef][Web of Science]
REINSEL, G. C. AND VELU, R. P. (1998). Multivariate Reduced-Rank Regression. New York: Springer.
SCHWARZ, G. (1978). Estimating the dimension of a model. Annals of Statistics 6, 461464.[CrossRef][Web of Science]
SMYTH, G. (1996). Partitioned algorithms for maximum likelihood and other non-linear estimation. Statistics and Computing 6, 201216.[CrossRef][Web of Science]
WOLD, H. AND LYTTKENS, E. (1969). Nonlinear iterative partial least squares (NIPALS) estimation procedure. Bulletin of the International Statistical Institute 43, 2951.
YEE, T. W. AND HASTIE, T. J. (2003). Reduced-rank vector generalized linear models. Statistical Modelling 3, 1541.
Received February 3, 2004; revised July 4, 2004; revised October 11, 2004; accepted for publication February 7, 2005.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||















r = 1, 2, 3, and the logarithm of the 5-year cumulative incidence function for all transitions.