Biostatistics Advance Access originally published online on November 10, 2005
Biostatistics 2006 7(2):252-267; doi:10.1093/biostatistics/kxj005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
One- and two-sample nonparametric inference procedures in the presence of a mixture of independent and dependent censoring
Department of Biostatistics, Harvard University, 677 Huntington Avenue, Boston, MA 02115, USA
Department of Preventive Medicine, Northwestern University, 680 North Lake Shore Drive, Suite 1102, Chicago, IL 60611, USA
Department of Biostatistics, Harvard University, 677 Huntington Avenue, Boston, MA 02115, USA wei{at}sdac.harvard.edu
* To whom correspondence should be addressed.
| SUMMARY |
|---|
|
|
|---|
In survival analysis, the event time T is often subject to dependent censorship. Without assuming a parametric model between the failure and censoring times, the parameter
of interest, for example, the survival function of T, is generally not identifiable. On the other hand, the collection
of all attainable values for
may be well defined. In this article, we present nonparametric inference procedures for
in the presence of a mixture of dependent and independent censoring variables. By varying the criteria of classifying censoring to the dependent or independent category, our proposals can be quite useful for the so-called sensitivity analysis of censored failure times. The case that the failure time is subject to possibly dependent interval censorship is also discussed in this article. The new proposals are illustrated with data from two clinical studies on HIV-related diseases.
Keywords: Competing risks; Martingale; Sensitivity analysis; Simultaneous confidence interval; Survival analysis
| 1. INTRODUCTION |
|---|
|
|
|---|
In survival analysis, the time to the event of interest is often subject to dependent right censorship. For example, in a double-blind clinical trial, AIDS Clinical Trials Group 175 (ACTG 175), conducted by the ACTG, 2467 patients were randomly assigned to one of four daily regimens (Hammer et al., 1996
50% decline in the CD4 cell count, development of AIDS, and death. One thousand nine hundred and two event times were censored. Although the majority of these event times was censored administratively, 663 patients were off the treatments without having any of the above clinical events due to, for example, toxicity or request from the patient or investigator, which is likely related to the primary end point. As indicated by Tsiatis (1975)
With various parametric assumptions on the dependence structure between the event and censoring times, novel inference procedures and sensitivity analyses were proposed, for example, by Fisher and Kanarek (1974)
, Slud and Rubinstein (1983)
, Klein and Moeschberger (1988)
, Klein et al. (1992)
, Moeschberger and Klein (1995)
, Zheng and Klein (1995)
, Lin et al. (1996)
, DiRienzo and Lagakos (2001)
, and DiRienzo (2003)
. When auxiliary variables are available, innovative research has been done, for example, by Robins and Rotnitzky (1992)
, Robins (1993)
, Robins and Finkelstein (2000)
, Satten et al. (2001)
, and Scharfstein and Robins (2002)
.
In this article, we consider the case that the failure time T may be censored by either a dependent or an independent censoring variable without assuming a parametric or semiparametric dependence structure between the failure and censoring times. Although the parameter
of interest for this case may not be identifiable, the collection
of all possible values of
is often well defined. For example, if
is the survival function of T,
is the collection of nonincreasing functions which are bounded by Peterson bounds (Peterson, 1976
). For the two-sample problem with the proportional hazards assumption,
is the ratio of two hazard functions, which is a scalar parameter, and
is the set of all possible positive values of
under the above nonparametric setting. In this paper, we propose inference procedures for
under various one- and two-sample settings. Specifically, we present a consistent estimate
and a (1
) confidence set
for
such that
where 0 <
< 1. Such confidence interval estimation provides more information than the single point estimation. Moreover, by varying the criteria of classifying censoring to the dependent or independent category, our proposal can be quite useful for sensitivity analysis of censored failure time observations. The new proposals are illustrated with the data from the aforementioned ACTG 175 study. To the best of our knowledge, under the nonparametric setting for the relationship between the survival and censoring variables, there are no confidence interval estimation procedures available for the set of all attainable values of the parameter of interest in the presence of a mixture of dependent and independent censoring.
Lastly, in this paper we discuss the case that the failure time is subject to dependent interval censorship and present certain one- and two-sample inference procedures. The procedures are illustrated with the data from a well-known study on the HIV-1 infection incidence among hemophilia patients.
| 2. INFERENCES WITH RIGHT-CENSORED OBSERVATIONS |
|---|
|
|
|---|
Let T be the continuous failure time of interest, D be the continuous, dependent censoring variable, and C be the independent censoring variable. Also, let {(Ti, Di, Ci), i = 1, ..., n} be n independent copies of (T, D, C). For the ith subject, one can only observe (Xi,
i), where Xi = min( Ti, Di, Ci) and
![]() |
First, suppose that we are interested in making inferences about the survival function S(t) of T. In the presence of censoring, generally S(t) cannot be estimated well nonparametrically for small or large t, here we let the parameter
be the function S(·) defined in a predetermined, finite interval
= [
1,
2], where
1 and
2 are known constants such that pr(X
1, T < D
C) > 0 and pr(X
2) > 0. Without assuming a parametric dependence structure between T and D, S(·) is not identifiable. On the other hand, the set
of all attainable values of
is the collection of nonincreasing functions which are bounded below by SL(t) and above by SU(t), where SL(t) = pr(T
D
t) and
![]() | (2.1) |
t
(Peterson, 1976
). In the competing risks literature, the right-hand side of (2.1) is the so-called cumulative incidence function.
![]() | (2.2) |
where
![]() |
and
![]() |
are the cause-specific hazard functions with respect to T and D, respectively (Aalen, 1978
; Kalbfleisch and Prentice, 2002
, p. 251). To obtain a consistent estimate
for
, one needs to estimate SL(t) and SU(t). To this end, let
![]() |
where I(·) is the indicator function and m = T, D. Then, the AalenNelson estimator
is a consistent estimator for
m(t) and a consistent estimator
for
is a set of nonincreasing functions which are bounded above by
and below by
where
and
are obtained by replacing
m(t) in (2.2) with
Note that in the presence of independent and dependent censoring, it seems quite natural to replace each dependently censored observation by a value which is beyond the largest observed or censored event time in the data set and construct the standard KaplanMeier (KM) estimator to estimate the upper bound SU(·) of the survival function. Unfortunately, this naive estimator may not be consistent to SU(·) and generally yields a larger estimate than ours due to the fact that the independent censoring assumption for such a KM estimator is violated.
Now, to obtain a (1
) confidence set
of
, one needs the joint distribution of the process
To this end, note that
![]() |
where
![]() |
Since MTi(·) and MDi(·) are orthogonal martingales (Fleming and Harrington, 1991
, p. 42), it follows that the processes
and
converge jointly to a two-dimensional Gaussian process, for s, t
, as
To relax the constraint that the cumulative hazard function is nonnegative, one usually reparametrizes this function by considering its log-transformation. By the functional
-method, for large n, the distribution of the process, indexed by (s, t),
![]() |
can be well approximated by that of the process
![]() | (2.3) |
Generally, the distribution of a function of (2.3) may be rather difficult to obtain analytically. On the other hand, we may approximate the distribution of (2.3) utilizing a simple perturbation technique proposed by Lin et al. (1993)
. To this end, let {G1, ..., Gn} be a random sample from the standard normal, which is independent of the data. Consider a process which is obtained by replacing Mmi(t) in (2.3) by Gi x I(Xi
t,
i = I(m = T) + 2 x I(m = D)), m = T, D. Then, for large n, conditional on the data, the distribution of the resulting process
![]() | (2.4) |
gives a good approximate to the unconditional distribution of (2.3). Note that the only random quantities in (2.4) are Gi, i = 1, ..., n. Also note that the two components of (2.4) utilize the same Gi multiplier. Since T and D are assumed to be continuous random variables, which do not have events simultaneously, conditional on the data these two components of (2.4) are uncorrelated. To obtain an approximation to the distribution of (2.4), one may generate a large number, say, N, of independent random samples {Gi, i = 1, ..., n} to obtain N realizations of (2.4).
For convenience, define two random processes
such that
![]() |
It follows that the distribution of the process
is asymptotically Gaussian and it can be approximated by the conditional distribution of the process
where
and
are obtained by replacing
m(t) in (2.2) with
m = T, D. Let
L(s) and
U(t) be the estimated standard errors for
and
respectively. These two standard errors can be obtained via the sample variances based on the above N realizations of (
A (1
) confidence set
for
is the collection of nonincreasing functions S(·) which satisfy
![]() | (2.5) |
where t
and c is chosen to satisfy
![]() | (2.6) |
Note that the probability measure (2.6) is generated by {Gi, i = 1, ..., n}, but conditional on the data.
Now, we use the data from the ACTG 175 study to illustrate the above inference procedures for the survival function. Although there were four treatment groups in the study, for illustration, we only compare the AZT (zidovudine) monotherapy with the other three treatments combined. Six hundred and nineteen out of 2467 patients were randomly assigned to the AZT monotherapy. There were 423 and 1479 such failure times censored in the AZT and combined groups, respectively. In Table 1, we list the reasons for censoring. Here, for illustration, we let D be the dependent censoring time when the study patient was off treatment without reaching the primary clinical event due to toxicity or the request of the investigator or patient. There were 157 and 506 such dependent censored events for the AZT and combined groups, respectively. For each group,
1 and
2 are chosen such that they approximately equal to the lower and upper fifth percentiles of the observed failure times, respectively. For the AZT group,
1 = 140 (days) and
2 = 950 (days) and for the combined group,
1 = 170 and
2 = 995. The standard error estimates
L(t) and
U(t) and the cutoff point c is obtained with N = 1000 realizations of
and
In Figures 1(a) and (b), the collection of nonincreasing functions, whose upper and lower bounds are denoted by the solid lines, is the point estimate
and the region bounded by the dotted lines is the 0.95 confidence set
These figures are quite informative. For example, on Day 700, on average, the survival probability is between 0.82 and 0.87 with its confidence band of (0.79, 0.89) for the combined treatment group. For the monotherapy group, the survival probability is approximately between 0.71 and 0.76 with the confidence band of (0.65, 0.81).
|
|
Note that
= [
1,
2] may be chosen based on clinical interest. For the present data set, we find that the choice of this interval does not seem critical with respect to the cutoff value c in (2.6). For example, if we let
1 and
2 be the lower and upper 1st, 5th, 10th, and 20th percentiles of the observed failure times for the AZT group, with N = 1000, the corresponding cutoff points, c, are 2.9, 2.7, 2.7, and 2.6, respectively.
Suppose that we are interested in making inferences about the quantile process of the survival function, for example, the median or upper and lower quartiles of T. To this end, let tp be the pth quantile of the survival function S(·), that is, 1 S(tp) = p. Here, the parameter
is a function tp of p
= [p1, p2], a predetermined interval such that
Let tlp and tup be the pth quantiles for SL(·) and SU(·), respectively. Then, the set
of all possible values of
consists of nondecreasing functions tp, p
, which are bounded below by the function tlp and above by tup. A consistent estimator
can be obtained easily via estimators
and
for tlp and tup by solving the equations
and
One may use the aforementioned perturbation technique to obtain a (1
) confidence set
for
. Since the processes
and
are tight, it follows that the asymptotic distribution of the process
indexed by (p, r), is the same as the conditional distribution of
Conditional on the data, let
and
be the random variables such that
![]() |
Then, using the results from Goldwasser et al. (2004)
, for large n, the distribution of the process
indexed by (p, r), can be approximated well by that of the process
where p, r
. Let
lp and
up be the estimated standard errors of log
and log
respectively. Then,
consists of all nondecreasing functions tp, such that,
![]() | (2.7) |
p
. Here, c is chosen to satisfy
![]() | (2.8) |
Again, we use the data from ACTG 175 to illustrate the above procedure. In Figures 1(c) and (d), we present the point estimates
and 0.95 interval estimates
for the corresponding pth quantiles based on (2.7) and (2.8) with p
= [0.04, 0.32] for the AZT group, and = [0.03, 0.21] for the combined group. Here,
is the region bounded by the solid lines and
is bounded by the dotted lines. For example, with p = 0.15, on average, the 15th percentile is between 607 and 786 with a confidence band of (493, 967) for the combined group. On the other hand, on average, the 15th percentile for the AZT group is (395, 476) with the band of (268, 696).
In this section, we present nonparametric and semiparametric inference procedures for various parameters which quantify the relative merit between two independent groups of failure times in the presence of dependent censoring. To this end, all the aforementioned theoretical and empirical quantities in Section 2.1 are subindexed by their group membership k, k = 1, 2. For example, the data now consist of {(Xki,
ki), i = 1, ..., nk; k = 1, 2}.
Suppose that we are interested in
= {S2(t) S1(t), t
}, the difference of two underlying survival functions, where L is a predetermined interval [
1,
2] such that pr(Xk1
1, Tk1 < Dk1
Ck1, k = 1, 2) > 0 and pr(Xk1
2, k = 1, 2) > 0. Note that
consists of functions S2(·) S1(·), which satisfy
![]() | (2.9) |
t
. A consistent estimator
of
can be obtained by replacing SkL(t) and SkU(t) in the lower and upper bounds of (2.9) with their empirical counterparts, k = 1, 2. A (1
) confidence set
is the collection of functions of t, which belong to the intervals
![]() |
where
kL(t) and
kU(t) are the estimated standard errors of
and Sk*U(t), k = 1, 2, and c is chosen such that
![]() |
Again, an approximation to the above cutoff point c can be obtained via the perturbation technique discussed in Section 2.1.
We use the data from ACTG 175 to illustrate the above proposal. To this end, we let S2(t) and S1(t) be the survival functions for the combined and AZT groups, respectively. First, we assume that the dependent censoring is due to toxicity or the request from the patient or investigator. In Figure 2(a), we present a point estimate
and a 0.95 interval estimate
for
with
= [170, 950]. With N = 1000 sets of realizations from {
k = 1, 2},
is composed of functions bounded by the solid lines, and
is the set of functions bounded by the dotted lines. For example, on Day 700, the estimated set of all possible values of the difference between the two survival probabilities is (0.03, 0.19) with a 0.95 confidence band of ( 0.03, 0.24). In Figure 2(b), we present a similar plot, but assume that the dependent censoring event is only due to the request from the patient or investigator. There are 90 and 290 such events in the AZT and combined groups, respectively. For this case, on Day 700, the point estimate for the difference between two groups is (0.06, 0.16) with a confidence band of (0.004, 0.22). Lastly, we assume that all the censoring variables are independent of T, and in Figure 2(c) we provide the KM estimate denoted by the solid line, and a 0.95 confidence set
whose boundaries are the dotted lines. For this case, on Day 700, the point estimate for the difference between two groups is 0.09 with a confidence band of (0.04, 0.13). The plots in Figure 2 provide valuable information regarding sensitivity of the censoring assumptions.
|
under the proportional hazards model.Suppose that there exists an unknown constant
such that
![]() | (2.10) |
a two-sample proportional hazards model. We are interested in making inferences about
. Note that for t
,
![]() |
Let
L = sup t
L(t) and
U = inf t 

U(t). It is not difficult to show that any member of the interval
= [
L,
U] is an attainable value for
in Model (2.10). Let
and
be the estimators obtained by replacing S(t) with
in
L(t) and
U(t), respectively. Similarly,
and
are obtained with S(t) replaced by S*(t). A consistent point estimator for
is
where
and
To derive a (1
) confidence set
unfortunately, it is rather difficult, if not impossible, to obtain the joint distribution of
and
analytically or numerically. Now, consider the following class of interval estimates for
, indexed by time t
,
![]() | (2.11) |
where
L(t) and
U(t) are the estimated standard errors for
and
Note that for any predetermined t
, an interval (2.11) with c
1.96 is a valid 0.95 confidence set for
. However, such an interval for
can be quite large. To obtain a robust interval estimate, first, we let the cutoff point c in (2.11) be chosen such that
![]() | (2.12) |
With this relatively larger threshold value than 1.96, the set of intervals (2.11) is a (1
) simultaneous confidence band for
across t
. Thus, the narrowest interval from this band is a valid (1
) confidence set for
. For example, a possible choice for
is the interval
![]() | (2.13) |
Now, we use the data set from ACTG 175 to illustrate the procedure (2.13). First, let us assume that the dependent censoring event is due to toxicity or the request from the patient or investigator. For this case, with
= [170, 950] the cutoff point c based on (2.12) is 2.48. In Figure 3, we present a 0.95 simultaneous confidence band (2.11) with c = 2.48. The minimizer for
is t = 844, and the maximizer for
is t = 812. It follows from (2.13) that a 0.95 confidence interval for
is ( 1.21, 0.03), indicating that even without assuming a parametric model between the failure and dependent censoring times, patients in the combined group were doing better than those in the AZT group. It is interesting to note that for any predetermined t
[600,800], the corresponding 0.95 confidence interval
is almost identical to our interval (1.21, 0.03). On the other hand, if one chooses t < 500, the resulting interval for
is quite wide. For example, when t = 200, the pointwise interval is ( 2.21, 0.50), which is much larger than ours. Lastly, if one assumes that all censorings are independent of T, the maximum partial likelihood estimate for
is 0.60 and the corresponding 0.95 interval for
=
is (0.78, 0.43).
|
Now, suppose that there exists an unknown
such that S2(e
t) = S1(t), t > 0, the so-called two-sample accelerated failure time model (Kalbfleisch and Prentice, 2002
. Note that
![]() |
for p
= [p1, p2], where
and
are the lower and upper boundaries of
. Let
L = sup p

L(p) and
U = inf p

U(p). Then,
= [
L,
U]. Let
and
Also, let
and
= log
The point estimate
where
and
are the empirical counterparts of
L and
U, respectively. Moreover, it follows from a similar argument in Section 2.1 that the distribution of
can be approximated well by that of
where p, r
. Similar to the case of the proportional hazards model discussed above, a (1
) confidence interval
of
is
![]() | (2.14) |
where the cutoff point c is chosen such that
![]() |
For the ACTG 175 study, we considered the case that the censoring was due to the toxicity or the request from the patient or investigator. With M = [0.04, 0.21], a 0.95 confidence interval (2.14) for
is (0.20, 1.07).
| 3. INFERENCES WITH INTERVAL-CENSORED DATA |
|---|
|
|
|---|
Suppose that for each Ti, one cannot observe Ti directly, but only observe an interval (ELi, EUi) which contains Ti, i = 1,..., n. When ELi and EUi are independent of Ti, nonparametric estimation procedures for S(t) were proposed, for example, by Peto (1973)
Unlike the case with the dependent right censorship, for the interval-censored data, even if one can identify which interval censorings are informative and which are not, it is not clear how to utilize this valuable information to obtain sharp theoretical bounds such as the Peterson bounds for S(t). Here, we propose inference procedures which are valid even when all interval censorings are informative. To this end, let SL(t) and SU(t) be the survival functions of ELi and EUi, respectively. The parameter
is {S(t), t
}, where
is the predetermined interval [
1,
2] such that pr(EUi
1) > 0 and pr(ELi
2) > 0. The
consists of nonincreasing functions S(t) such that SL(t)
S(t)
SU(t), t
. The SL(t) and SU(t) can be estimated well by
and
and a consistent estimator
for
can be obtained accordingly.
To obtain a (1
) confidence set
of
, note that for large n, the distribution of the process
![]() |
can be approximated well by the conditional distribution of
![]() | (3.15) |
where
Now, let SL*(s) and SU*(t) be the random processes such that
![]() |
A (1
) confidence set of
is exactly like (2.5), where
L(t) and
U(t) are the estimated standard errors for
and
via (3.1), and the cutoff point c is obtained via (2.6) with the current
and
Now, for comparing two independent groups of failure times {Tki, i = 1,..., nk;k = 1, 2} with the interval-censored data {(EkLi, EkUi)}, let us assume that the two failure times follow a proportional hazards model with parameter e
. Using the arguments via (2.11)(2.13) with the current
and
one can obtain a (1
) confidence interval (2.13) for
.
We use the so-called five-center cohort data set from a well-known, multicenter study on the HIV-1 infection incidence among hemophilia patients to illustrate the above interval estimation procedures for
(Kroner et al., 1994
; Betensky et al., 2002
). During the 1980s, persons with hemophilia had a high risk of infection with HIV due to their need for infusion of factor VIII or factor IX concentration, products manufactured from the donor's plasma. For this five-center cohort, patients were enrolled without regard to their HIV antibody status. For each patient, repeated serum samples were taken between early 1978 and early 1987, and HIV seroconverters were individuals with both a last negative and first positive serum sample. Thus, each infected subject was assigned a window of time in which he/she seroconverted. It is not clear from the literature if the sampling times for the patient were independent of the underlying T. In Figure 4, the solid lines are the upper and lower boundaries of the point estimate
and a 0.95 interval estimate
for the collection of S(·) is the region bounded by the dashed lines. Here, we let
1 = 1000 (days) and
2 = 5000 (days). Note that the dotted line in the center is the estimated S(·) under the assumption of independent interval censoring.
|
One of the goals of the study was to examine if the patient's average annual dose of nonheat-treated factor VIII concentrate used from 1978 (or birth) to 1984 was related to the time of seroconvertion. For all the analyses done for this study in the literature, the dose level was classified as high ( >20 000 U), low (120 000 U), or none. Let us assume that the failure time T2i for the high dose and T1i for the group without using factor VIII concentration have a proportional hazards structure with the proportionality parameter
. First, we obtain the two bounds corresponding to (2.11) for 2500
t
4500 and c = 2.67. Then, it follows from (2.13) that a 0.95 confidence interval
is (2.1, 3.6), indicating that the high-dose group of patients tended to have a much higher HIV incidence rate than the group of patients who did not use this particular concentration.
Under the assumption of independent interval censoring, the estimated log hazard ratio via the nonparametric maximum likelihood estimation for the proportional hazards model (Huang and Wellner, 1997
) is 3.1 with 0.95 confidence interval of (2.7, 3.5), which is not markedly different from ours.
| 4. REMARKS |
|---|
|
|
|---|
In this article, the approach we took for handling dependent censoring case is quite different from those in the literature. For most clinical studies with an event time as the end point, the censoring variable is a mixture of dependent and independent censoring times. Almost all existing methods assume parametric or semiparametric dependence structures between the failure and the mixed censoring times. As one of the referees kindly pointed out, in practice it seems rather difficult to quantify such a dependence relationship to implement the resulting inference procedures. Our proposal does not need to specify a parametric model, but does need the information about the causes of censoring.
Under the current setting, a sensitivity analysis consists of various subanalyses corresponding to different classifications of censoring as either dependent or independent. For example, in Figure 2, we present results from three distinct censoring classifications for ACTG 175. In Figure 2(a), inferences about the difference between the two survival functions were made assuming that dependent censoring was due to toxicity or due to the patient's or primary physician's request for withdrawal. The most common reason of such a request was that the patient did not respond favorably to the assigned treatment with respect to certain efficacy-related markers. This type of drop out was likely related to the time to the event of interest. The relationship between the toxicity occurrence and the event time is not that clear. One could argue that the treatment might be too potent and caused toxicity. In that case a patient, who developed toxicity, might have much lower HIV-RNA values (viral-load) during the study and, consequently, a longer expected time to event than other patients. On the other hand, toxicity could be an indicator of general poor health, and therefore of shorter expected times to event. Of course it is also possible that toxicity is not at all correlated with the primary end point. In Figure 2(b), we present the results obtained under the assumption that toxicity leads to independent censoring. The confidence band for the difference of two survival functions is slightly tighter in this case compared to that of dependent censoring. This suggests that even if we misclassified toxicity as a cause of independent or dependent censoring, there was no significant impact on the conclusion of the treatment difference. Lastly, in Figure 2(c) we present the results obtained under the assumption that all censoring is independent of the underlying event times. This routine analysis likely exaggerates the treatment difference.
As suggested by an associate editor and the editors, a sensitivity analysis under our setting should consist of three components: (a) providing rationales for classifying sources of censoring as either dependent or independent, (b) specifying any uncertainty involved in this classification, and (c) clearly describing how to perform sensitivity analyses. Furthermore, we strongly encourage investigators of clinical studies to carefully document each patient's reasons for going off treatment or off study so that rational and informative sensitivity analyses can be performed at the interim looks and also at the end of the study.
We have compared our proposal with a typical parametric method in the literature. Specifically, we applied the novel procedure for the one-sample problem studied by Slud and Rubinstein (1983)
to analyze the data set from ACTG 175. Slud & Rubinstein introduced a function
(t), which reflects the relationship between T and the censoring variable C*, where
![]() |
and under the present setting C* = min( C, D). For each
(t), the survival function of T is identifiable and can be estimated in the presence of the dependent censoring variable C*. They suggested to specify two functions
1(t) and
2(t) to obtain two bounds of the nonidentifiable survival function. Note that when
1(t) = 0 and
2(t) =
, the resulting survival functions correspond to the lower and upper Peterson bounds with the dependent censoring variable C*. In comparing with our method, we assumed that the dependent censoring was due to toxicity or the request from the patient or investigator. In Figure 5, for various
1 and
2, we present the estimated SludRubinstein bounds (dashed lines) and our point estimate
(solid lines). Their bounds are markedly narrower than ours in Figure 5(a), but much larger in Figure 5(d). Since Slud & Rubinstein used
(t) to model the dependence between the failure time and a mixture of the dependent and independent censoring times, it is not clear which
(t)'s would result in our estimated upper and lower bounds of all possible values of the underlying survival function. A generalization to this type of parametric methods is to model the relationship, say, via
(t), only between the dependent censoring and failure times and derive inference procedures in the presence of an extra independent censoring variable.
|
Extending our proposals to the general regression problems seems quite challenging due to the difficulty of identifying possible values of the regression parameters with dependent censorship under a nonparametric setting.
| ACKNOWLEDGMENTS |
|---|
The authors are very grateful to two referees, an associate editor, and the editors for insightful comments on the paper. This research is partially supported by the grants from US National Institutes of Health.
| REFERENCES |
|---|
|
|
|---|
-
AALEN, O. (1978). Nonparametric estimation of partial transition probabilities in multiple decrement models. Annals of Statistics 6, 534545.
BACCHETTI, P. (1990). Estimating the incubation period of AIDS by comparing population infection and diagnosis patterns. Journal of the American Statistical Association 85, 10021008.[CrossRef]
BETENSKY, R. A., LINDSEY, J. C., RYAN, L. M. AND WAND, M. P. (2002). A local likelihood proportional hazards model for interval censored data. Statistics in Medicine 21, 263275.[CrossRef][Web of Science][Medline]
BETENSKY, R. A., RABINOWITZ, D. AND TSIATIS, A. A. (2001). Computationally simple accelerated failure time regression for interval censored data. Biometrika 88, 703711.
CAI, T. AND BETENSKY, R. A. (2003). Hazard regression for interval censored data with penalized spline. Biometrics 59, 570579.[CrossRef][Web of Science][Medline]
DIRIENZO, A. G. (2003). Nonparametric comparison of two survival-time distributions in the presence of dependent censoring. Biometrics 59, 497504.[Medline]
DIRIENZO, A. G. AND LAGAKOS, S. W. (2001). Bias correction for score tests arising from misspecified proportional hazards regression models. Biometrika 88, 421434.
FISHER, L. AND KANAREK, P. (1974). Presenting censored survival data when censoring and survival times may not be independent. In Proschan, F. and Serfling, R. (eds), Reliability and Biometry: Statistical Analysis of Lifelength. Philadelphia, PA: SIAM, pp. 303326.
FLEMING, T. R. AND HARRINGTON, D. P. (1991). Counting Processes and Survival Analysis. New York: Wiley.
GENTLEMAN, R. AND GEYER, C. J. (1994). Maximum likelihood for interval censored data: consistency and computation. Biometrika 81, 618623.
GOLDWASSER, M. A., TIAN, L. AND WEI, L. J. (2004). Statistical inference for infinite dimensional parameters via asymptotically pivotal estimating functions. Biometrika 91, 8194.
HAMMER, S. M., KATZENSTEIN, D. A., HUGHES, M. D., GUNDACKER, H., SCHOOLEY, R. T., HAUBRICH, R. H., HENRY, W. K., LEDERMAN, M. M., PHAIR, J. P., NIU, M. et al. (1996). A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine 335, 10811090.
HUANG, J. (1996). Efficient estimation for the proportional hazards model with interval censoring. Annals of Statistics 24, 540568.[CrossRef]
HUANG, J. (1999). Asymptotic properties of nonparametric estimation based on partly interval-censored data. Statistica Sinica 9, 501519.
HUANG, J. AND WELLNER, J. A. (1997). Interval censored survival data: a review of recent progress. Proceedings of the First Seattle Symposium in Biostatistics: Survival Analysis. New York: Springer.
JOLY, P., COMMENGES, D. AND LETENNEUR, L. (1998). A penalized likelihood approach for arbitrarily censored and truncated data: application to age-specific incidence of dementia. Biometrics 54, 185194.[CrossRef][Web of Science][Medline]
KALBFLEISCH, J. D. AND PRENTICE, R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd edition. New York: Wiley.
KLEIN, J. P. AND MOESCHBERGER, M. L. (1988). Bounds on net survival probabilities for dependent competing risks. Biometrics 44, 529538.[Medline]
KLEIN, J. P., MOESCHBERGER, M. L., LI, Y. H. AND WANG, S. T. (1992). Estimating random effects in the Framingham heart study (with discussion). In Klein, J. and Goel, P. (eds), Survival Analysis: State of the Art. Dordrecht, The Netherlands: Kluwer, pp. 99120.
KOOPERBERG, C. AND CLARKSON, D. B. (1997). Hazard regression with interval-censored data. Biometrics 53, 14851494.[CrossRef][Web of Science][Medline]
KRONER, B. L., ROSENBERG, P. S., ALEDORT, L. M., ALVORD, W. G. AND GOEDERT, J. J. (1994). HIV-1 infection incidence among persons with hemophilia in the United States and Western Europe, 19781990. Journal of Acquired Immune Deficiency Syndromes 7, 279286.[Medline]
LIN, D. Y., ROBINS, J. M. AND WEI, L. J. (1996). Comparing two failure time distributions in the presence of dependent competing risks. Biometrika 83, 381393.
LIN, D. Y., WEI, L. J. AND YING, Z. (1993). Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80, 557572.
MOESCHBERGER, M. L. AND KLEIN, J. P. (1995). Statistical methods for dependent competing risks. Lifetime Data Analysis 1, 195204.[CrossRef][Medline]
PETERSON, A. V. (1976). Bounds for a joint distribution function with fixed sub-distribution functions: application to competing risks. Proceedings of the National Academy of Sciences of the United States of America 73, 1113.
PETO, R. (1973). Experimental survival curves for interval-censored data. Applied Statistics 22, 8691.[CrossRef]
RABINOWITZ, D., TSIATIS, A. A. AND ARAGON, J. (1995). Regression with interval-censored data. Biometrika 82, 501513.
ROBINS, J. M. (1993). Information recovery and bias adjustment in proportional hazards regression analysis of randomized trials using surrogate markers. In Proceedings of the Biopharmaceutical Section, American Statistical Association, pp. 2433.
ROBINS, J. M. AND FINKELSTEIN, D. H. (2000). Correcting for non-compliance and dependent censoring in an AIDS clinical trial with inverse probability of censoring weighted (IPCW) log-rank test. Biometrics 56, 779788.[CrossRef][Web of Science][Medline]
ROBINS, J. M. AND ROTNITZKY, A. (1992). Recovery of information and adjustment for dependent censoring using surrogate markers. In Jewell, N. and Dietz, K. (eds), AIDS Epidemiology: Methodological Issues. Boston, MA: Birkhauser, pp. 297331.
ROSENBERG, P. S. (1995). Hazard function estimation using B-splines. Biometrics 51, 874887.[CrossRef][Web of Science][Medline]
SATTEN, G. A., DATTA, S. AND ROBINS, J. M. (2001). An estimator for the survival function when data are subject to dependent censoring. Statistics and Probability Letters 54, 397403.[CrossRef]
SCHARFSTEIN, D. O. AND ROBINS, J. M. (2002). Estimation of the failure time distribution in the presence of informative censoring. Biometrika 89, 617634.
SLUD, E. V. AND RUBINSTEIN, L. V. (1983). Dependent competing risks and summary survival curves. Biometrika 70, 643649.
TSIATIS, A. A. (1975). A nonidentifiability aspect of the problem of competing risk. Proceedings of the National Academy of Sciences of the United States of America 72, 2022.
TURNBULL, B. W. (1976). The empirical distribution function with arbitrary grouped, censored and truncated data. Journal of the Royal Statistical Society Series B: Statistical Methodology 38, 290295.
ZHENG, M. AND KLEIN, J. P. (1995). Estimates of marginal survival for dependent competing risks based on an assumed copula. Biometrika 82, 127138.
Received April 13, 2004; revised August 8, 2005; revised October 24, 2005; accepted for publication October 26, 2005.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. Zhang and D. F. Heitjan Impact of nonignorable coarsening on Bayesian inference Biostat., October 1, 2007; 8(4): 722 - 743. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Lagakos Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial N. Engl. J. Med., July 13, 2006; 355(2): 113 - 117. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||















dotted lines are boundaries for 




















dotted line is the estimated S(·) under the assumption of independent interval censoring).


