Biostatistics Advance Access originally published online on April 5, 2006
Biostatistics 2006 7(4):599-614; doi:10.1093/biostatistics/kxj028
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Parametric survival models for interval-censored data with time-dependent covariates
The Biostatistics Center, Department of Biostatistics and Epidemiology, School of Public Health and Health Services, The George Washington University, 6110 Executive Boulevard, Suite 750, Rockville, MD 20852, USA jml{at}biostat.bsc.gwu.edu
Merck and Company, Blue Bell, PA 19422, USA
* To whom correspondence should be addressed.
| SUMMARY |
|---|
|
|
|---|
We present a parametric family of regression models for interval-censored event-time (survival) data that accomodates both fixed (e.g. baseline) and time-dependent covariates. The model employs a three-parameter family of survival distributions that includes the Weibull, negative binomial, and log-logistic distributions as special cases, and can be applied to data with left, right, interval, or non-censored event times. Standard methods, such as NewtonRaphson, can be employed to estimate the model and the resulting estimates have an asymptotically normal distribution about the true values with a covariance matrix that is consistently estimated by the information function. The deviance function is described to assess model fit and a robust sandwich estimate of the covariance may also be employed to provide asymptotically robust inferences when the model assumptions do not apply. Spline functions may also be employed to allow for non-linear covariates. The model is applied to data from a long-term study of type 1 diabetes to describe the effects of longitudinal measures of glycemia (HbA
) over time (the time-dependent covariate) on the risk of progression of diabetic retinopathy (eye disease), an interval-censored event-time outcome.
Keywords: Interval-censored data; Parametric models; Time-dependent covariate
| 1. INTRODUCTION |
|---|
|
|
|---|
In medical and biological research, the analysis of event-time or survival data aims to describe the risk (hazard) function of event times in a population, the associated survival or cumulative incidence functions, and the effects of covariates on risk. When event times are not observed exactly, these times are censored. The event time is "right censored" when follow-up is curtailed without observing the event. "Left censoring" arises when the event occurs at some unknown time prior to an individual's inclusion in a cohort. The event time is considered "interval censored" when an event occurs within some interval of time but the exact time of the event is unknown (cf. Kalbfleisch and Prentice, 2002
A variety of models have been developed for interval-censored data. Finkelstein and Wolfe (1985)
present a semiparametric model, which is based on factoring the joint likelihood function for a random interval, and they consider a set of distinct endpoints that comprise the interval as nuisance parameters. Finkelstein (1986)
also develops a method for fitting the proportional hazards model for interval-censored data where the baseline survival function quantities at the distinct endpoints are considered nuisance parameters. Seaman and Bird (2001)
also extend the interval-censored proportional hazards model to accommodate time-dependent covariates. The baseline hazard function is defined as piecewise constant between a specified finite set of time points, and the function is estimated over each interval using the EM algorithm. Betensky and others (2002)
describe a proportional hazards model for interval-censored data using local likelihood estimation, and they allow an arbitrarily smooth covariate function to describe the covariate vector's effect on the hazard function.
Goetghebeur and Ryan (2000)
develop a semiparametric regression model for interval-censored data that employs the EM algorithm. Here, the E-step requires estimating the risk set sizes and number of events that occurred at each set of possible event times and the M-step estimates the regression coefficients. Rabinowitz and others (2000)
use conditional logistic regression to fit proportional odds models to interval-censored data, and they assume that the conditional distribution of the interval endpoints given the covariates follows a semiparametric proportional odds regression model.
Parametric models have also been proposed. Odell and others (1992)
describe a Weibull regression model for interval-censored data with fixed (e.g. baseline) covariates. Rabinowitz and others (1995)
extend the accelerated failure model to the interval-censored case. They present a class of score statistics for estimating the regression coefficients without specifying the distribution function of the residuals or the joint distribution of the covariates and the interval times. Moreover, Younes and Lachin (1997)
present a link-based model that can be applied to the interval-censored case. The model employs a link function to describe the manner by which the covariates act upon the survival times, and it uses B -splines to approximate the background hazard function. Kooperberg and Clarkson (1997)
apply the hazard regression methodology of Kooperberg and others (1995)
to interval-censored data and time-dependent covariates, and they estimate the logarithm of the conditional hazard function using splines and tensor products.
All these methods for interval-censored data that allow for time-dependent covariates are either computationally intensive or of high dimension due to the many nuisance parameters. As an alternative, we present a family of parametric survival models for left, right, and interval-censored data with fixed and time-dependent covariates. This approach provides a direct computational solution with only a few model parameters in addition to the covariate effects. Furthermore, the proposed family of parametric models includes a proportional hazard model and a proportional odds model as special cases. As noted by Lindsey (1998)
, parametric regression models in the presence of heavily interval-censored data are robust and are generally more informative than the corresponding non-parametric models.
In Section 2, we present a three-parameter family of event-time distributions that includes the Weibull, negative binomial, and log-logistic, among others, as special cases, and the associated hazard and cumulative hazard functions. We then describe the likelihood function for the family in terms of linear functions of fixed and time-dependent covariates, model estimation, and inferences. In Section 3, we present model diagnostics to examine the event time distribution, the functional form of the covariates, and overall fit of the model. An example is provided in Section 4, followed by caveats and discussion in Section 5.
| 2. MODEL SPECIFICATION |
|---|
|
|
|---|
Odell and others (1992)
, let
be the event time,
be the observed event, left- or right-censoring time, and for interval-censored observations, let
be the left-censoring time and
the right-censoring time. Indicator functions for the ith observation are defined as follows:
![]() | (2.1) |
Note that
.
A subject is right censored
when the subject was last known to be event free at time
. A subject is left censored
when the subject is known to have had the event sometime prior to
but it is not known when the subject was previously event free or when the event occurred. A subject is interval censored
when it is known that the subject was event free at time
and to have had the event sometime up to
. When time is defined with a definite start time at
, then a left-censored observation can be viewed as an interval-censored observation with
and
. A subject is known to have had the event exactly
at time
, when the subject is known to have been event free immediately prior to
, i.e.
,
.
Let
denote the probability density function of the event times and
denote the distribution function. Under the assumption of independent censoring, the likelihood function for a sample of n independent observations is
![]() | (2.2) |
(Odell and others, 1992
). To accomodate covariate effects, both quantitative and qualitative, let
be a vector of p fixed (e.g. baseline) covariates for the ith subject. Then also assume that for the ith subject, additional time-dependent covariates are updated at a sequence of update times
, where
is the time at which a subject enters follow-up (usually zero as in a clinical trial). The set of update times
may differ among subjects. At the jth update time
of the ith subject, let
denote a vector of q time-dependent covariate values that are updated at that time. The covariate vector at the jth update time can also be denoted as
. Then let
denote the complete sequence of time-dependent covariate values over time for the ith subject, and let
denote the sequence up to time t. Note that under this model, the covariates are updated at discrete points in time. Different expressions are required if covariate values are updated continuously, such as a function of time itself (see Section 5).
![]() | (2.3) |
where
is the event density for a specified event-time distribution conditional on the fixed covariate values
and the sequence of time-dependent covariate values
up to time
. The function
is the corresponding cumulative distribution function. Specification of the hazard function in terms of the covariates leads to a specification of the cumulative hazard and survival function probabilities, conditional on the fixed covariates and the history of the time-dependent covariate processes for a given subject.
Let
and
be the coefficient vectors for the fixed and time-dependent covariates
and
, respectively, so that
![]() | (2.4) |
Let
designate a rate parameter conditional on the covariate values at update time
. Note that with no time-dependent covariates,
.
A left-censored observation is assumed to have an initial value of the time-dependent covariate at time
, assuming that the subject is known to be event free at that time. Otherwise, left-censored observations must be excluded from the analysis. However, left-censored observations could be employed in a model with only fixed covariate values, assuming that those values were determined prior to the event (e.g. gender).
We now introduce a general form for the hazard function that with an additional parameter can span a family of distributions such as the Weibull and log-logistic distributions, among others, as special cases. For subject i, conditional on covariates measured at baseline and at time
, this hazard can be expressed as
![]() | (2.5) |
where
and
are general hazard function parameters. By construction, the rate parameter
is assumed to be constant in the interval
,
, for the ith subject.
Specific values for
and
yield a specific distribution. In particular,
and
yield a negative binomial distribution for event times. More generally,
yields a Weibull hazard, and the parameter vectors
and
are the change in the log relative risk per unit increase in
and
, respectively. The hazard function will be decreasing for
, constant for
, and increasing for
. Selecting
yields a log-logistic hazard and
and
are the change in the log odds ratio of cumulative incidences. This hazard is decreasing for
For
and fixed
, the hazard increases to a maximum at time
, then decreases to zero as time approaches infinity. For
and
, the hazard increases rapidly, plateaus, and then begins to slowly decline, similar to the hazard for the log normal distribution. Thus, this family of hazards indexed by the additional parameter
encompasses a wide range of survival distributions.
![]() | (2.6) |
for which the hazard function can be conveniently expressed as
![]() | (2.7) |
This simplifies the expressions for the score equations and Hessian.
To describe the expression for the cumulative hazard, we impose the condition that the set of update times
for the ith subject includes the event or censoring time
for that subject. If in fact, as will most often be the case, the time-dependent covariate values are not updated exactly at an event or censoring time, then the interval between two update times can be split into two intervals with an added update time equal to the event or censoring time.
Let
denote the indicator function where
if S is true, 0 otherwise. Then, the cumulative hazard at time
for the ith subject is
![]() | (2.8) |
![]() | (2.9) |
For
the latter term is undefined. However, in this case, the antiderivative of the hazard is
![]() | (2.10) |
and using l'Hospital's rule with implicit differentiation it follows that
. The resulting expressions for the hazard and survivor function equal those for the log-logistic model. Thus, the expression in (2.10) should be employed to compute the gradient and Hessian in cases where
is specified to be 1, or the interim estimate in an iterative computation yields a value close to 1, i.e.
for some small
.
For a subject with an observed event time or a right- or left-censored event time, the cumulative hazard is evaluated at
; and for an interval-censored observation
at both
and at
. In the simple case with no time-dependent covariates, the term
is constant over time for the ith individual. The cumulative hazard function for the i th subject evaluated at time t is then expressed as
![]() | (2.11) |
From the general expression for the likelihood in (2.3) and that for the hazard function with fixed and time-dependent covariates in (2.5), the score equations and Hessian matrix can be derived. The expressions are presented in the Appendix. These can then be used to provide the maximum likelihood estimates of the model parameters and the estimated variance of the estimates using an iterative procedure such as the NewtonRaphson algorithm or variations thereof. Alternatively, a derivative-free iterative procedure may be employed to fit the model and estimate the Hessian. The program available from the authors obtains the model estimates using the NewtonRaphson ridge optimization method (cf. Press and others, 1992
) to maximize the log-likelihood function through the SAS IML function NLPNRR (SAS, 1999). Initial values are obtained by fitting a Weibull accelerated failure time model using SAS PROC LIFEREG and then transforming the time acceleration parameter estimates to Weibull risk model estimates (cf. Lachin, 2000
). At each iteration, the Hessian is estimated by the the SAS IML function NLPFDD that uses the algorithm of Gill and others (1983)
based on finite difference equations with central difference approximations. The final estimate of the Hessian when the model has converged is then used to estimate the observed information matrix and the covariance matrix of the coefficient estimates.
Using the theorem from Lehmann (1983
, pp. 42930), Sparling (2002)
provides a proof that the resulting estimates are asymptotically normally distributed about the true values with a covariance matrix that can be consistently estimated from the estimated observed information matrix. The resulting estimates then provide a basis for confidence interval estimates and Wald or likelihood ratio tests of significance.
| 3. MODEL DIAGNOSTICS |
|---|
|
|
|---|
The shape of the assumed hazard function is determined by the values of
and
. Specific hypotheses such as
(i.e. a Weibull model) or
(i.e. a log-logistic model) can be tested using a Wald or likelihood ratio test. If, for example, it is desired to employ a Weibull model, then that model could be fit if the corresponding test is not significant by setting
in all the equations.
Alternatively, the model can be fit for other values of
, and the adequacy of those values can be assessed by examining the values of the log likelihood function, or the optimal value
could be estimated from the model. In this case, however, the coefficient estimates
or
no longer have a convenient interpretation as the log relative hazards (
) or as the log cumulative incidence odds ratios (
).
The functional form of the covariate effects can be explored using spline functions (Smith, 1979
; Ramsay, 1988
).
Following Therneau and others (1990)
, the deviance for the family of models herein can be described as
![]() | (3.1) |
where h is the set of subject-specific implied parameters. Let
,
, and
denote the hazard, survival, and cumulative hazard functions, respectively, from the saturated model; and let
,
, and
denote the estimates from the fitted model. Let
be the individual per-subject estimates of the parameter vector
.
The Appendix shows that
for left-censored, right-censored and interval-censored observations. For non-censored observations where
and
is the event time for individual i, then
![]() | (3.2) |
The first two terms are not readily obtained because the subject-specific estimate of the cumulative hazard function at time
is a function of the subject-specific estimate of the hazard function at time
. However, an approach similar to that in Therneau and others (1990)
can be used. For the non-censored case in the current model, the derivatives of the log-likelihood function with respect to each element of the parameter vector
can be solved under the constraint that the second derivatives given are negative. This approach involves solving a set of simultaneous equations per subject by numerical methods such as NewtonRaphson ridge optimization.
The deviance statistic may indicate that the model does not fit the data well because important covariates have been omitted, or components of the model are mis-specified such as the variance of the responses as a function of the conditional expectation. In the latter case, the "information sandwich" can be used to provide an estimate of the covariance matrix that is robust to mis-specification (Royall, 1986
).
| 4. GLUCOSE EXPOSURE AND THE RISK OF RETINOPATHY IN DIABETES |
|---|
|
|
|---|
In longitudinal studies or clinical trials, subjects may undergo an examination or procedure at regularly scheduled follow-up visits to determine whether disease progression has occurred. In this case, the time to specific outcomes is interval censored by the schedule of follow-up assessments. Furthermore, covariates related to the outcome may also be assessed periodically over time.
For example, the study of the Epidemiology of Diabetes Interventions and Complications (EDIC) is a follow-up observational study of the subjects who had previously participated in the Diabetes Control and Complications Trial (DCCT). Men and women aged 1339 years with type 1 diabetes mellitus (T1DM) were enrolled in the DCCT between 19831989. Patients were recruited into two cohortsa primary prevention cohort with no pre-existing complications and a secondary intervention cohort with minimal complications present.
Patients were randomized to either intensive or conventional treatment and were followed for an average of 6.5 years. Intensive therapy was aimed at maintaining near normal levels of blood glucose while conventional therapy had no such glucose target. The DCCT Research Group (1993)
showed that intensive therapy markedly reduced the risk of progression of diabetes complications, principally retinopathy (diabetic retinal abnormalities, potentially leading to blindness) that was assessed from a retinal evaluation every 6 months.
The level of glucose exposure (glycemia) over the preceding 68 weeks is provided by the hemoglobin A
(HbA
), expressed as the percentage of all hemoglobin (red cells) that have been glycosylated through exposure to glucose molecules in blood, the half-life of hemoglobin being 68 weeks. The history of glycemia before study entry was represented by the level of HbA
on initial screening and the pre-existing duration of diabetes, and the history of glycemia during the study by the mean level of HbA
during the study and the duration of follow-up. The DCCT Research Group (1995)
showed that the lifetime history of glycemia represented by all four factors was the dominant determinant of the risk of complications, and that the group differences in the updated current mean HbA
(a time-dependent covariate) during the study explained virtually all the effect of DCCT treatment group on complications.
At the close of the DCCT in 1993, all subjects were referred to their personal physicians for care and followed annually during EDIC at which time HbA
was assessed. During EDIC, the levels of HbA
were approximately equal in the two former DCCT treatment groups. Retinopathy was assessed in about one-quarter of the patients at years 13 timed in relation to the original date of entry (i.e. 4, 8, or 12 years since entry), and in all subjects at year 4. Thus, the times of progression of retinopathy are interval censored with staggered intervals. One objective of EDIC is to assess the long-term effects of levels of glycemia during the DCCT and EDIC on risk of further progression of retinopathy from the levels present at the end of the DCCT. The DCCT/EDIC Research Group (2000)
showed that former intensive therapy reduced the risks of further progression of retinopathy during EDIC. The remaining question is the extent to which glycemic levels during DCCT and EDIC are associated with risk of further retinopathy progression during EDIC.
Fixed (baseline) covariates include primary versus secondary cohort at EDIC baseline (1 if in the primary cohort, 0 if secondary), the duration of T1DM in months and HbA
level (%) on initial screening that represent the pre-DCCT level of glycemia, and the mean HbA
(%) during the DCCT and months duration of follow-up in the DCCT that represent the level of glycemia during the DCCT. The time-dependent covariate is the updated current mean HbA
during the EDIC, i.e. the value at year 1, the mean of years 1 and 2, then years 13, updated at the time of each successive measurement. In those cases where the EDIC annual HbA
is missing (not measured), the mean value from the previous visit is carried forward. Nine patients are deleted because they are missing all HbA
values. Of the remaining 1316 subjects, 1085 are right censored and 231 have interval-censored times of progression.
Approximately half the subjects within each treatment group were enrolled (by design) into the primary and secondary cohorts. On entry, the mean duration of diabetes was 69 ± 49 (SD) months and the mean level of HbA
was 9.0 ± 1.6%. The mean duration of follow-up during the DCCT was 73 ± 20 months and the mean level of HbA
was 8.1 ± 1.4% during the DCCT. The mean level of HbA
during EDIC was 8.2 ± 1.3%.
Prior knowledge about the distribution of progression of retinopathy suggests that the event times might follow a Weibull distribution. Figure 1 presents the cumulative incidence of retinopathy progression estimated using the Turnbull (1976)
empirical estimate and using the Weibull model-based estimate separately within each treatment group and shows that the two estimates are superimposable. Under a Weibull distribution assumption, the regression model coefficients have a log hazard ratio or log relative risk interpretation. Analyses using regression splines to represent the effect of the EDIC HbA
showed that a simple linear effect was satisfactory (see Sparling, 2002
, for details).
|
Table 1 presents the maximum likelihood estimates of the parameters, their variances, Wald tests, and likelihood ratio tests for the Weibull model. The deviance for the fitted Weibull model is
with
and
.624 that indicates adequate fit of the model. Further, the deviance/df
0.99 does not indicate any over dispersion. Thus, a robust estimate of the variancecovariance matrix is not employed.
|
Primary versus secondary cohort and duration of follow-up in the DCCT do not have any meaningful effects on progression of retinopathy adjusted for the other covariates. T1DM duration and the HbA
on entry have significant effects whereas the effect of the duration of follow-up in the DCCT is not significant. By far the greatest effect is contributed by the mean level of HbA
during the DCCT with a 2.04-fold increase in the risk of progression of retinopathy per unit increase in DCCT mean HbA
percent (95% CI: 1.64, 2.54,
). Interestingly, the time-dependent EDIC mean HbA
over 4 years following the DCCT has a significant effect on risk, but with a much smaller 1.19-fold increase in risk per unit increase in HbA
percent (95% CI: 1.02, 1.38,
). The finding that the glycemic exposure during the DCCT persists and outweighs the level of HbA
during the first 4 years of EDIC has lead to the hypothesis of metabolic memory that effects of hyperglycemia are long lasting.
An unrestricted model provides a shape parameter estimate
.3247 with
and 95% confidence limits
. The Wald test of the hypothesis
yields
, the likelihood ratio test yields
. These results indicate that the distribution of event times conditional on covariates does not significantly deviate from a Weibull distribution. In this model,
with
, close to the values in the Weibull model. Further,
. Because the observed cumulative incidence at the close of follow-up in the cohort is still low (
), there is little information to reliably estimate the shape parameter
.
Similar covariate effects were obtained in a logistic regression analysis of the subset of subjects who were assessed at 4 years (The DCCT/EDIC Research group, 2000). That analysis described the cross-sectional association of the mean HbA
over 4 years with the prevalence of progression at 4 years, whereas the analysis herein shows the more desirable prospective association between the time-dependent HbA
levels and the incidence (risk) of progression using all visits in all patients.
| 5. DISCUSSION |
|---|
|
|
|---|
We describe a family of parametric regression models for survival data that allows for fixed (e.g. baseline) and/or time-dependent covariates with mixtures of left, right, and interval censoring. The model is fit using standard maximum likelihood estimation from the full likelihood for which all the conditions for convergence have been rigorously proven in Sparling (2002)
The family of models is characterized by an additional parameter
that allows fitting a Weibull model
or a log-logistic model
as special cases, or allows for the optimal value of this parameter to also be estimated from the observed data. In the latter case, however, the ability to differentiate a Weibull from a log-logistic model, or to accurately estimate the optimal value, will be roughly proportional to the observed cumulative incidence. If the observed cumulative incidence is low, as the example herein (Figure 1), there is inadequate information about the true shape of the hazard function to allow precice estimation of
.
The family of models presented herein allows for incorporation of time-dependent covariates for which the values are updated or change at discrete points in time. For a time-dependent covariate that changes continuously over time, such as a function of time itself, the integrals in the cumulative hazard function may not be expressable in closed form. For example, rather than having a vector of discrete time-dependent covariates
with distinct values at each of the update times
a covariate that is a function of time itself would have continuously changing time-dependent values specified as
,
. Then, the hazard function is no longer constant over intervals of time. Consequently, cumulative hazard at time t for the ith subject is
![]() | (5.1) |
While such an expression may not have a closed form, the expression could be evaluated numerically.
Many covariates will in fact be a function of time, in theory, including biochemical or biological measures that change from day to day, such as the
measured in the EDIC example above. However, in practice, as in the EDIC, all that is known are the time-dependent covariate's values for a finite set of update times. In this situation, the time-dependent covariate effect must be interpreted in the context of the study design and the follow-up schedule of assessments at which changes in the covariate are observed. For the EDIC example, the estimated relative risk per unit increase in HbA
is that associated with differences between values updated approximately annually, the specified schedule during EDIC. The relative risks (or hazard ratios) then have a prospective interpretation when applied to other settings with the same (or a similar) schedule of update times.
As with any model that employs fixed and time-dependent covariates, caution should be taken in the interpretation of a fixed (e.g. baseline) effect when a time-dependent covariate is influenced by that fixed effect, such as treatment group (see, for example, Kalbfleisch and Prentice, 2002
). In this situation, the model describes the risks of the event given the time-dependent covariate values and treatment group. If the effect of the treatment group is predominantly reflected in the time-dependent covariate process, this type of analysis would show a minimal treatment group effect. However, an analysis not including the time-dependent covariate might show a substantial treatment group effect. In this case, the proper interpretation is that treatment group has an effect on risk, and also an effect on the time-dependent covariate Y, but that after adjusting for
, group has little further effect. This suggests that factors related to Y reflect the underlying mechanism by which treatment has an effect on risk. In fact, such analyses and findings are useful to illustrate the mechanism by which fixed baseline covariates such as treatment group have an effect on risk.
A SAS IML macro written by Oliver Bautista and Yvonne Sparling is available from www.bsc.gwu.edu under the link to downloadable software.
| APPENDIX |
|---|
|
|
|---|
From (2.3), the log likelihood is
![]() | (A.1) |
In order to simplify the expressions, the conditioning on
is omitted for the expressions for
and
. For any parameter
, where
, the score equation is
![]() | (A.2) |
The derivatives of
, of the log hazard and the increments in cumulative hazard with respect to each parameter
at covariate update time u, are
![]() | (A.3) |
where
is the value for the
th time-dependent covariate value for the ith subject at update time u. The derivatives of terms involving the cumulative hazard at event or censoring time t are then provided by
![]() | (A.4) |
The score equation for parameter
is then obtained by evaluating the respective derivatives for each subject. The maximum likelihood estimates of the model parameters are those values
such that the joint set of score equations are equal to zero when all parameters are fixed at these values.
The Hessian for any parameters
and
has elements
![]() | (A.5) |
The partial second derivatives with respect to
,
, and
are presented in Tables 24, respectively.
|
|
The estimated Hessian matrix then has elements
based on the vector of parameter estimates
. The observed information is
. If
is the
th term in the parameter vector
then the estimated variance of
is obtained as
![]() | (A.6) |
where
is the (
) term of
.
To simplify the derivation, we use
to refer to
. In the left-censored case (
and
is the left-censoring time for individual i),
![]() | (A.7) |
For the right-censored case (
and
is the right-censoring time for individual i),
![]() | (A.8) |
In the interval-censored case (
and
is the left endpoint of the censoring interval and
is the right endpoint of the censoring interval for individual i),
![]() | (A.9) |
![]() | (A.10) |
|
| ACKNOWLEDGMENTS |
|---|
This work was supported by a grant from the National Cancer Institute and by a contract from the National Institute of Diabetes, Digestive, and Kidney Diseases for the study of the EDIC. Conflict of Interest: None declared.
| REFERENCES |
|---|
|
|
|---|
-
Betensky RA, Lindsey JC, Ryan LM, Wand MP. (2002) A local likelihood proportional hazards model for interval censored data. Statistics in Medicine 21:26375.[CrossRef][Web of Science][Medline]
Finkelstein DM. (1986) A proportional hazards model for interval-censored failure time data. Biometrics 42:84554.[CrossRef][Web of Science][Medline]
Finkelstein DM and Wolfe RA. (1985) A semiparametric model for regression analysis of interval-censored failure time data. Biometrics 41:93345.[CrossRef][Web of Science][Medline]
Gill EP, Murray W, Saunders MA, Wright MH. (1983) Computing forward-difference intervals for numerical optimization. SIAM Journal on Scientific Computing 4:31021.[CrossRef]
Goetghebeur E and Ryan L. (2000) Semiparametric regression analysis of interval-censored data. Biometrics 56:113944.[CrossRef][Web of Science][Medline]
Kalbfleisch JD and Prentice RL. (2002) The Statistical Analysis of Failure Time Data 2nd edition (John Wiley and Sons, Inc., New York).
Kooperberg C and Clarkson DB. (1997) Hazard regression with interval-censored data. Biometrics 53:148594.[CrossRef][Web of Science][Medline]
Kooperberg C, Stone CJ, Truong YK. (1995) Hazard regression. Journal of the American Statistical Association 90:7894.[CrossRef]
Lachin JM. (2000) Biostatistical Methods. The Assessment of Relative Risk(John Wiley and Sons, Inc., New York).
Lehmann EL. (1983) Theory of Point Estimation(John Wiley and Sons, Inc., New York).
Lindsey JK. (1998) A study of interval censoring in parametric regression models. Lifetime Data Analysis 4:32954.[CrossRef][Web of Science][Medline]
Odell PM, Anderson KM, D'Agostino RB. (1992) Maximum likelihood estimation for interval-censored data using a Weibull-based accelerated failure time model. Biometrics 48:9519.[CrossRef][Web of Science][Medline]
Press WH, Vetterling WT, Teukolsky SA, Flannery BP. (1992) Numerical Recipies in C(Cambridge University Press, London).
Rabinowitz D, Betensky R, Tsiatis AA. (2000) Using conditional logistic regression to fit proportional odds models to interval censored data. Biometrics 56:5118.[CrossRef][Web of Science][Medline]
Rabinowitz D, Tsiatis A, Aragon J. (1995) Regression with interval-censored data. Biometrika 82:50113.
Ramsay JO. (1988) Monotone regression splines in action (with discussion). Statistical Science 3:42561.
Royall RM. (1986) Model robust inference using maximum likelihood estimators. International Statistical Review 54:2216.
SAS Institute Inc. (1999) SAS/IML Users' Guide, Version 8(SAS Institute Inc., Cary, NC).
Seaman SR and Bird SM. (2001) Proportional hazards model for interval-censored failure times and time-dependent covariates: application to hazard of HIV infection of injecting drug users in prison. Statistics in Medicine 20:185570.[CrossRef][Web of Science][Medline]
Smith PL. (1979) Splines as a useful and convenient statistical tool. The American Statistician 33:5762.
Sparling YH. (2002) Parametric survival models for interval-censored data with time-dependent covariates, [Doctoral Dissertation](The George Washington University, Washington, DC).
The DCCT/EDIC Research Group YH. (2000) Retinopathy and nephropathy in patients with type 1 diabetes four years after a trial of intensive therapy. The New England Journal of Medicine 342:3819.
The DCCT Research Group YH. (1993) The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. The New England Journal of Medicine 329:97786.
The DCCT Research Group YH. (1995) The relationship of glycemic exposure (HbA1c) to the risk of development and progression of retinopathy in the diabetes control and complications trial. Diabetes 44:96883.[Abstract]
Therneau TM, Grambsch PM, Fleming TR. (1990) Martingale-based residuals for survival models. Biometrika 77:14760.
Turnbull BW. (1976) The empirical distribution function with arbitrarily grouped, censored and truncated data. Journal of the Royal Statistical Society, Series B 38:2905.
Younes N and Lachin J. (1997) Link-based models for survival data with interval and continuous time censoring. Biometrics 53:1199211.[CrossRef]
Received March 31, 2005; revised January 27, 2006; accepted for publication March 8, 2006.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





, let














ij(u) with respect to the elements of the parameter vector,
2
)



, then
and
(u|zi, yij)/(