Biostatistics Advance Access published online on October 15, 2009
Biostatistics, doi:10.1093/biostatistics/kxp040
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Varying-coefficient models for longitudinal processes with continuous-time informative dropout
MRC Biostatistics Unit, Institute of Public Health, University Forvie Site, Robinson Way, Cambridge CB2 0SR, UK li.su{at}mrc-bsu.cam.ac.uk
Center for Statistical Sciences, Department of Community Health, Box G-S121-7, Brown University, Providence, RI 02912, USA
* To whom correspondence should be addressed.
| SUMMARY |
|---|
|
|
|---|
Dropout is a common occurrence in longitudinal studies. Building upon the pattern-mixture modeling approach within the Bayesian paradigm, we propose a general framework of varying-coefficient models for longitudinal data with informative dropout, where measurement times can be irregular and dropout can occur at any point in continuous time (not just at observation times) together with administrative censoring. Specifically, we assume that the longitudinal outcome process depends on the dropout process through its model parameters. The unconditional distribution of the repeated measures is a mixture over the dropout (administrative censoring) time distribution, and the continuous dropout time distribution with administrative censoring is left completely unspecified. We use Markov chain Monte Carlo to sample from the posterior distribution of the repeated measures given the dropout (administrative censoring) times; Bayesian bootstrapping on the observed dropout (administrative censoring) times is carried out to obtain marginal covariate effects. We illustrate the proposed framework using data from a longitudinal study of depression in HIV-infected women; the strategy for sensitivity analysis on unverifiable assumption is also demonstrated.
Keywords: HIV/AIDS; Missing data; Nonparametric regression; Penalized splines
| 1. INTRODUCTION |
|---|
|
|
|---|
Many longitudinal studies suffer from dropout, which is termed "informative" if the dropout process depends on the unobserved outcomes even after conditioning on the observed data. To account for informative dropout, a number of model-based approaches have been proposed for the joint modeling of the dropout and longitudinal outcome processes (Little, 1995
Focusing on the pattern-mixture modeling approach, in this article we develop a general framework of varying-coefficient models (VCMs) (Hastie and Tibshirani, 1993
) for longitudinal data, where measurement times may be irregular across individuals and where dropout can occur at any point in continuous time (not just at observation times) and be unobserved due to administrative censoring. Specifically, the conditional distribution of the longitudinal outcomes given the dropout time (or the administrative censoring time) follows a VCM, where the outcome model parameters such as regression coefficients, variance components, and correlation parameters depend on the dropout time (or the administrative censoring time) through unspecified smooth functions. Two separate VCMs are specified to distinguish the administratively censored individuals from those who actually drop out. The full-data distribution is a mixture over the dropout/administrative censoring time distribution, which is left unspecified.
The proposed framework generalizes the pattern-mixture models (Little, 1993
, 1994
; Fitzmaurice and Laird, 2000
), the conditional linear models (CLMs) (Wu and Bailey, 1989
), and the class of VCMs developed for continuous outcomes in Hogan and others (2004)
. Specifically, our approach is distinguished from the work in Hogan and others (2004)
by (a) handling administrative censoring separately from other types of dropout that could be related to the outcome process, (b) allowing all model parameters to depend on the dropout process, and (c) accommodating binary outcomes. In this article, we demonstrate the proposed approach using both continuous and binary longitudinal data with continuous-time dropout. The unspecified smooth functions are modeled by Bayesian penalized splines (Ruppert and others, 2003
). When the marginal covariate effects on the outcome process are of interest, Rubin's (1981)
Bayesian bootstrap (BB) is used for averaging over the dropout time distribution with administrative censoring. The advantage of building our VCM framework within the Bayesian paradigm is that there is no need to model the continuous dropout time distribution parametrically. With a frequentist approach to model the dropout times nonparametrically, extra simulation by bootstrapping the continuous dropout times is necessary for standard error estimation if the delta method fails (Hogan and others, 2004
). On the other hand, the BB is naturally merged with the Markov chain Monte Carlo (MCMC) for the outcome process model, and the variability of the observed dropout/administrative censoring times is appropriately taken into account when making inferences on the marginal covariate effects.
The HIV epidemiology research study (HERS) (Smith and others, 1997
; Ickovics and others, 2001) was a longitudinal study of women with, or at high risk for, HIV infection. Twelve core visits (each in a calendar time window) were scheduled for 1310 women, where a variety of clinical, behavioral, and sociologic outcomes were to be recorded approximately every 6 months. If women came to the sites on a date out of the visit window, the visit procedures were not performed. Further, mid-interval visits were added for severely immunosuppressed women (CD4 count < 100). The actual measurement times correspond to assessment dates and vary across participants.
Our interest is in studying the course of depression in the 753 women who had HIV infection at baseline and did not drop out of the study due to HIV-related death before the study end. Depression was measured using the Center for Epidemiologic Studies Depression Scale (CES-D). The CES-D includes 20 questions related to mood, each of which can take a value from 0 (symptom rarely present) to 3 (symptom almost always present); scores therefore range from 0 to 60. Because the distribution of CES-D for a general population can be very skewed, in practice transformations or nonparametric methods need to be applied (Radloff, 1977
). In HIV research, a score of 16 or greater for CES-D is frequently used as a cutoff for clinical depression (Ickovics and others, 2001; Cook and others, 2004
; Leserman, 2008
). This can avoid the potential nonnormality problem for continuous CES-D data and can be useful for screening depression cases in study populations. As in the original analysis of the HERS CES-D data presented in Ickovics and others (2001), our proposed VCMs were originally motivated by the analysis of the dichotomized HERS CES-D data. However, we will also illustrate our framework using continuous HERS CES-D data and compare the results from the 2 analyses.
A challenge with the analysis of these HERS depression data is that dropout could be related to the disease progression and associated depressive symptoms. Figure 1 presents the Kaplan–Meier curves for the dropout time by race and baseline CD4 count. Only 173 women finished the 12 scheduled visits, and their dropout times are treated as administratively censored. We distinguish these women from those who dropped out prematurely due to reasons other than HIV-related death.
|
The remainder of this article is organized as follows. The proposed modeling framework is described in Section 2. Estimation procedures are detailed in Section 3. In Section 4, we apply our methods to the HERS depression data. Conclusions and discussion follow in Section 5.
| 2. VCMS FOR INFORMATIVE DROPOUT |
|---|
|
|
|---|
Suppose that the data come from N individuals, and for the ith (i = 1,...,N) individual, there is an outcome process {Yi(t)}, where t(t
0) is the time since enrollment. Correspondingly, there is a p-dimensional covariate process {xi(t)} associated with {Yi(t)}. In the absence of dropout, the conditional distribution of the variable Yi(t) given xi(t) can be described by a model F with parameters
, |
|
For example, if Yi(t) is continuous, we might assume that F{
;xi(t)} is a Gaussian process with mean function µ
(t) = xi(t)β and variance–covariance function cov{Yi(s),Yi(t)|xi(s),xi(t)} = Vi(s,t) (s
t), where β is a px1 vector of regression coefficients. Parametric forms can be used for Vi(s,t), for instance, Vi(s,t) =
2exp( –
|t – s|). In this case,
= (βT,
2,
)T.
If {Yi(t)} is a binary process, we might assume that F{
;xi(t)} follows a marginalized transition model (MTM) (Heagerty, 2002
), with marginal mean g{µ
(t)} = xi(t)β, where g(·) is a link function. The serial dependence is modeled by the conditional mean of Yi(t) given its history 
(t – ) before t and the covariate process history 
(t), that is, µ
(t) = E{Yi(t)|
(t – ),
(t);
}. Here,
is the dependence parameter vector and
= (βT,
T)T.
For the ith individual, let Ti denote the administrative censoring time (or the scheduled study end) and let Di denote the dropout time. The observed data consist of the total follow-up time, Ui = min(Di,Ti), and the indicator for dropout,
i = I(Di < Ti). In other words, for individuals who are administratively censored (or finish the study), Di is right censored and
i = 0; for individuals who drop out prematurely, Di is observed. At the time points ti1,...,tini (tini
Ui), we also observe the outcome measurements Yi = {Yi(ti1),...,Yi(tini)}T.
When Di
Ti for all i, it is not necessary to consider the dropout process while modeling the outcome process {Yi(t)}. Otherwise, the dropout process is potentially informative. To deal with this situation, we assume that the full data comprise {Yi(t),xi(t),Ui,
i} and factor the joint distribution as
|
|
To induce the dependence between y and (u,
) in the first factor, we assume that
|
|
where F is an appropriate outcome model and
|
|
Here,
1(·) and
0(·) are the vectors of functions for the dropout time Di and the administrative censoring time Ti. Therefore, the administratively censored individuals are distinguished from those who drop out by allowing them to have different model parameters for the outcome process. The second factor f(u,
|x) can be specified using any distribution for event times, where the dependence on x can be checked using standard event time regression analysis methods. In the HERS application, the covariates are discrete and we allow f(u,
|x) to be completely unspecified within the levels of x.
Different assumptions can be made for the form of
(u,
). For example, if
(u,
) are constant in u, then the dropout process is ignorable and methods for modeling Yi(t) given xi(t) can be used without explicitly considering (Ui,
i). When the dropout/administrative censoring times and the values of
(u,
) are discrete, we have a PMM (Little, 1993
; Fitzmaurice and Laird, 2000
). When the dropout/administrative censoring times are continuous and
(u,
) are polynomial functions, we have the CLM (Wu and Bailey, 1989
). The VCMs by Hogan and others (2004)
generalized the CLM for continuous outcome data by allowing the mean parameters to be unspecified smooth functions. Unlike in Hogan and others (2004)
, our approach handles the administrative censoring differently from other outcome-related dropout. For example, in the HERS analysis reported in Section 4, we assume that
1(u) are unspecified smooth functions for the dropped-out individuals and
0(u) are constants for the administratively censored individuals. Furthermore, we extend the work in Hogan and others (2004)
by allowing all model parameters to depend on u and accommodating binary outcomes.
In a linear mixed model (LMM) for the HERS depression data introduced in Section 1, "missingness at random" (MAR) is assumed such that the conditional distribution of missing CES-D scores given the observed ones for those who remained in the study at u is the same as the corresponding conditional distribution for those who left the study at u (Molenberghs and others, 1998
), that is,
|
|
For our VCM, if the marginal distribution of {Yi(t)|xi(t)} is of interest, we assume that conditional on u and the covariates, the outcome distribution after u can be characterized by the same parameters in the distribution for the observed data. For example, the time trend of the CES-D score estimated from the observed data can be extrapolated for the missing CES-D scores beyond u up to the study end.
Neither the assumption in the LMM nor the one in the VCM can be verified from the observed data. One advantage of the pattern-mixture modeling approach is that the extrapolation of the missing data is transparent, which makes the substantive critique and empirical sensitivity analysis relatively straightforward (Little and Wang, 1996
; Daniels and Hogan, 2000
; Rotnitzky and others, 2001
). For example, in the HERS analysis reported in Section 4, we can assume a different time slope for the CES-D scores beyond u. The sensitivity parameters would be the difference between the time slopes before u and beyond u, which cannot be identified by the observed data. Then, we can recompute the quantities of interest (such as the marginal CES-D profiles) to check their sensitivity to the nonidentifiable parameters. Because the unidentifiable part of the model is distinguished from those identifiable from the observed data, in the VCM the inferences based on the observed data remain the same regardless of the sensitivity parameters.
Hogan and others (2004)
developed varying-coefficient LMMs for continuous longitudinal data, where the mean parameters were allowed to depend on the dropout time, but the variance components were constants. In addition, they did not distinguish administrative censoring from other outcome-related dropouts. We generalize their model by allowing variance-component parameters to vary by the dropout/ administrative censoring times.
Recall that for the ith subject, Yi is an nix1 continuous outcome vector, Ui = u is the observed dropout/administrative censoring time, and
i = 0,1 is the indicator for dropout. Let xi = {xi(ti1)T,...,xi(tini)T}T be the nixp exogenous covariate matrix associated with the fixed effects and zi = {zi(ti1)T,...,zi(tini)T}T be an nixq covariate matrix associated with the random effects. Conditional on (Ui,
i), we assume that
|
| (2.1) |
where β1(u) and β0(u) are 2 px1 vectors of unknown regression coefficient functions and
1(u) and
0(u) are the vectors of unknown variance-component functions.
We use a Cholesky decomposition for modeling the variance components as the functions of u (Daniels and Zhao, 2003
). Other formulations using multivariate normal distributions are possible; this one is chosen for convenience. Details are given in the supplementary material available at http://www.biostatistics. oxfordjournals.org.
In the HERS analysis reported in Section 4.1, we assume that β1(u) and
1(u) are unspecified smooth functions that are modeled by penalized splines. Because the administrative censoring times in these data are similar, we assume that β0(u) and
0(u) are constant functions. In practice, when study participants have staggered entry and the administratively censored individuals are a heterogeneous group with respect to the outcome distribution, we could also allow β0(u) and
0(u) to be unspecified smooth functions. Note that with variance components varying by the dropout/administrative censoring times, we need to pay attention to the effective number of parameters that are incorporated in the VCM (Spiegelhalter and others, 2002
). If the results for variance-component functions suggest particular parametric forms, we could reduce the model complexity accordingly.
To illustrate the VCM for binary longitudinal data, we build on MTMs (Heagerty, 2002
), where the mean and serial dependence structures, and their dependence on the dropout process, are separately specified.
Specifically, let µ
(u) = E{Yi(tij)|xi(tij),Ui = u,
i} (j = 1,...,ni) and
|
| (2.2) |
where g(·) is a link function, xi(tij) is a 1xp covariate vector, and β1(u) and β0(u) are 2 px1 vectors of unknown regression coefficient functions.
Serial dependence between the outcomes within individuals follows an rth-order Markov model; that is,
|
|
The dependence structure is modeled via
|
| (2.3) |
although in principle any valid link function can be used (Heagerty, 2002
). Note that, for simplicity, the dependence of
ij and
ijl,
i(u) on xi(tij) is suppressed for now. The log-odds ratios
ijl,
i(u) measure the dependence between Yi(tij) and Yi(ti,j – 1),...,Yi(ti,j – r) among those with Ui = u and
i =
; the intercept
ij is determined such that the mean structure in (2.2) and the Markov dependence structure in (2.3) are simultaneously satisfied (Azzalini, 1994; Heagerty, 2002
).
We further assume that the serial dependence
ijl,
i(u) can be modeled via
|
| (2.4) |
where zi,l(tij) is a subset of the covariates xi(tij), while
l,1(u) and
l,0(u) are 2 dlx1 (l = 1,...,r) vectors of unknown functions of u. For example, if
ijl,1(u) =
l1,1(u) +
l2,1(u)·Zi, where Zi is a treatment group indicator, individuals for whom Zi = 1 are allowed to have different serial dependence compared with individuals for whom Zi = 0, given that they drop out at u.
As with the VCM for the continuous HERS depression data, we assume that each element of the MTM parameters is an unspecified smooth function of u for individuals who dropped out of the HERS, while for administratively censored individuals, we assume that the MTM parameters are constant in u.
| 3. ESTIMATION |
|---|
|
|
|---|
Suppose
indexes the dropout/administrative censoring time distribution f(u,
|x;
), and let
denote the set of parameters in the VCM for the outcome process. The likelihood from the ith individual can be partitioned as
|
|
If the priors for
and
are independent, it follows that
is not a part of the posterior for
. The inference for f(u,
|x;
) can be based on the marginal likelihood 
f(ui,
i|xi;
), whereas the inference for
is based on 
f(yi|xi,ui,
i;
).
In the HERS analysis reported in Section 4.1, the set of parameters
here includes those indexing the smooth functions for the regression coefficients and variance components when the dropout time is observed and the parameter vector (constant in u) when the dropout time is administratively censored. Using the same notation as in Section 2.1, the log-likelihood associated with the continuous outcome process can be written as |
|
where Vi = ziGi{
i(ui)}z
+ Ri{
i(ui)}.
In the HERS analysis reported in Section 4.1, we assume a first-order serial dependence structure. The likelihood contribution for the ith individual corresponding to the model in (2.2–2.4) can be written as
|
|
The smooth functions in (2.1) and (2.2–2.4) are modeled by Bayesian penalized splines with low-rank thin-plate spline bases (Ruppert and others, 2003
; Crainiceanu and others, 2005
).
The low-rank thin-plate spline representation of a scalar smooth function
(·) is
|
| (3.1) |
where
= (
0,
1,
1,...,
K)T is a vector of regression coefficients and
1 <
<
K are fixed knots. We set
k at the k/(K + 1) sample quantile of us (Ruppert, 2002
; Ruppert and others, 2003
; Crainiceanu and others, 2007
). Let
= (
0,
1)T ,
= (
1,...,
K)T, U1 = (1,u), U2 = (|u –
1|3,...,|u –
K|3), and
be a KxK matrix whose (l,k)th entry is |
l –
k|3. Using the reparameterization
and
, (3.1) can be rewritten as
.
In the HERS analysis reported in Section 4, we assign to
independent normal priors with mean zero and large variance and to
the prior N(0,
·I), where I is a KxK identity matrix. Estimating the smoothing parameter
is similar to estimating variance components in Bayesian hierarchical models (Gelman, 2006
), and the curve estimation by penalized splines can be sensitive to the choice of prior for
. Crainiceanu and others (2007)
discussed this issue and found that inverse-Gamma priors can be used in practice when certain conditions are met such that the posterior inference of
is insensitive to the hyperparameters in the prior for
. In our applications, we use inverse-Gamma priors for
and the estimated curves fit the observed data reasonably well. Additional analyses using Uniform priors for
1/2 give similar results for curve estimation. Therefore, we only present the results with inverse-Gamma priors for
.
In our VCM approach, we leave the dropout/administrative censoring time distribution f(ui,
i|xi;
) completely unspecified and use Rubin's (1981)
BB (Kim and others, 2005
) to obtain the posterior for P(U = ui,
=
i|xi).
We now briefly describe the BB procedure. Suppose
= (U1,...,UN) is a random sample from an unknown distribution. For simplicity, we assume that there are no ties in
. The BB posterior for
i = P(U = Ui) can be obtained by
|
|
Because there is only one observation at each Ui, the empirical likelihood is given by
|
|
Using a noninformative prior 


for (
1,...,
N), we have Rubin's BB posterior
|
|
In the HERS analysis reported in Section 4, at each iteration of the MCMC we then simulate P(U = ui,
=
i|xi) from Dirichlet(1,...,1) for each combination of the discrete covariates.
To obtain inference on the marginal mean E(Yi|xi), the empirical averages 
N*P(U = ui,
=
i|xi)E(Yi|xi,ui,
i) can be computed using the posterior samples of P(U = ui,
=
i|xi) and E(Yi|xi,ui,
i) from the VCM, where N* is the sample size corresponding to a specific combination of the discrete covariate values.
For example, in (2.1), when the identity link is used for modeling the mean structure and f(u,
|x) = f(u,
), the marginal covariates effects can be approximated by 
P(U = ui,
=
i)β
i(ui). However, when other link functions are used for the mean structure and/or f(u,
|x)
f(u,
), the marginal covariate effects might not be readily available. Here, the effect of covariate difference x – x' is
|
|
which cannot be simplified to (x – x')EU,
{β
(u)} because of its dependence on x. In a simple scenario with treatment groups and measurement times as the covariates, we can compute 
N*P(U = ui,
=
i|xi)E(Yi|xi,ui,
i) and plot summaries of the posteriors to demonstrate the marginal covariates effects. For other more complicated situations with many confounders or a number of quantitative covariates of interest, a simple summary of the marginal effects in PMMs might not be immediately obtainable (Fitzmaurice and Laird, 2000
).
The prior specification for Bayesian penalized splines is discussed in Section 3.2. For the constant parameter vector in the administrative censoring group, independent vague normal priors with mean zero and large variance are assigned. We use MCMC to sample from the posterior distributions; summary statistics such as posterior means and 95% credible bands are then used for inference. The MCMC is implemented in the WinBUGS package (version 1.4.1) and its development interface (WBDev; Spiegelhalter and others, 2003
). The programs for the HERS analysis reported in Section 4 are provided in the supplementary material available at Biostatistics online.
| 4. APPLICATION: THE HERS STUDY |
|---|
|
|
|---|
Our goal is to describe the depression changes over time by baseline characteristics, such as race (Black, White, Latina including others) and baseline CD4 count (CD4 > 200), for the 753 women who did not suffer HIV-related deaths in the HERS. We first present the analysis using the continuous CES-D data and then analyze the binary data using the cutoff CES-D
16 to define clinical depression.
We fit 3 models. The first is a LMM assuming that (Yij|bi,Xij,Zij)
N(Xijβ + Zijbi,
2), where Yij is the CES-D score at time tij; Xij is the covariate vector that includes intercept, race, baseline CD4 count, time, and the interaction between time and baseline CD4 count; and Zij is the covariate vector associated with a random intercept and a random time slope bi = (bi1,bi2)T. In addition, we fit 2 VCMs. The first has unspecified smooth functions of u in the mean structure only (VCM1). Specifically, it is assumed that (Yij|bi,Xij,Zij,u,
i)
N(Xijβ
i(u) + Zijbi,
2), where β0(u) = β0 is a vector that is constant in u for the administrative censoring group. The second VCM also includes the variance components as smooth functions of u (VCM2). Further details about variance-component parametrization, prior specification as well as posterior inference can be found in the supplementary material available at Biostatistics online.
We first focus on the results from VCM2. Additional results from the LMM and VCM1 as well as an example of sensitivity analysis based on VCM2 can be found in the supplementary material available at Biostatistics online. Figure 2 gives the results for β1(u) and β0. The intercept and race effects are fairly constant over u. The main effect of baseline CD4 count is decreasing as u increases. The main effect of time has a downward trend toward zero, which suggests that for the group with baseline CD4
200 earlier dropout was associated with steeper change in expected CES-D scores over time. The interaction between time and baseline CD4 count increases toward zero, which shows that the positive time slopes for the group with baseline CD4 > 200 are less steep than those with baseline CD4
200. Overall, we expect that VCM2 will adjust the expected CES-D profiles upward and the adjustment might differ between the baseline CD4 groups.
|
Figure 3 shows the estimated smooth functions for the variance components. Since later dropouts are generally associated with more observations within patients, we would expect that the estimated variability of the random intercepts, random slopes, and residual errors decreases as u increases. As seen from Figure 3, this is true for the estimated random-slope standard deviation (SD) and the error SD; but the estimated random-intercept SD increases as u increases. This upward trend for random-intercept SD suggests that those early dropouts in the HERS might be a more homogeneous group in terms of their baseline CES-D levels. Overall, all estimated variance components are not constant over u. In fact, by allowing the variance components to vary by u, the within-individual correlation structure in VCM2 is different from the one in VCM1 (see the supplementary material available at Biostatistics online for variance-component estimates from the LMM and VCM1). It is well known that with complete data and likelihood-based approaches, properly modeling the within-individual correlation structure can affect the variability estimates more than the point estimates of the mean regression coefficients (Diggle and others, 2002
|
Table 1 gives the results for the intercept and the time slope in estimated CES-D profile by race and baseline CD4 count. The intercept estimates are close across all models, which is expected because in early study period the influence of dropout is minimal. However, both VCMs adjust the time slope estimates upward compared with the LMM, where for the group with baseline CD4
200, the adjustment is larger and the time slopes are changed completely to be positive. Therefore, without taking into account the dropout process such as in the LMM, we might incorrectly conclude that both baseline CD4 groups had downward trends for the CES-D scores over time. Further, both VCMs give similar time slope estimates for the group with baseline CD4 > 200, but the point estimates and variability estimates of the time slopes in the group with baseline CD4
200 differ between the VCMs. This might be explained by the different levels of missingness between the baseline CD4 groups. Allowing the variance components to vary by u in VCM2 therefore might have larger impact on the point and variability estimates for the group with baseline CD4
200.
|
In summary, regardless of baseline CD4 count, we observed that Whites and Latinas (including others) had larger CES-D scores over time than Blacks. For all races, the expected CES-D scores for the patients with baseline CD4
200 were increasing over time, while the CES-D scores for those with baseline CD4 > 200 were decreasing over time.
First we fit a MTM(1). The same set of covariates as in Section 4.1 is used. The marginal mean of depression follows logit(µ
) = Xijβ, and the dependence structure is assumed to follow a first-order Markov model with constant serial dependence logit(µ
) =
ij +
·yij – 1. We then fit the varying-coefficient MTM(1) as in Section 2.2. The mean structure for the dropout group follows
|
| (4.1) |
while the dependence structure follows logit{µ
(ui)} =
ij +
(ui)·yij – 1. Further details are provided in the supplementary material available at Biostatistics online.
For individuals with observed dropout times, Figure 4 presents the estimated smooth functions. The main effect for baseline CD4 count shows a downward trend over u. The main effect for time decreases to approximately zero as u increases, which again suggests that earlier dropout was associated with larger time slope in depression probability for the group with baseline CD4
200. The interaction between time and baseline CD4 count increases toward zero, which suggests that the group with baseline CD4 > 200 had time slopes that are less varying over u. Overall, we expect that the VCM could adjust the marginal probability profiles of depression upward for the group with baseline CD4
200. The within-individual serial dependence is positive and increases slightly as u increases. Note that in Figure 4 the corresponding estimates from the administrative censoring group are close to those at the right boundary of the observed dropout times.
Table 2 presents the estimated marginal covariate effects from the fitted MTM(1) assuming MAR. The estimated interaction between time and baseline CD4 count is positive, which means that regardless of race, the group with baseline CD4
200 had steeper decline in depression prevalence over time. Based on substantive knowledge and the results in Section 4.1, this is not sensible and may be an artefact selection bias due to informative dropout. Marginal probability profiles estimated from the VCM by race and baseline CD4 count are presented in Figure 4 of the supplementary material available at Biostatistics online. Apparently, for the group with baseline CD4
200, the VCM adjusts the marginal probability of depression; and the downward trends shown in the MTM(1) under MAR are moved upward. The adjustment for the group with baseline CD4 > 200 is minimal. As a result, the estimated interaction between time and baseline CD4 count becomes negative in the VCM, which is shown by the difference between marginal probability profiles in Figures 5 and 6 of the supplementary material available at Biostatistics online. Note that after averaging over the dropout/administrative censoring time distribution, the effects of race, baseline CD4 count, and time are no longer independent. Results for the administrative censoring group are also given in the supplementary material available at Biostatistics online.
|
|
In summary, regardless of baseline CD4 count, we observed that Latinas (including others) had higher prevalence of depression over time than Blacks and Whites. Given the race groups, the patients with baseline CD4
200 had similar depression prevalence over time as for the patients with baseline CD4 > 200; unlike the results based on continuous CES-D data, the depression prevalence remained relatively constant over time for all race and baseline CD4 groups. It should be noted that the analyses based on continuous and binary CES-D data focused on different scientific questions. With continuous CES-D data, we are interested in the covariate effects on the absolute levels of CES-D, while with binary CES-D data the targets are the covariate effects on the prevalence of clinical depression. We have seen that in both cases the race effects are similar, but the baseline CD4 effects differ. | 5. DISCUSSION |
|---|
|
|
|---|
We have proposed a Bayesian VCM approach for longitudinal data with continuous-time informative dropout. Our framework assumes that the parameters in the outcome process depend on the dropout time through unspecified functions, where administratively censored dropout times are handled separately and no modeling of the continuous dropout time distribution is needed in order to obtain the inference for marginal covariates effects. While the VCM is widely applicable, we used both continuous and binary data from an HIV longitudinal study to show that our approach has the potential to adjust for selection biases induced by early dropouts of poor responders.
Our VCM approach provides a convenient framework for sensitivity analysis because the unidentifiable part of the model can be distinguished from the identifiable part and for the latter the inferences remain the same regardless of the sensitivity parameters. In our analysis of the HERS depression data, we emphasized that sensitivity analysis should be based on those parameters that cannot be identified by the observed data. More in-depth research on this aspect is needed, building on general sensitivity analysis strategies developed for PMMs (Scharfstein and others, 1999
; Daniels and Hogan, 2000
; Molenberghs and others, 2003
). For example, informative priors for sensitivity parameters can be introduced using expert opinions and/or prior elicitation based on previous studies (Lee, 2007
).
Appropriate summary of marginal covariate effects is a challenge in the pattern-mixture modeling approach to informative dropout. In practice, we might prefer to specify the marginal covariate effects directly in the model. Thus, approaches to marginalizing PMMs are worth further research (Wilkins and Fitzmaurice, 2006
; Wilkins and Fitzmaurice, 2007
; Roy and Daniels, 2007
). In our ongoing research, we plan to extend the VCM for binary data by separately specifying the marginal model and the conditional model given the dropout/administrative censoring time, while constraints are imposed such that they are satisfied simultaneously.
In our VCM, we distinguished administrative censoring from other dropout. In the HERS application, we assumed that for the administrative censoring group the outcome model parameters do not vary with the administrative censoring times but are distinct from the parameters in the dropout group. However, our VCM specification is flexible, and in practice, similar unspecified smooth functions can also be used to capture the heterogeneity within the administrative censoring group with respect to the outcome process. This is particulary useful when study participants have staggered entry and the observed administrative censoring times vary considerably. When there is no dropout, variation in these administrative censoring times is usually ignorable. However, when informative dropout is present, for example, in the context of the HERS analysis, it is possible that the longer a participant stays on a study without dropping out, the less steep the patient's true depression trend is likely to be. In this situation, modeling the relationship between the outcome precess and the administrative censoring times would be necessary.
Outcome-related death mixed with dropout is another problem that warrants further research. Because extrapolating the missing data beyond death is inappropriate, instead of modeling the marginal mean of the outcomes, a more meaningful quantify of interest would be the mean of the longitudinal outcomes conditional on being alive (Kurland and Heagerty, 2005
). When the survival information is available, we could extrapolate the missing data in the VCM up to the observed survival times for summarizing marginal covariate effects. If survival times are censored, further work on joint modeling is needed.
| FUNDING |
|---|
|
|
|---|
The National Institutes of Health (R01-AI-50505, R01-HL-79457); the US Centers for Disease Control and Prevention (U64-CCU10675). Funding to pay the Open Access publication charges for this article was provided by Medical Research Council (UK) grant U.1052.00.009.
| ACKNOWLEDGMENTS |
|---|
The authors thank Jeffrey Blume, Mike Daniels, Constantine Gatsonis, Patrick Heagerty, Tony Lancaster, the editor, and the referee for helpful comments and suggestions. Conflict of Interest: None declared.
| REFERENCES |
|---|
|
|
|---|
-
Azzalini A. Logistic regression for autocorrelated data with application to repeated measures. Biometrika (1994) 81:767–775.
Cook J, Grey D, Burke J, Cohen M, Gurtman A, Richardson J, Wilson T, Young M, Hessol N. Depressive symptoms and AIDS-related mortality among a multisite cohort of HIV-positive women. American Journal of Public Health (2004) 94:1133–1140.
Crainiceanu C, Ruppert D, Carroll RJ. Spatially adaptive penalized splines with heteroscedastic errors. Journal of Computational and Graphical Statistics (2007) 16:265–288.[CrossRef][Web of Science]
Crainiceanu C, Ruppert D, Wand M. Bayesian analysis for penalized spline regression using WinBUGS. Journal of Statistical Software (2005) 14:1–24.
Daniels M, Hogan J. Reparameterizing the pattern mixture model for sensitivity analyses under informative dropout. Biometrics (2000) 56:1241–1248.[CrossRef][Web of Science][Medline]
Daniels M, Hogan J. Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis (2008) Monographs on Statistics and Applied Probability, Volume 101. New York: Chapman & Hall.
Daniels M, Zhao Y. Modelling the random effects covariance matrix in longitudinal data. Statistics in Medicine (2003) 22:1631–1647.[CrossRef][Web of Science][Medline]
Diggle P, Heagerty P, Liang K-Y, Zeger S. Analysis of Longitudinal Data (2002) New York: Oxford University Press.
Diggle P, Kenward M. Informative dropout in longitudinal data analysis (with discussion). Applied Statistics (1994) 43:49–93.[CrossRef][Web of Science]
Fitzmaurice GM, Laird NM. Generalized linear mixture models for handling nonignorable dropouts in longitudinal studies. Biostatistics (2000) 1:141–156.[Abstract]
Follman D, Wu MC. An approximate generalized linear model with random effects for informative missing data. Biometrics (1995) 51:151–168.[CrossRef][Web of Science][Medline]
Gelman A. Prior distributions for variance parameters in hierarchical models. Bayesian Analysis (2006) 1:515–533.[CrossRef]
Hastie T, Tibshirani R. Varying-coefficient models. Journal of the Royal Statistical Society, Series B (1993) 55:757–796.
Heagerty P. Marginalized transition models and likelihood inference for longitudinal categorical data. Biometrics (2002) 58:342–351.[CrossRef][Web of Science][Medline]
Henderson R, Diggle P, Dobson A. Joint modelling of longitudinal measurements and event time data. Biostatistics (2000) 1:465–480.[Abstract]
Hogan JW, Laird NM. Mixture models for the joint distribution of repeated measures and event times. Statistics in Medicine (1997a) 16:239–257.[CrossRef][Web of Science][Medline]
Hogan JW, Laird NM. Model-based approaches to analyzing incomplete longitudinal and failure time data. Statistics in Medicine (1997b) 16:259–272.[CrossRef][Web of Science][Medline]
Hogan JW, Lin X, Herman B. Mixtures of varying coefficient models for longitudinal data with discrete or continuous non-ignorable dropout. Biometrics (2004) 60:854–864.[CrossRef][Web of Science][Medline]
Ickovics JR, Hamburger ME, Vlahov D, Schoenbaum EE, Schuman P, Boland RJ, Moore J, for the HIV Epidemiology Research Study Group. Mortality, CD4 cell count decline, and depressive symptoms among HIV-seropositive women. Journal of the American Medical Association (2001) 285:1466–1474.
Kenward M, Molenberghs G. Parametric models for incomplete continuous and categorical data. Statistical Methods in Medical Research (1999) 8:51–83.
Kim Y, Lee J, Kim J. Bayesian bootstrap analysis of doubly censored data using Gibbs sampler. Statistica Sinica (2005) 15:969–980.[Web of Science]
Kurland B, Heagerty P. Directly parameterized regression conditioning on being alive: analysis of longitudinal data truncated by deaths. Biostatistics (2005) 6:241–258.[Abstract]
Lee JY. Sensitivity analysis and informative priors for longitudinal binary data with dropout. In: PhD. Thesis] (2007) Providence, RI: Brown University.
Leserman J. Role of depression, stress and trauma in HIV disease progression. Psychosomatic Medicine (2008) 70:539–545.
Little R. Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association (1993) 88:125–134.[CrossRef][Web of Science]
Little R. A class of pattern-mixture models for normal incomplete data. Biometrika (1994) 81:471–483.
Little R. Modeling the dropout mechanism in repeated measures studies. Journal of the American Statistical Association (1995) 90:1112–1121.[CrossRef][Web of Science]
Little R, Wang Y. Pattern-mixture models for multivariate incomplete data with covariates. Biometrics (1996) 52:98–111.[CrossRef][Web of Science][Medline]
Molenberghs G, Michiels B, Kenward MG, Diggle PJ. Monotone missing data and pattern-mixture models. Statistica Neerlandica (1998) 52:153–161.[CrossRef][Web of Science]
Molenberghs G, Thijs H, Kenward M, Verbeke G. Sensitivity analysis of continuous incomplete longitudinal outcomes. Statistica Neerlandica (2003) 57:112–135.[CrossRef][Web of Science]
Radloff L. The CES-D scale: a self-report depression scale for research in the general population. Applied Psychological Measurement (1977) 1:385–401.[Abstract]
Rotnitzky A, Scharfstein D, Su T, Robins J. Methods for conducting sensitivity analysis of trials with potentially nonignorable competing causes of censoring. Biometrics (2001) 57:103–113.[CrossRef][Web of Science][Medline]
Roy J, Daniels MJ. A general class of pattern mixture models for nonignorable dropout with many possible dropout times. Biometrics (2007) 64:538–545.[CrossRef][Web of Science][Medline]
Rubin D. The Bayesian bootstrap. Annals of Statistics (1981) 9:130–134.[CrossRef][Web of Science]
Ruppert D. Selecting the number of knots for penalized splines. Journal of Computational and Graphical Statistics (2002) 11:735–757.[CrossRef][Web of Science]
Ruppert D, Wand M, Carroll R. Semiparametric Regression (2003) Cambridge: Cambridge University Press.
Scharfstein D, Robins J, Rotnitzky A. Adjusting for nonignorable nonresponse using semiparametric nonresponse models with time dependent covariates (with discussion). Journal of the American Statistical Association (1999) 94:1096–1146.[CrossRef][Web of Science]
Smith D, Warren D, Vlahov D, Schuman P, Stein M, Greenberg B, Holmberg S. Design and baseline participant characteristics of the human immunodeficiency virus epidemiology research (HER) study: a prospective cohort study of human immunodeficiency virus infection in US women. American Journal of Epidemiolology (1997) 146:459–469.
Spiegelhalter D, Best N, Carlin B, van der Linde A. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B (2002) 64:583–639.[CrossRef]
Spiegelhalter D, Thomas A, Best N, Lunn D. WinBUGS Version 1.4 User Manual. (2003) Cambridge: Medical Research Council Biostatistics Unit.
Ten Have T, Kunselman A, Pulkstenis E, Landis J. Mixed effects logistic regression models for longitudinal binary response data with informative drop-out. Biometrics (1998) 54:367–383.[CrossRef][Web of Science][Medline]
Tsiatis A, Davidian M. Joint modeling of longitudinal and time-to-event data: an overview. Statistica Sinica (2004) 14:809–834.[Web of Science]
Verbeke G, Molenberghs G, Thijs H, Lesaffre E, Kenward M. Sensitivity analysis for nonrandom dropout: a local influence approach. Biometrics (2001) 57:7–14.[CrossRef][Web of Science][Medline]
Wilkins KJ, Fitzmaurice GM. A hybrid model for nonignorable dropout in longitudinal binary responses. Biometrics (2006) 62:168–176.[CrossRef][Web of Science][Medline]
Wilkins KJ, Fitzmaurice GM. A marginalized pattern-mixture model for longitudinal binary data when nonresponse depends on unobserved responses. Biostatistics (2007) 8:297–305.
Wu M, Bailey K. Estimation and comparison of changes in the presence of informative right censoring: conditional linear model (corr:v46 p889). Biometrics (1989) 45:939–955.[CrossRef][Web of Science][Medline]
Wu M, Carroll R. Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process. Biometrics (1988) 44:175–188.[CrossRef][Web of Science]
Wulfsohn M, Tsiatis A. A joint model for survival and longitudinal data measured with error. Biometrics (1997) 53:330–339.[CrossRef][Web of Science][Medline]
Received July 17, 2008; revised November 17, 2008; revised March 18, 2009; revised August 6, 2009; accepted for publication September 14, 2009.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||











