Biostatistics Advance Access originally published online on January 20, 2006
Biostatistics 2006 7(3):469-485; doi:10.1093/biostatistics/kxj019
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Estimation in regression models for longitudinal binary data with outcome-dependent follow-up
Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue and Brigham and Women's Hospital, Boston, MA, USA fitzmaur{at}hsph.harvard.edu
Medical University of South Carolina, Charleston, SC, USA
School of Public Health, University of North Carolina, Chapel Hill, NC, USA
Dana Farber Cancer Institute, Boston, MA, USA
University of Miami School of Medicine, Miami, FL, USA
* To whom correspondence should be addressed.
| SUMMARY |
|---|
|
|
|---|
In many observational studies, individuals are measured repeatedly over time, although not necessarily at a set of pre-specified occasions. Instead, individuals may be measured at irregular intervals, with those having a history of poorer health outcomes being measured with somewhat greater frequency and regularity. In this paper, we consider likelihood-based estimation of the regression parameters in marginal models for longitudinal binary data when the follow-up times are not fixed by design, but can depend on previous outcomes. In particular, we consider assumptions regarding the follow-up time process that result in the likelihood function separating into two components: one for the follow-up time process, the other for the outcome measurement process. The practical implication of this separation is that the follow-up time process can be ignored when making likelihood-based inferences about the marginal regression model parameters. That is, maximum likelihood (ML) estimation of the regression parameters relating the probability of success at a given time to covariates does not require that a model for the distribution of follow-up times be specified. However, to obtain consistent parameter estimates, the multinomial distribution for the vector of repeated binary outcomes must be correctly specified. In general, ML estimation requires specification of all higher-order moments and the likelihood for a marginal model can be intractable except in cases where the number of repeated measurements is relatively small. To circumvent these difficulties, we propose a pseudolikelihood for estimation of the marginal model parameters. The pseudolikelihood uses a linear approximation for the conditional distribution of the response at any occasion, given the history of previous responses. The appeal of this approximation is that the conditional distributions are functions of the first two moments of the binary responses only. When the follow-up times depend only on the previous outcome, the pseudolikelihood requires correct specification of the conditional distribution of the current outcome given the outcome at the previous occasion only. Results from a simulation study and a study of asymptotic bias are presented. Finally, we illustrate the main results using data from a longitudinal observational study that explored the cardiotoxic effects of doxorubicin chemotherapy for the treatment of acute lymphoblastic leukemia in children.
Keywords: Follow-up time process; Generalized estimating equations; Maximum likelihood; Multinomial distribution; Pseudolikelihood
| 1. INTRODUCTION |
|---|
|
|
|---|
In many observational studies, individuals are followed over time, and the outcome variable of interest is measured repeatedly at different follow-up times. Unlike designed longitudinal studies, individuals are not necessarily measured at a set of common time points specified by the study protocol, but instead are measured at irregular intervals. Furthermore, the follow-up times may depend on the previous outcome measures. For example, consider a recent longitudinal observational study (Lipshultz et al., 1995
|
In this particular study, each individual has a set of random follow-up times and it is reasonable to conjecture that the follow-up times might depend on previous outcomes. That is, the physician of a patient with an abnormal left ventricular mass measurement may demand more frequent echocardiograms and at shorter follow-up intervals. For example, in this study, the median time to the first follow-up visit was 2.6 years for patients who initially had normal left ventricular mass, and 1.7 years for patients who initially had abnormal left ventricular mass (log-rank test, p-value < 0.025).
The focus of this paper is on estimation of regression parameters in marginal models for longitudinal binary data where the follow-up times are not fixed by design, but can depend on previous outcomes. We discuss assumptions about the distribution of follow-up times that result in the likelihood function for the observed data separating into two components: one for the follow-up time process, the other for the outcome measurement process. The practical implication of this separation of the likelihood function is that the former process can be ignored when making likelihood-based inferences about the latter. That is, maximum likelihood (ML) estimation of the regression parameters does not require that a model for the distribution of follow-up times be specified. If the follow-up time process is considered to be a form of selection, then there are very close parallels with ignorable missing data mechanisms (Little and Rubin, 1987
). However, to obtain consistent regression parameter estimates, the multinomial distribution for the vector of repeated binary outcomes must be correctly specified. ML estimation requires specification of all higher-order moments and it is prohibitively difficult to compute the likelihood function for marginal models except in cases where the number of repeated measurements is small. When the likelihood is intractable, one alternative is the generalized estimating equations (GEE) approach (Liang and Zeger, 1986
). However, when the follow-up times depend on previous outcomes, the standard GEE approach yields biased parameter estimates (Lipsitz et al., 2002
). To circumvent problems with intractable likelihoods for marginal models and the potential bias of standard GEE methods, we propose a pseudolikelihood to estimate the regression parameters when the follow-up times depend on previous outcomes. The pseudolikelihood uses a linear approximation to the conditional distribution of the response at any occasions, given the history of previous responses. The appeal of this approximation is that the conditional distributions are functions of the first two moments of the binary responses only. Recently, Lin et al. (2004)
have proposed a weighted GEE approach for handling outcome-dependent follow-up that also avoids specification of the joint distribution for the vector of repeated outcomes. In their weighted GEE, contributions to the estimating equations are weighted by the inverse of the intensity-of-visit process. The weighted GEE approach makes weaker assumptions about the follow-up time process but requires correct specification of a model for the follow-up time process. In addition, it can incorporate auxiliary, internal time-dependent covariates into the model for the follow-up time process. In contrast, the proposed pseudolikelihood estimator makes stronger assumptions about the follow-up time process but does not require specification of a model for the follow-up time process.
In the following, we refer to the model for the outcome of interest as the measurement model. Furthermore, given that the follow-up times are generated from some underlying distribution, we also refer to the follow-up time process. In Section 2, we introduce some notation and consider models for the measurement and follow-up time processes. We discuss assumptions regarding the follow-up time process and consider likelihood-based inferences about the measurement model. A pseudolikelihood approach is proposed to circumvent difficulties with intractable likelihoods for marginal models for longitudinal binary data. In Section 3, we demonstrate how misspecification of the model for the within-subject association among the repeated binary responses will, in general, result in regression parameter estimates that are asymptotically biased. The results of a simulation study, assessing the potential magnitude of the bias in finite samples, and also comparing estimators in terms of variance and coverage probabilities, are presented in Section 3. Finally, we illustrate the main results using data from the longitudinal study of the cardiotoxic effects of doxorubicin chemotherapy introduced earlier.
| 2. MODELS FOR MEASUREMENT AND FOLLOW-UP PROCESSES |
|---|
|
|
|---|
Suppose that n independent individuals are to be followed up intermittently over an interval [0,
], where the constant
> 0 is determined with the knowledge that outcome measurements could potentially be observed up to time
. It is assumed that the outcome at time t, denoted by Yi(t), is binary, i.e. Yi(t) equals 1 if the ith individual has response 1 (say success) at time t, and 0 otherwise. Associated with Yi(t), each individual has a (J x 1) covariate vector Xi(t) which can include both time-stationary and time-varying covariates.
A logistic regression model is assumed for the marginal mean of Yi(t),
![]() | (2.1) |
where ß is the vector of logistic regression parameters. Although a logit link function is adopted, in principle any suitable link function can be used. Note that if Xi(t) includes time-varying covariates, they are assumed to be external covariates in the sense described by Kalbfleisch and Prentice (1980)
. Specifically, it is assumed that any time-varying covariate process at time t is conditionally independent of all previous responses, given the past history of the covariate process. When this assumption is violated, the validity and interpretability of the model given by (2.1) is questionable (see, for example, Fitzmaurice et al., 1993
; Pepe and Anderson, 1994
; Robins et al., 1999
). Since Yi(t) is binary, the variance function over time is determined completely by the marginal mean, i.e.
![]() | (2.2) |
We denote the correlation function by
![]() | (2.3) |
and it depends upon the higher-order moments; this is specified in terms of an additional set of parameters,
.
Next, let Ni(t) denote a counting process for the number of follow-up measurements on the ith individual by continuous time point t
0. The indicator random variable dNi(t) equals 1 if a follow-up measurement is obtained on the ith individual at follow-up time t, and equals 0 otherwise. Note that all the information about the times of follow-up is contained in the counting process Ni(t), or equivalently, in dNi(t). Thus, the ith individual (i = 1, ..., n) has a set of binary responses at Mi = Ni(
) random follow-up times.
Finally, we consider the distribution of the follow-up times and their dependence on the outcome measurement function {Yi(t)}. Let
and
denote the history of the outcome measurement and the follow-up time process through time t; further, let
Let
denote the history of the observed outcome measurement process through time t. Also, let Yi(t) = {Yi(s): t
s
}. Next, we assume that the conditional distribution of dNi(t), given
and
depends only on the history of the observed outcome measurements at the previous times, and perhaps on
For example, patients with a history of poor health outcomes on previous visits may be expected to have shorter times of follow-up. Specifically, we assume that
![]() | (2.4) |
Thus, (2.4) implies that follow-up at time t depends only on the previously observed data (i.e. the previously observed outcomes and the times at which they were observed). Equation (2.4) is identifiable only to the extent that it asserts other possible endogenous covariates, say W, do not affect the timing of the measurements. In principle, the conditional distribution of the follow-up times can follow any time-to-event distribution. For reasons that will become apparent in Section 2.2, we do not consider particular families of distributions for dNi(t).
Recall that in most longitudinal studies the parameters of primary interest are ß, and sometimes
; the parameters of the follow-up time distribution, say
, are usually regarded as nuisance parameters. Indeed, in many longitudinal studies, both
and
might be regarded as nuisance parameters. In this section, we discuss estimation of (ß,
).
The assumption about the conditional distribution of the follow-up times given by (2.4) implies the following conditional independence relationship for the distribution of the outcome measurement process at time t,
![]() | (2.5) |
which follows by expressing
![]() |
and substituting from (2.4). That is, the conditional distribution of the outcome measurement process at time t, given the history of the observed outcome measurement process and the visit history, is the same for those individuals who were followed up as for those who were not. Note, however, that
due to the assumed dependence of visit times on the history of the observed outcome measurement process.
Since (2.5) holds under the assumption about the conditional distribution of the follow-up time process given by (2.4), the log-likelihood for the observed data is
|
|
Assuming that
and (ß,
) are separable, the ML estimate of (ß,
) is obtained by maximizing the second term on the right-hand side of (2.6) and solving
|
|
for
Thus, given (2.4) and the assumption that
and (ß,
) are functionally distinct sets of parameters, (ß,
) can be estimated ignoring the distribution of the follow-up times by maximizing
|
|
Using standard results from likelihood theory, and under suitable regularity conditions, it can be shown that the ML estimate
is consistent and asymptotically multivariate normal, with asymptotic covariance matrix approximated by the inverse of the Fisher information matrix.
To formulate the likelihood, we must specify the conditional distributions
![]() | (2.9) |
In general, this requires specification of the success probabilities (2.1) and correlations (2.3), in addition to higher-order moments. In principle, given (ß,
), specification of these higher-order moments is relatively straightforward (e.g. using the Bahadur, 1961
, representation for the multinomial distribution). However, because the number of higher-order moments increases rapidly with the number of repeated measurements, in practice, it is prohibitively difficult to compute the likelihood function for marginal models except in cases where the number of repeated measurements is relatively small.
Although ML estimation yields consistent parameter estimates when the follow-up times depend on previous outcomes, it does require that the multinomial distribution for the vector of repeated binary outcomes be correctly specified. Because the likelihood is intractable except when Mi is small, we propose a pseudolikelihood (PL) approach to circumvent these difficulties. The proposed pseudolikelihood is a modification of the likelihood to reduce its complexity and is based on approximating the conditional distributions in (2.8). Specifically, the proposed pseudo log-likelihood is
|
|
![]() | (2.11) |
where pi(t) is the marginal probability at time t given by (2.1). This is a Gaussian approximation for
i(t), with
i(t) taking the same form as the conditional mean from the multivariate normal distribution. Thus, the pseudolikelihood uses a linear approximation for the conditional distribution of the response at any occasion, given the history of previous responses. The appeal of this approximation is that the conditional distributions are functions of the first two moments of the binary responses only.
An interesting special case of this pseudolikelihood arises when an exponential correlation,
i(u, t) =
|u|, is assumed. Then, under (2.11), the conditional distribution of Yi(t), given the history of previous responses,
depends only on the most recently observed value of the response prior to t,
This is a first-order Markov-type dependence, and it can be shown that the pseudo log-likelihood,
![]() | (2.12) |
yields consistent estimators of (ß,
) under the following conditional independence assumption,
![]() | (2.13) |
Assumption (2.13) is stronger than assumption (2.4) and is not verifiable from the data at hand (see Appendix) unless further parametric modeling assumptions are made [e.g. with additional modeling assumptions of the kind discussed in Diggle and Kenward, 1994
, assumption (2.13) becomes weakly testable]. It implies that the conditional distribution of the outcome measurement process at time t, given the previous observed outcome, is the same for those individuals who were followed up at time t as for those who were not. Thus, when assumption (2.13) holds, maximization of (2.12) yields pseudo maximum likelihood estimates (PMLEs), (
,
), which are consistent (see Appendix) provided the true conditional distribution
has been correctly specified in (2.12), even though
![]() |
When the true conditional distribution does satisfy the first-order Markov-type assumption,
![]() |
then the PMLE is the ML estimate of (ß,
).
Next, for the more general case given by (2.10) and (2.11), we derive the pseudoscore vector denoted by
|
|
where
is specified by (2.11),
and
We note that PL estimators can be derived from (2.10) and (2.11) under more general assumptions about the correlation structure, e.g. second-order antedependence. The PMLE
is obtained as the solution to
e.g. using the NewtonRaphson algorithm. However, in general, the inverse of the pseudo-Fisher information matrix does not yield a valid estimate of the asymptotic variance. Instead, a so-called empirical or robust estimator of variance is required. The appropriate adjustment takes the usual form of the sandwich estimator,
![]() |
![]() | (2.14) |
The variance estimate is obtained by replacing (ß,
) with
in (2.14).
It must be recognized that the PL estimator of ß (and
) is consistent only if it can be shown that
at the true ß and
. Under the assumption about the conditional distribution of the follow-up times given by (2.4),
has mean equal to 0 at the true ß and
provided
is a linear function of the previous responses. In the bivariate case, the linear approximation of
i(t) can be shown to be exact provided that the covariance matrix has been correctly specified. However, in the multivariate case this no longer holds and the adequacy of the approximation will depend on two factors: (i) how well
i(t) is approximated by a linear function of the previous responses and (ii) how closely
approximates the true underlying covariance of the responses. We examine the adequacy of the approximation in Section 3 with studies of asymptotic and finite-sample bias.
Finally, we note that although the proposed PL estimator requires a Gaussian approximation of
i(t), ß and
are estimated jointly by maximizing (2.10). An alternative approach is to make a Gaussian approximation for
i(t) and, in addition, decouple estimation of ß from the estimation of
(see, for example, Hand and Crowder, 1996
). This is precisely the approach taken in the modified GEE based on Gaussian estimation, as suggested by Lipsitz et al. (1992)
, Lee et al. (1999
, unpublished manuscript), and Lipsitz et al. (2000)
. The modified GEE uses the multivariate normal estimating equations for estimation of the correlation parameters,
, combined with the standard GEE for the marginal mean regression parameters, ß, thereby decoupling estimation of ß from estimation of
. The use of the multivariate normal estimating equations for
ensures that the estimated correlation matrix is non-negative definite, while avoiding full specification of the multinomial joint distribution of the responses. Lipsitz et al. (2000)
have shown that this modified GEE yields little bias when there are missing data that are assumed to be missing at random. As mentioned earlier, if the follow-up time process is considered to be a form of selection, then there are very close parallels with missing at random missing data mechanisms (Little and Rubin, 1987
). In Sections 3 and 4, we explore the properties of the proposed PL estimator and compare it to the standard and modified (or decoupled) GEE estimators of the regression parameters in marginal models.
| 3. STUDIES OF BIAS |
|---|
|
|
|---|
In this section, we consider the asymptotic and finite-sample bias resulting from the different methods when follow-up times are random and depend on previous outcomes. Our main concern is with the potential bias in estimating ß. We consider the PL estimator and compare it to both the standard and modified (or decoupled) GEE estimators (Lipsitz et al., 2000
We consider a two-group longitudinal design configuration with a binary response measured on four occasions. We assume that half of the individuals are assigned to each of the two groups, i.e. the group indicator variable, xi, equals 0 or 1 with pr(xi = 1) = 0.5. The marginal model for the mean of Yi(t) is
![]() | (3.15) |
and an autoregressive correlation is assumed,
![]() | (3.16) |
for
= 0.2 or 0.4.
It is assumed that each individual is seen four times, with the first time point set equal to Ti1 = 0. A detailed description of the specification of the joint distribution of the responses and the follow-up times can be found as supplementary material available at Biostatistics online (www.biostatistics.oupjournals.org). We specify the conditional distributions of the follow-up times as first-order, second-order, or third-order, depending on how many of the previous outcomes the follow-up time depends. Finally, the distribution for the measurement model is specified through a series of conditional distributions, given the past history of responses, via the Bahadur representation (Bahadur, 1961
); the Bahadur representation of those distributions is specified such that (2.15) and (2.16) hold. We formulate two sets of conditional distributions. The first set is the first-order dependence model, in which the distribution of the outcome at a given time depends only on the previous outcome; a Markov-type dependence. The second set is the full Bahadur model, in which the distribution of the outcome at a given time depends on all previous outcomes. A more detailed description can be found as supplementary material available at Biostatistics online.
In the study of asymptotic bias, we fix (ß0, ß1, ß2, ß3) = (1, 1, 1, 1) and set
= 0.2 or 0.4. Thus, we specify two models for the outcome distribution, full Bahadur and first-order dependence, and three models for the follow-up time distribution, first-order, second-order, and third-order.
Letting
denote an estimate from one of the methods described in Section 2.3,
where
is not necessarily equal to ß. Here we are primarily interested in assessing the asymptotic biases,
Following Rotnitzky and Wypij (1994)
, the asymptotic bias can be ascertained by simply considering an artificial sample comprised of one suitably weighted observation for each possible realization of the outcomes and follow-up times. Then, we can solve for
in the usual way, except that each pseudoindividual's contribution to the estimating equations is weighted by its respective probability. We examine the asymptotic bias of the various estimators of ß, and explore how different specifications of the working correlation with GEE can affect the asymptotic bias. The three working correlation structures considered are independence [
i(u, t) = 0], exchangeable [
i(u, t) =
], and the correct model, first-order autoregressive.
Table 2 displays the asymptotic bias for the grouptime interaction effect (ß3), the effect usually of most interest in a longitudinal study. It can be seen that the PL estimator is unbiased under a first-order dependence measurement model, regardless of the nature of the follow-up time process. The PL estimator has relatively small bias (<5%) under the full Bahadur measurement model, even when the follow-up process is second-order or third-order. In contrast, the bias of the modified GEE is large under the naive assumption of independence, and is also relatively large under the (incorrectly specified) exchangeable correlation structure. The bias of these two GEE estimators increases for larger values of the correlation parameter
. When the correctly specified (AR1) correlation structure is assumed, the bias of the modified GEE is minimal and comparable to the bias of the PL estimator. Thus, use of the correct model for the correlation with the modified GEE appears to reduce greatly the bias (when compared to the GEE with incorrect correlation structure). Overall, the results in Table 2 indicate that the proposed pseudolikelihood approach has little bias, even in the configurations when it might be expected to have bias (full Bahadur and second-order or third-order follow-up). In addition, the results indicate that the modified GEE requires careful modeling of the correlation structure in order to minimize bias.
|
To complement the study of asymptotic bias reported in Section 3.1, we conducted a simulation study using a similar design configuration. Specifically, we considered a two-group longitudinal design configuration with a binary response measured on three occasions, as opposed to four occasions in the study of asymptotic bias. We considered samples of size n = 400, with half of the individuals assigned to each of the two groups. The marginal model for the mean of Yi(t) is
![]() |
where xi is the group indicator variable and (ß0, ß1, ß2, ß3) = (1, 1, 1, 1). An autoregressive correlation was assumed,
![]() |
for
= 0.2, 0.4, or 0.6. Following the design used in Section 3.1, we specified two models for the outcome distribution, first-order and full Bahadur, and two models for the follow-up distribution, first-order and second-order. For each combination of the outcome and follow-up time distributions, we performed 1000 replicate simulations.
Tables 3, 4, and 5 display the relative bias, variance, and coverage probabilities of nominal 95% confidence intervals for the grouptime interaction effect (ß3). In terms of bias (see Table 3), there is very discernible bias for the GEE under the naive assumption of independence and under the incorrectly specified exchangeable correlation structure. In particular, the bias increases with larger values of the correlation parameter
. For
= 0.4, the relative bias of the working independence GEE is approximately 100%; when
= 0.6, the relative bias exceeds 400%. For
= 0.4, the relative bias of the working exchangeable correlation GEE is approximately 50%; when
= 0.6, the relative bias exceeds 100%. In contrast, the finite-sample bias of the working autoregressive correlation GEE and pseudolikelihood estimates is modest, with relative bias less than 10% for almost all the design configurations. These results complement the earlier results from the study of asymptotic bias, where it was found that the proposed pseudolikelihood approach, or use of the correct correlation model with the modified GEE, greatly reduce the bias when compared to the GEE with incorrect correlation structure. The results of the simulation study suggest that the bias of the GEE with incorrect correlation structure can be very substantial in finite samples.
|
|
|
In terms of variance (see Table 4), the GEE under the naive assumption of independence and under the incorrectly specified exchangeable correlation structure, performed poorly, especially for larger values of
. In particular, the working independence GEE estimator was found to be very inefficient, with large increases in the variance as
increased from 0.2 to 0.6. In contrast, the variances of the working autoregressive correlation GEE and pseudolikelihood estimators were discernibly smaller. In terms of both variance and mean-squared error, the pseudolikelihood estimator, in general, performed somewhat better than the working autoregressive correlation GEE, although there were a few configurations where the working autoregressive correlation GEE performed better.
Finally, in term of coverage probabilities for nominal 95% confidence intervals, the pseudolikelihood estimator had adequate coverage in 11 of the 12 configurations considered; in one configuration the coverage probability was approximately 90%. Coverage probabilities for the working autoregressive correlation GEE were less impressive. In all configurations, the coverage probabilities were less than 95%, dropping below 90% in 7 of the 12 configurations. Coverage probabilities for the GEE with incorrect correlation structure were poor, with coverage probabilities decreasing with increasing
. In particular, when
= 0.6, coverage probabilities ranged from 51.7 to 79.5%.
| 4. APPLICATION: LONGITUDINAL STUDY OF CARDIOTOXICITY |
|---|
|
|
|---|
In this section, we illustrate the key results using data from the longitudinal observational study that motivated this paper. Doxorubicin chemotherapy has been very effective in the treatment of acute lymphoblastic leukemia in children, with 5-year survival rates of at least 80% (Schorin et al., 1994
Abnormalities of the left ventricular mass of the heart were determined by examining echocardiograms from n = 111 children and adults who had received cumulative doses from between 45 to 550 mg of doxorubicin per square meter of body-surface area for the treatment of acute lymphoblastic leukemia in childhood. The main covariates of interest are the cumulative dose (dichotomized at 300 mg; 1 = 300 mg or more, 0 = less than 300 mg), age at the end of treatment (ranging from 1.5 to 20 years), time since the end of chemotherapy to a given heart left ventricular mass measurement (ranging from 0 to 13.9 years), and gender (1 = female, 0 = male). We note that clinical investigators suspect that the previous outcome is the main determinant of the timing of future visits. Thus, the timing of the patient visit is assumed to depend on the previous left ventricular mass status. This is an important assumption that is required for appropriate inference from subsequent analyses.
We fit the following logistic regression model for abnormal left ventricular mass of the heart at time t,
![]() |
To minimize the possible bias in estimating ß, we fit a highly parameterized model for the within-subject association. However, attempts to obtain the ML estimates for the full Bahadur model with over 100 parameters for the within-subject association resulted (not surprisingly) in convergence problems for the algorithms used (NewtonRaphson and NelderMead). Instead, we fit the conditional probability approximation given in (2.11), with autoregressive (or, more precisely, exponential) correlation pattern
![]() |
For illustrative purposes, we also used the GEE approach, implemented using the SAS procedure PROC GLIMMIX, with the following correlation structures: independence, exchangeable, and autoregressive (2.1).
Table 6 displays the estimates of ß obtained using the pseudolikelihood and GEE approaches. For the GEE with a working independence correlation structure, standard errors of the estimates of ß were based on the empirical or so-called robust variance estimator (Huber, 1967
; White, 1982
) in order to correct for the potential misspecification of the covariance. Overall, all approaches yielded similar estimates for the gender and dose effects. However, the approaches produced discernibly different estimates for the interaction between age and time. In particular, there is a 42% difference between the PL-AR1 and GEE-IND; a 61% difference between the PL-AR1 and GEE-CS estimates; and a 25% difference between the PL-AR1 and GEE-AR1 estimates.
|
Also, for these data,
using the pseudolikelihood and
using the GEE approach with AR1 correlation. Thus, as might be expected, observations close in time are very similar, and there is strong evidence that the repeated measures are not independent. Because there was empirical evidence that follow-up time appeared to depend on the previous outcomes (e.g. the median time to the first follow-up visit was 1.7 years for patients with abnormal initial left ventricular mass measurements, and 2.6 years for patients with normal initial left ventricular mass measurements), it should not be surprising that the estimates of the interaction effect differ when using the pseudolikelihood and GEE approaches. | 5. CONCLUSION |
|---|
|
|
|---|
In this paper, we consider ML estimation of regression parameters in marginal models for longitudinal binary data when the follow-up times are random and depend on previous outcomes. Under this follow-up time process, we have shown that ML estimation of the regression parameters does not require that a model for the distribution of follow-up times be specified. However, the model for the entire joint distribution of the repeated measures must be correctly specified. Thus, any misspecification of the higher-order moments can result in biased estimates of the regression parameters.
When follow-up depends on previous outcomes, ML estimation requires specification of all higher-order moments. However, it is prohibitively difficult to compute the likelihood function for marginal models for longitudinal binary data except in cases where the number of repeated measurements is relatively small. To circumvent these difficulties, we propose a pseudolikelihood to estimate the regression parameters. The pseudolikelihood uses a linear approximation to the conditional distribution of the response at any occasion, given the history of previous responses. The appeal of this approximation is that the conditional distributions are functions of the first two moments of the binary responses only. When the follow-up times depend only on the previous outcome, the pseudolikelihood requires correct specification of the conditional distribution of the current outcome given the outcome at the previous occasion only. The results from a study of asymptotic bias and a simulation study demonstrate that the proposed pseudolikelihood approach, or use of the correct correlation model with modified GEEs, greatly reduce the bias when compared to the GEE with incorrect correlation structure. This underscores the need to model carefully the correlation structure when follow-up depends on previous outcomes.
A caveat of the proposed pseudolikelihood approach is that it requires strong assumptions on either the follow-up time process [assumption (2.13)] or the measurement model. When the follow-up times depend only on the previous outcome, the PMLE is consistent provided the true conditional distribution
has been correctly modeled, even though
On the other hand, when the true conditional distribution does satisfy the first-order Markov-type dependence,
the PMLE is the ML estimate and is consistent even when assumption (2.13) does not hold, but the follow-up times depend on any of the previous outcomes [assumption (2.4)]. We note that the Bahadur dependence discussed in the study of asymptotic bias may not always be adequate for correctly specifying the conditional distribution of the current response, given the past history of responses. For example, when the true binary measurement process depends on random effects, this would induce a non-Markov dependence marginally (when averaged over the distribution of the random effects). However, we conjecture that the bias of the PL estimator may be modest unless any departures from these assumptions are relatively extreme. The simulation study results lend partial support to this conjecture.
Finally, we note that the results for both ML and PL estimation presented in this paper can be readily extended to the case where the longitudinal outcome is categorical with more than two levels (e.g. ordinal and nominal responses).
| APPENDIX |
|---|
|
|
|---|
Under assumption (2.13), the PMLE of (ß,
) is consistent.
Sketch of Proof. The pseudoscore function s(ß,
) can be expressed as
![]() |
where
![]() |
![]() |
The PMLE of ß (and
) is consistent, under suitable regularity conditions, if it can be shown that E[s(ß,
)] = 0 at the true (ß,
). Recall that assumption (2.13) implies the following conditional independence relationship,
![]() |
and thus
![]() |
Hence, under assumption (2.13) and provided the model for
i(t; ß,
) has been correctly specified, s(ß,
) has mean equal to 0 at the true ß and
.
Finally, we note that because
is clearly not estimable from the observed data, assumption (2.13) is not verifiable from the data at hand.
| ACKNOWLEDGMENTS |
|---|
We thank the Associate Editor and the referees for their helpful comments and suggestions. We are grateful for the support provided by National Institutes of Health grants GM 29745, GM 70335, HL 69800, HL 53392, HL 78522, HL 72705, AI 60373, CA 74015, CA 70101, CA 68484, CA 79060, and HR 96041. We also thank the Dana-Farber Cancer Institute Childhood Acute Lymphoblastic Leukemia Consortium for permission to use the data from the cardiotoxicity study. Conflict of Interest: None declared.
| REFERENCES |
|---|
|
|
|---|
-
BAHADUR, R. R. (1961). A representation of the joint distribution of responses to n dichotomous items. In Solomon, H. (ed), Studies in Item Analysis and Prediction. Stanford Mathematical Studies in the Social Sciences VI. Palo Alto, CA: Stanford University Press, pp. 158168.
DIGGLE, P. J. AND KENWARD, M. G. (1994). Informative dropout in longitudinal data analysis (with discussion). Applied Statistics 43, 4993.[CrossRef]
FITZMAURICE, G. M., LAIRD, N. M. AND ROTNITZKY, A. (1993). Regression models for discrete longitudinal responses (with discussion). Statistical Science 8, 284309.
HAND, D. J. AND CROWDER, M. J. (1996). Practical Longitudinal Data Analysis. London: Chapman and Hall.
HUBER, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1. Berkeley, CA: University of California Press, pp. 221233.
KALBFLEISCH, J. D. AND PRENTICE, R. L. (1980). The Statistical Analysis of Failure Time Data. New York: John Wiley.
LEE, H., LAIRD, N. M. AND JOHNSTON, G. (1999). Combining GEE and REML for estimation of generalized linear models with incomplete multivariate data (Unpublished).
LIANG, K. Y. AND ZEGER, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73, 1322.
LIN, H., SCHARFSTEIN, D. O. AND ROSENHECK, R. A. (2004). Analysis of longitudinal data with irregular, outcome-dependent follow-up. Journal of the Royal Statistical Society Series B 66, 791813.[CrossRef]
LIPSHULTZ, S. E., COLAN, S. D., GELBER, R. D., PEREZ-ATAYDE, A. R., SALLAN, S. E. AND SANDERS, S. P. (1991). Late cardiac effects of doxorubicin therapy for acute lymphoblastic leukemia in childhood. New England Journal of Medicine 324, 808815.[Abstract]
LIPSHULTZ, S. E., LIPSITZ, S. R., MONE, S. M., GOORIN, A. M., SALLAN, S. E. AND SANDERS, S. P. (1995). Female sex and higher drug dose as risk factors for late cardiotoxic effects of doxorubicin therapy for childhood cancer. New England Journal of Medicine 332, 17381743.
LIPSITZ, S. R., FITZMAURICE, G. M., IBRAHIM, J. G., GELBER, R. AND LIPSHULTZ, S. (2002). Parameter estimation in longitudinal studies with outcome-dependent follow-up. Biometrics 58, 5059.
LIPSITZ, S. R., LAIRD, N. M. AND HARRINGTON, D. P. (1992). A three-stage estimator for studies with repeated and possibly missing binary outcomes. Applied Statistics 41, 203213.
LIPSITZ, S. R., MOLENBERGHS, G., FITZMAURICE, G. M. AND IBRAHIM, J. (2000). GEE with Gaussian estimation of the correlations when data are incomplete. Biometrics 56, 528536.[CrossRef][Web of Science][Medline]
LITTLE, R. J. AND RUBIN, D. B. (1987). Statistical Analysis with Missing Data. New York: Wiley & Sons.
PEPE, M. S. AND ANDERSON, G. L. (1994). A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data. Communications in Statistics 23, 939951.
ROBINS, J. M., GREENLAND, S. AND HU. F.-C. (1999). Estimation of the causal effect of a time-varying exposure on the marginal mean of a repeated binary outcome. Journal of the American Statistical Association 94, 687712.[CrossRef]
ROTNITZKY, A. AND WYPIJ, D. (1994). A note on the bias of estimators with missing data. Biometrics 50, 11631170.[CrossRef][Web of Science][Medline]
SCHORIN, M. A., BLATTNER, S., GELBER, R. D., TARBELL, N. J., DONNELLY, M., DALTON, V., COHEN, H. J. AND SALLAN, S. E. (1994). Treatment of childhood acute lymphoblastic leukemia: results of Dana-Farber Cancer Institute/Children's Hospital Acute Lymphoblastic Leukemia Consortium Protocol 85-01. Journal of Clinical Oncology 12, 740747.[Abstract]
WHITE, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50, 126.[CrossRef][Web of Science]
Received May 29, 2003; revised May 25, 2004; revised February 11, 2005; revised June 16, 2005; revised December 8, 2005; revised January 4, 2006; accepted for publication January 18, 2006.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
R. K. Ghanta, R. Chen, N. Narayanasamy, S. McGurk, S. Lipsitz, F. Y. Chen, and L. H. Cohn Suture bicuspidization of the tricuspid valve versus ring annuloplasty for repair of functional tricuspid regurgitation: Midterm results of 237 consecutive patients J. Thorac. Cardiovasc. Surg., January 1, 2007; 133(1): 117 - 126. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||









is approximated by








for the grouptime interaction effect (ß3 = 1)







