Biostatistics Advance Access originally published online on June 20, 2006
Biostatistics 2007 8(2):297-305; doi:10.1093/biostatistics/kxl010
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A marginalized pattern-mixture model for longitudinal binary data when nonresponse depends on unobserved responses
Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA
Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA and Division of General Medicine, Brigham and Women's Hospital, 1620 Tremont Street, 3rd Floor, Boston, MA 02120-1613, USA fitzmaur{at}hsph.harvard.edu
* To whom correspondence should be addressed.
| SUMMARY |
|---|
|
|
|---|
This paper proposes a method for modeling longitudinal binary data when nonresponse depends on unobserved responses. The proposed method presumes that the target of inference is the marginal distribution of the response at each occasion and its dependence on covariates, and can accommodate both monotone and non-monotone missingness. The approach involves a marginally specified pattern-mixture model that directly parameterizes both the marginal means at each occasion and the dependence of each response on indicators of nonresponse pattern. This formulation readily incorporates a variety of nonresponse processes assumed within a sensitivity analysis. Once identifying restrictions have been made, estimation of model parameters proceeds via solution to a set of modified generalized estimating equations. The proposed method provides an alternative to standard selection and pattern-mixture modeling frameworks, while featuring certain advantages of each. The paper concludes with application of the method to data from a contraceptive clinical trial with substantial dropout.
Keywords: Binary data; Dropout; Longitudinal method; Marginal regression; Missing data; Nonresponse
| 1. INTRODUCTION |
|---|
|
|
|---|
Missing data are a common feature of longitudinal designs that complicate their analyses. For example, a year-long clinical trial comparing two doses of a contraceptive regimen (Machin and others, 1988
A range of methodologies for handling longitudinal data subject to nonresponse has been proposed, accommodating both general patterns of missingness and monotone patterns of dropout. When the focus is upon the marginal means of discrete responses, subject to NMAR nonresponse, both likelihood-based (e.g. Baker, 1995
; Molenberghs and others, 1997
; Ekholm and Skinner, 1998
) and quasi-likelihood (e.g. Rotnitzky and others, 1998
; Fitzmaurice and Laird, 2000
) methods have been proposed. One aspect that all models for NMAR nonresponse must address is the inescapable need to postulate assumptions about how the probability of nonresponse may depend on unobserved responses. Unless supplemental information is available, the observed data provide no means to verify such assumptions. Consequently, sensitivity analysis under a range of assumptions about the missingness process is recommended.
Joint models for the "full" data (i.e. complete responses, both observed and unobserved, and indicators of missingness) can be broadly classified into "selection" and "pattern-mixture" models (Little, 1993
). Selection models factor the joint distribution into the marginal distribution of the complete responses and the conditional distribution of missingness indicators given observed and unobserved responses. In contrast, pattern-mixture models factor the joint distribution into the conditional distribution of responses given missingness pattern and the marginal distribution of missingness indicators. In either case, identifying constraints are required when the models are applied to the (incomplete) observed data. When marginal means are the target of inference, selection models hold the advantage of directly parameterizing the marginal distribution of the responses, whereas parameters in pattern-mixture models have interpretations "conditional" on patterns of nonresponse. Pattern-mixture models have an advantage when adopting assumptions concerning the dependence between unobserved responses and nonresponse. Stratification by patterns of nonresponses makes clear how the observed data lack information sufficient to estimate certain parameters; the latter require identifying constraints. The assumptions and implications concerning unobserved responses entailed in such constraints are transparent, in contrast to those for selection models.
We propose a method that retains advantageous features of both selection and pattern-mixture models. We develop a marginally specified pattern-mixture model for longitudinal binary data that directly estimates the marginal mean of each response, given covariates, while adopting assumptions about the dependence between unobserved responses and missingness indicators. The approach formulates models for the marginal means (similar to selection models) yet constrains these means via models conditional on nonresponse pattern (similar to pattern-mixture models) that characterize the nonresponse process. The approach is semiparametric, avoiding the proliferation of nuisance parameters in full likelihood approaches (e.g. Wilkins and Fitzmaurice, 2006
). Higher-order moments, and their dependence on nonresponse pattern, are left unspecified; as a result, sensitivity analysis is simplified appreciably. Modified generalized estimating equations enable subsequent estimation of marginal mean regression parameters.
| 2. MARGINALIZED PATTERN-MIXTURE MODEL |
|---|
|
|
|---|
Let Yi = (Yi1,...,YiT)' denote the Tx1 vector of binary responses for the ith subject. Let Xi = (xi1,...,xiT)' denote the Txp matrix of covariates, which we consider to be fully observed (e.g. external or fixed by study design). Let Ri
[Rij] = [I(Yijis observed)] be a vector of missingness indicators. Recall that pattern-mixture models factor the joint distribution for the complete data as f(Yi,Ri|Xi) = f(Yi|Xi,Ri)f(Ri|Xi).
The proposed method directly models the marginal probabilities of each response as well as the marginal probabilities of nonresponse, but avoids fully specifying nuisance characteristics of the data. An additional aspect of the model captures the association between each response and missingness indicators. Specifically, model formulation involves three components:
- (i) Marginal model for the mean of the jth response, Yij: E(Yij|xij),
- (ii) Marginal model for nonresponse pattern, Ri: f(Ri|Xi), and
- (iii) Conditional model for association between each Yij and Ri: E(Yij|Ri,xij).
- (ii) Marginal model for nonresponse pattern, Ri: f(Ri|Xi), and
To specify (i), we link the marginal mean of each response to covariates xij:
|
| (2.1) |
To specify (ii) and (iii), we represent each unique missingness pattern possible for the i t h subject by a single scalar value ri = r(Ri). We set r = 0 to indicate a fully observed response vector, and let r = 0,1,
,K 1 index missingness patterns. We allow the marginal distribution of the nonresponse pattern,
i = [Pr(ri = r(Ri) = r)] = [
ir], to depend on covariates Xi, with parameters
. For component (iii), characterizing the missingness process, we specify a model for the mean of Yij, conditional on nonresponse pattern
|
| (2.2) |
The model given by (2.2) specifies how the mean response for subjects exhibiting nonresponse differs from the complete cases. This parametrization defines
ij implicitly as a function of
= (ß',
')',
i, and covariates xij, due to the relationship between (2.1) and (2.2),
![]() |
where
ir = Pr(ri = r|Xi); zij(r) = z(xij,ri = r) is allowed to depend on the covariates Xi, as well as on nonresponse pattern such that complete cases have zij(r = 0) = 0.
Components (i) and (iii) have distinct roles. The marginal mean model within component (i) is chosen to match the desired target of inference. The conditional mean model within (iii) expresses an assumed nonresponse process. It must be recognized that unverifiable assumptions within (iii) drive inferences about the model parameters in (i). As a result, we recommend sensitivity analysis under a number of distinct assumptions within (iii). Identifying restrictions for (2.2) are discussed in Section A of the supplementary material (http://www.biostatistics.oxfordjournals.org). Finally, we note that there are constraints on the bivariate associations among the responses and nonresponse indicators in (2.2), determined by the marginal probabilities in (2.1) and the nonresponse probabilities. Such constraints, however, are known to be relatively weak (e.g. Liang and others, 1992
; Fitzmaurice and others, 1993
) and are unlikely to pose any problems in most practical applications.
Once identifying constraints are adopted, estimation of
(and
) can proceed via the solution to a set of modified GEEs. The form of the equations is given by
|
|
where µ
is formed from the components of µi(r) corresponding to Y
, and
![]() |
the corresponding rows of the Jacobian "conditional" on nonresponse pattern; the form of this derivative matrix is outlined in Section B of the supplementary material (http://www.biostatistics.oxfordjournals.org). While expressed in terms of conditional means, the estimating equations depend on ß through the implicitly defined terms,
ij. We must choose a weighting matrix Wi of dimensions commensurate with the observed data vector Y
. Although an identity matrix is the simplest choice, we recommend Wi = A
where Ai = 
= d i a g [ V a r (Y
|ri = r)]. Note that non-diagonal matrices, resulting from an assumed form of "working correlation," require assumptions about conditional higher-order moments and further identifying restrictions. We note that these GEEs are different from those within a standard pattern-mixture formulation (e.g. Fitzmaurice and Laird, 2000
), since the goal is to estimate regression parameters for the marginal rather than conditional means.
Estimation involves two steps: (i) determining
ij given xij, an estimate of
i = [
ir], and the current value of
, say
; (ii) a modified NewtonRaphson step that updates
. Thus, estimation involves iterating the following two steps until convergence:
- Solve for the implicitly defined

= [
] via the series of equations 
- Form conditional means µi(r) = [µij(r)] from updated
, current
; update
: 
The marginal nonresponse probabilities
i can be estimated by a general multinomial model involving relevant covariates (Xi) parameterized by
. To efficiently solve the nonlinear equations in Step 1 at each iteration, we employ Brent's method (Press and others, 1992
).
Given correct specification of the marginal means µi, marginal nonresponse probabilities
i, and conditional means µi(r), the estimating equations are unbiased and their solution provides consistent estimates
. Because the covariance is misspecified in our choice of Wi, we recommend use of a "sandwich" estimator (Huber, 1967
) of the form
![]() |
where
![]() |
with
and
for k = 1,
,K 1,
,
, and
for j = 1,
,T, which follows from the set of joint estimating equations effectively used
![]() |
a Taylor expansion argument establishes the multivariate normal distribution for
.
| 3. APPLICATION TO CLINICAL TRIAL OF CONTRACEPTING WOMEN |
|---|
|
|
|---|
Next, we illustrate the proposed method using data from the clinical trial of contracepting women introduced earlier. Recall the trial comparing two doses of a contraceptive: four injections of 100 or 150 mg of DMPA were given at 90-day intervals. The outcome of interest is a repeated binary response indicating amenorrhea during follow-up intervals. In this study, there was substantial dropout for reasons thought likely to be related to the outcome. More than one-third of the women dropped out of the trial: 17% dropped out after the first 90-day interval, 13% dropped out after the second interval, and 7% dropped out after the third interval.
We considered the following logistic model for the probability of amenorrhea:
|
|
where t = 1,2,3,4 represents time elapsed (in terms of 90-day intervals) and dosei = I(150mgDMPA). We estimated the vector of treatment-specific dropout probabilities,
dosei, nonparametrically. Finally, to complete specification of the marginally specified pattern-mixture model, we considered three assumptions for dropout
|
|
- (a) "next dropout pattern": For j > k,
l o g i t E(Yij|Xij,Di = k) = l o g i t E(Yij|Xij,Di = j).
- (b) "dropout trend": l o g i t E(Yij|Xij,Di = k) l o g i t E(Yij|Xij,Di = j) =
(dose)j(k j).
- (c) "complete-case contrast": l o g i t E(Yij|Xij,Di = k) l o g i t E(Yik|Xik,Di = k) =
l o g i t E(Yij|Xij,Di = 4) l o g i t E(Yik|Xik,Di = 4).
- (b) "dropout trend": l o g i t E(Yij|Xij,Di = k) l o g i t E(Yij|Xij,Di = j) =
In (a), the mean response at any occasion following dropout is tied to the corresponding mean for those who dropout at the subsequent occasion. In (b), a linear trend in dropout time is assumed for the mean response at any occasion; thus, the mean response at any occasion following dropout is extrapolated from the "observable" trend across dropout patterns. Note that in both (a) and (b), there is an implicit assumption that those who dropout early in the study are more similar to those who dropout soon after than to those who dropout later or complete the study. In contrast, (c) assumes that any longitudinal trend in the mean response following dropout, relative to the mean prior to dropout, is similar to the corresponding trend for the study completers. Assumptions (a)(c) were chosen for illustrative purposes only; in a true sensitivity analysis, the right-hand side of equations (a)(c) might include an additional sensitivity parameter whose value is varied across a plausible range. Ideally, any assumptions made should be guided by subject matter considerations.
The estimated marginal probabilities of amenorrhea at each occasion, under the three dropout assumptions, are presented in Table 1. Also presented in Table 1 are the corresponding estimates under the assumption that dropout is completely at random (MCAR), as a point of reference. In general, the three assumptions concerning NMAR dropout shift the marginal probabilities upward (see Figure 1). This suggests that women who dropout have higher risk of amenorrhea. However, treatment differences remain stable and are relatively unaffected by these three assumed dropout processes (see Figure 2).
|
|
|
Remaining in the spirit of a sensitivity analysis, we also considered a selection model estimated by weighted GEEs (e.g. Rotnitzky and others, 1998
|
|
allowing dropout to depend upon the current response, adjusting for the previous response and dose group. We considered
20 and
21 fixed, and assessed the sensitivity of inferences across a range of plausible values:
20
( log(3),log(3)) and
21
( log(4),log(4)). Intestingly, despite their distinct parameterization of dropout, the selection models yielded qualitatively similar inferences about dosage group differences in the rates of amenorrhea.
| 4. DISCUSSION |
|---|
|
|
|---|
It is an inescapable fact that all methods for handling nonresponse have to make some unverifiable assumptions; this is true of both selection and pattern-mixture models. Selection and pattern-mixture models have their own distinct advantages and disadvantages. The proposed marginally specified pattern-mixture models attempt to capitalize on desirable features of each approach. Specifically, these models circumvent the obvious drawback of pattern-mixture models. By construction, the regression parameters in marginally specified pattern-mixture models have "[tnqit]marginal[/tnqit]" interpretations. Unlike conditionally specified pattern-mixture formulations which cannot directly model the marginal probabilities (e.g. Hogan and Laird, 1997
Of note, the proposed model is semiparametric. The avoidance of full distributional assumptions can be seen as advantageous in this setting as it avoids the need to make identifying restrictions on all higher-order moments (Wilkins and Fitzmaurice, 2006
). Likelihood approaches require the specification of two-, three-, and higher-way moments; the number of parameters grows exponentially with the number of measurement occasions. This proliferation of parameters must also be considered when adopting assumptions about nonresponse, with a proportionate subset of the nuisance parameters requiring identifying constraints. However, when the target of inference is the marginal means, it is unnecessary to directly model such higher-order moments. The proposed method limits consideration only to the targeted marginal means and, to account for nonresponse, the bivariate dependence of each response on indicators of nonresponse pattern; the latter dependence is parameterized differently than in Wilkins and Fitzmaurice (2006)
. Consequently, far fewer identifying restrictions are necessary, greatly simplifying sensitivity analysis. The proposed method also involves computation that is much less intensive than for comparable likelihood-based methods (Wilkins and Fitzmaurice, 2006
). For example, in fitting models to the contraception data, likelihood-based estimation requires 30 times more CPU time.
Finally, a potential drawback of the proposed method is that it examines departures from MCAR rather than missing at random (MAR). The weaker MAR assumption is a more appealing point of reference for a sensitivity analysis; alternate methods, such as weighted GEEs and full likelihood-based approaches targeting marginal inferences (e.g. Birmingham and Fitzmaurice, 2002
), maintain the advantage of examining departures from MAR.
| ACKNOWLEDGMENTS |
|---|
This work was supported by National Institutes of Health grants GM 29745, HL 69800, and MH 17119. Conflict of Interest: None declared.
| REFERENCES |
|---|
|
|
|---|
-
Baker SG. (1995) Marginal regression for repeated binary data with outcome subject to non-ignorable non-response. Biometrics 51:104252.[CrossRef][Web of Science][Medline]
Birmingham J and Fitzmaurice GM. (2002) A pattern-mixture model for longitudinal binary responses with nonignorable nonresponse. Biometrics 58:98996.[CrossRef][Web of Science][Medline]
Ekholm A and Skinner C. (1998) The Muscatine children's obesity data reanalysed using pattern mixture models. Applied Statistics 47:25163.
Fitzmaurice GM and Laird NM. (2000) Generalized linear mixture models for handling nonignorable dropouts in longitudinal studies. Biostatistics 1:14156.[Medline]
Fitzmaurice GM, Laird NM, Rotnitzky A. (1993) Regression models for discrete longitudinal responses (with discussion). Statistical Science 8:284309.[Web of Science]
Heagerty PJ and Zeger SL. (2000) Marginalized multilevel models and likelihood inference. Statistical Science 15:126.[Web of Science]
Hogan JW and Laird NM. (1997) Mixture models for the joint distribution of repeated measures and event times. Statistics in Medicine 16:23957.[CrossRef][Web of Science][Medline]
Huber PJ. (1967) The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability(University of California Press, Berkeley, CA) Volume 1: pp. 22133.
Liang K-Y, Zeger SL, Qaqish B. (1992) Multivariate regression analyses for categorical data (with discussion). Journal of the Royal Statistical Society, Series B 54:340.
Little RJA. (1993) Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association 88:12534.[CrossRef][Web of Science]
Machin D, Farley TM, Busca B, Campbell MJ, d'Arcangues C. (1988) Assessing changes in vaginal bleeding patterns in contracepting women. Contraception 38:16579.[CrossRef][Web of Science][Medline]
Molenberghs G, Kenward MG, Lesaffre E. (1997) The analysis of longitudinal ordinal data with nonrandom drop-out. Biometrika 84:3344.
Press WH, Teukolsky SA, Vetterling WT, Flannery BP. (1992) Numerical Recipes in C: The Art of Scientific Computing 2nd edition (Cambridge University Press, Cambridge, UK).
Rotnitzky A, Robins JM, Scharfstein DO. (1998) Semiparametric regression for repeated outcomes with nonignorable nonresponse. Journal of the American Statistical Association 93:132139.[CrossRef][Web of Science]
Wilkins KJ and Fitzmaurice GM. (2006) A hybrid model for non-ignorable dropout in longitudinal binary responses. Biometrics 62:16876.[CrossRef][Web of Science][Medline]
Received November 29, 2004; revised October 3, 2005; revised June 9, 2006; accepted for publication June 16, 2006.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
L. Su and J. W. Hogan Varying-coefficient models for longitudinal processes with continuous-time informative dropout Biostat., October 15, 2009; (2009) kxp040v1. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||







