Biostatistics Advance Access originally published online on January 10, 2007
Biostatistics 2007 8(4):722-743; doi:10.1093/biostatistics/kxm001
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Impact of nonignorable coarsening on Bayesian inference
Genentech, Inc., South San Francisco, CA 94080, USA zhang.jiameng{at}gene.com
Department of Biostatistics and Epidemiology, University of Pennsylvania, School of Medicine, Philadelphia, PA 19104, USA
* To whom correspondence should be addressed.
| SUMMARY |
|---|
The coarse data model of Heitjan and Rubin (1991) generalizes the missing data model of Rubin (1976) to cover other forms of incompleteness such as censoring and grouping. The model has 2 components: an ideal data model describing the distribution of the quantity of interest and a coarsening mechanism that describes a distribution over degrees of coarsening given the ideal data. The coarsening mechanism is said to be nonignorable when the degree of coarsening depends on an incompletely observed ideal outcome, in which case failure to properly account for it can spoil inferences. A theme in recent research is to measure sensitivity to nonignorability by evaluating the effect of a small departure from ignorability on the maximum likelihood estimate (MLE) of a parameter of the ideal data model. One such construct is the "index of local sensitivity to nonignorability" (ISNI) (Troxel and others, 2004), which is the derivative of the MLE with respect to a nonignorability parameter evaluated at the ignorable model. In this paper, we adapt ISNI to Bayesian modeling by instead defining it as the derivative of the posterior expectation. We propose the application of ISNI as a first step in judging the robustness of a Bayesian analysis to nonignorable coarsening. We derive formulas for a range of models and apply the method to evaluate sensitivity to nonignorable coarsening in 2 real data examples, one involving missing CD4 counts in an HIV trial and the other involving potentially informatively censored relapse times in a leukemia trial.
Keywords: Censoring; Coarse data; Ignorability; Missingness; Sensitivity analysis
| 1. INTRODUCTION |
|---|
Heitjan and Rubin (1991)
An elementary sensitivity analysis that avoids the use of explicit nonignorable models would impute the coarsened observations by "extreme" values. For instance, a simple sensitivity analysis for censored data is to impute the censored survival time by either the maximum survival time or the censoring time (Allison, 1995). Large discrepancy between the inferences from different imputed data sets indicates high sensitivity to nonignorability—a common finding with this method, as it in effect assumes a high degree of nonignorability. Park and others (2006)
introduced a nonparametric approach that enumerates the collection of all attainable values for the survival function using information about the causes of censoring. Their sensitivity analysis examines the change of the collection when one varies the classification of censoring as either dependent or independent. One may also use the difference of the likelihood or the posterior density between the ignorable and nonignorable models to represent the difference in key inferences. This difference can be quantified, for example, by the Kullback–Leibler divergence or any other relevant measure of distance. Cook (1986)
used geometric normal curvatures to characterize the behavior of this likelihood displacement. Verbeke and Molenberghs (2001)
and Jansen and others (2003)
extended the method to continuous outcomes and multivariate and longitudinal binary data with possibly nonignorable missingness.
An alternative approach is to explicitly specify a coarsening model in which a single parameter indexes the magnitude of nonignorability and to quantify how the inference of interest varies as a function of this parameter. One variant is to evaluate the range of an estimate as the nonignorability parameter varies over its parameter space (Moeschberger and Klein, 1984
; Zheng and Klein, 1995
; Robins, 1997
). Such a "global" sensitivity analysis can be computationally burdensome because it may require fitting the nonignorable model many times. Moreover, the bounds from such an analysis may be too wide to be of practical value if the limits of the nonignorability parameter represent values of nonignorability that are unlikely to be realized in practice. By contrast, "local" sensitivity analysis methods examine the sensitivity of parameter estimates to small perturbations from the ignorable model. Copas and Li (1997)
first proposed this approach by measuring sensitivity of inferences on a normal mean to a probit selection mechanism. Siannis (2004)
and Siannis and others (2005)
applied a similar idea to measure sensitivity of the maximum likelihood estimate (MLE) of parameters in a survival model to small departures from noninformative censoring. Troxel and others (2004)
derived a simple, general measure that they denoted the index of local sensitivity to nonignorability (ISNI), which they applied in the univariate generalized linear model. Other recent papers have applied ISNI analysis to causal modeling in a clinical trial with noncompliance (Xie and Heitjan, 2004
), to longitudinal modeling (Ma and others, 2005
), and to the coarse data model (Zhang, 2004
; Zhang and Heitjan, 2005
, 2006
).
Past approaches to local sensitivity analysis have in most cases considered sensitivity of the MLE. As the MLE is equivalent to the posterior mode with a flat (and therefore potentially improper) prior, this approach may be unsatisfactory if the ultimate goal is to make Bayesian inferences, particularly when the sample size is small. Moreover, there is no straightforward non-Bayesian analogue of quantities like the probability that a parameter falls in some subset of the parameter space. Thus, in this paper, we propose a local sensitivity analysis for Bayesian inferences in the coarse data model.
The rest of the paper is organized as follows: Section 2 reviews the coarsening model. In Sections 3 and 4, we derive the general form of ISNI for Bayesian inference and present methods to evaluate the sensitivity. In Sections 5 and 6, we apply our method to real data sets involving missing and censored data. To illustrate the range of applicability of the method, in the appendices we present ISNI formulas for a function of a parameter of interest (Appendix A), for a range of models with missing data (Appendix B), and for censored data with parametric models (Appendix C).
| 2. NONIGNORABLE COARSENING |
|---|
The coarsening model (Heitjan and Rubin, 1991
X, is distributed according to the density f
(x), where
is the parameter of interest. A second random variable G, taking values in a sample space
G, controls the degree of coarsening of X in the sense that G determines a partition of the sample space of X. The resulting data element is Y = Y(X, G), a subset of
X in which X is known to lie. If we also do not observe G precisely, then we denote H = H(X,G) to be the range of plausible values of G, a subset of
G. We moreover assume that the distribution of G given X is f
(g|x) and that
and
are "distinct" in the sense that they are a priori independent. The correct likelihood is |
|
where fY,H|X,G (y, h|x, g) is a known, degenerate function that equals 1 when Y(x,g) = y and H(x,g) = h and 0 otherwise.
Because the correct likelihood may be challenging to work with both analytically and numerically, in applications we typically favor the simplified likelihood
|
|
With the same prior for
, this likelihood yields the same inferences as the correct likelihood if, for the fixed observed (y,h) and for every
, the quantity
|
| (2.1) |
takes the same value for all x
y. In such a case, we say that the data are "coarsened at random"; if, moreover, the parameters
and
are distinct, in the sense of Rubin (1976)
, then the coarsening mechanism is "ignorable."
Although the model stated above covers a broad range of data structures, it is helpful to have a notation that allows for n independent units, possibly not identically distributed. Thus, we assume that X and G are n-vector random variables and that the components Xi of X, i = 1, ..., n, are independent with density p
(xi). The density of X is then
|
| (2.2) |
We also assume that the elements Gi given Xi = xi are independent and distributed according to densities p
(gi|xi), leading to the conditional density of G given X taking the form
|
| (2.3) |
| 3. ISNI FOR BAYESIAN INFERENCE |
|---|
Bayesian data analysis reduces to the computation of posterior expectations of functions of key parameters of interest. Thus, our local sensitivity analysis is based on the derivative of a posterior mean with respect to a nonignorability parameter, evaluated at the ignorable model.
We begin by further defining
= (
0,
1)', where
1 indicates the degree of nonignorability in the sense that if
1 = 0, f
(g|x) = f
(g) and the coarsening mechanism is ignorable. We assume that primary interest is in the posterior distribution of
. Let
= (
,
0)', E(·) be the posterior expectation operation, and
(
1) = E(
|
1, y, h) be the posterior expectation of (
,
0) when the nonignorability parameter is fixed at
1.
To assess the sensitivity of
(
1) to the nonignorable coarsening, we calculate the change of
(
1) when
1 is perturbed around the ignorable model. Define ISNI[
(
1)] as the rate of change of the posterior mean of
evaluated at
1 = 0. We then have (see Appendix A)
![]() |
where
C(
,
0;
1, y, h) is the log-likelihood and COVI(·) is the posterior covariance under the ignorable model. Note that
, the score statistic, is a function of
and
0. ISNI describes the amount by which a unit change in
1 displaces the posterior expectation of the parameter of interest from its value under the ignorable model. Large absolute values of ISNI thus generally suggest high sensitivity to nonignorability. If no data are coarsened, then the coarsening mechanism is ignorable in the Bayesian sense, in that varying
1 can have no effect on inferences regarding
. In such situations, ISNI in general equals 0.
Under the ignorable model with distinct parameters, we can factor the posterior into 2 parts—one involving only
and the other involving only
. Therefore, COVI(
,
0) = 0, and by the delta method,
![]() |
where
(0) and
0(0) are the posterior means of
and
0 under the ignorable model and VARI(·) is the posterior variance under the ignorable model.
Note that this approximation is similar to ISNI for the MLE
(Troxel and others, 2004
), whose formula replaces the posterior variance of
with the inverse information under the ignorable model
![]() |
where
0(0) and
(0) are the MLEs of
0 and
from the ignorable model. In both cases, the first factor measures the precision of estimation of
under the ignorable model and the second factor measures the nonorthogonality of
and
1.
If the ignorable model actually holds and
(0) is the true value of
, then under suitable regularity conditions, both |
(0) –
(0)| and |
(0) –
(0)| decrease to 0 at the rate
(Serfling, 2002
; Walker, 1969
; Bernardo and Smith, 1994). Because these 2 estimates converge to the truth at the same rate, as
.
| 4. SENSITIVITY EVALUATION BY ISNI |
|---|
|
|
|---|
We cannot in all cases use ISNI alone to index the sensitivity because its interpretation generally depends on the units of measurement. To address this problem (in the context of maximum likelihood estimation), Troxel and others (2004)
(
1) with respect to
1 evaluated at
1 = 0 are small, we can approximate
(
1) by
(
1)
(0) +
1 x ISNI(
). We define the posterior mean to be sensitive to nonignorability if, for a plausible value of
1, the difference between the posterior means of the parameters under the ignorable and nonignorable models exceeds one posterior standard deviation (SD) under the ignorable model.
Here, we adapt the graphical sensitivity evaluation method of Zhang (2004)
to the Bayesian case. The idea is to set
, where SDI(
) is the posterior SD of
under the ignorable model; 
is then approximately the value of
1 that can cause a one-SD change in
. If 
is plausible for the data, in the sense that it is in line with prior notions of how bad the nonignorability can be, then we say that
is sensitive to the nonignorable coarsening. One can check the plausibility of 
by plotting q(
, x*) versus x*, where x* ranges over the set of all plausible values of x and q(·, ·) is a measure whose plausibility is easy to interpret, such as the conditional expectation or hazard function of G given X = x. Note that because
1 = 0 implies a nonignorable coarsening mechanism, q(
1 = 0, x*) should be constant in the x* argument. The greater the departure of 
from 0, the larger will be the variation of q(
, x*). If the range of q(
, x*) is small enough to be considered plausible for the data at hand, then 
is plausible for the data, and we conclude that inferences are sensitive. If on the other hand the range of q(
, x*) is larger than seems plausible, then inferences are insensitive and the ignorable model can be considered reliable. Thus, 
is similar to the minimum value of the odds ratio for hidden bias
that appears in the work of Rosenbaum (1995)
on observational studies.
Likely choices of q(·, ·) may involve
0 as well as
1. For example, in the missing data model, one may set q(
1, x*) = Pr
[Xi is observed|Xi = x*], in which case q(·, ·) implicitly depends on
0. In such cases, we may propagate uncertainty properly by evaluating not the limits of q(
1, x*) itself but rather the posterior probability that q(
1, x*) exceeds some benchmark value. We might then evaluate the posterior probabilities
|
|
and
|
|
where the outer probability operator refers to the posterior distribution of
0 assuming that
1 = 0. If both P1 and P2 are close to 1, the suggestion is that under 
there is a wide range of probabilities of missingness, and therefore, we conclude that 
is implausible and that the inference is insensitive to nonignorability. We will use this approach in what follows.
The values of ISNI depend on the specification of the joint distribution of X and G, both the distribution of X and the conditional distribution of G|X (the coarsening mechanism). In our experience, the conclusions of ISNI-based local sensitivity analyses have been robust to the choice of parametric model for the coarsening mechanism, as long as it allows a sufficient range of nonignorability.
| 5. ISNI FOR MISSING DATA |
|---|
Let Xi, i = 1, ..., n, be independent outcomes with joint density given by (2.2). Let Gi be an accompanying indicator of whether Xi is observed, and assume that the distribution of G given X = x is given by (2.3). Assume specifically that the conditional distribution of Gi given Xi = xi and
is Bernoulli:
|
|
where h(·) is a monotone increasing link function. If
1 < 0 ( > 0), there is a risk of overestimating (underestimating) the mean outcome.
Research to date suggests that MLEs from nonignorable models can be sensitive to the complete data model but are robust to the selection model (Kenward, 1998; Xie, 2003
). Thus, the choice of link may have a modest effect on the sensitivity analysis, and we henceforth assume h(·) to be the commonly used logistic link,
|
|
It is convenient to express the prior of
0 as a beta ß(v, w) density on the h(
0) scale,
|
|
which allows us to think of v and w as prior numbers of observed and missing outcomes, respectively. Then, we have the general result that
![]() |
Note that, as with ISNI for the MLE, this expression equals 0, and consequently, sensitivity is 0 if all units are observed.
We illustrate the analysis in the special case of estimation of a normal mean µ; formulas for some other distributions appear in Appendix B. Let
= (
1,
2)' = (µ,
2)'. A closed-form expression for ISNI for the posterior mean
depends on the prior distribution of (µ,
2). We consider three cases:
case 1. A noninformative prior. We assume that µ and log
are uniform and independent:
Then,
and
|
|
where no = 
gi and
.
case 2. A semi-conjugate prior. Assuming that
2 is known and µ
N(µ0, 
), we have that
and
|
|
case 3. A conjugate prior. We assume that µ|
2
N(µ0,
2/k0) and
2
Inv-
2(
0, 
). Then, we have
with
|
|
To illustrate the graphical sensitivity analysis of Section 4, we generated 8 samples containing 4000 (Xi, Gi) pairs, where the Xi values for each sample were the same and were generated from a standard normal. Here, Gi given Xi = xi are independent binomials with Pr
[Gi = 1|Xi = xi] = expit(
0 +
1xi). We set the values of
0 = 3 and
1 = ±3, ±5, ±8, ±10, ±20 for these simulated data. In Figure 1, we randomly chose and plotted 100 pairs from 2 samples with
1 = 3 and – 3, respectively. For
1 = 3( – 3), the missingness is heavier for subjects with smaller (larger) outcomes.
|
We analyzed the missing data by the ignorable and nonignorable models assuming a noninformative prior, with results summarized in Table 1. We used importance sampling to calculate the posterior means. The posterior mean from the ignorable model (
0) was an overestimate (underestimate) when
1 > 0( < 0). The posterior mean from the nonignorable model (
nonig) was close to the true value 0. We calculated 
as defined in Section 4. Taking the range of plausible values of q(·, ·) to be (0.1, 0.9), we defined P1 and P2 to be the posterior probabilities, averaging over the distribution of the
parameters, that the probability of being observed is either less than 0.1 or greater than 0.9, for fixed 
and for the maximum or minimum values of x. If both P1 and P2 are near 1, indicating that subjects with extreme values of x either all give complete data or all are missing, then 
would be implausible for the data.
|
In Figure 2, we plot the posterior means assuming different values for the ignorability parameter
1 for the data with true
1 = ±3. ISNI(
) is the slope of
at
1 = 0, which reflects the behavior of
near the ignorable model. Evidently, the linear approximation is reasonable in the range
1
( – 3,3) for these data, and consequently, the value of 
accurately estimates the degree of nonignorability that is sufficient to cause one-SD change in µ.
|
The probability of being observed given the outcome value x and fixed values of
0 and
1 is expit(
0 +
1x). One can check whether 
is plausible for the data by plotting expit(
+ 
x*) versus x*, where x* ranges over the set of all plausible values of the outcome and 
is the posterior mean of
0 from the ignorable model. If the range of probabilities is plausible for the data, 
is also plausible for the data and µ is sensitive to nonignorable missingness. We present these plots in Figure 3. The ranges of the probability in the last plot in each row are much larger than the ranges in other plots, showing relatively low sensitivity to the dependent censoring. As shown in the 5th and 10th rows of Table 1, the posterior probabilities P1 and P2 were both 1, indicating low sensitivity; the differences of posterior expectations from the ignorable and nonignorable models for these data sets were close to or less than one SD, whereas for the other data sets, the differences were larger. The plots and posterior probabilities show that the sensitivity to nonignorability increases with the percentage of censoring.
|
Adult AIDS Clinical Trials Group (AACTG) study 5125 sought to compare the impact of protease inhibitors-containing lopinavir/ritonavir (LPV/r) and nucleoside reverse transcriptase inhibitors (NRTI) containing regimens on HIV lipoatrophy. Subjects received either LPV/r or NRTI and were followed for 72 weeks beyond the enrollment of the last subject. CD4 count at week 72 was a key safety end point but was missing for 10 of the 61 subjects (6 on NRTI and 4 on LPV/r). There was a concern that the data were missing nonignorably because poor health may be associated with both low CD4 counts and a propensity to miss scheduled visits. We analyzed the data to determine whether the difference between the arms in mean 72-week CD4 count was sensitive to nonignorability.
We assume that CD4 count (Xi) is independent and normally distributed with mean ziß and variance
2, where zi = (1, 1) for subject i in the NRTI arm, zi = (1, 0) for subject i in the LPV/r arm, and ß = (ß0, ß1). With the noninformative prior
, we have the ignorable model posterior
|
|
where Zo is the subset of Z corresponding to subjects for which the outcome is observed,
is the least-squares estimate of ß, and s2 is the residual variance. We measure the sensitivity with
|
|
To check the sensitivity of
1 without fitting the nonignorable model, we set 
= 0.0122 as defined in Section 4 and 
to be the posterior mean of
0 under the ignorable model. Figure 4 shows plots of the probability of observing a CD4 outcome given 
and 
versus the set of all plausible values of the outcome. The range of probabilities is large (from 0 to 1), which suggests that we would not have sensitivity unless the missingness mechanism is such that subjects with CD4 less than 100 would always miss visits and subjects with CD4 greater than 1000 would always attend visits. This is not plausible, in that subjects with evidently low CD4 do sometimes appear for visits and subjects with evidently high CD4 do sometimes skip visits. We also calculated the posterior probability, under the ignorable model, that the probability of staying in the trial is less than 0.1 (greater than 0.9) at max(x) (min(x)). Both probabilities are equal to 1, in agreement with Figure 4. Therefore, we conclude that the posterior mean of the treatment effect on day-72 CD4 is robust to all but extreme nonignorability.
|
We are also interested in estimating the probability that the mean day-72 CD4 is less than 500 (the lower bound of the normal range) for the 2 treatment arms separately. Table 2 shows the inferences from the ignorable model along with the ISNIs and corresponding

. We check the plausibility of the 
values in Figure 4. The ranges of the probability of not dropping out given 
seem plausible for the trial. Therefore, we conclude that the probabilities that the mean CD4 is below normal for the 2 arms are sensitive to possible nonignorable dropout. The probability that the mean CD4 is below normal for the NRTI arm is more sensitive than that for the LPV/r arm, probably because there are more missing data in the NRTI arm.
|
This analysis demonstrates how the ISNI approach can produce a separate sensitivity analysis for each model parameter of interest—indeed, for each function of interest of the parameters. This property is important because it may happen that the 2 parameters representing the 2 study arms separately are both substantially sensitive to nonignorability, whereas their difference (i.e. the treatment effect) is not.
| 6. ISNI FOR CENSORED DATA |
|---|
Let X be an n-vector of times to the onset of a particular event and G be an accompanying n-vector of censoring times from another possible event. Assume that the components of X are independent with joint density given by (2.2) and that the components of G given X = x are given by (2.3); specifically, we assume a proportional hazards model for the hazard function of the censoring variable:
|
|
where h0(·) is the baseline censoring hazard. Taking H0(·) to be the cumulative baseline censoring hazard, the full likelihood is
![]() |
where di = 1 if xi
gi and 0 otherwise.
Under this model, a negative (positive) value of
1 implies that the posterior mean of survival under the ignorable model will be too large (small). In general, ISNI for the posterior mean of
is
![]() |
Again, note that if no subjects are censored (all di = 1), then ISNI is 0, and every inference about
is completely insensitive to nonignorability.
Assume that the marginal distribution of Xi is exponential with mean e–
and that the conditional distribution of Gi given Xi = xi is exponential with mean e–
0 –
1xi. (Formulas for censored data in other models appear in Appendix C.) ISNI for the posterior mean survival e–
is in general
|
|
We consider 2 special cases:
case 1. Under the Jeffreys prior for e
and e
0,
, we have that
![]() |
where ti = min(xi, gi).
case 2. Conjugate priors for e
and e
0 take the prior for the censoring rate to be
(k, f) and the prior for the survival rate to be
(p,q). We can think of p and k as prior numbers of events and censorings, in prior lengths of follow-up q and f, respectively,
. Then, ISNI for the posterior mean survival e–
is
![]() |
where ti = min(xi, gi).
A bone marrow transplant (BMT) for leukemia is considered a failure if the patient either relapses or dies while remission. Oncologists postulated that the development of effective techniques for treating graft-versus-host disease and other complications of BMT would considerably reduce the incidence of death in remission. Thus, the marginal distribution of time to relapse, which is what we would observe if we could eliminate death in remission, was considered to be a worthy object of statistical inference. In this context, death in remission censors relapse, and because times to relapse and death in remission are possibly correlated, the censoring is potentially nonignorable and a sensitivity analysis is in order. We illustrate the procedure using data from a multicenter trial of BMT (Copelan and others, 1991
). In up to 7 years of follow-up, 42 of 137 patients relapsed, 41 died in remission, and the rest survived without relapse. We assume that interest is in the mean time to relapse.
Let Xi be the relapse time, G1i be the time to death in relapse, and G2i be the maximum time of observation. We assume that Xi is exponential with mean e–
, G1i given Xi = xi is exponential with mean e–
0 –
1xi, and G2i is uniform and independent of both Xi and G1i. The correct likelihood is
![]() |
where
![]() |
With this nonignorable model, it is possible (although not trivial) to calculate the posterior means of all the model parameters. Table 3 shows the posterior inferences from ignorable and nonignorable models. The Bayes factor is larger than 3, implying substantial evidence against H0 :
1 = 0. Because the posterior mean of
1 is negative, posterior mean relapse time from the ignorable model is evidently inflated. Moreover, the difference of posterior expectations from the ignorable and nonignorable models exceeds one SD, suggesting high sensitivity to nonignorability.
|
We calculated the ISNI for mean relapse time assuming a Jeffreys prior for (e
, e
0). To assess the sensitivity of posterior mean relapse time (
) to nonignorability without fitting the nonignorable model, we set 
= SD(µ0)/ISNI(
) = 398.4/456114.4 = 0.00087 and plotted ratios of the hazard of being censored by death in remission, exp(
x*) versus x*, where x* ranges over the set of all plausible values of time to relapse (Figure 5). Because the ranges of plotted hazard ratios are plausible for the data, 
is a plausible degree of nonignorability, and we conclude that posterior mean relapse time is sensitive to even this relatively modest amount of dependent censoring. In Table 4, we present the posterior mean of the censoring hazard (exp(
0 + 
x*)) and the posterior probability of the censoring hazard under the nonignorable model being less than the hazard under the ignorable model (exp(
0 + 
x*) < exp(
0)), given 
, using the posterior distribution of
0 from the ignorable model and setting x to the extreme observed values of the relapse time. Because the ranges of the posterior means and probabilities are not implausible for the application, we again conclude sensitivity.
|
|
| 7. DISCUSSION |
|---|
When data are possibly nonignorably coarsened, a formal assessment of sensitivity can be an important first step in their analysis. We have presented here a simple method for conducting such analyses for moment-based Bayesian inferences under the coarse data model. The method has the advantages of accommodating any type of parametric model, avoiding the estimation of a nonignorable model, and allowing a separate sensitivity analysis for each model parameter or function of model parameters. We have presented formulas for ISNI under simple parametric models for common coarsening scenarios. Our exposition is intended to be illustrative rather than comprehensive, in that we conjecture that it is straightforward to derive analogous formulas for more complicated failure and coarsening models.
When an analytical posterior is not available, a straightforward numerical approach would calculate ISNI from repeated fits of the model at a range of values of the nonignorability parameter
1. Alternatively, if using a Markov chain Monte Carlo (MCMC) estimation algorithm, one could fit the full nonignorable model and take the necessary derivative numerically by evaluating E(
|
1, y, h) at a range of convenient
1 values in the sample path. We do not counsel the latter strategy, however, as the posterior from the full nonignorable model may behave badly and be difficult to explore. Moreover, there is no guarantee that MCMC will sojourn long enough in the vicinity of
1 = 0 to impart sufficient accuracy to the computation of ISNI.
A potential criticism of the method is its dependence on a parametric model for the coarsening mechanism. In all cases that we are aware of, the nonignorable model is at best poorly identified from the data because we never know the precise values of the coarsened observations that would allow us to robustly estimate such a model. Fortunately, results obtained so far in the missing data context suggest that the sensitivity analysis itself is reasonably robust to assumptions about the missing data selection model (Xie, 2003
). Thus, there is an empirical support for the conjecture that the choice of coarsening model has minimal influence on evaluations of sensitivity, at least locally. The validity of assumptions regarding the complete data model is equally a concern, but again our experience suggests that the degree of sensitivity to nonignorability is robust to the complete data model—which, if generally true, would simplify matters considerably (Xie, 2003
; Zhang and Heitjan, 2006
).
The sensitivity measures derived here are similar to those previously derived for MLEs in Troxel and others (2004)
and Zhang (2004)
—unsurprisingly, as it is well known that Bayesian and frequentist data summaries differ only slightly in large samples (e.g. Kass and others, 1989
). If the objective is to execute Bayesian inference, the Bayesian ISNI approach has the advantage that one can readily apply it to other posterior moments such as the probability that a parameter
lies below a constant c, that
1 <
2, and so on (see Appendix A). The MLE version of ISNI may also be inaccurate as a substitute for Bayesian ISNI in small-sample situations where the posterior mean and MLE have not converged.
One can generalize the method to cover more complicated coarsening models such as those involving multiple nonignorability parameters—for example, a separate nonignorability parameter for each treatment group in a randomized clinical trial. Xie (2003)
and Xie and Heitjan (2004)
have described a method that identifies the direction of maximal local sensitivity to nonignorability. Approaches that exploit prior information on the nonignorability parameters are also possible. We note moreover that although the analysis presented here relies on parametric models, in principle the same reasoning is applicable to nonparametric or semiparametric models such as those underlying the Kaplan–Meier curve and Cox regression.
When an ISNI-based analysis suggests modest sensitivity, one can use the ignorable model with some assurance of robustness to minor departures from ignorability. When the analysis suggests more pronounced sensitivity, one has a range of options: First, if there is sufficient prior information on the coarsening mechanism, one can fit the nonignorable model. If not, then one can either conduct global sensitivity analysis to explore the dependence of inferences on
1 more generally or decline to conduct further inferential analyses on the grounds that model-based inferences are evidently sensitive to things unknown. In the survival example, we were able to estimate the nonignorable model without heroic efforts, but we are not suggesting that the results are therefore reliable. If one has recourse to nonignorable modeling, it seems prudent to consider sensitivity of the results to a range of potential nonignorable models, relying on the findings only to the extent that reasonable models agree.
| APPENDIX A |
|---|
Letting
(
) be the prior of
, the posterior mean of
for a fixed
1 is
|
|
where LC(
;
1,y, h) is the likelihood accounting for the coarsening. Define
![]() |
where
C(
;
1, y, h) is the log of LC(
;
1, y, h), LI(
; y, h) is the likelihood when
1 = 0, and EI(·) and COVI(·) are the posterior expectation and covariance under the ignorable model.
For a function Q(·) of
, ISNI of
) is therefore
![]() |
To calculate the sensitivity of the probability that
c, where c is a subset of sample space of
, we have
![]() |
where PrI[·] is the posterior probability under the ignorable model.
| APPENDIX B |
|---|
Note that with a ß(v, w) distribution for the prior of h(
0), we can think of v and w as prior numbers of observed and missing outcomes. Then, we have
![]() |
If
= (
1,
2)' and p
(xi) is in the exponential family (McCullagh and Nelder, 1989
)
|
|
for some a(·), b(·), and c(·), then we have
|
|
and
![]() |
Assume that we have predictors zi = (z1i, z2i, ..., zpi) and that the Xi are independent and normally distributed with mean ziß and variance
2, which gives in the generalized linear model formulation |
|
The ISNI for
= E[ß|
1, y, g] is
![]() |
We further assume that the prior distributions of µ and log
are both uniform and that µ and log
are a priori independent, yielding
|
|
Then, we have
|
|
where Zo is the subset of Z consisting of all rows for which the corresponding outcome is observed and no is the number of observed subjects, and
|
|
Suppose the Xi are n independent Bernoulli random variables with Pr[Xi = 1] =
. Then,
1) =
. Furthermore, we assume that
has a conjugate beta prior with parameters e and f. Therefore, the posterior distribution of
is also beta with parameters p and q, where p = 
gixi + e and q = 
gi(1 – xi) + f. ISNI for
= E[
|
1, y, g] is then
![]() |
Assume that we have predictors zi = (z1i, z2i, ..., zpi) and that the Xi are n independent Bernoulli variables with Pr[Xi = 1] = expit(ziß). Then, b'(
1) = expit(ziß). ISNI for
= E[ß|
1, y, g] is
![]() |
where
0 is the posterior mean of ß from the ignorable model.
Suppose the Xi are n independent Poisson variables with mean
. Then,
1 = ln(
), b(
1) = e
1, a(
2) = 1, and E
(Xi) = b'(
1) =
. Taking
to have a conjugate gamma prior with parameters e and f, ISNI for
= E[
|
1, y, g] is
![]() |
Assume that we have predictors zi = (z1i, z2i, ..., zpi) and that Xi are n independent Poisson variates with mean
= exp(ziß). Then, E
(Xi) = b'(
1) = exp(ziß). ISNI for
= E[ß|
1, y, g] is
![]() |
where
0 is the posterior mean of ß from the ignorable model.
Let Xi be n independent gamma variates with parameters
i = (µi,
), where
is a constant coefficient of variation and the mean is
= E[ß|
1, y, g] is then
![]() |
where
0 is the posterior mean of ß from the ignorable model.
Assume the Xi are n independent inverse Gaussian variables with parameters
i = (µi,
2), where
2 is the variance. Then,
2) = 1/
2 = 1/
2. Furthermore, we assume that we have predictors zi = (z1i, z2i, ..., zpi) with
= E[ß|
1,y, g] is then
![]() |
where
0 is the posterior mean of ß from the ignorable model.
| APPENDIX C |
|---|
Using the notation of the body of the paper, ISNI for the parameter of a general censored data–selection model is
![]() |
![]() |
where
We assume that the marginal distribution of Xi is Weibull with parameter
= (
,
i)' and mean
i = eziß : |
|
ISNI for
is then
![]() |
where
0 is the posterior mean of ß from the ignorable model.
Assume that the marginal distribution of Xi is exponential with mean e–
, which has a gamma prior with parameters p and q, and h0(gi) =
2g
. ISNI for the posterior mean survival
![]() |
![]() |
where ti = min(xi, gi).
| ACKNOWLEDGMENTS |
|---|
We acknowledge with thanks the contributions of the reviewers. The United States Public Health Service supported this research under grant HL68074 and cooperative agreements AI038855 and NS032228. Conflict of Interest: None declared.
| REFERENCES |
|---|
-
Allison PD. Survival Analysis Using the SAS System—A Practical Guide (1995) Cary, NC: SAS Institute.
Bernado JM, Smith AFM. Bayesian Theory (1994) New York: Wiley.
Cook RD. Assessment of local influence. Journal of the Royal Statistical Society, Series B (1986) 48:133–169.
Copas JB, Li HG. Inference for non-random samples (with discussion). Journal of Royal the Statistical Society, Series B (1997) 59:55–95.[CrossRef]
Copelan EA, Biggs JC, Thompson JM, Crilley P, Szer J, Klein JP, Kapoor N, Avalos BR, Cunningham I, Atkinson K, et al. Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with BuCy2. Blood (1991) 78:838–843.
Diggle P, Kenward MG. Informative drop-out in longitudinal data analysis. Applied Statistics (1994) 43:49–93.[CrossRef][Web of Science]
Fisher L, Kanarek P. Presenting censored survival data when censoring and survival times may not be independent. In: Reliability and Biometry—Proschan F, Serfling RI, eds. (1974) Philadelphia, PA: Society for Industrial and Applied Mathematics. 303–326.
Heitjan DF. Ignorability and coarse data: some biomedical examples. Biometrics (1993) 49:1099–1109.[CrossRef][Web of Science][Medline]
Heitjan DF. Ignorability in general incomplete-data models. Biometrika (1994) 81:701–708.
Heitjan DF. Ignorability, sufficiency and ancillarity. Journal of the Royal Statistical Society, Series B (1997) 59:375–381.[CrossRef]
Heitjan DF, Rubin DB. Ignorability and coarse data. Annals of Statistics (1991) 19:2244–2253.[CrossRef][Web of Science]
Jansen I, Hens N, Molenberghs G, Aerts M, Verbeke G, Kenward MG. A local influence approach applied to binary data from a psychiatric study. Biometrics (2003) 59:409–148.
Kass RE, Tierney L, Kadane JB. Approximate methods for assessing influence and sensitivity in Bayesian analysis. Biometrika (1989) 76:663–674.
Kenward MG. Selection and models for repeated measurements with nonrandom drop out: An illustration of sensitivity. Statistics in Medicine (1988) 17:2723–2732.[CrossRef]
Lagakos SW. General right-censoring and its impact on the analysis of survival data. Biometrics (1979) 35:139–156.[CrossRef][Web of Science][Medline]
Ma G, Troxel AB, Heitjan DF. An index of local sensitivity to nonignorable dropout in longitudinal modeling. Statistics in Medicine (2005) 24:2129–2150.[CrossRef][Web of Science][Medline]
McCullagh P, Nelder JA. Generalized linear models (1989) London and New York: Chapman and Hall.
Moeschberger ML, Klein JP. Consequences of departures from independence in exponential series systems. Technometrics (1984) 26:277–284.[Medline]
Park Y, Tian L, Wei LJ. One- and two-sample nonparametric inference procedures in the presence of a mixture of independent and dependent censoring. Biostatistics (2006) 7:252–267.
Robins JM. Non-response models for the analysis of non-monotone non-ignorable missing data. Statistics in Medicine (1997) 16:21–37.[CrossRef][Web of Science][Medline]
Rosenbaum PR. Observational Studies (1995) New York: Springer.
Rubin DB. Inference and missing data. Biometrika (1976) 63:581–592.
Serfling RJ. Approximation Theorems of Mathematical Statistics (2002) New York: Wiley.
Siannis F. Applications of a parametric model for informative censoring. Biometrics (2004) 60:704–714.[CrossRef][Web of Science][Medline]
Siannis F, Copas J, Lu G. Sensitivity analysis for informative censoring in parametric survival models. Biostatistics (2005) 6:77–91.[Abstract]
Troxel AB, Ma G, Heitjan DF. An index of local sensitivity to nonignorability. Statistica Sinica (2004) 14:1221–1237.[Web of Science]
Tsiatis A. A nonidentifiability aspect of the problem of competing risks. Proceedings of the National Academy of Sciences of the United States of America (1975) 72:20–22.
Verbeke G, Molenberghs G. Sensitivity analysis for nonrandom dropout: a local influence approach. Biometrics (2001) 57:7–14.[CrossRef][Web of Science][Medline]
Walker AM. On the asymptotic behaviour of posterior distributions. Journal of the Royal Statistical Society, Series B (1969) 31:80–88.
Xie H. An index of sensitivity to nonignorability: extensions and applications, [Doctoral dissertation]. In: Department of Biostatistics (2003) New York: Columbia University.
Xie H, Heitjan DF. Sensitivity analysis of causal inference in a clinical trial subject to crossover. Clinical Trials (2004) 1:21–30.[CrossRef][Medline]
Zhang J. Sensitivity analysis of nonignorable coarsening, [Doctoral dissertation]. In: Department of Biostatistics & Epidemiology (2004) Philadelphia: University of Pennsylvania.
Zhang J, Heitjan DF. Nonignorable censoring in randomized clinical trials. Clinical Trials (2005) 2:488–496.[CrossRef][Web of Science][Medline]
Zhang J, Heitjan DF. A simple sensitivity analysis tool for nonignorable coarsening: application to dependent censoring. Biometrics (2006) 62:1260–1268.[CrossRef][Web of Science][Medline]
Zheng M, Klein JP. Estimates of marginal survival for dependent competing risks based on an assumed copula. Biometrika (1995) 82:127–38.
Received April 7, 2006; revised December 11, 2006; accepted for publication December 20, 2006.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||































