Biostatistics Advance Access originally published online on May 25, 2005
Biostatistics 2005 6(4):615-632; doi:10.1093/biostatistics/kxi032
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A Bayesian mixture model relating dose to critical organs and functional complication in 3D conformal radiation therapy
Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA tdjtdj{at}umich.edu
Department of Biostatistics, School of Public Health and Department of Radiation Oncology, School of Medicine, University of Michigan, Ann Arbor, MI 48109, USA
Department of Radiation Oncology, School of Medicine, University of Michigan, Ann Arbor, MI 48109, USA
* To whom correspondence should be addressed.
| SUMMARY |
|---|
|
|
|---|
A goal of cancer radiation therapy is to deliver maximum dose to the target tumor while minimizing complications due to irradiation of critical organs. Technological advances in 3D conformal radiation therapy has allowed great strides in realizing this goal; however, complications may still arise. Critical organs may be adjacent to tumors or in the path of the radiation beam. Several mathematical models have been proposed that describe the relationship between dose and observed functional complication; however, only a few published studies have successfully fit these models to data using modern statistical methods which make efficient use of the data. One complication following radiation therapy of head and neck cancers is the patient's inability to produce saliva. Xerostomia (dry mouth) leads to high susceptibility to oral infection and dental caries and is, in general, unpleasant and an annoyance. We present a dosedamageinjury model that subsumes any of the various mathematical models relating dose to damage. The model is a nonlinear, longitudinal mixed effects model where the outcome (saliva flow rate) is modeled as a mixture of a Dirac measure at zero and a gamma distribution whose mean is a function of time and dose. Bayesian methods are used to estimate the relationship between dose delivered to the parotid glands and the observational outcomesaliva flow rate. A summary measure of the dosedamage relationship is modeled and assessed by a Bayesian
2 test for goodness-of-fit.
Keywords: Bayesian analysis; 3D conformal radiation therapy; Dirac measure; MCMC; Mixture model; Goodness-of-fit; Radiotherapy
| 1. INTRODUCTION |
|---|
|
|
|---|
Three-dimensional (3D) conformal radiation therapy (Lichter, 1991
Radiation therapy is given in a series of daily small fractionated doses (typical dose fraction sizes are 1.53 Gray (Gy), with the most common values being 1.8 and 2.0 Gy) delivered to the target 5 days per week over approximately a 6-week period. The total dose delivered to the tumor is typically on the order of 6075 Gy and should be homogeneous, in order to avoid cold spots from which the tumor could regenerate. The fractionated dose distribution may differ from one day to the next due to designed field size changes and technique changes. However, in practice only one or two changes are made during the course of treatment. The 3D treatment-planning software calculates the dose, at every fraction, at every voxel (volume element) in the path of the radiation beam based on volumetric data sets typically obtained by computed tomography (CT). The goal of the treatment plan is two-fold. First, to deliver the prescribed overall dose of radiation homogeneously to the target (typically a malignant tumor). Second, to minimize the dose delivered to the surrounding healthy tissues and organs and to those in the path of the radiation beam. The development of 3D conformal radiation therapy treatment has allowed radiation oncologists to meet these goals to a larger extent than previously possible.
The dose distribution to a specified region is typically summarized by the dose volume histogram (DVH) based on the total delivered dose to that region. The DVH is analogous to the survivor functionit is one minus the distribution function of the total dose over the region of interest. Throughout this paper, the dose distribution function to the region of interest (in our case an healthy organ) will be denoted by F.
The model we consider in this article can be broken down into two parts. The first part relates the DVH, or equivalently F, to the damage suffered by a specific healthy organ. The damage to the organ is unobserved. The second part of the model relates damage to injury incurred by the organ. Injury is the observed outcome. Several biomathematical models relating dose to damage incurred have been proposed, e.g. Chappell et al. (1995)
, Dawson et al. (2002)
, Deasy et al. (2002)
, Jackson et al. (1995)
, Lyman (1985)
and Seppenwoolde et al. (2003)
. However, few of these have been fit to data using modern statistical methodology, as pointed out by Shultheiss et al. (1983)
. Some analyses of data from 3D conformal therapy have been published (Boersma et al., 1998
; Chao et al., 2001
; Jackson et al., 1993
, 1995
; Kwa et al., 1998
; Roesink et al., 2001
; Scrimger et al., 2004
; Theuws et al., 1998
). In these papers, the authors have attempted to estimate the relationship between the DVH and complications. (However, the analyses, for the most part, have not been very sophisticated from a statistical point of view.) Although the current usage of statistical methods in the field of 3D conformal radiation therapy is limited, there have been some improvements in recent years. Maximum likelihood estimation has been used (Roberts and Hendry, 1993
). Profile likelihood confidence intervals have been advocated (Schilstra and Meertens, 2001
). The use of simulation to assess goodness-of-fit has been suggested (Stavrev and Stavrev, 2000
). The bootstrap has been used to obtain confidence intervals (Chao et al., 2001
), and generalized linear models/generalized estimating equations have been employed (Eisbruch et al., 1999
). Shultheiss (2001)
lists a number of problem areas in modeling normal tissue radiation injury. The list includes (i) too few data for analysis or too few events, (ii) assigning values to parameters that should be estimated, (iii) using inappropriate statistical methods, ad hoc methods, and univariate analysis, and (iv) ignoring model validation.
In this article, we address the aforementioned issues. We use an empirical measure of damage and concentrate on modeling the relationship between damage and injury to critical organs, the parotid glands (the major salivary producing glands), that occurs in head and neck cancer patients. The data come from an ongoing study being conducted in the Department of Radiation Oncology at the University of Michigan's Comprehensive Cancer Center. This article lays the foundation for future work where we will use the proposed model to compare, and develop, different biomathematical models that relate dose to damage.
The paper is organized as follows. The data set is described in Section 2, while our model is presented in Section 3. Details of the Markov chain Monte Carlo (MCMC) estimation procedure are detailed in Section 4. Results of the data analysis are given in Section 5 along with an analysis of the goodness-of-fit of the model. In Section 6, we propose an analysis using off-the-shelf software and argue that our approach is necessary to capture all the structure in the data. We conclude with a discussion of future work in Section 7.
| 2. THE PAROTID DATA SET |
|---|
|
|
|---|
The data set is comprised of 87 head and neck cancer patients. The outcome of interest is saliva flow rate from the parotid glands that are located in the cheek just in front of each ear. Saliva flow rate (ml/min) was obtained by placing a suction cup over the secretory duct of the parotid gland (Heft and Baum, 1984
|
|
We found from the goodness-of-fit summary (Section 5.1) that a transformation of the data was necessary. The goodness-of-fit (GOF) summary suggested that there were too many larger than expected observed values and that a transformation of the observations was necessary. The log transformation is a common function used to transform strictly positive, skewed data to more symmetrically distributed observations. However, these data have many observations at 0 and we felt it was desirable to treat all observations equally (that is, transform all observations). Further, a log transformation of only the nonzero observations skewed the data in the opposite direction because nonzero saliva flow rates were measured as low as 0.005 ml/min. Therefore, since all but two observations were less than 2 ml/min, we transformed the data by squaring the observed values divided by 2. That is, if y is an observation, we took y* = (y/2)2. This transformation was sufficient to achieve a satisfactory GOF summary (see Section 5.1).
| 3. THE DOSEDAMAGEINJURY MODEL |
|---|
|
|
|---|
The dosedamageinjury model is a radiobiologically based model that is central to the field of radiation therapy and that is amenable to statistical data analysis (Jackson et al., 1993, 1995
We present a dosedamageinjury model and focus on the relationship between damage and injury. We take a Bayesian approach to estimating model parameters. This part of the model assumes that the injury incurred by the parotid gland is measured by saliva flow rate. Saliva flow rate is assumed to be a smooth, monotone decreasing function of damage. Random effects for model parameters are used to account for overdispersion. Before introducing this part of the model, we first discuss the dosedamage relationship.
As mentioned in Section 1, several biomathematical models that relate dose to damage have been proposed. One of the main goals of this ongoing study is to evaluate the appropriateness of these models to these data, modify these models if necessary and develop new models that are appropriate for these data. However, in this article our focus is on the relationship between damage and injury and not on validating the previously proposed dosedamage models.
Nevertheless, the first part of our model relates dose (observed) to damage (unobserved). In general, damage can be written as a functional, T(F), of the dose distribution, F = 1 DVH, to the parotid and is typically a value between 0 and 1, where 0 represents no damage and 1 represents maximal damage. The biomathematical models mentioned are of this type. More empirically, we might think of damage as a scalar summary of F, such as a functional that maps F to the real line. Two examples of such a functional are the generalized mean: T(F) = [
x
dF(x)]1/
and the inverse cumulative distribution function (inverse cdf): T(F) = F1(p). Although in these cases, damage is some scalar summary of F, it has the same unit as dose, Gy, and it does not lie in [0, 1].
We take an empirical approach to estimate the damage from the dose distribution. In particular, we use the inverse cdf,
![]() |
to specify the relationship between dose and damage. Formally, d is the dose at which 100p percentage of the parotid received dose d or less . We take d to be a measure of damage, thus d can take values from 0 to the maximum possible dose in this study, 81.5 Gy. We estimate the posterior distribution of p from the data. A relatively large estimate of p suggests that the dose to a majority of the parotid gland is important in assessing overall damage. Whereas, a relatively small estimate of p suggests that the dose to only a small portion of the parotid gland is necessary. This could have implications as to whether an attempt should be made to reduce the overall dose distribution to the parotid gland, or to attempt to spare a portion of the gland and not worry about the rest. We assign the quantile p, which we assume to be the same for all subjects, a U[0, 1] (uniform) prior distribution.
The second part of our model relates damage to the injury incurred by the parotid gland. Specifically, the (transformed) expected saliva flow rate is modeled as a smooth, monotone decreasing parametric function of damage. In our particular application, saliva flow rate is used to determine the extent of injury to the parotid glands. This part of the model is a nonlinear mixture model with both fixed and random effects. As mentioned in Section 2, there are numerous outcome measures of zero. To account for these zero saliva flow rates, we model saliva flow rate at a given level of damage and at each time, as a mixture of a Dirac measure (point mass) at zero and a gamma distribution. This type of mixture model (a point mass at zero mixed with a standard distribution function) is reminiscent of Lambert's (1992)
zero-inflated Poisson regression model. The parotid glands are indexed by k (=0 for ipsilateral and =1 for contralateral), time is indexed by j, and the stimulation condition is indexed by
(=0 for normal and =1 for stimulated). Subjects are indexed by i. dik will denote the damage incurred by parotid k of subject i.
Let Zijk
be a latent allocation variable taking values in {0, 1} such that p(Zijk
= 1) =
j
(dik). Let, N(µ,
2) denote the normal distribution with mean µ and variance
2 and let G(µ,
) denote the gamma distribution with mean µ and shape parameter
. Thus, if X
G(µ,
) the density of X is given by (
/µ)
x
1 e
x/µ/
(
). Then saliva flow rate Yijk
conditional on the allocation variable, as well as all other model parameters, has the following distribution:
![]() |
Marginalizing over Zijk
, Yijk
is a mixture of a point mass at zero and a gamma random variable:
![]() |
We model the mixing parameters, {
j
}, as smooth, monotone decreasing functions of damage allowing for the possibility that the mixing proportion at a damage of zero is less than one (
j
(0) < 1). This allows subjects to have zero saliva flow at baseline, prior to radiation therapy, due to xerostomia (dry mouth).
j
(dik) can also be interpreted as the probability of nonzero saliva flow rate under stimulation condition
at time j for a subject that has incurred damage level dik to parotid k . In particular, we model the logit of
j
as a linear function of damage:
![]() | (3.1) |
The intercept, 
> 0, is indexed by stimulation condition and the slope,
j > 0, is indexed by time. Indexing the intercept by stimulation condition allows for differing baseline proportions of subjects with no saliva flow between the stimulated and normal conditions. Indexing the slope by time allows for the probability of nonzero saliva flow to change over time at a given level of damage, thus allowing for the possibility that the parotid recovers from damage over time. 
and
j are assigned vague, proper exponential priors with rates 0.01 (
(0.01)). At baseline, prior to radiation therapy, we set
0
0 since no radiation has been given and no damage has occurred due to radiation exposure. Therefore, we estimate the posterior distribution of
j only at the six posttreatment times j=1, 2, 3, 4, 5, 6 corresponding to 1, 3, 6, 12, 18, and 24 months postradiation therapy, respectively.
When Z = 1, the conditional expected outcome, µj
(dik), is restricted to be a smooth, monotone decreasing, parametrized function of the damage to the parotid: µj
: [0,
)
[0,
). Let
ik
, ßij
,
ik
> 0. We use a scaled version (scaled by
ik
) of the incomplete gamma function
|
| (3.2) |
(Here we have taken a bit of notational liberty. Technically, we should have shown the conditional dependence of µj
on the parameters
ijk
, ßij
, and
ik
. Henceforth, this conditional dependence will be implicit.) µj
achieves its conditional maximum,
ik
, at dik = 0 and has an asymptote of zero as dik
. We chose this function because of its flexible shape. It takes three parameters:
, which scales the range and
, ß which together control location and shape. It has a sigmoid shape and inflection point at
(
1)/ß for
> 1 and is exponential when
= 1. When 0 <
< 1, the function approaches it's asymptote of zero at a rate faster than exponential. One could also choose the survivor function of any distribution with nonnegative support in lieu of the incomplete gamma function, for example, the Weibull or lognormal distribution.
We model the priors for the parameters of the mean function, µj
, in a hierarchical fashion, thus accounting for some of the correlation within subjects across parotid glands, stimulation condition and time. In particular, let
i = (
i00,
i01,
i10,
i11)T, for all i. Further, let Wd(S) denote the Wishart distribution with d degrees of freedom and symmetric, positive definite scale matrix S for which the dimension will be obvious. To satisfy the positivity constraint on
i, we use a lognormal prior:
![]() |
where g = ln(0.1, 0.1, 0.1, 0.1)T, V = 100I4, R = 0.01I4, and Id is the d x d identity matrix. This gives a rather vague, but proper prior on both the random effects and their mean. To account for overdispersion, random parotid and stimulation effects within each subject,
i, are centered about the population mean,
, with covariance
.
accounts for correlation within subjects between and across parotids and stimulation conditions at baseline. A priori, we assume that the components of the random effects vector are uncorrelated, as well as the components of the population mean vector.
Lognormal priors are also placed on the other two parameters of the mean function. Specifically, let
ij = (
ij00,
ij01,
ij10,
ij11)T, for all i and j. Let ßi
=(ßi1
,..., ßi6
)T for all i,
. Then
![]() | (3.3) |
where a = ln(3, 3, 3, 3)T, U = I4, and T = 2I4; and
![]() | (3.4) |
where h = (0.3, , 0.3)T, W = I6, and S = 2I6.
accounts for posttreatment correlation between stimulation conditions and parotid glands. We assume that this correlation is constant over time.
accounts for correlation over time within a subject. We assume that this correlation is the same for both normal and stimulated conditions. We note here that the prior specifications for the mean of the random effects, the prior distributions in (eqn:alpha) and (eqn:beta) are somewhat informative. These priors were chosen, however, in combination to give a broad range of shapes (Figure 2) for the prior mean function µj
which is of particular interest to this study. Thus, we believe that this prior specification does not overly influence the posterior distribution of the mean function. Finally, the shape parameters,
j
, are given an exponential prior distribution with rate 0.1. This completes our model specification.
|
Accounting for correlation over time within a parotid gland, between parotid glands and across stimulation conditions, along with numerous outcome measures of zero and any potential overdispersion requires one of two solutions: (1) either make simplifying assumptions and ignore some or all potential correlation within the data or (2) fit the data with a necessarily complex model that accounts for the various forms of correlation, numerous outcome measures of zero, and included random effects to help account for overdispersion. The latter choice is preferred from a statistical point of view.
| 4. MCMCESTIMATION |
|---|
|
|
|---|
Because of the complexity of the model, the posterior distribution of the model parameters is not analytically tractable and so we estimate the posterior distribution through MCMC simulation. The MCMC simulation is a hybrid MetropolisHastings and Gibbs algorithm. For those parameters whose full conditional posterior distributions are analytically tractable (
,
,
,
,
, ß) we perform a Gibbs update and perform a MetropolisHastings step for all other parameters. We update parameters in the following order: 
,
= 0, 1 ;
j, j = 1,..., 6;
ik, i = 1,..., N, k = 0, 1;
;
; ßi
, i = 1,..., N,
= 0, 1;
; ß;
i, i = 1,..., N;
;
;
j
, j = 0, 1,..., 6,
= 0, 1; and p.
A complication arises in estimating the mean function µj
(dik) for a particular parotid gland from a specific subject under a distinct stimulation condition at a given time in that the observation is obtained at only one level of damage, dik. However, we would like to obtain a smooth, monotone decreasing estimate of saliva flow rate as a function of damage incurred. And we would like to have this estimate for each parotid under both conditions. (Estimation of this mean function is of interest insomuch as it is part of the model, it will allow saliva flow predictions to be made and will allow us to compare various dosedamage relationships.) To circumvent this complication, we use the baseline observation as a second point, which corresponds to the observed saliva flow rate at dik = 0. This assumes that saliva flow rate is not a function of time when no radiation has been delivered, which we believe is a reasonable assumption. A third observation can be obtained as a latent variable evaluated at the estimated damage incurred by the opposing parotid gland within the particular patient of interest. This is accomplished by drawing the latent observation from its posterior predictive density. These draws are easily obtained at each step of the MCMC simulation. First draw
from a Bernoulli distribution with parameter
j
(dik*), where k* = |k 1| (i.e. at the damage received by the other parotid gland). Then if
set
and if
draw
from a gamma distribution with mean µj
(dik*) and shape parameter
j
. Here,
j
(dik*), µj
(dik*), and
j
are the current draws from the posterior distribution of the parameters. By the hierarchical structure placed on the parameters of the mean function and assuming that the covariance matrix
is constant over time and
is the same regardless of stimulation condition, we borrow strength over time, across stimulation conditions and between parotid glands when drawing these latent variables from the aforementioned gamma distribution.
Missing data are treated as MCAR (Little and Rubin, 2002
). Since this is an ongoing study, missing data can occur because patients have not completed the study, missed appointments, become lost to follow-up, or die. However, complications to the parotid glands are most likely not the cause of death, nor are they associated with death, and so we feel that this assumption is valid. Further, in a related analysis on a subset of these data, Eisbruch et al. (2001)
found no relationship between xerostomia questionnaire scores and patterns of missing data. Missing observations are also treated as latent variables and the values are drawn from the posterior predictive distribution as above. Drawing the missing data gives us a complete balanced data set which simplifies computation for some of the model parameters. In particular, it allows us to easily draw the precision matrices
1,
1, and
1 from their conjugate full conditional distributions.
We ran the simulation for 250 000 iterations with a burn-in of 50 000 iterations. We assessed convergence graphically and ran the chain from several different initial states. The posterior distribution was estimated by saving every 10th iteration after burn-in for a total of 20 000 draws from the posterior.
| 5. DATA ANALYSIS RESULTS |
|---|
|
|
|---|
Damage, as we have modeled it, is a function of a random parameter p. As such, damage has a posterior distribution and can cause difficulty in interpreting results of the model, especially for radiation oncologists who are interested in, among other things, the probability of nonzero saliva flow at a specified level of damage. That is, at a point estimate of damage. The estimated marginal posterior density of p has a median of 0.341, a 95% Bayesian credible interval (equal tail area) of (0.295, 0.381) and is quite symmetric, with an estimated skewness coefficient of 0.001. Thus, this distribution is fairly tight around its median value and roughly symmetric. Therefore, we reran our simulation, conditional on p = 0.341. Results of the unconditional simulation and the conditional simulation are virtually indistinguishable. Hence, we report results conditional on p = 0.341.
Two summary measures of interest are the probability of nonzero saliva flow at each time point under both conditions,
j
(d), and the marginal population mean saliva flow rate as a function of damage for each condition and parotid gland, µj
(d). The first of these is obtained from (eqn:logit). The second can be obtained by (eqn:mean) evaluated at the median values of
ik
, ßij
, and
ik
. We concentrate on the 12-month results.
The median probability of nonzero saliva flow rate is displayed in Figure 3 for both stimulation conditions along with the Bayesian 95% point-wise credible intervals (Wahba, 1983
). From this figure, one can see that as the level of damage increases the probability of having nonzero saliva flow decreases (as it should: more damage, less saliva). A patient who has an estimated damage of 26.8 Gy will have a 50/50 chance of having no saliva production (under normal conditions at 12 months). This value is called the toxic dose 50, denoted TD50 = 26.8 Gy. According to our model, if the dose distribution to the parotid could be altered, without altering the dose to the tumor, in such a way that the 0.341 quantile of F = 1 DVH is made smaller, the probability of the patient having nonzero saliva flow rate would be greater. Let 
and
j denote the median values of the estimated marginal posterior distributions of 
and
j, respectively. An estimate of the TD50 at time j for stimulation condition
is 
/
j. Under stimulated conditions, the estimated TD50=48.3 Gy at 1 year. The estimated TD50 at each of the postradiation therapy follow-up times are tabulated in Table 2. We see that there is an increase in the TD50 up to 18 months and then a slight drop-off. The increase indicates that the parotid glands recover from ionizing radiation-induced damage, to some degree. The slight drop-off may be due to chance variation or may be due to the pattern of missing observations shown in Table 1.
|
|
The median of the estimated marginal population mean saliva flow rate posterior distribution for each condition by parotid gland is shown in Figure 4 along with the Bayesian 95% credible bands. In this figure, the gray shaded area represents the range of values of damage estimated from the observed data for that particular parotid gland. The solid black dots represent sample means from the observed data at 12 months. The observed data were first grouped into the lower, middle, and upper third quantiles of their estimated damage evaluated at the median value of p. Then for each grouped set of data, the mean saliva flow rate was plotted against the mean damage. The sample baseline mean saliva rate is also depicted in the graphs as a solid dot at a damage of 0. The x symbol represents the mean saliva flow rate of the means of the latent observations evaluated by grouping the latent observations according to the grouping performed on the observed observations from the opposing parotid gland. Figure 4 shows that our model does a good job at modeling the marginal mean saliva flow rate. Note the similarity in shapes of the predicted marginal mean saliva flow rates in the four figures; especially the similarity in shape between the ipsilateral and contralateral predicted mean saliva flow rates under normal conditions and between the two parotid glands under stimulated conditions. Also note the positions of the means of the latent variables relative to the positions of the observed means from the opposing parotid gland under each stimulation condition. The latent means track the relative positions of the observed means fairly closely. This is a consequence of borrowed strength across the glands, stimulation condition, and time that was alluded to in Section 4.
|
Figure 5 shows the median of the marginal population mean saliva flow rate posterior distribution superimposed on the 12-month, stimulated data. This figure also displays the baseline observations as well as the means of the latent and missing values that were drawn from the estimated posterior predictive distribution. Again, the latent/missing predicted observations mix nicely throughout the observed data, due to the strength borrowed from the observed data. At damage levels larger than about 25, the latent contralateral predicted observations are a bit higher than the observed ipsilateral observations. This is a consequence of the larger predicted baseline population mean (0.51 for the contralateral side, 0.36 for the ipsilateral side, both under stimulated conditions).
|
We assessed model adequacy by a Bayesian
2 goodness-of-fit statistic (Johnson, 2004
). This goodness-of-fit test is similar to the classical
2 goodness-of-fit test, except in the Bayesian version the bin probabilities pb are held fixed at their null values and the bin counts are considered random.
Let
denote a sample from the posterior distribution and F the marginal sampling distribution after integrating out the missing data and latent variables. For this goodness-of-fit test we used 10 equally spaced bins such that the bin counts mb(
) are determined by counting the number of F(yijk
|
)
((b 1)/10, b/10] for b = 1,..., 10. When Yijk
|
= 0, we use a randomization procedure to determine bin assignment. That is, we draw z
U(0, 1
j
(dik)), then if F(z)
((b 1)/10, b/10] we increment mb by 1. Johnson's Bayesian
2 statistic is then
![]() |
where n is the sample size and pb is the theoretical probability of bin assignment, in our case with 10 bins, pb = 0.1. Johnson suggests two summary measures of the distribution of RB(
). First is the proportion of times it exceeds the critical value of a
distribution, 16.919. That is, the 0.95 quantile of a
2 distribution with nine degrees of freedom. Values much larger than the nominal 0.05 suggest lack of fit. The second suggestion is the probability that RB(
) is greater than a
random deviate. The nominal value of this probability is 0.5 and values much greater than this nominal value suggest lack of fit. For our model, the proportion of times RB(
) exceeds the critical value of a
distribution is 0.052. The probability that RB(
) is greater than a
random deviate is 0.51. Both of these summaries suggest that our model fits the data adequately. Figure 6 depicts the quantilequantile plot of RB values calculated from 20 000 samples from the posterior distribution.
|
As Johnson points out, the Bayesian
2 goodness-of-fit statistic is also a useful tool for code verification. The distribution of RB tends to deviate from its null distribution when the model is incorrectly specified or there are coding errors. It is evident from Figure 6 that RB does not deviate from its null distribution and thus we are confident that we have coded our model correctly and that this is an adequate model for the data. | 6. IS SUCH COMPLEXITY NECESSARY? |
|---|
|
|
|---|
It could be argued that we have unnecessarily complicated this analysis, and that perhaps a simpler model with less assumptions would fit the data adequately (and one that can be fit using off-the-shelf software). In this section, we relax some of our assumptions and fit the data with a generalized linear mixed model (GLMM). We will relax the assumption that the four saliva flow measurements from each individual are correlated and treat the four measurements as independent measurements. That is, at each time point, the saliva flow rate from the ipsilateral and contralateral parotid glands at the stimulated and unstimulated conditions are all independent measurements. We will still assume that the saliva flow rates from a particular parotid gland under a particular stimulation condition are correlated over time.
To fit the data we use the quasilikelihood approach specifying the mean function and modeling the variance of the outcome, saliva flow rate, as a function of the mean. Specifically, let Yij denote the saliva flow rate from parotid i at time j = 0, 1, 3, 6, 12, 18, 24. Let µij = E(Yij). Further, let C denote the set of contralateral parotid glands and S denote the set of parotid glands that were stimulated. Since the outcome is strictly nonnegative, we model the log of the expected saliva flow as a function of damage di to parotid i. As before,
where Fi = 1 DVHi. In particular,
![]() |
Above,
is the model intercept and the ai
N(0,
2) are parotid-specific random effects. I(A) is the indicator function equal to 1 if A is true and 0 otherwise and ßj are dose effect parameters, one for each j > 0. Also,
and
are the side and stimulation condition fixed effects, respectively. Further,
![]() | (6.1) |
Note the similarity between this model and a model for count data with a Poisson error and canonical log link. In fact, fitting this model with the function glmmPQL in the MASS library (Venables and Ripley, 2002
) in R (R Development Core Team, 2004
), we can specify a Poisson family with log link. The model fits are identical, but since the observations are not integer counts, we encounter several warning messages that the data are not integer count data. Thus, we resort to the quasilikelihood approach. In this model, the mean saliva flow rate is an exponential function of damage. Our mixture model allows mean saliva flow rate to be a more flexible function of damage which contains, as a special case, the exponential function. Several correlation structures were considered. The particular correlation structure specified in (6.1) was chosen because it gave the fewest problems with convergence.
Ideally, p, and hence the p th quantiles di, should be estimated from the data, but since di are nonlinear functions of the DVHs, they cannot be estimated from the data in this quasilikelihood GLMM approach. Thus, we fit the model using several different values for p. In particular, we chose several different values of p between 0 and 1 and fit a separate model for each choice. Assessment of model fit is based on two measures: the sum of the squared Pearson standardized residuals, given as
2, and the Bayesian information criterion (BIC), with smaller values of each denoting better fit. BIC is the sum of two terms: one that measures the goodness-of-fit (2 times the log-likelihood) and the other that is a penalty term for complexity (number of parameters times the log of the number of observations). Formally, this quasilikelihood approach does not have a likelihood and thus there is no theoretical justification for using the BIC. Further, it is not clear how many parameters are in a random effects model. However, because the equations used in our quasilikelihood estimation are the same as the likelihood equations for count data using a Poisson model with log link and the number of parameters is not changing (no matter how many there are), we feel justified in informally using BIC as a criterion. The following table, Table 3, shows that the 0.55 quantile results in the best fitting model according to
2 and that the 0.90 quantile gives the best fitting model according to BIC. The missing values in the table at p=0.4 and 0.45 are a result of nonconvergence due to numerical instability for those particular values of p. The best fitting model based on each criterion is shown in boldface. It is disconcerting that these criteria chose models that are quite different with regard to the best p.
|
A residual analysis shows a deficiency of this model. Figure 7 displays the Pearson standardized residuals as a function of the fitted values from this model (that chosen by
2) and the Bayesian standardized residuals as a function of the Bayesian predicted values (Gelfand et al., 1992
|
By contrast, the residuals from our mixture model shown in Panels B and D of Figure 7 show much less residual structure and do not indicate any severe model deficiencies. This is a result of our explicit modeling of the observed zeros.
To summarize, we feel that the level of complexity of our mixture model is justified on several points. First, we had several convergence problems with the GLMM approach. Second, this approach cannot account for the large number of zero outcomes in the data. Third, our decision on which quantile to use is based on a heuristic with no theoretical justification. Fourth, different model selection criteria chose quite different models. Fifth, this approach cannot estimate p nor can it gives measures of uncertainty about p. Lastly, nonnested models cannot be compared using these model comparison measures (this will become important when we compare different dosedamage modelsa central goal of this ongoing study, see Section 7).
| 7. DISCUSSION |
|---|
|
|
|---|
We have developed and implemented a realistic, albeit complicated, model to analyze the parotid data set outlined in Section 2. The need to account for the numerous outcome measures of zero and to account for within parotid gland correlation over time, correlation between stimulation conditions, and correlation between the two parotid glands within a subject requires such a complex model. However, this model did come at a price: time and effort in developing specialized software. In this model, we used an empirical measure of damage, namely the inverse cdf of the dose distribution and concentrated on modeling the damageinjury relationship in the dosedamageinjury model. Further, our analysis shows that the parotid gland recovers with time. A similar result was found by Scrimger et al. (2004)
2 goodness-of-fit statistic, on the other hand, is useful in determining model fit as opposed to model choice. | ACKNOWLEDGMENTS |
|---|
This work was funded by the National Institutes of Health, grant number RO1 CA095096.
| REFERENCES |
|---|
|
|
|---|
-
BOERSMA, L. J., VAN DEN BRINK, M., BRUCE, A. M., SHOUMAN, T., GRAS, L., TE VELDE, A. AND LEBESQUE, J. V. (1998). Estimation of the incidence of late bladder and rectum complications after high-dose (7085 GY) conformal radiotherapy for prostate cancer, using dosevolume histograms. International Journal of Radiation Oncology, Biology, Physics 41, 8392.[CrossRef][Web of Science][Medline]
CHAO, K., DEASY, J., MARKMAN, J., HAYNIE, J., PEREZ, C., PURDY, J. AND LOW, D. (2001). A prospective study of salivary function sparing in patients with head-and-neck cancers receiving intensity-modulated or three-dimensional radiation therapy: initial results. International Journal of Radiation Oncology, Biology, Physics 49, 907916.[CrossRef][Web of Science][Medline]
CHAPPELL, R., NONDAHL, D. M. AND FOWLER, J. (1995). Modelling does and local control in radiotherapy. Journal of the American Statistical Association 90, 829938.[CrossRef]
DAWSON, L. A., NORMOLLE, D., BALTER, J. M., MCGINN, C. J., LAWRENCE, T. S. AND TEN HAKEN, R. K. (2002). Analysis of radiation-induced liver disease using the lyman ntcp model. International Journal of Radiation Oncology, Biology, Physics 53, 810882.[CrossRef][Web of Science][Medline]
DEASY, J. O., NIEMIERKO, A., HERBERT, D., YAN, D., JACKSON, A., TEN HAKEN, R. K., LANGER, M. AND SAPARETO, S. (2002). Methodological issues in radiation dosevolume outcome analyses: summary of a joint aapm/nih workshop. Medical Physics 29, 21092127.[CrossRef][Web of Science][Medline]
EISBRUCH, A., TEN HAKEN, R. K., KIM, H. M., MARSH, L. H. AND SHIP, J. A. (1999). Dose, volume and function relationships in parotid salivary glands following conformal and intensity-modulated irradiation of head and neck cancer. International Journal of Radiation Oncology, Biology, Physics 45, 577587.[CrossRef][Web of Science][Medline]
EISBRUCH, A., KIM, H. M., TERRELL, J. E., MARSH, L. H., DAWSON, L. A. AND SHIP, J. A. (2001). Xerostomia and its predictors following parotid-sparing irradiation of head-and-neck cancer. International Journal of Radiation Oncology, Biology, Physics 50, 695704.[CrossRef][Web of Science][Medline]
GELFAND, A. E., DEY, D. K. AND CHANG, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods (with Discussion). In Bernardo, J. M. and DeGroot, M. H. (eds), Bayesian Statistics 4. Oxford: Oxford University Press.
GELFAND, A. E. AND GHOSH, S. K. (1998). Model choice: a minimum posterior predictive loss approach. Biometrika 85, 111.
HEFT, M. W. AND BAUM, B. J. (1984). Unstimulated and stimulated parotid salivary flow rate in individuals of different ages. Journal of Dental Research 63, 11821185.
JACKSON, A., KUTCHER, G. J. AND YORKE, E. D. (1993). Probability of radiation induced complications for normal tissues with parallel architecture subject to non-uniform radiation. Medical Physics 20, 613625.[CrossRef][Web of Science][Medline]
JACKSON, A., KUTCHER, G. J. AND YORKE, E. D. (1995). Analysis of clinical complication data for radiation hepatic using a parallel architecture model. International Journal of Radiation Oncology, Biology, Physics 31, 883891.[CrossRef][Web of Science][Medline]
JOHNSON, V. E. (2004). A Bayesian chi-squared test for goodness of fit. Annals of Statistics 32, 23612384.[CrossRef]
KWA, S. L. S., LEBESQUE, J. V., THEUWS, J. C. M., MARKS, L. B., MUNLEY, M. T., BENTEL, G. B., OETZEL, D., SPAHN, U., GRAHAM, M. V., DRZYMALA, R. E. et al. (1998). Radiation pneumonitis as a function of mean dose lung dose: an analysis of pooled data of 540 patients. International Journal of Radiation Oncology, Biology, Physics 42, 19.[Web of Science][Medline]
LAMBERT, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34, 114.
LICHTER, A. S. (1991). Three-dimensional conformal radiation therapy: a testable hypothesis. International Journal of Radiation Oncology, Biology, Physics 21, 853855.[Web of Science][Medline]
LITTLE, R. J. A. AND RUBIN, D. B. (2002). Bayesian Data Analysis. New York: Wiley-Interscience.
LYMAN, J. T. (1985). Complication probability as assessed from dosevolume-histograms. Radiation Research 104, s13s19.[CrossRef]
NIEMIERKO, A. AND GOITEIN, M. (1992). Modeling of normal tissue response to radiation: the critical volume model. International Journal of Radiation Oncology, Biology, Physics 25, 135145.
R DEVELOPMENT CORE TEAM (2004). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
ROBERTS, S. AND HENDRY, J. (1993). A direct maximum-likelihood analysis of a collection of worldwide tumour-control data. Radiotherapy and Oncology 29, 6974.[CrossRef][Web of Science][Medline]
ROESINK, J., MOERLAND, M., BATTERMANN, J., JAN HORDIJK, G. AND TERHAARD, C. (2001). Quantitative dosevolume response analysis of changes in parotid gland function after radiotherapy in the head-and-neck region. International Journal of Radiation Oncology, Biology, Physics 51, 938946.[CrossRef][Web of Science][Medline]
SCHILSTRA, C. AND MEERTENS, H. (2001). Calculation of the uncertainty in complication probability for various does-response models. International Journal of Radiation Oncology, Biology, Physics 50, 147158.[CrossRef][Web of Science][Medline]
SCRIMGER, R. A., STAVREV, P., PARLIAMENT, M. B., FIELD, C., THOMPSON, H., STAVREVA, N. AND FALLONE, B. G. (2004). Phenomenologic model describing flow reduction for parotid gland irradiation with intensity-modulated radiotherapy: evidence of significant recovery effect. International Journal of Radiation Oncology, Biology, Physics, 178185.
SEPPENWOOLDE, Y., LEBESQUE, J. V., DE JAEGER, K., BELDERBOS, J. S. A., BOERSMA, L. J., SCHILSTRA, C., HENNING, G. T., HAYMAN, J. A., MARTEL, M. K. AND TEN HAKEN, R. K. (2003). Comparing different ntcp models that predict the incidence of radiation pneumonitis. International Journal of Radiation Oncology, Biology, Physics 55, 724735.[CrossRef][Web of Science][Medline]
SHULTHEISS, T. E. (2001). The controversies and pitfalls in modeling normal tissue radiation injury/damage. Seminars in Radiation Oncology 11, 210214.[CrossRef][Web of Science][Medline]
SHULTHEISS, T. E., ORTON, C. G. AND PECK, R. A. (1983). Models in radiotherapy: volume effects. Medical Physics 10, 410415.[CrossRef][Web of Science][Medline]
SPIEGELHALTER, D. J., BEST, N. G., CARLIN, B. P. AND VAN DER LINDE, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B 64, 583639.[CrossRef]
STAVREV, P. AND STAVREV, N. (2000). Fraction size and dose parameters related to the incidence of pericardial effusions: regarding Martel et al. International Journal of Radiation Oncology, Biology, Physics 48, 609617.
THEUWS, J. C., KWA, S. L., WAGENAAR, A. C., SEPPENWOOLDE, Y., BOERSMA, L. J., DAMEN, E. M., MULLER, S. H., BAAS, P. AND LEBESQUE, J. V. (1998). Prediction of overall pulmonary function loss in relation to the 3-d dose distribution for patients with breast cancer and malignant lymphoma. Radiotherapy and Oncology 49, 233243.[CrossRef][Web of Science][Medline]
VENABLES, W. N. AND RIPLEY, B. D. (2002). Modern Applied Statistics with S, 4th edition. New York: Springer.
WAHBA, G. (1983). Bayesian "confidence intervals" for the cross-validated smoothing spline. Journal of the Royal Statistical Society, Series B, Methodological 45, 133150.
YORK, E. D., KUTCHER, G. J., JACKSON, A. AND LING, C. C. (1993). Probability of radiation-induced complications in normal tissue with parallel architecture under conditions of uniform whole or partial organ irradiation. Radiotherapy and Oncology 26, 226237.[CrossRef][Web of Science][Medline]
Received November 17, 2004; revised May 9, 2005; accepted for publication May 19, 2005.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||













, observed data by
, missing data (means) by
, and latent variables (means) by x. All baseline data are observed and plotted at 1 to assist in visual clarity.




