Skip Navigation


Biostatistics Advance Access originally published online on August 28, 2006
Biostatistics 2007 8(2):438-452; doi:10.1093/biostatistics/kxl021
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
8/2/438    most recent
kxl021v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Healy, B.
Right arrow Articles by Degruttola, V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Healy, B.
Right arrow Articles by Degruttola, V.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

Hidden Markov models for settings with interval-censored transition times and uncertain time origin: application to HIV genetic analyses

Brian Healy*

Department of Biostatistics, Harvard University, 655 Huntington Avenue, Boston, MA 02115, USA bhealy{at}hsph.harvard.edu

Victor Degruttola

Department of Biostatistics, Harvard University, 655 Huntington Avenue, Boston, MA 02115, USA

* To whom correspondence should be addressed.


    SUMMARY
 TOP
 SUMMARY
 1. INTRODUCTION
 2. EXPANDING THE MARKOV...
 3. IDENTIFIABILITY OF PARAMETERS
 4. APPLICATION
 5. DISCUSSION
 APPENDIX
 REFERENCES
 
The simplicity and flexibility of Markov models make them appealing for investigations of the acquisition of HIV drug-resistance mutations, whose presence can define specific Markov states. Because the exact time of acquiring a mutation is not observed during clinical research studies on HIV infection, it is important that methods for fitting such models accommodate interval-censored transition times. Furthermore, many such studies include patients with extensive treatment experience prior to the onset of the studies. Therefore, the virus in these patients may have already acquired resistance mutations by study entry. Retrospective data regarding the time on treatment, which is often known or known with error, provide information about the acquisition rates before the start of a study. Finally, variability in the genetic sequences of circulating HIV creates uncertainty in the Markov states. This paper considers approaches to fitting Markov models to data with interval-censored transition times when the time origin and the Markov states are known with error. The methods were applied to AIDS Clinical Trial Group protocol 398, a randomized comparison of mono- versus dual-protease inhibitor use in heavily pretreated patients. We found that the estimated rates of acquiring certain classes of mutations depended on the presence of others, and that the precision of these estimates can be considerably improved by inclusion of retrospective data.

Keywords: Hidden Markov models; HIV; Interval censoring; Resistance mutations; Uncertain time origin


    1. INTRODUCTION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. EXPANDING THE MARKOV...
 3. IDENTIFIABILITY OF PARAMETERS
 4. APPLICATION
 5. DISCUSSION
 APPENDIX
 REFERENCES
 
Disease progression can sometimes be characterized as a series of transitions through clinically meaningful discrete states. This paper discusses the application of Markov models in settings where the transition times are not directly observed, but known only within an interval. An example of such a situation arises in studies of the acquisition of HIV drug-resistance mutations in patients receiving antiretroviral treatment. Plasma from patients may be genotyped throughout these studies so that the evolution of the virus can be observed, but the exact times of transition may not correspond to the times of measurement. Furthermore, previously treated patients have often acquired a considerable number of mutations before entry into such studies. The time at which patients began treatment may be known (perhaps with error), but the precise time at which the mutations were acquired is not observed. This paper also discusses methods of combining retrospective and prospective data to estimate parameters from Markov models used to characterize the genetic evolution of HIV under selective drug pressure.

1.1 Background

Albert (1962)Go described a Markov model that assumes an exponential hazard for each transition applicable in settings in which transition times are known. Foulkes and DeGruttola (2003)Go applied the model to characterize development (or loss) of HIV-resistance mutations that occur under drug pressure. The states of the Markov model correspond to the absence or presence of specific mutations in the viral genome. They also considered hidden Markov models (HMMs) to accommodate the fact that the exact state of the virus is not always known because of measurement error or genetic variability among the viral strains that circulate in the blood of patients. Foulkes and DeGruttola (2003)Go assumed that transitions occurred at times of measurement—an assumption that is only applicable when genotyping is done very frequently. In this paper, we consider approaches to fitting Markov model and HMM that take into account interval-censored transition times.

Turnbull (1976)Go used a self-consistency algorithm to make inferences on failure time distributions from interval-censored data. DeGruttola and Lagakos (1989)Go and Sternberg and Satten (1999)Go extended Turnbull's work to settings with multivariate responses. Satten and Sternberg (1999)Go developed a method that estimates parameters from a semi-Markov model where transition times are interval censored and the initiation time of the process is unknown. The approach treats the unknown initiation time as a nuisance parameter in the likelihood. Satten and Longini (1996)Go developed a procedure to estimate Markov model parameters that conditions on the initiation time in order to remove dependence on this time. These methods are designed for settings where there is no information regarding the initiation time.

We develop methods for fitting Markov model or HMM that apply to settings in which some information about the start of the process is available at the start of the study, even if it is known with error. The methods shown here extend ideas from the methods of Foulkes and DeGruttola (2003)Go and the methods of Satten and Sternberg (1999)Go to incorporate the extra retrospective information available at the start of a study.

1.2 Application

The clinical study that motivated our research was AIDS Clinical Trial Group (ACTG) 398 (Hammer and others, 2002Go), a randomized trial that compared the benefits of using single- versus dual-protease inhibitor (PI) regimens in combination therapy in treatment-experienced patients. All patients were placed on a four drug-class regimen of amprenavir, a PI; abacavir, a nucleoside analog reverse transcriptase inhibitor (NRTI); efavirenz, a non-nucleoside reverse transcriptase inhibitor (NNRTI); and adefovir dipivoxil, a nucleotide reverse transcriptase (RT) inhibitor. Patients were randomized to receive either an additional PI that they had not previously received or placebo. Prior to the study, all the patients but one had NRTI experience (AZT, 3TC, DDI, or D4T) either as monotherapy or in combination with other drugs, and roughly 50% of the patients had NNRTI experience (nevirapine or delavirdine) in addition. Start and stop dates of these treatments were recorded for each patient, but the accuracy of these data may have been lower than the prospectively collected data. One of the main results from the study was that prior exposure to NNRTIs was the most important predictor of success.

Beyond information on patient outcomes, this study provided a rich data set for investigating how the HIV genotype evolved under selective drug pressure. Here we use the term "evolution" not only in the strict biological sense but also to refer to observed changes in genotype that could be the result of detecting previously undetectable viruses under selective drug pressure. Since HIV could only be genotyped in patients with detectable virus, plasma samples were sequenced at baseline (by design, all patients entered the study with detectable virus), early in the study before most patients could have achieved full suppression (generally 8 weeks), at virological rebound, and possibly an additional time thereafter. Multiple amino acids were sometimes detected at specific codons, reflecting genetic diversity among viral particles in plasma, but patients who maintained full virological suppression could not be genotyped after they had achieved it.

Investigations of acquisition rates of drug-resistance mutations estimated from genotype data have mostly used only prospectively collected information, but the process of accumulating such mutations begins at initiation of treatment, not the start of a specific study. All the NRTIs to which patients were exposed before the start of ACTG 398 have resistance mutations in common with abacavir, specifically at codons 41, 65, 151, 184, 210, and 215 from the RT region of the viral genome. The NNRTIs used before entry into this study also have many of the same resistance mutations as efavirenz, specifically at codons 103, 106, 181, and 188 from the RT region. Therefore, treatments used prior to the start of ACTG 398 have similar resistance profiles (sets of mutations that confer resistance) to those used during the study. The question we address is how to use the treatment history to improve estimates of rates of acquisition of resistance mutations.

Our investigation focuses on determining the order and relative rates of acquisition of mutations among patients who have detectable virus during the study. Therefore, the rates of transition we estimate apply to the retrospective experience of all study patients and the prospective experience of patients with evidence of replicating virus throughout most or all of the study (70% of patients). While we could include the prospective experience of patients whose lack of detectable virus implies no occurrence of mutations, we cannot do so for the retrospective experience because of the study entry criteria. By including only viremic patients prospectively, we are able to make use of the retrospective information to answer the scientifically relevant questions. As our focus is the order in which muta- tions occur and the impact of mutations of a given type on the rate of acquisition of others, we need not include the experience of patients after they achieved full virological suppression in ACTG 398.

The statistical challenge we consider is developing methods for fitting Markov model and HMM that accommodate interval-censored transition times in settings where the time on treatment (TOT) before the study is known, possibly with error. In Section 2, we describe different approaches for using the treatment history in fitting Markov model and HMM. We discuss the identifiability of the parameters of the model in Section 3, and present the results of applying the method to the ACTG 398 data in Section 4. Section 5 provides discussion of the results.


    2. EXPANDING THE MARKOV MODEL
 TOP
 SUMMARY
 1. INTRODUCTION
 2. EXPANDING THE MARKOV...
 3. IDENTIFIABILITY OF PARAMETERS
 4. APPLICATION
 5. DISCUSSION
 APPENDIX
 REFERENCES
 
To review the Markov model proposed by Albert (1962)Go, we use the following notation: H patients are observed over time in one of Ns possible states. P(t) is an NsxNs matrix of transition probabilities, where the (i,j)th element is the transition probability from state i to j at time t. Under the assumptions of an exponential hazard and stationary transition probabilities (i.e. the transition probability only depends on the length of time in a state), the probabilities can be written as P(t) = exp(tQ), where Q is the matrix of infinitesimal generators. The parameters of interest are the elements of Q, which correspond to the rates of transition. When the transition times are known exactly, the likelihood for the transition parameters has the form given below:


Formula (2.1)

where P(Z0 = z0,h) is the probability that the initial state is z0,h for patient h, zx,h is the xth state for patient h, zyh,h is the final state for patient h, qzx,hz(x + 1),h is the appropriate element of the Q matrix, qzx,h = {sum}x!=x'qzx,hzx',h (by Theorem 2.1 from Albert, 1962Go), tzx,h is the time in the xth state, and Th is the total time on the study. We define d' to be a vector of the states and transition times for all patients. The maximum likelihood estimate (MLE) for each of the transition parameters is Formula, where Nij is the number of jumps from state i to state j and Ai is the total time spent in state i.

As an example, suppose we are interested in the rate at which patients acquire NRTI- and NNRTI-specific resistance mutations and the possible effect of one type of mutation on the acquisition rate of the other. If the transition times were known, the methods of Foulkes and DeGruttola (2003)Go could be applied to the data from ACTG 398 with the states given in Table 1 and the pathways shown in Figure 1. The arrows in Figure 1 are unidirectional because we only consider an ascending Markov model in this paper, and the codons in Table 1 were chosen because they corresponded to important resistance mutations for the drugs taken by patients before and during the study. If q12, the rate of acquisition of an NRTI mutation by a virus without resistance mutations, differed from q34, the rate of acquisition of an NRTI mutation by a virus that had already acquired an NNRTI mutation, then the presence of an NNRTI mutation is associated with an increase in the rate of acquiring an NRTI mutation.


View this table:
[in this window]
[in a new window]

 
Table 1. States of Markov model and HMM. The mutations for each state are listed in parentheses

 

Figure 1
View larger version (4K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Schematic of possible pathways.

 
Because times of transitions are known only within an interval for patients followed in ACTG 398, we need to extend the models described above. Furthermore, the TOT prior to entry into the study constitutes another interval in which transitions could have occurred. To make use of this interval, we must model the relationship between the transition parameters before and during the study. Model extensions to accommodate interval-censored transition times and retrospective treatment information are discussed in the following sections.

2.1 Interval-censored transition times

Like Albert (1962)Go, we assume a Markov model with exponentially distributed time intervals between transitions. Since the density of the transition times is modeled explicitly, we can use a convolution in the model of the second, third, and later transitions. For simplicity, the likelihood for the transition parameters, Q, is shown for a case with a maximum of two transitions, but the method can be extended to any number of transitions:


Formula (2.2)

where patient h transitions from state ih to jh in the interval (th(i,1),th(i,2)) and then to kh in the interval (th(j,1),th(j,2)), P(Z0 = ih) and the q‘s are as defined above, S is the sum of the transition times, and s1 is the first transition time. Note d is a vector of the states and transition intervals for all patients. The equation above has been left general so that any two transitions are possible, including reversions, although we do not consider reversions in this paper. Using the correct bounds of integration, it is possible for both transitions to have occurred in the same interval or for transitions to be right censored. As an example, we consider the states from Table 1. For a patient who entered the study in state 1 (no mutations at codons of interest), transitioned to state 2 (acquired an NRTI mutation) between 3 and 5 weeks, and then transitioned to state 4 (acquired an NNRTI mutation and now has both types of mutation) between 20 and 24 weeks, the bounds for the integrals in (2.2) are (3, 5) and (20,24).

2.2 Treatment history data

As described above, the initiation of the process of interest may occur before the start of observation. In our setting, the information available at the start of ACTG 398 reflected the process over an interval from treatment initiation to the start of the study. The transition probabilities can be estimated using this interval if the starting state is known. For ACTG 398, it is reasonable to assume that the starting state for all patients is state 1 because primary resistance mutations were rare when the patients initiated treatment.

Although the prospective and retrospective data can be used to estimate the transition parameters separately, it is more efficient to combine the two data sources, provided a model is available to link the two sets of parameters. In ACTG 398, the specific NRTIs and NNRTIs were changed at the start of the study, but the treatments used before and during the study induce similar resistance mutations. The simplest model assumes that transition rates do not change when the study begins, qFormula = qFormula, for all i and j (superscripts b and d correspond to before and during the study). Each equality can be tested using the likelihood ratio test to find the most parsimonious model. A more complex model of the link between the two sets of parameters could include covariates, which would extend the methods below to include ideas developed by Loecke and DeGruttola (thesis).

2.3 Known TOTs

For the remainder of this section, methods are developed under the assumption that all the transition rates before and during the study are equal. When the assumption is not met for a subgroup of the rates, a simple extension to the likelihoods given in this section is required to include the additional parameters. In the next two subsections, we consider the cases in which the TOTs before the study are known exactly and known with error, respectively.

When only one treatment is given before and during the study, the start of treatment is the start of the process and all transitions are possible from this time, leading to the likelihood below:


Formula (2.3)

where P(Z0 = ih) and the q‘s are defined as above, r1(t),...,rH(t) are the TOTs before the start of the study for each patient, and bh = (bh,1,bh,2,bh,3,bh,4) is a vector of the appropriate bounds of integration for patient h. For example, the bounds of integration for a patient who transitioned once before the study and did not transition on study are bh = (0,rh(t),th(j,1) + rh(t),{infty}), where th(j,1) is the time from the start of the study to the end of follow-up for patient h. Note qjhkh is replaced by qjh for this patient because the second transition does not occur. For a second example, if the patient from the example in Section 2.1 started treatment 10 weeks before the study, the bounds for this patient are bh = (13,15,30,34).

When treatments from two drug classes are administered at different times, all transitions are possible in some time intervals, but only a subset of transitions are possible in other intervals. As an example, suppose a patient in ACTG 398 started NRTIs 2 years before the study and then added NNRTIs 1 year before the study. If this patient was initially in state 1, the patient could only have transitioned to state 2 in the first year of treatment; but, assuming the patient did not transition in the first year, the patient could have transitioned to either state 2 or 3 in the second year. Note that when the number of possible transitions increases or decreases, the value of qi changes accordingly because it is the sum of qij for i!=j. For the patient above, q1 = q12 for the first year, and q1 = q12 + q13 for the second year.

The likelihood for the Markov model in which patients start with treatments from one class and then add treatments from another class for the remainder of the time before they enter the study is shown in (2.4). For this likelihood, we assume that only one transition is possible before the start of the second treatment.


Formula (2.4)

where I(m = 1) is an indicator function that equals 1 if the first transition occurred before the second treatment began and 0 otherwise, rh = (rh(t1),rh(t2)) is a vector of the times on each of treatments for patient h, r = (r1,...,rH), qih(1) is the transition parameter when only a subset of the transitions are possible, and qih(2) is the same parameter when all transitions are possible. The two terms in (2.4) are required because a patient who transitions before the start of the study may have transitioned before or after the start of the second treatment; therefore, both possibilities must be included in the likelihood. If the patient from the earlier example transitioned from state 1 to state 2 before the study, the patient could have transitioned either before (m = 1) or after the start of the NNRTIs. Conversely, if a patient never transitioned or transitioned only to state 3 before the study, only the first term is required (m = 0) because this patient must have remained in state 1 until the start of NNRTIs. The bounds of integration are found using the approach described above. This approach can be extended to three or more treatments with different initiation times.

An important further extension accommodates patients who discontinued a class of drugs prior to the start of the study. All patients enrolled in ACTG 398 were continuously exposed to NRTIs, but some discontinued NNRTIs. The likelihood becomes more complex in this case and is shown in the Appendix. Maximizing the likelihood in the Appendix determines the parameter estimates, and the inverse of the hessian matrix is the variance of the parameter estimates.

2.4 TOTs known with error

This section generalizes the approach of including the times on each treatment to fit the case in which the times are known with error. For ACTG 398, there are three time intervals of interest prior to the study: the time on NRTIs before starting NNRTIs, the time on both treatments, and the time on NRTIs after NNRTIs were discontinued before the start of the study. We assume that the time on NRTIs after discontinuation of NNRTIs is known exactly. This procedure allows error to be added to the total time on each treatment even though one component of the time of exposure is held constant. Therefore, the impact of error in the recorded times can be investigated without making the procedure overly computationally complex.

To determine the sensitivity of results to uncertainty in the TOTs, the reported times are coarsened by assuming each time is known within an interval that we specify. The intervals can also be defined by information from outside sources, such as dates at which drugs became available. Although other approaches can be used, the interval-censored approach accommodates both the uncertainty in the reported times and the dates from other sources.

We assume a flexible parametric distribution for the total time on each treatment to assure numerical stability. For continuous distributions, the use of the coarsened intervals is straightforward. For a multinomial distribution, which is used in our application, consider the following example. Assume that only one treatment is given before the study, and assume that the probability of starting the treatment during one of three possible intervals, ( – 35, – 25), ( – 25, – 15), and ( – 15, – 5) weeks where 0 is the time at the start of the study, is given by a multinomial distribution; we call these the TOT intervals. We consider three patients with the following coarsened intervals; ( – 35, – 5), ( – 10, – 5), ( – 30, – 20). Any of the TOT intervals partially covered by the patient's interval is considered admissible, so the admissible TOT intervals for these patients are (1,1,1), (0,0,1), and (1,1,0). In the following equations, if the TOT interval is not admissible for a patient, the probability of this interval is set to 0. This method can be generalized to multiple treatments, and a discussion of identifiability of parameters is given in Section 3.

For the remainder of this section, we consider a general discrete distribution for the TOTs; if a continuous distribution was used, the sums would be changed to integrals. Here we denote the nth mass point of the bivariate distribution of the times on the two treatments by r'n = (rn(t1),rn(t2)). When the parameters of the TOTs distribution are known, the likelihood is shown here:


Formula (2.5)

where dh is a vector of the states and transition intervals for patient h; uh is a vector of the left and right limits of the coarsened intervals for the first and second treatment for patient h; u = (u1,...,uH); r'' = (r1(t3),...,rH(t3)) is the time on NRTIs after discontinuation of NNRTIs; and L3 is the likelihood shown in the Appendix.

If the parameters of the TOTs distribution are unknown, an expectation maximization (EM) algorithm (Dempster and others, 1977Go) can be used to estimate all the parameters. The complete data include the exact TOTs (r), but only the intervals for the TOTs, (u), are observed. For the E step, the conditional probability that Rh = rFormula can be calculated as shown in (2.7). The expected conditional log likelihood, shown in (2.6), is maximized in the M step.


Formula (2.6)


Formula (2.7)

where G is a vector including the Markov parameters (Q) and time distribution parameters, and l indexes the values in the discrete distribution.

When the EM algorithm is used to estimate the parameters of the distribution, the result from Louis (1982)Go provides variance estimates for the parameters. The first term is the complete data information matrix, and the second term is the added uncertainty from the missing data. The expansion of the second term is given for a similar example in Loecke and DeGruttola (thesis).

2.5 Extensions to the HMM

In HIV genomic studies, as described above, the states of a Markov model can be defined by specific patterns of mutations. When measurement techniques allow detection of multiple clones or mixtures of amino acids at specific codons as in ACTG 398, multiple states may be observed at a single time point. Although these techniques provide additional information about the virus, the exact state of the virus at certain time points and the exact path (sequence of states) taken by the virus are no longer observed. A HMM includes the uncertainty of state and path into the likelihood by summing over the possible paths using the proper weighting. In this section, we extend the model with interval-censored transition times and TOTs known with error to HMMs.

When the TOTs are known, the HMM is an extension of the likelihood in the Appendix. Using an EM algorithm for estimation, the complete data consist of d, r, and each patient's path. The expected value of the conditional log likelihood and the probability of path v for patient h are given by


Formula (2.8)


Formula (2.9)

where vh is the set of possible paths for patient h, v = (v1,...,vH), and w indexes the possible paths. Iterating between maximizing the conditional log likelihood and calculating the probabilities of the paths until convergence provide the MLE for the transition parameters. We again apply the result from Louis (1982)Go to find the variance of the estimates for the HMM.

Finally, the HMM can be combined with the method presented above in which the TOTs are known with error. The expected value of the conditional log likelihood is given in (2.10), which combines the ideas of (2.6) and (2.8):


Formula (2.10)

where pFormula(V = v|Qm,dh,rFormula,vh,rh(t3)) is given by (2.9). The pFormula(Rh = rFormula|Gm,dh,uh,rh(t3)) term is similar to the probability of the TOTs term from (2.7), except that it is the marginal probability over path. Again, the result from Louis (1982)Go is used to determine the variance of the estimates.


    3. IDENTIFIABILITY OF PARAMETERS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. EXPANDING THE MARKOV...
 3. IDENTIFIABILITY OF PARAMETERS
 4. APPLICATION
 5. DISCUSSION
 APPENDIX
 REFERENCES
 
The parameter set of the model, G, includes the TOT parameters and the Markov transition parameters. The time and transition parameters are identifiable when either both sets are identifiable independently or there is additional information from the transitions to identify the time parameters or vice versa. When a multinomial distribution is used for the TOTs, as in Section 4, a sufficient condition for identifiability is that all possible patterns of times and all possible transition paths have positive probability. This ensures that with a large enough sample, each parameter set is identifiable independently. Although this condition is sufficient, it is not necessary; both sets of parameters can be identified under less strict conditions, though the conditions are more difficult to characterize.

We provide an example here. Assume that we are interested in the patients described in the example from Section 2.4, all patients began in state 1, and all patients are observed at the start of the study (time 0). The data consist of a vector of three indicators for which the three time points (time – 3, time – 2, time – 1) are admissible, followed by the state of each patients at the start of the study: (1,1,1;1), (0,0,1;1), and (1,1,0;2). An illustration of this is provided in Figure 2. The time parameters are not identifiable alone because all patients have the same values for the time – 3 and time – 2. Conversely, when the transition information is included, all parameters are identifiable. The maximum likelihood places mass on time – 1, which is known for patient 2 who did not transition and plausible for patient 1 who also failed to transition. Maximum likelihood also places mass on time – 3, doing so allows extra time for patient 3 to transition. Similar arguments can be made to show how adding information regarding the TOT to the likelihood can make the overall likelihood identifiable even if the parameters of the HMM alone are not.


Figure 2
View larger version (4K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Circles denote admissible time points, dark circles for patients who did not transition and light circles for patients who transitioned.

 
Since there is no simple rule for when we have identifiability, inspection of the likelihood is required. First, checking that the second derivative matrix is negative definite ensures that a local maximum has been reached. Second, if a global maximum has been reached, all starting values must arrive at the same maximum. In practice, we can investigate whether a local maximum is also the global maximum by considering a wide range of starting values for the model.


    4. APPLICATION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. EXPANDING THE MARKOV...
 3. IDENTIFIABILITY OF PARAMETERS
 4. APPLICATION
 5. DISCUSSION
 APPENDIX
 REFERENCES
 
Our investigation focuses on the development of resistance mutations at NRTI- and NNRTI-specific codons and possible interactions between these types of mutations. To investigate the relative rates of acquisition of these mutations, several different models with four possible states (Table 1) and two possible paths (Figure 1) were applied to the data from ACTG 398, described in Section 1. We assume that each patient was in state 1 at the beginning of treatment because primary resistance mutations were rare when these patients initiated their first regimens. We also assume a strictly ascending Markov model because we do not expect true reversions while patients are under drug pressure.

The interval-censored Markov model and HMM were first applied to the prospective data of patients with replicating virus; Table 2 provides the results. The estimates of the transition rates from the HMM were lower than the estimates from the Markov model in part because of the method of handling mixtures of amino acids at single codons in the two models. Under the Markov model, a genotype was classified as having a mutation at a specific codon as long as the mixture included a mutant amino acid. Under the HMM, a genotype was considered wild type and mutant at a specific codon if both types of amino acid were present. By allowing the possibility of different paths with fewer transitions, the HMM led to lower transition rate estimates.


View this table:
[in this window]
[in a new window]

 
Table 2. Transition parameter estimates using prospective data only. Units are per year

 
We view all the following results as exploratory and therefore do not adjust for multiple comparisons. The first result from both models was that the acquisition rates for an NRTI mutation were lower than the rates for an NNRTI mutation; all comparisons of relevant parameters were highly significant (p < 0.003), except the comparison of q34 and q13 which was less significant (p = 0.023). The second result was that the acquisition rate for an NNRTI mutation increased when a patient had already acquired an NRTI mutation (q13 < q24, p < 0.001). Although the point estimate of the acquisition rate for an NRTI mutation increased when a patient had already acquired an NNRTI mutation (q12 < q34), the difference was not significant (p = 0.28) because of the large variance in the estimates. The transition from state 1 to state 2 occurred rarely on the study because a large percentage of patients had acquired such a mutation prior to the study.

There was a considerable amount of retrospective information to help in estimation of each transition rate because there were 13, 41, 22, and 122 NNRTI exposed patients in states 1, 2, 3, and 4, respectively, at baseline. The similarities between the NRTI and NNRTI drugs taken before and during the study suggested an initial assumption that qFormula = qFormula for all i,j, and formal testing of each was undertaken to find the most parsimonious model. Likelihood ratio tests were used to compare several nested models under the assumption of known state and TOTs (Table 3). Based on these tests, the most appropriate model had five transition parameters, q12, q13, qFormula, qFormula, and q34. The basic form of the resulting likelihood is shown in the Appendix. This parameter set also provided the best model when the state membership was considered known with error.


View this table:
[in this window]
[in a new window]

 
Table 3. The maximized log likelihoods for the models of interest

 
4.1 Markov model with TOTs considered known

For the Markov model with known TOTs and state membership, maximization of the likelihood yielded the results in the first column of Table 4. As described in Section 1, all patients contributed retrospective data and 70% of patients—those who did not rapidly become and then remain virologically suppressed during the study—contributed prospective data for this and the subsequent analyses. The results using the combined data were consistent with the results using the prospective data alone. Specifically, the acquisition rate of an NRTI mutation (q12,q34) was lower than that for an NNRTI mutation (q13,qFormula,qFormula), p < 0.002 for all pairwise comparisons; and the acquisition rate of an NNRTI mutation significantly increased when patients had already acquired an NRTI mutation (qFormula and qFormula) compared to patients without such mutations (q13), p < 0.005 for each comparison. Interestingly, among patients with an NRTI mutation, the acquisition rate of an NNRTI mutation was higher in patients who acquired the mutation on study (qFormula < qFormula,p = 0.008). The difference between the point estimate of q12 and q34 was much smaller than in Table 2 and remained nonsignificant (p = 0.75), showing that a NNRTI mutation did not have an effect on the acquisition of an NRTI mutation.


View this table:
[in this window]
[in a new window]

 
Table 4. Transition parameter estimates using treatment history and prospective data. q24b is the transition parameter before the study and q24d is during the study. Units are per year

 
The major difference between the first column of Table 2 and the first column of Table 4 was the tightened confidence intervals for each of the acquisition rate estimates. The acquisition rate of an NRTI mutation by a virus in state 1 (q12) changed most drastically because many patients acquired one of these mutations before the study, but few did so on study.

4.2 Markov model with TOTs considered known with error

We next assumed that state membership was measured exactly, but the TOTs were considered known with error. For the NNRTI-experienced patients, the times on NRTIs and NNRTIs were discretized and assumed to be distributed as jointly multinomial. The time prior to the study was divided into three intervals of 2 years and a fourth interval of roughly 6 years for the NRTI treatments and three intervals of 34 weeks and a fourth interval of roughly 150 weeks for the NNRTIs in order to have a similar number of patients in each interval. The mass was placed at the midpoint of the first three intervals and at 119 and 412 weeks, respectively, for the fourth interval because the midpoint of the fourth interval was skewed by extreme values. By noting that NNRTI treatment should not begin before NRTI treatment, two of the 16 parameters of the multinomial distribution were immediately set to zero. For the NNRTI naive group, the distribution of the time on NRTIs was assumed to be the marginal distribution and the time on NNRTIs was set to be exactly equal to 0.

To reflect the uncertainty in the TOTs, we placed an interval of 2 years around the recorded NRTI time and 34 weeks around the recorded NNRTI time if the NNRTI time was greater than 0. These intervals were quite large, and therefore reflected considerable pessimism regarding the quality of the data on treatment history. All the intervals of the multinomial distribution covered by the patient's interval were considered allowable. This procedure led to some total times that were impossible since we considered the time on NRTIs after discontinuation of NNRTIs to be known exactly. To remedy this problem, we used the time on NNRTIs from the multinomial and assigned the total time on NRTIs to be the maximum of the time on NRTIs from the multinomial and the sum of the time on NNRTIs and the time on NRTIs after discontinuation of NNRTIs.

In our example, we estimated the transition and multinomial parameters, but the multinomial estimates are not presented because these were nuisance parameters. The second column of Table 4 displays the Markov parameter estimates with the appropriate confidence intervals. As expected, the estimate of qFormula was the same as when the TOTs were considered known. A very small change in the remaining estimates was observed. As we explain in Section 5, our approach to handling the TOTs has a small, but systematic tendency to place more mass on the 1->3->4 pathway. The comparisons between the parameter estimates yielded similar results to those found when the TOTs were considered known, except that the difference between q13 and qFormula was only marginally significant (p = 0.056). Furthermore, the estimates had much tighter confidence intervals compared to the estimates from Table 2. Even though the TOTs were considered to be known with error, the estimates including these data were more precise than the estimates using the prospective data alone.

4.3 Hidden Markov Model

Finally, we explored the case in which the state membership may not be known exactly using an ascending HMM. The third and fourth columns of Table 4 show the results from the HMM when the TOTs were considered known and were considerd known with error, respectively. Each estimate was less than or equal to the corresponding standard Markov model estimate, reflecting the way in which the HMM accommodated mixtures. The significant differences described above were found again in these models. As before with the standard Markov model, the estimates in Table 4 had tighter confidence intervals compared to the HMM in Table 2, even when the TOTs were considered known with error. As in the standard Markov model, the rates along the 1->3->4 pathway increased when the TOTs were considered known with error.

The results across the different methods showed a fairly high degree of consistency. Acquisition rates for an NNRTI mutation were higher than those for an NRTI mutation regardless of the presence of mutations of the other class. The acquisition rate of an NNRTI mutation increased in the presence of an NRTI mutation, and this effect was even greater when the NNRTI mutation developed during the study. There was no significant effect of an NNRTI mutation on the acquisition of an NRTI mutation, and this lack of effect was demonstrated more persuasively after the inclusion of the retrospective data. Finally, inclusion of the retrospective data enhanced the precision of the parameter estimates even when the retrospective data were considered known with error.

A simulation study using data comparable to ACTG 398 demonstrated that the Markov model yielded unbiased estimates when the state membership was known exactly. Furthermore, when the state membership was known with error, estimates from the Markov model were biased if state membership was treated as known, but estimates from the HMM were unbiased. The results of the simulation study are provided in Section A of the supplementary material (available at Biostatistics online http://www.biostatistics.oxfordjournals.org.ezp1.harvard.edu).


    5. DISCUSSION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. EXPANDING THE MARKOV...
 3. IDENTIFIABILITY OF PARAMETERS
 4. APPLICATION
 5. DISCUSSION
 APPENDIX
 REFERENCES
 
Proper application of Markov models to estimate the acquisition rate of viral mutations must account for the complex nature of genetic data, especially in treatment-experienced patients. Our methods address two of the possible extensions to the works of Albert (1962)Go and Foulkes and DeGruttola (2003)Go, interval-censored times of mutation and inclusion of retrospective data.

From prospective data alone, limited information may be available to estimate some of the parameters. For example, when many patients start a study after at least one state transition, only a small number of transitions from the original state can be observed during the study. Incorporation of retrospective data provides additional information related to these and other transitions. Retrospective information is also virtually "free" in the sense that such information is routinely collected at entry into studies; the only cost is the assumptions required, while the benefit is increased precision.

The decision about when and how to use retrospective data to augment the prospective data depends not only on knowledge of the treatment conditions before and during the study but also on the number of patients in each of the starting states. ACTG 398 observed each patient's state at study entry, but we needed to assume that each patient was in state 1 at treatment initiation. Therefore, no direct information was available from the retrospective data to estimate q24 and q34 because no patients could be observed to move from state 2 or 3 to state 4 before the start of the study. However, a reasonable number of patients were in states 2, 3, and 4 at the start of the study, so it appears likely that each pathway to state 4 was taken. The ascending Markov model allowed us to use the number of patients in each state to make inference about all transition parameters. We performed additional analyses using the retrospective data for observed transitions only (not presented) and found that the results were consistent with those from Section 4.

Coarsening the TOTs allowed us to determine the robustness of our conclusions to uncertainty in the retrospective data. Our method of coarsening was designed to reflect uncertainty in the TOTs while inducing only a small degree of bias. When the discrete time points were not admissible, the time on NRTIs was assigned to exactly equal the sum of the time on NNRTIs and the time on NRTIs after discontinuation of NNRTIs. This reduced the chance of getting an NRTI mutation before an NNRTI mutation. Although coarsening did have some impact on the parameter estimates, the effect was small, particularly for the transitions from state 1, and the qualitative interpretation of the results remained unchanged. Our general conclusions appear to be robust to considerable uncertainty in the pre-study time of exposure to treatment.

To check the Markov assumption during the study, the time to the next transition was compared in patients who were in state 3 for two consecutive measurements to the time in patients who transitioned from state 1 to state 3 using the log rank test. There was no difference between the two groups (p = 0.37), providing some evidence in favor of the Markov assumption because the time to the next transition was not affected by the history. At the same time, a relatively small numbers of patients contributed to this analysis (although greater than if the transitions from state 2 were used), so the power was only modest.

Finally, our analysis showed that the rate of acquisition of an NNRTI mutation among patients who had an NRTI mutation was greater during the study than prior to the study. As mentioned above, patients were exposed to nevirapine or delavirdine before the study and efavirenz for the duration of the study. Several studies, Van Leth and others (2004)Go and Nunez and others (2002)Go, have shown nevirapine to be similar to efavirenz regarding both virologic failure rates and resistance mutations so the change in the effect of an NRTI mutation on the acquisition of an NNRTI mutation was surprising. A possible explanation for the observed difference is that the patients on the study were selected by the ability to survive extensive antiretroviral experience; those who quickly developed resistance may have been at higher risk of death. Another explanation is that some of the acquisitions of new NNRTI mutations during the study were actually the reemergence of archived mutant species. When patients discontinue treatment, often the virus reverts to the wild-type virus, which is more fit in the absence of drug pressure, but the undetectable mutant virus remains in the body (Hance and others, 2001Go). Since some patients discontinued NNRTIs before ACTG 398, the increase in qFormula compared to qFormula may have been due to the reemergence of archived mutant species, rather than the acquisition of new mutations. This and other possible biases underscore the importance of testing whether the rates are the same before and during the study prior to combining the data.


    APPENDIX
 TOP
 SUMMARY
 1. INTRODUCTION
 2. EXPANDING THE MARKOV...
 3. IDENTIFIABILITY OF PARAMETERS
 4. APPLICATION
 5. DISCUSSION
 APPENDIX
 REFERENCES
 
Given below is the expansion of the Markov likelihood for the case with one treatment continuously given and a second treatment given after the first, but possibly discontinued before the study. There are 11 total terms in the likelihood for the possible sequences of states and times, although only three of the terms are shown here. The first term (p = 1) corresponds to no transitions before or during the study. The second term (p = 2) corresponds to a transition from state 1 to state 2 before the start of the study and no further transitions. Since it is uncertain if the patient transitioned before, while, or after the second treatment was given, there are three terms required for this path. The final term (p = 11) is for patients who transition from state 1 to state 3 and then to state 4 during the study.


Formula (A.1)

where rh(t1) is the time on NRTIs, rh(t2) is the time on NNRTIs, rh(t3) is the time on NRTIs after discontinuation of NNRTIs before the start of the study, t‘s are the relevant longitudinal times, and qij are the relevant transition parameters.


    ACKNOWLEDGMENTS
 
Brian Healy is supported by AIDS training grant (5 T32 AI007358-17) and Victor DeGruttola is supported by National Institute of Allergy and Infectious Disease R01-51164. We would also like to acknowledge Chengcheng Hu and the reviewers for their helpful comments. Conflict of Interest: None declared.


    REFERENCES
 TOP
 SUMMARY
 1. INTRODUCTION
 2. EXPANDING THE MARKOV...
 3. IDENTIFIABILITY OF PARAMETERS
 4. APPLICATION
 5. DISCUSSION
 APPENDIX
 REFERENCES
 

    Albert A. (1962) Estimating the infinitesimal generator as a continuous time, finite state Markov process. Annals of Mathematical Statistics 33:727–53.[Web of Science]

    Degruttola V and Lagakos SW. (1989) Analysis of doubly-censored survival data, with applications to AIDS. Biometrics 45:1–11.[CrossRef][Web of Science][Medline]

    Dempster AP, Laird NM, Rubin DB. (1977) Maximum estimation from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B 39:1–38.

    Foulkes AF and Degruttola V. (2003) Characterizing the progression of viral mutations over time. JASA Applications and Case Studies 98:859–67.

    Hammer S, Vaida F, Bennett KK, Holohan MK, Sheiner L, Eron JJ, Wheat LJ, Mitsuyasu RT, Gulick RM, Valentine FT, others FT. (2002) Dual vs. single protease inhibitor therapy following antiretroviral treatment failure: a randomized trial. Journal of the American Medical Association 288:169–80.[Abstract/Free Full Text]

    Hance AJ, Lemiale V, Izopet J, Lecossier D, Joly V, Massip P, Mammano F, Descamps D, Brun-vezinet F, Clavel F. (2001) Changes in human immunodeficiency virus type 1 populations after treatment interruption in patients failing antiretroviral therapy. Journal of Virology 75:6410–7.[Abstract/Free Full Text]

    Louis TA. (1982) Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B 44:226–33.

    Nunez M, Soriano V, Martin-Carbonero L, Barrios A, Barreiro P, Blanco F, Garcia-Benayas T, Gonzalez-Lahoz J. (2002) SENC (Spanish efavirenz vs. nevirapine comparison) trial: a randomized, open-label study in HIV-infected naive individuals. HIV Clinical Trials 3:186–94.[CrossRef][Medline]

    Satten GA and Longini IM. (1996) Markov chains with measurement error: estimating the "true" course of a marker of the progression of human immunodeficiency virus disease. Applied Statistics 45:275–309.[CrossRef][Web of Science]

    Satten GA and Sternberg MR. (1999) Fitting semi-Markov models to interval-censored data with unknown initiation times. Biometrics 55:507–13.[CrossRef][Web of Science][Medline]

    Sternberg MR and Satten GA. (1999) Discrete-time nonparametric estimation for semi-Markov models of chain-of-events data subject to interval censoring and truncation. Biometrics 55:514–22.[CrossRef][Web of Science][Medline]

    Turnbull B. (1976) The empirical distribution function with arbitrarily grouped, censored and truncated data. Journal of the Royal Statistical Society, Series B 38:290–5.

    Van Leth F, Phanuphak P, Ruxrungtham K, Baraldi E, Miller S, Gazzard B, Cahn P, Lalloo UG, Van Der Westhuizen IP, Malan DR, et al. (2004) Comparison of first-line antiretroviral therapy with regimens including nevirapine, efavirenz, or both drugs, plus stavudine and lamivudine: a randomised open-label trial, the 2NN Study. Lancet 363:1253–63.[CrossRef][Web of Science][Medline]

    Received November 28, 2005; revised April 18, 2006; revised August 3, 2006; accepted for publication August 21, 2006.


    Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



    This Article
    Right arrow Abstract Freely available
    Right arrow FREE Full Text (PDF) Freely available
    Right arrow All Versions of this Article:
    8/2/438    most recent
    kxl021v1
    Right arrow Alert me when this article is cited
    Right arrow Alert me if a correction is posted
    Services
    Right arrow Email this article to a friend
    Right arrow Similar articles in this journal
    Right arrow Similar articles in PubMed
    Right arrow Alert me to new issues of the journal
    Right arrow Add to My Personal Archive
    Right arrow Download to citation manager
    Right arrowRequest Permissions
    Right arrow Disclaimer
    Google Scholar
    Right arrow Articles by Healy, B.
    Right arrow Articles by Degruttola, V.
    Right arrow Search for Related Content
    PubMed
    Right arrow PubMed Citation
    Right arrow Articles by Healy, B.
    Right arrow Articles by Degruttola, V.
    Social Bookmarking
     Add to CiteULike   Add to Connotea   Add to Del.icio.us  
    What's this?