Skip Navigation


Biostatistics Advance Access originally published online on October 9, 2006
Biostatistics 2007 8(3):595-608; doi:10.1093/biostatistics/kxl031
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrowOA All Versions of this Article:
8/3/595    most recent
kxl031v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Rossell, D.
Right arrow Articles by Rosner, G. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rossell, D.
Right arrow Articles by Rosner, G. L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2006 The Authors
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Screening designs for drug development

David Rossell

Department of Biostatistics & Applied Mathematics, The University of Texas, M. D. Anderson Cancer Center, Houston, TX 77030, USA and Department of Statistics, Rice University, Houston, TX 77005, USA

Peter Müller* and Gary L. Rosner

Department of Statistics, Rice University, Houston, TX 77005, USA pmueller{at}mdanderson.org

* To whom correspondence should be addressed.


    SUMMARY
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 
We propose drug screening designs based on a Bayesian decision-theoretic approach. The discussion is motivated by screening designs for phase II studies. The proposed screening designs allow consideration of multiple treatments simultaneously. In each period, new treatments can arise and currently considered treatments can be dropped. Once a treatment is removed from the phase II screening trial, a terminal decision is made about abandoning the treatment or recommending it for a future confirmatory phase III study. The decision about dropping treatments from the active set is a sequential stopping decision. We propose a solution based on decision boundaries in the space of marginal posterior moments for the unknown parameter of interest that relates to each treatment. We present a Monte Carlo simulation algorithm to implement the proposed approach. We provide an implementation of the proposed method as an easy to use R library available for public domain download (http://www.stat.rice.edu/~rusi/ or http://odin.mdacc.tmc.edu/~pm/).

Keywords: backward induction; bayesian optimal design; clinical trial design; forward simulation; utility function


    1. INTRODUCTION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 
We develop a Bayesian decision-theoretic approach to screening designs for drug development. The proposed process is appropriate for a sequence of phase II studies targeting the same disease area, carried out at the same institution, competing for the same pool of potentially eligible patients, and subject to common resource constraints. For example, at large institutions dedicated to clinical research in cancer, such as the University of Texas, M. D. Anderson Cancer Center, a large number of new agents or new combinations of anticancer agents undergo evaluation for activity. The process is typically carried out through separate phase II studies with only informal learning between studies—even if the studies draw patients with similar disease characteristics. We develop an approach that considers such a sequence of studies as one large encompassing screening design and borrows information between studies. An easy to use implementation, as an R library, allows interested readers to implement the proposed algorithm with minimal effort.

Most screening designs for culling active therapies from many new agents that are in development consider each study in isolation, even though investigators recognize the need for reproducibility of results (Simon, 1987Go). After several similar phase II studies have appeared, one is left to combine the information informally and arrive at a decision whether to move ahead with the treatment or not. The question of how many repeat studies to complete is also left informal. In particular, one intuitively would think that the number of replicate phase II studies might depend on the strength of evidence already available concerning the activity of the new agent. Currently, however, decision making does not incorporate such quantitative information in a formal way.

Yao and others (1996)Go proposed a formal way to screen multiple agents for activity in a series of phase II vaccine trials. For each treatment being considered, a single-arm clinical study is carried out. The decision concerns choosing the sample size for each phase II study and a threshold to minimize the overall expected sample size (or time) needed until an active agent is identified. The decision problem is discussed in the frequentist paradigm, in which the type I and type II error probabilities are prespecified and preserved over the sequence of experiments. The formal setup in Yao and others (1996)Go considers one treatment at a time and assumes independent binary outcomes. In later work, Yao and Venkatraman (1998)Go, Wang and Leung (1998)Go, and Leung and Wang (2001)Go consider a variety of extensions leading to 2-stage designs and fully sequential designs in the same setup. Strauss and Simon (1995)Go consider a generalization based on 2-armed randomized trials for each new treatment. One arm is the new treatment, and the other arm is the best treatment found so far. At the end of this sequence of randomized studies, one chooses the "winner" that will be compared to a standard regimen in a randomized comparative trial. Stout and Hardwick (2005)Go discuss the above-mentioned approaches as special cases of a more general setup.

In this paper, we build on these methods to develop a sequential decision-theoretic design for drug screening. We introduce two important directions of generalization. First, we allow for multiple treatments to be considered at any given time. New treatments can arise, and existing ones can be dropped at any time if the current evidence suggests that it is optimal to abandon further development of them or that it is optimal to move them to phase III. Second, we cast drug screening as a decision problem. Using a simulation-based solution allows us to consider essentially arbitrarily complex utility functions and probability models. Also, the proposed approach includes the possibility to restrict the action space, for example by considering only designs with certain type I and type II error probabilities.

We propose a probability model that allows borrowing information between treatments, which is appropriate when treatments target the same disease and are likely to be based on similar mechanisms. We consider a utility function that includes terms related to sampling cost and a final payoff that is realized if the future phase III trial shows a statistically significant improvement over the standard of care. The use of a utility function that is based on the sampling cost and the final payoff means that we focus on the perspective of the drug developer or the investigator who is carrying out the trial. We propose to accommodate the interests of regulators and patients by restricting consideration to rules that satisfy constraints on type I and type II error probabilities. For comparison, we also consider a utility of the form proposed by Yao and others (1996)Go, who seek to minimize the number of patients before the first treatment is recommended for phase III. The decision criterion for the screening trial is the expected utility, appropriately marginalizing with respect to the unknown true success probability, and the future outcomes in the phase III study. For an extensive discussion of utility functions for clinical trials, see Gittins and Pezeshk (2002)Go.

In Section 2, we formally state the drug screening process as a decision problem by defining a probability model, an action space, and a utility function that serves as the decision criterion. In Section 3, we discuss a simulation-based approach for solving the decision problem. In Section 4, we show results for a simulated example. In Section 5, we assess the uncertainty and robustness of these results. In Section 6, we compare our approach with that of Yao and Venkatraman (1998)Go in a clinical immunology problem. Finally, we conclude in Section 7 with a final discussion of features and limitations of the proposed approach.


    2. DRUG SCREENING
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 
Our approach is based on casting the screening process as a formal decision problem. The basic ingredients of a decision-theoretic setup are an action space A of possible decisions d{epsilon}A, a probability model p({theta},y) for all relevant random variables, including parameters {theta} and future data y, and a utility function u(d,{theta},y). The probability model is conveniently factored into a prior probability model p({theta}) and a sampling model p(y|{theta}). It can be argued (DeGroot, 2004Go) that a rational decision maker should choose an action in A to maximize the expectation of u. The expectation is with respect to p, conditioning on all data observed at the time of decision making, and marginalizing over all parameters and all future data. Sometimes, the action space is restricted to decisions that satisfy certain constraints, for example prespecified bounds on type I and type II errors (false-positive and false-negative rates). In such cases, the maximization is carried out over the restricted set.

2.1. Action space and probability model

Let yti be the outcome at time t = 1,...,T for treatment i{epsilon}At, where At is the set of treatments being considered at time t. We assume a finite time horizon T for the entire screening process, and we allow for a random number of treatments at any given time t.

After observing the outcomes yti, i{epsilon}At, we make a sequential stopping decision dti for each treatment. We denote with dti = 0 the action of removing treatment i from At and with dti = 1 the action of continuing recruitment for treatment i. If we decide dti = 0, then a terminal second-step decision ai indicates whether to abandon treatment i (ai = 0) or whether to recommend to proceed with a confirmatory phase III study (ai = 1).

Finally, before the next decision at time t + 1, new treatments might be proposed and added to the set At + 1. Let {Delta}nt denote the number of new treatments arising in period t and denote with {pi}j = Pr({Delta}nt = j), for j = 0,1,... , its probability distribution. In the last period, T, continuation is not possible. That is, dTi = 0 for all i{epsilon}AT. Figure 1 illustrates the sequence of decisions and observations.


Figure 1
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Multiple binomial experiments are available at time t. Some of them are dropped, and some are introduced at the end of period t. 1

 
The formal definition of the decision problem requires a probability model for all involved random variables. We assume binomial sampling. We make this assumption mainly for ease of exposition. With minor modifications, the proposed approach can be adapted to other sampling models. Thus, without major loss of generality, we assume

Formula (2.1)

with known Nti. In particular, accrual rates can vary across treatments. The unknown success probabilities arise from a common prior distribution, possibly involving a regression on treatment-specific covariates. We use a Beta prior, {theta}i~Be(u,v), with random hyperparameters (u,v) that allow borrowing of information between treatments. As prior distribution on these hyperparameters, we assume Gamma distributions, subject to a bound on u + v,

Formula (2.2)

The restriction limits the extent of borrowing of strength across treatments. That is, no matter how many treatments and patients we have observed, the data will never provide more information about a new treatment than the equivalent of 10 patients. The choice of 10 is arbitrary. Any alternative bound, or no bound, could be used without any change in the following discussion. In the context of phase II trials with typically small sample sizes, we consider 10 to be a reasonable choice.

Finally, we include a bound N{star} for the number of eligible patients who can be recruited for enrollment at time t. Setting N{star} = {infty} defines the problem without recruitment limits. We assume without loss of generality that N{star} remains the same across t. When a new treatment arises and no patients are available, data collection for the new treatment has to wait until one of the existing treatments is dropped. We do not consider adaptive allocation to treatments.

2.2. Utility function

Let nT be the overall number of treatments considered in the screening process, and let d{equiv}(dti,t = 1,2,3,...,T;i{epsilon}At) and a{equiv}(ai,i = 1,...,nT) denote the sequence of decisions. Recall that dit denotes the stopping decision and ai denotes the terminal decision upon stopping enrollment for treatment i. Let y = (yti,t = 1,2,...,T,i{epsilon}At), and let {theta} = ({theta}i;i = 1,...,nT) denote the parameters of the sampling model for y.

The utility function u(d,a,{theta},y) formalizes preferences across possible outcomes corresponding to assumed responses y, parameters {theta}, and decisions (d,a), that is it reports the value of a hypothetical realization (y,{theta},a,d) of the entire trial. An important advantage of the proposed simulation-based solution is that we are free to specify a utility function that reflects the scientific problem, without constraints to convenient analytic properties.

In our implementation, we use a utility function that includes sampling cost plus a payoff for every treatment that is recommended for phase III and is approved at the end of a future confirmatory phase III study that compares the experimental therapy versus the standard of care. The payoff is weighted by the size of the advantage over the standard of care. Regulatory approval is formalized as a statistically significant treatment effect at the conclusion of the confirmatory trial. We build a utility function for the entire process in steps, leading eventually to the utility function stated in (2.3).

First, suppose that for treatment i we start recruitment at time t0i and we stop recruitment at time t1i, that is dti = 0 at time t = t1i. If the treatment is abandoned (ai = 0), then we only record a linear sampling cost c1·{sum}FormulaNti = c1·N·i. Here, N·i is the total number of patients assigned to treatment i.

If we proceed with a phase III trial (ai = 1), then we record the sampling cost c1n3, where n3 is the sample size of the future study, and we add a payoff for a significant phase III result, weighted by the estimated size of the advantage over the standard of care. Let {theta}0 denote the success probability for the standard of care. Let Formula and Formula denote the maximum likelihood estimates for {theta}i and {theta}0 at the end of the phase III trial, and let B denote the event of observing a significant result. Let c2 denote the reward for recommending a treatment that shows a significant treatment effect in the confirmatory trial, that is the reward for a successful drug development. The reward is scaled by the estimated size of the advantage over placebo and the probability of B. We record Formula. Putting everything together, we have

Formula (2.3)

We now discuss the evaluation of n3, Pr(B|y1,...,yt1i), and Formula. Let (mti,sti) denote the posterior mean and standard deviation for {theta}i at time t, and let (m·i,s·i) denote their value at time t1i. The phase III sample size n3 is chosen for a test comparing H0, H0:{theta}i = {theta}0, versus an alternative H1, H1:{theta}i = m·i, for a given significance level {alpha}3 and power 1 – ß3. Let Formula, and let zp denote the (1 – p) standard normal quantile. We assume that the final test is carried out as a z-test to compare two binomial proportions. Assuming known {theta}0, we approximate the phase III sample size as

Formula

Next, we evaluate the posterior predictive probability p(B|y1,...,yt1i). The event B is defined by the z-statistic falling in the rejection region in favor of the experimental arm. Thus,

Formula

Using a normal approximation to the posterior predictive distribution Formula, we can approximate p(B|y1,...,yt1i). Denote by µ{Delta} and {sigma}Formula the moments of this normal approximation,

Formula

Finally, we evaluate Formula, the posterior predictive expectation for the size of the advantage over standard of care, conditional on B. This conditional expectation is evaluated as the expected value of a normal random variable left truncated at Formula (Jawitz, 2004Go).

2.3. Decision boundaries

The described action set, probability model, and utility function formally define the decision problem. We now proceed to find the optimal solution by maximizing the utility u(d,a,{theta},y) as a function of the decisions, marginalizing with respect to {theta} and all future data that are unknown at the time of a decision and conditioning on all available data.

We first discuss the terminal decision ai, the indicator for recommending a phase III trial. The terminal decision is carried out at time t = t1i. From (2.3), we find that ai = 1 is optimal if and only if

Formula (2.4)

If m·i < {theta}0, it is not possible to achieve the desired power in the phase III trial, and we set ai = 0. This solves the choice for the terminal decision ai, once we have decided to stop enrollment in treatment i.

The continuation decision dti is complicated by its sequential nature. To find the optimal solution at time t, we need to compare expected utilities under dti = 0 and dti = 1. To find the expected utility under continuation, dti = 1, we need to know the decision for t + 1, etc. A full solution involves the use of backward induction. But the computational cost of backward induction makes a full solution infeasible even in fairly simple situations. DeGroot (2004)Go, Brockwell and Kadane (2003)Go, and Berry and others (2001)Go discuss alternative, computationally intensive approaches that allow one to approximate full backward induction. Many Bayesian clinical trial designs avoid the difficult problem of optimal sequential decisions by stopping short of a formal decision-theoretic approach. Instead, many methods include a combination of posterior inference for the probability model with reasonable but ad hoc rules for the desired decisions. A typical example is the approach proposed in Thall and others (1995)Go. The method proceeds by evaluating posterior probabilities of clinically meaningful events. When these probabilities cross predefined boundaries, certain decisions are indicated. The boundaries are fixed to achieve desired frequentist properties. Spiegelhalter and others (2004)Go refer to such decision rules as proper Bayes. The main problem with such approaches is the large number of arbitrary choices. The major advantage is the ease of implementation.

We propose rules that are derived as optimal Bayes rules by maximizing expected utility. But we avoid the prohibitive computational cost of backward induction by appealing to an approximation. Instead of a full backward induction solution, we use decision boundaries in the space of marginal posterior moments (log(sti),mti) to approximate the optimal sequential decision. See Figure 2 for an example. The decision boundary is defined by two line segments starting at (s0,b0) and going through (s1,b1), with b1 > b0, and (s1,b2), with b2 < b0, respectively. The two values s0 and s1 are fixed, leaving b = (b0,b1,b2) to identify the decision boundary. At the end of each period t, we compare the marginal moments (log(sti),mti) with the decision boundaries. If mti lies between the two lines, then we continue to accrue patients for treatment i, that is dti = 1. If not, we drop treatment i (dti = 0). In summary,

Formula (2.5)

We write d = d(b) to highlight the nature of d as a rule determined by the decision boundary b.


Figure 2
View larger version (21K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Forward simulation. For 20 treatments, we pl ot (mti,logsti) from the time t0i treatment i arose until t1i when it stopped. The 2 thick black half lines show a decision boundary b.

 
Figure 2 shows the decision boundaries for a specific choice of (b0,b1,b2). The (fixed) offset so determines the log(sti) value where the two half lines join. We always stop accruing patients when log(sti) < s0. This has the desirable implication of imposing an upper bound on the amount of information, as measured by posterior variance, before making a stopping decision. The stopping decision is followed by the terminal decision ai, as described earlier.

Using decision boundaries as in (2.5) reduces the solution of the sequential decision problem to finding the optimal parameters b = (b0,b1,b2). The optimal choice is determined by maximizing expected utility

Formula (2.6)

The expectation is over {theta}~p({theta}) and yti~p(yti|{theta}) and plugging in the optimal terminal decisions ai.


    3. EXPECTED UTILITY MAXIMIZATION BY SIMULATION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 
We resort to forward simulation to evaluate expected utility

Formula (3.1)

using the optimal terminal rule a. Forward simulation was introduced in Carlin and others (1998)Go to solve sequential decision problems that can be described by decision boundaries. We simulate once, up-front, possible realizations, j = 1,...,M, of the screening process, keeping all treatments in the trial until a final horizon T. That is, we do not include stopping in the simulation. The arrival of new treatments is simulated using the multinomial probabilities {pi}j.

To evaluate expected utility U(b) for a decision boundary described by b, we look through the file of saved simulations. Let ui denote the ith term in (2.3). Whenever a treatment hits the decision boundary b, it is removed from the current set. When this happens, we compute the optimal terminal decision ai using (2.4) and record the realized utility ui for this treatment. Summing ui over all treatments we get a realization of the utility (2.3). Averaging over all simulated realizations, j = 1,...,M, we obtain an estimate Formula of the expected utility U(b). In other words, we use the Monte Carlo average Formula to evaluate the expected utility integral (3.1). Similarly, we can evaluate the expected value of other summary statistics, such as the number of patients tested with each treatment or the probabilities of type I and II errors. Finally, evaluating Formula over a grid on b, we find the optimal decision boundary b*.

Evaluation of U(b) as a sample average Formula does not exploit assumed regularities of the expected utility surface as a function of b. That is, we ignore that we could learn about b also by looking at close-by designs b'. This is formalized by fitting a smooth surface Formula to the observed sample averages Formula as a function of b. Such smoothing was proposed in Müller and Parmigiani (1996)Go as a generic method to improve expected utility evaluations. We propose to define a smooth surface Formula as a locally weighted linear regression of Formula on b, using only main effects for b0, b1, and b2.

The described algorithm requires the evaluation of (mti,sti) for a large number of times, treatments, and simulations. This can be very computationally intensive when no closed form is available, as is the case for the model defined by (2.1) and (2.2). We implemented instead an empirical Bayes approximation to (mti,sti) as proposed, for example, in Gelman and others (1995)Go.


    4. SIMULATION EXAMPLE
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 
We implemented the described method for the following problem. We assume a standard of care with success probability {theta}0 = 0.5, sampling in cohorts of Nti = 2 patients, and multinomial probabilities ({pi}0,...,{pi}3) = (0.7,0.2,0.05,0.05) for the arrival of new treatments {Delta}nt. We specify no limit on the number of available patients, that is N{star} = {infty}.

We set the prior parameters to Au = 3, Bu = 1, Av = 3, and Bv = 1, corresponding to a prior mean E({theta}i) = 0.5 and standard deviation SD({theta}i) = 0.27. The experimenter expects new treatments to be as good as the standard of care on the average, but the actual performance of individual treatments can vary considerably. For the utility function, we use relative weights c1 = 1 and c2 = 10000, that is the final payoff for a successful drug is 10 000 times the sampling cost for one patient. The value of c2 was chosen to achieve a power of approximately 80%. See below for a definition of power and type II error in the context of this simulation. The time horizon is assumed as T = 100. We investigated the impact of T on the solution by considering a doubling of the time horizon to T = 200. Comparing the reported optimal rules we found no significant change, leading us to interpret T = 100 as a reasonable approximation for a process with infinite horizon.

We add one more important feature to the decision problem. Let 1 – ß denote the probability of an effective treatment, that is a treatment with simulation truth {theta}i > {theta}0, being recommended for phase III. The probability is over repeated simulations and averaging with respect to the prior over all {theta}i > {theta}0. We refer to ß as the false-negative probability (type II error) and 1 – ß as power. Similarly, we define {alpha} as the false-positive probability (type I error). We constrain the set of allowable decision rules b to such rules that imply {alpha} ≤ 0.05 and ß ≤ 0.20, that is power > 80%. The motivation for adding the constraint is that the utility function (2.3) could be criticized as being too narrowly focused on the perspective of the investigator and drug developer only. In the simulation, the constraint is imposed by restricting the grid search for the optimal rule b* to decision rules b that satisfy the conditions. To evaluate {alpha}, we find the relative frequency of simulated treatments with {theta}i < {theta}0 in the forward simulation that are recommended for phase III. To evaluate ß, we count the treatments with {theta}i > {theta}0 that are not recommended for phase III. All results are based on M = 1000 forward simulations.

We evaluate the expected utility U(b) in (3.1) over a 3-dimensional grid, as described in Section 3. Figure 3(a), (b), and (c) plots the surface Formula for several values of b0. The flat nature of the surface with respect to b1 indicates that a wide range of b1 values yield similar expected utility.


Figure 3
View larger version (22K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3. Heat map of unsmoothed expected utilities Formula (first row) and smoothed Formula (second row) on a grid. The black star in each figure marks the optimal (b1,b2) combination for that b0.

 
Figure 2 shows the optimal decision boundaries subject to {alpha} ≤ 0.05 and ß ≤ 0.2, some simulated trajectories, and the terminal decisions. For each treatment in the simulated trial, we plot the trajectory of (mti,sti). We follow each treatment from right (high, prior variance) to left until the trajectory crosses a decision boundary. This defines the stopping time t = t1i and the terminal decision ai.

Rows 1 and 2 of Table 1 provide the solution to both the unconstrained and the constrained optimization problems. The optimal unconstrained decision has higher expected utility than the optimal constrained decision, but it requires a larger average number of patients and it implies higher {alpha} and ß.


View this table:
[in this window]
[in a new window]

 
Table 1. Optimal decisions for the simulation example

 
Finally, we fit a smooth surface Formula to Formula by locally weighted linear regression, as proposed in Section 3. The optimal bandwidth is selected by leaving 1/3 of the grid points out of the model fit and minimizing the mean square error of the predictions for those points. Figure 3(c), (d), and (e) shows the fit.

The optimal decisions, shown in rows 3 and 4 of Table 1, are very similar to those obtained without smoothing. The fact that the optimal design changed little confirms that the chosen Monte Carlo sample size, M = 1000, was sufficiently large for this optimal design problem. For smaller M, the advantages of smoothing should be more noticeable.


    5. UNCERTAINTY AND SENSITIVITY OF THE OPTIMAL DECISION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 

5.1. Uncertainty

We consider two sources of uncertainty in the final solution b*. First, numerical errors in evaluating the expected utilities imply uncertainty about the location of the maximum b*. We refer to this uncertainty as "numerical uncertainty." Second, even if we identify the correct mode of the expected utility surface, there may be other designs with almost equally high expected utility. We refer to a set of designs b with expected utility U(b) within a small neighborhood of U(b*) as "almost optimal designs." While it is possible to reduce the first source of uncertainty by more extensive simulation, the latter uncertainty is inherent in the problem. We can only aim to honestly describe it.

To evaluate the numerical uncertainty in b*, we select designs b within a neighborhood of b*. We then approximate the expected utility U(b) in that neighborhood with a quadratic response surface, U(b) = Q(b;{gamma}) + {epsilon}. Here, {gamma} are the regression coefficients of the quadratic function and {epsilon} are independent normal residuals. The posterior distribution on {gamma} implies a posterior distribution on the mode bFormula({gamma}) of the response surface. We report 95% posterior intervals for bFormula to summarize the numerical uncertainty in the optimization. The results are shown in Table 2. The table is based on a neighborhood of b* defined by ||bb*|| ≤ 0.01. We judge the reported uncertainties to be negligible based on the comparison with the suboptimal set of designs discussed below. The small size of the reported numerical uncertainties confirms that the chosen Monte Carlo sample size M = 1000 was sufficiently large.


View this table:
[in this window]
[in a new window]

 
Table 2. Uncertainty in the determination of the optimal decision

 
Next, we find the set of almost optimal designs. In the forward simulation, 203 = 8000 triples b = (b0,b1,b2) were considered, that is we estimated the expected utility and type I and type II error rates for 8000 possible values of b. Of these, 55 designs satisfied the constraint Formula and Formula and had an expected utility greater than 95% of the utility under the optimal design b*, that is Formula. Here, b* denotes the optimal rule for the constrained problem. We refer to these 55 designs as the almost optimal designs. They are suboptimal but only by a negligible difference in expected utility. The range of almost optimal designs is reported in Table 2. Since the decision problem is invariant with respect to any additive shift of the utility function, it is not possible to recommend a universal threshold like the 95% chosen here. The choice depends on the problem. The reported range of almost optimal designs is a useful diagnostic to help interpret, critique, and modify the proposed solution. Typically, the utility function is only a stylized description of the decision problem. The range of suboptimal designs allows the investigator to consider adjustments of the proposed solution to accommodate secondary goals and nuances of the decision problem that were not included in the formal utility function. For example, a large range on b1 or b2 might lead an investigator to propose designs with a narrower continuation region than b*, that is shorter total time for each treatment under consideration.

5.2. Sensitivity analysis

We assess the sensitivity of the solutions with respect to the choice of the main features of the decision problem: the utility function, the prior probability model, the parameterization of the decision boundary, and the maximum number of patients enrolled across all trials at each time (N{star}).

We first consider changes to the utility function defined in (2.3). We leave the general form of the utility unchanged, but we now weight the payoff for a significant phase III result by the true advantage over placebo ({theta}i{theta}0), rather than the estimated advantage. We define the utility function

Formula (5.1)

The optimal designs under the corresponding expected utility U2(b) are shown in rows 5–6 of Table 1 (smooth version only). Compared to the solution under the original utility function, b0 in the solution of the unconstrained problem decreases, the expected sample size Formula increases, and the type I error probability decreases slightly. The solution of the constrained problem is robust with respect to the change in the utility.

Next, we consider changes in the prior probability model. In (2.2), we defined a hierarchical model with hyperparameters that allow the pooling of information between treatments. We investigate the change in the optimal design if at the time of analysis, we ignore the hierarchy and use independent beta priors. We continue to use the hierarchical model as the simulation truth. For a meaningful comparison, we use U2 since it does not depend on model-based estimates of {theta}i. Table 1 presents the optimal decisions for both the constrained and unconstrained problems. Without pooling information, the expected number of patients per treatment is increased. The change is most extreme in the constrained problem. The expected utility of the optimal design changes only little. We conclude that by using the hierarchical prior, we can gain the same payoff with fewer patients. Of course, this conclusion is only valid if the true sampling process does in fact include dependence across treatments.

Next, we consider changes in the parameterization of the decision boundaries. In (2.5), we imposed that the boundaries be linear in log(sti). We investigate changing the boundaries to be linear in sti, that is in (2.5), we replace log(sti) by sti. Results are shown in Table 1. The optimal design, its expected utility, and the expected number of patients per treatment are similar to the results for the log-scale parameterization. Lack of such robustness would be a concern. It would indicate that the optimal sequential rule is very poorly approximated by boundaries on the chosen grid.

Finally, we consider changing N{star}. We investigate the solution for only N{star} = 10 eligible patients available to enroll at each time (across all treatments in At). The optimal rule, shown in Table 1, results in smaller sample sizes and reduced utility, especially under the unconstrained problem.


    6. A CLINICAL IMMUNOLOGY EXAMPLE
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 
We apply the proposed approach to a screening design for vaccines in a clinical immunology scenario. The same example was analyzed in Yao and others (1996)Go and Yao and Venkatraman (1998)Go using an approach that minimizes the expected number of patients until the first effective treatment is identified. The design is restricted to a bound on type I and type II errors. They propose a 2-stage design where an interim analysis is carried out after the first N1 patients. The treatment is discarded if the number of successes is ≤ K1. Otherwise, N2 more patients are accrued, and the treatment is discarded if the overall number of successes is ≤ K2 and recommended otherwise. They then repeat the same process with the second treatment, and so forth. The decision parameters are K1, K2, N1, and N2. The method also includes a truncation, that is stopping the accrual before observing N1 patients if the number of failures is already ≥ N1K1, and stopping before N2 patients if we already have more than N2K2 failures. Truncation can significantly reduce the expected number of patients.

In our sequential approach, we define the utility function to be the average number of patients needed to recommend one treatment. This allows us to compare the results across methods. Specifically, we define the utility function to be the ratio of the total number of patients enrolled across all treatments to the number of treatments recommended for phase III. Let N·i = {sum}FormulaNti denote the total number of patients on treatment i. We define

Formula

Like Yao and others, we set the success probability for the standard of care to be {theta}0 = 0.5, and we use a Beta prior with parameters Formula and Formula, chosen to match the moments based on historical data, Formula and Formula. The prior gives a high probability to success probabilities close to 0 or 1.

We evaluate designs with M = 1000 simulations on a grid with 20 equally spaced values of b0 in [0.3,0.7], b1 in [0.3,0.8], and b2 in [0.2,0.6]. We use cohorts of N = 2 patients. After each batch, the posterior moments are evaluated and the decision to stop is taken according to (2.5). For the terminal decision, we use a fixed rule. Upon stopping the enrollment, a treatment is recommended when stopping was indicated by crossing the upper boundary, and a treatment is abandoned if stopping was indicated by crossing the lower boundary. We then select b to maximize Formula, the Monte Carlo sample average utility in the forward simulation. Again, the maximization is restricted to designs b that satisfy the constraints Formula and Formula.

Table 3 shows the optimal decision boundaries for several values of {alpha}max and ßmax, and compares them with the 2-stage optimal design with truncation proposed in Yao and Venkatraman (1998)Go. The fully sequential approach with the optimal decision boundaries yields a reduction between 27% and 57% in the expected number of patients necessary to recommend a treatment for phase III evaluation. In some cases, the actual Formula and Formula are lower than the upper bound imposed by the constraints. The reduced sample size is a natural consequence of the fully sequential setup and does not reflect on any deficiency in the other method.


View this table:
[in this window]
[in a new window]

 
Table 3. Clinical immunology example

 

    7. DISCUSSION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 
We have proposed a Bayesian decision-theoretic approach to optimal screening designs for phase II studies. Its main strength is the generality of the simulation-based solution which allows for a wide range of probability models and essentially any utility function. Another advantage is the possibility of optimizing within a subset of rules that satisfy certain properties.

As a decision-theoretic approach, the proposed method inherits the usual limitations of expected utility maximization. In particular, it requires the specification of a utility function and a prior probability model.

The use of decision boundaries to solve the sequential design problem greatly reduces the computational burden to find the optimal sequential decision. At the same time, however, it restricts the possible actions to those described by such decision boundaries. Instead of decision boundaries, one could consider the optimal decision for all possible values of a suitable summary statistic, in our case (sti,mti), on a finite grid. This is explored in Ding (2006)Go.

The basic framework developed in this paper allows many generalizations. The prior model could easily be generalized to include a regression of success probabilities on treatment-specific covariates. For example, we might learn that treatments that target a specific molecular mechanism are more successful than others. Another important direction of generalization is the sampling model. Consider problems with a continuous response or an outcome that reports the time to some clinically meaningful event. The algorithm can still be used in such problems as long as we can define a single parameter upon which to base the inference. For example, when analyzing time until tumor progression, one could define the summaries (mti,sti) as posterior moments of a log hazard ratio for treatment relative to the standard of care. The nature of the event time as a delayed response would cause no difficulty in the optimal design scheme. Delayed responses are accounted for in the definition of the posterior moments. In particular, the definition of the likelihood function would include different factors for censored observations and for observed event times, as usual in posterior inference for event time data.


    ACKNOWLEDGMENTS
 
Research was supported by National Institute of Health/National Cancer Institute grants R33 CA97534-01 and R01 CA075981. We thank Raquel Montes Díaz and Roberto Carta for work on earlier prototypes of proposed method. Conflict of Interest: None declared. Funding to pay the Open Access publication charges for this article was provided by M. D. Anderson Cancer Center.


    REFERENCES
 TOP
 SUMMARY
 1. INTRODUCTION
 2. DRUG SCREENING
 3. EXPECTED UTILITY MAXIMIZATION...
 4. SIMULATION EXAMPLE
 5. UNCERTAINTY AND SENSITIVITY...
 6. A CLINICAL IMMUNOLOGY...
 7. DISCUSSION
 REFERENCES
 

    Berry DA, Mueller P, Grieve AP, Smith M, Parke T, Blazek R, Mitchard N, Krams M. Adaptive Bayesian designs for dose-ranging drug trials. In: Case Studies in Bayesian Statistics, Volume V—Gatsonis C, Kass RE, Carlin B, Carriquiry A, Gelman A, Verdinelli I, West M, eds. (2001) New York: Springer. 99–182. Lecture Notes in Statistics.

    Brockwell AE, Kadane JB. A gridding method for Bayesian sequential decision problems. Journal of Computational and Graphical Statistics (2003) 12:566–584.[CrossRef][ISI]

    Carlin B, Kadane J, Gelfand A. Approaches for optimal sequential decision analysis in clinical trials. Biometrics (1998) 54:964–975.[CrossRef][ISI][Medline]

    DeGroot M. Optimal Statistical Decisions (2004) New York: Wiley-Interscience.

    Ding M. Bayesian optimal design for phase II screening trials, [PhD. Thesis]. (2006) Houston, TX: M.D. Anderson Cancer Center and Rice University.

    Gelman A, Carlin J, Stern H, Rubin D. Bayesian Data Analysis (1995) Boca Raton, FL: Chapman & Hall.

    Gittins J, Pezeshk H. A decision-theoretic approach to sample size determination in clinical trials. Journal of Biopharmaceutical Statistics (2002) 12:535–551.[CrossRef][Medline]

    Jawitz J. Moments of truncated continuous univariate distributions. Advances in Water Resources (2004) 27:269–281.[CrossRef][ISI]

    Leung DHY, Wang YG. Optimal designs for evaluating a series of treatments. Biometrics (2001) 57:168–171.[CrossRef][ISI][Medline]

    Müller P, Parmigiani G. Optimal design via curve fitting of monte carlo experiments. Journal of the American Statistical Association (1996) 90:1322–1330.[ISI]

    Simon R. How large should a phase II trial of a new drug be? Cancer Treatment Reports (1987) 71:1079–1085.[ISI][Medline]

    Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trails and Health Care Evaluation (2004) Chichester, UK: John Wiley and Sons.

    Stout Q, Hardwick J. Optimal screening designs with flexible cost and constraint structures. Journal of Statistical Planning and Inference (2005) 132:149–162.[CrossRef][ISI]

    Strauss N, Simon R. Investigating a sequence of randomized phase-II trials to discover promising treatments. Statistics in Medicine (1995) 14:1479–1489.[ISI][Medline]

    Thall P, Simon R, Estey E. Bayesian sequential monitoring designs for single-arm clinical trials with multiple outcomes. Statistics in Medicine (1995) 14:357–379.[ISI][Medline]

    Wang YG, Leung DHY. An optimal design for screening trials. Biometrics (1998) 54:243–250.[CrossRef][ISI][Medline]

    Yao TJ, Begg CB, Livingston PO. Optimal sample size for a series of pilot trials of new agents. Biometrics (1996) 52:992–1001.[CrossRef][ISI][Medline]

    Yao TJ, Venkatraman ES. Optimal two-stage design for a series of pilot trials of new agents. Biometrics (1998) 54:1183–1189.[CrossRef][ISI][Medline]

    Received April 25, 2005; revised September 1, 2006; accepted for publication October 3, 2006.


    Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



    This Article
    Right arrow Abstract Freely available
    Right arrow FREE Full Text (PDF) Freely available
    Right arrowOA All Versions of this Article:
    8/3/595    most recent
    kxl031v1
    Right arrow Alert me when this article is cited
    Right arrow Alert me if a correction is posted
    Services
    Right arrow Email this article to a friend
    Right arrow Similar articles in this journal
    Right arrow Similar articles in PubMed
    Right arrow Alert me to new issues of the journal
    Right arrow Add to My Personal Archive
    Right arrow Download to citation manager
    Right arrow Disclaimer
    Google Scholar
    Right arrow Articles by Rossell, D.
    Right arrow Articles by Rosner, G. L.
    Right arrow Search for Related Content
    PubMed
    Right arrow PubMed Citation
    Right arrow Articles by Rossell, D.
    Right arrow Articles by Rosner, G. L.
    Social Bookmarking
     Add to CiteULike   Add to Connotea   Add to Del.icio.us