Skip Navigation


Biostatistics Advance Access originally published online on October 27, 2006
Biostatistics 2007 8(3):625-631; doi:10.1093/biostatistics/kxl034
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
8/3/625    most recent
kxl034v2
kxl034v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Tang, N.-S.
Right arrow Articles by Wang, S.-F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tang, N.-S.
Right arrow Articles by Wang, S.-F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

Sample size determination for matched-pair equivalence trials using rate ratio

Nian-Sheng Tang

Department of Statistics, Yunnan University, Kunming 650091, China

Man-Lai Tang*

Department of Mathematics, Hong Kong Baptist University, Kowloon, Hong Kong mltang{at}math.hkbu.edu.hk

Shun-Fang Wang

Department of Statistics, Yunnan University, Kunming 650091, China

* To whom correspondence should be addressed.


    SUMMARY
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PROCEDURES FOR EQUIVALENCE...
 3. SAMPLE SIZE FORMULAE
 4. ANALYSIS OF THE...
 5. CONCLUSION
 REFERENCES
 
In this article, we compare Wald-type, logarithmic transformation, and Fieller-type statistics for the classical 2-sided equivalence testing of the rate ratio under matched-pair designs with a binary end point. These statistics can be implemented through sample-based, constrained least squares estimation and constrained maximum likelihood (CML) estimation methods. Sample size formulae based on the CML estimation method are developed. We consider formulae that control a prespecified power or confidence width. Our simulation studies show that statistics based on the CML estimation method generally outperform other statistics and methods with respect to actual type I error rate and average width of confidence intervals. Also, the corresponding sample size formulae are valid asymptotically in the sense that the exact power and actual coverage probability for the estimated sample size are generally close to their prespecified values. The methods are illustrated with a real example from a clinical laboratory study.

Keywords: Constrained maximum likelihood estimation method; Equivalence study; Sample size formula; Score test statistic


    1. INTRODUCTION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PROCEDURES FOR EQUIVALENCE...
 3. SAMPLE SIZE FORMULAE
 4. ANALYSIS OF THE...
 5. CONCLUSION
 REFERENCES
 
Many new treatments have been developed because they offer advantages such as better safety profiles, easier administration or lower cost, while maintaining efficacies similar to those of standard treatments. This changes the nature of the clinical investigation from a superiority to a noninferiority or an equivalence trial design (Hauck and Anderson, 1999).

An example is a clinical laboratory study of several radio allegro sorbent test (RAST) methods (Garcia and others, 1997Go). Briefly, the rate of production of in vitro immunoglobulin E (IgE) antibodies to the benzylpenicilloyl (PBO) determinant is a useful tool for evaluating suspected penicillin-allergic subjects. PBO conjugated to human serum albumin (HSA) is usually considered to be the standard test, and PBO conjugated to an aminospacer (SP) has only recently been suggested. The objective of this trial is to demonstrate that the ratio of the true success rates lies between a pair of clinically acceptable equivalence margins {delta}0 and {delta}1, with {delta}0 < {delta}1. However, only 60 samples of sera were obtained. It was reported that this could be an undersized study, and a proper sample size determination was needed once {delta}0 and {delta}1 were fixed (Tang, 2003Go).

Sample size determination for assessing the equivalence/noninferiority of 2 treatments via a rate ratio under a matched-pair design has only recently been studied (see Tang, 2003Go and references therein). Adopting the logarithmic transformation and Fieller-type statistics based on sample-based and constrained least squares estimations of nuisance parameters, Lui and Cumberland (2001)Go developed sample size formulae for noninferiority testing of the rate ratio in matched-pair designs. Nam and Blackwelder (2002)Go derived sample size formulae based on a Wald-type statistic and the constrained maximum likelihood (CML) Fieller-type statistic. Tang and others (2002)Go developed sample size formulae that control the desired power and confidence width based on a score-type statistic. Tang (2003)Go found that the score-type statistic of Tang and others is identical to Nam and Blackwelder's CML statistic and that the sample size formulae based on the CML estimation method are valid asymptotically. However, all these studies have mainly been concerned with noninferiority testing. Systematic evaluations of equivalence trials via rate ratios have not been carried out.

In this article, we consider the problem of testing equivalence via a rate ratio. We compare the performance of Wald-type, logarithmic transformation, and Fieller-type statistics. These statistics are implemented through sample-based, constrained least squares estimation, and CML estimation methods. We discuss both significance testing and confidence interval approaches. We then consider sample size formulae for different tests and approaches based on the CML estimation method. Simulations are conducted to demonstrate the asymptotic validity of the proposed formulae. We illustrate our methodologies with the aforementioned clinical laboratory study. Finally, we give a brief discussion.


    2. PROCEDURES FOR EQUIVALENCE HYPOTHESIS TESTS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PROCEDURES FOR EQUIVALENCE...
 3. SAMPLE SIZE FORMULAE
 4. ANALYSIS OF THE...
 5. CONCLUSION
 REFERENCES
 

2.1. Interval hypothesis

We assume that the disease status (i.e. diseased or nondiseased) of a given subject can be determined by a gold standard. A random sample of size ng subjects is drawn from the diseased (g = d) and nondiseased (Formula) populations. A reference diagnostic test and a new diagnostic test are then administered to each of these ng sampled subjects in random order. We define "concordant" as a positive test result on a diseased subject or as a negative test result on a nondiseased subject. Let i = 1 (j = 1), if a subject shows a concordant result in the new (reference) test; otherwise i = 0 (j = 0). Let Xijg be the number of subjects that show result (i,j) (i = 0,1,j = 0,1) in the gth population (Formula) and xijg the corresponding observed value of Xijg. The 4 outcomes and probabilities in population g are summarized in Table 1, where 0 ≤ pijg ≤ 1 denotes the response probability of cell (i,j), pi + g = pi1g + pi0g and p + jg = p1jg + p0jg, i,j = 0,1; xi + g = xi1g + xi0g and x + jg = x1jg + x0jg,i,j = 0,1. Note that ng = x1 + g + x0 + g = x + 1g + x + 0g, Formula. Hence, the sensitivities for the new and reference diagnosis tests are given by p1 + d and p + 1d, respectively. Similarly, the specificities for the new and reference diagnosis tests are given by Formula and Formula, respectively. Let Formula, Formula, and Formula be the sample-based estimates of pijg,pi + g, and p + jg, for i,j = 0,1 and Formula, respectively. The ratio p1 + d/p + 1d (or Formula) provides a measure for assessing equivalence between 2 test procedures in terms of sensitivity (or specificity), and the vector (x11g,x10g,x01g) follows a multinomial distribution with response probabilities (p11g,p10g,p01g). The equivalence between the new and reference diagnosis procedures can be described by the following interval hypotheses:

Formula (2.1)

where {delta}0g < {delta}1g are predetermined clinically meaningful lower and upper equivalence limits.


View this table:
[in this window]
[in a new window]

 
Table 1. Matched-pair equivalence trial data

 
To test the interval hypothesis H0g in (2.1), we adopt the widely used two one-sided tests (TOST) approach. This consists of testing the following 1-sided hypotheses (see, Dunnett and Gent, 1977Go; Schuirmann, 1987Go):

Formula (2.2)


Formula (2.3)

and taking the p value of the equivalence test for (2.1) to be the maximum of the p values of the TOST for (2.2) and (2.3) (Berger and Hsu, 1996Go).

2.2 Asymptotic equivalence tests

Let Formula and Formula be any estimates of p + 1g and p11g under the null hypothesis H0kg, for k = l,u, respectively, and let z{alpha} be the upper 100{alpha} percentile of the standard normal distribution. For sufficiently large ng, we consider the following equivalence tests for hypotheses (2.2) and (2.3) at level {alpha}. All derivations are presented in the supplementary material available at Biostatistics online. Reject H0g at the level {alpha} if

(T1) Wald-type tests based on measurement {delta}g = p1 + g/p + 1g:

Formula (2.4)

(T2) Logarithmic transformation tests based on measurement log({delta}g) = log(p1 + g/p + 1g):

Formula (2.5)

(T3) Fieller-type tests based on measurement p1 + g {delta}gp + 1g:

Formula (2.6)

As reported by Farrington and Manning (1990)Go and Tang (2003)Go, the choices of Formula, Formula, Formula, and Formula have a substantial impact on the performance of the test statistics in (2.4), (2.5), and (2.6). Here, we consider 3 methods to estimate Formula, and Formula. They are the sample-based method (M1), the constrained least squares method (M2), and the CML method (M3) (see the supplementary material available at Biostatistics online).

Note that the statistic T1lg based on the CML method is identical to the CML Fieller-type statistic and the score-type statistic (see, Nam and Blackwelder, 2002Go; Tang and others 2002Go), also that T1lg and T2lg could be undefined when Formula = x + 1g/ng = 0. To overcome this, we add 0.5 to x + 1g.

In some applications, one may want simultaneously to assess the equivalence of the sensitivity and the specificity of a new test and a reference test. In this case, we can establish equivalence between the new and the reference tests only when we are able to reject both hypotheses H0d:p1 + d/p + 1d ≤ {delta}0dorp1 + d/p + 1d ≥ {delta}1d and Formula. Hence, the intersection–union test discussed by Berger and Hsu (1996)Go can be used. That is, the equivalence of a new test to a reference test can be established at level {alpha} if Tkld ≥ z{alpha}, Tkud ≤z{alpha}, Formula, and Formula, for k = 1,2,3.

2.3 Tests based on the confidence interval approach

Often, the estimation of treatment difference is of more interest than the testing of specific hypotheses. It is well known that an equivalence hypothesis can be tested via the confidence interval approach (Tang and others 2002Go; Liu and others 2002Go). Briefly, the equivalence between the test and reference procedures can be established at the {alpha} level of significance, if and only if the corresponding 100x(1 – {alpha}) percent confidence interval lies entirely in the interval ({delta}0g,{delta}1g). It then follows from (2.4)–(2.6) that the 100x(1 – {alpha}) percent asymptotic test-based confidence intervals are given by Formula, where

Formula

with k' = 1,2,3.


    3. SAMPLE SIZE FORMULAE
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PROCEDURES FOR EQUIVALENCE...
 3. SAMPLE SIZE FORMULAE
 4. ANALYSIS OF THE...
 5. CONCLUSION
 REFERENCES
 
Determining the appropriate sample size for an equivalence trial is an essential step in any statistical design. In general, sample size planning can be approached from 2 different perspectives, namely the significance testing and confidence interval approaches. In particular cases, it is pertinent to include sample size planning in order to likely accomplish the goals of the study from the significance testing approach, the confidence interval approach, or a combination of the two. Derivations of the approximate sample size formulae for the 3 proposed statistics (T1, T2, and T3) and the 2 approaches (significance testing and confidence interval approaches) based on the CML method (M3) are presented in the supplementary material available at Biostatistics online.

The supplementary material available at Biostatistics online also compares the performances of the 3 proposed statistics (T1, T2, and T3) using the 3 different estimation methods (M1, M2, and M3). The results can be summarized as follows:

  1. In general, T2 and T3, using M3, give consistently good performance.
  2. In particular, T3 demonstrates robust behavior for almost all settings, while T2 gives conservative performance when sample size is small and moderate. We therefore recommend T3 using M3.
  3. All sample size formulae for controlling a prespecified power are asymptotically valid in the sense that the exact power for the estimated sample size is close to the prespecified power level.
  4. Similarly, all sample size formulae for controlling the confidence interval width are asymptotically valid in the sense that both the prespecified coverage and half-width are well controlled.
  5. Required sample sizes are generally smaller for T3, than for T1 or T2.


    4. ANALYSIS OF THE LABORATORY STUDY
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PROCEDURES FOR EQUIVALENCE...
 3. SAMPLE SIZE FORMULAE
 4. ANALYSIS OF THE...
 5. CONCLUSION
 REFERENCES
 
In this section, we illustrate our proposed methodology using as an example the clinical laboratory study described in Section 1. Briefly, 30 positive control sera (serum samples from penicillin-allergic subjects with a positive clinical history and a positive penicillin skin test) and 30 negative control sera (sera from subjects with no history of penicillin allergy and a negative skin test) were tested for BPO determinant–specific IgE antibodies by RAST using different conjugates coupled to the solid phase. The standard procedure is benzylpenicillin conjugated to HSA and the new procedure is benzylpenicillin conjugated to SP. The results are summarized in Table 2.


View this table:
[in this window]
[in a new window]

 
Table 2. Observed frequencies of BPO-HSA and BPO-SP for the positive and negative control groups

 
Suppose that we want to show that BPO-SP is equivalent to BPO-HSA on the basis of the specificity and/or sensitivity. The null hypothesis of interest is

Formula

where {delta}0g = 0.9 and {delta}1g = 1/{delta}0g for g = d or Formula. We applied to this data set the 9 equivalence tests that have been presented in this paper. If we take the significance level of the 2 1-sided equivalence tests to be 0.05, all tests except the Wald test based on the CML estimation method yield p values which are greater than the prespecified 0.05 nominal level, indicating that BPO-SP is not equivalent to BPO-HSA based on sensitivity and/or specificity.

Suppose that we focus now on the CML method. We want to plan a trial with FormulaFormula, and p10g = p1 + gp11g = {delta}gp + 1g p11g for g = d or Formula. The sample sizes that are required for obtaining 80% power with a 0.05 significance level are 122, 121, and 117 for the Wald-type statistic, the logarithmic transformation statistic, and the Fieller-type statistic, respectively, on the basis of the sensitivity, while their sample sizes are 56, 55, and 52, respectively, based on the specificity. Clearly, the sample sizes that are obtained from different statistics do not differ substantially in this example.

Suppose now that another investigator wants to rerun the experiment using similar settings, but with the aim of estimating the rate ratios of the sensitivities and specificities (i.e. {delta}d and Formula) and to construct the corresponding 90% confidence intervals with both half-widths being controlled at prescribed values. According to the data, the estimates of p1 + d, p + 1d, Formula, and Formula are given by Formula, Formula, Formula, and Formula, respectively. The estimates of {delta}d and Formula are given by Formula and Formula, respectively. Based on these, we set {delta}d = 1.4, Formula, p + 1d = 0.6, Formula, p10d = 0.3, and Formula. The corresponding sample sizes for the 90% confidence intervals with half-width controlled at each of w = 0.05 and 0.1 are reported in Table 3. In all the cases, we observe that the sample sizes based on the logarithmic transformation statistic are slightly smaller than those based on Wald-type (or Fieller-type) statistic.


    5. CONCLUSION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PROCEDURES FOR EQUIVALENCE...
 3. SAMPLE SIZE FORMULAE
 4. ANALYSIS OF THE...
 5. CONCLUSION
 REFERENCES
 
Our findings in this article are consistent with those of comparative binomial trials (see Farrington and Manning, 1990Go) and noninferiority trials (see Tang, 2003Go). That is, the sample-based and constrained least squares methods can produce incorrect sample sizes and inflated type I error rates, whereas the CML method usually produces accurate sample sizes with fairly well-controlled type I error rates. In general, we find that both logarithmic transformation and Fieller-type statistics are desirable choices in equivalence trial sample size calculations.

We consider sensitivity and specificity separately, although there are summary indices that combine both sensitivity and specificity. Two common choices for this purpose include Youden's index and the likelihood ratio of a positive (or negative) test (see, Biggerstaff, 2000Go). Extension of the present work to these indices is under consideration.


View this table:
[in this window]
[in a new window]

 
Table 3. Sample sizes for controlling a 90% confidence interval at half-width w for BPO data

 

    ACKNOWLEDGMENTS
 
The first author's work was sponsored by the National Natural Science Foundation of China (Project no. 10561008) and Natural Science Fund of Yunnan Province (Project no. 2004A0002M). The second author's work was fully supported by a grant from the Research Grant Council of the Hong Kong Special Administration (Project no. CUHK4371/04M). The authors are grateful to the editor and referees for their valuable suggestions that greatly enhanced the manuscript and to Professor N. Balakrishnan for reading the article for us. The second author would like to thank Ms Chow Hoi-Sze Daisy for her kind encouragement during the preparation of the manuscript. Conflict of Interest: None declared.


    REFERENCES
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PROCEDURES FOR EQUIVALENCE...
 3. SAMPLE SIZE FORMULAE
 4. ANALYSIS OF THE...
 5. CONCLUSION
 REFERENCES
 

    Berger JR, Hsu J. Bioequivalence trials, intersection union tests and equivalence confidence sets (with discussion). Statistical Science (1996) 11:283–319.[CrossRef][Web of Science]

    Biggerstaff BJ. Comparing diagnostic tests: a simple graphic using likelihood ratios. Statistics in Medicine (2000) 19:649–663.[CrossRef][Web of Science][Medline]

    Dunnett CW, Gent M. Significance testing to establish equivalence between treatments, with special reference to data in the form of 2x2 tables. Biometrics (1977) 33:593–602.[CrossRef][Web of Science][Medline]

    Farrington CP, Manning G. Test statistics and sample size formulae for comparative binomial trials with null hypothesis of non-zero risk difference or non-unity relative risk. Statistics in Medicine (1990) 9:1447–1454.[Web of Science][Medline]

    Garcia JJ, Blanca M, Moreno F, Vega JM, Mayorga C, Fernandez J, Juarez C, Romano A, De Ramon E. Determination of IgE antibodies to the benzylpenicilloyl determinant: a comparison of the sensitivity and specificity of three radio allegro sorbent test methods. Journal of Clinical Laboratory and Analysis (1997) 11:251–257.[CrossRef]

    Hauck WW, Anderson S. Some issues in the design and analysis of equivalence trials. Drug Information Journal (1999) 33:109–118.[Web of Science]

    Liu JP, Hsueh HM, Hsieh E, Chen JJ. Tests for equivalence or non-inferiority for paired binary data. Statistics in Medicine (2002) 21:231–245.[CrossRef][Web of Science][Medline]

    Lui KJ, Cumberland WG. Sample size determination for equivalence test using rate ratio of sensitivity and specificity in paired sample data. Controlled Clinical Trials (2001) 22:373–389.[CrossRef][Web of Science][Medline]

    Nam J, Blackwelder WC. Analysis of the ratio of marginal probabilities in a matched-pair setting. Statistics in Medicine (2002) 21:689–699.[CrossRef][Web of Science][Medline]

    Schuirmann DJ. A comparison of the two one-sided procedure and the power approach for assessing the equivalence of average bioavailability. Journal of Pharmacokinetics and Biopharmaceutics (1987) 15:657–680.[CrossRef][Web of Science][Medline]

    Tang ML. Matched-pair non-inferiority trials using rate ratio: a comparison of current methods and sample size refinement. Controlled Clinical Trials (2003) 24:364–377.[CrossRef][Web of Science][Medline]

    Tang ML, Tang NS, Chan ISF, Chan BPS. Sample size determination for establishing equivalence/noninferiority via ratio of two proportions in matched-pair design. Biometrics (2002) 58:957–963.[CrossRef][Web of Science][Medline]

    Received June 21, 2005; revised December 8, 2005; revised March 25, 2006; revised April 28, 2006; revised July 10, 2006; revised September 17, 2006; accepted for publication October 20, 2006.


    Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



    This Article
    Right arrow Abstract Freely available
    Right arrow FREE Full Text (PDF) Freely available
    Right arrow Supplementary Material
    Right arrow All Versions of this Article:
    8/3/625    most recent
    kxl034v2
    kxl034v1
    Right arrow Alert me when this article is cited
    Right arrow Alert me if a correction is posted
    Services
    Right arrow Email this article to a friend
    Right arrow Similar articles in this journal
    Right arrow Similar articles in PubMed
    Right arrow Alert me to new issues of the journal
    Right arrow Add to My Personal Archive
    Right arrow Download to citation manager
    Right arrowRequest Permissions
    Right arrow Disclaimer
    Google Scholar
    Right arrow Articles by Tang, N.-S.
    Right arrow Articles by Wang, S.-F.
    Right arrow Search for Related Content
    PubMed
    Right arrow PubMed Citation
    Right arrow Articles by Tang, N.-S.
    Right arrow Articles by Wang, S.-F.
    Social Bookmarking
     Add to CiteULike   Add to Connotea   Add to Del.icio.us  
    What's this?