Skip Navigation



Biostatistics Advance Access published online on November 13, 2007

Biostatistics, doi:10.1093/biostatistics/kxm039
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
9/3/391    most recent
kxm039v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Zheng, G.
Right arrow Articles by Ng, H. K. T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zheng, G.
Right arrow Articles by Ng, H. K. T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

Genetic model selection in two-phase analysis for case–control association studies

Gang Zheng

Office of Biostatistics Research, National Heart, Lung and Blood Institute, 6701 Rockledge Drive, Bethesda, MD 20892-7931, USA

Hon Keung Tony Ng*

Department of Statistical Science, Southern Methodist University, 3225 Daniel Avenue, PO Box 750332, Dallas, TX 75275-0332, USA ngh{at}mail.smu.edu

* To whom correspondence should be addressed.


    SUMMARY
 TOP
 SUMMARY
 1. INTRODUCTION
 2. NOTATION AND MODEL
 3. TWO-PHASE ANALYSIS WITH...
 4. NUMERICAL RESULTS
 5. DISCUSSION
 REFERENCES
 
The Cochran–Armitage trend test (CATT) is well suited for testing association between a marker and a disease in case–control studies. When the underlying genetic model for the disease is known, the CATT optimal for the genetic model is used. For complex diseases, however, the genetic models of the true disease loci are unknown. In this situation, robust tests are preferable. We propose a two-phase analysis with model selection for the case–control design. In the first phase, we use the difference of Hardy–Weinberg disequilibrium coefficients between the cases and the controls for model selection. Then, an optimal CATT corresponding to the selected model is used for testing association. The correlation of the statistics used for selection and the test for association is derived to adjust the two-phase analysis with control of the Type-I error rate. The simulation studies show that this new approach has greater efficiency robustness than the existing methods.

Keywords: Cochran–Armitage trend test; Disease risk; Efficiency robustness; Hardy–Weinberg disequilibrium; SNP


    1. INTRODUCTION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. NOTATION AND MODEL
 3. TWO-PHASE ANALYSIS WITH...
 4. NUMERICAL RESULTS
 5. DISCUSSION
 REFERENCES
 
The case–control design is often employed for testing association between a marker and a disease. A single-nucleotide polymorphism (SNP) is a marker with 2 alleles, denoted as A and B, where A is the at-risk one. Its genotypes are denoted as AA, AB, or BB. Testing genetic association is equivalent to finding any general association between the outcome and the observed genotypes in a 2 x 3 contingency table, for which Pearson's chi-squared test can be used. It has an asymptotic chi-squared distribution with 2 degrees of freedom (df) under the null hypothesis. However, as A is the risk allele, an individual with genotype AA is more likely to have disease than an AB individual, who in turn is more likely to have disease than a BB individual. The Cochran–Armitage trend test (CATT) (Cochran, 1954Go; Armitage, 1955), which utilizes this risk model, is usually more powerful than Pearson's chi-squared test with 2 df (Zheng and others, 2006Go). CATT has been suggested for genetic association in the case–control design (Sasieni, 1997Go). It is known that the statistical properties of CATT depend on the choice of scores and that the best (or optimal) set of scores which maximizes the power of the test depends on the genetic model (Sasieni, 1997Go; Slager and Schaid, 2001Go; Zheng and others, 2003Go). For complex diseases, however, the genetic models of the disease loci are unknown. Therefore, analysis using the CATT based on one score is not robust (Graubard and Korn, 1987Go; Freidlin and others, 2002Go). In this situation, one usually uses the CATT optimal for the additive model (Sasieni, 1997Go). On the other hand, when the true model is unknown, robust tests are preferable. The maximin efficiency robust test (MERT) (Gastwirth, 1966Go, 1985Go) and the maximum of the 3 CATTs (MAX) were studied in Freidlin and others (2002)Go. Here, we follow the criterion of efficiency robustness in the previous articles and say that one test has greater efficiency robustness across a set of plausible models than another test when the minimum power of the first test is higher than that of the second test. Wang and Sheffield (2005)Go introduced a restricted likelihood ratio test for the case–control data and showed that it had comparable power with MAX. Empirical results also demonstrate that MAX has greater efficiency robustness than MERT (Freidlin and others, 2002Go). For comparison with our proposed method, we only consider CATT, MAX, and MERT.

Wittke-Thompson and others (2005)Go studied the directions (signs) of the Hardy–Weinberg disequilibrium (HWD) coefficients when Hardy–Weinberg equilibrium (HWE) holds in the population and used these to confirm the underlying genetic model. We further show that HWD coefficients can be used to divide the parameter space into 4 unique regions, from which genetic models can be selected. The selection of genetic models based on the above theory is, however, robust to departure from HWE in the population. Next, we propose a two-phase analysis for genetic association with model selection. In the first phase, we apply the difference of HWD coefficients between the cases and the controls to classify the underlying genetic model into 3 categories: the recessive region, additive/multiplicative region, or dominant region. In the second phase, we apply the appropriate CATT, optimal for the selected model, to test genetic association. Such two-phase selection-testing analysis has been studied by Hogg (1974)Go and extensively studied in clinical trials (e.g. Thall and others, 1988Go), but not in genetic analysis. We further derive and adjust the asymptotic correlation between the 2 analyses so that the two-phase analysis with model selection has an approximately correct size. Extensive simulation studies are also conducted to compare this new approach with the existing approaches. Finally, for illustration, we apply the two-phase analysis with model selection to real data.


    2. NOTATION AND MODEL
 TOP
 SUMMARY
 1. INTRODUCTION
 2. NOTATION AND MODEL
 3. TWO-PHASE ANALYSIS WITH...
 4. NUMERICAL RESULTS
 5. DISCUSSION
 REFERENCES
 
Denote the allele frequencies in cases and controls as p = Pr(A|case) and q = Pr(A|control) and the genotypes as g0 = BB, g1 = AB, and g2 = AA. Assume that r cases and s controls are independently sampled. The observed counts for genotypes (g0, g1, g2) are (r0, r1, r2) in cases and (s0, s1, s2) in controls. The observed genotype counts (r0, r1, r2) and (s0, s1, s2) follow multinomial distributions (r; p0, p1, p2) and mul(s; q0, q1, q2), respectively, where pi = Pr(gi|case) and qi = Pr(gi|control) for i = 0, 1, 2. The null hypothesis of no association can be stated as H0 : pi = qi for i = 0, 1, 2. Denote the disease prevalence as k = Pr(case). Then, pi = Pr (gi)fi / k and qi = Pr(gi)(1 – fi)/(1 – k), where fi = Pr(case|gi) > 0 is the penetrance and Formula. The null hypothesis is equivalent to H0: f0 = f1 = f2 = k. A genetic model is referred to as a relationship between 2 genotype relative risks, denoted as {lambda}1 = f1 /f0 and {lambda}2 = fs/f0. The null and alternative hypotheses can be equivalently stated as H0: {lambda}1 = {lambda}2 = 1 against H1: {lambda}2 ≥ {lambda}1 ≥ 1 with at least one inequality. Then, under H0, p = q. The recessive (REC), additive (ADD), multiplicative (MUL), and dominant (DOM) models refer to {lambda}1 = 1, {lambda}1 = ({lambda}2 + 1)/2, {lambda}2 = {lambda}Formula, and {lambda}1 = {lambda}2, respectively.

CATT models association between the outcome as a linear trend in genotypes. In a logistic regression model for case–control data with a single covariate (genotype) coded with scores (0, x, 1) for (g0, g1, g2), where 0 ≤ x ≤ 1, the score statistic is equivalent to the CATT statistic (Sasieni, 1997Go)

Formula (2.1)

where ni = ri + si for i = 0, 1, 2, n = r + M, and (x0, x1, x2) = (0, x, 1). Under H0, for a given x isin [0, 1], Zx asymptotically follows a standard normal distribution N(1, 0). Sasieni (1997)Go, Slager and Schaid (2001)Go, and Zheng and others (2003)Go showed that the optimal choices of x for the REC, ADD (MUL), and DOM models are x = 0, 1/2, and 1, respectively.

In genetic association studies, departure from HWE in cases has also been used to test genetic association in the case–control design (Nielsen and others, 1998Go). However, using departure from HWE in cases as a test statistic has lower power for the additive model and no power at all for the multiplicative model (Nielsen and others, 1998Go; Deng and others, 2000Go; Song and Elston, 2003Go, 2006Go). In general, Pr(g0) = {Pr(B)}2 + FPr(A)Pr(B), Pr(g1) = 2Pr(A)Pr(B)(1 – F), and Pr(g2) = {Pr(A)}2 + FPr(A)Pr(B), where F is the Wright coefficient of inbreeding and HWE holds in the population if and only if F = 0.


    3. TWO-PHASE ANALYSIS WITH GENETIC MODEL SELECTION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. NOTATION AND MODEL
 3. TWO-PHASE ANALYSIS WITH...
 4. NUMERICAL RESULTS
 5. DISCUSSION
 REFERENCES
 

3.1. HWD coefficients and genetic models

HWE is usually tested using the HWD coefficients (Weir, 1996Go), denoted as Formula. When HWE holds in the population, Formula, under which Wittke-Thompson and others (2005)Go studied the directions of Formula in cases (Formula) and controls (Formula) and their relations with the underlying genetic model. They considered an additive model as Formula regardless of the reference penetrance Formula, but here we use the definition Formula.

Additional relations among Formula, Formula, and Formula are as follows: (i) Formula does not imply that Formula or Formula and vice versa; (ii) under the null hypothesis of no association, it follows from Formula that Formula. This implies that, when HWE holds in the population Formula, genetic association can be tested under the null hypothesis Formula. On the other hand, under the alternative hypothesis of association, whether or not HWE holds in the population cannot be tested using the retrospective case–control data. Song and Elston (2006)Go considered a Hardy–Weinberg disequilibrium trend test (HWDTT) based on Formula. Here, following Wittke-Thompson and others (2005)Go, we assume that HWE holds in the population and use the HWDTT for genetic model selection. The sensitivity of departure from HWE is examined empirically in Section B of the supplementary material available at Biostatistics online (http://www.biostatistics.oxfordjournals.org). Substituting Formula and Formula into Formula and Formula with Formula, one obtains the following: Formula and Formula. Wittke-Thompson and others (2005)Go proved that Formula and Formula under the REC model (Formula), Formula and Formula under the DOM model (Formula), and Formula and Formula under the MUL model (Formula). Under the ADD model (Formula), it can be shown similarly that both Formula and Formula are negative. The 4 genetic models with the directions of Formula are plotted in Figure 1. The open triangle, formed by REC and DOM in Figure 1, indicates the null and alternative hypotheses. The vertex Formula corresponds to the null hypothesis. The signs of Formula for the 4 models are indicated with the labels of the genetic models. The curve CD corresponds to Formula, the condition used to determine the sign of Formula. It follows from Figure 1 that the 4 genetic models plus the curve CD divide the alternative hypothesis into 4 mutually exclusive open regions: Formula, Formula, Formula, and Formula (boundaries are excluded). The signs of Formula in the 4 open regions can be uniquely determined when HWE holds in the population (see Section A of the supplementary material available at Biostatistics online). Based on the above analysis, the difference of HWD coefficients between cases and controls can be used to classify the REC and DOM models. For example, the REC and DOM models imply that Formula and Formula, respectively.


Figure 1
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Signs of HWD coefficients and genetic models.

 
3.2. Two-phase analysis with model selection

When the genetic model is unknown, Formula has the greatest efficiency robustness among the family of statistics Formula against model misspecification and is often used in applications (Sasieni, 1997Go; Slager and Schaid, 2001Go; Freidlin and others, 2002Go; Zheng and others, 2005Go). Other robust tests considered by Freidlin and others (2002)Go are Formula and the MERT given by Formula , where Formula , is a consistent estimate of the asymptotic null correlation between Formula and Formula. The MERT and MAX have been shown to have high-efficiency robustness in hypothesis testing in survival and ordered categorical data, and genetic data analyses when the model underlying the data is uncertain, but a family of scientifically plausible models and their normally distributed tests are available (Gastwirth, 1966Go, 1985Go; Davies, 1977Go; Zucker and Lakatos, 1990Go; Whittemore and Tu, 1998Go; Freidlin and others, 1999Go; Gastwirth and Freidlin, 2000Go; Shih and Whittemore, 2001Go).

We propose a two-phase analysis where we classify the underlying genetic model in the first phase followed by testing association using the optimal CATT for the selected model in the second phase. Song and Elston (2006)Go considered testing association using the test statistic

Formula

where Formula and Formula are the estimates of Formula and Formula, respectively, with Formula and Formula .

From Figure 1 and the signs of Formula for the 4 genetic models and regions, it seems that the ADD and MUL models cannot be distinguished by the signs as Formula have the same sign Formula under the 2 models. However, Formula is asymptotically optimal for both ADD and MUL models under local alternatives. Thus, it is not necessary to distinguish them. On the other hand, the ADD (MUL) model may be distinguished from the REC and DOM models by the signs of Formula. Thus, in the first phase, we classify the genetic model in the recessive region (Formula) if Formula, the dominant region (Formula) if Formula, and in the middle region (Formula and Formula) containing the ADD or MUL models otherwise, where Formula is a prespecified real number. In the second phase, we choose Formula as a test statistic for association, given by Formula if Formula, Formula if Formula, and Formula if otherwise, where we choose Formula. The performance of the above model selection is examined and the results are reported in Section B of the supplementary material available at Biostatistics online. When Formula, the probabilities to select correct REC/DOM and ADD/MUL models are greater than 65% and 85% and are robust to departure from HWE.

3.3. Adjustment of the sizes of the two-phase analysis with model selection

To correct the size of Formula analytically, the correlation between the two phases of the analysis needs to be derived. In Section C of the supplementary material available at Biostatistics online, we obtain Formula, Formula, and Formula, where Formula under Formula. Note that Formula and Formula are asymptotically independent.

Suppose in the second phase the null hypothesis corresponding to the jth SNP (Formula) is rejected at the level Formula using the normal distribution. Then, to control the overall level at the given Formula in the two-phase analysis with model selection for M SNPs, we have FormulaFormula and Formula, where Formula is the upper Formulath percentile of Formula. Using the bivariate normal distribution of (Zx, ZHWDTT) with correlation {rho}x, {alpha}* satisfies

Formula (3.1)

where Formula, Formula, and Formula. Given Formula and M and under HWE in the population, this shows that {alpha}* only asymptotically depends on p (=q, under Formula), which can be estimated consistently using the observed data (Weir, 1996Go) by Formula . The supplementary material available at Biostatistics online (Section E) includes a SAS IML macro to calculate {alpha}*. Table 1 (the last column) reports the simulated levels of significance using the adjusted significance level ({alpha}*) for Formula. The adjusted level {alpha}* was calculated for each simulated data set. The sizes of the analytically adjusted level ({alpha}*) based on (3.1) are not significantly different from the nominal level. For rare allele frequency (Formula), the level {alpha}* leads to a conservative test. The last part of Table 1 also shows that departure from HWE does not inflate the Type-I error rates of the 4 genotype-based test statistics (see comments of Sasieni, 1997Go).


View this table:
[in this window]
[in a new window]

 
Table 1 Type-I error rates of the robust tests based on 10 000 replications with the nominal level 0.05 using r cases and s controls and Wright's coefficient of inbreeding F. The {alpha} is unadjusted and {alpha}* is analytically adjusted levels

 

    4. NUMERICAL RESULTS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. NOTATION AND MODEL
 3. TWO-PHASE ANALYSIS WITH...
 4. NUMERICAL RESULTS
 5. DISCUSSION
 REFERENCES
 
In Section D of the supplementary material available at Biostatistics online, we reported simulation studies to compare power for various approaches. Our results show that the two-phase analysis with model selection (Formula) has greater efficiency robustness than MAX, although their power difference is about 2–3%. The Formula has at least 95% efficiency under ADD/MUL and 100–225% (100–123%) under REC (DOM) relative to the CATT.

To illustrate the use of the two-phase model selection, we consider the top 2 SNPs in a genome-wide case–control study for age-related macular degeneration (AMD) presented in Klein and others (2005)Go. After the quality control of genotyping, there were 103 611 SNPs left for testing association. Each SNP was tested at the level Formula. The top 2 SNPs (rs380390 and rs1329428) with smallest p-values were found in a close region and were further studied by biology and haplotype analysis. The results show that this region is associated with AMD. For both SNPs, the underlying genetic models are uncertain.

The data set for the top SNP (rs380390) was given by Formula for cases and Formula for controls. The CATT with the additive score is Formula with unadjusted p-value Formula. The MERT is Formula with unadjusted p-value Formula. Further, we have Formula and its simulated p-value is Formula based on Formula replicates, which indicates that the association is marginally significant. For the model selection, using (3.1), we obtained the adjusted significance level Formula and Formula. This suggests that we use Formula with p-value Formula. After Bonferroni correction, the adjusted p-value is Formula. Thus, the SNP is significantly associated with the AMD after Bonferroni correction. Note that the two-phase analysis selected the largest CATT in the first phase.

The second top SNP (rs1329428) was given by Formula for cases and Formula for controls. Using (3.1), the adjusted significance level is Formula. The CATT Formula with unadjusted p-value Formula. On the other hand, Formula with unadjusted p-value Formula. The MAX is equal to Formula and its simulated p-value is Formula. Since Formula, we select the CATT with the recessive score in the second phase and Formula with unadjusted p-value Formula, which is not significant after Bonferroni correction. In fact, none of the 4 nonadaptive robust tests reached statistical significance; however, the two-phase test selected the appropriate CATT.


    5. DISCUSSION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. NOTATION AND MODEL
 3. TWO-PHASE ANALYSIS WITH...
 4. NUMERICAL RESULTS
 5. DISCUSSION
 REFERENCES
 
In testing genetic association between a SNP and a disease using a case–control design, we studied a new two-phase analysis, where an underlying genetic model is selected in the first phase based on the difference of HWD coefficients between cases and controls. Then, in the second phase, we use the CATT for association with the score optimal for the selected model. Since two-phase analyses are correlated, we derived their asymptotic null correlations and used them to adjust the level in the two-phase analysis to control the Type-I error rate.

Our simulation results demonstrate that this new approach compares favorably with other existing approaches, including the CATT with the additive score (Sasieni, 1997Go) and the robust tests (MERT and MAX) of Freidlin and others (2002)Go. When the underlying genetic model is uncertain, it has the greatest efficiency robustness among all the tests considered here. Thus, it is useful for genetic data analysis using case–control data. Since it has been shown that MAX and likelihood ratio tests have comparable power, we did not compare our approach to the likelihood ratio test.

One assumption in this article is that HWE holds in the population. Although model selection can be uniquely determined when HWE holds in the population, our empirical results showed that a moderate departure from HWE had little effect on the power performance of the model selection under various genetic models. Nor did departure of HWE inflate Type-I error rates. It is known that case–control designs for genetic association may be affected by population stratifications and/or cryptic relatedness (Devlin and Roeder, 1999Go; Pritchard and Rosenberg, 1999Go; Satten and others, 2001Go; Zheng and others, 2005Go; Gorroochurn and others, 2006Go). This needs to be examined in the future. In addition, we chose a fixed threshold value as the upper 95th normal percentile in the model selection phase. Data-driven threshold levels may further improve the power. This also needs future research.


    ACKNOWLEDGMENTS
 
The authors gratefully acknowledge the editor, the associate editor, and the 2 anonymous referees for their constructive comments. The authors would also like to thank Prof. William R. Schucany and Ms Julia Kozlitina for their suggestions which resulted in a much improved version of the manuscript.

Conflict of Interest: None declared.


    REFERENCES
 TOP
 SUMMARY
 1. INTRODUCTION
 2. NOTATION AND MODEL
 3. TWO-PHASE ANALYSIS WITH...
 4. NUMERICAL RESULTS
 5. DISCUSSION
 REFERENCES
 

    Armitage P. Tests for linear trends in proportions and frequencies. Biometrics (1955) 11:375–386.[Medline]

    Cochran WG. Some methods for strengthening the common chi-square tests. Biometrics (1954) 10:417–451.[Medline]

    Davies RB. Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika (1977) 64:247–254.[Abstract/Free Full Text]

    Deng HW, Chen WM, Recker RR. QTL fine mapping by measuring and testing for Hardy-Weinberg and linkage disequilibrium at a series of linked marker loci in extreme samples of population. American Journal of Human Genetics (2000) 66:1027–1045.[CrossRef][Web of Science][Medline]

    Devlin B, Roeder K. Genomic control for association studies. Biometrics (1999) 55:997–1004.[CrossRef][Web of Science][Medline]

    Freidlin B, Podgor MJ, Gastwirth JL. Efficiency robust tests for survival or ordered categorical data. Biometrics (1999) 55:883–886.[CrossRef][Web of Science][Medline]

    Freidlin B, Zheng G, Li Z, Gastwirth JL. Trend tests for case-control studies of genetic markers: power, sample size and robustness. Human Heredity (2002) 53:146–152.[CrossRef][Web of Science][Medline]

    Gastwirth JL. On robust procedures. Journal of the American Statistical Association (1966) 61:929–948.[CrossRef][Web of Science]

    Gastwirth JL. The use of maximin efficiency robust tests in combining contingency tables and survival analysis. Journal of the American Statistical Association (1985) 80:380–384.[CrossRef][Web of Science]

    Gastwirth JL, Freidlin B. On power and efficiency robust linkage tests for affected sibs. Annals of Human Genetics (2000) 64:443–453.[CrossRef][Web of Science][Medline]

    Gorroochurn P, Heiman GA, Hodge SE, Greenberg DA. Centralizing the non- central chi-square: a new method to correct for population stratification in genetic case-control association studies. Genetic Epidemiology (2006) 30:277–289.[CrossRef][Web of Science][Medline]

    Graubard BI, Korn EL. Choice of column scores for testing independence in ordered 2 x K contingency tables. Biometrics (1987) 43:471–476.[CrossRef][Web of Science][Medline]

    Hogg RV. Adaptive robust procedures—partial review and some suggestions for future applications and theory. Journal of the American Statistical Association (1974) 69:909–923.[CrossRef][Web of Science]

    Klein RJ, Zeiss C, Chew EY, Tsai JY, Sackler RS, Haynes C, Henning AK, Sangiovanni JP, Mane SM, Mayne ST, et al. Complement factor H polymorphism in aged-related macular degeneration. Science (2005) 308:385–389.[Abstract/Free Full Text]

    Nielsen DM, Ehm MG, Weir BS. Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. American Journal of Human Genetics (1998) 63:1531–1540.[CrossRef][Web of Science][Medline]

    Pritchard MJ, Rosenberg NA. Use of unlinked genetic markers to detect population stratification in association studies. American Journal of Human Genetics (1999) 65:220–228.[CrossRef][Web of Science][Medline]

    Sasieni PD. From genotypes to genes: doubling the sample size. Biometrics (1997) 53:1253–1261.[CrossRef][Web of Science][Medline]

    Satten GA, Flanders WD, Yang Q. Account for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. American Journal of Human Genetics (2001) 68:466–477.[CrossRef][Web of Science][Medline]

    Shih M, Whittemore AS. Allele-sharing among affected relatives: nonparametric methods for identifying genes. Statistical Methods in Medical Research (2001) 10:27–55.[Abstract/Free Full Text]

    Slager SL, Schaid DJ. Case-control studies of genetic markers: power and sample size approximations for Armitage's test for trend. Human Heredity (2001) 52:149–153.[CrossRef][Web of Science][Medline]

    Song K, Elston RC. The Hardy-Weinberg disequilibrium (HWD) measure and test statistics for a disease-susceptibility locus with multiple alleles allowing for an inbreeding coefficient (F). Genetica (2003) 119:269–293.[CrossRef][Web of Science][Medline]

    Song K, Elston RC. A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies. Statistics in Medicine (2006) 25:105–126.[CrossRef][Web of Science][Medline]

    Thall PF, Simon R, Ellenberg SS. Two-stage selection and testing designs for comparative clinical trials. Biometrika (1988) 75:303–310.[Abstract/Free Full Text]

    Wang K, Sheffield VC. A constrained-likelihood approach to marker-trait association studies. American Journal of Human Genetics (2005) 77:768–780.[CrossRef][Web of Science][Medline]

    Weir BS. Genetic Data Analysis II: Methods for Discrete Population Genetic Data (1996) Sunderland, MA: Sinauer Associations Inc.

    Whittemore AS, Tu I-P. Simple, robust linkage tests for affected sibs. American Journal of Human Genetics (1998) 62:1228–1242.[CrossRef][Web of Science][Medline]

    Wittke-Thompson JK, Pluzhnikov A, Cox NJ. Rational inferences about departure from Hardy-Weinberg equilibrium. American Journal of Human Genetics (2005) 76:967–986.[CrossRef][Web of Science][Medline]

    Zheng G, Freidlin B, Gastwirth JL. Comparison of robust tests for genetic association using case-control studies (2nd special issue in honor of E.L. Lehmann). IMS Lecture Notes—Monograph Series (2006) 49:253–265.[CrossRef]

    Zheng G, Freidlin B, Li Z, Gastwirth JL. Choice of scores in trend tests for case-control studies of candidate-gene associations. Biometrical Journal (2003) 45:335–348.[CrossRef][Web of Science]

    Zheng G, Freidlin B, Li Z, Gastwirth JL. Genomic control for association studies under various genetic models. Biometrics (2005) 61:186–192.[CrossRef][Web of Science][Medline]

    Zucker DM, Lakatos E. Weighted log rank type statistics for comparing survival curves when there is a time lag in the effectiveness of treatment. Biometrika (1990) 77:853–864.[Abstract/Free Full Text]

    Received April 16, 2007; revised July 25, 2007; revised October 1, 2007; accepted for publication October 12, 2007.


    Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



    This Article
    Right arrow Abstract Freely available
    Right arrow FREE Full Text (PDF) Freely available
    Right arrow Supplementary Material
    Right arrow All Versions of this Article:
    9/3/391    most recent
    kxm039v1
    Right arrow Alert me when this article is cited
    Right arrow Alert me if a correction is posted
    Services
    Right arrow Email this article to a friend
    Right arrow Similar articles in this journal
    Right arrow Similar articles in PubMed
    Right arrow Alert me to new issues of the journal
    Right arrow Add to My Personal Archive
    Right arrow Download to citation manager
    Right arrowRequest Permissions
    Right arrow Disclaimer
    Google Scholar
    Right arrow Articles by Zheng, G.
    Right arrow Articles by Ng, H. K. T.
    Right arrow Search for Related Content
    PubMed
    Right arrow PubMed Citation
    Right arrow Articles by Zheng, G.
    Right arrow Articles by Ng, H. K. T.
    Social Bookmarking
     Add to CiteULike   Add to Connotea   Add to Del.icio.us  
    What's this?