Biostatistics Advance Access originally published online on December 6, 2005
Biostatistics 2006 7(2):167-181; doi:10.1093/biostatistics/kxj009
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A tail strength measure for assessing the overall univariate significance in a dataset
Department of Statistics, Stanford University, Stanford, CA 94305,USA jtaylor{at}stat.stanford.edu
Department of Health Research and Policy and Department of Statistics, Stanford University, Stanford, CA 94305, USA tibs{at}stanford.edu
* To whom correspondence should be addressed.
| SUMMARY |
|---|
|
|
|---|
We propose an overall measure of significance for a set of hypothesis tests. The tail strength is a simple function of the p-values computed for each of the tests. This measure is useful, for example, in assessing the overall univariate strength of a large set of features in microarray and other genomic and biomedical studies. It also has a simple relationship to the false discovery rate of the collection of tests. We derive the asymptotic distribution of the tail strength measure, and illustrate its use on a number of real datasets.
Keywords: Multiple testing; p-value
| 1. INTRODUCTION |
|---|
|
|
|---|
Dave et al. (2004)
The left panel of Figure 1 shows the ordered Cox scores T(k) for each gene (see, e.g. Kalbfleisch and Prentice, 1980
). These are the partial likelihood score statistics, and are plotted against the expected (null) order statistics
where the expectation is estimated by repeated permutations of the patient labels. We see that there is little deviation from the expected values. The right panel shows a similar plot for the leukemia data of Golub et al. (1999)
. This problem compares two disease classes, so the scores T(k) are the ordered two-sample t-statistics. There are many more large values than we would expect to see by chance. Perhaps this is why the Golub dataset has become the most common testing ground for authors proposing new methods for microarray analysis.
|
In the re analysis of the Dave et al. (2004)
In this paper, we propose a measure of the overall statistical significance in testing the global hypothesis of no gene effects. We call it the tail strength. We derive its asymptotic distribution and illustrate its use on a number of real datasets. We also relate our measure to the false discovery rate (FDR) and the area under the ROC curve.
| 2. TAIL STRENGTH |
|---|
|
|
|---|
We first define our measure based on a set of p-values. Later, we give an equivalent form in terms of test statistics. We assume that we have null hypotheses H0i, and associated p-values pi, i = 1, 2, ..., m. The global null hypothesis states that H0i holds for all i, and we assume that under this hypothesis, the pi are i.i.d. U[0, 1] random variables.
Let the ordered p-values be p(1)
p(2)
p(m). We define the tail strength as
![]() | (2.1) |
Now under the global null hypothesis, each pk has a uniform distribution, so that the expected value of the kth smallest p(k) is k/(m + 1) and TS has expectation zero. The tail strength measures the deviation of each p-value from its expected value: p(k) < k/(m + 1) causes
to be >0. Thus, large positive values of TS indicate evidence against the global null hypothesis, that is, it indicates that there are more small p-values than we would expect by chance. Note also that the particular form of TS will give more weight to the lowest p-values, so that it is most sensitive to deviations in the tail. In contrast, for example, the sum of successive differences in p-values would not have this property.
There is a Bayesian model for this setting that we will find useful in our later analysis. Given a prior null probability,
0, and an alternative distribution F1, the Bayesian model for observing m p-values is the following: for 1
i
m independently
- generate H0, i
Bernoulli(
0);
- if H0, i = 0, generate pi
Unif(0, 1), else generate pi
F1.
![]() |
Further, without any constraint on F1, the parameter
0 is obviously unidentifiable.
The most common application of tail strength is likely to be in assessing the univariate (marginal) effect of a set of predictors. Suppose we have predictors xij, j = 1, 2, ..., m, and response variable Yi, for observations i = 1, 2, ..., N. Letting y = (y1, y2, ..., yN) and xj = (x1j, ..., xNj), we form a test statistic for each predictor:
![]() | (2.2) |
Thus, Tj measures the univariate effect of the jth predictor on the response. The two-sample t-statistic is a simple example.
There is an equivalent form of tail strength, expressed in terms of the Tj. Suppose we have a null distribution Prob0 for these statistics, derived from a set of permutations or asymptotic theory. This yields a set of p-values
![]() | (2.3) |
where |T(1)|
|T(2)|
|T(m)| are the test statistics ordered by absolute value. Then
![]() | (2.4) |
Each term is the proportion of test statistics that exceeds the expected number, when testing at value T(k).
For the FL and leukemia datasets, TS equals 0.027 and 0.655, respectively. Hence, the FL p-values are slightly larger than we would expect under the uniform distribution. In contrast, the leukemia genes are highly significant. The value 0.655 for the leukemia data indicates that there are (on average) 65.5% more significant test statistics than we would expect by chance.
The left panel of Figure 2 shows the tail strength measure applied to some simulated microarray data. There are 1000 genes (features) and 20 samples; all measurements are standard N(0, 1), except for the first 100 genes in the second 10 samples, which were generated as N(
, 1). The plot shows tail strength divided by its standard error
from 100 realizations at each of the seven different values of
. We see that TS has the desired behavior: it is centered around zero when the overall null hypothesis holds (
= 0) and then becomes more and more positive as
increases.
|
In the right panel, the genes are not independent but have pairwise correlation 0.5 before
is added. Hence, the resulting p-values are exchangeable (but not independent) under the null hypothesis. We see that the expectation of tail strength behaves as in the independent case; however, its variance seems to be inflated. We address this issue in Section 3.
The quantity TS is closely related to the FDR) (Benjamini and Hochberg, 1995
; Efron et al., 2001
; Storey, 2002
; Efron and Tibshirani, 2002
; Genovese and Wasserman, 2002
).
We first review the FDR. Table 1 displays the various outcomes when testing m null hypotheses H0i, 1
i
m. The quantity V is the number of false positives (type I errors), while R is the total number of hypotheses rejected, which depends on the testing procedure.
|
FDR (Benjamini and Hochberg, 1995
![]() | (2.5) |
![]() | (2.6) |
the empirical cumulative distribution function of the p-values p1, ..., pm.
In Efron and Tibshirani (2002)
and Storey (2002)
, it is shown that under the Bayesian model of section 2,
![]() |
For extensions to large samples, see Storey et al. (2004)
.
Finally, we can derive the relationship between tail strength and FDR. Looking at the plug-in estimate (2.5), it is easy to see that
![]() | (2.7) |
Figure 3 gives a graphical interpretation of the simple relationship (2.7): TS measures a weighted area under the curve
evaluating this function at the observed p-values pk. Hence, the faster
goes to one (
drops to zero) as x
0, the higher the TS. Further, the tighter the p-values are bunched up near 0, the larger the TS.
|
Another way of seeing that TS can be phrased directly in terms of the test statistics (as in (2.4)) comes from the fact that the expression
in (2.7) can be computed on the scale of the test statistics or the p-values. Therefore, TS is unchanged under any one-to-one transformation of the p-values, and is not tied to the choice of test statistic used to test each null hypothesis H0i, 1
i
m.
When the p-values are i.i.d. with distribution F, the following result, proven in the Appendix, is therefore not surprising:
![]() | (2.8) |
![]() | (2.9) |
is the (asymptotic) population FDR, with the unknown proportion of true null hypotheses
0 set to one. In other words, the tail strength statistic estimates the average amount by which the true FDR function falls below its null value of one, with the average computed with respect to the true distribution of p-values.
If F is stochastically dominated by Unif(0, 1), then TS is asymptotically normal with variance
![]() | (2.10) |
where C(F)
1 if F(x)
x for each x in [0, 1]. Hence, we have the asymptotic approximation
![]() | (2.11) |
In Section 3, we examine the accuracy of this approximation.
Note that the quantity m1 = m m0 measures how many non-null genes there are in the dataset. Various authors have studied this as a measure of univariate strength (cf. Benjamini and Hochberg, 2000
; Storey et al., 2004
). However, this does not really measure how different the non-null p-values are from Unif(0, 1). Further, in the Bayesian model described earlier, this parameter is not identifiable without some constraint on the alternatives. In contrast, tail strength is identifiable and measures how far the non-null p-values are from Unif(0, 1).
The asymptotic behavior of tail strength is summarized in the following theorem.
THEOREM 2.1 Under the Bayesian model of Storey (2002), suppose that F(x)
x. Then, if F has density f, as m
![]()
, TS is asymptotically normally distributed with mean
and variance
The proof appears in the Appendix.
In the diagnostic testing literature (cf. Hanley and McNeil, 1982
; Pepe, 2003
), the ROC curve is used to discriminate between two samples. Such a curve can also be constructed to compare a sample of test statistics to a given null distribution. A commonly used summary of the ROC curve is the area under the ROC curve. In the two-sample setting, the area is essentially equivalent to the MannWhitney test statistic (Hanley and McNeil, 1982
). This measure places equal weight on departures from Unif(0, 1) without focusing on the most interesting region, the tail of the test statistics. One solution is to only look at the area under the ROC curve up to some false positive level t0 (Pepe, 2003
), but the choice of t0 is somewhat arbitrary. Here we show that the tail strength measure is related to a weighted area under such an ROC curve, weighted to accentuate the tail of the test statistics.
It is well-known (Hanley and McNeil, 1982
) that for two independent samples
and
the expected area under the empirical ROC curve
![]() |
is
![]() |
The measure TS is also closely related to the area under the ROC curve (Pepe, 2003
; Hanley and McNeil, 1982
). Let
![]() |
be the population ROC curve reflected along the line y = x. Suppose that X
Unif(0, 1) and Y
F, then
![]() |
and this quantity is
if F = Unif(0, 1). This suggests that the area under the (ROC) curve
![]() | (2.12) |
is a measure of departure from uniformity. It is positive whenever F(x)
x or whenever the p-values are stochastically dominated by Unif(0, 1). This quantity places equal weight on the differences for all values of x with no focus on the tail. One way to adjust it is to insert a weight into the expression (2.12)
![]() | (2.13) |
The choice w(x) = x1 corresponds to TS in the asymptotic setting. In finite samples, the above integral is of course replaced by a Riemann sum.
The partial AUC proposed by Pepe (2003)
also attempts to accentuate the tail
![]() |
Though the axes are reversed, this is equivalent to choosing a weight
![]() |
while (2.12). Setting w(x) = x1, which puts more weight on the tail, yields TS.
| 3. ESTIMATES OF VARIANCE |
|---|
|
|
|---|
To use the tail strength measure in practice, we need a reasonably accurate estimate of its variance. Formula (2.11) is very simple, but assumes both independence of the genes and the global null hypothesis.
To assess the accuracy of formula (2.11), we did a simulation experiment. To ensure that the correlation structure was realistic, we used the gene expression data from Rieger et al. (2004)
, consisting of 12 625 genes and 58 samples in two classes. We constructed three scenarios: in the null scenario, datasets were created by permuting the sample labels (leaving the expression data intact); in the first non-null scenario, we first permuted the sample labels and then added 2000 units to the first 500 genes (2000 was about the largest average difference in group means in the actual data), and in the second non-null scenario, we added a random amount ui
N(0, 2002) to 2000 genes i chosen at random.
We carried about 100 realizations of this experiment, and the results are shown in Table 2. The quantity sd is the actual standard deviation of tail strength over the 100 realizations;
the asymptotic standard error. The third column
perm is a different estimate, one that starts with the the permutation values used in the original computation of tail strength. We compute tail strength from successive blocks of 20 permutations, and then compute the standard deviation of the resulting tail strength values. This estimate explicitly assumes that the null hypothesis is true. We see that
is much too small, in general, because of the lack of independence of the genes. On the other hand, the permutation-based estimate
perm is reasonably accurate under both the null and non-null scenarios. As we might expect, it is somewhat conservative (too large) under the non-null setup, but this is acceptable in practice. Alternatively, one could use a (non-null) bootstrap process to estimate the standard error, but this would require a great deal more computation. This might be an impediment against routine use of tail strength measure, so we use the permutation-based estimate
perm in the real data examples in this paper.
|
| 4. EXAMPLES |
|---|
|
|
|---|
Figure 4 shows the tail strength measure and asymptotic 90% confidence intervals, applied to 12 different datasets. The datasets are summarized in Table 3. The first nine datasets are from microarray studies, and all report positive findings. Most of these are described in Dettling (2004)
|
|
The remaining datasets are from neuroimaging studies. The datasets aud-over and aud-sent are from an auditory functional magnetic resonance imaging study (Taylor and Worsley, 2005
All the datasets (except for FL) show significant (non-zero) tail strengths of various degrees. For the subset of classification problems among these studies, Table 4 compares the estimated tail strength with the misclassification rate from the nearest shrunken centroid classifier (Tibshirani et al., 2001
) [Results from other classifiers, given in Dettling (2004)
, are quite similar]. The error rates were computed by repeated (2/3, 1/3) train-test splits of the data, except for the skin data which uses 14-fold cross-validation.
|
There is one interesting (qualitative) discrepancy in Table 4: the multi class brain dataset shows very different behavior in tail strength and misclassification rate. The tail strength is high0.82, but the misclassification rate seems poor (23.5%). The test statistic for each gene is an F-statisticthe ratio of between-class to within-class variance. Figure 5 shows the ordered test statistics versus their expected values under the null hypothesis. There is clearly more variation that we would expect by chance.
|
There are some possible explanations for the seeming discrepancy between tail strength and classification rate in the brain example. First note that with five classes, the base error rate is 80%, so that the value 23.5% is actually a substantial reduction in this rate. In addition, there are only 42 cases in this dataset, so that the training set on which the classifiers were trained had only 28 cases on an average. For the five classes, the class wise error rates were 15, 6, 6, 20 and 57%. We computed the tail strengths for each class versus the rest (based on a two-sample t-statistic): they were 0.39, 0.53, 0.67, 0.54 and 0.32. Hence, class 5 has both a high error rate and a lower tail strength. It seems that the overall tail strength, based on the F-statistic for all five classes, fails to capture the difficulty in predicting class five.
| 5. DISCUSSION |
|---|
|
|
|---|
The tail strength measure is potentially useful for assessing the overall statistical significance of a set of hypothesis tests. It gives a quantitative measure of the overall strength of evidence against the global null hypothesis of no association between a large set of features and an outcome of interest. We suggest that the tail strength could be routinely reported in such studies, to give the reader a crude idea of the degree of departure from the no association null in a complex dataset.
In statistics, there is of course a long history and a substantial literature in the area of multiple hypothesis testing. With the flurry of applications in genomics, there has been a resurgence of interest in this area (see, e.g. Dudoit et al., 2003
, for a summary). Our work has a close relationship to the FDR approach to multiple testing, as we have shown in Section 2.2. There is a recent work of Efron (2005)
(Section 5) in which quantities similar to tail strength are considered, based on a local version of the FDR.
Another concept that seems connected to tail strength is the higher criticism of Donoho and Jin (2004)
, generalizing an idea introduced by Tukey (1976)
. They define
![]() | (5.14) |
for some
0 > 0. This statistic is designed as an overall summary of the p-values, and they prove that it is optimal for detecting certain sparse patterns of p-values. They also show that the asymptotic
percentile for HCm is of the size
We attempted some numerical comparisons of HCm with tail strength on the datasets in this paper, but these were not successful. The presence of some very small p-values made the denominator very small and caused the statistic to get very large. In addition, it was not clear how to choose
0 and the significance cutpoint in finite samples. We leave this comparison for future study.
In summary, the tail strength measure proposed here is simple to compute, with no parameters that require adjustment. It must be stressed, however, that it does not measure all the interesting structure that might be present in a dataset. When applied to univariate association measures, it does not capture interactions or multivariate effects that might exist.
| APPENDIX |
|---|
|
|
|---|
In the FDR setting, previous work has shown that examination of the limiting behavior of (estimates of) FDR and local FDR is useful in understanding what the various techniques are doing in a population setting. In this section, we carry out a similar analysis for TS, and give the proof of Theorem 2.1.
We can write
![]() |
Under H0, the spacings of order statistics are distributed as
![]() |
where
i
Exp(1) are i.i.d. exponential random variables.
This suggests that TS should be asymptotically normally distributed, at least under H0, because it is the sum of the approximately independent random variables. In fact, TS is also asymptotically normally distributed when the p-values are identically distributed with distribution F, as in the Bayesian model of Storey (2002)
. We could alternatively assume that some fixed proportion
0 of the p-values are i.i.d. Unif(0, 1), and the remaining are i.i.d. from some distribution F1, such that
![]() |
This is the mixture model used in the development of local FDR by Efron et al. (2001)
and Efron and Tibshirani (2002)
. This assumption would not likely change the essence of our main result, only complicate the proofs.
We begin by expressing (2.1) in yet another way, in terms of quantile processes. Let
![]() | (A.1) |
be the quantile process of the p-values and
![]() | (A.2) |
be the population quantile function.
Given the definition of
m, it is not hard to see that
![]() |
Using this fact, the expression (2.1) takes the form of a Riemann sum
![]() | (A.3) |
Such an expression is simpler to analyze than (eq:ts), using some results from the theory of quantile processes.
If Q is Riemann integrable, then this expression converges to
![]() |
If, further, F has a density, then making the substitution u = Q(x) we see that this expression is equal to (2.8).
The result we will use from the theory of quantile processes (Barrio, 2004
) is the following: under H0,
![]() | (A.4) |
where (B(x))0
x
1 is a standard Brownian bridge. That is, a continuous Gaussian process on [0, 1] with mean 0 and covariance function
![]() | (A.5) |
Suppose the p-values are i.i.d. with distribution F, where F is twice differentiable with strictly positive density f on (0, 1), then (Barrio, 2004
)
![]() | (A.6) |
This suggests that
![]() |
A straightforward application of Theorem 1 of Shorack (1972)
, combined with the comments above suffices to prove Theorem 2.1.
REMARK A.1 Actually, F need not even have a density, for the central limit to hold the above, though the expected value will be changed slightly. If F has density f, then under the hypothesis F(x)x
so the variance under H0 is an upper bound.
| ACKNOWLEDGMENTS |
|---|
We would like to thank Brad Efron for helpful comments, and editors and referees whose comments substantially improved this paper. Robert Tibshirani was partially supported by National Science Foundation grant DMS-9971405 and National Institutes of Health contract N01-HV-28183.
| REFERENCES |
|---|
|
|
|---|
-
ALIZADEH, A., EISEN, M., DAVIS, R. E., MA, C., LOSSOS, I., ROSENWAL, A., BOLDRICK, J., SABET, H., TRAN, T., YU, X. et al. (2000). Identification of molecularly and clinically distinct substypes of diffuse large B cell lymphoma by gene expression profiling. Nature 403, 503511.[CrossRef][Medline]
ALON, U., BARKAI, N., NOTTERMAN, D., GISH, K., YBARRA, S., MACK, D. AND LEVINE, A. (1999). Broad patterms of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceeding of the National Academy of Sciences of the United States of America 96, 67456750.
BARRIO, E. (2004). Empirical and Quantile Processes in the Asymptotic Theory of Goodness of Fit Tests. http://www.eio.uva.es/ems/Goodness_of_fit-Laredo_2004.pdf.
BENJAMINI, Y. AND HOCHBERG, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 85, 289300.
BENJAMINI, Y. AND HOCHBERG, Y. (2000). On the adaptive control of the false discovery fate in multiple testing with independent statistics. Journal of Educational and Behavioral Statistics 25, 6083.
DAVE, S. S., WRIGHT, G., TAN, B., ROSENWALD, A., GASCOYNE, R. D., CHAN, W. C., FISHER, R. I., BRAZIEL, R. M., RIMSZA, L. M., GROGAN, T. M. et al. (2004). Prediction of survival in follicular lymphoma based on molecular features of tumor-infiltrating immune cells. The New England Journal of Medicine 351, 21592169.
DETTLING, M. (2004). Bagboosting for tumor classification with gene expression data. Bioinformatics 20, 35833593.
DEUTSCH, G. K., DOUGHERTY, R. F., BAMMER, R., SIOK, W. T., GABRIELI, J. D. AND WANDELL, B. (2005). Correlations between white matter microstructure and reading performance in children. Cortex 41, 354363.[ISI][Medline]
DONOHO, D. AND JIN, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Annals of Statistics 32, 962994.[CrossRef]
DUDOIT, S., SHAFFER, J. P. AND BOLDRICK, J. C. (2003). Multiple hypothesis testing in microarray experiments. Statistical Science 18, 71103.[CrossRef]
EFRON (2005). Local false discovery rates. Technical Report, Stanford University.
EFRON, B. AND TIBSHIRANI, R. (2002). Empirical Bayes methods and false discovery rates for microarrays. Genetic Epidemiology 1, 7086.
EFRON, B., TIBSHIRANI, R., STOREY, J. AND TUSHER, V. (2001). Empirical bayes analysis of a microarray experiment. Journal of The American Statistical Association 96, 11511160.[CrossRef]
GENOVESE, C. AND WASSERMAN, L. (2002). Operating characteristics and extensions of the FDR procedure. Journal of the Royal Statistical Society Series B 64, 499517.[CrossRef]
GOLUB, T., SLONIM, D. K., TAMAYO, P., HUARD, C., GAASENBEEK, M., MESIROV, J. P., COLLER, H., LOH, M. L., DOWNING, J. R., CALIGIURI, M. A. et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531536.
HANLEY, J. A. AND MCNEIL, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 2936.
KALBFLEISCH, J. AND PRENTICE, R. (1980). The Statistical Analysis of Failure Time Data. New York: Wiley.
KHAN, J., WEI, J. S., RINGNéR, M., SAAL, L. H., LADANYI, M., WESTERMANN, F., BERTHOLD, F., SCHWAB, M., ANTONESCU, C. R., PETERSON, C. et al. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine 7, 673679.[CrossRef][ISI][Medline]
LIAO, C., WORSLEY, K., POLINE, J.-B., ASTON, J., DUNCAN, G. AND EVANS, A. (2002). Estimating the delay of the response in fMRI data. Neuroimage 16, 593606.[CrossRef][ISI][Medline]
PEPE, M. S. (2003). Partial AUC estimation and regression. Biometrics 59, 614623. http://www.blackwell-synergy.com/doi/abs/10.1111/1541-0420.00071.[CrossRef][ISI][Medline]
POMEROY, S., TAMAYO, P., GAASENBEEK, M., STURLA, L., ANGELO, M., MCLAUGHLIN, M., KIM, J., GOUMNEROVA, L., BLACK, P., LAU, C. et al. (2002). Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 5, 436442.
RIEGER, K., HONG, W., TUSHER, V., TANG, J., TIBSHIRANI, R. AND CHU, G. (2004). Toxicity from radiation therapy associated with abnormal transcriptional responses to DNA damage. Proceedings of the National Academy of Sciences of the United States of America 101, 66346640.
ROSENWALD, A., WRIGHT, G., CHAN, W. C., CONNORS, J. M., CAMPO, E., FISHER, R. I., GASCOYNE, R. D., MULLER-HERMELINK, H. K., SMELAND, E. B. AND STAUDT, L. M. (2002). The use of molecular profiling to predict survival after chemotherapy for diffuse large b-cell lymphoma. The New England Journal of Medicine 346, 19371947.
SCHWARTZMANN, A., DOUGHERTY, R. AND TAYLOR, J. (2005). Cross-subject comparison of principal diffusion direction maps. Magnetic Resonance in Medicine 53, 14231431.
SHORACK, G. R. (1972). Functions of order statistics. Annals of Mathematical Statistics 43, 412427.
SINGH, D., FEBBO, P., ROSS, K., JACKSON, D., MANOLA, J., LADD, C., TAMAYO, P., RENSHAW, A., D'AMICO, A., RICHIE, J. et al. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer cell 1, 203209.[CrossRef][ISI][Medline]
STOREY, J. D. (2002). A direct approach to false discovery rates. Journal of the Royal Statistical Society Series B 64, 479498.[CrossRef]
STOREY, J. D., TAYLOR, J. E. AND SIEGMUND, D. O. (2004). Strong control, conservative point estimation, and simultaneous conservative consistency of false discovery rates: a unified approach. Journal of the Royal Statistical Society Series B 66, 187205.[CrossRef]
TAYLOR, J. AND WORSLEY, K. (2005). Analysis of hemodynamic delay in the FIAC data. 11th Annual Meeting of the Organization for Human Brain Mapping, Toronto, June 1216, 2005.
TIBSHIRANI, R. (2005). Immune signatures in follicular lymphoma. The New England Journal of Medicine 352, 14961497.
TIBSHIRANI, R., HASTIE, T., NARASIMHAN, B. AND CHU, G. (2001). Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proceedings of the National Academy of Sciences of the United States of America 99, 65676572.
TUKEY, J. (1976). T13 n: The Higher Criticism. Course notes stat 411. Princeton university.
Received August 25, 2005; revised November 29, 2005; accepted for publication December 1, 2005.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
L. M. Rimsza, M. L. LeBlanc, J. M. Unger, T. P. Miller, T. M. Grogan, D. O. Persky, R. R. Martel, C. M. Sabalos, B. Seligmann, R. M. Braziel, et al. Gene expression predicts overall survival in paraffin-embedded tissues of diffuse large B-cell lymphoma treated with R-CHOP Blood, October 15, 2008; 112(8): 3425 - 3433. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. M. Habermann, S. S. Wang, M. J. Maurer, L. M. Morton, C. F. Lynch, S. M. Ansell, P. Hartge, R. K. Severson, N. Rothman, S. Davis, et al. Host immune gene polymorphisms in combination with clinical and demographic factors predict late survival in diffuse large B-cell lymphoma patients in the pre-rituximab era Blood, October 1, 2008; 112(7): 2694 - 2702. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Cerhan, S. M. Ansell, Z. S. Fredericksen, N. E. Kay, M. Liebow, T. G. Call, A. Dogan, J. M. Cunningham, A. H. Wang, W. Liu-Mares, et al. Genetic variation in 1253 immune and inflammation genes and risk of non-Hodgkin lymphoma Blood, December 15, 2007; 110(13): 4455 - 4463. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Cerhan, S. Wang, M. J. Maurer, S. M. Ansell, S. M. Geyer, W. Cozen, L. M. Morton, S. Davis, R. K. Severson, N. Rothman, et al. Prognostic significance of host immune gene polymorphisms in follicular lymphoma survival Blood, June 15, 2007; 109(12): 5439 - 5446. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. J. Schork, T. A. Greenwood, and D. L. Braff Statistical Genetics Concepts and Approaches in Schizophrenia and Related Neuropsychiatric Research Schizophr Bull, January 1, 2007; 33(1): 95 - 104. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
















, TS is asymptotically normally distributed with mean 









the asymptotic standard error; 
















