<?xml version="1.0" encoding="ISO-8859-1"?>

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns="http://purl.org/rss/1.0/"
 xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
 xmlns:prism="http://purl.org/rss/1.0/modules/prism/"
 xmlns:admin="http://webns.net/mvcb/"
>

<channel rdf:about="http://biostatistics.oxfordjournals.org">
<title>Biostatistics - Advance Access</title>
<link>http://biostatistics.oxfordjournals.org</link>
<description>Biostatistics - RSS feed of articles</description>
<prism:eIssn>1468-4357</prism:eIssn>
<prism:publicationName>Biostatistics</prism:publicationName>
<prism:issn>1465-4644</prism:issn>
<items>
 <rdf:Seq>
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn007v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn009v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn008v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn006v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn004v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn002v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm059v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm058v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm053v3?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn005v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm055v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn001v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn003v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm057v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm050v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm056v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm051v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm049v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm052v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm054v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm048v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm047v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm046v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm044v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm045v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm041v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm042v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm038v1?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm039v1?rss=1" />
 </rdf:Seq>
</items>
</channel>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn007v1?rss=1">
<title><![CDATA[Testing for association on the X chromosome]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn007v1?rss=1</link>
<description><![CDATA[
<p>The problem of testing for genotype&ndash;phenotype association with loci on the X chromosome in mixed-sex samples has received surprisingly little attention. A simple test can be constructed by counting alleles, with males contributing a single allele and females 2. This approach does assume not only Hardy&ndash;Weinberg equilibrium in the population from which the study subjects are sampled but also, perhaps, an unrealistic alternative hypothesis. This paper proposes 1 and 2 degree-of-freedom tests for association which do not assume Hardy&ndash;Weinberg equilibrium and which treat males as homozygous females. The proposed method remains valid when phenotype varies between sexes, provided the allele frequency does not, and avoids the loss of power resulting from stratification by sex in such circumstances.</p>
]]></description>
<dc:creator><![CDATA[Clayton, D.]]></dc:creator>
<dc:date>2008-04-25</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn007</dc:identifier>
<dc:title><![CDATA[Testing for association on the X chromosome]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-04-25</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn009v1?rss=1">
<title><![CDATA[Time-dependent covariates in the proportional subdistribution hazards model for competing risks]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn009v1?rss=1</link>
<description><![CDATA[
<p>Separate Cox analyses of all cause-specific hazards are the standard technique of choice to study the effect of a covariate in competing risks, but a synopsis of these results in terms of cumulative event probabilities is challenging. This difficulty has led to the development of the proportional subdistribution hazards model. If the covariate is known at baseline, the model allows for a summarizing assessment in terms of the cumulative incidence function. black Mathematically, the model also allows for including random time-dependent covariates, but practical implementation has remained unclear due to a certain risk set peculiarity. We use the intimate relationship of discrete covariates and multistate models to naturally treat time-dependent covariates within the subdistribution hazards framework. The methodology then straightforwardly translates to real-valued time-dependent covariates. As with classical survival analysis, including time-dependent covariates does not result in a model for probability functions anymore. Nevertheless, the proposed methodology provides a useful synthesis of separate cause-specific hazards analyses. We illustrate this with hospital infection data, where time-dependent covariates and competing risks are essential to the subject research question.</p>
]]></description>
<dc:creator><![CDATA[Beyersmann, J., Schumacher, M.]]></dc:creator>
<dc:date>2008-04-22</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn009</dc:identifier>
<dc:title><![CDATA[Time-dependent covariates in the proportional subdistribution hazards model for competing risks]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-04-22</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn008v1?rss=1">
<title><![CDATA[Estimating time-to-event from longitudinal ordinal data using random-effects Markov models: application to multiple sclerosis progression]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn008v1?rss=1</link>
<description><![CDATA[
<p>Longitudinal ordinal data are common in many scientific studies, including those of multiple sclerosis (MS), and are frequently modeled using Markov dependency. Several authors have proposed random-effects Markov models to account for heterogeneity in the population. In this paper, we go one step further and study prediction based on random-effects Markov models. In particular, we show how to calculate the probabilities of future events and confidence intervals for those probabilities, given observed data on the ordinal outcome and a set of covariates, and how to update them over time. We discuss the usefulness of depicting these probabilities for visualization and interpretation of model results and illustrate our method using data from a phase III clinical trial that evaluated the utility of interferon beta-1a (trademark Avonex) to MS patients of type relapsing&ndash;remitting.</p>
]]></description>
<dc:creator><![CDATA[Mandel, M., Betensky, R. A.]]></dc:creator>
<dc:date>2008-04-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn008</dc:identifier>
<dc:title><![CDATA[Estimating time-to-event from longitudinal ordinal data using random-effects Markov models: application to multiple sclerosis progression]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-04-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn006v1?rss=1">
<title><![CDATA[On outcome-dependent sampling designs for longitudinal binary response data with time-varying covariates]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn006v1?rss=1</link>
<description><![CDATA[
<p>A typical longitudinal study prospectively collects both repeated measures of a health status outcome as well as covariates that are used either as the primary predictor of interest or as important adjustment factors. In many situations, all covariates are measured on the entire study cohort. However, in some scenarios the primary covariates are time dependent yet may be ascertained retrospectively after completion of the study. One common example would be covariate measurements based on stored biological specimens such as blood plasma. While authors have previously proposed generalizations of the standard case&ndash;control design in which the clustered outcome measurements are used to selectively ascertain covariates (<cross-ref type="bib" refid="bib14">Neuhaus and Jewell, 1990</cross-ref>) and therefore provide resource efficient collection of information, these designs do not appear to be commonly used. One potential barrier to the use of longitudinal outcome-dependent sampling designs would be the lack of a flexible class of likelihood-based analysis methods. With the relatively recent development of flexible and practical methods such as generalized linear mixed models (<cross-ref type="bib" refid="bib4">Breslow and Clayton, 1993</cross-ref>) and marginalized models for categorical longitudinal data (see <cross-ref type="bib" refid="bib10">Heagerty and Zeger, 2000</cross-ref>, for an overview), the class of likelihood-based methods is now sufficiently well developed to capture the major forms of longitudinal correlation found in biomedical repeated measures data. Therefore, the goal of this manuscript is to promote the consideration of outcome-dependent longitudinal sampling designs and to both outline and evaluate the basic conditional likelihood analysis allowing for valid statistical inference.</p>
]]></description>
<dc:creator><![CDATA[Schildcrout, J. S., Heagerty, P. J.]]></dc:creator>
<dc:date>2008-03-27</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn006</dc:identifier>
<dc:title><![CDATA[On outcome-dependent sampling designs for longitudinal binary response data with time-varying covariates]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-03-27</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn004v1?rss=1">
<title><![CDATA[Estimating hepatitis C prevalence in England and Wales by synthesizing evidence from multiple data sources. Assessing data conflict and model fit]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn004v1?rss=1</link>
<description><![CDATA[
<p>Multiparameter evidence synthesis is becoming widely used as a way of combining evidence from multiple and often disparate sources of information concerning a number of parameters. Synthesizing data in one encompassing model allows propagation of evidence and learning. We demonstrate the use of such an approach in estimating the number of people infected with the hepatitis C virus (HCV) in England and Wales. Data are obtained from seroprevalence studies conducted in different subpopulations. Each subpopulation is modeled as a composition of 3 main HCV risk groups (current injecting drug users (IDUs), ex-IDUs, and non-IDUs). Further, data obtained on the prevalence (size) of each risk group provide an estimate of the prevalence of HCV in the whole population. We simultaneously estimate all model parameters through the use of Bayesian Markov chain Monte Carlo techniques. The main emphasis of this paper is the assessment of evidence consistency and investigation of the main drivers for model inferences. We consider a cross-validation technique to reveal data conflict and leverage when each data source is in turn removed from the model.</p>
]]></description>
<dc:creator><![CDATA[Sweeting, M. J., De Angelis, D., Hickman, M., Ades, A. E.]]></dc:creator>
<dc:date>2008-03-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn004</dc:identifier>
<dc:title><![CDATA[Estimating hepatitis C prevalence in England and Wales by synthesizing evidence from multiple data sources. Assessing data conflict and model fit]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-03-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn002v1?rss=1">
<title><![CDATA[Optimal screening for promising genes in 2-stage designs]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn002v1?rss=1</link>
<description><![CDATA[
<p>Detecting genetic markers with biologically relevant effects remains a challenge due to multiple testing. Standard analysis methods focus on evidence against the null and protect primarily the type I error. On the other hand, the worthwhile alternative is specified for power calculations at the design stage. The balanced test as proposed by Moerkerke <I>and others</I> (2006) and Moerkerke and Goetghebeur (2006) incorporates this alternative directly in the decision criterion to achieve better power. Genetic markers are selected and ranked in order of the balance of evidence they contain against the null and the target alternative. In this paper, we build on this guiding principle to develop 2-stage designs for screening genetic markers when the cost of measurements is high. For a given marker, a first sample may already provide sufficient evidence for or against the alternative. If not, more data are gathered at the second stage which is then followed by a binary decision based on all available data. By optimizing parameters which determine the decision process over the 2 stages (such as the area of the "gray" zone which leads to the gathering of extra data), the expected cost per marker can be reduced substantially. We also demonstrate that, compared to 1-stage designs, 2-stage designs achieve a better balance between true negatives and positives for the same cost.</p>
]]></description>
<dc:creator><![CDATA[Moerkerke, B., Goetghebeur, E.]]></dc:creator>
<dc:date>2008-03-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn002</dc:identifier>
<dc:title><![CDATA[Optimal screening for promising genes in 2-stage designs]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-03-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm059v1?rss=1">
<title><![CDATA[A Bayesian approach to functional-based multilevel modeling of longitudinal data: applications to environmental epidemiology]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm059v1?rss=1</link>
<description><![CDATA[
<p>Flexible multilevel models are proposed to allow for cluster-specific smooth estimation of growth curves in a mixed-effects modeling format that includes subject-specific random effects on the growth parameters. Attention is then focused on models that examine between-cluster comparisons of the effects of an ecologic covariate of interest (e.g. air pollution) on nonlinear functionals of growth curves (e.g. maximum rate of growth). A Gibbs sampling approach is used to get posterior mean estimates of nonlinear functionals along with their uncertainty estimates. A second-stage ecologic random-effects model is used to examine the association between a covariate of interest (e.g. air pollution) and the nonlinear functionals. A unified estimation procedure is presented along with its computational and theoretical details. The models are motivated by, and illustrated with, lung function and air pollution data from the Southern California Children's Health Study.</p>
]]></description>
<dc:creator><![CDATA[Berhane, K., Molitor, N.-T.]]></dc:creator>
<dc:date>2008-03-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm059</dc:identifier>
<dc:title><![CDATA[A Bayesian approach to functional-based multilevel modeling of longitudinal data: applications to environmental epidemiology]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-03-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm058v1?rss=1">
<title><![CDATA[A transdimensional Bayesian model for pattern recognition in DNA sequences]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm058v1?rss=1</link>
<description><![CDATA[
<p>Identification of transcription factor binding sites (TFBSs) is essential to elucidate gene regulatory networks. This article is focused on the recognition of overpresented short patterns, called "motifs", that may correspond to regulatory binding sites in the DNA sequences upstream of genes. An integrated Bayesian model is proposed to incorporate all unknown characteristics in motif discovery, including the number of motifs, motif widths, motif compositions, the number of motif sites, and locations of motif sites. Reversible jump Markov chain Monte Carlo is used to obtain posterior inference in the transdimensional parameter space. We present a number of suggestions for graphical summarization of the posterior distribution over the complex parameter space. The basic model is extended using a third-order Markov structure for nonmotif bases and allowing positions within a motif to be switched between 2 types: "conserved" and "degenerate." We evaluate the prediction accuracy for the simulated data with 3 motifs and apply the model to upstream sequences in high signal-to-noise regions in a human ChIP-chip study. The performance of the Bayesian model is assessed using yeast data sets of various numbers of sequences and background structures, with and without true TFBSs. The performance is also compared to other computational methods, including 2 statistical approaches, AlignACE and multiple expectation maximization for motif elicitation, and 1 word numeration&ndash;based approach, yeast motif finder (YMF).</p>
]]></description>
<dc:creator><![CDATA[Li, S. M., Wakefield, J., Self, S.]]></dc:creator>
<dc:date>2008-03-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm058</dc:identifier>
<dc:title><![CDATA[A transdimensional Bayesian model for pattern recognition in DNA sequences]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-03-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm053v3?rss=1">
<title><![CDATA[Efficient p-value estimation in massively parallel testing problems]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm053v3?rss=1</link>
<description><![CDATA[
<p>We present a new method to efficiently estimate very large numbers of <I>p</I>-values using empirically constructed null distributions of a test statistic. The need to evaluate a very large number of <I>p</I>-values is increasingly common with modern genomic data, and when interaction effects are of interest, the number of tests can easily run into billions. When the asymptotic distribution is not easily available, permutations are typically used to obtain <I>p</I>-values but these can be computationally infeasible in large problems. Our method constructs a prediction model to obtain a first approximation to the <I>p</I>-values and uses Bayesian methods to choose a fraction of these to be refined by permutations. We apply and evaluate our method on the study of association between 2-way interactions of genetic markers and colorectal cancer using the data from the first phase of a large, genome-wide case&ndash;control study. The results show enormous computational savings as compared to evaluating a full set of permutations, with little decrease in accuracy.</p>
]]></description>
<dc:creator><![CDATA[Kustra, R., Shi, X., Murdoch, D. J., Greenwood, C. M. T., Rangrej, J.]]></dc:creator>
<dc:date>2008-03-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm053</dc:identifier>
<dc:title><![CDATA[Efficient p-value estimation in massively parallel testing problems]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-03-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn005v1?rss=1">
<title><![CDATA[Boosting method for nonlinear transformation models with censored survival data]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn005v1?rss=1</link>
<description><![CDATA[
<p>We propose a general class of nonlinear transformation models for analyzing censored survival data, of which the nonlinear proportional hazards and proportional odds models are special cases. A cubic smoothing spline&ndash;based component-wise boosting algorithm is derived to estimate covariate effects nonparametrically using the gradient of the marginal likelihood, that is computed using importance sampling. The proposed method can be applied to survival data with high-dimensional covariates, including the case when the sample size is smaller than the number of predictors. Empirical performance of the proposed method is evaluated via simulations and analysis of a microarray survival data.</p>
]]></description>
<dc:creator><![CDATA[Lu, W., Li, L.]]></dc:creator>
<dc:date>2008-03-15</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn005</dc:identifier>
<dc:title><![CDATA[Boosting method for nonlinear transformation models with censored survival data]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-03-15</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm055v1?rss=1">
<title><![CDATA[A likelihood-based approach to mixed modeling with ambiguity in cluster identifiers]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm055v1?rss=1</link>
<description><![CDATA[
<p>This manuscript describes a novel, linear mixed-effects model&ndash;fitting technique for the setting in which correlated data indicators are not completely observed. Mixed modeling is a useful analytical tool for characterizing genotype&ndash;phenotype associations among multiple potentially informative genetic loci. This approach involves grouping individuals into genetic clusters, where individuals in the same cluster have similar or identical multilocus genotypes. In haplotype-based investigations of unrelated individuals, corresponding cluster assignments are unobservable since the alignment of alleles within chromosomal copies is not generally observed. We derive an expectation conditional maximization approach to estimation in the mixed modeling setting, where cluster assignments are ambiguous. The approach has broad relevance to the analysis of data with missing correlated data identifiers. An example is provided based on data arising from a cohort of human immunodeficiency virus type-1&ndash;infected individuals at risk for antiretroviral therapy&ndash;associated dyslipidemia.</p>
]]></description>
<dc:creator><![CDATA[Foulkes, A. S., Yucel, R., Li, X.]]></dc:creator>
<dc:date>2008-03-14</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm055</dc:identifier>
<dc:title><![CDATA[A likelihood-based approach to mixed modeling with ambiguity in cluster identifiers]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-03-14</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn001v1?rss=1">
<title><![CDATA[Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn001v1?rss=1</link>
<description><![CDATA[
<p>Genome-wide association studies (GWAS) provide an important approach to identifying common genetic variants that predispose to human disease. A typical GWAS may genotype hundreds of thousands of single nucleotide polymorphisms (SNPs) located throughout the human genome in a set of cases and controls. Logistic regression is often used to test for association between a SNP genotype and case versus control status, with corresponding odds ratios (ORs) typically reported only for those SNPs meeting selection criteria. However, when these estimates are based on the original data used to detect the variant, the results are affected by a selection bias sometimes referred to the "winner's curse" (<cross-ref type="bib" refid="bib4">Capen <I>and others</I>, 1971</cross-ref>). The actual genetic association is typically overestimated. We show that such selection bias may be severe in the sense that the conditional expectation of the standard OR estimator may be quite far away from the underlying parameter. Also standard confidence intervals (CIs) may have far from the desired coverage rate for the selected ORs. We propose and evaluate 3 bias-reduced estimators, and also corresponding weighted estimators that combine corrected and uncorrected estimators, to reduce selection bias. Their corresponding CIs are also proposed. We study the performance of these estimators using simulated data sets and show that they reduce the bias and give CI coverage close to the desired level under various scenarios, even for associations having only small statistical power.</p>
]]></description>
<dc:creator><![CDATA[Zhong, H., Prentice, R. L.]]></dc:creator>
<dc:date>2008-02-28</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn001</dc:identifier>
<dc:title><![CDATA[Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-02-28</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxn003v1?rss=1">
<title><![CDATA[Regression models for infant mortality data in Norwegian siblings, using a compound Poisson frailty distribution with random scale]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxn003v1?rss=1</link>
<description><![CDATA[
<p>The power variance function distributions, which include the gamma and compound Poisson (CP) distributions among others, are commonly used in frailty models for family data. In a previous paper, we presented a frailty model constructed by randomizing the scale parameter in a CP distribution. When combined with a parametric baseline hazard, this yields a model with heterogeneity on both the individual and the family level and a subgroup with zero frailty, corresponding to people not experiencing the event. In this paper, we discuss covariates in the model. Depending on where the covariates are inserted in the model, one may have proportional hazards at the individual level, the family level, and a larger group level (for covariates shared by many families, e.g. ethnic groups) or get accelerated failure times. Each of these alternatives gives a specific interpretation of the covariate effects. An application to data infant mortality in siblings from the Medical Birth Registry of Norway is included. We compare the results for some of the different covariate modeling options.</p>
]]></description>
<dc:creator><![CDATA[Moger, T. A., Aalen, O. O.]]></dc:creator>
<dc:date>2008-02-27</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn003</dc:identifier>
<dc:title><![CDATA[Regression models for infant mortality data in Norwegian siblings, using a compound Poisson frailty distribution with random scale]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-02-27</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm057v1?rss=1">
<title><![CDATA[Modeling temperature effects on mortality: multiple segmented relationships with common break points]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm057v1?rss=1</link>
<description><![CDATA[
<p>We present a model for estimation of temperature effects on mortality that is able to capture jointly the typical features of every temperature&ndash;death relationship, that is, nonlinearity and delayed effect of cold and heat over a few days. Using a segmented approximation along with a doubly penalized spline-based distributed lag parameterization, estimates and relevant standard errors of the cold- and heat-related risks and the heat tolerance are provided. The model is applied to data from Milano, Italy.</p>
]]></description>
<dc:creator><![CDATA[Muggeo, V. M. R.]]></dc:creator>
<dc:date>2008-02-27</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm057</dc:identifier>
<dc:title><![CDATA[Modeling temperature effects on mortality: multiple segmented relationships with common break points]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-02-27</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm050v1?rss=1">
<title><![CDATA[ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm050v1?rss=1</link>
<description><![CDATA[
<p>The accuracy of a single diagnostic test for binary outcome can be summarized by the area under the receiver operating characteristic (ROC) curve. Volume under the surface and hypervolume under the manifold have been proposed as extensions for multiple class diagnosis (Scurfield, 1996, 1998). However, the lack of simple inferential procedures for such measures has limited their practical utility. Part of the difficulty is that calculating such quantities may not be straightforward, even with a single test. The decision rule used to generate the ROC surface requires class probability assessments, which are not provided by the tests. We develop a method based on estimating the probabilities via some procedure, for example, multinomial logistic regression. Bootstrap inferences are proposed to account for variability in estimating the probabilities and perform well in simulations. The ROC measures are compared to the correct classification rate, which depends heavily on class prevalences. An example of tumor classification with microarray data demonstrates that this property may lead to substantially different analyses. The ROC-based analysis yields notable decreases in model complexity over previous analyses.</p>
]]></description>
<dc:creator><![CDATA[Li, J., Fine, J. P.]]></dc:creator>
<dc:date>2008-02-27</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm050</dc:identifier>
<dc:title><![CDATA[ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-02-27</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm056v1?rss=1">
<title><![CDATA[Linear mixed models for longitudinal shape data with applications to facial modeling]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm056v1?rss=1</link>
<description><![CDATA[
<p>We present a novel application of methods for analysis of high-dimensional longitudinal data to a comparison of facial shape over time between babies with cleft lip and palate and similarly aged controls. A pairwise methodology is used that was introduced in <cross-ref type="bib" refid="bib11">Fieuws and Verbeke (2006)</cross-ref> in order to apply a linear mixed-effects model to data of high dimensions, such as describe facial shape. The approach involves fitting bivariate linear mixed-effects models to all the pairwise combinations of responses, where the latter result from the individual coordinate positions, and aggregating the results across repeated parameter estimates (such as the random-effects variance for a particular coordinate). We describe one example using landmarks and another using facial curves from the cleft lip study, the latter using B-splines to provide an efficient parameterization. The results are presented in 2 dimensions, both in the profile and in the frontal views, with bivariate confidence intervals for the mean position of each landmark or curve, allowing objective assessment of significant differences in particular areas of the face between the 2 groups. Model comparison is performed using Wald and pseudolikelihood ratio tests.</p>
]]></description>
<dc:creator><![CDATA[Barry, S. J. E., Bowman, A. W.]]></dc:creator>
<dc:date>2008-02-05</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm056</dc:identifier>
<dc:title><![CDATA[Linear mixed models for longitudinal shape data with applications to facial modeling]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-02-05</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm051v1?rss=1">
<title><![CDATA[Mixture models with multiple levels, with application to the analysis of multifactor gene expression data]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm051v1?rss=1</link>
<description><![CDATA[
<p>Model-based clustering is a popular tool for summarizing high-dimensional data. With the number of high-throughput large-scale gene expression studies still on the rise, the need for effective data- summarizing tools has never been greater. By grouping genes according to a common experimental expression profile, we may gain new insight into the biological pathways that steer biological processes of interest. Clustering of gene profiles can also assist in assigning functions to genes that have not yet been functionally annotated. In this paper, we propose 2 model selection procedures for model-based clustering. Model selection in model-based clustering has to date focused on the identification of data dimensions that are relevant for clustering. However, in more complex data structures, with multiple experimental factors, such an approach does not provide easily interpreted clustering outcomes. We propose a mixture model with multiple levels, <f><inline-fig>
<link locator="biostskxm051fx1_ht"></inline-fig></f>, that provides sparse representations both "within" and "between" cluster profiles. We explore various flexible "within-cluster" parameterizations and discuss how efficient parameterizations can greatly enhance the objective interpretability of the generated clusters. Moreover, we allow for a sparse "between-cluster" representation with a different number of clusters at different levels of an experimental factor of interest. This enhances interpretability of clusters generated in multiple-factor contexts. Interpretable cluster profiles can assist in detecting biologically relevant groups of genes that may be missed with less efficient parameterizations. We use our multilevel mixture model to mine a proliferating cell line expression data set for annotational context and regulatory motifs. We also investigate the performance of the multilevel clustering approach on several simulated data sets.</p>
]]></description>
<dc:creator><![CDATA[Jornsten, R., Keles, S.]]></dc:creator>
<dc:date>2008-02-05</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm051</dc:identifier>
<dc:title><![CDATA[Mixture models with multiple levels, with application to the analysis of multifactor gene expression data]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-02-05</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm049v1?rss=1">
<title><![CDATA[Penalized loss functions for Bayesian model comparison]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm049v1?rss=1</link>
<description><![CDATA[
<p>The deviance information criterion (DIC) is widely used for Bayesian model comparison, despite the lack of a clear theoretical foundation. DIC is shown to be an approximation to a penalized loss function based on the deviance, with a penalty derived from a cross-validation argument. This approximation is valid only when the effective number of parameters in the model is much smaller than the number of independent observations. In disease mapping, a typical application of DIC, this assumption does not hold and DIC under-penalizes more complex models. Another deviance-based loss function, derived from the same decision-theoretic framework, is applied to mixture models, which have previously been considered an unsuitable application for DIC</p>
]]></description>
<dc:creator><![CDATA[Plummer, M.]]></dc:creator>
<dc:date>2008-01-21</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm049</dc:identifier>
<dc:title><![CDATA[Penalized loss functions for Bayesian model comparison]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-01-21</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm052v1?rss=1">
<title><![CDATA[Statistical models for quantifying diagnostic accuracy with multiple lesions per patient]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm052v1?rss=1</link>
<description><![CDATA[
<p>We propose random-effects models to summarize and quantify the accuracy of the diagnosis of multiple lesions on a single image without assuming independence between lesions. The number of false-positive lesions was assumed to be distributed as a Poisson mixture, and the proportion of true-positive lesions was assumed to be distributed as a binomial mixture. We considered univariate and bivariate, both parametric and nonparametric mixture models. We applied our tools to simulated data and data of a study assessing diagnostic accuracy of virtual colonography with computed tomography in 200 patients suspected of having one or more polyps.</p>
]]></description>
<dc:creator><![CDATA[Zwinderman, A. H., Glas, A. S., Bossuyt, P. M., Florie, J., Bipat, S., Stoker, J.]]></dc:creator>
<dc:date>2008-01-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm052</dc:identifier>
<dc:title><![CDATA[Statistical models for quantifying diagnostic accuracy with multiple lesions per patient]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-01-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm054v1?rss=1">
<title><![CDATA[A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm054v1?rss=1</link>
<description><![CDATA[
<p>Longitudinal data often contain missing observations and error-prone covariates. Extensive attention has been directed to analysis methods to adjust for the bias induced by missing observations. There is relatively little work on investigating the effects of covariate measurement error on estimation of the response parameters, especially on simultaneously accounting for the biases induced by both missing values and mismeasured covariates. It is not clear what the impact of ignoring measurement error is when analyzing longitudinal data with both missing observations and error-prone covariates. In this article, we study the effects of covariate measurement error on estimation of the response parameters for longitudinal studies. We develop an inference method that adjusts for the biases induced by measurement error as well as by missingness. The proposed method does not require the full specification of the distribution of the response vector but only requires modeling its mean and variance structures. Furthermore, the proposed method employs the so-called functional modeling strategy to handle the covariate process, with the distribution of covariates left unspecified. These features, plus the simplicity of implementation, make the proposed method very attractive. In this paper, we establish the asymptotic properties for the resulting estimators. With the proposed method, we conduct sensitivity analyses on a cohort data set arising from the Framingham Heart Study. Simulation studies are carried out to evaluate the impact of ignoring covariate measurement error and to assess the performance of the proposed method.</p>
]]></description>
<dc:creator><![CDATA[Yi, G. Y.]]></dc:creator>
<dc:date>2008-01-16</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm054</dc:identifier>
<dc:title><![CDATA[A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2008-01-16</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm048v1?rss=1">
<title><![CDATA[Weighted clustering of called array CGH data]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm048v1?rss=1</link>
<description><![CDATA[
<p>Array comparative genomic hybridization (aCGH) is a laboratory technique to measure chromosomal copy number changes. A clear biological interpretation of the measurements is obtained by mapping these onto an ordinal scale with categories loss/normal/gain of a copy. The pattern of gains and losses harbors a level of tumor specificity. Here, we present WECCA (weighted clustering of called aCGH data), a method for weighted clustering of samples on the basis of the ordinal aCGH data. Two similarities to be used in the clustering and particularly suited for ordinal data are proposed, which are generalized to deal with weighted observations. In addition, a new form of linkage, especially suited for ordinal data, is introduced. In a simulation study, we show that the proposed cluster method is competitive to clustering using the continuous data. We illustrate WECCA using an application to a breast cancer data set, where WECCA finds a clustering that relates better with survival than the original one.</p>
]]></description>
<dc:creator><![CDATA[Van Wieringen, W. N., Van De Wiel, M. A., Ylstra, B.]]></dc:creator>
<dc:date>2007-12-22</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm048</dc:identifier>
<dc:title><![CDATA[Weighted clustering of called array CGH data]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-12-22</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm047v1?rss=1">
<title><![CDATA[Significance levels for studies with correlated test statistics]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm047v1?rss=1</link>
<description><![CDATA[
<p>When testing large numbers of null hypotheses, one needs to assess the evidence against the global null hypothesis that none of the hypotheses is false. Such evidence typically is based on the test statistic of the largest magnitude, whose statistical significance is evaluated by permuting the sample units to simulate its null distribution. Efron (2007) has noted that correlation among the test statistics can induce substantial interstudy variation in the shapes of their histograms, which may cause misleading tail counts. Here, we show that permutation-based estimates of the overall significance level also can be misleading when the test statistics are correlated. We propose that such estimates be conditioned on a simple measure of the spread of the observed histogram, and we provide a method for obtaining conditional significance levels. We justify this conditioning using the conditionality principle described by Cox and Hinkley (1974). Application of the method to gene expression data illustrates the circumstances when conditional significance levels are needed.</p>
]]></description>
<dc:creator><![CDATA[Shi, J., Levinson, D. F., Whittemore, A. S.]]></dc:creator>
<dc:date>2007-12-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm047</dc:identifier>
<dc:title><![CDATA[Significance levels for studies with correlated test statistics]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-12-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm046v1?rss=1">
<title><![CDATA[Complementary hierarchical clustering]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm046v1?rss=1</link>
<description><![CDATA[
<p>When applying hierarchical clustering algorithms to cluster patient samples from microarray data, the clustering patterns generated by most algorithms tend to be dominated by groups of highly differentially expressed genes that have closely related expression patterns. Sometimes, these genes may not be relevant to the biological process under study or their functions may already be known. The problem is that these genes can potentially drown out the effects of other genes that are relevant or have novel functions. We propose a procedure called complementary hierarchical clustering that is designed to uncover the structures arising from these novel genes that are not as highly expressed. Simulation studies show that the procedure is effective when applied to a variety of examples. We also define a concept called relative gene importance that can be used to identify the influential genes in a given clustering. Finally, we analyze a microarray data set from 295 breast cancer patients, using clustering with the correlation-based distance measure. The complementary clustering reveals a grouping of the patients which is uncorrelated with a number of known prognostic signatures and significantly differing distant metastasis-free probabilities.</p>
]]></description>
<dc:creator><![CDATA[Nowak, G., Tibshirani, R.]]></dc:creator>
<dc:date>2007-12-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm046</dc:identifier>
<dc:title><![CDATA[Complementary hierarchical clustering]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-12-18</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm044v1?rss=1">
<title><![CDATA[Monitoring late-onset toxicities in phase I trials using predicted risks]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm044v1?rss=1</link>
<description><![CDATA[
<p>Late-onset (LO) toxicities are a serious concern in many phase I trials. Since most dose-limiting toxicities occur soon after therapy begins, most dose-finding methods use a binary indicator of toxicity occurring within a short initial time period. If an agent causes LO toxicities, however, an undesirably large number of patients may be treated at toxic doses before any toxicities are observed. A method addressing this problem is the time-to-event continual reassessment method (TITE-CRM, <cross-ref type="bib" refid="bib4">Cheung and Chappell, 2000</cross-ref>). We propose a Bayesian dose-finding method similar to the TITE-CRM in which doses are chosen using time-to-toxicity data. The new aspect of our method is a set of rules, based on predictive probabilities, that temporarily suspend accrual if the risk of toxicity at prospective doses for future patients is unacceptably high. If additional follow-up data reduce the predicted risk of toxicity to an acceptable level, then accrual is restarted, and this process may be repeated several times during the trial. A simulation study shows that the proposed method provides a greater degree of safety than the TITE-CRM, while still reliably choosing the preferred dose. This advantage increases with accrual rate, but the price of this additional safety is that the trial takes longer to complete on average.</p>
]]></description>
<dc:creator><![CDATA[Bekele, B. N., Ji, Y., Shen, Y., Thall, P. F.]]></dc:creator>
<dc:date>2007-12-14</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm044</dc:identifier>
<dc:title><![CDATA[Monitoring late-onset toxicities in phase I trials using predicted risks]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-12-14</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm045v1?rss=1">
<title><![CDATA[Sparse inverse covariance estimation with the graphical lasso]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm045v1?rss=1</link>
<description><![CDATA[
<p>We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm&mdash;the <I>graphical lasso</I>&mdash;that is remarkably fast: It solves a 1000-node problem (~500000 parameters) in at most a minute and is 30&ndash;4000 times faster than competing methods. It also provides a conceptual link between the exact problem and the approximation suggested by Meinshausen and B&uuml;hlmann (2006). We illustrate the method on some cell-signaling data from proteomics.</p>
]]></description>
<dc:creator><![CDATA[Friedman, J., Hastie, T., Tibshirani, R.]]></dc:creator>
<dc:date>2007-12-12</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm045</dc:identifier>
<dc:title><![CDATA[Sparse inverse covariance estimation with the graphical lasso]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-12-12</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm041v1?rss=1">
<title><![CDATA[Predicting renal graft failure using multivariate longitudinal profiles]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm041v1?rss=1</link>
<description><![CDATA[
<p>Patients who have undergone renal transplantation are monitored longitudinally at irregular time intervals over 10 years or more. This yields a set of biochemical and physiological markers containing valuable information to anticipate a failure of the graft. A general linear, generalized linear, or nonlinear mixed model is used to describe the longitudinal profile of each marker. To account for the correlation between markers, the univariate mixed models are combined into a multivariate mixed model (MMM) by specifying a joint distribution for the random effects. Due to the high number of markers, a pairwise modeling strategy, where all possible pairs of bivariate mixed models are fitted, is used to obtain parameter estimates for the MMM. These estimates are used in a Bayes rule to obtain, at each point in time, the prognosis for long-term success of the transplant. It is shown that allowing the markers to be correlated can improve this prognosis.</p>
]]></description>
<dc:creator><![CDATA[Fieuws, S., Verbeke, G., Maes, B., Vanrenterghem, Y.]]></dc:creator>
<dc:date>2007-12-03</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm041</dc:identifier>
<dc:title><![CDATA[Predicting renal graft failure using multivariate longitudinal profiles]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-12-03</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm042v1?rss=1">
<title><![CDATA[MOST: detecting cancer differential gene expression]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm042v1?rss=1</link>
<description><![CDATA[
<p>We propose a new statistics for the detection of differentially expressed genes when the genes are activated only in a subset of the samples. Statistics designed for this unconventional circumstance has proved to be valuable for most cancer studies, where oncogenes are activated for a small number of disease samples. Previous efforts made in this direction include cancer outlier profile analysis (Tomlins <I>and others</I>, 2005), outlier sum (Tibshirani and Hastie, 2007), and outlier robust <I>t</I>-statistics (Wu, 2007). We propose a new statistics called maximum ordered subset <I>t</I>-statistics (MOST) which seems to be natural when the number of activated samples is unknown. We compare MOST to other statistics and find that the proposed method often has more power then its competitors.</p>
]]></description>
<dc:creator><![CDATA[Lian, H.]]></dc:creator>
<dc:date>2007-11-29</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm042</dc:identifier>
<dc:title><![CDATA[MOST: detecting cancer differential gene expression]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-11-29</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm038v1?rss=1">
<title><![CDATA[The separation of timescales in Bayesian survival modeling of the time-varying effect of a time-dependent exposure]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm038v1?rss=1</link>
<description><![CDATA[
<p>In this paper, we apply flexible Bayesian survival analysis methods to investigate the risk of lymphoma associated with kidney transplantation among patients with end-stage renal disease. Of key interest is the potentially time-varying effect of a time-dependent exposure: transplant status. Bayesian modeling of the baseline hazard and the effect of transplant requires consideration of 2 timescales: time since study start and time since transplantation, respectively. Previous related work has not dealt with the separation of multiple timescales. Using a hierarchical model for the hazard function, both timescales are incorporated via conditionally independent stochastic processes; smoothing of each process is specified via intrinsic conditional Gaussian autoregressions. Features of the corresponding posterior distribution are evaluated from draws obtained via a Metropolis&ndash;Hastings&ndash;Green algorithm.</p>
]]></description>
<dc:creator><![CDATA[Haneuse, S. J.-P. A., Rudser, K. D., Gillen, D. L.]]></dc:creator>
<dc:date>2007-11-19</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm038</dc:identifier>
<dc:title><![CDATA[The separation of timescales in Bayesian survival modeling of the time-varying effect of a time-dependent exposure]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-11-19</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/kxm039v1?rss=1">
<title><![CDATA[Genetic model selection in two-phase analysis for case control association studies]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/kxm039v1?rss=1</link>
<description><![CDATA[
<p>The Cochran&ndash;Armitage trend test (CATT) is well suited for testing association between a marker and a disease in case&ndash;control studies. When the underlying genetic model for the disease is known, the CATT optimal for the genetic model is used. For complex diseases, however, the genetic models of the true disease loci are unknown. In this situation, robust tests are preferable. We propose a two-phase analysis with model selection for the case&ndash;control design. In the first phase, we use the difference of Hardy&ndash;Weinberg disequilibrium coefficients between the cases and the controls for model selection. Then, an optimal CATT corresponding to the selected model is used for testing association. The correlation of the statistics used for selection and the test for association is derived to adjust the two-phase analysis with control of the Type-I error rate. The simulation studies show that this new approach has greater efficiency robustness than the existing methods.</p>
]]></description>
<dc:creator><![CDATA[Zheng, G., Ng, H. K. T.]]></dc:creator>
<dc:date>2007-11-13</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm039</dc:identifier>
<dc:title><![CDATA[Genetic model selection in two-phase analysis for case control association studies]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:publicationDate>2007-11-13</prism:publicationDate>
<prism:section>Article</prism:section>
</item>

</rdf:RDF>