Biostatistics Advance Access published online on September 12, 2007
Biostatistics, doi:10.1093/biostatistics/kxm031
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Stochastic segmentation models for array-based comparative genomic hybridization data analysis
Department of Statistics and Cancer Center, Stanford University, Stanford, CA 94305-4065, USA
Department of Statistics, Columbia University, New York, NY 10027, USA
Department of Statistics, Stanford University, Stanford, CA 94305-4065, USA nzhang@stat.stanford.edu
* To whom correspondence should be addressed.
Array-based comparative genomic hybridization (array-CGH) is a high throughput, high resolution technique for studying the genetics of cancer. Analysis of array-CGH data typically involves estimation of the underlying chromosome copy numbers from the log fluorescence ratios and segmenting the chromosome into regions with the same copy number at each location. We propose for the analysis of array-CGH data, a new stochastic segmentation model and an associated estimation procedure that has attractive statistical and computational properties. An important benefit of this Bayesian segmentation model is that it yields explicit formulas for posterior means, which can be used to estimate the signal directly without performing segmentation. Other quantities relating to the posterior distribution that are useful for providing confidence assessments of any given segmentation can also be estimated by using our method. We propose an approximation method whose computation time is linear in sequence length which makes our method practically applicable to the new higher density arrays. Simulation studies and applications to real array-CGH data illustrate the advantages of the proposed approach.
Keywords: Array-CGH; Bayesian inference; Hidden Markov models; Jump probabilities
Received October 10, 2006; revised June 4, 2007; accepted for publication July 11, 2007.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Stjernqvist and T. Ryden A continuous-index hidden Markov jump process for modeling DNA copy number data Biostat., October 1, 2009; 10(4): 773 - 778. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Daines, H. Wang, Y. Li, Y. Han, R. Gibbs, and R. Chen High-Throughput Multiplex Sequencing to Discover Copy Number Variants in Drosophila Genetics, August 1, 2009; 182(4): 935 - 941. [Abstract] [Full Text] [PDF] |
||||

