Biostatistics Advance Access originally published online on March 10, 2006
Biostatistics 2006 7(4):585-598; doi:10.1093/biostatistics/kxj027
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Published by Oxford University Press 2006.
Pooling biospecimens and limits of detection: effects on ROC curve analysis
Division of Epidemiology, Statistics & Prevention, NICHD, NIH, DHHS, Bethesda, MD 20892, USA and Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
Division of Epidemiology, Statistics & Prevention, NICHD, NIH, DHHS, Bethesda, MD 20892, USA schistee{at}mail.nih.gov
* To whom correspondence should be addressed.
Frequently, epidemiological studies deal with two restrictions in the evaluation of biomarkers: cost and instrument sensitivity. Costs can hamper the evaluation of the effectiveness of new biomarkers. In addition, many assays are affected by a limit of detection (LOD), depending on the instrument sensitivity. Two common strategies used to cut costs include taking a random sample of the available samples and pooling biospecimens. We compare the two sampling strategies when an LOD effect exists. These strategies are compared by examining the efficiency of receiver operating characteristic (ROC) curve analysis, specifically the estimation of the area under the ROC curve (AUC) for normally distributed markers. We propose and examine a method to estimate AUC when dealing with data from pooled and unpooled samples where an LOD is in effect. In conclusion, pooling is the most efficient cost-cutting strategy when the LOD affects less than 50% of the data. However, when much more than 50% of the data are affected, utilization of the pooling design is not recommended.
Keywords: Limit of detection; Maximum likelihood; Pooling design; Receiver operating characteristics; Sampling