Skip Navigation

This Article
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary PDF file
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by French, J. L.
Right arrow Articles by Wand, M. P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by French, J. L.
Right arrow Articles by Wand, M. P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Biostatistics (2004), 5, 2, pp. 177-191
Biostatistics Vol. 5 No. 2 © Oxford University Press 2004; all rights reserved.

Generalized additive models for cancer mapping with incomplete covariates

Jonathan L. French{dagger}

Biostatistics, Global Research and Development, Pfizer, Inc., 50 Pequot Avenue, New London, CT, 06320, USA,
Jonathan_L_French{at}groton.pfizer.com

Matthew P. Wand

Harvard School of Public Health, Boston, USA

{dagger} To whom correspondence should be addressed.

Maps depicting cancer incidence rates have become useful tools in public health research, giving valuable information about the spatial variation in rates of disease. Typically, these maps are generated using count data aggregated over areas such as counties or census blocks. However, with the proliferation of geographic information systems and related databases, it is becoming easier to obtain exact spatial locations for the cancer cases and suitable control subjects. The use of such point data allows us to adjust for individual-level covariates, such as age and smoking status, when estimating the spatial variation in disease risk. Unfortunately, such covariate information is often subject to missingness. We propose a method for mapping cancer risk when covariates are not completely observed. We model these data using a logistic generalized additive model. Estimates of the linear and non-linear effects are obtained using a mixed effects model representation. We develop an EM algorithm to account for missing data and the random effects. Since the expectation step involves an intractable integral, we estimate the E-step with a Laplace approximation. This framework provides a general method for handling missing covariate values when fitting generalized additive models. We illustrate our method through an analysis of cancer incidence data from Cape Cod, Massachusetts. These analyses demonstrate that standard complete-case methods can yield biased estimates of the spatial variation of cancer risk.

Keywords: Binary response; Expectation Maximization; Generalized linear model; Laplace approximation; Logistic regression; Method of weights; Missing data


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.