Skip Navigation


Biostatistics Advance Access originally published online on May 25, 2005
Biostatistics 2005 6(4):633-652; doi:10.1093/biostatistics/kxi033
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
6/4/633    most recent
kxi033v2
kxi033v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Schildcrout, J. S.
Right arrow Articles by Heagerty, P. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Schildcrout, J. S.
Right arrow Articles by Heagerty, P. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org.

Regression analysis of longitudinal binary data with time-dependent environmental covariates: bias and efficiency

Jonathan S. Schildcrout*

Department of Biostatistics, Vanderbilt University, S-2323 Medical Center North, Nashville, TN 37232-2158, USA jonathan.schildcrout{at}vanderbilt.edu

Patrick J. Heagerty

Department of Biostatistics, University of Washington, F-600 Health Sciences Building, Campus Mail Stop 357232, Seattle, WA 98195-7232, USA

* To whom correspondence should be addressed.

Generalized estimating equations (Liang and Zeger, 1986) is a widely used, moment-based procedure to estimate marginal regression parameters. However, a subtle and often overlooked point is that valid inference requires the mean for the response at time t to be expressed properly as a function of the complete past, present, and future values of any time-varying covariate. For example, with environmental exposures it may be necessary to express the response as a function of multiple lagged values of the covariate series. Despite the fact that multiple lagged covariates may be predictive of outcomes, researchers often focus interest on parameters in a ‘cross-sectional’ model, where the response is expressed as a function of a single lag in the covariate series. Cross-sectional models yield parameters with simple interpretations and avoid issues of collinearity associated with multiple lagged values of a covariate. Pepe and Anderson (1994), showed that parameter estimates for time-varying covariates may be biased unless the mean, given all past, present, and future covariate values, is equal to the cross-sectional mean or unless independence estimating equations are used. Although working independence avoids potential bias, many authors have shown that a poor choice for the response correlation model can lead to highly inefficient parameter estimates. The purpose of this paper is to study the bias–efficiency trade-off associated with working correlation choices for application with binary response data. We investigate data characteristics or design features (e.g. cluster size, overall response association, functional form of the response association, covariate distribution, and others) that influence the small and large sample characteristics of parameter estimates obtained from several different weighting schemes or equivalently ‘working’ covariance models. We find that the impact of covariance model choice depends highly on the specific structure of the data features, and that key aspects should be examined before choosing a weighting scheme.

Keywords: ALR; Bias–variance trade-off; GEE; Longitudinal data; Marginal model


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BMJHome page
J. H Fowler and N. A Christakis
Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the Framingham Heart Study
BMJ, December 4, 2008; 337(dec04_2): a2338 - a2338.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
J. S. Schildcrout and P. J. Heagerty
On outcome-dependent sampling designs for longitudinal binary response data with time-varying covariates
Biostat., October 1, 2008; 9(4): 735 - 749.
[Abstract] [Full Text] [PDF]


Home page
NEJMHome page
N. A. Christakis and J. H. Fowler
The Collective Dynamics of Smoking in a Large Social Network
N. Engl. J. Med., May 22, 2008; 358(21): 2249 - 2258.
[Abstract] [Full Text] [PDF]


Home page
NEJMHome page
N. A. Christakis and J. H. Fowler
The Spread of Obesity in a Large Social Network over 32 Years
N. Engl. J. Med., July 26, 2007; 357(4): 370 - 379.
[Abstract] [Full Text] [PDF]


Home page
Am J EpidemiolHome page
J. S. Schildcrout, L. Sheppard, T. Lumley, J. C. Slaughter, J. Q. Koenig, and G. G. Shapiro
Ambient Air Pollution and Asthma Exacerbations in Children: An Eight-City Analysis
Am. J. Epidemiol., September 15, 2006; 164(6): 505 - 517.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.