Skip Navigation



Biostatistics Advance Access published online on May 25, 2005

Biostatistics, doi:10.1093/biostatistics/kxi033
This Article
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
6/4/633    most recent
kxi033v2
kxi033v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Schildcrout, J. S.
Right arrow Articles by Heagerty, P. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Schildcrout, J. S.
Right arrow Articles by Heagerty, P. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org.
Received January 15, 2004
Revised May 12, 2005
Accepted May 19, 2005

Article

Regression Analysis of Longitudinal Binary Data with Time-Dependent Environmental Covariates: Bias and Effciency

Jonathan S. Schildcrout 1* and Patrick J. Heagerty 2

1 Department of Biostatistics, Vanderbilt University, S-2323 Medical Center North, Nashville, TN 37232-2158
2 University of Washington

* To whom correspondence should be addressed.
Jonathan S. Schildcrout, E-mail: jonathan.schildcrout{at}vanderbilt.edu


   Abstract

Generalized estimating equations (Liang and Zeger, 1986) is a widely-used, moment-based procedure to estimate marginal regression parameters. However, a subtle and often overlooked point is that valid inference requires the mean for the response at time t to be expressed properly as a function of the complete past, present, and future values of any time-varying covariate. For example, with environmental exposures it may be necessary to express the response as a function of multiple lagged values of the covariate series. Despite the fact that multiple lagged covariates may be predictive of outcomes, researchers often focus interest on parameters in a "cross-sectional" model, where the response is expressed as a function of a single lag in the covariate series. Cross-sectional models yield parameters with simple interpretations and avoid issues of collinearity associated with multiple lagged values of a covariate. Pepe and Anderson (1994), showed that parameter estimates for time-varying covariates may be biased unless the mean, given all past, present, and future covariate values, is equal to the cross-sectional mean or unless independence estimating equations is used. Although working independence avoids potential bias, many authors have shown that a poor choice for the response correlation model can lead to highly inefficient parameter estimates. The purpose of this paper is to study the bias-efficiency tradeoff associated with working correlation choices for application with binary response data. We investigate data characteristics or design features (e.g. cluster size, overall response association, functional form of the response association, covariate distribution, and others) that inuence the small and large sample characteristics of parameter estimates obtained from several different weighting schemes or equivalently "working" covariance models. We find that the impact of covariance model choice depends highly on the specific structure of the data features, and that key aspects should be examined before choosing a weighting scheme.

Keywords: bias-variance tradeoff; marginal model; longitudinal data; GEE; ALR.
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BMJHome page
J. H Fowler and N. A Christakis
Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the Framingham Heart Study
BMJ, December 4, 2008; 337(dec04_2): a2338 - a2338.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
J. S. Schildcrout and P. J. Heagerty
On outcome-dependent sampling designs for longitudinal binary response data with time-varying covariates
Biostat., October 1, 2008; 9(4): 735 - 749.
[Abstract] [Full Text] [PDF]


Home page
NEJMHome page
N. A. Christakis and J. H. Fowler
The Collective Dynamics of Smoking in a Large Social Network
N. Engl. J. Med., May 22, 2008; 358(21): 2249 - 2258.
[Abstract] [Full Text] [PDF]


Home page
NEJMHome page
N. A. Christakis and J. H. Fowler
The Spread of Obesity in a Large Social Network over 32 Years
N. Engl. J. Med., July 26, 2007; 357(4): 370 - 379.
[Abstract] [Full Text] [PDF]


Home page
Am J EpidemiolHome page
J. S. Schildcrout, L. Sheppard, T. Lumley, J. C. Slaughter, J. Q. Koenig, and G. G. Shapiro
Ambient Air Pollution and Asthma Exacerbations in Children: An Eight-City Analysis
Am. J. Epidemiol., September 15, 2006; 164(6): 505 - 517.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.