Biostatistics Advance Access originally published online on May 25, 2005
Biostatistics 2005 6(4):633-652; doi:10.1093/biostatistics/kxi033
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Regression analysis of longitudinal binary data with time-dependent environmental covariates: bias and efficiency
Department of Biostatistics, Vanderbilt University, S-2323 Medical Center North, Nashville, TN 37232-2158, USA jonathan.schildcrout{at}vanderbilt.edu
Department of Biostatistics, University of Washington, F-600 Health Sciences Building, Campus Mail Stop 357232, Seattle, WA 98195-7232, USA
* To whom correspondence should be addressed.
Generalized estimating equations (Liang and Zeger, 1986) is a widely used, moment-based procedure to estimate marginal regression parameters. However, a subtle and often overlooked point is that valid inference requires the mean for the response at time t to be expressed properly as a function of the complete past, present, and future values of any time-varying covariate. For example, with environmental exposures it may be necessary to express the response as a function of multiple lagged values of the covariate series. Despite the fact that multiple lagged covariates may be predictive of outcomes, researchers often focus interest on parameters in a cross-sectional model, where the response is expressed as a function of a single lag in the covariate series. Cross-sectional models yield parameters with simple interpretations and avoid issues of collinearity associated with multiple lagged values of a covariate. Pepe and Anderson (1994), showed that parameter estimates for time-varying covariates may be biased unless the mean, given all past, present, and future covariate values, is equal to the cross-sectional mean or unless independence estimating equations are used. Although working independence avoids potential bias, many authors have shown that a poor choice for the response correlation model can lead to highly inefficient parameter estimates. The purpose of this paper is to study the biasefficiency trade-off associated with working correlation choices for application with binary response data. We investigate data characteristics or design features (e.g. cluster size, overall response association, functional form of the response association, covariate distribution, and others) that influence the small and large sample characteristics of parameter estimates obtained from several different weighting schemes or equivalently working covariance models. We find that the impact of covariance model choice depends highly on the specific structure of the data features, and that key aspects should be examined before choosing a weighting scheme.
Keywords: ALR; Biasvariance trade-off; GEE; Longitudinal data; Marginal model
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. H Fowler and N. A Christakis Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the Framingham Heart Study BMJ, December 4, 2008; 337(dec04_2): a2338 - a2338. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Schildcrout and P. J. Heagerty On outcome-dependent sampling designs for longitudinal binary response data with time-varying covariates Biostat., October 1, 2008; 9(4): 735 - 749. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. A. Christakis and J. H. Fowler The Collective Dynamics of Smoking in a Large Social Network N. Engl. J. Med., May 22, 2008; 358(21): 2249 - 2258. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. A. Christakis and J. H. Fowler The Spread of Obesity in a Large Social Network over 32 Years N. Engl. J. Med., July 26, 2007; 357(4): 370 - 379. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Schildcrout, L. Sheppard, T. Lumley, J. C. Slaughter, J. Q. Koenig, and G. G. Shapiro Ambient Air Pollution and Asthma Exacerbations in Children: An Eight-City Analysis Am. J. Epidemiol., September 15, 2006; 164(6): 505 - 517. [Abstract] [Full Text] [PDF] |
||||



