Biostatistics Advance Access originally published online on September 12, 2006
Biostatistics 2007 8(2):468-473; doi:10.1093/biostatistics/kxl024
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Numerical equivalence of imputing scores and weighted estimators in regression analysis with missing covariates
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, PO Box 19024, Seattle, WA 98109-1024, USA cywang{at}fhcrc.org
Department of Statistics, Feng-Chia University, Taichung, Taiwan, Republic of China
Insightful Corporation, 1700 Westlake Avenue North, Suite 500, Seattle, WA 98109, USA
* To whom correspondence should be addressed.
Imputation, weighting, direct likelihood, and direct Bayesian inference (Rubin, 1976) are important approaches for missing data regression. Many useful semiparametric estimators have been developed for regression analysis of data with missing covariates or outcomes. It has been established that some semiparametric estimators are asymptotically equivalent, but it has not been shown that many are numerically the same. We applied some existing methods to a bladder cancer casecontrol study and noted that they were the same numerically when the observed covariates and outcomes are categorical. To understand the analytical background of this finding, we further show that when observed covariates and outcomes are categorical, some estimators are not only asymptotically equivalent but also actually numerically identical. That is, although their estimating equations are different, they lead numerically to exactly the same root. This includes a simple weighted estimator, an augmented weighted estimator, and a mean-score estimator. The numerical equivalence may elucidate the relationship between imputing scores and weighted estimation procedures.
Keywords: Estimating equation; Ignorable missingness; Inverse selection probability; Missing at random
Received May 1, 2006; revised August 25, 2006; accepted for publication September 8, 2006.