Skip Navigation


Biostatistics Advance Access originally published online on October 24, 2006
Biostatistics 2007 8(3):654-673; doi:10.1093/biostatistics/kxl036
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
8/3/654    most recent
kxl036v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Leitenstorfer, F.
Right arrow Articles by Tutz, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Leitenstorfer, F.
Right arrow Articles by Tutz, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

Generalized monotonic regression based on B-splines with an application to air pollution data

Florian Leitenstorfer* and Gerhard Tutz

Department of Statistics, Ludwig-Maximilians-Universität München, Akademiestraße 1, 80799 München, Germany florian.leitenstorfer{at}stat.uni-muenchen.de

* To whom correspondence should be addressed.


    SUMMARY
 TOP
 SUMMARY
 1. INTRODUCTION
 2. BOOSTING B-SPLINES IN...
 3. SIMULATION RESULTS
 4. AIR POLLUTION IN...
 5. CONCLUDING REMARKS
 Appendix
 REFERENCES
 
In many studies, it is known that one or more of the covariates have a monotonic effect on the response variable. In these circumstances, standard fitting methods for generalized additive models (GAMs) generate implausible results. A fitting procedure is proposed that incorporates monotonicity assumptions on one or more smooth components within a GAM framework. The algorithm uses the monotonicity restriction for B-spline coefficients and provides componentwise selection of smooth components. Stopping criteria and approximate pointwise confidence bands are derived. The method is applied to the data from a study conducted in the metropolitan area of São Paulo, Brazil, where the influence of several air pollutants like SO2 on respiratory mortality is investigated.

Keywords: Air pollution data; Generalized additive models; Likelihood-based boosting; Monotonic regression


    1. INTRODUCTION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. BOOSTING B-SPLINES IN...
 3. SIMULATION RESULTS
 4. AIR POLLUTION IN...
 5. CONCLUDING REMARKS
 Appendix
 REFERENCES
 
In many biometrical problems where generalized smooth regression methods are used, a monotonic relationship between one or more explanatory variables and the response variable has to be assumed. A typical problem of this type which will be considered more closely arises in studies, where the influence of air pollution on mortality or illness is investigated (see, e.g. Schwartz, 1994bGo, or Conceição and others, 2001Go). In these analyses, an increase of deaths or cases of illness is expected with an increasing concentration of a specific pollutant. When standard smoothing techniques, like spline smoothing (Green and Silverman, 1994Go) or local polynomial fitting (Fan and Gijbels, 1996Go), are applied to data of this type in a generalized additive modeling approach, the fitted curves may lead to unconvincing results. In the following, it is proposed to incorporate knowledge about monotonic relationships in the estimation by using monotonic regression methods. Other important biometrical problems that require monotonic regression techniques are the estimation of dose–response functions (e.g. Kelly and Rice, 1991Go, Lee, 1996Go, or Dilleen and others, 2003Go) and the estimation of human growth curves (e.g. Ducharme and Fontez, 2004Go).

Starting from the pool adjacent violators algorithm (PAVA) (see, e.g. Robertson and others, 1988Go) which produces a step function, a variety of methods have been developed to smooth the PAVA results, so as to obtain a smooth estimate of the underlying monotonic function. Details of such approaches, which are mainly based on kernel regression techniques, are given in Friedman and Tibshirani (1984)Go, Mukerjee (1988)Go, or Mammen and others (2001)Go. Alternative approaches, which will be pursued in the following, are based on the expansion of a monotonic function into a sum of basis functions, that is, f = {sum}j{alpha}jBj. To assure monotonicity of the estimate, constraints have to be put on the coefficients {alpha}j. Ramsay (1988)Go suggests the use of monotonic basis functions (integrated splines), while Kelly and Rice (1991)Go propose a B-spline basis. As the B-spline approach has become very popular in nonparametric regression (see Eilers and Marx, 1996Go), we will focus on the latter.

Many publications on monotonic regression focus on unidimensional smoothing problems with a Gaussian response variable y (see, e.g. Ramsay, 1998Go, Zhang, 2004Go, or Turlach, 2005Go). In the example considered here, as in many ecological or biometrical applications, one has multiple covariates x' = (x1,...,xp), and only for some of the covariates, a monotonic relationship to E(y|x) has to be assumed. Furthermore, the response variables are typically binary or count data, which are considered as binomial or Poisson distributed. Because little work has been done on monotonic regression in a generalized linear model (GLM) context, least-squares approaches have often been used in such cases (see, e.g. Kelly and Rice, 1991Go), which lead to unsatisfactory results. Flexible modeling tools are needed where monotonicity restrictions can easily be incorporated into a generalized additive model (GAM) framework.

Recently, boosting approaches became increasingly important in nonparametric regression (see, e.g. Bühlmann and Yu, 2003Go). As demonstrated by Tutz and Leitenstorfer (2006)Go, monotonicity restrictions are easy to include in likelihood-based algorithms for generalized response problems by componentwise boosting of monotonic basis functions in each step. In the present paper, we suggest boosting based on B-spline basis functions, rather than using monotonic basis functions as in Tutz and Leitenstorfer (2006)Go or Ramsay (1988)Go. When using B-splines, the monotonicity condition of the estimate is preserved in a different way. A special update scheme for the basis coefficients is proposed which shows good performance. It should be noted that the proposed method avoids the use of algorithms which handle inequality constraints. Procedures of this type typically are computationally burdensome and often yield unstable estimates. From a Bayesian perspective, a B-spline approach to monotonic regression has been suggested by Brezger and Steiner (2004)Go.

We illustrate generalized monotonic regression techniques on a data set that has previously been analyzed by Conceição and others (2001)Go, Singer and others (2002)Go, and Einbeck and others (2004)Go. The data have been collected to evaluate the association between mortality due to respiratory causes and the concentration of various air pollutants in the city of São Paulo, Brazil, from 1994 to 1997. The analysis presented here focuses on the effect on people older than 65 years, which is one of the most susceptible population segments. GAMs are frequently used for the analysis of such data, where one first aims at controlling for seasonal and weather effects and then includes an air pollution variable in the model (see Schwartz, 1994bGo). There are numerous publications that suggest that respiratory mortality increases with the concentration of air pollutants (see, e.g. Schwartz, 1994aGo, for a review). Under this assumption, it seems sensible to estimate smooth effects of pollution concentration on mortality under a monotonicity constraint, resulting in more reliable fits.

In Section 2, the concept of monotonic likelihood boosting based on B-splines is introduced, and an extension to multiple covariate settings is given. In Section 3, the performance of our approach is evaluated in various simulation studies. In Section 4, we take a closer look on the data set mentioned above. Note that throughout the paper, we take monotonic to mean nondecreasing.


    2. BOOSTING B-SPLINES IN GENERALIZED MONOTONIC REGRESSION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. BOOSTING B-SPLINES IN...
 3. SIMULATION RESULTS
 4. AIR POLLUTION IN...
 5. CONCLUDING REMARKS
 Appendix
 REFERENCES
 

2.1. Monotonicity constraints for B-splines

First, we consider a generalized smooth monotonic regression problem with dependent variable y that can be non-Gaussian and a single covariate x. As in GLMs (e.g. McCullagh and Nelder, 1989Go), it is assumed that yi|xi has a distribution from a simple exponential family f(yi|xi) = exp{[yi{theta}ib({theta}i)]/{varphi} + c(yi,{varphi})}, where {theta}i is the canonical parameter and {varphi} denotes the dispersion parameter. The link between µi = E(yi|xi) and the explanatory variable xi is determined by µi = h({eta}i), where h is a given response function which is strictly monotone (the inverse of the link function g = h – 1), and the predictor {eta}i = {eta}(xi) is a function of x. While in GLMs, {eta}(x) is assumed to be a linear predictor, here more generally it is assumed that {eta}(x) = f(x) is a smooth function that satisfies the monotonicity condition

Formula (2.1)

Obviously, monotonicity in {eta} transforms into monotonicity in the means.

Due to their flexibility, smoothing methods based on B-splines are a common tool in statistics (see, e.g. Eilers and Marx, 1996Go). Such approaches are based on an expansion of f into B-spline basis functions, where a sequence of knots {tj} is placed equidistantly within the range [xm in,xmax]. With Formula denoting the number of interior knots, one obtains the linear term

Formula (2.2)

where q denotes the degree of the B-splines and Formula (the number of basis functions). An algorithm for the computation of B-splines of degree q is given in De Boor (1978)Go. Monotonicity can be assured in the following way. Suppose we have B-splines of degree q ≥ 1. Let h be the distance between the equally spaced knots. Then the derivative {eta}'(x) = {partial}{eta}(x)/{partial}x can be written as

Formula

for a proof see De Boor (1978)Go. Since Bj(x,q – 1) ≥ 0, it follows from

Formula (2.3)

that {eta}'(x) ≥ 0 holds. In other words, since (2.3) is a sufficient condition for the monotonicity of {eta}(x), the sequence of coefficients {alpha}j has to be nondecreasing in order to obtain monotonic functions. This property of B-splines has been previously exploited by Kelly and Rice (1991)Go and Brezger and Steiner (2004)Go in a monotonic regression setting. Note that throughout this paper, we work with B-splines that are centered by their corresponding integral.

2.2. An outline of the algorithm

Boosting was originally introduced within the machine learning community (e.g. Schapire, 1990Go) for classification problems. More recently, the approach has been extended to regression modeling with a continuous dependent variable (e.g. Bühlmann and Yu, 2003Go, Bühlmann, 2006Go). The basic idea is to fit a function iteratively by fitting in each stage a "weak" learner to the current residual. In componentwise boosting as proposed by Bühlmann and Yu (2003)Go, only the contribution of one variable is updated in one step. In contrast to these approaches, we propose to update a specific simplification of the predictor which makes it easy to control the monotonicity restriction.

For simplicity, in the following the degree q of the B-splines is suppressed. In matrix notation, the data are given by y = (y1,...,yn)' and x = (x1,...,xn)'. Based on the expansion into basis functions, the data set may be collected in matrix form (y,B), where B = (B1(x),...,Bm(x)),Bj(x) = (Bj(x1),...,Bj(xn))'.

The residual model that is fitted by weak learners in one iteration step uses a grouping of B-splines. One considers for r = 1,...,m – 1 the simplified model with predictor

Formula (2.4)

When fitting model (2.4), the monotonicity constraint is easily checked by comparing the estimates Formula and Formula, since monotonicity follows from Formula. Given an estimate from the previous fitting

Formula

refitting is performed by

Formula

It is obvious that Formula is monotonic if the estimates fulfill Formula, provided that the previous estimate Formula was monotonic. The grouping of basis functions into B1,...,Br and Br + 1,...,Bm, the effect of which is adapted by the amount {alpha}1(r) in the first and {alpha}2(r) in the second group, allows monotonicity to be controlled in a simple way. Fitting a full model with m new parameters would imply much more computational effort. Furthermore, if individual constraints are imposed on each of the m parameters, problems with convergence might occur. Instead, the possible groupings (r = 1,...,m – 1) are evaluated and by analogy with componentwise boosting, the best refit is selected. The grouping of B-splines can be derived as a restricted model in the sense of restricted least-squares estimators in linear models (see Theil and Goldberger, 1961Go). In the usual form of a smoothed estimate based on B-splines, model (2.4) is given by

Formula (2.5)

with the constraints {alpha}1 = ··· = {alpha}r = {alpha}1(r), {alpha}r + 1 = ··· = {alpha}m = {alpha}2(r). The constraints specify that blocks of r and m r parameters have to be identical.

Before giving the algorithm, which is based on likelihood-based boosting strategies as proposed by Tutz and Binder (2006)Go, the fit of model (2.4) is embedded into the framework of penalized likelihood estimation. Moreover, the model is generalized to a model that contains parametric effects in addition to the smooth monotonic effects. Thus, the intercept term is replaced by z'{alpha}0, where z is a vector of covariates and {alpha}0 is an unknown parameter vector (possibly specifying only the intercept). It is assumed that zi always contains an intercept. In order to avoid identifiability issues which arise for B-splines in connection with an intercept, the update step is split up into 2 parts. In the first part, the smooth component is updated and in the second, the parametric term. In the first part, one fits by penalized likelihood. Therefore, one considers

Formula

with 0r, 1r denoting the vectors of length r containing 0s and 1s only. Then the linear predictor may be represented in matrix form by {eta}(x) = B(r){alpha}(r), where B(r) = BR(r) and {alpha}(r) = ({alpha}1(r),{alpha}2(r))'. It is proposed that in each boosting step, the model is estimated by the 1-step Fisher scoring based on generalized ridge regression (Marx and others, 1992Go). Standard ridge regression maximizes the penalized log-likelihood

Formula

where li({alpha}(r)) = li(h(B(r){alpha}(r))) is the usual log-likelihood contribution of the ith observation and P({alpha}(r)) = ({lambda}/2){alpha}Formula{alpha}(r) represents the penalty term with ridge parameter {lambda}. Model (2.5) is asymmetric in a specific sense. If, for example, r = 2, the first constraint {alpha}1 = {alpha}2 concerns only 2 parameters, whereas the second constraint {alpha}3 = ··· = {alpha}m concerns m – 2 parameters, which for m = 22 means 20 parameters are restricted. To account for this asymmetry, we tried the modified penalty Formula where the parameters are weighted by the number of parameters that are implicitly considered as identical. However, numerical examples showed that the performance of the modified penalty is nearly the same as for the usual ridge penalty Formula. Hence, in the following we use this simpler scheme, which in matrix form leads to the penalized log-likelihood

Formula

where {Lambda} = diag{1,1} and {lambda} > 0 represents the ridge parameter. Derivation yields the corresponding penalized score function

Formula (2.6)

with W({eta}) = D2({eta}){Sigma}({eta}) – 1, D({eta}) = diag{{partial}h({eta}1)/{partial}{eta},...,{partial}h({eta}n)/{partial}{eta}}, {Sigma}({eta}) = diag{{sigma}Formula,...,{sigma}Formula}, {sigma}Formula = var(yi), all of them evaluated at the current value of {eta}. The monotonicity constraint from (2.3) is incorporated by taking into account only estimates which fulfill Formula. It is easily seen that the update scheme given below yields the desired nondecreasing sequences of estimates Formula in each boosting iteration. The update of the parametric term z'{alpha}0 is performed in the same way but without penalization and with the design matrix determined by Z = (1,z1,...,zu), where z1,...,zu denote the observed covariate values.

Monotonic Likelihood Boosting for B-splines

Step 1: (Initialization)

Set Formula, Formula, Formula, and Formula.

Step 2: (Iteration)

For l = 1,2,...

1) Fitting step, monotone component
For r = 1,···,m – 1, compute the modified ridge estimate based on the 1-step Fisher scoring,

Formula (2.7)

where Formula, Formula, Formula, and Formula. Let Formula denote the candidates that fulfill the monotonicity constraint. If A = {emptyset}, stop. Otherwise continue with Step 2.

2) Selection step and update, monotone component
Compute the potential update of the linear predictor, Formula, r{epsilon}{1,···,m – 1}. Choose rl{epsilon}A such that the deviance is minimized, that is

Formula

where Formula. Set

Formula (2.8)


3) Fitting step and update, parametric term
Based on the 1-step Fisher scoring, one obtains

Formula

where Formula and Formula. Set

Formula


When using boosting techniques, the number of iterations l plays the role of a smoothing parameter. Therefore, in order to prevent overfitting, a stopping criterion is necessary. A quite common measure of the complexity of a smooth regression fit is the hat matrix. Consequently, Bühlmann and Yu (2003)Go and Bühlmann (2006)Go developed a hat matrix for L2-boosting with continuous dependent variable. In the case of likelihood boosting, for more general exponential-type distributions, the hat matrix has to be approximated. For integrated splines, Tutz and Leitenstorfer (2006)Go give an approximation based on first-order Taylor expansions, which shows satisfying properties. It is straightforward to derive the hat matrix for the present case along the lines of Tutz and Leitenstorfer (2006)Go. With Formula, Ml = {Sigma}FormulaWFormulaB(rl)(BFormulaWlB(rl) + {lambda}{Lambda}) – 1BFormulaWFormula{Sigma}Formula, where Formula and Formula, and Formula where Formula and Formula, l = 1,2,···, the approximate hat matrix is given by

Formula (2.9)

with Formula. By considering tr(Hl) as the degree of freedom of the smoother, we use as potential stopping criteria an information criterion proposed by Akaike (AIC) and the bayesian information criterion (BIC) criteria, AIC(l) = Devl + 2tr(Hl) and BIC(l) = Devl + log(n)tr(Hl), where Formula denotes the deviance of the model in the lth boosting step. The optimal number of boosting iterations is defined by lFormula = argminlAIC(l) or lFormula = argminlBIC(l). Since the BIC (Schwarz, 1978Go) penalizes the complexity of the fit stronger, usually sparser models result. A more extensive treatment of stopping criteria for boosting algorithms is given in Bühlmann and Yu (2006)Go.

2.3. Extension to GAMs

In biometrical or ecological problems, one is usually interested in the effect of several smooth predictor variables, some of which might have monotonic influence on y. In the following, we demonstrate that the concept given above can easily be extended to a GAM setting (see, e.g. Hastie and Tibshirani, 1990Go, or Marx and Eilers, 1998Go). Let

Formula (2.10)

where for some of the p unknown smooth functions (say f1,...,fv, v ≤ p) monotonicity constraints are assumed to hold. Using the matrix notation from above, we have a design matrix (Z,X), with the matrix of linear terms Z and the matrix of smooth components X = (x1,...,xp), where xs = (x1s,...,xns)'. Componentwise expansion of X into B-spline basis functions leads to the matrix (B(1),...,B(p)), where B(s) refers to the sth predictor.

It is essential to distinguish between components that are or are not under monotonicity restrictions. For the former, grouping of basis functions is done within each component in the same way as described in (2.4). For the unconstrained components, we follow Bühlmann and Yu (2003)Go and Tutz and Binder (2006)Go and use penalized regression splines (P-splines, cf. Eilers and Marx, 1996Go) as weak learners for the chosen component. Thereby, the second-order differences of the B-spline coefficients are penalized. For simplicity, it is assumed that the same number of basis functions m is used for all fs. The vector of basis coefficients for all smooth terms in the model is then given by {alpha} = ({alpha}11,...,{alpha}1m,...,{alpha}p1,...,{alpha}pm)'. Thus, Step 2 (iteration) of the algorithm described above is extended as follows:

Step 2: (Iteration)

For l = 1,2,...

1. Fitting step, smooth components
For s = 1,...,p,
  • If s{epsilon}{1,...,v} (the components under monotonicity constraint), compute the estimates from (2.7) componentwise for the possible groupings r = 1,...,m – 1, with

    Formula (2.11)

    The set of indices for components s and split points r that satisfy the monotonicity constraint is given by

    Formula


  • If s{epsilon}{v + 1,···,p} (the components without constraints), compute the 1-step Fisher scoring estimate of the P-spline,

    Formula (2.12)

    where

    Formula

    denotes the matrix representation of the second-order differences. Since the P-spline fit (2.12) does not distinguish between split points r{epsilon}{1,...,m – 1}, for convenience of notation we set r = 0 and extend the selection set by A2 = {(s,0),s{epsilon}{v + 1,...,p}}, yielding

    Formula (2.13)



2. Selection step and update, smooth components
Compute the potential update of the linear predictor, which only for the monotonic coefficients s ≤ v depends on the split point r. Otherwise, r is set to 0, indicating that Formula is not affected by r. Choose (sl,rl){epsilon}A such that the deviance is minimized, that is

Formula

In each iteration, only the basis coefficients belonging to the chosen component sl are refitted. That is, if the selected sl is in {1,...,v}, then Formula, j = 1,...,m, are updated by the refitting scheme (2.8). If sl > v, then update Formula with Formula from (2.12).

3. Fitting step and update, parametric terms. See above.

By using BFormula from (2.11), along with B(s) and the penalty matrices for the P-spline estimates, the hat-matrix approximation and the corresponding AIC and BIC stopping criteria can be extended to the additive setting. One obtains the hat matrix from (2.9), where

Formula

with l = 1,2,.... In the case of many predictors, it might occur that boosting stops before a certain component has been chosen. Thus, the extended approach has the nice effect of doing variable selection for smooth components, similar to the methods proposed by Bühlmann and Yu (2003)Go. This additional strength is important only in data sets with a large number p of covariates, where only some of them are influential.

A derivation of standard deviations for function and parameter estimates is given in Appendix A.1.


    3. SIMULATION RESULTS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. BOOSTING B-SPLINES IN...
 3. SIMULATION RESULTS
 4. AIR POLLUTION IN...
 5. CONCLUDING REMARKS
 Appendix
 REFERENCES
 
In order to evaluate the performance of the proposed method, we conduct some simulation studies. In a first setting, a unidimensional Poisson regression model is considered, with response yi generated from P(exp({eta}i)), where {eta}i = {eta}(xi) is specified by a monotonic function. The xi are drawn from a U[0,5]-distribution. We investigate a step function, {eta}(x) = 2cI(x > 2.5), and a plateau function, {eta}(x) = c(2/{1 + exp[ – 10(x – 1)]} + 2/{1 + exp[ – 5(x – 4)]} – 1). The strength of the signal is controlled by the constant c.

For Generalized Monotonic B-spline Boosting (GMBBoost), a B-spline basis of degree q = 3 is used, with 20 equally spaced knots in the domain of the data, which results in m = 22 basis functions. The ridge parameter has been chosen as {lambda} = 300, and the maximum number of boosting iterations is limited to L = 500. Note that for better comparability, the predictor variable is always rescaled to [0,1] before proceeding further.

GMBBoost is compared to unconstrained penalized regression splines as implemented in the R library "mgcv", where the penalization parameter is determined by the unbiased risk estimation (UBRE) criterion (see Wood, 2000Go, Wood, 2001Go, Wood, 2003Go). In order to obtain comparability with the proposed approach, a cubic regression spline basis with a dimension of k = 22 is used. Furthermore, following a suggestion of a referee, we consider a monotonicity-constrained version of this approach, based on quadratic programming. This involves embedding a monotone smoother in an iteratively reweighted least-squares loop. When using regression splines, this monotone smoother can be obtained by the function "pcls" in mgcv which solves a penalized least-squares quadratic programming problem. For details, see Wood (1994)Go. The smoothing parameter is chosen by UBRE applied to the unconstrained model. Ordinary PAVA is also included, since in 1-parameter exponential families, it is the restricted maximum likelihood estimate for isotonic regression (see Robertson and others, 1988Go).

A criterion for the performance of the fitting methods is the averaged Kullback–Leibler (AKL) distance,

Formula (3.1)

where in the case of Poisson regression we have Formula, with Formula and µi = exp[{eta}(xi)]. The means of AKL over S = 250 simulated data sets are given in Table 1 for selected sample sizes and noise levels. For the step function example, it is seen that GMBBoost is a strong competitor that clearly outperforms the unconstrained and constrained MGCV fits. Note that especially in the lower noise case, in about 10% of the simulated data sets, no fit could be obtained by the restricted version of MGCV. PAVA, which might be thought of being appropriate for this example due to its noncontinuous character, does better only in the case of a stronger signal and n = 100. For the plateau function, monotonicity-restricted MGCV is the best performer, but the performance of GMBBoost is very similar. It is obvious that using PAVA is not a good idea when the true underlying function is smooth. Note that the differences between the stopping criteria are marginal. For the step function, BIC does a little better, whereas in the plateau example, AIC has the advantage.


View this table:
[in this window]
[in a new window]

 
Table 1. AKL error over S = 250 simulated data sets for 1-dimensional Poisson regression, with corresponding standard errors. The number of instances where no fit could be obtained is given in brackets, the best 2 procedures in boldface

 
To assess the performance of the GMBBoost extension to the GAM framework, we conduct simulation studies from 2 settings with multiple covariates. We investigate again a Poisson model with logarithmic link, where the linear predictor is given by

Formula (3.2)

and c relates to the strength of the signal. First, p = 2 and 2 monotonic increasing functions,

Formula

and

Formula

are considered ({alpha}0 = 0.5). The covariates xi1 and xi2 are drawn independently from a U[0,1]-distribution. We compare GMBBoost with both smooth components under a monotonicity restriction to the additive extension of MGCV based on cubic regression splines, where the multiple smoothing parameters are selected by UBRE based on Newton's method in multidimensions (see, e.g. Gu and Wahba, 1991Go, and Wood, 2000Go). As above, for each smooth component, B-splines of degree 3 with 20 equidistant knots within the domain of the data and a ridge parameter of {lambda} = 300 are used. We use a dimension of k = 22 in each component for MGCV. Table 2 shows the results of the mean AKL over S = 250 for 2 different levels of the signal strength and sample sizes of n = 100 and 200. In all settings, GMBBoost yields considerably better estimates than the unconstrained GAM. The differences between AIC- and BIC-stopped GMBBoost are again small, with no clear preference for either criterion.


View this table:
[in this window]
[in a new window]

 
Table 2. AKL error over S = 250 simulated data sets for 2-dimensional Poisson regression, with corresponding standard errors. The number of instances where no fit could be obtained is given in brackets

 
Finally, we consider a setting in higher dimensions, where only some of the components are under monotonicity constraints. A Poisson model with a log-link and a linear predictor as given in (3.2) is considered, where p = 5 and the last 3 functions are assumed to be monotonic. See Figure 1 for the shape of the functions ({alpha}0 = 0). The algebraic expressions of these functions are given in Appendix A.2. The covariates are drawn from a N5(0,{Sigma})-distribution, where a compound symmetry correlation structure of the design is obtained by using {Sigma} = {rho}11' + (1 – {rho})I, we chose {rho} = 0.4. In order to keep the strength of the signal comparable to the example above, rather small values of c = 0.2 and 0.3 are chosen.


Figure 1
View larger version (6K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Functions fs(·), s = 1,...,5, used for the simulation in higher dimensions. The last 3 functions are monotonic, the first 2 are not.

 
In this case, GMBBoost selects between 2 different types of weak learners: grouped basis functions with restrictions and P-splines (see Section 2.3). Simulations suggest that different ridge parameters should be used for these 2 types of learners. In the examples presented above, we observed that a ridge parameter of {lambda} = 300 for the grouped B-splines is large enough to yield an appropriate fit. We also tried larger values which required a higher number of iterations, but only with marginally differing results. Hence, {lambda} = 300 is also applied here. However, we allow for more flexibility in the choice of the P-spline penalty parameter {lambda}P. In each setting, GMBBoost cycles for 3 different values of {lambda}P, and the combination of {lambda}P and number of iterations l is chosen that minimizes AIC or BIC. A maximum number of L = 120 boosting iterations for each cycle suffices in most cases, which makes the procedure computationally feasible. Since small values of {lambda}P may lead to premature stopping, we consider 3 different values of {lambda}P{epsilon}{1000,3000,5000}. The choice of basis functions and knots is the same as in the example with p = 2. We again compare GMBBoost to MGCV which uses the same settings as before.

In Figure 2, the logarithm of the AKL is given for the various settings. It is seen that GMBBoost outperforms MGCV in most considered settings, with a distinct dominance in the higher noise case c = 0.2 (left panel). Interestingly, BIC seems to stop boosting prematurely in the higher dimensional case, yielding an inferior performance compared to AIC-stopped GMBBoost. Thus, we recommend to use the AIC criterion in problems where also nonrestricted smooth components have to be estimated.


Figure 2
View larger version (9K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Boxplots of log(AKL) for different fitting methods for the model with 5 predictors with different noise levels. In each panel, the results of sample sizes for n = 200 (left) and n = 300 (right) are given. The number of instances where no MGCV fit could be obtained is given at the bottom of the corresponding boxplots.

 

    4. AIR POLLUTION IN SÃO PAULO
 TOP
 SUMMARY
 1. INTRODUCTION
 2. BOOSTING B-SPLINES IN...
 3. SIMULATION RESULTS
 4. AIR POLLUTION IN...
 5. CONCLUDING REMARKS
 Appendix
 REFERENCES
 
In the following, the air pollution data mentioned in Section 1 are investigated more closely. The objective is to evaluate the association between mortality for respiratory causes and the concentration of SO2, CO, PM10, and O3. Previous analyses of this data set (see, e.g. Conceição and others, 2001Go, or Singer and others, 2002Go) focus on the population of children under 5. In the analysis presented here, we consider the risk group of elderly people (aged 65 years or older; see also Saldiva and others, 1995Go). More detailed information about the origin of the data and the preprocessing steps is given in Appendix A.3. The response variable is the number of daily deaths of elderly people attributed to respiratory causes in the city of São Paulo. The sample size is n = 1351. A standard approach for data of this type is to use a generalized additive "core" model which includes terms to control for trend, seasonality, and other influential variables like temperature or humidity, cf. Schwartz (1994b)Go. As the dependent variable consists of count data, we use a Poisson model along with the log-link and consider a core model similar to the model of Singer and others (2002)Go,

Formula (4.1)

The model includes a nonspecified function of time (in days) to control for long-term seasonality. We also tried models where temperature (daily minimum, 2-day lagged) and humidity were considered as smooth functions. Since we found no evidence of nonlinearity, we decided to include these covariates as linear terms. In addition, day-of-week dummies are included to control for short-term seasonality and the number of deaths by nonrespiratory causes as a linear term. The basic strategy to investigate the effect of a specific pollutant is to take only this pollutant into the model. In the following, we will focus exclusively on the concentration of SO2, given in daily mean values of µg/m3 (2-day lagged), considering the predictor

Formula (4.2)

where f2(·) is monotonic increasing (for a discussion see concluding remarks). To account for that assumption, model (4.2) has been fitted by the boosting procedure described in Section 2 (GMBBoost), where f2 was estimated under the monotonicity constraint. We used a B-spline basis of degree 3 with Formula equidistant interior knots for each of the smooth component and concentrate on AIC-stopped GMBBoost. When using the same values for the penalty parameters {lambda} and {lambda}P as in the simulation study, no distinct minimal AIC-value was attained after L = 400 boosting iterations. To keep computation feasible, we decided to use considerably lower values of {lambda} = 30 and {lambda}P{epsilon}{30,50,100}. Then, AIC-stopped GMBBoost chose {lambda}P = 30 at an optimal number of lopt = 71 iterations. It should be noted that the results for the other values of {lambda}P are very similar.

The fixed effects were reestimated in each iteration. For comparison, we also fitted a GAM by the use of the R package mgcv with the same adjustments as in the simulations.

Figure 3 shows the estimated curves Formula and Formula for the various fitting procedures. It is seen from Formula (left panel) that a strong seasonal pattern in mortality is evident for both fitting methods. The more interesting result is the difference in the fits for the concentration of SO2 (right panel). The MGCV fit shows a rather wiggly curve up to a concentration of 35 µg/m3. Interestingly, the fitted curve decreases for very small concentrations, which is counterintuitive. In contrast, GMBBoost shows different behavior. Since monotonicity is assumed for this component, one obtains a fit that increases first slowly (up to about 40 µg/m3) and then steeply (up to 60 µg/m3), where it remains constant on a rather high level of mortality for high pollutant concentrations.


Figure 3
View larger version (11K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3. Core model + f2(SO2), estimated curves for the smooth components, GMBBoost with monotonic fitting of f2(SO2), AIC-stopped (solid), and MGCV (dash-dotted). Data points are given as rug at the foot of each panel.

 
In Table 3, the parameter estimates for the fixed effects, controlling for temperature, humidity, long-term seasonality, and nonrespiratory deaths, are given for the different fitting methods, along with the corresponding standard errors. It is seen that the estimates are rather stable across fitting procedures. We also give the deviance and the effective degrees of freedom for both models. It is seen that the MGCV fit is closer to the data, whereas GMBBoost yields a sparser model due to the restriction. Note that for GMBBoost, the effective degrees of freedom have to be interpreted with caution, since they depend on the approximation of the hat matrix.


View this table:
[in this window]
[in a new window]

 
Table 3. Core model + f2(SO2), estimates of fixed coefficients for MGCV and AIC-stopped GMBBoost, along with corresponding standard errors

 
The editor pointed out with good reason that overdispersion may be a problem in Poisson models. We therefore estimated the scale parameter,

Formula

for both models, where edf denotes the effective degrees of freedom. From Table 3, it is seen that Formula is in both cases fairly close to one, indicating that overdispersion should not be an issue in this application. In addition, we computed the partial serial correlation of residuals. The inclusion of days of the week as covariates had the effect that partial serial correlation is rather weak. The maximal value was 0.062. For significance level 0.05 for single lags, 2 lags out of 30 (1 and 13) were just above the critical value. When adjusting for multiple tests, serial correlation seems hardly relevant. Similar results were reported by Singer and others (2002)Go.

Figure 4 shows the curves fitted by AIC-stopped GMBBoost and approximate 0.95 pointwise confidence bands as derived in Appendix A.1. Although confidence intervals are rather large for a SO2 concentration above 40 µg/m3, the confidence intervals show that the increase should be taken seriously.


Figure 4
View larger version (13K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 4. Core model + f2(SO2), AIC-stopped GMBBoost with monotonic fitting of f2(SO2) along with the approximate confidence intervals. Data points are given as rug at the foot of each panel.

 
In studies of the type presented here, one is often interested in the risk of death at a certain pollutant concentration, relative to the risk of death at the minimum concentration of that pollutant (see, e.g. Singer and others, 2002Go, or Einbeck and others, 2004Go). Let SO2(i) be the recorded concentration in observation i, i = 1,···,n, and SO2(min) the minimum concentration recorded, then the relative risk of death is defined by

Formula

In Figure 5, the estimated relative risk curve is given for 2 fitting methods. It is obvious that the nonmonotonic fit yields unreliable results with a fit which indicates that the relative risk of death is below 1 up to 45 µg/m3. In contrast, the GMBBoost fit shows monotonic increase over the whole risk curve. For concentrations higher than 60 µg/m3, the risk seems to remain on a fairly high level for the AIC-stopped boosting.


Figure 5
View larger version (5K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 5. Relative risk curves versus SO2 concentration for MGCV (·) and AIC-stopped GMBBoost (o).

 

    5. CONCLUDING REMARKS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. BOOSTING B-SPLINES IN...
 3. SIMULATION RESULTS
 4. AIR POLLUTION IN...
 5. CONCLUDING REMARKS
 Appendix
 REFERENCES
 
A procedure is proposed that incorporates monotonicity constraints for one or more components within a GAM. By using monotonicity, the procedure yields stable estimates which in contrast to GAM fitting avoids overfitting and questionable wiggly estimates. Einbeck and others (2004)Go try to find stable estimates by downweighting observations. However, with downweighting approaches, problems arise in higher dimensions, since densities have to be estimated. The monotone regression boosting approach does not suffer from these problems. It should also be noted that the problem of choosing smoothing parameters—which in the case of higher dimensional covariates is hard to tackle—is avoided by boosting techniques. The only crucial tuning parameter is the number of boosting iterations, which is chosen by the AIC or the BIC criterion. Moreover, the procedure is a strong competitor to alternative approaches.

The results provided by the proposed methodology go with other findings regarding the relationship between air pollution and mortality in the São Paulo area. In an earlier study, Saldiva and others (1995)Go investigated all causes of mortality of elderly people. They found a significant effect of SO2 by using a parametric Poisson model, controlling for seasonality by dummy variables for months and years. Later, Pereira and others (1998)Go investigated the association between air pollution and mortality of fetuses with a parametric Poisson model, reported a weakly increasing risk of death for SO2, and derived a dose–response relationship between the pollutant and intrauterine mortality. In the study by Conceição and others (2001)Go, the respiratory mortality of children under 5 was investigated by using GAM techniques and the same database as this paper. They found an increase in relative risk of death of about a 25% on the most polluted days (compared to the least polluted days) for SO2. This is comparable to the relative risk of death that we observed for elderly people using the GMBBoost procedure, see Figure 5. Braga and others (2001)Go explored hospital admissions due to respiratory causes of children and adolescents under 20. They observed a significant effect of SO2 on infants under 2 but no significance in other age groups.

To conclude, previous investigations of adverse health effects of SO2 in São Paulo are supported by our approach. Besides the known effects on children's health (Conceição and others (2001)Go), SO2 seems also to affect the health of elderly people. A particular characteristic of the effects of the pollutant SO2 in São Paulo is that most of the pollution in this city is caused by auto engine exhaust gases. SO2 (as well as CO) may be seen as good proxies for automotive emissions.

A referee raised the problem that there is not necessarily an increasing relationship between high air pollution concentration and mortality. When ambient air pollution is at a very high level, people may adjust their behavior, for example they may stay inside on particularly polluted days (see Bresnahan and others, 1997Go, or Neidell, 2002Go, for a detailed discussion). Nevertheless, there is empirical evidence that high pollutant concentrations result in an increasing mortality. Thus, we think that it is reasonable to consider ambient air pollution as a proxy for air pollution exposure.

An implementation in R of the approach outlined in this paper is available at http://www.statistik.lmu.de/institut/lehrstuhl/semsto/Software/software.htm.


    Appendix
 TOP
 SUMMARY
 1. INTRODUCTION
 2. BOOSTING B-SPLINES IN...
 3. SIMULATION RESULTS
 4. AIR POLLUTION IN...
 5. CONCLUDING REMARKS
 Appendix
 REFERENCES
 

A.1. Standard deviations

In order to obtain standard deviations for function estimates, we suggest starting from the approximate hat matrix given in (2.9). Consider the model from (2.10), where components 1,...,v are estimated under the monotonicity constraint and components v + 1,...,p are not; the linear predictor after l boosting iterations is given by

Formula (A.1)

where Formula and Formula results from updating the linear terms in each iteration. Let sk be the smooth component chosen in the kth boosting iteration, one has from the update step of the extended algorithm,

Formula (A.2)

where, according to (2.10) and (2.12),

Formula (A.3)

From (A.3), it becomes apparent that the type of the update of the chosen smooth component depends on the presence or absence of a monotonicity restriction. Using the indicator function I(·), (A.2) can be written in the closed form

Formula

With the approximation of the hat matrix, one has Formula, which leads to

Formula

and hence, in a recursive fashion,

Formula

where

Formula

Approximate confidence intervals for the estimate of the smooth component fs after l boosting iterations are then found from

Formula

where Formula, with Formula.

Standard deviations for the fixed effects may be obtained in a similar way. The vector of fixed parameters after l boosting iterations is given by

Formula (A.4)

Let Formula and use the approximation of the hat matrix, Formula, where Formula, then (A.4) may be expressed in a recursive way as

Formula

where Formula and Formula. With Formula, it immediately follows that

Formula

A.2. Algebraic expressions of smooth functions

The smooth functions used in the simulation study of the 5-dimensional additive Poisson model (see Figure 1) are given in algebraic terms by

Formula

A.3. Preprocessing of the data set

The data stem from a study conducted between January 1994 and December 1997. The original database can be found on the web at

http://www.ime.usp.br/jmsinger/Polatm9497.zip

(the data file is called master.xls). The concentrations of several air pollutants were recorded daily by 13 monitoring stations of the São Paulo state air pollution controlling agency. For SO2, 24-h mean values (in µg/m3) are given (variables SO21 to SO221). We took the mean over all monitoring station where records are available to get an overall variable for SO2 concentration. The daily minimum temperature (TEMPMIN) and relative humidity (HUMIDMED) are provided by the University of São Paulo. We follow Conceição and others (2001)Go and consider the measurements taken 2 days before as influential. This refers to the measurements of the SO2 concentration and the minimum temperature, as well as the measurement of the concurrent day is used for relative humidity. Daily records of mortality were recorded from the municipal mortality information improvement program. For several age groups, the total number of deaths is given, as well as a more detailed partition into deaths due to respiratory diseases, cardiovascular, and other causes. In the present analysis, the respiratory mortality of population at an age of 65 years or older is investigated. We use the daily number of deaths in this age group due to respiratory diseases (RES65) as response variable. The mortality due to nonrespiratory causes, which is used as a covariate, contains the deaths due to cardiovascular (CAR65) and other (OTH65) causes. For a more detailed description of the classification of mortality, we refer to Conceição and others (2001)Go. The preprocessing results in a data set of n = 1351 complete observations.


    ACKNOWLEDGMENTS
 
We gratefully acknowledge support from the Deutsche Forschungsgemeinschaft (SFB 386, "Statistical Analysis of Discrete Structures"). We thank the "Laboratorio de Poluição Atmosférica Experimental, Faculdade de Medicina, Universidade de São Paulo, Brazil", Jochen Einbeck, and Julio M. Singer for letting us use the data on air pollution in São Paulo. We also thank Harald Binder and 2 unknown referees for constructive comments. Conflict of Interest: None declared.


    REFERENCES
 TOP
 SUMMARY
 1. INTRODUCTION
 2. BOOSTING B-SPLINES IN...
 3. SIMULATION RESULTS
 4. AIR POLLUTION IN...
 5. CONCLUDING REMARKS
 Appendix
 REFERENCES
 

    Braga ALF, Saldiva PHN, Pereira LAA, Menezes JJC, Conceição GMS, Lin CA, Zanobetti A, Schwartz J, Dockery DW. Health effects of air pollution exposure on children and adolescents in São Paulo, Brazil. Pediatric Pulmonology (2001) 31:106–113.[CrossRef][Web of Science][Medline]

    Bresnahan B, Dickie M, Gerking S. Averting behavior and urban air pollution. Land Economics (1997) 73:340–357.[CrossRef][Web of Science]

    Brezger A, Steiner WJ. Monotonic regression based on Bayesian P-splines: an application to estimating price response functions from store-level scanner data. SFB Discussion Paper 331 (2004) Munich, Germany: LMU München.

    Bühlmann P. Boosting for high-dimensional linear models. Annals of Statistics (2006) 34:559–583.[CrossRef][Web of Science]

    Bühlmann P, Yu B. Boosting with the L2-loss: regression and classification. Journal of the American Statistical Association (2003) 98:324–339.[CrossRef][Web of Science]

    Bühlmann P, Yu B. Sparse boosting. Journal of Machine Learning Research (2006) 7:1001–1024.[Web of Science]

    Conceição GMS, Miraglia SGEK, Kishi HS, Saldiva PHN, Singer JM. Air pollution and children mortality: a time series study in São Paulo, Brazil. Environmental Health Perspectives (2001) 109:347–350.[CrossRef][Web of Science][Medline]

    De Boor C. A Practical Guide to Splines (1978) New York: Springer.

    Dilleen M, Heimann G, Hirsch I. Non-parametric estimators of a monotonic dose-response curve and bootstrap confidence intervals. Statistics in Medicine (2003) 22:869–882.[CrossRef][Web of Science][Medline]

    Ducharme GR, Fontez B. A smooth test of goodness-of-fit for growth curves and monotonic nonlinear regression models. Biometrics (2004) 60:977–986.[CrossRef][Web of Science][Medline]

    Eilers PHC, Marx BD. Flexible smoothing with B-splines and penalties. Statistical Science (1996) 11:89–121.[CrossRef][Web of Science]

    Einbeck J, Andre CDS, Singer JM. Local smoothing with robustness against outlying predictors. Environmetrics (2004) 15:541–554.[CrossRef][Web of Science]

    Fan J, Gijbels I. Local Polynomial Modelling and Its Applications (1996) London: Chapman & Hall.

    Friedman JH, Tibshirani R. The monotone smoothing of scatterplots. Technometrics (1984) 26:243–250.[Medline]

    Green DJ, Silverman BW. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach (1994) London: Chapman & Hall.

    Gu C, Wahba G. Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM Journal of Scientific and Statistical Computing (1991) 12:383–398.[CrossRef]

    Hastie T, Tibshirani R. Generalized Additive Models (1990) London: Chapman & Hall.

    Kelly C, Rice J. Monotone smoothing with application to dose–response curves and the assessment of synergism. Biometrics (1991) 46:1071–1085.[Web of Science]

    Lee C-IC. On estimation for monotone dose-response curves. Journal of the American Statistical Association (1996) 91:1110–1119.[CrossRef][Web of Science]

    Mammen E, Marron J, Turlach B, Wand M. A general projection framework for constrained smoothing. Statistical Science (2001) 16:232–248.[CrossRef][Web of Science]

    Marx B, Eilers P, Smith E. Ridge likelihood estimation for generalized linear regression. In: Statistical Modelling—van der Heijden R, Jansen W, Francis B, Seeber G, eds. (1992) Amsterdam: North-Holland. 227–238.

    Marx BD, Eilers PHC. Direct generalized additive modelling with penalized likelihood. Computational Statistics & Data Analysis (1998) 28:193–209.[CrossRef][Web of Science]

    McCullagh P, Nelder JA. Generalized Linear Models (1989) 2nd edition. New York: Chapman & Hall.

    Mukerjee H. Monotone nonparametric regression. Annals of Statistics (1988) 16:741–750.[Web of Science]

    Neidell M. Air pollution, health and socio-economic status: the effect of outdoor air quality on childhood asthma. Journal of Health Economics (2002) 23:1209–1236.[CrossRef]

    Pereira LAA, Loomis D, Conceição GMS, Braga ALF, Arcas RM, Kishi HS, Singer JM, Böhm GM, Saldiva PHN. Association between air pollution and intrauterine mortality in São Paulo, Brazil. Environmental Health Perspectives (1998) 106:325–329.[Web of Science][Medline]

    Ramsay JO. Monotone regression splines in action. Statistical Science (1988) 3:425–461.

    Ramsay JO. Estimating smooth monotone functions. Journal of the Royal Statistical Society B (1998) 60:365–375.[CrossRef]

    Robertson T, Wright FT, Dykstra RL. Order-Restricted Statistical Inference (1988) New York: Wiley.

    Saldiva PHN, Pope, III CA, Schwartz J, Dockery DW, Lichtenfels AJFC, Salge JM, Barone IA, Böhm GM. Air pollution and mortality in elderly people: a time-series study in São Paulo, Brazil. Archives of Environmental Health (1995) 50:159–163.[Web of Science][Medline]

    Schapire RE. The strength of weak learnability. Machine Learning (1990) 5:197–227.[Web of Science]

    Schwartz J. Air pollution and daily mortality: a review and meta analysis. Environmental Research (1994a) 64:36–52.[Medline]

    Schwartz J. Nonparametric smoothing in the analysis of air pollution and respiratory illness. Canadian Journal of Statistics (1994b) 22:471–487.[CrossRef]

    Schwarz G. Estimating the dimension of a model. Annals of Statistics (1978) 6:461–464.[Web of Science]

    Singer JM, Andre CDS, Lima PL, Conçeicão GMS. Association between atmospheric pollution and mortality in São Paulo, Brazil: regression models and analysis strategy. In: Statistical Data Analysis Based on the L1 Norm and Related Methods—Dodge Y, ed. (2002) Berlin: Birkhäuser. 439–450.

    Theil H, Goldberger AS. On pure and mixed estimation in econometrics. International Economic Review (1961) 2:65–78.[CrossRef][Web of Science]

    Turlach BA. Shape constrained smoothing using smoothing splines. Computational Statistics (2005) 20:81–103.[Web of Science]

    Tutz G, Binder H. Generalized additive modelling with implicit variable selection by likelihood based boosting. Biometrics (2006) (in press).

    Tutz G, Leitenstorfer F. Generalized smooth monotonic regression in additive modeling. Journal of Computational and Graphical Statistics (2006) in press.

    Wood SN. Monotonic smoothing splines fitted by cross validation. SIAM Journal on Scientific Computing (1994) 15:1126–1133.[CrossRef][Web of Science]

    Wood SN. Modelling and smoothing parameter estimation with multiple quadratic penalties. Journal of the Royal Statistical Society B (2000) 62:413–428.[CrossRef]

    Wood SN. mgcv: GAMs and generalized ridge regression for R. R News (2001) 1:20–25.

    Wood SN. Thin plate regression splines. Journal of the Royal Statistical Society B (2003) 65:95–114.[CrossRef]

    Zhang JT. A simple and efficient monotone smoother using smoothing splines. Journal of Nonparametric Statistics (2004) 16:779–796.[CrossRef][Web of Science]

    Received June 2, 2005; revised March 21, 2006; revised July 27, 2006; accepted for publication October 20, 2006.


    Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



    This Article
    Right arrow Abstract Freely available
    Right arrow FREE Full Text (PDF) Freely available
    Right arrow All Versions of this Article:
    8/3/654    most recent
    kxl036v1
    Right arrow Alert me when this article is cited
    Right arrow Alert me if a correction is posted
    Services
    Right arrow Email this article to a friend
    Right arrow Similar articles in this journal
    Right arrow Similar articles in PubMed
    Right arrow Alert me to new issues of the journal
    Right arrow Add to My Personal Archive
    Right arrow Download to citation manager
    Right arrowRequest Permissions
    Right arrow Disclaimer
    Google Scholar
    Right arrow Articles by Leitenstorfer, F.
    Right arrow Articles by Tutz, G.
    Right arrow Search for Related Content
    PubMed
    Right arrow PubMed Citation
    Right arrow Articles by Leitenstorfer, F.
    Right arrow Articles by Tutz, G.
    Social Bookmarking
     Add to CiteULike   Add to Connotea   Add to Del.icio.us  
    What's this?