Skip Navigation


Biostatistics Advance Access originally published online on June 5, 2006
Biostatistics 2007 8(2):228-238; doi:10.1093/biostatistics/kxl003
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
8/2/228    most recent
kxl003v2
kxl003v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by de la Cruz-Mesía, R.
Right arrow Articles by Quintana, F. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by de la Cruz-Mesía, R.
Right arrow Articles by Quintana, F. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

A model-based approach to Bayesian classification with applications to predicting pregnancy outcomes from longitudinal ß-hCG profiles

Rolando de la Cruz-Mesía* and Fernando A. Quintana

Departamento de Estadística, Facultad de Matemáticas, Pontificia Universidad Católica de Chile, Casilla 306, Correo 22, Santiago, Chile rolando{at}mat.puc.cl

* To whom correspondence should be addressed.


    SUMMARY
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PREGNANT WOMEN DATA
 3. CLASSIFICATION USING...
 4. APPLICATION
 5. DISCUSSION
 REFERENCES
 
This paper discusses Bayesian statistical methods for the classification of observations into two or more groups based on hierarchical models for nonlinear longitudinal profiles. Parameter estimation for a discriminant model that classifies individuals into distinct predefined groups or populations uses appropriate posterior simulation schemes. The methods are illustrated with data from a study involving 173 pregnant women. The main objective in this study is to predict normal versus abnormal pregnancy outcomes from beta human chorionic gonadotropin data available at early stages of pregnancy.

Keywords: Discriminant analysis; Longitudinal data; Nonlinear hierarchical models


    1. INTRODUCTION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PREGNANT WOMEN DATA
 3. CLASSIFICATION USING...
 4. APPLICATION
 5. DISCUSSION
 REFERENCES
 
In different biomedical situations, markers are needed for early detection of the onset of a specific disease, taking into account any longitudinal information that becomes available. One such example concerns pregnant women. To detect a number of complications arising during pregnancy, a variety of quantities or characteristics are measured at antenatal examinations (Yamashita and others, 1989Go; Witt and others, 1990Go; Hahlin and others, 1991Go). One of these is the beta human chorionic gonadotropin (ß-hCG), also known as the "pregnant hormone" or "announcer of pregnancy," which keeps the corpus luteum (yellow body) producing progesterone and estrogen during the early stages of pregnancy. The ß-hCG hormone is produced by the placenta. It is detectable in the blood and urine within 10 days of fertilization. After the fertilized egg implants or attaches to the inside of the uterus or other structure inside the mother, the levels of ß-hCG rise rapidly, frequently exceeding 100 mIU/ml. The levels continue to increase throughout the first trimester of pregnancy and reach a peak 60–80 days after the fertilized egg implants. The exact level of ß-hCG in the blood can be measured using standard tests. These measurements can help giving a rough estimate of the age of the fetus. They can also help to determine if the pregnancy is progressing normally. Levels that are abnormally low or high may be signs that an abnormal medical condition is present. This would suggest the need for further evaluation and testing.

There is a great variation in ß-hCG levels. However, it is not the absolute value that matters, but their relative changes. In a normal pregnancy, the level of this hormone approximately doubles every 1.5 days up to 5 weeks after the last menstrual period, and then every 3.5 days from the 7th week onward (Frits and Guo, 1987Go). After the first trimester, levels should gradually decrease over time and quickly decrease to zero after the pregnancy is ended. However, abnormally large levels of ß-hCG may indicate choriocarcicoma of the uterus, down syndrome in the fetus, hydatidiform mole of the fetus, or ovarian cancer. Lower than normal ß-hCG levels may indicate ectopic pregnancy, a miscarriage, or spontaneous abortion. In any case, a failure to exhibit normal growth patterns in ß-hCG levels should usually be interpreted as a complication of pregnancy.

The main objective of this article is to explore a classification technique for predicting the outcome of pregnancy on the basis of the results of certain diagnostic tests administered to pregnant women. The inference problem is formally described as a discriminant analysis based on longitudinal ß-hCG outcomes. Our approach is fully Bayesian and provides the posterior (or predictive) probability of outcome of pregnancy in women based on the longitudinal marker. Physicians can then make decisions on the basis of these probabilities. We consider a hierarchical structure that accommodates nonlinear profiles and allows classification for individuals with very few observations. This is specially relevant in the context of our motivating example, where more than one-third of the women had either one or two ß-hCG measurements.

Extensions of classical discriminant analysis to multivariate response curves observed over fixed time intervals have been considered by Albert (1983)Go. Some extensions to the case of unbalanced data have also been proposed, typically using linear or nonlinear random-effects models to describe the longitudinal profiles in each group. Verbeke and Lesaffre (1996)Go proposed the so-called heterogeneity model, i.e. a linear mixed-effects model with random effects sampled from a mixture of normal distributions. Verbeke and Molenberghs (2000)Go indicated that the classification rule implied by the heterogeneity model is equivalent to the discriminant function proposed by Tomasko and others (1999)Go. Further developments along this direction have been discussed in Wernecke and others (2004)Go. Marshall and Barón (2000)Go considered nonlinear random-effects models to describe evolutions in different groups and stated the optimal allocation rule. Brown and others (2001)Go discussed discriminant analysis using linear random-effects models from a Bayesian viewpoint.

In our approach, the classes or groups are predefined and the task is to understand the basis for the classification from a set of labeled subjects (training data set). This information is then used to classify future subjects. In the case that the classes or groups are unknown a priori and need to be estimated from the data, latent class models offer a fruitful approach. See additional developments along this direction in Muthén and Shedden (1999)Go, Lin and others (2000)Go, McCulloch and others (2002)Go, Muthén and others (2002)Go, and references therein.

The rest of this article is organized as follows: We first give a brief description of the data set in Section 2. In Section 3, we extend the framework of traditional classification methods to the longitudinal hierarchical setting. In Section 4, we illustrate the proposed longitudinal method using data from Santiago, Chile on the ß-hCG measured in women with normal and abnormal pregnancy outcomes. An appropriate posterior simulation scheme based on the Gibbs sampling algorithm is described. Finally, Section 5 discusses the results.


    2. PREGNANT WOMEN DATA
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PREGNANT WOMEN DATA
 3. CLASSIFICATION USING...
 4. APPLICATION
 5. DISCUSSION
 REFERENCES
 
There is an increased risk of early pregnancy loss after assisted reproduction. Various studies have addressed the question about early ß-hCG values and their relationship to pregnancy outcome after in vitro fertilization (e.g. Yamashita and others, 1989Go). We consider a data set with a total of 173 young women, representing different pregnancies over a period of 2 years in a private fertilization obstetrics clinic in Santiago, Chile. The ß-hCG concentrations for the 173 women were measured during the first 80 days of gestational age. One of the main targets of the study was to evaluate these concentrations at early stages of pregnancy, with the purpose of identifying women with a high risk of loss. Consequently, pregnancy outcomes were divided into two groups: normal and abnormal. The women were classified as normal pregnancies if they had a normal delivery or as abnormal pregnancies if they had any complication resulting in a nonterminal delivery and loss of the fetus. The resulting data set consists of 124 patients diagnosed with normal pregnancy and 49 patients with abnormal pregnancy. The 173 women altogether contribute a total of 375 observations, where the number of samples per woman ranged from 1 to 6 (median 2). These data were originally presented in Marshall and Barón (2000)Go.

We analyze the vectors of time-varying ß-hCG measurements for the 173 women. Approximately 30% of the 173 women had one ß-hCG measurement, 31% had two, 33% had three, and 6% had four or more measurements. Figure 1 presents the subject-specific log10 ß-hCG profiles for both groups.


Figure 1
View larger version (22K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Observed profiles of ß-hCG for all 173 women.

 
The two populations appear clearly distinct when considering the ensemble of profiles. However, for any one of the profiles, the classification into one or the other subpopulation is far less certain, in particular when considering series of partial responses. Thus, our main inference goal in analyzing these data is to provide a classification rule for a new patient. The rule should allow sequential updating as data accrue for the new patient.


    3. CLASSIFICATION USING NONLINEAR HIERARCHICAL MODELS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PREGNANT WOMEN DATA
 3. CLASSIFICATION USING...
 4. APPLICATION
 5. DISCUSSION
 REFERENCES
 
Suppose we are given a training data set comprising m units {(yi,zi),i = 1,...,m}. Here yi = (yi1,...,yini)'isinRni represents the observed response vector for the ith unit, taken at arbitrary times t' = (t1,t2,...,tni). Let ziisin{1,2,...,g} denote the known group label for the ith unit. In our application, m = 173, g = 2, with zi = 1 and zi = 2 indicating normal and abnormal pregnancy, respectively. The label zi is known for some women with already reported delivery, but unknown for women with partial data before delivery. Without loss of generality, we assume that zi, i = 1,...,m, is known, and zm + 1 is unknown. Let ym = (y1,...,ym,z1,...,zm) denote all data, including the recorded class memberships zi, up to the mth patient.

Classification can be done either on a within-sample or on a predictive basis. For the former, the posterior probabilities {p(zi = k|ym);k = 1,...,g} are appropriate, and these can be directly estimated as empirical averages in their Markov chain Monte Carlo run. The predictive classification problem assumes that a future unit m + 1 with as yet unknown label zm + 1 is recorded, so interest focuses on inference about zm + 1, i.e. we are in principle interested in {p(zm + 1|ym,ym + 1)}, where ym + 1 is the currently available partial response vector for the new patient m + 1.

We consider for group k a generic hierarchical model of the form


Formula (3.1)

In words, data yik for the ith sampling unit in group k are sampled from a probability model parameterized by a vector {theta}ik. We will assume the parameter vector {theta}ik to be partitioned into common fixed effects {theta}Formula and unit-specific random effects {theta}Formula. The {theta}Formula are assumed to be generated from a distribution parameterized by a d-dimensional hyperparameter vector {varphi}k, thus implying a parametric model for random effects. Bayesian inference under parametric assumptions for random effects has been discussed by Bennett and others (1995)Go, Racine-Poon (1985)Go, Wakefield and others (1994)Go, Wakefield (1996)Go, Wakefield and Bennet (1996)Go, and Ziegelmann and Brown (2001)Go, among others.

For the top-level sampling model p(yik|{theta}ik) in (3.1), we assume a nonlinear regression

Formula

with a mean function f({theta}ik;·) parameterized by {theta}ik and evaluated at known times tijk, j = 1,...,ni, k = 1,...,g. The residual term {epsilon}ijk is assumed to be normally distributed with mean 0 and variance {sigma}Formula. To facilitate classification, we augment the model with a marginal probability for zi

Formula

with k = 1,...,g. Specific choices for our motivating example will be discussed in Section 4. Note, however, that the augmented model implies the desired classification as a conditional probability p(zm + 1|ym,ym + 1), marginalizing with respect to the unknown {theta}i, and other possibly unknown hyperparameters.

From a Bayesian viewpoint, the classification probabilities are obtained by weighting the posterior distributions of the parameters. Using Bayes’ rule and some algebraic manipulations, the classification probability that a new unit ym + 1 belongs to the kth group is


Formula (3.2)

However, the integration is usually analytically intractable. Therefore, we shall construct a set of stationary samples {{Theta}(b),b = 1,...,B} from their posterior distribution and use

Formula (3.3)

to approximate (3.2). If the prevalences {pi}k are unknown hyperparameters as well, then (3.3) would use the imputed values {pi}Formula.

Under the "zero-one" loss function (Hastie and others, 2001Go), the Bayes classification of an existing unit yi and a future one ym + 1 are, respectively, given by


Formula (3.4)

In other words, the unit is classified in that group for which the highest posterior probability is attained, thus minimizing the expected misclassification rate. Of course, there are other loss functions that could be used, depending on whether it is more important to avoid false-positive or false-negative cases (see Hastie and others, 2001Go).


    4. APPLICATION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PREGNANT WOMEN DATA
 3. CLASSIFICATION USING...
 4. APPLICATION
 5. DISCUSSION
 REFERENCES
 
We apply the proposed model to the analysis of the longitudinal ß-hCG data discussed in Section 2.

4.1 Model specification

Mean values of the log10 ß-hCG for these 173 women show a nonlinear relationship with days of pregnancy. Figure 1 shows time profiles for normal and abnormal subjects. The analysis in Marshall and Barón (2000)Go suggests that woman-to-woman variation is adequately accounted for by the introduction of random effects to model the asymptotic behavior of the log10 ß-hCG level ({theta}ik below). Specifically, they proposed the following nonlinear random-effects model:

Formula (4.1)

where yijk represents the jth log10 ß-hCG measurement on the ith woman in group k, where k = 1,2 represent, respectively, normal and abnormal pregnancy groups, tijk is the day at which the yijk measurement was recorded, and the {theta}ik, i = 1,···,173, are assumed N({theta}k1,{tau}Formula) and independent of the eijk. The parameters {theta}Formula = ({theta}11,{theta}12,{theta}13) and {theta}Formula = ({theta}21,{theta}22,{theta}23) represent the population parameters of the logistic curve for the normal and abnormal pregnancy groups, respectively.

After integrating out random effects in the kth group, the likelihood in the kth group is still described by means of a normal distribution

Formula

where the mean vector µk(tik) has elements

Formula

and represents the population curve at time tijk. Furthermore, I is an nixni identity matrix, and the vector vik has elements

Formula

which depend on the values of the unknown population parameters {theta}k2 and {theta}k3. We complete the Bayesian formulation of Model (4.1) assuming prior independence for parameters, with distributions specified as

Formula (4.2)


Formula (4.3)


Formula (4.4)

for k = 1,2. Here, IG denote the inverse gamma distribution. In practice, the specification of hyperparameters {theta}0, D0, a1, c1, a2, and c2 may be difficult, so we choose in Section 4.2 hyperpameter values defining vague priors.

4.2 Posterior computation

Posterior distributions of the parameters were estimated using the Gibbs sampling algorithm. The values of the hyperparameters in (4.2), (4.3), and (4.4) were taken as {alpha}0 = (0,0,0)T, D0 = 1000 I3, a1 = a2 = 3, and c1 = c2 = 0.01, which give the prior variance of {sigma}2 and of {tau}2 to be 2500. The resulting prior densities are proper, but vague and hence relatively uninformative. Prior probabilities of group membership were assumed proportional to the size of the groups in the training sample. We also performed the analysis with different hyperparameter values, obtaining very similar results. This suggests robustness to the hyperparameter choices.

The full conditionals for implementing the Gibbs sampler for {alpha}k1, bik, {sigma}Formula, and {tau}Formula are straightforwardly derived as normal, normal, inverse gamma, and inverse gamma distributions, respectively. The full conditionals for {alpha}k2 and {alpha}k3 are not available in closed form. A separate random walk Metropolis algorithm was used to simulate each {alpha}k{ell}({ell} = 2,3). For this, we use a normal proposal distribution centered at the current value of {alpha}k{ell} with variance given by a constant c times the inverse information matrix. The tuning constant c controls the acceptance rate of the algorithm. If c is too small, then the acceptance rate is high, but jump sizes are correspondingly small, yielding slow convergence. Conversely, the selection of high values of c leads to larger jump sizes, but at the cost of lower acceptance rates. Following the recommendation of Gelman and others (1996)Go, c shall be selected so as to yield empirical acceptance rates around 0.25. Consequently, we chose c = 2.

To perform the Gibbs sampling, we chose starting points in a neighborhood of the maximum likelihood estimates of model parameters. For this we used the library NLME of Pinheiro and Bates (2000)Go to fit the models. The algorithm was implemented using computer code written in C. We generated 250000 iterations. After 10000 iterations, samples were collected, at a spacing of 240 iterations, to obtain approximately independent samples, giving samples of size 1000 for calculating posterior quantities of interest.

We checked convergence for each monitored variable, using the CODA software of Best and others (1995)Go to produce convergence diagnostics. Application of Geweke's (1995)Go convergence criterion separately to each of the model parameters showed no evidence against convergence, as the absolute value of the Z-statistic was less than 1.7 in all cases. Also, all sequences passed the stationarity test of Heidelberger and Welch (1983)Go.

4.3 Results

The mean gestational ages (days) in women with normal and abnormal pregnancies were 34.5 and 32.9 days, respectively; this difference was not significant. We found that concentrations of ß-hCG were significantly lower in women with an abnormal pregnancy than in those with a normal pregnancy.

Table 1 presents posterior summaries of the parameters. Observed differences in the group-specific estimates of the variance components models suggest that there is more between-subject variability in the abnormal group than in the normal group.


View this table:
[in this window]
[in a new window]

 
Table 1. Summary of model fitting

 
As part of the analysis, we estimated individual log10 ß-hCG profiles and standard errors. Fitted profiles with ±2 SD curves are displayed for six selected patients in Figure 2. Three of them were in the normal group (patients 3, 24, 66) and the remaining three in the abnormal group (patients 27, 34, 45). We note that the inference captures the varying observation error between subjects. An important feature of the Bayesian approach is that the credible intervals take into account the variability of all parameters, including unknown hyperparameters. We note here that all these patients, except patient 34, were correctly classified according to our proposed approach.


Figure 2
View larger version (12K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Fitted curves for three patients in the normal group (patients 3, 24, 66) and three in the abnormal group (patients 27, 34, 45). The points are the actual observations. The solid lines represent the fitted curves and the dashed lines represent fitted curve ± two posterior standard deviation.

 
Once the posterior probabilities p(zi = k|ym) are estimated, several quantities of interest can be readily evaluated. Among these, "sensitivity" and "specificity" are popular ways to summarize the classification results. Letting A = {the patient is classified as abnormal} and E = {the patient actually belongs to the abnormal group}, then sensitivity and specificity are defined, respectively, as Pr[A|E] and Formula. These are simply estimated as Formula and Formula, respectively, where Formula was defined in (3.4). From our results we found 57% sensitivity and 95% specificity.

We also computed the misclassification error rate, which was found to be 16% for this data set. However, it is well known that the error rate obtained by applying the classifier to the same data from which it has been formed tends to be biased downward as an estimate of the true error rate (Hastie and others, 2001Go). Several methods are available to solve this problem. For moderately large data sets, we could consider a series of randomly chosen divisions of the data into two components, one reserved for deriving the classification rule (the training sample) and the other to assess this rule (the test sample). Under this method, the estimated error rate is the average error rate over all the generated divisions. For smaller data sets like the one at hand, a cross-validation (CV) technique can be used to compensate for the lack of data. Specifically, we consider the leave-one-out cross-validation method, which consists of splitting the m samples into a training sample of size m – 1 and a test sample of size 1. The estimated error rate is then the average number of times the test sample was misclassified over the m possible divisions. A critical advantage of this method is that all the data can be used for training, without the need to set aside subjects purely for testing. The CV error rate is an important statistical estimator of the performance of a classification rule when the sample size is small. It is frequently used by many researchers. A direct generalization of the above procedure is the k-fold CV, where the data are divided randomly into k components of approximately equal sizes. Next, k – 1 components are used to compute the classification rule, which is then tested on the omitted component. This process is repeated k times, until all components are used to assess the classification rule, and the error rates averaged.

We performed both leave-one-out CV and 5-fold CV obtaining misclassification error rates of 17 and 17.3%, respectively. Both error rates are very similar, which suggests that these methods provide good estimates that are reasonably consistent. See further properties and discussion of these methods in Hastie and others (2001)Go.

In the above analysis, we have used all the available information. However, it is interesting to assess the predictive power of our model. A possible way of doing this is to study how the classification probabilities change with the number of available observations, for a given patient. Thus, we generate from the corresponding posterior predictive distributions one future patient for each group, and evaluate (3.2) for up to five possible observations. Time points were chosen from the empirical distribution of observed times within each group. Figure 3 shows the evolution of these probabilities for each future patient. For the normal patient, we observe a steady growth of the probabilities. For the abnormal patient, however, this probability first increases and starts to decrease to values that leave no question about the classification. A possible explanation for this is the rather heterogeneous patterns found for abnormal patients. Indeed, many of these show an initial increase in the log10 ß-hCG responses (just as all the normal patients do) followed by a decrease in some of the patients. Thus, the classification probabilities for abnormal patients require more observations than the abnormal ones to reflect the correct outcome.


Figure 3
View larger version (9K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3. Evolution of classification probabilities for one normal (solid line) and one abnormal (dashed line) future patient as a function of the number of observations.

 

    5. DISCUSSION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PREGNANT WOMEN DATA
 3. CLASSIFICATION USING...
 4. APPLICATION
 5. DISCUSSION
 REFERENCES
 
This paper proposes a general Bayesian framework for the classification of longitudinal profiles where the underlying models in each group or population are given by nonlinear hierarchical models. The approach has the advantage of being appropriate for classifying longitudinal profiles of data sets with an unbalanced data structure. It also has the ability to use all the information for classifying subjects over time, regardless of the number or timing of the observations. Moreover, the influence on discrimination of both the between-group and within-group components of variability can be readily quantified, and the posterior simulation scheme is straightforwardly implemented.

This approach is particularly appropriate for decision making in clinical practice where the number and times of observations are often arbitrary and depend on the progression of the patient. In this context, the approach we have presented solves one important problem.

In our study, the group "abnormal pregnancies" implies women with any complication resulting in a nonterminal delivery and loss of the fetus. This involves a wide range of pathologies including spontaneous abortion (hard to prevent in advance) and ectopic pregnancy (an acute condition that requires surgery). Measurements of serum ß-hCG at regular examination times may help identifying women with a high probability of abnormal pregnancy and who may benefit from closer surveillance. For instance, in the case of ectopic pregnancy, this may help to reduce the risk of tubal rupture. Our results may help both physicians and patients make informed treatment decisions on the basis of objective staging information. Indeed, a reliable and inexpensive diagnostic test to differentiate between normal pregnancies and pregnancies with early adverse outcome might reduce the psychological tension and anxiety present in many patients. It may also help to reduce treatment cost by making it more efficient. On the other hand, for patients judged by this test to be at high risk of an unfavorable outcome, a more careful follow-up might help to reduce the associated risks.

Other markers, such as serum progesterone measurements, have been used to distinguish normal from abnormal (spontaneous miscarriage) pregnancies. A straightforward generalization of our approach would accommodate additional information when this is available, either by inclusion of more covariates or by considering other markers, thus extending the framework to a multivariate one.


    ACKNOWLEDGMENTS
 
We are grateful to Guillermo Marshall for facilitating us the ß-hCG data set. The first author thanks the Comisión Nacional de Investigación Científica y Tecnológica for partially supporting his PhD studies at the Pontificia Universidad Católica de Chile. We thank a referee and the Editor for their valuable comments that helped improving this manuscript. Conflicts of Interest: None declared.


    REFERENCES
 TOP
 SUMMARY
 1. INTRODUCTION
 2. PREGNANT WOMEN DATA
 3. CLASSIFICATION USING...
 4. APPLICATION
 5. DISCUSSION
 REFERENCES
 

    Albert A. (1983) Discriminant analysis based on multivariate response curve: a descriptive approach to dynamic allocation. Statistics in Medicine 2:95–106.[Medline]

    Bennett JE, Racine-Poon A, Wakefield JC. (1995) MCMC for nonlinear hierarchical models. In Gilks WR, Richardson S, Spiegelhalter WR (Eds.). Markov Chain Monte Carlo in Practice(Oxford University Press, Oxford) pp. 339–358.

    Best NG, Cowles MK, Vines SK. (1995) CODA: Convergence Diagnosis and Output Analysis Software for Gibbs Sampling Output, Version 0.3(MRC Biostatistics Unit, Cambridge).

    Brown PJ, Kenward MG, Bassett EE. (2001) Bayesian discrimination with longitudinal data. Biostatistics 2:417–32.[Abstract]

    Frits MA and Guo SM. (1987) Doubling time of human chorionic gonadotropin (hCG) in early normal pregnancy: relationship to hCG concentration and gestational age. Fertility Sterility 47:584–9.[Web of Science][Medline]

    Gelman A, Roberts GO, Gilks WR. (1996) Efficient Metropolis jumping rules. In Bernardo JM, Berger JO, Dawid AP, Smith AFM (Eds.). Bayesian Statistics(Oxford University Press, Oxford) Volume 5: pp. 599–607.

    Geweke J. (1995) Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In Bernardo JM, Berger JO, Dawid AP, Smith AFM (Eds.). Bayesian Statistics(Oxford University Press, Oxford) Volume 4: pp. 169–194.

    Hahlin M, Sjoblom P, Lindblom B. (1991) Combined use of progesterone and human chorionic gonadotrophin determination for differential diagnosis of very early pregnancy. Fertility Sterility 55:492–6.[Web of Science][Medline]

    Hastie T, Tibshirani R, Friedman J. (2001) Elements of Statistical Learning: Data Mining, Inference and Prediction(Springer, New York).

    Heidelberger P and Welch P. (1983) Simulation run length control in the presence of an initial transient. Operations Research 31:1109–44.[Abstract/Free Full Text]

    Lin HQ, McCulloch CE, Turnbull BW, Slate EH, Clark LC. (2000) A latent class mixed model for analysing biomarker trajectories with irregularly scheduled observations. Statistics in Medicine 19:1303–18.[CrossRef][Web of Science][Medline]

    Marshall G and Barón AE. (2000) Linear discriminant models for unbalanced longitudinal data. Statistics in Medicine 19:1969–81.[CrossRef][Web of Science][Medline]

    McCulloch CE, Lin H, Slate EH, Turnbull BW. (2002) Discovering subpopulation structure with latent class mixed models. Statistics in Medicine 21:417–29.[CrossRef][Web of Science][Medline]

    Muthén B, Brown CH, Masyn K, Jo B, Khoo ST, Yang CC, Wang CP, Kellam SG, Carlin JB, Liao J. (2002) General growth mixture modeling for randomized preventive interventions. Biostatistics 3:459–75.[Abstract]

    Muthén B and Shedden K. (1999) Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics 55:463–9.[CrossRef][Web of Science][Medline]

    Pinheiro JC and Bates DM. (2000) Mixed-Effects Models in S and S-PLUS(Springer, New York).

    Racine-Poon A. (1985) A Bayesian approach to nonlinear random effects models. Biometrics 41:1015–24.[CrossRef][Web of Science][Medline]

    Tomasko L, Helms RW, Snapinn SM. (1999) A discriminant analysis extension to mixed models. Statistics in Medicine 18:1249–60.[CrossRef][Web of Science][Medline]

    Verbeke G and Lesaffre E. (1996) A linear mixed-effects model with heterogeneity in the random-effects population. Journal of American Statistical Association 91:217–21.[CrossRef]

    Verbeke G and Molenberghs G. (2000) Linear Mixed Models for Longitudinal Data(Springer, New York).

    Wakefield J. (1996) The Bayesian analysis of population pharmacokinetic models. Journal of American Statistical Association 92:62–75.

    Wakefield J and Bennet J. (1996) Bayesian modeling of covariates for population pharmacokinetic models. Journal of American Statistical Association 91:917–27.[CrossRef]

    Wakefield JC, Smith AFM, Racine-Poon A, Gelfand AE. (1994) Bayesian analysis of linear and non-linear population models by using the Gibbs sampler. Applied Statistics 43:201–21.[CrossRef][Web of Science]

    Wernecke K-D, Kalb G, Schink B, Wegner B. (2004) A mixed model approach to discriminant analysis with longitudinal data. Biometrical Journal 46:246–54.[CrossRef][Web of Science]

    Witt BR, Wolf GC, Wainwright CJ, Johnston PD, Thorneycroft IH. (1990) Relaxin, Ca-125, progesterone, estradiol, Schwangerschaft protein, and human chorionic gonadotrophin as predictors of outcome in threatened and not-threatened pregnancy. Fertility Sterility 53:1029–36.[Web of Science][Medline]

    Yamashita T, Okamoto S, Thomas A, MacLachlan V, Healy DL. (1989) Predicting pregnancy outcome after in-vitro fertilization and embryo transfer using estradiol, progesterone and human chorionic gonadotrophin ß-subunit. Fertility Sterility 51:304–9.[Web of Science][Medline]

    Ziegelmann P and Brown PJ. (2001) Bayesian approach in pharmacokinetics models. In George EI (Ed.). Bayesian Methods with Applications to Science, Policy, and Official Statistics, Selected Papers from om ISBA 2000. Monographs of Official Statistics(Eurostat, Luxembourg) pp. 583–92.

    Received June 23, 2004; revised April 25, 2005; revised December 15, 2005; revised April 12, 2006; accepted for publication May 10, 2006.


    Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



    This Article
    Right arrow Abstract Freely available
    Right arrow FREE Full Text (PDF) Freely available
    Right arrow All Versions of this Article:
    8/2/228    most recent
    kxl003v2
    kxl003v1
    Right arrow Alert me when this article is cited
    Right arrow Alert me if a correction is posted
    Services
    Right arrow Email this article to a friend
    Right arrow Similar articles in this journal
    Right arrow Similar articles in PubMed
    Right arrow Alert me to new issues of the journal
    Right arrow Add to My Personal Archive
    Right arrow Download to citation manager
    Right arrowRequest Permissions
    Right arrow Disclaimer
    Google Scholar
    Right arrow Articles by de la Cruz-Mesía, R.
    Right arrow Articles by Quintana, F. A.
    Right arrow Search for Related Content
    PubMed
    Right arrow PubMed Citation
    Right arrow Articles by de la Cruz-Mesía, R.
    Right arrow Articles by Quintana, F. A.
    Social Bookmarking
     Add to CiteULike   Add to Connotea   Add to Del.icio.us  
    What's this?