Skip Navigation


Biostatistics Advance Access first published online on May 11, 2006
This version published online on March 5, 2007

Biostatistics, doi:10.1093/biostatistics/kxl004
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
8/2/239    most recent
kxl004v2
kxl004v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Harbord, R. M.
Right arrow Articles by Sterne, J. A. C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Harbord, R. M.
Right arrow Articles by Sterne, J. A. C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

A unification of models for meta-analysis of diagnostic accuracy studies

Roger M. Harbord*

Medical Research Council Health Services Research Collaboration, Department of Social Medicine, University of Bristol, Canynge Hall, Whiteladies Road, Bristol BS8 2PR, UK roger.harbord{at}bristol.ac.uk

Jonathan J. Deeks

Centre for Statistics in Medicine, Oxford, UK

Matthias Egger

Department of Social and Preventive Medicine, University of Berne, Switzerland

Penny Whiting and Jonathan A. C. Sterne

Medical Research Council Health Services Research Collaboration, Department of Social Medicine, University of Bristol, UK

* To whom correspondence should be addressed.


    SUMMARY
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 
Studies of diagnostic accuracy require more sophisticated methods for their meta-analysis than studies of therapeutic interventions. A number of different, and apparently divergent, methods for meta-analysis of diagnostic studies have been proposed, including two alternative approaches that are statistically rigorous and allow for between-study variability: the hierarchical summary receiver operating characteristic (ROC) model (Rutter and Gatsonis, 2001Go) and bivariate random-effects meta-analysis (van Houwelingen and others, 1993Go), (van Houwelingen and others, 2002Go), (Reitsma and others, 2005Go). We show that these two models are very closely related, and define the circumstances in which they are identical. We discuss the different forms of summary model output suggested by the two approaches, including summary ROC curves, summary points, confidence regions, and prediction regions.

Keywords: Bivariate normal distribution; Diagnostic tests; Hierarchial models; HSROC model; Meta-analysis; ROC analysis; Sensitivity and specificity


    1. INTRODUCTION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 
There is increasing interest in systematic reviews and meta-analyses of data from diagnostic accuracy studies (Deeks, 2001Go), (Deville and others, 2002Go), (Bossuyt and others, 2003Go), (Khan and others, 2003Go), (Whiting and others, 2004Go), (Tatsioni and others, 2005Go), (Gluud and Gluud, 2005Go). Typically, the data from each primary study are summarized as a 2x2 table, based on dichotomized test result against true disease status, from which familiar measures such as sensitivity and specificity can be derived.

Several statistical methods for meta-analysis of data from diagnostic test accuracy studies have been proposed (Moses and others, 1993Go), (Rutter and Gatsonis, 2001Go), (Dukic and Gatsonis, 2003Go), (Siadaty and Shu, 2004Go), (Reitsma and others, 2005Go). These methods reflect two important characteristics of such data. First, a negative correlation between sensitivity and specificity is expected because of the trade-off between these measures as the test threshold varies (Moses and others, 1993Go), (Deeks, 2001Go). Second, and in contrast to meta-analysis of data from randomized controlled trials, substantial between-study heterogeneity is to be expected and must be incorporated in the models (Lijmer and others, 2002Go). The inferential focus of these methods is also a matter of debate. Some authors propose estimating summary measures of sensitivity and specificity, or prediction regions within which we may expect the results of a future study to lie (Reitsma and others, 2005Go), while others suggest that in the presence of substantial heterogeneity, the results of meta-analyses should be presented as summary receiver operating characteristic (SROC) curves (Rutter and Gatsonis, 2001Go).

Littenberg and Moses (1993)Go see also Moses and others, 1993Go proposed a method of generating a SROC curve using simple linear regression that has been frequently used. However, the assumptions of simple linear regression are not met and the method is therefore approximate. There is also uncertainty as to the most appropriate weighting of the regression (Walter, 2002Go), (Rutter and Gatsonis, 2001Go).

Two statistically rigorous methods for the meta-analysis of data from diagnostic test accuracy studies have been proposed (Reitsma and others, 2005Go), (Rutter and Gatsonis, 2001Go) that overcome these problems but are necessarily more complex. In this paper, we review the characteristics of these methods. We show that although these have been discussed as alternative ways to analyze such data, they are equivalent in many circumstances and hence often lead to identical statistical inferences. Section 2 describes the bivariate model, while Section 3 describes the hierarchical summary receiver operating characteristic (HSROC) model. In Section 4, we explain the relationship between these two models. In Section 5, we discuss the different focus of inference and presentation of model estimates suggested by the two parameterizations. A worked example is presented in Section 6, and the implications of the work are discussed in Section 7.


    2. THE BIVARIATE MODEL
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 
The bivariate model is based on an approach to meta-analysis introduced by van Houwelingen and others (1993)Go (see also van Houwelingen and others, 2002Go). It has recently been applied to meta-analysis of diagnostic accuracy studies by Reitsma and others (2005)Go.

Following Reitsma and others (2005)Go, we define µAi as the logit-transformed sensitivity in study i, and µBi as the logit-transformed specificity. We use the letter µ where Reitsma and others (2005)Go used {theta} to avoid a clash of notation with the HSROC model defined in Section 3. The bivariate model is a random-effects model in which the logit transforms of the true sensitivity and true specificity in each study are assumed to have a bivariate normal distribution across studies, thereby allowing for the possibility of correlation between them (Reitsma and others, 2005Go):


Formula (2.1)

Covariates that affect either sensitivity or specificity or both can be included in a natural way by replacing one or both of the means µA and µB by linear predictors in the covariates. For example, for a single covariate Z that may affect both sensitivity and specificity, we could replace µA by µA + {nu}AZi and µB by µB + {nu}BZi.


    3. THE HSROC MODEL
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 
The HSROC model (Rutter and Gatsonis, 2001Go) was motivated by a model for ordinal regression (McCullagh, 1980Go) that has been used to estimate a receiver operating characteristic (ROC) curve from a single study with data available for multiple thresholds (Tosteson and Begg, 1988Go). The model is formulated in terms of the probability {pi}ij that a patient in study i with disease status j has a positive test result, where j = 0 for a patient without the disease and j = 1 for a patient with the disease. In the usual terminology of diagnostic accuracy studies, {pi}i1 is the true-positive rate or sensitivity in study i, while {pi}i0 is the false-positive rate, equal to 1 – specificity. The HSROC model is defined by separate equations for within-study variation (Level I) and between-study variation (Level II). (The Bayesian formulation originally presented by Rutter and Gatsonis (2001)Go requires an additional third level specifying the priors for the model parameters.)

3.1 HSROC level I (within study) model

The Level I model for study i takes the form


Formula (3.1)

where Xij is a dummy variable denoting the true disease status for a patient in study i with disease status j. Rutter and Gatsonis (2001)Go chose to code Formula for those without disease (j = 0) and Formula for those with disease (j = 1). Both {theta}i and {alpha}i are allowed to vary between studies. Rutter and Gatsonis (2001)Go refer to the {theta}i as "cutpoint parameters" or "positivity criteria," as they model the trade-off between sensitivity and specificity in each study: true-positive rate (sensitivity) and false-positive rate (1 – specificity) both increase with increasing {theta}i. The {alpha}i are "accuracy parameters," as they measure the difference between true-positive and false-positive fractions in each study. When ß = 0, the diagnostic odds ratio for each study does not depend on the cutpoint parameter {theta}i, and {alpha}i is then the log of the diagnostic odds ratio. ß is a "scale parameter" or "shape parameter" which models possible asymmetry in the ROC curve by allowing true-positive and false-positive fractions to increase at different rates as {theta}i increases. When ß!=0, the diagnostic odds ratio varies with {theta}i even if the accuracy parameter {alpha}i is held fixed. ß is assumed to be constant across studies, although this assumption can be relaxed somewhat, for example to allow a different value of ß in each of several groups of studies (Rutter and Gatsonis, 2001Go).

3.2 HSROC level II (between study) model

Level II models the variation of the parameters {theta}i and {alpha}i between studies. In the simplest case, {theta}i and {alpha}i are assumed to have independent Normal distributions, with {theta}i~N({Theta},{sigma}Formula) and {alpha}i~N({Lambda},{sigma}Formula). More generally, the means of the two distributions may be determined by linear functions of study-level covariates. For example, with a single covariate Z that affects both the cutpoint and accuracy parameters,


Formula (3.2)


Formula (3.3)

where the coefficients {gamma} and {lambda} express the effect of the covariate Z on the cutpoint and accuracy parameters, respectively. This model may be extended to include more than one covariate, or to allow the covariates that affect the accuracy parameters to differ from those that affect the cutpoint parameters.


    4. RELATIONS BETWEEN THE TWO MODELS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 
We now clarify the relationship between the bivariate and HSROC models. We shall start from the HSROC model. For brevity, first let b = exp(ß/2). We can reexpress level I of the HSROC model by splitting (3.1) into separate equations for those with and without disease:


Formula (4.1)


Formula (4.2)

The bivariate model is written in terms of µAi and µBi, the logit transforms of sensitivity and specificity in study i. In the notation introduced in Section 3, the sensitivity in study i is {pi}i1 and the specificity is 1 – {pi}i0, so


Formula (4.3)


Formula (4.4)

We can therefore relate the random variables that form the basis of the two models:


Formula (4.5)


Formula (4.6)

This pair of equations tells us that µAi and µBi are linear combinations of two random variables, {theta}i and {alpha}i, which the HSROC model assumes to have independent normal distributions (conditional on any covariates). Any pair of linear combinations of random variables with normal distributions has a bivariate normal distribution see, e.g. Dudewicz and Mishra, 1988Go, p. 242. Therefore, the HSROC model implies that the joint distribution of µAi and µBi is bivariate normal. So the HSROC model is precisely equivalent to the bivariate model. We give explicit expressions for the relationships between their parameters in the subsections that follow.

We can express the relationship more concisely using matrix notation. We may write (4.5) and (4.6) in a single matrix equation as


Formula (4.7)

Inverting this,


Formula (4.8)

S is then the transformation matrix associated with the change from the bivariate model coordinates (logit-transformed sensitivity and specificity) to the HSROC model coordinates (cutpoint and accuracy parameters). Note that S is not orthogonal (S 1!=ST). As illustrated in Section 6, it follows that when plotted in bivariate model space (logit-ROC space), the axes corresponding to the coordinates of the HSROC model are not perpendicular to each other.

4.1 Relation between parameters of models with no covariates

We can then express the relationship between the parameters of the two models without covariates in terms of the transformation matrix S by taking the expectation and variance of both sides of (4.8):


Formula (4.9)


Formula (4.10)

The assumption of the HSROC model that {theta}i and {alpha}i are uncorrelated, i.e. the off-diagonal elements above are zero, fixes the value of b and hence the transformation matrix S. So S is a non-orthogonal transformation that diagonalizes the variance–covariance matrix of the bivariate model. On expanding the right-hand side of (4.10), we find that these off-diagonal elements are zero if and only if Formula or, equivalently,


Formula (4.11)

Thus, the shape parameter (ß) of the HSROC model is determined solely by the ratio of the variances of logit sensitivity and logit specificity in the bivariate model, and, perhaps surprisingly, is unrelated to their correlation. Equations (4.9) and (4.10) then allow us to relate the other parameters of the HSROC model to those of the bivariate model:


Formula (4.12)


Formula (4.13)


Formula (4.14)


Formula (4.15)

We can also invert these equations to give the five parameters of the bivariate model in terms of those of the HSROC model:


Formula (4.16)


Formula (4.17)


Formula (4.18)


Formula (4.19)


Formula (4.20)

where b = exp(ß/2), as defined above.

4.2 Relations between parameters of models with covariates

We now move on to examine the relationship between the models when covariates are included. If the bivariate model is extended to include a single covariate Z that affects both the sensitivity and specificity, (4.10) is unchanged, while (4.9) for the expectation of (4.8) becomes


Formula (4.21)

This is of the form


Formula (4.22)

The extension to more than one covariate, with each covariate affecting both accuracy and cutpoint parameters, is straightforward. Therefore, a bivariate model in which one or more covariates affect both sensitivity and specificity is equivalent to an HSROC model in which the same covariates are allowed to affect both accuracy and cutpoint parameters.

However, a bivariate model in which different covariates are allowed to affect sensitivity from specificity, or covariates are included for only sensitivity or only specificity, will not be equivalent to an HSROC model including covariates, unless constraints are imposed on the relationship between the coefficients of the covariates in the HSROC model. The converse is also true.


    5. FOCUS OF INFERENCE AND MODEL OUTPUTS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 
The two approaches lead to different forms of model output appearing more natural.

5.1 HSROC model

The HSROC model gives rise to a SROC curve by allowing the threshold parameter {theta}i to vary while holding the accuracy parameter {alpha}i fixed at its mean {Lambda}. For the model without covariates, the expected sensitivity for a given specificity is then given by (Rutter and Gatsonis, 2001Go), (Macaskill, 2004Go)


Formula (5.1)

Rutter and Gatsonis (2001)Go suggest that the curve is restricted to the observed range of estimated specificities of the studies to discourage extrapolation beyond the data. If ß = 0, the curve is symmetric about the "sensitivity = specificity" diagonal. This SROC curve does not depict the uncertainty in any of the parameter estimates and depicts the variability in threshold but not in accuracy.

5.2 Bivariate model

As Reitsma and others (2005)Go suggest, confidence and prediction regions in ROC space can be constructed using the estimates from the bivariate model. As sensitivity and specificity may be highly correlated, separate confidence intervals for the mean logit sensitivity µA and mean logit specificity µB may be misleading. It is preferable to use an elliptical joint confidence region for both parameters. Such an ellipse is most easily generated using a parametric representation (Douglas, 1993Go):


Formula (5.2)


Formula (5.3)

where sA and sB are the estimated standard errors of Formula and Formula, r is the estimate of their correlation, and varying t from 0 to 2{pi} generates the boundary of the ellipse. The constant c has been called the boundary constant of the ellipse (Alexandersson, 2004Go); asymptotically, to give a 100(1 – {alpha}%) confidence region, Formula, where {chi}Formula is the upper 100{alpha}% point of the {chi}2 distribution with two degrees of freedom. When the number of studies is small, it may be preferable to use a more conservative approximate confidence region given by Formula, where n is the number of studies and f2,n – 2;{alpha} is the upper 100{alpha}% point of the F distribution with degrees of freedom 2 and n – 2 (Douglas, 1993Go), (Chew, 1966Go). Such an ellipse in logit-ROC space can then be back-transformed to conventional ROC space to give a confidence region for the summary operating point.

It is also possible to construct a prediction region giving the region which has a given probability (e.g. 95%) of including the "true" sensitivity and specificity of a future study. The covariance matrix for the true logit sensitivity and logit specificity in a future study is


Formula (5.4)

In practice, both terms must be estimated by fitting the model to the data. The parameters sA, sB, and r in (5.2) and (5.3) can then be replaced by the corresponding quantities derived from this covariance matrix to give the prediction ellipse in logit-ROC space. Again, this can be back-transformed to a prediction region for the true sensitivity and specificity of a future study in conventional ROC space.


    6. EXAMPLE: LYMPHANGIOGRAPHY FOR DIAGNOSIS OF LYMPH NODE METASTASIS
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 
As an example, we shall apply both methods to data on 17 studies of lymphangiography for the diagnosis of lymph node metastasis in women with cervical cancer, one of three imaging techniques in the meta-analysis of Scheidler and others (1997)Go which has been much used as an example data set for methodological papers on diagnostic meta-analysis (Rutter and Gatsonis, 2001Go), (Macaskill, 2004Go), (Reitsma and others, 2005Go). A SROC plot showing the estimates of sensitivity and specificity from the individual studies is shown in Figure 1.


Figure 1
View larger version (6K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. SROC plot for the data of Scheidler and others (1997)Go on lymph node metastasis for diagnosis of cervical cancer. The area of the circles are in proportion to the number of patients in each study.

 
We fitted both the bivariate and the HSROC models using the NLMIXED procedure in the statistical software package SAS (SAS Institute Inc., 2003Go), using code similar to that given by Macaskill (2004)Go and available from the authors on request. Note that our results differ slightly from those in Reitsma and others (2005)Go as they use empirical logit transforms and their standard errors followed by the MIXED procedure in SAS, where we choose to model the binomial error structure directly using the NLMIXED procedure.

Table 1 shows the parameter estimates obtained for both models, and the result of applying (4.11)(4.20) to transform estimates from the HSROC model to the corresponding parameters of the bivariate model and vice versa. The standard errors of the transformed estimates were computed by the delta method using the ESTIMATE statement of the NLMIXED procedure. As can be seen, the results are virtually identical. (The standard errors are identical in theory due to the close relationship between the delta method and maximum likelihood; Cox, 1998Go; Cox and Hinkley, 1974Go, Exercise 4.15.) By taking the inverse logit transforms of µA and µB, respectively, and assuming their estimates have a normal distribution, the summary estimate of sensitivity is found to be 0.67 (95% CI, 0.60–0.74) and that of specificity is 0.84 (95% CI, 0.76–0.89). In this example, {sigma}AB is estimated to be positive, though with large standard error. This implies a positive correlation between sensitivity and specificity across the studies, not the negative correlation that would be expected if the between-study heterogeneity was due mainly to variation in threshold.


View this table:
[in this window]
[in a new window]

 
Table 1. Results of fitting the bivariate and HSROC models to the lymphangiography data

 
Figure 2 shows the 95% confidence region for the summary operating point and a 95% prediction region for the true operating point in a single future study in both logit-transformed ROC space (left panel) and back-transformed to conventional ROC space (right panel). The prediction region covers a greater range of specificity than sensitivity, in contrast to the estimates from the separate studies shown in the SROC plot in Figure 1, which exhibit more variation in estimated sensitivity than specificity. This is due to the fact that most of the studies had a considerably larger number of patients with negative results on the reference test than positive results, leading to greater sampling variability in the estimates of sensitivity than specificity. The prediction region is for the true sensitivity and specificity in a future study, not the estimated values.


Figure 2
View larger version (6K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Summary points, lines, and regions in logit-transformed ROC space (left) and conventional ROC space (right). Filled circle: summary point. Solid line: SROC curve. Dotted line: boundary of the confidence region for the summary point. Dashed line: boundary of prediction region. The left-hand panel also shows the HSROC model coordinate axes in logit-transformed ROC space. Note that these axes do not align with the major or minor axes of the ellipse.

 
Also shown in Figure 2 is the SROC curve (a straight line in logit-transformed ROC space). Note that the SROC curve takes a conventional shape despite the positive estimate of the correlation between sensitivity and specificity. The left-hand panel also shows the HSROC coordinate axes in logit-transformed ROC space. Note that these axes do not align with the major or minor axes of the ellipse. The {theta} axis is parallel to the ROC curve, while its horizontal reflection is parallel to the {alpha} axis. The method of Littenberg and Moses (1993)Go (using unweighted linear regression) gives a curve similar to, but slightly above, the HSROC curve, as shown in Macaskill (2004)Go.


    7. DISCUSSION
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 
We have shown that the HSROC model and the bivariate random-effects model for meta-analysis of diagnostic accuracy studies are very closely related, and in common situations identical. In the absence of study-level covariates, they are different parameterizations of the same model. The bivariate model allows inclusion of covariates that affect sensitivity or specificity or both, while the HSROC model allows covariates that affect accuracy or threshold parameters or both. An HSROC model that allows one or more covariates to affect both accuracy and threshold parameters is equivalent to a bivariate model that allows the same covariates to affect both sensitivity and specificity. However, the HSROC model can be more easily extended to include a covariate to affect the degree of asymmetry of the SROC curve.

The models may differ in the options for introducing greater model parsimony by dropping or combining parameters: The HSROC framework allows the analyst to drop the random effect for the accuracy parameter and assume this is fixed across all studies, and hence that only the threshold parameter varies between studies. This corresponds to perfect negative correlation between the logit transforms of sensitivity and specificity in the bivariate model ({sigma}AB = – {sigma}A{sigma}B). The confidence and prediction regions then collapse to lie along the SROC curve. The HSROC framework also allows the assumption of a symmetric SROC curve with constant diagnostic odds ratio by setting ß = 0, which in the bivariate model corresponds to equal variances of logit sensitivity and logit specificity ({sigma}Formula = {sigma}Formula). The ability to enforce such constraints on bivariate model parameters may vary between software packages. By contrast, it does not appear natural to set any of the parameters of the bivariate model to zero. One practical advantage of the bivariate model is that it can be fitted in a wider range of software, for example MLwiN, SAS, or the Stata package "gllamm" (Rabe-Hesketh and others, 2004Go), whereas the HSROC model is at present only estimable using WinBUGS or the NLMIXED procedure in SAS.

As we have seen, the different parameterizations of the HSROC and bivariate models arise from different ideas of the most appropriate meta-analytic summaries of the results of diagnostic test accuracy studies, and have primarily been used to produce these chosen summaries. The HSROC parameterization naturally leads to a SROC curve when the threshold parameter {theta} is allowed to vary between studies but the accuracy parameter {alpha} is fixed at its mean. This may be reasonable when there is little or no detectable heterogeneity in the accuracy parameter, i.e. {sigma}Formula is estimated to be close to zero, or when there is considerably greater variability in threshold than in accuracy. The bivariate model parameterization naturally leads to a summary operating point, i.e. a summary sensitivity and specificity, together with confidence intervals for each or a joint confidence region for both together. When there is a considerable degree of between-study heterogeneity, as is common in meta-analysis of diagnostic accuracy studies, a prediction region may be preferable to a confidence region.

In our example in Section 6, fitting both models to the same data gave near-identical results in agreement with the formulae derived in Section 4.1, when both models were fitted using the NLMIXED procedure in SAS. However, such close agreement may not always be found in practice, particularly if the models are fitted using different approaches in different software. Rutter and Gatsonis (2001)Go originally proposed fitting the HSROC model using a Bayesian Markov chain Monte-Carlo method. Unlike maximum likelihood estimates, Bayesian posterior means or medians are not invariant under nonlinear transformations such as those in Section 4.1. Reitsma and others (2005)Go fit the bivariate model using the MIXED procedure in SAS, which, unlike the NLMIXED procedure, requires first calculating empirical estimates of the logit transforms of sensitivity and specificity and their standard errors, treating the latter as fixed and approximating the within-study variability of the logits by a normal distribution. This approach is less computationally demanding but involves some degree of approximation when the study sizes are small. In addition, regardless of the method of estimation, there is typically little information on the covariance parameter {sigma}AB of the bivariate model unless there are many studies of reasonable size with considerable variation in sensitivity and specificity between them. Its estimation may therefore prove troublesome (R. Riley and others, in preparation).

Another reason for apparent discrepancies between results in previous publications is that when fitting models to the data of Scheidler and others (1997)Go, authors have made different assumptions about the equality of parameters between the three imaging techniques assessed, of which for simplicity we have only considered one, lymphangiography, in the example here. Rutter and Gatsonis (2001)Go allowed all five parameters of the HSROC model to differ between the three imaging techniques. Macaskill (2004)Go assumed that the two variance parameters {sigma}Formula and {sigma}Formula were the same while the other three parameters {Lambda}, {Theta}, and ß differed. Reitsma and others (2005)Go estimated a bivariate model in which the three variance–covariance parameters {sigma}A, {sigma}B, and {sigma}AB are the same for the three imaging techniques and the two location parameters µA and µB differ, thereby constraining the three SROC curves to have the same degree of asymmetry.

It may initially seem surprising that (4.11) for ß, the shape parameter of the HSROC model, does not involve the covariance {sigma}AB of the bivariate model but only the ratio of the variances. In fact, {sigma}AB only enters (4.14) and (4.15) for the variances of the HSROC parameters. It follows that the equation for the SROC curve given by the HSROC model does not require this covariance. It is therefore possible to use (4.11), (4.13), and (5.1) to estimate the equation of the SROC curve from separate conventional "univariate" random-effects meta-analyses of logit-transformed sensitivity and logit-transformed specificity. These could be performed using any of the widely available packages for random-effects meta-analysis. The estimates obtained by such an approach will not be identical to those from the bivariate model as joint marginal normality of two random variables does not imply they have a bivariate normal distribution. However, if the bivariate normal model does hold, separate univariate analyses should give consistent estimates of the means and variances, with only a slight loss in efficiency (Riley and others, 2006Go). Separate univariate analyses may therefore provide an alternative to the method of Littenberg and Moses (1993)Go as a way of generating a SROC curve using widely available algorithms. Separate univariate analyses may also be useful in providing starting values for the iterative procedures required to fit either the bivariate or the HSROC models, which may aid convergence.

There is empirical evidence that aspects of the design and conduct of diagnostic accuracy studies can lead to bias or increased variation in their results. Exploration of potential sources of heterogeneity is therefore a crucial component of systematic reviews of such studies. Sources of between-study heterogeneity may include differences in patient selection and clinical setting, disease severity, specifics of the index and reference tests, and interobserver variability (Lijmer and others, 1999Go), (Whiting and others, 2004Go). The expected effect of a covariate on test performance may lead to a preference for one of the two parameterizations implied by the HSROC and bivariate models. For example, "spectrum bias," in which the subjects included in a study are not representative of the patients who will receive the test in practice (Whiting and others, 2003Go), might be expected to affect test accuracy rather than threshold, and might therefore be most appropriately investigated using the HSROC approach. Conversely, between-study variation in disease severity will affect sensitivity but not specificity, leading to a preference for the bivariate approach. For most study characteristics, however, there are few a priori reasons to prefer one approach over the other; further empirical research on this issue is needed.

The methods explored in this paper assume that only summary data from each study are available in the form of a 2x2 table. Meta-analysis of individual patient data may offer particular advantages for diagnostic research (Khan and others, 2003Go). It would allow differences in patient spectra to be properly accounted for, and enable assessment of the additional information provided by a test above that already known from patient history and clinical examination. For test results that are originally numerical or ordered categorical, it would also capture within-study information about the ROC curve that is lost when a particular threshold is chosen and the results collapsed into a summary 2x2 table.

In summary, we have demonstrated that the HSROC and bivariate models are very closely related and often identical. The parameter estimates from either model can be used to produce a summary operating point, an SROC curve, confidence regions, or prediction regions. The choice between these parameterizations depends partly on the degrees of and reasons for between-study heterogeneity. Empirical evidence about this would be useful in guiding analysts.


    ACKNOWLEDGMENTS
 
We wish to acknowledge helpful discussions with Petra Macaskill. This work was supported by the MRC Health Services Research Collaboration. Jonathan J. Deeks is funded in part by a Senior Scientist in Evidence Synthesis Award from the UK Department of Health. Conflict of Interest: None declared.


    REFERENCES
 TOP
 SUMMARY
 1. INTRODUCTION
 2. THE BIVARIATE MODEL
 3. THE HSROC MODEL
 4. RELATIONS BETWEEN THE...
 5. FOCUS OF INFERENCE...
 6. EXAMPLE: LYMPHANGIOGRAPHY FOR...
 7. DISCUSSION
 REFERENCES
 

    Alexandersson A. (2004) Graphing confidence ellipses: an update of ellip for Stata 8. Stata Journal 4:242–56.

    Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, Lijmer JG, Moher D, Rennie D, de Vet HCW. (2003) Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. British Medical Journal 326:41–4.[Abstract/Free Full Text]

    Chew V. (1966) Confidence, prediction, and tolerance regions for the multivariate normal distribution. Journal of the American Statistical Association 61:605–17.

    Cox C. (1998) Delta method. In Armitage P and Colton T (Eds.). Encyclopedia of Biostatistics 1st edition (Wiley, Chichester, UK) pp. 1125–1127.

    Cox DR and Hinkley DV. (1974) Theoretical Statistics(Chapman and Hall, London).

    Deeks JJ. (2001) Systematic reviews in health care: systematic reviews of evaluations of diagnostic and screening tests. British Medical Journal 323:157–62.[Free Full Text]

    Deville W, Buntinx F, Bouter L, Montori V, de Vet H, van der Windt D, Bezemer P. (2002) Conducting systematic reviews of diagnostic studies: didactic guidelines. BMC Medical Research Methodology 2:9.

    Douglas JB. (1993) Confidence regions for parameter pairs. American Statistician 47:43–5.

    Dudewicz EJ and Mishra SN. (1988) Modern Mathematical Statistics(Wiley, New York).

    Dukic V and Gatsonis C. (2003) Meta-analysis of diagnostic test accuracy assessment studies with varying number of thresholds. Biometrics 59:936–46.[CrossRef][Web of Science][Medline]

    Gluud C and Gluud LL. (2005) Evidence based diagnostics. British Medical Journal 330:724–6.[Free Full Text]

    Khan KS, Bachmann LM, ter Riet G. (2003) Systematic reviews with individual patient data meta-analysis to evaluate diagnostic tests. European Journal of Obstetrics & Gynecology and Reproductive Biology 108:121–5.[Web of Science][Medline]

    Lijmer JG, Bossuyt PMM, Heisterkamp SH. (2002) Exploring sources of heterogeneity in systematic reviews of diagnostic tests. Statistics in Medicine 21:1525–37.[CrossRef][Web of Science][Medline]

    Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, Bossuyt PM. (1999) Empirical evidence of design-related bias in studies of diagnostic tests. Journal of the American Medical Association 282:1061–6.[Abstract/Free Full Text]

    Littenberg B and Moses LE. (1993) Estimating diagnostic accuracy from multiple conflicting reports: a new meta-analytic method. Medical Decision Making 13:313–21.[Abstract/Free Full Text]

    Macaskill P. (2004) Empirical Bayes estimates generated in a hierarchical summary ROC analysis agreed closely with those of a full Bayesian analysis. Journal of Clinical Epidemiology 57:925–32.[CrossRef][Web of Science][Medline]

    McCullagh P. (1980) Regression models for ordinal data. Journal of the Royal Statistical Society, Series B, Methodological 42:109–42.

    Moses LE, Shapiro D, Littenberg B. (1993) Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations. Statistics in Medicine 12:1293–316.[Web of Science][Medline]

    Rabe-Hesketh S, Pickles A, Skrondal A. (2004) GLLAMM manual. U.C. Berkeley Division of Biostatistics Working Paper Series Working Paper 160. http://www.bepress.com/ucbbiostat/paper160.

    Reitsma JB, Glas AS, Rutjes AWS, Scholten RJPM, Bossuyt PM, Zwinderman AH. (2005) Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. Journal of Clinical Epidemiology 58:982–90.[CrossRef][Web of Science][Medline]

    Riley RD, Abrams KR, Lambert P, Sutton AJ, Thompson JR. (2006) An evaluation of bivariate random-effects meta-analysis for the joint synthesis of two correlated outcomes. Statistics in Medicine doi:10.1002/sim.2524.

    Rutter CM and Gatsonis CA. (2001) A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Statistics in Medicine 20:2865–84.[CrossRef][Web of Science][Medline]

    SAS Institute Inc. (2003) The SAS System for Windows. Version 9.1(SAS Institute Inc, Cary, NC).

    Scheidler J, Hricak H, Yu KK, Subak L, Segal MR. (1997) Radiological evaluation of lymph node metastases in patients with cervical cancer. a meta-analysis. Journal of the American Medical Association 278:1096–101.[Abstract/Free Full Text]

    Siadaty M and Shu J. (2004) Proportional odds ratio model for comparison of diagnostic tests in meta-analysis. BMC Medical Research Methodology 4:27.

    Tatsioni A, Zarin DA, Aronson N, Samson DJ, Flamm CR, Schmid C, Lau J. (2005) Challenges in systematic reviews of diagnostic technologies. Annals of Internal Medicine 142:1048–55.[Abstract/Free Full Text]

    Tosteson AN and Begg CB. (1988) A general regression methodology for ROC curve estimation. Medical Decision Making 8:204–15.[Abstract/Free Full Text]

    van Houwelingen H, Arends LR, Stijnen T. (2002) Advanced methods in meta-analysis: multivariate approach and meta-regression. Statistics in Medicine 21:589–624.[CrossRef][Web of Science][Medline]

    van Houwelingen HC, Zwinderman KH, Stijnen T. (1993) A bivariate approach to meta-analysis. Statistics in Medicine 12:2273–84.[Web of Science][Medline]

    Walter SD. (2002) Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data. Statistics in Medicine 21:1237–56.[CrossRef][Web of Science][Medline]

    Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J. (2003) The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Medical Research Methodology 3:25.

    Whiting P, Rutjes AWS, Reitsma JB, Glas AS, Bossuyt PMM, Kleijnen J. (2004) Sources of variation and bias in studies of diagnostic accuracy—a systematic review. Annals of Internal Medicine 140:189–202.[Abstract/Free Full Text]

    Received January 31, 2006; revised March 17, 2006; revised April 10, 2006; accepted for publication May 10, 2006.


    Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


    This article has been cited by other articles:


    Home page
    Med Decis MakingHome page
    T. A. Trikalinos, U. Siebert, and J. Lau
    Decision-Analytic Modeling to Evaluate Benefits and Harms of Medical Tests: Uses and Limitations
    Med Decis Making, September 1, 2009; 29(5): E22 - E29.
    [Abstract] [PDF]


    Home page
    JCOHome page
    T. Terasawa, J. Lau, S. Bardet, O. Couturier, T. Hotta, M. Hutchings, T. Nihashi, and H. Nagai
    Fluorine-18-Fluorodeoxyglucose Positron Emission Tomography for Interim Response Assessment of Advanced-Stage Hodgkin's Lymphoma and Diffuse Large B-Cell Lymphoma: A Systematic Review
    J. Clin. Oncol., April 10, 2009; 27(11): 1906 - 1914.
    [Abstract] [Full Text] [PDF]


    Home page
    BiostatisticsHome page
    H. Chu and H. Guo
    Letter to the editor
    Biostat., January 1, 2009; 10(1): 201 - 203.
    [Full Text] [PDF]


    Home page
    ANN INTERN MEDHome page
    L. Kriston, L. Holzel, A.-K. Weiser, M. M. Berner, and M. Harter
    Meta-analysis: Are 3 Questions Enough to Detect Unhealthy Alcohol Use?
    Ann Intern Med, December 16, 2008; 149(12): 879 - 888.
    [Abstract] [Full Text] [PDF]


    Home page
    ANN INTERN MEDHome page
    M. M.G. Leeflang, J. J. Deeks, C. Gatsonis, P. M.M. Bossuyt, and on behalf of the Cochrane Diagnostic Test Accuracy
    Systematic Reviews of Diagnostic Test Accuracy
    Ann Intern Med, December 16, 2008; 149(12): 889 - 897.
    [Abstract] [Full Text] [PDF]


    Home page
    Med Decis MakingHome page
    A. J. Sutton, N. J. Cooper, S. Goodacre, and M. Stevenson
    Integration of Meta-analysis and Economic Decision Modeling for Evaluating Diagnostic Tests
    Med Decis Making, September 1, 2008; 28(5): 650 - 667.
    [Abstract] [PDF]


    Home page
    Med Decis MakingHome page
    T. H. Hamza, J. B. Reitsma, and T. Stijnen
    Meta-Analysis of Diagnostic Studies: A Comparison of Random Intercept, Normal-Normal, and Binomial-Normal Bivariate Summary ROC Approaches
    Med Decis Making, September 1, 2008; 28(5): 639 - 649.
    [Abstract] [PDF]


    Home page
    Am J EpidemiolHome page
    J. Koshiol, C. Poole, H. Chu, J. M. Pimenta, L. Lindsay, D. Jenkins, and J. S. Smith
    The Authors Respond to "HPV Persistence and Cervical Cancer Screening"
    Am. J. Epidemiol., July 15, 2008; 168(2): 145 - 148.
    [Full Text] [PDF]


    Home page
    BMJHome page
    J. S Cnossen, K. C Vollebregt, N. d. Vrieze, G. t. Riet, B. W J Mol, A. Franx, K. S Khan, and J. A M v. d. Post
    Accuracy of mean arterial pressure and blood pressure measurements in predicting pre-eclampsia: systematic review and meta-analysis
    BMJ, May 17, 2008; 336(7653): 1117 - 1120.
    [Abstract] [Full Text] [PDF]


    Home page
    CMAJHome page
    J. S. Cnossen MD, R. K. Morris MD, G. ter Riet MD PhD, B. W.J. Mol MD PhD, J. A.M. van der Post MD PhD, A. Coomarasamy MD, A. H. Zwinderman MSc PhD, S. C. Robson MD, P. J.E. Bindels MD PhD, J. Kleijnen MD PhD, et al.
    Use of uterine artery Doppler ultrasonography to predict pre-eclampsia and intrauterine growth restriction: a systematic review and bivariable meta-analysis
    Can. Med. Assoc. J., March 11, 2008; 178(6): 701 - 711.
    [Abstract] [Full Text] [PDF]


    Home page
    BiostatisticsHome page
    R. D. Riley, J. R. Thompson, and K. R. Abrams
    An alternative model for bivariate random-effects meta-analysis when the within-study correlations are unknown
    Biostat., January 1, 2008; 9(1): 172 - 186.
    [Abstract] [Full Text] [PDF]


    Home page
    JNMHome page
    T. Terasawa, T. Nihashi, T. Hotta, and H. Nagai
    18F-FDG PET for Posttherapy Assessment of Hodgkin's Disease and Aggressive Non-Hodgkin's Lymphoma: A Systematic Review
    J. Nucl. Med., January 1, 2008; 49(1): 13 - 21.
    [Abstract] [Full Text] [PDF]


    Home page
    RadiologyHome page
    E. E. Pakos, H. D. Koumoulis, A. D. Fotopoulos, and J. P. A. Ioannidis
    Osteomyelitis: Antigranulocyte Scintigraphy with 99mTc Radiolabeled Monoclonal Antibodies for Diagnosis Meta-Analysis
    Radiology, December 1, 2007; 245(3): 732 - 741.
    [Abstract] [Full Text] [PDF]


    This Article
    Right arrow Abstract Freely available
    Right arrow FREE Full Text (PDF) Freely available
    Right arrow All Versions of this Article:
    8/2/239    most recent
    kxl004v2
    kxl004v1
    Right arrow Alert me when this article is cited
    Right arrow Alert me if a correction is posted
    Services
    Right arrow Email this article to a friend
    Right arrow Similar articles in this journal
    Right arrow Similar articles in PubMed
    Right arrow Alert me to new issues of the journal
    Right arrow Add to My Personal Archive
    Right arrow Download to citation manager
    Right arrowRequest Permissions
    Right arrow Disclaimer
    Google Scholar
    Right arrow Articles by Harbord, R. M.
    Right arrow Articles by Sterne, J. A. C.
    Right arrow Search for Related Content
    PubMed
    Right arrow PubMed Citation
    Right arrow Articles by Harbord, R. M.
    Right arrow Articles by Sterne, J. A. C.
    Social Bookmarking
     Add to CiteULike   Add to Connotea   Add to Del.icio.us  
    What's this?