Biostatistics Advance Access originally published online on May 25, 2005
Biostatistics 2005 6(3):500-502; doi:10.1093/biostatistics/kxi034
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Letter to the editor
Department of Epidemiology, University of North Carolina at Chapel Hill, CB #7435, Chapel Hill, NC 27599-7435 beverly_rockhill{at}unc.edu
In their recent paper, Gail and Pfeiffer (2005)
argue that discriminatory accuracy of a risk prediction model is more crucial in the screening setting than in the preventive intervention setting, noting that use of specific loss functions (rather than use of general measures of discriminatory accuracy) helps one to see this distinction. I have four main comments on, and questions about, these authors' conclusion.
First, the dichotomy between screening and preventive intervention seems somewhat artificial and arbitrary. For instance, the example of screening used in the paper (colonoscopy) can be considered primary or secondary prevention with respect to colon cancer, and, indeed, it is not clear how the authors are considering it (see following paragraph). Colonoscopy can exert its protective effects on risk of colon cancer over a relatively long period (by allowing removal of premalignant lesions which may or may not become malignant) or over a relatively short period (by allowing removal of already malignant lesions).
The assignation of costs or losses to different possible outcomes, and in particular their relative value, rather than the distinction between screening and preventive intervention, seems to be the critical issue. For instance, in the situation of screening for breast cancer with clinical breast exam, both the costs and benefits of this intervention are quite modest (some might even say nil) compared to the costs and benefits of colonoscopy; so it is unlikely that the loss structure would resemble that for colonoscopy, even though both could be considered as screening applications. Screening with prostate-specific antigen is different still in its relative cost structure from the above two screening examples.
In the screening or secondary prevention setting, the relevant disease states are have disease and do not have disease. In the primary prevention setting, as Gail and Pfeiffer note, the relevant disease states are will get disease and will not get disease (over some considered time period). In their example on colonoscopy, Gail and Pfeiffer assign the greatest cost to those who have disease but who do not get screeneda cost of 100. The decision to not screen those who do not need to be screened is associated with the lowest cost, 0. Screening those who do not have disease, and therefore cannot benefit, is associated with a cost of 1. The second highest cost in this example, 11, is paradoxically assigned to the only group who stands to benefit, those with disease who are screened. This cost of 11 is arrived at by assuming (see p. 233) that colonoscopy reduces risks of death or morbidity in a person otherwise destined to develop colon cancer by a factor of 0.1. The language of destined to develop colon cancer is a bit confusing here, if this is a true disease-screening application rather than primary prevention intervention, but this is not the key issue heremy goal is simply to note the assigned costs in this example.
If one carries the cost logic from the screening example to the tamoxifen preventive intervention example, should not the largest cost be assigned to those women who are destined to get breast cancer (in the next 5 years) but who do not receive tamoxifen? Further, since the relative risk associated with tamoxifen chemoprevention is 0.5 (Fisher et al., 1998
) (i.e. stronger than the 0.9 assumed in the colonoscopy example), it is not clear why the costs of not giving the tamoxifen to a woman who stands to benefit should be even greater (relative to other costs in the example) than the cost of 100 in the colon cancer screening example. Yet, in the tamoxifen preventive intervention example, Gail and Pfeiffer assign a cost of 1 to the group of women who will get breast cancer but who do not receive preventive intervention and, surprisingly, this is the same cost assigned to those who will get (would have gotten) breast cancer but who do receive the intervention. The only costs in this example are 0's and 1's; there is no wide range of costs as in the colonoscopy example.
My second main comment with this paper thus relates to the seeming inconsistency between the two examples and can be stated as follows: why is failing to screen someone destined to develop colon cancer much more costly, relative to other possible states, than failing to provide an effective chemopreventive agent to someone destined to develop the relevant cancer? Is there truly no relative benefit to giving the chemopreventive agent to the only type of person who stands to benefit?
Thirdly, it is unclear why, in the colonoscopy example, a correct decision (screening those who truly stand to benefit) is assigned a much higher cost than an incorrect decision (screening those who cannot possibly benefit). It is thus not clear what the costs in this example encompass, or what they are relative to. At both the societal and individual level, it is better ethically, and often economically, to screen those who can truly benefit rather than people who will notso, should the assigned cost structure not reflect this?
Finally, the key point of Gail and Pfeiffer's work is that the conception of costs will determine the optimal decision regarding intervention to prevent morbidity and/or mortality. It follows, thus, that the choice of different cost structures in this paper could have led to different conclusions about the importance of discriminatory accuracy in different settings. More specifically, it is possible that if the costs of tamoxifen-induced endometrial cancer and pulmonary embolism (in addition to stroke) had been considered in the prevention example the key conclusion of the authors, that lower discriminatory accuracy is acceptable in this situation of preventive intervention, would have been altered.
Gail and Pfeiffer (2005)
note in their conclusion that using general criteria of discriminatory accuracy to help guide decision making has its own problems of implicit preference for a specific loss structure. The Food and Drug Administration set the guidelines for discrimination and decision making about tamoxifen with no reference to any statistical criteria, general or specific (i.e. a woman is high risk and eligible for chemoprevention with tamoxifen if her five-year Gail risk is greater than 1.67%). It bears repeating, as it is a counterintuitive result to many scientists and physicians, that models which have excellent calibration, as does the Gail model, nevertheless can have very poor discriminatory accuracy as measured by relevant statistical criteria (Rockhill et al., 2001
). I agree with Gail and Pfeiffer's argument that more information than just the predicted absolute risk from the model must come into play in decision making, and I commend them for pushing this line of analysis further. However, the conclusion one comes to about the importance of discriminatory accuracy is highly dependent on the assumed structure of relative costs. In our current health care system, with its emphasis on informed decision making by individual patients, it seems clear that for any given possible intervention, there is no single cost/loss structure to be uniformly imposed on all. This reality draws attention to the importance of patient education in basic probability, and it also leads us from the realm of statistical models and loss functions, which speak to averages, into the realm of psychology.
| REFERENCES |
|---|
|
|
|---|
-
GAIL, M. H. AND PFEIFFER, R. M. (2005). On criteria for evaluating models of absolute risk. Biostatistics 6, 227239.[Abstract]
FISHER, B., COSTANTINO, J. P., WICKERHAM, D. L., REDMOND, C. K., KAVANAH, M., CRONIN, W. M., VOGEL, V., ROBIDOUX, A., DIMITROV, N., ATKINS, J., DALY, M., WIEAND, S., TAN-CHIU, E., FORD, L. AND WOLMARK, N. (1998). Tamoxifen for prevention of breast cancer: report of the National Surgical Adjuvant Breast and Bowel Project P-1 Study. Journal of the National Cancer Institute 90, 13711388.
ROCKHILL, B., SPIEGELMAN, D., BYRNE, C., HUNTER, D. J. AND COLDITZ, G. A. (2001). Validation of the Gail et al. model of breast cancer risk prediction and implications for chemoprevention. Journal of the National Cancer Institute 93, 358366.
Received April 11, 2005; revised May 9, 2005; accepted for publication May 19, 2005.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||