Original Article
Polytomous logistic regression analysis could be applied more often in diagnostic research

https://doi.org/10.1016/j.jclinepi.2007.03.002Get rights and content

Abstract

Objective

Physicians commonly consider the presence of all differential diagnoses simultaneously. Polytomous logistic regression modeling allows for simultaneous estimation of the probability of multiple diagnoses. We discuss and (empirically) illustrate the value of this method for diagnostic research.

Study Design and Setting

We used data from a study on the diagnosis of residual retroperitoneal mass histology in patients presenting with nonseminomatous testicular germ cell tumor. The differential diagnoses include benign tissue, mature teratoma, and viable cancer. Probabilities of each diagnosis were estimated with a polytomous logistic regression model and compared with the probabilities estimated from two consecutive dichotomous logistic regression models.

Results

We provide interpretations of the odds ratios derived from the polytomous regression model and present a simple score chart to facilitate calculation of predicted probabilities from the polytomous model. For both modeling methods, we show the calibration plots and receiver operating characteristics curve (ROC) areas comparing each diagnostic outcome category with the other two. The ROC areas for benign tissue, mature teratoma, and viable cancer were similar for both modeling methods, 0.83 (95% confidence interval [CI] = 0.80–0.85) vs. 0.83 (95% CI = 0.80–0.85), 0.78 (95% CI = 0.75–0.81) vs. 0.78 (95% CI = 0.75–0.81), and 0.66 (95% CI = 0.61–0.71) vs. 0.64 (95% CI = 0.59–0.69), for polytomous and dichotomous regression models, respectively.

Conclusion

Polytomous logistic regression is a useful technique to simultaneously model predicted probabilities of multiple diagnostic outcome categories. The performance of a polytomous prediction model can be assessed similarly to a dichotomous logistic regression model, and predictions by a polytomous model can be made with a user-friendly method. Because the simultaneous consideration of the presence of multiple (differential) conditions serves clinical practice better than consideration of the presence of only one target condition, polytomous logistic regression could be applied more often in diagnostic research.

Introduction

Diagnostic practice starts with a patient presenting with particular signs and symptoms. The physician then defines the differential diagnoses and implicitly estimates the probability of presence of all possible conditions given the patient's clinical and nonclinical profile [1], [2], [3]. Usually, one of these differential diagnoses is defined as the working diagnosis or target condition, to which the diagnostic workup is primarily directed. Diagnostic studies commonly focus on the ability of tests to include or exclude this target condition by dichotomizing the diagnostic outcome; the alternative diagnoses are included in the category “target condition absent.” Accordingly, diagnostic studies that aim to develop diagnostic prediction rules commonly use dichotomous logistic regression analysis. A well-known example is the Wells rule to diagnose deep venous thrombosis [4]. However, diagnostic prediction rules that estimate the probability of presence vs. absence of one target condition may oversimplify clinical practice. Rules that estimate the probabilities of presence of each of the potential conditions may be preferable.

Already in the early eighties Begg and coworkers discussed the use of polytomous logistic regression to accommodate simultaneous modeling of more than two unordered outcome categories [5], [6]. This method has received little attention since then [7], and we believe it is time to revisit polytomous logistic regression analysis to address diagnostic questions.

We provide an introduction to the principles of polytomous logistic regression and show an application with empirical data from a study on diagnosis of residual retroperitoneal mass histology in patients with nonseminomatous testicular germ cell tumor (NSTGCT) [8]. We explain the interpretation of the derived odds ratios [ORs], study several aspects of the polytomous model performance, and present a user-friendly format for application of the polytomous regression model. Finally, the advantages and disadvantages of polytomous logistic regression are discussed.

Section snippets

Patients

We used data from previous studies on residual retroperitoneal mass histology in patients (n = 1,094) treated with chemotherapy for metastatic NSTGCT [8], [9], [10], [11]. These studies were primarily performed to develop and validate a dichotomous diagnostic prediction model to discriminate benign tissue from other histologies. Patients with elevated levels of the serum tumor markers alpha-fetoprotein (AFP) and human chorionic gonadotropin (HCG) at the time of surgery, extragonadal primaries,

Results

In 425 (39%) patients, the final diagnosis was benign tissue, 535 (49%) had mature teratoma, and 134 (12%) had viable cancer (Table 1). Overall, 46% of the patients had teratoma negative tumor histology. Tumor marker levels of AFP and HCG were normal in approximately one third of all patients (31% and 35%, respectively). Patients with benign masses had a higher frequency of absence of mature teratoma in the primary tumor (Table 1).

Discussion

In this article, we examined polytomous logistic regression in diagnostic studies with multiple diagnoses. We explained the interpretation of the ORs derived from the polytomous regression model and showed several model performance measures and a user-friendly format (score chart) to facilitate the use of a polytomous regression model in practice.

Acknowledgments

For this research project, we received financial support from the Netherlands Organization for Scientific Research grant numbers ZONMW 904-66-112 and 917-46-360.

References (33)

  • A. Wijesinha et al.

    Methodology for the differential diagnosis of a complex data set. A case study using data from routine CT scan examinations

    Med Decis Making

    (1983)
  • E.W. Steyerberg et al.

    Prediction of residual retroperitoneal mass histology after chemotherapy for metastatic nonseminomatous germ cell tumor: multivariate analysis of individual patient data from six study groups

    J Clin Oncol

    (1995)
  • E.W. Steyerberg et al.

    Validity of predictions of residual retroperitoneal mass histology in nonseminomatous testicular cancer

    J Clin Oncol

    (1998)
  • Y. Vergouwe et al.

    External validity of a prediction rule for residual mass histology in testicular cancer: an evaluation for good prognosis patients

    Br J Cancer

    (2003)
  • E.W. Steyerberg et al.

    Residual mass histology in testicular cancer: development and validation of a clinical prediction rule

    Stat Med

    (2001)
  • A. Agresti

    An introduction to categorical data analysis

    (1996)
  • Cited by (96)

    • Season, weather and predictors of healthcare-associated Gram-negative bloodstream infections: a case-only study

      2019, Journal of Hospital Infection
      Citation Excerpt :

      Cases diagnosed in winter were used as the reference category. Models of polytomous (multi-nomial) logistic regression were used for both uni- and multi-variable analysis [15]. In the multi-variable step, a stepwise forward strategy was used, using criteria of P < 0.05 and P > 0.1 for insertion and removal of variables [16].

    View all citing articles on Scopus
    View full text