Elsevier

Annals of Emergency Medicine

Volume 44, Issue 3, September 2004, Pages 235-241
Annals of Emergency Medicine

Disaster and terrorism
Syndromic surveillance: the effects of syndrome grouping on model accuracy and outbreak detection,☆☆

https://doi.org/10.1016/j.annemergmed.2004.03.030Get rights and content

Abstract

Study objective

Data used by syndromic surveillance systems must be grouped into syndromes or prodromes. Previous studies have examined the accuracy of different methods of syndromic grouping. We seek to study the effects of different syndrome grouping methods on model accuracy, a key factor in the outbreak-detection performance of syndromic surveillance systems.

Methods

Daily emergency department visit rates were analyzed from 2 urban academic tertiary care hospitals for 1,680 consecutive days. During this period, each hospital census totaled approximately 230,000 patient visits. Three methods were used to group the visits into a respiratory-related syndrome category: 1 relying on chief complaint, 1 on diagnostic codes, and 1 on a combination of the two. The different groupings of the syndromic data resulting from these methods were used to build different historical models that were then tested for forecasting accuracy and for sensitivity to detecting simulated outbreaks.

Results

For both hospitals, the data grouped according to chief complaints alone yielded the lowest model accuracy and the lowest detection sensitivity. Using diagnostic codes to group the data yielded better results in accuracy and sensitivity. Combining the 2 grouping methods yielded the best results in accuracy and sensitivity. Temporal smoothing of the data was shown to improve sensitivity in all cases, although to various degrees in the different models.

Conclusion

The methods used to group input data into syndromic categories can have substantial effects on the overall performance of syndromic surveillance systems. The results suggest that incorporating diagnostic data into these systems can improve the modeling accuracy and its detection sensitivity. Furthermore, the best results may be achieved by using a combination of methods to group visits into syndromic categories.

Introduction

Syndromic surveillance systems are being developed and deployed at the local, state, and national levels in response to the threat of bioterrorism.1 Syndromic surveillance2 is an approach to public health monitoring that relies on detecting clinical case features that are discernable before confirmed diagnoses are made.3 In particular, before the laboratory confirmation of an infectious disease, infected individuals may exhibit behavioral patterns, symptoms, signs, or laboratory findings that can be detected through a variety of data sources.

There are 4 basic methodologic stages in processing emergency department (ED) syndromic data for outbreak detection. First, in the syndromic grouping stage, data are gathered from the sources that feed into the system and are organized according to a coding scheme that allows each patient to be assigned to a particular syndromic group. Next is the modeling stage, in which historical data, usually reaching back from 1 to several years, are analyzed to establish a model of the normal temporal pattern. Third, in the detection stage, the expected values from the model (eg, daily frequencies of patients presenting in each syndromic group) are compared against observed values collected in the field to determine whether abnormal activity is occurring. Finally, in the alarms stage, thresholds are set for evaluation of whether the unusual patterns warrant notification. Degradation of the information quality anywhere along this 4-stage process might dramatically hamper outbreak detection performance.

Previous studies have shown that chief complaints and International Classification of Diseases, Ninth Revision (ICD-9)4 diagnostic codes can be used to classify ED encounters into syndromes5., 6., 7., 8., 9. and are valuable at the syndromic grouping stage. Chief complaints are usually recorded during triage, and all US EDs use the ICD-9 coding for billing. Real-time data feeds from these encounters have been successfully established in a number of cities.1., 10., 11., 12.

The goal of this study was to measure the impact of 3 syndromic grouping methods on the modeling and detection stages. We compare forecasting accuracy at the modeling stage and detection sensitivity at the detection stage of 3 groupings according to chief complaints, ICD-9 codes, and an inclusive combination of chief complaints and ICD-9 codes.

Section snippets

Setting

Data were extracted from the information systems of 2 major urban, academic, tertiary care hospitals sharing the same catchment area, each having an annual census of approximately 50,000 ED visits. Hospital 1 is a general hospital, and hospital 2 is a pediatric hospital. Eligible participants were all patients treated in the EDs of hospitals 1 or 2 between June 1, 1998, and January 5, 2003. This period included 1,680 consecutive days at each hospital. During this period, each hospital census

Results

Table 1 shows the average total number of daily visits and the SD for each of the 6 time series throughout the entire data sets. For both hospitals, fewer visits were considered respiratory related when only the chief complaints were examined; more visits were included if only the diagnostic codes were examined. As would be expected, the most visits were included when the chief complaint and the diagnostic code were considered together.

To examine these daily visit numbers in greater detail, a

Limitations

Because there is a paucity of data available on actual biologic warfare attacks, we relied, as have others,17 on simulated attacks for model validation. The characteristics of a true biologic attack, as reflected in ED syndromic surveillance data, would vary, depending on a wide range of characteristics. The strength of the approach we have taken lies in the use of semisynthetic data for simulation. Although the outbreak signal is simulated, the background noise is generated with authentic ED

Discussion

This study examines the relationship between methods of syndromic grouping and outbreak detection performance. We demonstrate that dramatically better surveillance can be achieved when the proper syndromic grouping methods are applied to certain types of data. The ordering of the model accuracies (BOTH>DX>CC) is directly consistent with the ordering of detection sensitivities, which suggests that the noise effects of input data coding propagate through a syndromic surveillance system, having

Acknowledgements

We thank Karen Olson, PhD, Shlomit Feit, and Allison Beitel for their methodologic contributions, which enabled generation of the data sets used in this study.

References (18)

  • K.D. Mandl et al.

    Implementing syndromic surveillance: a practical guide informed by the early experience

    J Am Med Inform Assoc

    (2004)
  • M.D. Lewis et al.

    Disease outbreak detection system using syndromic data in the greater Washington DC area

    Am J Prev Med

    (2002)
  • F.-C. Tsui et al.

    Technical description of RODS: a real-time public health surveillance system

    J Am Med Inform Assoc

    (2003)
  • J.W. Buehler et al.

    Framework for evaluating public health surveillance systems for early detection of outbreaks: recommendations from the CDC Working Group

    MMWR Recomm Rep

    (2004)
  • Centers for Disease Control and Prevention

    Updated guidelines for evaluating public health surveillance systems: recommendations from the guidelines working group

    MMWR

    (2001)
  • American Medical Association

    ICD-9-CM 2002: International Classification of Diseases, 9th Revision

    (2002)
  • A.J. Beitel et al.

    Use of emergency department chief complaint and diagnostic codes for identifying respiratory illness in a pediatric population

    Pediatr Emerg Care

    (2004)
  • J.U. Espino et al.

    Accuracy of ICD-9–coded chief complaints and diagnoses for the detection of acute respiratory illness

    Proc AMIA Symp

    (2001)
  • O. Ivanov et al.

    Accuracy of three classifiers of acute gastrointestinal syndrome for syndromic surveillance

    Proc AMIA Symp

    (2002)
There are more references available in the full text version of this article.

Cited by (36)

  • Syndromic surveillance using laboratory test requests: A practical guide informed by experience with two systems

    2014, Preventive Veterinary Medicine
    Citation Excerpt :

    The experts generally suggested very detailed classification of records, which would have resulted in a large list of syndromes. However, the final definition of syndromic groups to be monitored should be based not only on expert opinions regarding the biological characteristics of the cases, but also on statistical considerations (Reis and Mandl, 2004; Burkom et al., 2004). It is important in the initial phases of system development to perform such classification with a reasonable degree of detail.

  • A test of syndromic surveillance using a severe acute respiratory syndrome model

    2009, American Journal of Emergency Medicine
    Citation Excerpt :

    Further details on these 6 patients are shown in Table 1. Other authors have described the impact of different methods used to group data into syndromic categories [14]. Another reason why our exercise did not trigger an alarm could have been how age is stratified in the detection algorithm.

  • The Value of Patient Self-report for Disease Surveillance

    2007, Journal of the American Medical Informatics Association
    Citation Excerpt :

    They are not collected for the purpose of surveillance and hence used by disease surveillance systems as a secondary data source. While there has been considerable success “reading” these short text strings with natural language text classifiers, and using them to assign people to illnesses of interest (for example, influenza-like illness), results have been variable.8,10,12,13 An alternative to using routinely collected data is to have hospital staff manually enter information on every patient at the time of the visit.

View all citing articles on Scopus

Author contributions: BYR and KDM conceived the study and designed the experiments. KDM undertook recruitment of participating hospitals, developed the data sets, and obtained institutional review board approval. BYR developed the simulations and performed the analyses and integration of the results. BYR and KDM formulated the findings, and both had equal roles in drafting and revising the manuscript. BYR takes responsibility for the paper as a whole.

☆☆

Supported by the National Institutes of Health through a grant from the National Library of Medicine (R01LM07677-01), by contract 290-00-0020 from the Agency for Health Care Quality and Research, and by the Alfred P. Sloan Foundation (grant 2002-12-1).

Reprints not available from the authors.

View full text