Article Text

Review of syndromic surveillance: implications for waterborne disease detection
  1. Magdalena Berger1,
  2. Rita Shiau2,
  3. June M Weintraub1
  1. 1San Francisco Department of Public Health, Environmental Health Section, USA
  2. 2San Francisco Department of Public Health, Communicable Disease Control and Prevention Section
  1. Correspondence to:
 MsM Berger
 San Francisco Department of Public Health, Environmental Health Section, 1390 Market Street, Suite 910, San Francisco, CA 94102, USA; mberger{at}


Syndromic surveillance is the gathering of data for public health purposes before laboratory or clinically confirmed information is available. Interest in syndromic surveillance has increased because of concerns about bioterrorism. In addition to bioterrorism detection, syndromic surveillance may be suited to detecting waterborne disease outbreaks. Theoretical benefits of syndromic surveillance include potential timeliness, increased response capacity, ability to establish baseline disease burdens, and ability to delineate the geographical reach of an outbreak. This review summarises the evidence gathered from retrospective, prospective, and simulation studies to assess the efficacy of syndromic surveillance for waterborne disease detection. There is little evidence that syndromic surveillance mitigates the effects of disease outbreaks through earlier detection and response. Syndromic surveillance should not be implemented at the expense of traditional disease surveillance, and should not be relied upon as a principal outbreak detection tool. The utility of syndromic surveillance is dependent on alarm thresholds that can be evaluated in practice. Syndromic data sources such as over the counter drug sales for detection of waterborne outbreaks should be further evaluated.

Statistics from

Syndromic surveillance is a tool for outbreak detection that has been used by public health departments since the mid-1990s.1 The CDC has defined syndromic surveillance as “…surveillance using health-related data that precede diagnosis and signal a sufficient probability of a case or an outbreak to warrant further public health response.”2 In the USA, interest in syndromic surveillance increased after September 11, 2001 because of concern about the possibility of bioterrorist attacks.

Theoretically, syndromic surveillance systems have the potential to supplement traditional infectious disease surveillance systems by providing information about the extent of an outbreak or seasonal increases in disease incidence. They may also provide reassurance that an outbreak is not happening. As syndromic data are gathered before diagnostic or laboratory information is available, health departments may be able to recognise and respond to increases in disease incidence before formal diagnoses are made, and to respond to outbreaks early in their course. In this way, syndromic surveillance has the potential to effectively mitigate the extent of morbidity, mortality, and social and financial unrest resulting from natural or manmade outbreaks.

The effectiveness of syndromic surveillance in terms of informing timely and successful public health interventions has not been demonstrated in public health practice. Although the impetus for its development has been its potential use as a bioterrorism preparedness tool, it may be well suited to monitoring naturally occurring infectious and chronic disease. The utility of syndromic surveillance for monitoring waterborne or environmentally mediated disease incidence has not been assessed to date. However, any decision about the implementation of a syndromic surveillance system cannot focus on only one aspect of disease surveillance and response. This decision has to be made within the context of the existing public health system and its ability to respond to public health threats of all types. Additionally the design of an effective syndromic surveillance system must take into account the data needs and response capabilities of many different public health responders.

This review summarises the evidence gathered from retrospective, prospective, and simulation studies to assess the efficacy of syndromic surveillance for waterborne diseases detection. The original aim of this review was to determine the utility of syndromic surveillance systems for waterborne disease detection in the San Francisco Bay Area. All articles were selected from peer reviewed articles found on Medline under the search term “syndromic surveillance”.


Three levels of data may be used for disease surveillance: pre-clinical data, clinical pre-diagnostic data, and diagnostic data.3 Syndromic surveillance usually uses two types of data sources: pre-clinical and clinical pre-diagnostic data. Traditional surveillance generally focuses on diagnostic data.

Pre-clinical data sources

Public behaviour may be an early indicator of an increase in disease incidence in the population. Information about school and work absenteeism,1,4 calls to nurse help lines,5 poison control centres,6 water utility complaint lines,7 and sales information of over the counter and prescription drugs1,4,7,8,9,10,11,12,13,14 may give information about the level of disease in the population. Other environmental data, such as drinking water turbidity levels, may also be useful indicators of disease incidence. While pre-clinical data sources tend to be more timely than clinical data sources, they are much less specific and therefore provide a less solid basis for public health decision making.

Clinical pre-diagnostic data sources

Most clinical data sources currently used by syndromic surveillance systems consist of electronic data gathered for an independent purpose and adapted to the needs of the specific surveillance system. Sources of clinical data used for syndromic surveillance include health care utilisation records such as registers of emergency department chief complaints,1,4,15–24 discharge data,20 ambulance dispatch data,25 and ICD-9 or other procedure codes used in inpatient and outpatient settings.4,20,26 Electronically captured orders for diagnostic tests may also be used as a data source.

Different groupings of symptoms, ICD-9 codes, or drugs, as well as the most effective free text search algorithms are being researched to determine how these data are best analysed to accurately reflect disease activity in the population being studied.17,21,27,28


Usually, data sources that are chronologically distant from diagnostic information are timely but not as specific as data sources that are chronologically closer to diagnosis.3 Potential syndromic surveillance data sources along with each source’s capacity for being timely and specific are described in figure 1. Comparing two data sources from this figure illustrates the trade off between data timeliness (the amount of lag time from an event to data about the event being available for interpretation by a public health department) and specificity/accuracy (the degree to which information with relevant patient characteristics is available). Over the counter (OTC) drug data have the benefit of timeliness; for example, people who are beginning to have symptoms of an intestinal illness may purchase OTC drugs a day or two before presenting at the emergency department (ED) or other health care facility. However, OTC sales data are not very specific; while a rise in sales of OTC antidiarrhoeal remedies may be the result of an increase in incidence of diarrhoeal illness, it may be unrelated to disease incidence. An increase in OTC sales may be explained by store specials, hoarding behaviours, or duplicate data transfers. It would be difficult to make a public health decision based solely on increased OTC sales because it does not contain any patient specific data. Involving pharmacists or pharmacy managers in initial models and categorisation of OTC data may increase the correlation of OTC sales data and possible human disease. Timeliness of data acquisition is only valuable if the signal is accurate and specific enough to inform public health decision making.

Figure 1

 Surveillance data sources and health seeking behaviour.

The timeliness and specificity of OTC data may be compared with ED chief complaint data. An increase in chief complaints of diarrhoea is an unambiguous signal that the incidence of intestinal illness has increased. It is less timely than the OTC data but yields much more detailed information. None the less this information does not distinguish between a natural and expected increase, a natural outbreak, a result of a bioterrorism event, or a coincidental, simultaneous increase in intestinal illnesses of varying aetiologies. It would still be difficult to make a public health decision based on these data alone without further investigation.

An increase in sales of OTC antidiarrhoeal drugs occurring concurrently with spikes in other data such as increases in worker absenteeism, and increases in ED chief complaints of diarrhoea would be a more reliable indicator of a true increase in disease incidence than any of these signals on their own. Developing a mechanism for evaluating data from disparate sources and of varying relevance to public health response is one of the challenges in finding practical applications for syndromic surveillance information.3

The accuracy of aberration detection signals generated by a syndromic surveillance system is also greatly affected by the amount of baseline data that are available. Studies reporting the use of short term, or “drop-in” syndromic surveillance systems set up specifically for outbreak detection during a high profile event consistently cite the lack of baseline data as a factor that hampered the determination of an appropriate signal threshold for these systems, often leading to systems that were overly sensitive and did not account for seasonal or other temporal factors.15,29,30 Before a syndromic surveillance system is instituted for use as a regular surveillance tool, individual jurisdictions should ensure that at least one year of historical data are available for use as a reference for signal threshold determination and signal investigation.


Retrospective evaluations

Retrospective analyses show that syndromic surveillance systems may provide information that would allow public health departments to predict outbreaks earlier than by using traditional surveillance. For example, in the Milwaukee, Wisconsin cryptosporidiosis outbreak of 1993,31 newspaper reports of over the counter antidiarrhoeal drugs being sold out at pharmacies were one of the first clues that an unusual event had occurred.7 Calls to nurse hotlines showed a fourfold increase in the standard deviation of diarrhoea rates one day before observation of unusual activity by local pharmacists and five days before the local public health department was notified of the possible outbreak.5 Similarly, a retrospective review of pharmacy records showed that sales of over the counter drugs increased fivefold during an outbreak of cryptosporidiosis in North Battleford, Saskatchewan.7,32 During a 1983 outbreak of waterborne campylobacteriosis in Florida, pharmacy sales of antidiarrhoeals increased eightfold when compared with the same period in the previous year.16 Retrospective analysis of Canadian outbreaks of Cryptosporidium, E coli O157:H7, and Campylobacter also all showed increases in OTC sales that corresponded to increases in disease incidence.33 In New York City, retrospective analyses of ambulance dispatch data and ED data showed that they were effective in predicting city wide respiratory, gastrointestinal, and influenza outbreaks.25,34


Simulated outbreaks can examine the ability of syndromic surveillance systems to accurately predict outbreaks while minimising false alarms.34,35 According to most simulations, detection algorithms used in syndromic surveillance can best identify large, geographically widespread increases in disease. Syndromic surveillance detection algorithms are less successful at identifying small counts or small increases in disease, showing that syndromic surveillance may augment traditional disease surveillance but may not improve timeliness or sensitivity. Simulations of syndromic surveillance suggest that it is best suited for detecting diseases that have a narrow incubation period distribution, a steep epidemic curve, a long prodromal phase, are not included on routine diagnostic tests, and do not have a specific disease identifying clinical feature.36

To date no simulations of waterborne disease outbreaks have been conducted. Stoto et al tested a hypothetical syndromic surveillance system by simulating a “fast” outbreak (a disease that would have a steep epidemic curve) and a “slow” outbreak (a disease with a more gradual epidemic curve) of influenza-like illness (ILI) using three years of data from an ED.37 Results of this exercise show that an episode of either “fast” or “slow” ILI during the annual influenza season was undetectable by syndromic surveillance when set to a 1% false positive rate. During the non-flu (RSI) season syndromic surveillance algorithms were successful at detecting a “fast” outbreak by day three. However, the “slow” outbreaks were much harder to detect. Only one algorithm had a 50% probability of detecting this outbreak by day nine. By day nine, such outbreaks would usually be detected through traditional surveillance systems. As Stoto et al point out, “The sudden appearance of large number of ILI cases…five times the daily average…especially in the summer- is clearly exceptional and ER physicians do not require any sophisticated statistical algorithm to tell them so”.37

Prospective reports

Several prospective evaluations of long term surveillance systems18,23,34,38–43 have been published. Syndromic surveillance has been in use in New York City since 1995 to detect outbreaks of waterborne diarrhoeal disease. The waterborne disease surveillance system uses data from OTC antidiarrhoeal sales, ED chief complaint logs, reports from sentinel nursing homes, and reports of positive stool samples from clinical laboratories. Prospective studies evaluating this system between 2001 and 200434,40 reviewed citywide signals and smaller spatial clusters to find out if signals represented real outbreaks, and whether outbreaks detected by traditional surveillance were also detected by the syndromic system. These studies found that, 75% of the citywide outbreak signals correlated with true citywide viral gastroenteritis outbreaks. However, there was no correlation between smaller spatial clusters detected by the syndromic surveillance system and gastrointestinal (GI) outbreaks detected by traditional surveillance. Additionally, there were 36 GI outbreaks detected by traditional surveillance, none of which were detected by the syndromic surveillance system. Despite the extensive use of staff resources, the availability of fully electronic and timely data, and the ability to quickly initiate responses to various alerts, the syndromic system did not appreciably improve timeliness or sensitivity to overall disease surveillance.40 A system in Connecticut that tracks hospital admissions shows similar discrepancies between GI outbreaks detected by traditional and syndromic surveillance.42 This system generated 35 GI illness alarms in the course of one year, only one of which turned out to be a true GI outbreak. Additionally, none of 15 GI illness clusters detected by laboratory surveillance were detected by the syndromic surveillance system.

The University of Maryland implemented a syndromic surveillance system in its university hospital and evaluated its system’s ability to detect actual clusters of patients with GI or respiratory symptoms presenting to the hospital from 2001 to 2002.39 This system incorporated admission, discharge and transfer data as well as laboratory information system data. The evaluation showed that a peak in ED visits and an increase in stool test orders corresponded to a cluster of patients who later tested positive for infections with Shigellaspp. The authors found the system to be timely and sensitive for their needs. A similar system is operated by the Department of Defense (DoD).23 The DoD system detected three outbreaks of diarrhoeal diseases in three different locations, one of which was laboratory confirmed to be caused by rotavirus. A system based on ED chief complaint data administered by the Westchester County Department of Health was less specific. Over the course of 277 days, 59 signals were detected, none of which corresponded to an outbreak of a communicable disease.43 A multi-jurisdictional system implemented for the 2002 Olympic games in Utah tracked encounters at urgent care centres and EDs using the real-time outbreak and disease surveillance system.18 During the two month Olympic event the system’s detection algorithms were exceeded twice; neither of these alarms corresponded to a real outbreak. The system was successfully maintained after the end of the Olympic event.

A study by Henry et al quantified the sensitivity, specificity, and positive predictive value (PPV) of the Kaiser Mid-Atlantic Region nurse hotline syndromic system by comparing the level of concordance between syndrome assignments based on the nurse hotline algorithm and the diagnosis made based on a subsequent Kaiser office visit. The authors found that the nurse hotline system achieved the highest sensitivity and PPV for respiratory and GI illness syndromes.38

Staff and resources used

Performance evaluations often do not provide information about time and resources that are used in maintaining the system and investigating the alarms. In New York City, staff maintain the system, spending several hours a day, seven days per week to download and analyse data.1 Investigation of false signals is a significant burden on staff resources.37 An often mentioned solution is to investigate only those alarms that are maintained for two or more days and in two or more geographical areas. However, if the main goal of syndromic surveillance is to increase timeliness of outbreak detection and response then waiting two or more days to start an investigation would negate the timeliness benefit of the system.44 Similarly, if, because of the low specificity of syndromic signals, local health departments require that syndromic signals are substantiated by specimen collection and laboratory confirmation before public health action may be taken, any potential advantages of timeliness are lost.45 Decision makers must take into account the ability of the local public health system to respond to signals when considering the implementation of any syndromic surveillance system.


Criteria derived from simulations can be used to assess which waterborne diseases may be appropriate for identification by syndromic surveillance.36

(1) The disease should have a narrow incubation period distribution.

Incubation periods of waterborne disease agents vary widely depending on dose, host susceptibility, and other factors.

(2) The disease should have a steep epidemic curve and a long prodromal phase.

Most waterborne diseases do not have a prodromal phase and an outbreak will not necessarily have a steep epidemic curve.

(3) The disease should not have a specific disease identifying clinical feature.

Initial symptoms of most potentially waterborne diseases are non-specific and generally include diarrhoea and other GI distress. Most of these illnesses do not have a disease identifying clinical or historical feature that allows clinicians to pinpoint the cause before performing laboratory tests.46

(4) The disease should not be included in routine diagnostic tests.

Laboratory tests are not routinely ordered for many waterborne diseases,46 and acute GI illnesses are generally under diagnosed and underreported.47–49

Table 1 lists characteristics of potentially waterborne pathogens. Because many waterborne diseases lack clinically identifying features and are not part of routine testing, they may be good candidates for detection by syndromic surveillance, based on the four criteria listed above. However, many of the diseases that fit these criteria, while good candidates for syndromic surveillance, are often commonly occurring and nor necessarily of high public health importance.

Table 1

 Characteristics of potentially waterborne disease agents

What this paper adds

To date, little attention has been paid to the possibility of using syndromic surveillance for monitoring waterborne disease incidence. This review of the benefits and shortcomings of syndromic surveillance may prove useful to public health practitioners and planners who are considering approaches to improve traditional surveillance systems.

Waterborne disease surveillance in the San Francisco Bay Area

In the San Francisco Bay Area four counties receive water from a common water utility. With the exception of a multi-county cryptosporidiosis surveillance project, surveillance for potentially waterborne disease in the San Francisco Bay Area is conducted by each county separately. There is no formal, timely coordination of waterborne disease surveillance across county lines. A system with multi-jurisdictional disease monitoring capabilities could provide public health benefit by permitting early detection of a multi-county waterborne outbreak. Surveillance data captured from multiple jurisdictions and interpreted centrally may lead to earlier outbreak detection than data gathered and interpreted by staff in separate counties who may not be aware of disease incidence in neighbouring jurisdictions.

Syndromic surveillance data sources may potentially provide cross-jurisdictional data, information about the geographical scope of an outbreak once one was identified, and additional reassurance that an outbreak was not occurring. However, the practical utility of syndromic data sources is uncertain; their outbreak detection benefits are currently theoretical and will remain so until accurate electronic capture of data and signal detection algorithms are refined or data sensitivity increases. The benefits of a regional waterborne disease surveillance system must be weighed against the resources required to set up such a system and the true risk of a waterborne disease outbreak in the San Francisco Bay Area. Based on the absence of any known waterborne outbreaks in the history of the water utility, extensive watershed protection measures, and a protected water source located in a national park it would seem that the risk of a waterborne disease outbreak occurring in the San Francisco Bay Area is quite small.

Potential waterborne disease syndromic surveillance data sources

Table 2 lists the potential waterborne disease syndromic surveillance data sources along with indicators of data quality and utility. Potential data sources in the table are divided into data that are currently available and accessible in electronic format in the San Francisco Bay Area and data that are not currently automated or electronically available but could be useful once they became automated and electronic.

Table 2

 Potential syndromic surveillance data sources

Compared with other options, OTC surveillance for waterborne disease is currently the most feasible source of syndromic surveillance data available in the San Francisco metropolitan area because of the relative ease of implementing an existing, nationally funded system. Nursing home surveillance entails large inputs in terms of health department and on-site staff and fiscal resources for specimen testing, data monitoring, and signal investigation. Water utility complaint, nurse call line, and school and worker absenteeism logs are currently not compiled electronically in a central location; setting up a system to electronically capture these data would entail considerable commitments of will and resources from the department of public health, other city agencies, and private and public partnerships with water utilities, insurance providers, hospitals and clinics, and large employers in the San Francisco Bay Area. Mechanisms for data storage, sharing, and retrieval would need to be established for each partnership. Finally and most importantly, dedicated public health staff who would compile, manage, and analyse syndromic data on a regular basis, and respond to syndromic surveillance signals would be needed. Signal verification and response activities may include: (1) determining data import and aberration detection algorithm problems that may lead to erroneous signals (for example, duplicate data, batch transfers from certain institutions, miscoding of information at the point of entry, text-string search algorithms that are too specific or not specific enough, etc); (2) verifying the validity of the signal by looking for the presence of signals in other data sources; (3) if the signal is deemed to be composed of possible true cases, hospital logs and charts may need to be manually reviewed by hospital or public health department staff and a line list compiled for clinical and/or laboratory based case verification; (4) traditional outbreak investigation activities and application of interventions. Timeliness provided by a syndromic surveillance system can only be useful if all of the above functions are supported by and integrated into the activities of the local health department on a sustained basis.

Policy implications

This review will help policy makers weigh the costs and benefits of implementing a syndromic surveillance system and will clarify the drawbacks and advantages of potential data sources.

It is possible that outbreaks of cryptosporidiosis, cyclosporiasis, legionellosis, hepatitis A, and others may be detected through monitoring of OTC sales or that a suspected outbreak may be confirmed or better characterised through the use of these data. If OTC surveillance is to be used primarily as a back up data source to traditional surveillance, staff time for checking the web based interface would not exceed 15 minutes per day. Signal investigation would require additional resources. While the usefulness of OTC monitoring for waterborne diseases is only theoretical, given the availability of multi-county data and the low amount of staff time and effort needed to monitor the data, utilisation and prospective evaluation of these data may be recommended for two purposes: (1) reassurance of the absence of a waterborne disease outbreak and (2) establishing baseline familiarity that may prove helpful in the event of a waterborne disease outbreak. OTC data need to be correlated with known outbreaks in the geographical area where surveillance is occurring to clarify the validity and representativeness of these data before they can be used for prospective outbreak detection.


Because the efficacy of syndromic surveillance is not proved, it remains unclear whether it is worth the investment of personnel and financial resources to implement. In addition to the issues discussed in this review, it is challenging to develop sensible response protocols for syndromic surveillance systems because the likelihood of false alarms is so high, and because information is currently not specific enough to enable more timely outbreak detection or disease control activities. Although most syndromic surveillance systems do not use personally identifiable information, there are also considerations of personal privacy and public comfort with health information storage and analysis. The data needs of a syndromic surveillance system must be weighed against the needs and public demands for patient privacy.

There are many theoretical benefits of syndromic surveillance, including potential timeliness, increased response capacity, ability to establish baseline disease burdens, and ability to delineate the geographical reach of an outbreak. However, in the absence of empirical evidence of its efficacy to mitigate the effects of natural or intentional waterborne disease outbreaks through earlier detection and response, syndromic surveillance should not be implemented at the expense of traditional public health activities, and it should not be relied upon as a principal waterborne disease outbreak detection tool at this time. For waterborne diseases, syndromic surveillance systems should continue to be assessed and implemented as a supplemental system to help develop better statistical methods and sensible alarm thresholds that can be applied and evaluated in practice.

View Abstract


  • Funding: none.

  • Competing interests: none.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.