Statistics from Altmetric.com
Too much data? Too much information? The COVID-19 pandemic has made the case. The WHO coined the term ‘infodemic’ to describe the issue of overabundance of information, including misinformation, disseminated in real time via multiple channels.1 2 A related concept is ‘datademic’ to describe the overabundance of data. I argue in this essay that infodemic intoxicates public health surveillance and decision-making, and that we need to revisit how we conduct surveillance in the age of big data by fostering a slow data culture.
Why too much data intoxicate public health surveillance
Surveillance is the ongoing systematic collection, analysis and interpretation of data, closely integrated with the timely dissemination of the resulting information to those responsible for preventing and controlling disease and injury.3 Traditionally, it requires high-quality data which are collected for this purpose along well-defined methods. In the era of infodemic and big data, the access to different types of data has increased tremendously, offering new opportunities for surveillance. These new data, however, are not collected primarily for surveillance, often of relatively low (or not well-documented) quality, and, what is highly problematic for surveillance, of weak consistency across settings and time.3 The consequences are the questionable quality and reliability of information derived from these data.
An additional problem is what is called the selectivity bias.4 With some types of big data, it is difficult to define the source population from which they emerged; there is no well-defined source population. For instance, routinely collected data from healthcare providers are usually event based rather than population based, and the population using the services of these providers is changing, unpredictably, across time. These data are further exposed to surveillance bias and streetlight effect.3 Defining the source population is also very difficult with data from social media networks or internet queries, and it is not clear of which population there are representative. As a result, how to infer information derived from these data to a target population is a serious challenge for decision-makers.
Multiplication of information producers
Beyond quality, volume is the other major problem that intoxicates surveillance activities. In the field of public health surveillance, we are chronically complaining about the lack of data, for example, to estimate the burden of given diseases or to evaluate the effect of specific policies on population health. When available, these data have the advantage, however, of being collected to address relatively well-defined public health surveillance needs; they are designed for this purpose. In this time of infodemic, we are no short of data; on the contrary, we are actually overwhelmed by data and information which are on the one hand easily available but, on the other hand, not designed to address public health needs. New methods are needed to make them useful for decision-making.
Further, there is not only a multiplication of new data sources but also a multiplication of information producers (figure 1). Until recently, the production of surveillance information was tightly regulated, usually under the guidance of public health experts more or less trained in epidemiology and statistics, mandated by public agencies, and on which decision-makers could rely on. The health information production market is now open and deregulated, with data analysed by multiple people—with or without epidemiological expertise—who make their analyses available through different types of traditional and social media, and who compete with traditional information producers. The problem is that substantial resources are needed to handle this huge volume of information, decision-makers having to screen and sieve them to sort out what is useful for decision.
While poor quality and huge volume of data can lead to the production of information of questionable utility, a related problem is misinformation which can be defined as incorrect or false information that is shared with or without the intent to harm.5 6 It is nurtured by several psychosocial mechanisms such as congeniality bias, motivated reasoning, doom-scrolling and negativity bias, and is amplified by the algorithms of social networks.7 The spread of false information delays the adoption of effective public health measures, notably vaccination. Due to the scale of misinformation within the digital information ecosystem, a new—and resource-consuming—activity for public health agencies is tracking and countering misinformation.8 9
Failure of scientists
Scientists have not helped substantially to mitigate the effects of infodemic and to prevent misinformation. Actually, they might have contributed to these trends. More than 500 000 scientific papers would have been published on COVID-19 up to September 2021.10 It has also become standard to practise ‘medicine by press release’ by providing information on a new treatment or a test directly through media. Hence, sensational and exaggerated rather than evidence-based and potentially useful research findings are spread.11 Further, the mediatisation of preprints has favoured the spread of not peer-reviewed information. Another trap for scientists is the academic militantism, blurring the boundary between science making and politics.12
More broadly, the COVID-19 pandemic magnifies how biomedical and public health science is evolving, with the mass production of low-value research and with much effort devoted to communication designed for academic self-promotion rather than for knowledge transfer.11 In this era of infodemic, scientists should rather help decision-makers to deal with the mass of health information and sort out the good from the bad through, for example, misinformation-monitoring systems.13 More than ever, there is a need for trustworthy gatekeepers of knowledge, working within an evidence-based framework, on which decision-makers can rely on.
How to cope? Moving from a big to a slow data culture
In the age of infodemic and big data, we need to slow the surveillance data and information production process. Making sense of these data and information requires time, as well as for making them useful for decision-makers. We call here for a ‘slow data’ culture defined as a shared way of collecting, analysing, and disseminating data and information in order to address explicitly the needs of public health decision-makers, and which can be fostered in three steps.
First is to admit that we need more than data for an efficient public health surveillance system. Indeed, we have to focus on the process to produce useful information matching the needs of decision-makers. Data whatever their scale are not enough; ‘big’ data do not speak by themselves more than ‘small’ data. In many surveillance systems, a large share of resources is used for data collection and analyses while more weight should be given to information dissemination and communication. We need to identify explicitly surveillance needs, evaluate systematically what is the gain of considering extra data, and anticipate resources needed to handle these new data. We also need to strengthen the surveillance and containment of infodemic and misinformation, for example, by real-time tracking system of false COVID-19 claims and using trusted fact-checkers.13 14
Second, it is essential to re-enchant expertise and evidence. The pandemic has exposed strengths and weaknesses of scientists, and raised doubts on the value of reason and science.15 Public health agencies, epidemiologists, and data scientists have however unique expertise and credibility to guide in the adequate use of data for surveillance.16 More than ever, we need experts in public heath surveillance with a clear understanding of the difference between evidence production and health policymaking,6 at the interface between decision-making and data science, and working in trustworthy, science-grounded, accountable institutions. This is necessary to counter misinformation through the promotion and dissemination of credible information.14 17
Finally, it is necessary to strengthen skills of the society at large in the field of population health science. Epidemiology should be taught early in schools18 and, within a consequentialist and evidence-based perspective,19 epidemiologists have to be trained in knowledge translation and health communication.16 In the age of big data and infodemic, increasing health data and epidemiological literacy at a population level is needed for citizen-centred and accountable public health surveillance systems.
Data availability statement
No data are available. Not applicable.
Patient consent for publication
This study does not involve human participants.
Contributors AC wrote the manuscript and is responsible for the overall content as guarantor.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.