“Screening is the systematic application of a test or inquiry, to identify individuals at sufficient risk of a specific disorder to benefit from further investigation or direct preventive action, among persons who have not sought medical attention on account of symptoms of that disorder.”1
Statistics from Altmetric.com
Early detection and intervention is intuitively attractive, particularly for cancer when the consequences of disease may be serious and it seems obvious that early detection must be beneficial. Wilson and Jungner in 19682 proposed a set of criteria for assessing whether screening is worthwhile and their carefully reasoned and considered approach has served public health well in the following decades. Now, with several decades of screening experience to draw on, we propose the following criteria (1 to 3 below) need to be considered in assessing whether screening is worthwhile. In addition one further, rapidly developing issue (4 below) needs urgent debate and research. These points are identified below and then briefly discussed.
Cancer screening programmes should no longer be introduced without evidence from randomised trials demonstrating a mortality benefit or a quality of life benefit (for example, need for less invasive treatment)
The extent of inconsequential disease that will be generated by the screening programme should be estimated and carefully considered before widespread introduction of screening
The benefit (demonstrated by trials) should be weighed against the harms of screening to assess whether there is likely to be a net benefit from screening before widespread introduction of screening
If screening is introduced, potential participants in the screening programme should be given information that allows them to weigh up the probable benefit and harms, using their own values and preferences
Assessing the mortality or quality of life benefits of cancer screening is, in itself, a complex exercise. The best level of evidence comes from randomised trials, which may randomise participants to screening or non-screening arms (as for example in trials of mammographic screening),3 or may randomise screen detected cases to early versus later treatment (as for example in trials of cholesterol screening).4,5 It is important that randomised trials of screening meet quality criteria for randomised trials in general (that is, there should be allocation concealment, blind assessment of outcomes, small losses to follow up, and analysis by intention to treat).6 In addition trials of screening need to avoid lead time bias by using appropriate outcome measures such as the mortality rates in both arms of the trial (and not survival time from diagnosis as this will probably result in lead time bias). While trials of screening should be critically appraised (and may be found wanting as in the recent controversy over the analysis of mammographic screening trials by Olsen and Gotzsche7), they remain the best level of evidence to establish the benefit of early detection and treatment.
An insistence on randomised trials of screening may be regarded by some as unreasonable. But decisions that have the potential to affect hundreds of thousands (or millions) of people and that will place large burdens on healthcare resources, as population screening programmes inevitably do, must be based on the best possible evidence. Furthermore, there is an ethical imperative to base interventions that will be conducted on asymptomatic people on the best possible evidence because of the potential to harm those who are well, for example by psychological or physical adverse effects of investigations conducted on people who receive false positive screening test results.
Screening can lead to widespread over-detection and over-treatment of inconsequential disease. Screening for cervical cancer and for prostate cancer both frequently detect low grade disease that is unlikely to ever become symptomatic in screening participants’ lifetimes. Inconsequential disease has also been called “pathologists’ disease” (with apologies to pathologists) because there is histological evidence of disease, but the changes are low grade and unlikely to progress to invasive, symptomatic disease. For example, in cervical cancer screening abnormal smear tests are diagnosed in “6800 of every 100 000 women screened when the annual incidence of invasive cervical cancer in England and Wales was never greater than 30 per 100 000”.8 Clearly this represents enormous over detection and, because no one can tell which cases will progress, enormous over treatment of cervical intraepithelial abnormalities. The costs of inconsequential disease may be enormous, both for individuals and for the health budgets of countries that introduce cancer screening programmes.
WEIGHING UP BENEFITS AND HARMS
If a clear benefit of screening is demonstrable in randomised trials, the harms of screening need to be weighed against that benefit and an assessment made of whether there is likely to be a net benefit of screening. Harms include psychological and physical adverse effects of follow up tests, as well as the effects of inconsequential disease. Balance sheets of the consequences of screening are urgently needed (see table 1 for an example). Assessments of the ratio of true positive: false positive results when new screening tests are proposed may also be useful in weighing benefit against harms.
INDIVIDUAL VALUES AND PREFERENCES
Traditionally public information about screening has been aimed at achieving high uptakes, rather than informed choice10–13 and individual values are usually ignored for the sake of achieving population benefits from screening. Yet when benefits and harms are finely balanced, individuals’ preferences are likely to be important in deciding whether there is net benefit or net harm for an individual participant. It may well be that two people considering, for example, colorectal cancer screening may come to opposite but informed and rational views as to whether screening is worthwhile for them. It seems that there is a concern that a reduction in uptake and therefore in population benefits will occur with full disclosure of the benefit and risks associated with screening programmes. This provides a tension with the guidelines from the General Medical Council that make it clear that full information about the benefit, risks, and consequences of screening should be provided to potential participants. At the same time it needs to be recognised that the way information about the benefits and harms of screening is presented is important and there are concerns that those most likely to be deterred from screening may be the most socially disadvantaged.10
The following glossary is intended to provide a framework for reading research on screening for cancer and to stimulate discussion on the points raised above. The terms are presented in groups of terms with related meanings rather than alphabetically.
The target condition in this context refers to cancer or high grade abnormality that is likely to progress to cancer within a specified time frame (months to a few years). We consider low grade abnormalities that are unlikely to progress to cancer, such as low grade squamous intraepithelial lesions in programme screening for cervical cancer, to be false positives or nuisance positives (see False positive and Inconsequential disease).
BENEFITS OF SCREENING
Accurate early diagnosis of cancer (or pre-cancer) gives the opportunity to start treatment before disease progresses, thus potentially reducing the need for aggressive therapy, reducing the likelihood of metastatic disease, and averting cancer deaths. The size of the benefit depends on the effectiveness of early intervention compared with intervention at the time when the target condition is expected to manifest clinically. Reassurance afforded by a normal result may also be considered a benefit, although it is difficult to see reassurance as a benefit if the public was unaware of the target condition before the introduction of a screening programme.
HARMS OF SCREENING
For those who receive an abnormal test result, harms of screening may include complications arising from early investigation and treatment, unnecessary treatment of people with inconsequential disease, and unnecessary investigations of those who receive false positive results.4 In addition, adverse psychological effects of labelling or early diagnosis, anxiety, and the costs and inconvenience incurred are also potential harms. A negative result may give false reassurance and potentially cause a delayed presentation of symptomatic disease.14 This is true for both false negatives (when disease is present but currently undetected), and for true negatives (if disease develops after the screening test and the reassurance afforded by the previous negative test results in delayed clinical presentation).
A table outlining the consequences of screening (for example, per 1000 or per 10 000 people screened). The numbers of people who experience the benefit(s) and harm(s) of screening are provided to assist people to decide whether they consider screening is worthwhile.15 See table 1 for an example.
A disease state that is detected by screening but that would not contribute to a poor outcome if undetected by screening. An example is low grade prostate cancer detected by screening, but that, without screening, would not have become symptomatic in the man’s lifetime.
The period of time between each screening round.
An interval cancer is a cancer that is diagnosed after a normal screening test result was given and before the next scheduled screening examination or during some defined time period after the screening test. Interval cancers include a spectrum of cancers, from those that either did not exist or were undetectable at the previous screening round to those that were detectable but missed.16 While this theoretical distinction exists, in practice it can be difficult to separate accurately the “missed” cancers (or false negative results) from cancers that either did not exist or were undetectable at the previous screen.17
Rates of participation (or screening uptake rates) are the proportion of those eligible for screening who are actually screened. High participation rates have been considered an important measure of programme success as higher participation rates should be associated with greater population benefit. However, the potential benefit that the individual participant can expect is not affected by the participation (or non-participation) of others. Once the programme is established, the cost effectiveness should not be greatly affected by participation rates as costs and benefits will tend to increase proportionally as participation rates rise.18 Thus there is increasing debate about whether the programmes should be judged by their participation rate or by their capacity to provide for informed choice and informed participation.10 To achieve informed choice, information about the purpose of screening, including information on mortality reduction with screening, the likelihood of positive and negative findings (including false negative and false positive results), uncertainties and risks of screening, important medical, social or financial consequences of screening, and follow up plans should be given.19 Screening decision aids may be useful for this purpose.
A normal (negative) test result in a person who has the target condition.
An abnormal (positive) test result in a person who does not have the target condition. The false positive rate (the proportion of unaffected individuals identified as positive through screening), or 1−specificity, is important in quantifying the number of people without the cancer who will need further investigation. A reduction in the false positive rate is usually only achieved at the expense of a greater false negative rate. The relative weight given to the test thresholds that are chosen for recall and follow up should ideally reflect the implications of false negative and false positive results and the values ascribed to these implications by participants. Increasingly, it is being suggested that we consider the cumulative false positive rate of a screening programme as we are usually repeating the screening test over a number of years at a recommended interval between tests.20
SENSITIVITY (TRUE POSITIVE OR DETECTION RATE* OF A SCREENING TEST)
The proportion of individuals who have the target condition who receive a positive (abnormal) test result. A screening test that is more sensitive will pick up a greater proportion of those with the target condition and thus make a favourable difference to outcome when early intervention is effective. However, it is possible that a more sensitive test may worsen the trade off between benefits and harms if the test picks up more cases of inconsequential disease.
SPECIFICITY (TRUE NEGATIVE RATE OF A SCREENING TEST)
The proportion of individuals free of the cancer in a population who are correctly identified by a screening test as being free of the cancer. If a new screening test is more specific (results in fewer false positive results) then the potential harm of screening may be reduced.
POSITIVE PREDICTIVE VALUE (OF A SCREENING TEST)
The proportion of individuals with an abnormal test result who have the target condition. An alternative term, the “odds of being affected given a positive result” (OAPR) is the ratio of true positives to false positives.1
INTENTION TO SCREEN ANALYSIS/PRINCIPLE
In a trial of a screening intervention, patient outcomes are analysed according to the group to which subjects were randomised, irrespective of whether those in the screening and control arms actually participated in screening. The importance of this principle lies in ensuring that randomisation is preserved, thus maintaining an equal distribution of important factors that may influence the outcome in the control and intervention groups. Using intention to screen analysis also reflects more closely the population benefit that can be expected given participation rates that are likely to be encountered in practice. If desired, an adjustment can be made to de-attenuate the effect of less than 100% participation to provide an estimate of the probable benefit for those people who do actually attend for screening.21
The duration of the period in which the target condition is asymptomatic but detectable by screening.22 Sojourn time is also called the preclinical duration or preclinical phase.
LEAD TIME BIAS (ZERO TIME SHIFT)
Lead time is the period of time between the detection of the target condition by screening and the time when the condition would have been diagnosed clinically (if screening had not occurred). In effect screening brings forward in time the diagnosis, giving people more time to live knowing their diagnosis (in other words more “disease time”). Screening may therefore seem effective (even if ineffective) if survival from time of diagnosis is used as the outcome to compare screen detected cases with clinically diagnosed cases. This is called lead time bias.
LENGTH TIME BIAS
People whose disease is discovered by screening may also seem to live longer as screening tends to detect slowly progressing disease and may miss rapidly progressing disease that becomes symptomatic between screening rounds.23 Thus, a comparison of survival among screen detected cases compared with clinically diagnosed cases may show an apparent benefit of screening even if screening is ineffective (length time bias).
RELATIVE RISK REDUCTION (RRR)
The difference in event rates between control and intervention groups, divided by event rate in control group, or 1−relative risk (RR). This measure is used when the outcome of interest is an adverse event and the intervention reduces the risk.
ABSOLUTE RISK REDUCTION (ARR)
The difference in event rates between control and intervention groups. The value of the absolute risk reduction depends on the baseline risk of disease and thus can present a more realistic estimate of the size of the mortality benefit24 than the relative risk reduction.
NUMBER NEEDED TO SCREEN (NNS)
In the case of a randomised trial of screening (in which participants are randomised to screening or to no screening), the reciprocal of the ARR is the number needed to screen (NNS) to prevent one adverse outcome. Recently it has been suggested the term NNS should be adjusted for participation rate and that this be known as the NNBS—number needed to be screened.25 The number of people who actually need to participate in screening to prevent one death is lower than the number of people who need to be invited to screen to prevent one death; therefore, NNBS figures are usually lower than the NNS. As NNS is based on absolute risk, making comparisons between screening programmes using NNS may be misleading if there are differences in the baseline risks of conditions being compared.26
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.