Assessing psychosocial/quality of life outcomes in screening: how do we do it better?
High quality research on the psychosocial outcomes of screening programmes is urgently needed.

Assessing non-medical outcomes of screening presents constant challenges. Marteau and colleagues1 offer some insight into the complexities of assessing non-medical outcomes in their study of abdominal aortic aneurysm (AAA) screening. The paper reports that self assessed health (SAH) was lower among men who were found to have an aortic aneurysm than men who did not, yet baseline measurement indicated that much of this difference pre-dated screening. Poorer SAH seemed to predict having an aortic aneurysm. The authors suggest that the findings have implications for the methods used to assess psychological impact of screening tests and warn us not to erroneously conclude that poorer outcomes are necessarily a product of screening, if baseline differences are not assessed.

Marteau et al’s1 findings are extremely interesting and raise important issues for the assessment of psychosocial or quality of life (QOL) outcomes in the screening context. Adequate assessment of psychosocial as well as medical outcomes, is crucially important, especially given the potential of screening to detect inconsequential disease2–4 but presents many challenges. These have received comparatively little attention. We have identified three main methodological concerns: (1) the need for a control group (preferably created by randomisation); (2) the need for baseline and follow up measurements; (3) the need for reliable measurement tools with high criterion and content validity.

The first concern, obtaining an adequate control group, perhaps presents the most difficulty. If our goal is to assess the impact of screening we need to measure the combined impact of the screening procedure, follow up tests, and treatments. The best way to achieve this is to randomise people to be screened or not screened and to measure the psychosocial impact on everyone at multiple times (see fig 1), in a way that is analogous to the assessment of the medical outcomes of screening.2 This would mean that, as well as establishing, for example, the mortality rate from breast cancer (in a trial of mammography screening) in all those randomised to screening and all those randomised to usual care, the investigators would need to measure average QOL effects in these groups as well. Investigators will thus have to ensure appropriate measures are taken from those randomised to screening who (1) do not respond to the screening invitation; (2) test negative (including those who are truly negative and those who later are discovered to be false negatives); (3) test positive (again both true and false positives), or from random samples of people in each of these groups. Comparable measures will also be needed in the usual care group, including in those who do and do not seek screening through alternative systems. Some of the test positive group will in fact have inconsequential disease, but as this is not identifiable on an individual level, the only way to estimate the psychosocial impact of this is by comparison of the screened group as a whole with the usual care control group. Clearly this will add to the complexity and challenges of data collection for randomised trials of screening, but comparatively small sample sizes will be needed for psychosocial outcomes (compared with medical outcomes). Furthermore, efficiencies may be achievable by carefully designed sampling strategies. In summary, it should be feasible to validly answer questions about the real psychosocial impact of screening in this way.

Figure 1

 Design of randomised trials for valid estimation of the psychosocial impact of screening.

Alternatively in some circumstances other designs may be feasible. For example, people could be randomised to receive or not receive their results and subsequent tests and treatments, with follow up of psychosocial outcomes. Such designs have commonly been used in the past to evaluate screening for risk factors such as high cholesterol and high blood pressure in terms of medical outcomes.2

The second concern is the importance of taking baseline and follow up measures in both screened and unscreened groups. All psychosocial/QOL studies obviously take measures after screening (point 3, see fig 1), and many, as in Marteau’s study,1 take them before and after testing (points 1 and 3). However, we have been unable to find studies that have taken and reported measures at points 1, 2, 3, and 4 (see fig 1) or more. In particular, measures are rarely taken and/or reported among appropriate controls, at points 2 and 4. For example, a study by Wardle et al5 assessed anxiety among adults randomised to receive information about sigmoidoscopy screening and asked if they would be interested to attend, or not, but follow up measures were not reported in either arm.

Thirdly, it is imperative to select instruments that adequately capture psychosocial outcomes/QOL. What exactly constitutes psychosocial outcomes or QOL is often loosely defined. QOL itself has been described by many researchers as an atheoretical construct6,7,8,9,10 and there is little clear consensus about what should or should not be used to adequately assess it, particularly in the context of screening. Most psychological and QOL measures are designed for use in patient populations and as such they are designed to capture relatively large decrements in QOL/wellbeing. Screening may lead to comparatively small decreases in psychological wellbeing/QOL but the decrement may occur across very large numbers of people so may still be important. Use of general psychological/QOL measures may not be sensitive enough to capture all outcomes. Some screening specific measures have been developed to combat this problem, for example, perceived consequences questionnaire,11 cervical screening questionnaire,12 and the PEAPS Q.13 It has been argued recently that such measures should be used to assess screening outcomes rather than other widely used generic measures.14 However, the need for quantitative measures that can be equated to and calibrated against other adverse health outcomes and events is crucial if we are to truly gain a measure of how screening affects the wellbeing of individual’s and populations. Once the psychosocial/QOL impact of a screening test is adequately captured it may then be weighed against the test’s medical outcomes to comprehensively evaluate its worth as a screening tool.

Marteau’s study1 also raises interesting questions about what represents QOL/psychosocial outcomes. Although consensus on QOL/psychosocial measurement is limited, most evaluations include some component of emotional and social functioning with a measure of perceived health or physical functioning sometimes also included. Marteau1 reports only perceived health (SAH). The finding that SAH is poorer after screening in the group with screen detected aortic aneurysms is not at all surprising. The purpose of screening is to identify people at increased risk of disease and inform them of their status. As such, the finding that a person rates their health as poorer after an abnormal screening result is an inevitable consequence of screening, and perhaps may be viewed as an indication that a person has understood their test result, rather than a measure of psychosocial wellbeing.

This brings us to Marteau’s1 finding that SAH was poorer before screening in men who subsequently had aneurysms found, predicting AAA even after adjustment for known risk factors. This is a puzzling finding—why should a person’s perception of their health status predict whether they have an asymptomatic condition? It could be that the results are explained by the failure to measure smoking at baseline and adjust for it appropriately. However, given the arguments presented by the authors and the very small change in the odds ratios for SAH after adjustment of other known AAA risk factors (age, family history, blood pressure, and social deprivation), it is quite possible that even if smoking was included, SAH would still remain an independent predictor of AAA. Alternatively the finding might be related to an increased likelihood of other symptomatic cardiovascular conditions that affect SAH among the screen positive group.

If, however, the association is not the result of such factors, then it presents us with an astonishing finding, that asymptomatic AAA in some way makes people feel recognisably less well. This seems hard to believe especially as most of the aneurysms identified by screening in the study were comparatively small. Nevertheless it raises the question of whether SAH might be a predictor of disease in other screening programmes, such as cancer or heart disease. Could it be possible that people who are subsequently found to have colorectal cancer or bowel polyps have poorer SAH before screening, or that women who have cervical intraepithelial neoplasia have poorer perceived health? These speculations seem unlikely but not impossible and we are unaware of any evidence to support or refute them, other than the substantial body of evidence that SAH is a strong predictor of mortality, especially among men.15 Thus the association between SAH and clinical outcomes of other screening tests would seem to warrant investigation. The finding raises the possibility that screening programmes of the future might incorporate tests of SAH. This is of course a highly speculative suggestion and one that would need much, much more investigation.

In conclusion Marteau et al’s1 study highlights the urgent need for high quality research on the psychosocial outcomes of screening programmes. Just as with medical outcomes, the strongest designs will be randomised trials with before and after screening measurements. We believe it is feasible and important to include validated psychosocial measures within future randomised trials of screening.


We would like to thank Professor Les Irwig for comments on an earlier version of this paper.

