Article Text

Download PDFPDF

Systematic reviews of health effects of social interventions: 1. Finding the evidence: how far should you go?
  1. David Ogilvie1,
  2. Val Hamilton2,
  3. Matt Egan1,
  4. Mark Petticrew1
  1. 1MRC Social and Public Health Sciences Unit, University of Glasgow, UK
  2. 2Development and Alumni Office, University of Glasgow
  1. Correspondence to:
 Dr D Ogilvie
 MRC Social and Public Health Sciences Unit, University of Glasgow, 4 Lilybank Gardens, Glasgow G12 8RZ, UK;


Study objective: There is little guidance on how to identify useful evidence about the health effects of social interventions. The aim of this study was to assess the value of different ways of finding this type of information.

Design: Retrospective analysis of the sources of studies for one systematic review.

Setting: Case study of a systematic review of the effectiveness of interventions in promoting a population shift from using cars towards walking and cycling.

Main results: Only four of the 69 relevant studies were found in a “first-line” health database such as Medline. About half of all relevant studies were found through the specialist Transport database. Nine relevant studies were found through purposive internet searches and seven relevant studies were found by chance. The unique contribution of experts was not to identify additional studies, but to provide more information about those already found in the literature.

Conclusions: Most of the evidence needed for this review was not found in studies indexed in familiar literature databases. Applying a sensitive search strategy across multiple databases and interfaces is very labour intensive. Retrospective analysis suggests that a more efficient method might have been to search a few key resources, then to ask authors and experts directly for the most robust reports of studies identified. However, internet publications and serendipitous discoveries did make a significant contribution to the total set of relevant evidence. Undertaking a comprehensive search may provide unique evidence and insights that would not be obtained using a more focused search.

  • systematic review
  • searching
  • methodology
  • evidence based policy
  • transport

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

There are frequent calls for better evidence about the effects of interventions to improve population health.1–4 Systematic reviews such as those in the Cochrane Library ( are increasingly seen as the best way to assess and synthesise evidence about effectiveness in health care,5,6 and the NHS Centre for Reviews and Dissemination and the Cochrane Collaboration (among others) have published detailed guidance on how to do this.7,8 In social policy, the Campbell Collaboration ( also promotes systematic reviews of effectiveness in the fields of education, criminal justice, and social welfare.9

The question of how best to synthesise evidence is not solely the concern of systematic reviewers. Most public health specialists draw on published reviews when giving advice and helping to formulate local or national policy. Some also commission reviews, while others undertake higher level, quicker syntheses such as evidence briefings10 or brief reviews for health impact assessment (HIA).11 All of these types of “evidence synthesis” are intended to influence policy, so it is important to make them as useful as possible and ensure their findings are not misleading.11 This implies a need to consider carefully what methods are used.

Some areas of public health practice, such as immunisation or screening, involve comparatively discrete and replicable interventions. Their effectiveness can readily be studied using conventional, epidemiologically based methods for systematic reviews. However, systematic reviews of more complex public health interventions are more methodologically challenging,12 and researchers working in the wider field of improving health through social policy face similar, if greater, challenges in two particular areas. The first is that “conventional” methods, and the positivist epistemological position that underlies them, are less widely accepted as a means of generating evidence in the social sciences than in the biomedical sciences.13,14 The second is that even if the principles of the approach are accepted, it can be difficult to apply them to studies of interventions that are often complex, highly contextual, or not amenable to the types of study design usually accorded high status in the health research community.13,15–17 Nevertheless, we cannot ignore questions of effectiveness in this area. Models of health such as the socioecological model18 highlight the importance of the social, physical, and economic environments as determinants of health, but we know remarkably little about the actual effects of interventions to change these3,19 and we cannot assume that apparently sensible measures will be either effective or free from harmful effects.20–22

Transport is an aspect of public policy and social organisation that may have important influences on health. We carried out a systematic review to address a broadly specified research question in this area: what interventions are effective in promoting a population shift from using cars towards walking and cycling? (box). We have previously reported our synthesis of the best available evidence to answer this question.23 However, our experience also illuminates the more general problem of how to synthesise evidence about the public health implications of interventions whose primary focus is not health care, or even health, but some other area of social policy.

In the absence of much methodological research or guidance in this area, we have therefore lifted the lid on the “private life” of our review to explore some of the scientific issues raised.24 We have concentrated on the input side of the review process—how evidence is found and selected—because decisions made here have a large effect on the duration and cost of a review, as well as on the nature of its outputs. Many quicker forms of evidence synthesis, such as those used in rapid HIA, depend for their feasibility on a severely constrained input phase, and all researchers naturally hope to achieve their objectives as efficiently as possible. In this paper, we examine one phase of the review: the search for evidence. A companion paper deals with the subsequent problem of how to select the “best available” evidence for inclusion.25


Designing the search strategy for a systematic review entails trading off sensitivity (breadth of coverage) against specificity (efficiency of searching).26–28

The main resource for most systematic reviews is electronic literature databases. In practice, the number of databases searched varies widely between reviews; in one sample from the Cochrane Library, between one and 27 databases had been searched for each review.29 Even in the field of health promotion and public health, a substantial minority of journals are not indexed in a popular biomedical database such as Medline.30 In a cross disciplinary topic area, it may be particularly important to search a large number of databases,7,8,26,28 but terminology and the quality of indexing and abstracting vary widely between databases and disciplines.31–33

Reviewers are also advised to search more widely using reference lists, conference proceedings, and other sources of “grey” literature.34 In an emerging field, it may also be particularly valuable to contact experts for help in identifying relevant studies.35 However, we lack a clear understanding of how best to use these additional sources of evidence. For example, excluding grey literature from meta-analyses of trials has been found to result in an overestimation of effect size by an average of 12%,36 but another study has suggested that a comprehensive search strategy may have little effect on the overall result and may introduce bias by including trials of lower validity.37 Internet search engines offer an alternative way of finding evidence, but current guidance on how to search the internet systematically is largely limited to warnings about how difficult this might be.7,33

Key findings of the review

  • Review question—What interventions are effective in promoting a population shift from using cars towards walking and cycling?

  • Studies finally included—22 experimental or observational studies with a prospective or controlled retrospective design that evaluated any intervention applied to an urban population or area by measuring outcomes in members of the local population.

  • Key findings—We found evidence from a few comparatively well conducted studies that targeted behaviour change programmes could change the behaviour of motivated subgroups, resulting (in the largest study) in a shift of around 5% of all trips at a population level. Single studies of commuter subsidies and a new railway station also showed positive effects. The balance of best available evidence about agents of change, publicity campaigns, engineering measures, and charging road users suggested that they had not been effective in our terms. We also found evidence from single controlled studies that car share clubs and telecommuting were not effective; if anything, participation in these interventions was associated with negative effects. Participants in randomised controlled trials of active commuting experienced short term improvements in certain measures of health and fitness, but we found no good evidence about effects on health of any effective intervention at population level.23

We used a comprehensive search strategy for our review. In this paper, we report the findings of a retrospective analysis of how and where we found the evidence relevant to our review question. This analysis was intended to answer two questions: what were the relative and distinct contributions of different ways of finding evidence, and could we have found the evidence that mattered more efficiently?


We have reported full details of our search strategy previously.23

We chose to restrict our search of electronic databases to the 20 databases that had produced the highest yield in the search for a previous systematic review on a related topic, the health effects of new roads.38

We developed our search syntax iteratively.7 We first conducted a scoping search with a provisional set of terms, retrieved the 100 most relevant abstracts, and then added additional indexing or text word terms used in those references to our search strategy. We then adapted the search syntax for each database or interface used. We did not limit the search using terms for study design.

We decided not to attempt a “systematic” internet search. Instead, we used three quality assured gateway sites (,, and and our own knowledge to generate lists of potentially relevant web sites, from which we selected a purposive sample of 16 sites that contained bibliographies or searchable databases of documents. These represented a range of types of organisation (academic, government, and voluntary), countries of origin (Canada, all the countries of the European Union, Norway, and the United States of America), and language of publication (Danish, English, French, Norwegian, and Swedish).

We posted our review protocol on our web site ( along with an interim list of over 200 references, and then issued invitations both personally to experts and more generally to relevant electronic mail groups, inviting people to review our list of references and suggest additional studies. We also searched the reference lists of all documents obtained and our own existing collections of references.

When our review was complete, we analysed where we had obtained the references for all relevant studies. We based this analysis on a notional hierarchy of sources ranging from “first-line” health databases such as Medline (at the top) to stumbling upon studies by chance (at the bottom). This hierarchy reflected the general order in which we had conducted our search. For each study, we identified the highest order source from which we had identified a reference to it—either a primary report of the study, or a secondary source (literature review, book chapter, or similar) that included an appropriate reference to a primary report.


We identified 69 relevant studies, of which we included 22 in our final synthesis.23 We found about half of all relevant studies through the Transport database; we had not found these in databases more familiar to health and social science researchers (table). Only four of the 69 relevant studies were found in one of our “first-line” health databases.

Searching reference lists contributed comparatively few studies, and we identified no studies solely on the recommendation of an expert. Where experts did suggest references, these proved to be either general background papers, or more up to date or comprehensive reports about studies we had already identified; searching reference lists also contributed to this latter group of documents. We found nine relevant studies through our purposive internet search; we had not found these studies by searching the databases of published literature. We also found seven relevant studies by chance—one through unstructured web browsing (surfing), the remainder because we ordered a book or set of conference proceedings for one particular article and found other relevant articles in the same publication.


Principal findings

Most of the evidence we needed was not found by searching mainstream health literature databases. The Transport database was the key to this review; we also found relevant evidence by searching the internet and by chance. The contribution of experts was not to identify additional studies, but to help us find better reports of studies we already knew about.

Searching electronic literature databases

Although the studies we identified through first line health databases were of comparatively high methodological quality, they contributed a small minority of the total evidence needed for the review. This is not the case for all systematic reviews. A study of Cochrane reviews of trials found that most relevant trials were indexed in the Cochrane controlled trials register, Medline or Embase, and that searching an additional 26 databases contributed only 2.4% of the total number of trials identified.29 None the less, even where a database such as Medline does yield most of the relevant studies for a review, the value of searching a range of sources has been acknowledged: systematic reviews on risk communication in primary care,26 exercise therapy in cancer,28 acupuncture,34 and lipid lowering agents34 have all included unique references found only in other specialist databases. In our topic area, which lies far from the clinical focus of most health databases, our findings confirm the importance of searching widely in topic specific databases that may be unfamiliar to public health researchers.

Reviewers should not underestimate the complexity and time demands of searching across multiple databases with different technical and syntactical requirements.7 These are exemplified by Transport, by far the most important database for our review. Although this is the largest and most widely used transport database, and has the particular advantage of indexing a large amount of “grey” literature, Wentz et al found it impossible to construct a satisfactory search strategy to find controlled evaluation studies in this database.39 We designed a highly sensitive search and did not include terms for particular study designs, but sensitive searches tend to be imprecise and require reviewers to scan thousands of irrelevant items: we examined over 5000 titles or abstracts in this review.23,27

Searching for “grey” and unpublished literature

In our topic area, many relevant studies have never been fully reported in a scientific journal, so it can be difficult to find evidence that can be meaningfully appraised. In particular, the vast and rapidly expanding amount of information available on the internet can be a mixed blessing. The advantages include quicker and cheaper access to some full text journal articles and the increasing tendency to publish “grey” literature online. However, other reviewers have reported finding few or no useful studies by searching the internet.34,40 We found nine relevant studies through purposive web searching. Only two of these were included in our final synthesis, which suggests that most of the work we found was of comparatively low quality. However, finding and appraising such “low grade” evidence may still be important to develop a taxonomy of interventions, critique current approaches to evaluation, and show how the evidence base might be strengthened.25

The Cochrane reviewers’ handbook specifically suggests that reviewers send interim lists of references to authors and experts in the field and ask if they know of any other relevant studies, but warns that asking researchers for information on “unpublished” studies can be unrewarding.8 Our previous systematic review of the health effects of new roads included several important unpublished studies that could only have been found in this way,38 and in another systematic review on near-patient testing, McManus et al found that 24% of eligible references were recommended by experts.35 In this review, however, we identified no studies solely on the recommendation of an expert. Instead, we found that experts helped us to find better reports of studies we already knew about.

What is already known on this subject?

  • We need better syntheses of evidence about the effects of interventions to influence the wider determinants of health, ranging from full scale systematic reviews to more rapid evidence briefings and health impact assessments

  • Relevant evidence may be dispersed across many different disciplines and types of paper or electronic publication, and may be hard to find

  • We lack an accepted, evidence based methodology for finding this evidence.

A surgical strike on the evidence?

In retrospect, it seems that most relevant studies could have been found in—or in references from—documents indexed in a handful of key resources. This suggests that we might have reached similar conclusions if we had followed an alternative search strategy by searching those few resources, then asking authors and experts directly for the most robust reports of studies of the interventions identified. This hypothesis could, of course, be tested prospectively in a future review by applying two or more search strategies in parallel and comparing the results. McNally et al reached a similar conclusion at the end of their review on access to health care for people with learning disabilities, commenting that their time might have been better spent assessing the value of each database more critically at the outset.40 Of course, our analysis is based on where we actually found studies in practice, rather than where we might have found them in theory had we used different search terms or screened the results of the search in different ways. The efficiency implications of these decisions could also be investigated in future research.

A more targeted search—a “surgical strike” to hit the most relevant evidence—might be more efficient and help to guard against the temptation to keep searching for just one more relevant study. It would also be expected to reduce the often daunting quantity of time and money needed to carry out a systematic review. It is always necessary to find a balance between comprehensiveness and precision when developing a search strategy, and the law of diminishing returns applies as much to literature searching as to any other activity8,28 However, there is a subjective and serendipitous element to literature searching that would be lost in a highly targeted approach. We did find some relevant studies purely by chance, and we have no way of knowing whether we might have found them by other means. Hawker et al have also commented on the importance of serendipity in finding evidence for their review on the transfer of patient information.41 We also concur with their finding that eventually, references to the same study begin to appear repeatedly and one gains the impression of having reached adequate saturation in the search, in much the same way that a qualitative researcher may continue sampling until no new conceptual categories are generated.42 Constraining the search options, particularly when studies of many different types are being sought, must surely reduce the likelihood that reviewers will reach a point at which they can reasonably judge their search to be complete.

What does this study add?

  • Relying on mainstream electronic databases of health literature would have seriously compromised the scope and value of our evidence synthesis

  • Evidence about the relative contributions from literature databases, the internet, and contacting experts to systematic reviews on clinical topics cannot necessarily be generalised to wider public health topics

  • Comprehensive searching may seem inefficient, but may also provide unique evidence—and insights into that evidence—that would not be obtained using a more focused search.

Implications for evidence synthesis in public health

Some of our findings contradict those of others who have published methodological analyses of their systematic reviews. This is not surprising, given the heterogeneity of review questions and the nature of the evidence available in our particular topic area. Colleagues planning to synthesise evidence about the health effects of social interventions should consider three important findings from our case study. Firstly, the temptation to rely on the electronic databases of health literature with which public health researchers are most familiar may seriously compromise the scope and value of the exercise. Secondly, evidence about the relative contributions from literature databases, the internet, and contacting experts to systematic reviews on clinical topics cannot necessarily be generalised to wider public health topics. Thirdly, undertaking a comprehensive search may seem inefficient, but may also provide unique evidence—and insights into that evidence—that would not be obtained using a more focused search.

Table 1

 Sources of studies for the review



  • Funding: The review was funded by the Chief Scientist Office of the Scottish Executive Health Department and by the ESRC Evidence Network. DO is now funded by a Medical Research Council fellowship. The funding sources played no part in the design, analysis, interpretation, or writing up of the study or in the decision to publish.

  • Competing interests: none known.

  • Ethical approval: not required.