Original articles
Publication and related bias in meta-analysis: Power of statistical tests and prevalence in the literature

https://doi.org/10.1016/S0895-4356(00)00242-0Get rights and content

Abstract

Publication and selection biases in meta-analysis are more likely to affect small studies, which also tend to be of lower methodological quality. This may lead to “small-study effects,” where the smaller studies in a meta-analysis show larger treatment effects. Small-study effects may also arise because of between-trial heterogeneity. Statistical tests for small-study effects have been proposed, but their validity has been questioned. A set of typical meta-analyses containing 5, 10, 20, and 30 trials was defined based on the characteristics of 78 published meta-analyses identified in a hand search of eight journals from 1993 to 1997. Simulations were performed to assess the power of a weighted regression method and a rank correlation test in the presence of no bias, moderate bias or severe bias. We based evidence of small-study effects on P < 0.1. The power to detect bias increased with increasing numbers of trials. The rank correlation test was less powerful than the regression method. For example, assuming a control group event rate of 20% and no treatment effect, moderate bias was detected with the regression test in 13.7%, 23.5%, 40.1% and 51.6% of meta-analyses with 5, 10, 20 and 30 trials. The corresponding figures for the correlation test were 8.5%, 14.7%, 20.4% and 26.0%, respectively. Severe bias was detected with the regression method in 23.5%, 56.1%, 88.3% and 95.9% of meta-anlyses with 5, 10, 20 and 30 trials, as compared to 11.9%, 31.1%, 45.3% and 65.4% with the correlation test. Similar results were obtained in simulations incorporating moderate treatment effects. However the regression method gave false-positive rates which were too high in some situations (large treatment effects, or few events per trial, or all trials of similar sizes). Using the regression method, evidence of small-study effects was present in 21 (26.9%) of the 78 published meta-analyses. Tests for small-study effects should routinely be performed in meta-analysis. Their power is however limited, particularly for moderate amounts of bias or meta-analyses based on a small number of small studies. When evidence of small-study effects is found, careful consideration should be given to possible explanations for these in the reporting of the meta-analysis.

Introduction

The importance of being systematic when reviewing the evidence that is available on the benefits and risks of medical intervention is widely recognized. Systematic reviews are prepared using reproducible strategies to avoid bias [1]. Results from individual studies are, when appropriate, combined in meta-analysis, a technique which has become increasingly popular in medical research [2]. Meta-analysis is not an infallible tool, however. Large multicenter trials have contradicted the results of earlier meta-analyses that were based on smaller trials 3, 4, 5. Furthermore, meta-analyses addressing the same question have produced contradictory results 6, 7.

The dissemination of findings from clinical trials is influenced by a host of factors which modify the probability that a study is included in a meta-analysis. If these factors are associated with trial results, then the meta-analysis will be biased. Publication bias, the tendency for increased publication rates among studies that show a statistically significant effect of treatment, has been documented repeatedly 8, 9, 10. Published studies reporting statistically significant results are preferentially published in English [11], in journals indexed in Medline [8], are more likely to be cited by other authors 12, 13 and are more likely to produce multiple publications 14, 15. Such “positive” studies are therefore more likely to be located for and included in meta-analyses, which may introduce bias.

Several aspects of trial quality have been shown to influence effect sizes. Inadequate concealment of treatment allocation resulting, for example, from the use of open random number tables, has been associated with larger treatment effects 16, 17. Larger effects have also been found if investigators assessing outcomes knew what treatment patients had received 16, 18, or if some participants, for example those not adhering to study medications, were excluded from the analysis [19]. Investigators sometimes undermine random allocation, for example by opening assignment envelopes or holding translucent envelopes up to a light bulb [20].

The smaller a study, the larger the treatment effect necessary for the results to be declared statistically significant. In addition, the greater investment of money and time in larger studies means that they are more likely to be of high methodological quality and published even if their results are negative. Bias in a systematic review may therefore be evident in an association between treatment effect and study size, and may be shown graphically in “funnel plots,” scatter plots of the treatment effects estimated from individual studies on the horizontal axis against study size or standard error on the vertical axis which were first used in educational research and psychology [21]. The name “funnel plot” is based on the fact that the precision in the estimation of the underlying treatment effect will increase as the sample size of component studies increases. Effect estimates from small studies will therefore scatter more widely at the bottom of the graph, with the spread narrowing among larger studies. In the absence of bias the plot will resemble a symmetrical inverted funnel (see Fig. 1a).

Bias [e.g., because smaller studies showing no statistically significant effects (open circles in Fig. 1a) remain unpublished], will lead to an asymmetrical appearance of the funnel plot with a gap in the right bottom side of the graph (Fig. 1b). In this situation the combined effect from meta-analysis will overestimate the treatment's effect 5, 22. Such asymmetry might also result from the overestimation of treatment effects in smaller studies of lower methodological quality (Fig. 1c).

Funnel plot asymmetry cannot, however, be considered to be proof of bias in a meta-analysis. Heterogeneity between the treatment effects in different trials may lead to funnel plot asymmetry if the true treatment effect is larger in the smaller trials. For example, if a combined outcome is considered, substantial benefit may be seen only in patients at high risk for the component of the combined outcome which is affected by the intervention [23]. Trials conducted in high-risk patients will also tend to be smaller, because of the difficulty in recruiting such patients. Similarly, some interventions may have been implemented less thoroughly in larger trials, thus explaining the more positive results in smaller trials. This is particularly likely in trials of complex interventions in chronic diseases, such as rehabilitation after stroke or multifaceted interventions in diabetes mellitus. Different mechanisms which can lead to funnel plot asymmetry are summarized in Table 1.

We suggest the term “small-study effects” to describe a trend for the smaller studies in a meta-analysis to show larger treatment effects. While “small” trials are generally defined in relative terms (compared to the larger trials in a particular meta-analysis), a trial may be considered small in absolute terms if its sample size is too small to detect a clinically plausible effect. If small-study effects are present then there will be an association between treatment effect size and its standard error, so that the funnel plot will be asymmetrical (Fig. 1b and c), and further investigation of the reasons for these effects, including the potential for bias in the meta-analysis, is required.

The association between treatment effect size and its standard error is the basis of two statistical methods, a rank correlation test [24] and a regression method [5], which have been proposed as a means of avoiding the subjectivity associated with visual assessment of funnel plots and which are described in more detail in the Methods section. However the validity of these methods has been questioned 25, 26, 27. In the present study we examine the power of the two methods to detect small-study effects in circumstances typical of meta-analyses published recently, and identify situations in which the methods are not reliable. We then examine the evidence for small-study effects in meta-analyses published in leading journals.

Section snippets

Identification of published meta-analyses

To examine the characteristics of meta-analysis published in recent years, we searched four general medicine journals (Annals of Internal Medicine, BMJ, JAMA, Lancet) and four specialist journals (American Journal of Cardiology, Cancer, Circulation, Obstetrics and Gynecology), selected because a Medline search indicated that they publish many meta-analyses. Volumes from 1993 to 1997 were hand-searched for meta-analyses based on at least five trials with binary endpoints. Meta-analyses of

Simulations based on hypothetical, “typical” meta-analyses

Fig. 3 shows the power of the tests to detect bias (P < 0.1) derived from simulations with a control group event rate of 20% and assuming no treatment effect. Table 3 shows the results from the same analyses for control group event rates of 5%, 10% and 20%. In the absence of bias, the proportion of false-positive results was close to the correct value of 10%. The power increased with increasing numbers of trials but was low for meta-analyses with less than 10 trials, and for moderate bias (bias

Discussion

The fallibility of meta-analysis is not surprising given the biases which may be introduced in the process of locating and selecting studies 8, 11, 12, 32, and the often inadequate quality of component studies 16, 17. These biases are more likely to affect small studies, which also tend to be of lower methodological quality. Several biases, as well as other sources of heterogeneity, may act simultaneously in a given meta-analysis and their relative contribution cannot always be disentangled.

It

Acknowledgements

We are grateful to Nicola Low, Stephen Sharp and George Davey Smith for helpful comments on earlier drafts of the manuscript. We thank Deborah Tallon and Martin Schneider for their help with hand searching and data extraction.

References (40)

  • J. LeLorier et al.

    Discrepancies between meta-analyses and subsequent large randomized, controlled trials

    N Engl J Med

    (1997)
  • Egger M, Davey Smith G, Schneider M, Minder CE. Bias in meta-analysis detected by a simple, graphical test. Br Med J...
  • A. Leizorovicz et al.

    Low molecular weight heparin in prevention of perioperative thrombosis

    Br Med J

    (1992)
  • K. Dickersin et al.

    Factors influencing publication of research results. Follow-up of applications submitted to two institutional review boards

    JAMA

    (1992)
  • J.M. Stern et al.

    Publication biasevidence of delayed publication in a cohort study of clinical research projects

    Br Med J

    (1997)
  • P.C. Gøtzsche

    Reference bias in reports of drug trials

    Br Med J

    (1987)
  • P.C. Gotzsche

    Multiple publication of reports of drug trials

    Eur J Clin Pharmacol

    (1989)
  • M.R. Tramèr et al.

    Impact of covert duplicate publication on meta-analysisa case study

    Br Med J

    (1997)
  • K.F. Schulz et al.

    Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials

    JAMA

    (1995)
  • J.H. Noseworthy et al.

    The impact of blinding on the results of a randomized, placebo-controlled multiple sclerosis clinical trial

    Neurology

    (1994)
  • Cited by (0)

    View full text