Statistics from Altmetric.com
Throughout the past decade, a great number of studies have been published searching for a gene-disease association. Ioannidis et al, studying a variety of diseases, pointed out that the first time such a relation is shown it correlates moderately with later results.1
Sporadic Alzheimer’s disease (AD) has been extensively studied searching for gene causation.2 Apart from the well established relation with ApoE ε-4 genotype, other gene candidates have weaker associations with AD and their role remains controversial. In fact, results from case-control studies are frequently contradictory and meta-analyses have been conducted to clarify whether a gene-AD association exists.3
The goal of this paper is to describe how the odds ratio between a gene and AD changes as new studies are published.
We included case-control studies published before July 2003 on AD and the following polymorphisms: myeloperoxidase 7 (6 studies),4–8 low density lipoprotein receptor related protein gene (LRP) exon 3 polymorphism (10 studies),9–18 Glu298Asp polymorphism in the endothelial nitric oxide synthetase (NOS-3) (10 studies),8,16,19–26 and cathepsin D (14 studies).8,16,27–37 A study on LRP-AD association was excluded because it only reported allelic data.38 As this is a field in fast development, this list of genes is not intended to be exhaustive; we have selected them because, having worked on them, we knew they have been reported in more than five papers.
To select these papers, we searched in Medline for papers including in the text “Alzheimer” and any of the following: “myeloperoxidase”, “cathepsin”, “nitric oxide”, “lipoprotein related-receptor”, or “lrp”. The search was completed revising the bibliography of the selected papers. The articles that were found were revised for selecting only case-control studies relating AD with any of the genes listed in the preceding paragraph. Papers were included if they reported crude results on gene-AD association (for example: a two by two table) or odds ratios with confidence intervals; we exclude papers reporting only allelic results (for example: allelic odds ratio or allelic distribution). We did not analyse the paper’s quality, as this is not pertinent for the goal of our study.
For each gene-AD relation we performed a cumulative meta-analysis: we obtained an odds ratio with the first study, then we added the second published paper and we calculated a common odds ratio, and so on until all the articles were included in the same order they were published in. Common odds ratios were estimated weighting by the inverse of the variance in a random effect model. The statistical analysis was performed using the software Stata 8/SE (Stata Corporation, College Station, TX, US).
Figure 1 displays the evolution of the cumulative odds ratio for each gene. Only myeloperoxidase polymorphism had an initial odds ratio close to 1, and suffered small changes when adding later studies. The LRP-exon 3 gene had the highest initial odds ratio (2.41); when adding further studies the cumulative odds ratio progressively decreased down to 1.35. Similar changes may be seen for cathepsine D gene (its odds ratio changes from 2.40 to 1.26) and NOS-3 (its odds ratio falls from 1.72 to 1.07).
Our results, similar to those reported by Ioannidis et al,1 suggest that the first study dealing with a gene-AD relation tends to overestimate this association. A possible explanation would be publication bias, leading to a delay in publication of studies with ORs closer to the null.
One of the main causes for publication bias is that papers with negative results (that is, no association) would have a higher probability of being rejected regardless of their scientific quality, while papers with novel positive results would be seen as more attractive and would be more likely to be published.39 While genetic epidemiology develops, a progressively higher number of genetic markers would be tested for gene-disease association, and more papers on this subject would compete for the limited space in scientific journals; the editors would, therefore, have to make a choice and, probably, papers with positive results hold the advantage. However, the probability of type I error rises as the proportion of studies with positive results increases.
Once a strong association has been described, studies with negative results would be published because they contradict the first report. Then, if the first paper was attributable to a type I error, the next results would tend towards the null hypothesis (that is, OR = 1).
Our gene selection is partial; therefore, we do not intend to establish our results as a kind of gold standard in the gene-disease association, but as a called for scepticism facing the very first results on any genetic marker.
This layout is a challenge for researchers, referees, editors and, we believe, especially for readers. When a genetic marker is suggested as a putative cause for a disease, readers should have in mind the need for consistency in causation epidemiology. Consistency refers to the reliability of the results in different populations and under different circumstances, and is one of the causality criteria proposed by Hill and generally admitted.40 Before a gene-disease association is to be recognised as true, it is necessary to independently replicate investigations and, if needed, to combine their results in meta-analyses.
After sending the last version of our paper, a meta-analysis on association between cathepsin D and AD has appeared.41 It substantially coincides with our results and also remarks the dissipation of the postulated effect.
Conflicts of interest: none declared.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.