Article Text

Download PDFPDF
Challenges in reproducing results from publicly available data: an example of sexual orientation and cardiovascular disease risk
  1. Nichole Austin1,
  2. Sam Harper1,
  3. Jay S Kaufman1,
  4. Ghassan B Hamra2
  1. 1Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada
  2. 2Department of Environmental and Occupational Health, School of Public Health, Drexel University, Philadelphia, Pennsylvania, USA
  1. Correspondence to Dr Sam Harper, Department of Epidemiology, Biostatistics and Occupational Health, McGill University, 1020 Pine Avenue West, Montreal, Quebec, Canada H3A 1A2; sam.harper{at}


Background Replication is a vital part of the research process and has recently received considerable attention. Analyses using publicly available data should, if adequately described, be reproducible without assistance from the original investigators. Using data from the US National Health and Nutrition Examination Survey (NHANES), a recent study reported a statistically significant difference in cardiovascular disease risk comparing subgroups of sexual minority men. We attempted to reproduce these findings and assessed whether the results were robust to alternative analytic strategies and assumptions.

Methods We used the exclusion criteria and coding strategy described in the original paper to construct our analytical data set. Sampling weights were constructed in accordance with NHANES analytical guidelines. We estimated crude and covariate-adjusted associations between sexual orientation and vascular age using the regression models specified in the original report. We also conducted a series of sensitivity analyses to improve on the original findings.

Results Our replication attempt was partially successful: we replicated the general trends reported in the original analysis, but not identical effect estimates. Importantly, we identified a potential misapplication of the Framingham Risk Score; correcting for this increased the probability that the reported null hypothesis test was a type I error.

Conclusions This paper supports the recent calls for greater transparency and improved reporting in research. Even with a publicly available and well-documented data source, we were unable to exactly replicate another study's original findings. Our sensitivity analyses revealed key issues in the original analysis and demonstrate the scientific importance of research replication.

  • Cardiovascular disease
  • Health inequalities

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.