Sample selection and validity of exposure–disease association estimates in cohort studies

Costanza Pizzi; Bianca De Stavola; Franco Merletti; Rino Bellocco; Isabel dos Santos Silva; Neil Pearce; Lorenzo Richiardi

doi:10.1136/jech.2009.107185

Article Text

Theory and methods

Sample selection and validity of exposure–disease association estimates in cohort studies

Costanza Pizzi1,2,
Bianca De Stavola2,
Franco Merletti1,
Rino Bellocco3,4,
Isabel dos Santos Silva5,
Neil Pearce2,6,
Lorenzo Richiardi1

¹Cancer Epidemiology Unit, CeRMS and CPO-Piemonte, University of Turin, Italy
²Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
³Department of Statistics, University of Milano Bicocca, Milan, Italy
⁴Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
⁵Department of Non-communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
⁶Centre for Public Health Research, Massey University Wellington Campus, New Zealand

Correspondence to Costanza Pizzi, Via Santena 7, 10126 Torino, Italy; costanza.pizzi{at}lshtm.ac.uk

Abstract

Background Participants in cohort studies are frequently selected from restricted source populations. It has been recognised that such restriction may affect the study validity.

Objectives To assess the bias that may arise when analyses involve data from cohorts based on restricted source populations, an area little studied in quantitative terms.

Methods Monte Carlo simulations were used, based on a setting where the exposure and one risk factor for the outcome, which are not associated in the general population, influence selection into the cohort. All the parameters involved in the simulations (ie, prevalence and effects of exposure and risk factor on both the selection and outcome process, selection prevalence, baseline outcome incidence rate, and sample size) were allowed to vary to reflect real life settings.

Results The simulations show that when the exposure and risk factor are strongly associated with selection (ORs of 4 or 0.25) and the unmeasured risk factor is associated with a disease HR of 4, the bias in the estimated log HR for the exposure–disease association is ±0.15. When these associations decrease to values more commonly seen in epidemiological studies (eg, ORs and HRs of 2 or 0.5), the bias in the log HR drops to just ±0.02.

Conclusions Using a restricted source population for a cohort study will, under a range of sensible scenarios, produce only relatively weak bias in estimates of the exposure–disease associations.

Directed Acyclical Graphs
selection bias
confounding
Monte Carlo Simulations
BIAS ME
epidemiology ME

https://doi.org/10.1136/jech.2009.107185

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

View Full Text

Footnotes

Funding The study was conducted within projects partially funded by Compagnia SanPaolo/FIRMS, the Piedmont Region, the Italian Ministry of University and Research (MIUR), the Italian Association for Research on Cancer (AIRC) and the Massey University Research Fund (MURF). The Centre for Public Health Research is supported by a Programme Grant from the Health Research Council of New Zealand.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Footnotes

Read the full text or download the PDF:

Log in using your username and password

Read the full text or download the PDF:

Log in using your username and password