Article Text

Download PDFPDF
Baseline selection on a collider: a ubiquitous mechanism occurring in both representative and selected cohort studies
  1. Lorenzo Richiardi1,
  2. Neil Pearce2,3,
  3. Eva Pagano4,
  4. Daniela Di Cuonzo4,5,
  5. Daniela Zugna1,
  6. Costanza Pizzi1
  1. 1 Department of Medical Sciences, University of Turin and CPO-Piemonte, Turin, Italy
  2. 2 Medical Statistics, London School of Hygiene & Tropical Medicine, London, UK
  3. 3 Centre for Public Health Research, Massey University, Wellington, New Zealand
  4. 4 Cancer Epidemiology Turin, AOU Città della Salute e della Scienza and CPO-Piemonte, Torino, Italy
  5. 5 Cancer Epidemiology, AOU Città della Salute e della Scienza, Turin, Italy
  1. Correspondence to Dr Lorenzo Richiardi, Department of Medical Sciences, University of Turin, Turin 10124, Italy; lorenzo.richiardi{at}


There is debate as to whether cohort studies are valid when they are based on a source population that is non-representative of a given general population. This baseline selection may introduce collider bias if the exposure of interest and some other outcome risk factors affect the probability of being in the source population, thus altering the associations between the exposure and those risk factors. We argue that this mechanism is not specific to ‘selected cohorts’ and also occurs in ‘representative cohorts’ due to the selection processes that occur in any population. These selection processes are for example linked to the life status, immigration and emigration, which, in turn, may be affected by environmental and social determinants, lifestyles and genetics. We provide real-world examples of this phenomenon using data on the population of the Piedmont region, Italy. In addition to well-recognised mechanisms, such as shared common causes, the associations between the exposure of interest and the risk factors for the outcome of interest in any source population are potentially shaped by collider bias due to the underlying selection processes. We conclude that, when conducting a cohort study, different source populations, whether ‘selected’ or ‘representative’, may lead to different exposure–outcome risk factor associations, and thus different degrees of lack of exchangeability, but that one approach is not inherently more or less biased than the other. The key issue is whether the relevant risk factors can be identified and controlled.

  • collider bias
  • cohort studies
  • bias
  • selection
  • representativeness

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Contributors LR, DZ, NP and CP contributed to the concept and design. EP, DD and LR contributed to the acquisition and analysis of the data. All authors contributed to the interpretation of the results and to critically revising the manuscript. LR and NP drafted the work. All authors gave final approval and agree to be accountable for all aspects, ensuring integrity and accuracy.

  • Funding LR, CP and DZ received funding from the European Union’s Horizon 2020 Research and Innovation Programme (LifeCycle Project, grant agreement number 733206). NP’s involvement in this work was supported by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013/ERC grant agreement number 668954).

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval No research approval was obtained as the included examples are based on deidentified administrative data.

  • Provenance and peer review Not commissioned; externally peer reviewed.