Generalizing molecular results arising from incomplete biological samples: expected bias and unexpected findings

Ann Epidemiol. 2002 Jan;12(1):7-14. doi: 10.1016/s1047-2797(01)00267-8.

Abstract

Purpose: In molecular epidemiology, obtaining biological samples for all subjects targeted for study is frequently hampered by ethical, clinical, and logistic factors. The extent to which the incompleteness of biological samples could cause bias is rarely analyzed in depth. Here we report some expected bias and some unexpected findings during a study on mutations in the K-ras gene in exocrine pancreatic cancer (EPC).

Methods: In this case-case study, all patients registered with EPC between 1980 and 1990 at two general hospitals were retrospectively identified from the hospital tumor registries. Their clinical records were abstracted and paraffin-embedded samples retrieved from pathology records. DNA was amplified, and mutations in codon 12 of the K-ras gene were detected using the artificial RFLP technique.

Results: Results on the mutations (RM) were obtained for 51 of the 149 cases of EPC (34.2%). There were no significant differences on the availability of RM by age, gender, and tumor stage at diagnosis, but RM were over five times more likely to be available from one of the hospitals. Subjects with RM were more likely to have received a treatment with curative intent (OR = 11.56, 95% CL: 2.88-46.36). The existence of RM was positively associated with the availability of information on alcohol use and family history of cancer. Subjects with RM tended to belong to higher occupational groups and to smoke less than subjects without RM. Unexpectedly--given that in EPC K-ras mutations have consistently been found unrelated to age, gender, tumor stage, and other clinical factors-, cases with a K-ras mutation were more likely than wild-type cases to have information on tobacco and alcohol use (OR = 3.29, p = .21), medical history (OR = 4.46, p = .41), and family history of cancer (OR = 4.80, p = .01). The relationship between completeness of clinical records and K-ras mutations among cases with RM could not be accounted by age, gender, and occupational group.

Conclusions: Simple tests of age and gender distributions among subjects with and without available clinical information and molecular results may not rule out selection and information bias. Studies using biologic specimens are even more in need than classic studies to explain clearly the process followed to include and exclude subjects. Additional caution is needed when generalizing molecular results arising from incomplete biological specimens.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Female
  • Genes, ras / genetics*
  • Humans
  • Male
  • Middle Aged
  • Molecular Epidemiology*
  • Pancreatic Neoplasms / epidemiology
  • Pancreatic Neoplasms / genetics*
  • Pancreatic Neoplasms / pathology
  • Registries
  • Retrospective Studies
  • Selection Bias*
  • Spain / epidemiology