Background: Selection bias in observational epidemiology—the notion that people who participate in a study are fundamentally different from those who do not—is a perennial concern. In cohort studies, a potentially important but little investigated manifestation of selection bias is the distortion of the exposure–disease relationship according to participation status.
Methods: Seven years after the original UK Health and Lifestyle Survey (HALS1; N = 6484), attempts were made to resurvey participants (HALS2). The baseline characteristics, mortality experience following the completion of HALS2 and, finally, the baseline risk factor–cardiovascular disease (CVD) mortality gradients in HALS2 non-participants (N = 1894) and participants (N = 4590) were compared.
Results: Resurvey non-participants, based on data from HALS1, were younger, were of lower social class and had a lower prevalence of hypertension or self-reported limiting long-standing illness, but a higher prevalence of psychological distress (p⩽0.027). The risk of death from future CVD was significantly higher in those baseline study members who did not participate in HALS2. However, the magnitude of the association between a series of risk factors and CVD mortality was essentially the same in the HALS2 non-participants and participants (p value for interaction⩾0.108).
Conclusion: In the present cohort study, non-response at resurvey did not bias the observed associations between baseline risk factors and later CVD mortality. Future studies should also examine the impact of non-response to baseline surveys on these relationships.
Statistics from Altmetric.com
In prospective cohort studies, bias introduced by systematic (ie, non-random) non-response is a perennial problem, and one that is likely to become more acute with secular declines in survey participation over recent decades.1 2 Widely referred to as selection bias, this phenomenon has several manifestations. First, it may lead to error in the estimation of risk factor or disease prevalence at study baseline. Second, it may result in inaccuracy in the assessment of disease rates during follow-up. The very nature of non-attendance complicates empirical examination of the impact of these types of bias. However, this has been overcome by investigators using a number of approaches: gathering background information on non-participants from routine data sources (eg, driving licence records3 or mandatory census4); comparing the characteristics of study members who took part in all phases of data collection (eg, medical examination and questionnaire administration) with those of study members who only partially participated (eg, questionnaire administration);5 6 or contrasting the characteristics of participants in a baseline survey according to whether or not they took part in subsequent resurvey(s).7
Based on these methods, there is some evidence that risk factor or disease prevalence at study baseline differ according to response status, but the study findings are inconclusive. Thus, in comparison with responders, non-responders may be younger3 8 or older,7 9 10 and have greater11 12 13 or similar morbid load. There is a much greater degree of consistency as regards smoking7 12 14 and socioeconomic disadvantage,4 8 13 14 15 16 both of which appear to be more common in non-participants. Despite this apparently discordant literature, with very few exceptions,17 a series of studies examining death rates after baseline survey have reported elevated rates of future mortality in non-participants relative to participants.5 6 18 19 20 21 22
Although epidemiologists commonly examine differences between respondents and non-respondents in levels of risk factors, and, where possible, mortality, few studies have examined a third manifestation of selection bias: whether survey participation status affects estimates of the association between risk factors and disease. In a cohort study of cancer incidence, the magnitude of the association between cigarette smoking and carcinoma of the lung,5 and between body mass index (BMI) and colon cancer,3 5 was essentially the same according to survey participation status. In two studies that assessed risk of cardiovascular disease (CVD) mortality according to participation status, the strength of the association between BMI and death from CVD was similar in non-responders and responders in an older US female population.3 In a cohort of men and women from Finland, similar results were reported when socioeconomic disadvantage was the exposure of interest.4
To our knowledge, the impact of non-response on the association between CVD and several well-established risk indices—blood pressure, alcohol intake, physical activity and common mental disorder23—has yet to be examined. This is the primary purpose of the present analyses.
Baseline examination (HALS1)
The UK Health and Lifestyle Survey (HALS1) was conducted in 1984/5. In a household-based random sample of 9003 adults aged 18 and over (77.5% of target population), interviews were administered and physical measurements were made in participants’ homes.24 The socioeconomic profile of this sample is almost identical to that seen in the 1981 UK census data for both men (38.9% non-manual in HALS1 vs 39% in the census) and women (59.2% non-manual in HALS1 vs 60% in the census);25 a similar level of agreement is apparent for age and ethnicity.24 During the home visits, enquiries were made about employment history, smoking habits, alcohol consumption, engagement in exercise activities such as keep fit, sports, jogging, swimming, cycling or dancing, long-standing illness or disability, and experience of heart attack, angina or a stroke, including treatments. Using standard protocols, blood pressure, height and weight were collected on a further home visit (7268 responded; 81% of those initially interviewed), and the respondents were invited to complete and post back the 30-item General Health Questionnaire (GHQ-30),26 which provides an assessment of psychiatric status (6317 responded; 70% of those interviewed). In total, 98% of those who took part in HALS1 were subsequently “flagged” for mortality using the Office for National Statistics’ NHS Central Register. Death certificates were obtained for those who died and causes of death were coded according to the 9th revision of the International Classification of Diseases (ICD-9).
In 1991/2, approximately 7 years after the first survey, a resurvey (HALS2) was carried out in order to describe changes that had occurred in health and lifestyle among the original respondents.27 Of the 9003 men and women who took part in HALS1, 718 (8%) had died before the start of HALS2 in 1991, leaving 8285 as potential participants. Of these, 6484 (72% of study members) had complete data on levels of risk factors at baseline (occupational social class, blood pressure, height and weight, self-reported illness, smoking habits, alcohol consumption, physical activity and GHQ-30 score) together with data on mortality, so could be included in our analyses.
CVD was coded according to the ICD-928 (ICD-9 codes 390–434, 436–448.30). Analyses were based on deaths that occurred over a maximum of 14 years between the start of the HALS2 survey in September 1991 and 30 May 2005. We used Cox proportional hazards regression models29 to compute hazard ratios (HRs) with accompanying 95% confidence intervals (CIs) for the association of risk factors with CVD mortality.
Compared with those who participated in HALS2, non-participants in the resurvey, based on data from HALS1, were more likely to be younger, were of lower social class and had a lower prevalence of hypertension and self-reported limiting long-standing illness but a higher prevalence of psychological distress, as indicated by a score of 6 or more on the GHQ (table 1).
Compared with those who participated in HALS2, risk of death from all causes and from CVD was significantly higher in the non-participants: age- and sex-adjusted HRs (95% CI) for all-cause and cardiovascular mortality were 1.39 (1.24 to 1.57) and 1.28 (1.07 to 1.55), respectively. Notably, these associations were little attenuated by further adjustment for the risk factors described in table 1: fully adjusted HRs (95% CI) for all-cause and cardiovascular mortality were 1.36 (1.21 to 1.53) and 1.26 (1.04 to 1.53), respectively.
In table 2 we show the relation between each of the baseline risk factors and mortality from CVD in the participants compared with the non-participants in HALS2. Among the participants, all the baseline risk factors were associated with an increased risk of death from CVD in age- and/or sex-adjusted analyses with statistical significance apparent in most analyses. Some of these relationships were attenuated after mutual adjustment. Similar results were apparent among the non-participants such that no interaction term was statistically significant. A similar pattern of results was evident when total mortality was the outcome of interest (results available on request).
In the present study, risk factor and health indices at baseline differed according to participation at resurvey some 7 years later. That non-participants at resurvey herein were, at baseline, younger, of lower socioeconomic status and had elevated minor psychiatric morbidity accords with most,3 4 8 11 12 13 but not all,7 9 10 14 studies. Despite these non-participants also having a lower prevalence of hypertension or self-reported limiting long-standing illness, in keeping with a series of other cohorts,5 6 18 19 20 21 22 we observed an elevated risk of CVD and all-cause mortality at follow-up in resurvey non-participants relative to participants. Notably, however, this selection bias did not have an impact on the association of established risk factors with future total and CVD mortality. These results support those from the very few other studies on this topic, in which obesity3 and socioeconomic disadvantage4 were similarly predictive of CVD mortality according to survey response. While we examined the predictive value of a greater range of risk factors than was possible in these reports, a limitation of our study is that we were not able to directly explore the impact of non-response to the original (baseline) survey on the associations between risk factors and CVD deaths. However, as indicated, census comparison suggests that the men and women who took part in HALS1 were highly representative of the UK population.
In conclusion, in this prospective cohort study, which has a typical level of participant non-response at resurvey, and in which there were expected risk factors and mortality differences according to participation status, there was no evidence that this selection bias actually modified the risk factor–CVD mortality association. This suggests that, in the present cohort study at least, and perhaps others, non-response at resurvey does not bias the observed associations between baseline risk factors and later CVD mortality.
What is already known on this subject
Investigators working on cohort studies typically attempt to explore selection bias by comparing the baseline characteristics and mortality experience of participants and non-participants.
In general, the risk factor profile and mortality experience of non-participants is less favourable.
Very little is known about whether these differentials translate into substantially different risk factor–cardiovascular disease associations in survey participants and non-participants.
What this study adds
In a study which had a typical level of non-response to a resurvey, and in which there were expected differentials in both risk factors and mortality according to resurvey participation, there was in fact no evidence that this selection bias modified the risk factor–cardiovascular disease mortality association.
This suggests that, in the present study and perhaps others, non-response at resurvey does not bias the observed associations between baseline risk factors and later cardiovascular disease mortality.
Future studies should examine the impact of non-response to baseline surveys on the associations between risk factors and cardiovascular disease.
Funding The Medical Research Council (MRC) Social and Public Health Sciences Unit receives funding from the MRC and the Chief Scientist Office at the Scottish Government Health Directorates. G-DB is a Wellcome Trust Fellow (WBS U.1300.00.006.00012.01). The MRC supports CG.
Competing interests None.
Ethics approval Approval was obtained from various Local Research Ethics Committees across the UK.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.