Background: Declining response rates pose a serious threat to the validity of estimates derived from epidemiological studies. If respondents and non-respondents differ systematically from each other, there can be a bias in the results of the study. A population-based cohort study was conducted to investigate disparities in socioeconomic structure between respondents and non-respondents and the contribution of these disparities to socioeconomic differences in total and cardiovascular mortality.
Design: Data comprised 32 354 male and female participants and 4890 non-participants aged 35–74 years who belonged to the sample in one of the five FINRISK surveys in 1972, 1977, 1982, 1987 or 1992 in Finland. They were followed up for 9 years and 6 months.
Results: It was found that the lower socioeconomic groups were over-represented among non-respondents both in men and women. When comparing the relative risk of death using the highest socioeconomic group of the participants as the reference group, it was found that although the socioeconomic gradient was similar for participants and non-participants—that is, lower groups had a higher risk of death—the risk was at a higher level among non-respondents.
Conclusions: Basing analysis on participants does not distort the relative risk of death associated with socioeconomic position. However, it does underestimate the absolute risk.
- ISCED, International Standard Classification of Education
- SES, socioeconomic status
Statistics from Altmetric.com
Response rates have been declining in epidemiological studies for the past three decades.1,2 This trend poses a serious threat to the validity of the estimates derived from these studies. Especially large population-based surveys require a sufficiently large sample to reach statistical significance. Low response rate disturbs this aim by diminishing the number of respondents which results in wide confidence intervals and weak statistical power. The more important consequence of the low response rate is, however, the bias of the survey estimates caused by differences between respondents and non-respondents. This self-selection bias is not an automatic feature of the low response rate, though. If respondents and non-respondents do not differ systematically from each other, there will be no bias in the estimates, only the statistical precision of the study is reduced.3–5 It has been shown in many studies6,7,8,9,10,11,12,13,14,15,16,17 that usually these two groups are different from each other with respect to race, socioeconomic status (SES) and health behaviour. The process leading to non-response is, therefore, not completely random but is structured in a way that has an effect on the outcome of the survey.
According to several studies, non-respondents are more likely to be young, single, less educated men with unhealthy lifestyles.5,18,19 Mortality among non-respondents is often higher, sometimes even twice as high as among respondents.20,21 These facts suggest that survey estimates, for example, the prevalences of smoking and alcohol consumption, are underestimates if the response rate is low.
This study aimed to investigate disparities in socioeconomic structure between respondents and non-respondents and further to analyse to what extent these disparities contribute to the differences in total mortality and cardiovascular mortality between these groups. It was based on five FINRISK surveys conducted between 1972 and 1992 in Finland.22 The respondents and non-respondents were followed up for mortality for 9 years and 6 months. This study presents a follow-up of over 32 354 male and female participants and 4890 non-participants representing the general population. The total and cause-specific mortality data are complete both for participants and non-participants.
The study population comprised people who were selected for population-based FINRISK surveys in 1972, 1977, 1982, 1987 or 1997 in four areas of Finland (North Karelia, Kuopio, Turku/Loimaa and Helsinki). The total sample size for all surveys in the age range 25–74 years was 54 453, out of which 45 902 (84%) participated. The youngest age group (25–34 years) was excluded from the study (n = 14 532) because of the low death rate and unsettled socioeconomic status in the group. Participants who had missing information on socioeconomic indicators (n = 240) or who had died between the sampling and the survey date (n = 98) were also excluded from the analyses. There were 2339 participants who were selected for more than one survey, and they were included only in their first survey cohort. Thus, 32 354 male and female participants and 4890 non-participants aged 35–74 years were included in the analyses.
The study protocol for the FINRISK surveys included a postal questionnaire on health behaviour (such as smoking, use of alcohol and physical exercise) and other health-related data, and a personal health examination including measurements on blood pressure, height, weight and so on. Recipients were asked to fill in the questionnaire and bring it to the survey site where the health examination was performed. If a person did not appear at the survey site, he/she was contacted by phone to provide a new survey date. The surveys were approved by the ethics committee, conducted according to the ethical rules of the National Public Health Institute (KTL Helsinki, Finland) and carried out in accordance with the Declaration of Helsinki.
Each study cohort was followed up for 9 years and 6 months for total and cardiovascular mortality. Data on mortality were taken from the National Causes of Death Register, and cardiovascular deaths (the sum of ischaemic heart disease and stroke deaths) included the following International Classification of Diseases (ICD) codes as the underlying causes of death: ICD-8: 410–414, 430–434, 795, ICD-9: 410–414, 430, 431, 433, 434, 436, 437, 798, ICD-10: I21–I25, I46, I60, I61, I63, I64, R96, R98, R99. With the help of this country-wide register, the follow-up was in practice 100% complete. These data were further linked to the individual level records of the censuses carried out in Finland in 1970, 1975, 1980, 1985 and 1990 to obtain data on occupational class, household income and education. The linkage was performed by Statistics Finland using personal identification numbers unique to every resident of Finland.
Occupational class, level of education and household income were used as indicators of a person’s position in society. Data on occupational class are formed of several different classification criteria—for example, person’s stage in life (student, economically active, pensioner and so on), occupation and status in employment (self-employed, employee). Occupational class was divided into six groups: upper non-manual employees (administrative, managerial and professional occupations), lower non-manual employees (lower-level administrative and clerical occupations), manual workers, farmers, other employers and others. The group “others” was highly heterogeneous including people with a long unemployment history or those whose occupational class was unknown. Pensioners were classified according to their past occupational group. Women’s occupational class was primarily determined by their own present or past status. Housewives and others without present or past occupational class were classified according to the occupation of their spouse.
Income was defined as total household income per year adjusted for family size using the OECD equivalence scale, where the first adult in the household was weighted as 1.0, other adults as 0.7 and children aged <18 years as 0.5.23 Income was divided into tertiles by study year. Education was divided into three groups: primary education referred to a maximum of 9 years of formal education (corresponds with the UNESCO International Standard Classification of Education (ISCED) levels 0–2), secondary education refers to 10–12 years of formal education (ISCED levels 3–4) and tertiary education to >13 years of formal education (ISCED levels 5–6).
Analyses were conducted separately for men and women. The participants from different survey years (1972, 1977, 1982, 1987 and 1992) were pooled together. Descriptive statistics were computed to measure socioeconomic distributions of participants and non-participants. Kaplan–Meier curves and log rank tests were used to examine differences in mortality between participants and non-participants in different income groups.
Cox’s proportional hazards model was used to investigate three aspects of the contribution of non-participation to the estimates of relative risk (RR) of death between socioeconomic groups:
(1). contribution of socioeconomic status to possible excess mortality among non-participants. In this model, participation was the main target of interest and SES category was used as a covariate.
(2). participants versus non-participants. How did participants and non-participants differ from each other with respect to socioeconomic differences in total and cardiovascular mortality? The highest socioeconomic group (upper non-manual employees, tertiary education, the highest income tertile, respectively) of the participants was used as the reference group.
(3). participants versus total sample. The total sample (participants and non-participants) represented the study population and our aim was to investigate to what extent estimates derived from only observing the participants were distorted by systematic differences between participants and non-participants. The highest socioeconomic group for participants and for the total sample, respectively, were used as reference groups.
Table 1 presents participation rates for the FINRISK surveys in 1972–92 for men and women aged 35–74 years. The participation rate varied among men from 92% to 73% and among women from 93% to 77%, depending on the study area and year. In general, there was a declining trend over time. The average total participation rate for all the surveys was 85% for men and 89% for women.
Socioeconomic distribution of the participants differed from that of the non-participants (table 2). This difference depended on the socioeconomic indicator used. Male non-participants belonged more often to manual workers and to the group “others” than male participants. Farmers represented 20% of the participants whereas they formed only 11% of the non-participants. The proportion of upper non-manual employees was equal among both groups. For women, the differences were of similar nature. The socioeconomic distribution of participants and non-participants also differed clearly by household income. The lowest income tertile was underrepresented among the participants and formed 43–44% of the non-participants. This was the case for both genders. Education was the only indicator that did not have any significant difference in socioeconomic distribution between participants and non-participants.
Kaplan–Meier curves for participants and non-participants in the highest and lowest income tertiles deviated fairly evenly during the whole follow-up period among men (fig 1). At the end of the 9.5-year follow-up approximately 24% of the non-participating men in the lowest income tertile had died, whereas among the participating men in the highest tertile only about 9% had died. Among women, mainly the non-participants in the lowest income tertile differed from the other groups (fig 1). Also among women the curves deviated evenly during the whole follow-up period. We also used Kaplan–Meier curves to eliminate the possibility of people not responding because of an existing serious disease during the examination period. As can be seen from fig 1, there is no sudden drop in the survival probability of the non-participants which suggests that there was no substantial difference between participants and non-participants in this respect.
Non-participants had approximately twice as high a risk of death during the follow-up period than participants (hazard ratio, 1.95 (95% CI: 1.76 to 2.15) for men and 2.41 (95% CI: 2.06 to 2.82) for women) when adjusted for age, study area and study year. When the model was further adjusted for occupational class, the excess risk of non-participants decreased to 1.71 (95% CI: 1.55 to 1.89) for men and 2.19 (95% CI: 1.87 to 2.57) for women. Occupational class was the strongest explanatory covariate out of the three SES indicators.
Table 3 shows the RRs of death for different socioeconomic groups between participants and non-participants when compared to the reference group (upper non-manual employees, tertiary education, the highest income tertile among participants).
The expected socioeconomic gradient was seen among participants. Among non-participants, the socioeconomic gradient was similar but at a higher level of risk. Male non-participants who were employees or manual workers had statistically significantly higher risk of death than their socioeconomic counterparts who participated. For example, non-participant upper non-manual employees had 1.87 (95% CI: 1.22, 2.86) times higher risk of dying during the follow-up period than participant upper non-manual employees. In lower socioeconomic groups the risk increased further. These results persisted also for education and income. Among women the results were similar to men except for farmers there was a statistically significant difference between participants and non-participants but not for manual workers. Interaction between participation and SES category was not statistically significant except for women when household income was used as the SES indicator, indicating that the effect of SES on mortality was generally not different between participants and non-participants.
Hazard ratios of all-cause mortality between socioeconomic groups were calculated for the whole sample (table 3). The socioeconomic gradient of the total sample was very similar to the gradient found among the participants. This was the case for both men and women and for all socioeconomic indicators.
The same analyses were conducted for cardiovascular deaths (table 4). The findings among men were consistent with those of total mortality, although the 95% CIs of the hazard ratios among participants and non-participants were generally overlapping. Among women, the pattern was similar, but the CIs were wide due to the small numbers of cardiovascular deaths.
Compared to the corresponding analysis for the whole sample (table 4) the socioeconomic gradient and the hazard ratios were similar. Interaction between participation and SES category for cardiovascular deaths was not statistically significant.
Our study showed that lower socioeconomic groups were over-represented among non-participants in the FINRISK surveys in 1972–92. The response rates were lower among manual workers and in the heterogeneous “others” group than among non-manual employees, in particular among the upper non-manual employees. This was true also when household income was used as the socioeconomic indicator. People in the lowest income tertile were underrepresented among the participants forming almost 44% of the non-respondents.
The non-participants had a higher risk of death than participants. This higher risk was especially clear in lower socioeconomic groups. For example, among non-participating men, manual workers, those with primary education only and those belonging to the lowest income tertile had over three times higher risk of death than the participating upper non-manual employees. In general, the socioeconomic gradient in all-cause mortality and cardiovascular mortality was similar among non-participants and participants, but in every SES category the non-participants had approximately two times higher risk of death than the participants. After adjustment for socioeconomic status, the excess risk of non-participants decreased but was still almost double among men and more than double among women compared to participants. Even though non-respondents belonged to the lower socioeconomic groups more often than respondents, SES explained only a minor part of the mortality difference between participants and non-participants.
We compared the socioeconomic differences in all-cause mortality between participants and the total sample in order to establish how non-participation contributes to the relative socioeconomic differences in mortality. We did not find any substantial differences between participants and the total sample. In both groups, persons with lower socioeconomic position had a higher risk of death than persons with higher socioeconomic position and the RRs were almost identical. This was true both for all-cause mortality and cardiovascular mortality. According to our findings, non-participation did not seem to distort the relative socioeconomic differences in mortality, although the observed relative differences were on a lower absolute level of risk among participants than among the total sample.
Education was the only SES indicator in our study that did not yield unbalanced distribution between participants and non-participants. This was partly due to age that acted as a confounding factor between education and participation. We calculated educational distribution among participants and non-participants using two age strata, 35–54 years and 55–74 years, and found that among the older age group, the least educated were overrepresented among non-participants. Education was the most problematic SES indicator in our study. This was partly due to the comparison of people from different generations. It is known that among older people education is not the best possible SES indicator because most of the older population have only a minimal amount of formal education.24 Our study population was born between 1913 and 1957. The educational system was not strictly comparable before and after World War II which makes it difficult to assess the educational qualifications of the people selected in the sample. Before the war, having only a basic education did not necessarily result in a low socioeconomic position and this fact is also reflected as the unbalanced distribution between the educational groups. More than two thirds of respondents and non-respondents had only primary education.
There is some evidence that similar differences exist between participants and non-participants of epidemiological studies in populations that differ considerably both culturally and geographically from each other.5 Usually, non-participants tend to be single men, with low education and relatively low income.5,7,16 They often have unhealthier lifestyles than participants with excess smoking and alcohol use.16,21,25 Some studies did not indicate significant differences between participants and non-participants or these differences were not as usually expected.26,27 Overall, there seems to be fairly good consensus on the existence and nature of these differences at least regarding populations of industrialised countries. Our study confirms this generally accepted view on the part of socioeconomic status but goes beyond that by concluding that although participants and non-participants differ systematically from each other, it does not necessarily distort the relative socioeconomic differences in mortality.
The main strengths of this study were its population-based design, the large amount of all cause and cardiovascular deaths and the long and complete follow-up including both participants and non-participants of the surveys. The possibility of linking these data at an individual level with socioeconomic factors provided by Statistics Finland made it possible to receive uniquely accurate information on total and cardiovascular mortality in different socioeconomic groups. The cardiovascular diagnoses in the National Causes of Deaths Register have been validated recently.28 The coverage and the diagnostic accuracy of this register were found to be good.
The average response rates of the FINRISK surveys were relatively high, up to 90%. This poses a question about the applicability of the findings in this study to surveys with considerably lower response rates. The absolute effect of non-response is likely to grow larger along with increasing non-response rate. Therefore the results of this study are not likely to be directly applicable to surveys with markedly larger non-response.
In conclusion, lower socioeconomic groups were to some extent over-represented among non-participants of the population-based FINRISK surveys compared with participants. However, within every socioeconomic group the non-participants had approximately two times higher all cause and cardiovascular mortality than the participants. Only a small part of the excess mortality among non-participants was explained by their different socioeconomic distribution. It seems that non-participation in epidemiological studies does not significantly distort the RR estimates of socioeconomic indicators, but it does lead to a substantial underestimation of absolute risk of total or cardiovascular death.
What this study adds
Non-participants in epidemiological studies usually differ systematically from participants with regard to socioeconomic status, race and health behaviour.
This may result in bias in the estimates derived from these studies. The effect this bias has on relative socioeconomic differences in mortality is largely unknown.
We found that socioeconomic gradient was similar for participants and non-participants regarding all-cause mortality—that is, lower groups had a higher risk of death—but the risk was at a higher level among non-participants.
Basing analyses on participants does not distort the RR of death associated with socioeconomic position. However, it does underestimate the absolute risk.
We thank the Finnish Foundation for Cardiovascular Research for providing funding for the project.
Competing interests: None declared.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.