Study objective: To assess the representativeness of survey participants by systematically comparing volunteers in a national health and sexuality survey with the Australian population in terms of self reported health status (including the SF-36) and a wide range of demographic characteristics.
Design: A cross sectional sample of Australian residents were compared with demographic data from the 1996 Australian census and health data from the 1995 National Health Survey.
Setting: The Australian population.
Participants: A stratified random sample of adults aged 18–59 years drawn from the Australian electoral roll, a compulsory register of voters. Interviews were completed with 1784 people, representing 40% of those initially selected (58% of those for whom a valid telephone number could be located).
Main results: Participants were of similar age and sex to the national population. Consistent with prior research, respondents had higher socioeconomic status, more education, were more likely to be employed, and less likely to be immigrants. The prevalence estimates, means, and variances of self reported mental and physical health measures (for example, SF-36 subscales, women’s health indicators, current smoking status) were similar to population norms.
Conclusions: These findings considerably strengthen inferences about the representativeness of data on health status from volunteer samples used in health and sexuality surveys.
- health surveys
Statistics from Altmetric.com
Non-response is a problem for all studies that rely on volunteer samples, especially those exploring sensitive topics. In the broad literature on health surveys, participation bias has been explored by comparing respondents with non-respondents,1–6 early with late respondents,7–9 or drop outs in follow up studies with continuing participants.10,11 The results generally indicate that non-respondents are likely to be older, male, non-white, of low socioeconomic status, smokers and, importantly, have poorer health status. A few studies have compared respondents with census or large scale population surveys to assess the demographic and health characteristics of their samples12,13 and have observed similar differences in the characteristics of volunteers.
In recent years, numerous population based studies have had a dual focus on sexuality and health. Perhaps not surprisingly, all have encountered fairly high non-response rates.14–18 In most of these studies the sample has been compared with the general population on a limited number of demographic characteristics, such as age, gender, and marital status. The underlying inference is that comparable demographics indicate the sample is representative on important dependent variables, such as health status and sexuality, although most researchers would acknowledge that this inference relies more on faith than scientific evidence.
Health status is known to be strongly correlated with sexual experience and function.19–23 For example, poor current health status is associated with sexual dysfunction, reduced sexual activity, and lower levels of relationship satisfaction, while, conversely, certain patterns of sexual behaviour are associated with adverse health outcomes such as those associated with sexually transmitted diseases. It would seem important therefore to assess the extent to which the health of volunteers in sexuality survey samples is representative of the population.
As part of a population based Australian survey of sexuality and health (the National Study of Health, Intimacy and Social Relations), we evaluated the representativeness of the study sample by comparing demographic and health related characteristics with a large scale national health survey and with national census data. Both comparison datasets were collected by the Australian Bureau of Statistics and have very high participation rates. What we were able to do that others have not was to compare our participants with the population across a range of health categories, including eight subscales of the MOS Short-Form 36 (SF-36) questionnaire,24 four women’s health indicators as well as other measures such as smoking status and body mass index (BMI). This enabled a direct assessment of possible volunteer bias based on measures of health status.
This study used a cross sectional, telephone interview survey of a randomly selected sample of the Australian population. In October, 1999, a total of 4449 people were randomly selected from the entire Commonwealth electoral roll (enrolment to vote is compulsory in Australia) within the age categories of 18–29, 30–39, 40–49, and 50–59 years. The roll provided details on full name, age group, sex, and residential address. The frequencies selected within each age group were based on the age distribution of the Australian population according to the 1996 census data.
Australian White Pages and Desktop Marketing Systems Pty Ltd (Marketing Pro, Victoria) databases were searched to obtain the phone numbers of those selected. People who were listed in the directories were initially contacted by telephone to notify them of their selection in the study and to check their address details. Where a name and address match could not be found in the phone directories, other avenues for contacting these people were explored using clues obtained from the databases. For those who were found, a University of Queensland headed letter was sent to the homes inviting their participation, including a description of the study and an outline of what would be required of them. A time frame was specified during which they would be contacted by telephone to conduct the interview, but with the opportunity to change this time if it were not convenient. In addition, a coded set of responses to questions was sent to each subject so that numerical responses to sensitive questions would be visible to the respondent to use for anonymous answering of questions asked by the interviewer. Those people who refused to participate were invited to answer some basic demographic questions such as employment status and income, as well as their general health status, so they could be compared with the survey participants.
All interviews were conducted by trained interviewers in a computer assisted telephone interview (CATI) laboratory, located at the University of Queensland, Brisbane, between 23 November 1999 and 2 April 2000 (excluding the Christmas and New Year period). Interviews were performed between the hours of 4 pm to 9 pm on Monday to Friday, and between 9 am and 8 30 pm on Saturday and Sunday. Participants’ responses were entered directly onto computer using a software package called Surveycraft (SPSS Technologies for Market Research), which checked the validity and consistency of responses as they were entered. If a respondent gave inconsistent or unlikely answers, the interviewer received a warning and was requested to verify the question response.
Participants took a mean of 42.3 minutes to fill out the questionnaire (standard deviation 10.3 minutes, range 14.5–107.6 minutes), which contained a number of validated instruments to measure general health status (SF-36),24 anxiety and depression (HADS),25 sexual related satisfaction and dysfunction,26 and history of sexual abuse.27 In addition there were questions on use of drugs and alcohol as well as standard demographic items adopted from the Australian census and the Australian National Health Survey (NHS).28 The questionnaire was subject to extensive piloting with students and public volunteers of representative ages and socioeconomic status (SES) groups before being used in the formal study. The study received ethical approval from the Behavioural and Social Sciences Ethical Review Committee of the University of Queensland.
To assess the representativeness of the sample of respondents, a comparison of their demographic characteristics was made with the general Australian population aged 18–59 years using the 1996 National Census data. A range of survey items regarding health status were compared with data from a large scale population based survey of the Australian population conducted in 1995 (National Health Survey: NHS).28–30 This survey obtained information (through personal interviews) from residents of a sample of 23 817 private dwellings (houses, flats, etc) and non-private dwellings (hotels, boarding houses, etc) selected at random using a stratified multi-stage area sample, which ensured that all segments of the population were represented. The selection methods also ensured that persons within each state and territory had a known and, in the main, an equal chance of selection in the survey. The survey resulted in a total of 53 828 interviews, reflecting a household response rate of 91.5% and therefore should not suffer from significant sample participation bias as was the case in our study. For comparison with our study, we only used people from the NHS who were aged 18 to 59 years, which constituted a sample of 31 508. A General Health and Well-Being Form (MOS SF-36) 24 was given to adults (aged 18 years and over) in approximately half the dwellings for self completion before administration of the main questionnaire. The response rate for the General Health and Well-Being component was 95%, resulting in a sample of 15 938 people aged 18 to 59 years (51% of total).29 A Women’s Health Supplementary Form was provided at the completion of their interview to female respondents aged 18 years and over who were not selected in the General Health and Well-Being sample. The response rate for the Women’s Health component was 93%, resulting in a sample of 7747 women aged 18–59 years (48% of total). Variables that were comparable between the NHS and our study included weight, height, and BMI; self rated health status; smoking status; mean scores of the SF-36 subscales; history of breast cancer and hysterectomy; and two cancer screening practices (women only). Analysis of NHS data was weighted to reflect the appropriate sampling fraction of each respondent.
Comparisons of sociodemographic characteristics were also made between people in the different response categories to examine for systematic differences between contact and response groups. Respondents and non-respondents were classified in terms of the degree of geographical remoteness from major urban centres and the socioeconomic status of their residential area based on postcode/zipcode. Remoteness was assessed using the Accessibility/Remoteness Index of Australia (ARIA), which uses distances to population centres as the basis for quantifying service access and classifies areas as highly accessible, accessible, moderately accessible, remote, and very remote.31 Socioeconomic status for residential locations were assessed using the Australian Socio-Economic Indexes for Areas (SEIFA), which uses census information on prevalence of low income earners, relatively lower educational attainment, high unemployment, rented dwellings, and people lacking fluency in English to calculate an Index of Disadvantage.32 This continuous scale is then categorised based on the Australian population values for the 10th, 25th, 50th, 75th, and 90th centiles.
The distributions of sociodemographic characteristics between the different categories of survey respondents were compared using the χ2 test of association. To compare the distribution of sociodemographic and health characteristics of study participants with the distributions in the population using the census and NHS data, the χ2 goodness of fit test was used. Crude odds ratios (OR) and 95% confidence intervals (CI) were used to assess the magnitude of the difference between health related characteristics of the sample with those from the NHS.
The overall contact rate of those selected from the electoral roll was 66%. The participation rate of those contacted was 61%. Full details of the proportions in each of the response categories are given in table 1. Of the 4449 names randomly selected from the electoral roll, 1793 (40%) successfully completed an interview. At the time of interview, nine of these were outside of the age criteria (that is, 18–59 years) and were thus excluded from further analysis, leaving a total of 1784 participants.
The distributions of sex, age, ARIA, and SEIFA between the various response groups are outlined in table 2. There was no sex difference between response groups, however younger people were generally less likely to be contacted, were more often not available for interview, and were less likely to be listed in the white pages. Younger people were also less likely to refuse outright, and more likely to answer a few basic sociodemographic questions (that is, partial completers). There was very little difference between the degree of geographical remoteness of participants and non-participants (table 2), the only differences being that remote residents were less likely to be partial completers and those living in major urban centres (that is, highly accessible) were more difficult to contact. There was very little difference between the SES of response groups as measured by the SEIFA score, the main difference being that those of lower SES were slightly more difficult to contact (table 2).
Participants had a similar age and sex distribution to both the Australian census and the NHS (table 3). Marital status could not be compared with the census data, as the category “de facto” was not included in the corresponding census question. Compared with the NHS, participants were less likely to be married. Participants in our sex survey were significantly more likely to be born in Australia than in either the census or the NHS. It should be noted that both the census and NHS surveys had translators so that the interview was administered to residents who did not speak English. We, however, did not utilise translation of questionnaires and thus our target population could be considered to be English speaking Australian citizens. The respondents tended to have completed more years of education and were more likely to be currently studying than either the census population or NHS sample, and they were also significantly more likely to be currently employed (table 3). Compared with the Australian census, study participants were more likely to reside in areas of higher socioeconomic status although the geographical accessibility of participants’ residence was similar to the Australian population.
The health related characteristics of the study participants were compared with those of the NHS (table 4). The prevalence of current smoking was equivalent, although participants in our study were slightly less likely to be ex-smokers than people in the general population. The mean height of the participants was similar to the population, although study participants had significantly greater body mass than those in the NHS, with an OR of 1.8 (95% CI 1.5 to 2.2) associated with the highest category (≥30 kg/m2) compared with the lowest (<20 kg/m2).
On all measures of self reported health status, the differences between study respondents and the (much more representative) NHS were minimal. Among women, the prevalence of self reported breast cancer and hysterectomy, and the rate of screening for cancer of the cervix were very similar, although our survey participants may be more likely to have undertaken clinical screening for breast cancer.
Mean transformed scores for the eight subscales of the SF-36 were compared between study participants and the NHS (table 5), a higher score indicating better health. The eight subscales of the SF-36 are: “physical functioning”, “role limitations due to physical problems”, “bodily pain”, “general health perceptions”, “vitality/energy”, “social functioning”, “role limitations due to emotional problems”, and “mental health” (which covers psychological distress and well-being). In general, the health of the study participants was very similar to that of the population. The largest differences were poorer “role limitations due to physical problems” and poorer “social functioning” scores seen among study participants. Our respondents also had slightly more “role limitations due to emotional health problems” but had a higher “mental health” score than the population. The size of the differences in mean SF-36 scores between study participants and the population were less than one tenth of a standard deviation for six of the eight measures. It is also noteworthy that the variances of the measures from each of the samples were very similar.
In a large population based sex survey conducted in Australia, we have used a number of techniques to assess how representative our participants were of the population from which they were drawn. In a recent book on sex research methodology,33 discussants at a Kinsey Institute seminar agreed that data collection in this field had progressed much more quickly than had the careful development of methodology. Together with empirical analyses of bias and error in such surveys,34–36 it is clear that we should be cautious in interpreting the meaning of sex surveys without adequately testing the conditions under which measurements vary.
Working from a simple random sample of 4449 people drawn from the best available list of the Australian population (the Commonwealth electoral roll), we were successful in gaining telephone numbers for 69%. This rate is less than optimal, but is the best that could be achieved from exhaustive checking of up to date, publicly available databases. Those for whom no number could be found were similar to contactable subjects in terms of gender, geographical location, and socioeconomic status (see table 2). Predictably, younger people were significantly less likely to have listed telephone numbers.
Of those who were contacted, 39% refused to participate. This response rate of 61% is a little lower than the levels of participation recorded previously in surveys of this kind, which tend to vary in the range of 63%–73%.37 Differences were noted between our sample and the broader Australian population on marital status, country of birth, education, employment status, and socioeconomic index (table 3). It is possible that sexuality may be related to these factors, and thus, in extrapolating our estimates of the prevalence of sexual behaviour to the population, it will be necessary to adjust for these demographic differences. It should be noted however, that the most recent Australian census, with data available for this comparison, was 1996. There may have been some small shifts in the characteristics of the Australian population between 1996 and the conduct of our survey in 1999, which may explain some of the differences observed.
The comparisons to NHS data are the most interesting aspect of this analysis. By using a validated instrument (the SF-36) and some specific questions about diseases, screening practices and smoking, we were able to assess which particular aspects of health are most related to non-response. In fact, the differences were minimal. The effect sizes for each of the eight SF-36 subscales (table 5) are all less than 0.15 of a standard deviation, the average being just 0.07. To put this in context, a difference in means of about one standard deviation distinguishes chronic diseased and health people,38 and people with mild disease are on average about half a standard deviation below the norm for a healthy population.39
These findings contrast with “conventional” health surveys, which often find that people who refuse to participate have relatively poor current health status and more risky behaviours such as smoking.2,3,7–10,40 It may be, however, that such people are more willing to volunteer if there is a dual focus on sexuality and health. For example, volunteers for special sexuality surveys tend to have more risk taking, sensation seeking personalities and are slightly more (rather than less) likely to smoke cigarettes, drink alcohol, and have poorer health.35,41,42
The most pronounced demographic difference between the sample and general population was the under-representation of people born outside of Australia. One important consideration is that our sampling frame was the electoral roll. Although it is compulsory for every person aged 18 years or over, who is an Australian citizen and has lived at his or her current address for one month,43 the enrolment rates are comparatively low of young people and those who speak languages other than English. Also, the high rate of Australian born participants in our sample may partially be attributable to a large proportion of non-Australian born residence not being enrolled to vote because of citizenship ineligibility.
Based on a comparison of the demographic and health characteristics between respondents and non-respondents, there were some expectable differences. The results do not directly resolve the issue of participation bias with respect to sexual behaviour measures, and other sources of bias may still be present, such as respondents being unwilling to disclose certain behaviours.44 However, in terms of estimating the prevalence of sexual characteristics, the greatest predictors are age, gender and general health,23,26,45–48 hence we propose that because of the small differences observed between our sample and the population on these variables, the estimates we obtain should be generalisable to the population.
We gratefully acknowledge all the women who conducted the telephone interviews as well as the participants. Special thanks also go to Ruth Armstrong, Ghazala Suleman and Andrea Lanyon for their assistance with conducting the study.
Funding: this study was funded by the Australian National Health and Medical Research Council.
Conflicts of interest: none.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.