Background To estimate prevalence and incidence of diseases through self-reports in observational studies, it is important to understand the accuracy of participant reports. We aimed to quantify the agreement of self-reported and general practitioner-reported diseases in an old-aged population and to identify socio-demographic determinants of agreement.
Methods This analysis was conducted as part of the AugUR study (n=2449), a prospective population-based cohort study in individuals aged 70–95 years, including 2321 participants with consent to contact physicians. Self-reported chronic diseases of participants were compared with medical data provided by their respective general practitioners (n=589, response rate=25.4%). We derived overall agreement, over-reporting/under-reporting, and Cohen’s kappa and used logistic regression to evaluate the dependency of agreement on participants’ sociodemographic characteristics.
Results Among the 589 participants (53.1% women), 96.9% reported at least one of the evaluated chronic diseases. Overall agreement was >80% for hypertension, diabetes, myocardial infarction, stroke, cancer, asthma, bronchitis/chronic obstructive pulmonary disease and rheumatoid arthritis, but lower for heart failure, kidney disease and arthrosis. Cohen’s kappa was highest for diabetes and cancer and lowest for heart failure, musculoskeletal, kidney and lung diseases. Sex was the primary determinant of agreement on stroke, kidney disease, cancer and rheumatoid arthritis. Agreement for myocardial infarction and stroke was most compromised by older age and for cancer by lower educational level.
Conclusion Self-reports may be an effective tool to assess diabetes and cancer in observational studies in the old and very old aged. In contrast, self-reports on heart failure, musculoskeletal, kidney or lung diseases may be substantially imprecise.
- GENERAL PRACTICE
- HEALTH STATUS
Data availability statement
Data are available upon reasonable request.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known on this topic
Self-reports of diseases in studies are known to be of limitations. They have mostly been studied in younger cohorts and through comparison with diverse sources of medical data.
What this study adds
This study investigates the agreement of self-reports of the old and very old population in Germany with reports of general practitioners. The old aged are most frequently affected by the studied diseases and yet give inaccurate self-reports on heart failure, musculoskeletal, kidney or lung diseases, whereas self-reports on diabetes and cancer are more accurate. Agreement between these two sources of information is also affected by sociodemographic factors, such as sex, age and education level.
How this study might affect research, practice or policy
When using data concerning self-reported diseases of study participants in a similar population group, the results this study have provided will help to interpret accuracy.
As a population grows older, chronic diseases are not only burdens to the individual but also increasingly become a challenge to the healthcare system and society as a whole. Epidemiological longitudinal and panel studies have been designed to investigate older populations in terms of ageing and its impact on health.1–3 These studies often use questionnaire-based self-report measures of chronic disease.
Limitations of self-reports have been the subject of studies before: sociodemographic factors, illness perceptions, severity of symptoms and resources to understand a condition may impact consistency and accuracy of self-reports.4–7 Several studies focus on the validity and reliability of self-reports in different study populations and with different measures of agreement;5 8–19 however, there has only been a small number of such studies focused on Germany’s elderly population.
The German Altersbezogene Untersuchungen zur Gesundheit der Universität Regensburg (AugUR) study was established as a research platform to estimate the prevalence and incidence of chronic conditions and to understand associated risk factors for chronic health disabilities in the elderly.20 In the following AugUR substudy, we covered diseases and medical events, which are dealt with in general practice: hypertension, diabetes, myocardial infarction, heart failure, stroke, kidney disease, cancer, lung diseases and musculoskeletal diseases. All of them are either persistent or irreversible and, therefore, should become apparent in both the participants’ self-reports and the medical data provided by their health professionals. Sex, age, the status of living with a partner and education level might influence the participants’ awareness of the disease and, therefore, the degree of agreement with the diagnosis made by their general practitioners (GPs). Recognising sociodemographic factors can help to identify subgroups, which might benefit from better communication about diagnosed diseases.21 22 Limited awareness or knowledge of one’s own status of disease may lead to non-compliance with therapy as well as the oversight of warning signs of deterioration.23
Two questions were addressed in the following substudy analysis: (1) To what extent do self-reports of old-aged individuals agree with the information given by their GPs? (2) Are there sociodemographic determinants of agreement (age, sex, living status, education level)?
AugUR cohort study
The German AugUR study is a prospective study of the general elderly population in and around the city of Regensburg, Bavaria. The study region comprises ~347 000 inhabitants, including 45 000 residents aged 70 or older.20 From the latter, a random sample from population registries in Regensburg and selected communities of the county was identified (n=13 971) and invited for the baseline study centre visits in two consecutive, comparable surveys (AugUR-1 in 2013–2015 and AugUR-2 in 2017–2019). From a total of 13 522 contactable persons (n=449 deceased or moved outside the recruitment area before being invited), 2449 participated at the two AugUR baseline study surveys (n=1133 in AugUR-1 and n=1316 in AugUR-2, respectively), resulting in a net response rate of 18.1% (n=8171 did not respond at all, n=2902 actively refused participation). Further details on recruitment and response are presented in the online supplemental note. Consent and valid information to contact GPs was given by 2321 participants (figure 1).
For 788 AugUR-1 baseline participants, 3-year follow-up data were collected in 2016–2018. Between baseline and follow-up, 67 persons died, 37 moved away, whereas 3 did not agree to be recontacted (net response rate=788/(1133-67-37-3)=76.8%). The consecutive 6-year follow-up started in November 2019 and paused in March 2020 due to the COVID-19 pandemic after inclusion of n=123 participants (net response rate so far=72.4%).
We consider the participants physically mobile and without major cognitive impairments, since they have had to visit the study centre and actively take part at the study programme.
AugUR study programme and data management
The study programme, including a standardised in-person interview, was conducted in the study centre at the University Medical Centre Regensburg. AugUR focuses on ophthalmic diseases classified within the study24 and general chronic diseases evaluated through in-person-interview.20
The baseline questionnaire included an assessment of sociodemographic characteristics such as sex, age, living status and education level. Medical conditions and further medical history were evaluated at baseline and follow-up visits via self-report using the question ‘Has a physician ever diagnosed one of the following conditions?’. Possible answers were ‘yes’, ‘no’ and ‘I do not know’. The latter was excluded from the analysis.
Patterns of missing values of both self-reports and GP reports are presented in online supplemental table 1.
Some of the terms for diagnoses had to be adapted, so participants could better understand them. For example, the patients were asked about ‘heart weakness’ instead of the medical term ‘heart failure’, which was used in the questionnaires for GPs. Questionnaire data were transferred to an electronic case report form.20 For all participants, we evaluated the status of disease and the sociodemographic characteristics at their most recent interview. We included follow-up data; therefore, inconsistencies between baseline and follow-up information could arise. This is a common phenomenon among longitudinal data and needs to be dealt with.4 Adjustments had to be made if a participant reported a condition in an earlier interview and denied it during the following visit. Since the lifetime prevalences of diseases were asked for, and not the current prevalences of diseases, we set the disease status ‘yes’ for subsequent interviews and used that status in further analysis.
The study complies with the 1964 Helsinki declaration and its later amendments. All participants provided informed written consent. For this substudy, participants with given informed consent to contact their physicians were included.
AugUR study participants’ data collected from GPs
A total of 2321 AugUR participants gave informed consent to contact their physicians and provided valid contact information of their respective GPs (n=169). A standardised survey of the participants’ GPs was conducted between October 2020 and March 2021. 62/169 contacted GPs responded and provided data on 589/2321 study participants (25.4%), forming the AugUR GP substudy population. Multicentric data from GPs were collected using the web-based Magana Trial Manager (MaganaMed GmbH, Regensburg, Germany). The GPs recorded the data directly online or provided the data on paper-based questionnaires, which were transferred to Magana Trial Manager by trained AugUR staff. The questions read: ‘According to your records, does this patient have a diagnosis of disease X?’. Additionally, due to the study design, the date of first diagnosis was requested for diabetes, myocardial infarction, heart failure, stroke, kidney disease and cancer. It is possible that a disease was diagnosed by a GP after the participants’ last chance of reporting it in an interview, as a proportion of participant interviews had been conducted 2018 or earlier. If that was the case, the disease was considered absent. Raw data and further analysis on absent cases are found in online supplemental tables 2 and 3.
Self-reports and GP reports of up to 589 participants were compared (online supplemental table 4). Figure 2 gives an overview of the agreement parameters used in the following analyses and how they were determined. Concordance in absence or presence of disease between self-reports and GP reports was analysed by calculating overall agreement (figure 2). In addition, we identified over-reporters and under-reporters. An over-reporter is defined as a person reporting an illness, which is not confirmed by the GP, while an under-reporter does not report an illness, which is reported by the GP. Under-reporters/over-reporters and sensitivity/specificity can easily be translated into one another (figure 2). However, sensitivity and specificity are terms used for gold standards,25 and, therefore, unsuitable in our analyses. To describe agreement between self-reports and GP reports in an omnibus index and control for agreement by chance, Cohen’s kappa was calculated. According to Landis and Koch’s classification for agreement adjusted by chance, we refer to kappa values between 0.81 and 1 as ‘almost perfect’, 0.61 to 0.80 as ‘substantial’, 0.41 to 0.60 as ‘moderate’ and 0 to 0.40 as ‘poor to fair’.26 27 As kappa can be influenced towards both higher and lower numbers by the distributions in the cross-table, especially in the marginal totals, we also calculated specific agreement in form of positive and negative agreement (figure 2), as suggested by Hansen et al and used by Cicchetti and Feinstein.9 28 The proportions of specific agreement (ie, positive and negative agreement) estimate the conditional probability, in cases where one of the raters (GP or participant)—randomly selected—makes a positive/negative rating, the other rater will do so as well.
Disease frequencies and total numbers of overall agreement stratified by the independent variables sex, age (‘old vs ‘very old’, stratified at median age of 79.03 years), the status of living with a partner and the education level were assessed. Because, in Germany, 8 years of education are often followed by vocational training and >8 years qualify for higher education, we set 8 years of schooling education as the threshold to form the two groups.
In logistic regression analyses, we used overall agreement as the dependent variable (participant overall agrees with GP: coded as 1; participant does not overall agree with GP, that is, is under-reporter or over-reporter: coded as 0), with sex, age (in years, linear), status of living with a partner and education level as independent variables. Due to limited sample size and partially low prevalence of diseases, separate analyses of true-positives and true-negatives or under-reporters and over-reporters would have led to unreliable results. A p value of <0.05 was used as criterion for statistical significance.
Data management and statistical analyses were performed using SAS 9.4 software (SAS Institute, Cary, North Carolina), IBM SPSS Statistics for Windows, V.126.96.36.199 (IBM, Armonk, New York) and Microsoft Excel V.2019.
Characteristics of GP substudy individuals and AugUR study population
The characteristics of all AugUR participants and the substudy sample with available GP data showed comparable distributions (table 1).
The median time gap between the self-report and the GP report was 2.66 years (range: 0.78–7.73 years) for all 589 participants (table 1).
Frequency of diseases in self-reports and GP reports
96.9% of the 589 participants reported at least one of the depicted chronic diseases. For most diseases, the frequency was higher in self-reports than in GP reports, for example, for myocardial infarction, kidney disease, lung diseases and musculoskeletal diseases. For diabetes and hypertension, GP reports showed higher frequencies than self-reports (table 2).
Self-reported disease frequencies stratified by sex, old versus very old age, status of living with a partner and education level show that men more often report diabetes, myocardial infarction, stroke, kidney diseases, cancer and COPD/chronic bronchitis. Women more often claim to suffer from hypertension, heart failure, asthma, rheumatoid arthritis and arthrosis (online supplemental figure 1). Very old participants more often report all chronic diseases except kidney diseases and asthma. Participants who lived with a partner more often self-reported cancer and asthma. Participants who had been educated for >8 years more often reported heart failure, stroke and rheumatoid arthritis (online supplemental figure 1).
Agreement on diseases between self-reports and GP reports
As one of our goals was to determine, how well self-reports depict the actual disease status of our participants, we analysed, to what extend the participants self-reports align with their GP’s records. Overall agreement, over-reporting, under-reporting, kappa, positive and negative-specific agreement were evaluated.
High overall agreement with >80% was found for hypertension, diabetes, myocardial infarction, stroke, cancer, asthma, chronic bronchitis/COPD and rheumatoid arthritis. Overall agreement <80% was found for heart failure, kidney disease and arthrosis (table 2). High numbers of over-reporting were seen for hypertension (35.5%), kidney disease (25.6%) and arthrosis (59.5%). For under-reporting, high numbers are seen for heart failure (57.9%), kidney disease (57.1%), chronic bronchitis/COPD (54.8%) and rheumatoid arthritis (53.8%) (table 2).
Positive agreement of >80% was discovered for hypertension and diabetes (table 2). For most conditions, negative agreement of >80% was found, except for hypertension, kidney disease and arthrosis (table 2).
Cohen’s kappa was substantial for diabetes and cancer. Moderate agreement in observed data was found for myocardial infarction, stroke, asthma and hypertension. Poor to fair agreement was observed for chronic bronchitis/COPD, heart failure, rheumatoid arthritis, arthrosis and kidney disease (table 2).
Analysis of age, sex, living status and education level
We depicted overall agreement, stratified by sex, age, status of living with a partner and education level (online supplemental figure 1).
Regression analyses showed that men were significantly less likely than women to report the same as their GP regarding stroke (OR=0.391, p value=0.017), kidney disease (OR=0.606, p value=0.011) and cancer (OR=0.528, p value=0.019) (table 3).
For rheumatoid arthritis, the opposite effect was found, as women were significantly less likely to agree with their GPs (OR=2.276, p value=0.005) (table 3).
Older participants were significantly less likely to agree with their GPs on myocardial infarction (OR=0.931, p value=0.031) and stroke (OR=0.927, p value=0.022) (table 3). The regression analysis also showed that participants with low education of ≤8 years of schooling were significantly less likely to agree with their GPs on their cancer status (OR=1.927, p value=0.013) (table 3).
The status of living with a partner did not show significant association with overall agreement.
Our study highlights chances and challenges of interview-based self-reports on diseases and medical events in the population aged 70+. We found high agreement of self-reports and GP reports for diabetes and cancer. Therefore, self-reports are an effective tool to assess these diseases in observational studies in the old and very old population. Low agreement not only for heart failure but also for musculoskeletal, kidney or lung diseases indicate substantial imprecision when relying on self-reports. Our association analyses showed that being male, of very old age or having received less than 8 years of schooling education, was associated with lower agreement and, thus, with more inaccurate information concerning the self-reported disease status, which can induce differential misclassifications.
The data here presented showed highest overall agreement and kappa values for diabetes and cancer between self-reports and GP reports. This can be explained by both diseases requiring intensive intervention and being very present in a person’s daily life: to lower the risk of organ damage, participants with diabetes need to adjust their lifestyles in terms of diet and exercise or require medication.29 Furthermore, in Germany, a high proportion of patients suffering from diabetes is enrolled in disease management programmes, which include educational events and trainings, continuous monitoring by GPs and repeated screening for organ damage by ophthalmologists and by physicians specialised in internal medicine. A cancer diagnosis often causes physical and mental strains,30 leading to higher awareness of the disease. Our results are in line with other studies (table 4).
However, the agreement on cancer was not regularly addressed and varied among studies with kappa values from 0.33 to 0.67 with our kappa value of 0.66 on the top end. In comparably old study populations, similar values were found (table 4).
In AugUR, overall agreement was high, but Cohen’s kappa was moderate for myocardial infarction and stroke, which were both mainly under-reported. Myocardial infarction and stroke can be life-threatening events. However, not all events may have been explicitly explained by the physician, leading to a lack of awareness. Comparable results are seen in other studies with kappa values ranging from 0.33 to 0.80 and 0.36to 0.71 for myocardial infarction and stroke, respectively (table 4). The wide range of kappa values may be caused by the variety of terms used to ask for both diseases, for example, while we asked participants for a ‘brain attack’, other studies also differentiated transient ischaemic attacks. Variability in the severity of the events between study populations may also explain some of the differences in reported agreement, since more threatening events might lead to higher awareness and, thus, higher agreement.
In our study, agreement for asthma was moderate, while agreement for chronic bronchitis/COPD was only poor to fair. Lung diseases were asked about in different ways across studies, some asking for either asthma or COPD and some for chronic lung diseases in general, but reported agreement was comparable to our results for asthma (table 4). Previous reports indicated that 50% of patients with asthma believed they only had asthma when they were currently experiencing symptoms,31 which might explain our finding of high under-reporting.
For hypertension, our kappa value of 0.50 was well within the range of other reports (table 4). We found high over-reporting and low negative agreement, indicating that the participants were more likely to invent a diagnosis than denying it. In the literature, however, an under-reporting of hypertension is predominantly documented.6
We observed low kappa values for heart failure, kidney disease and musculoskeletal diseases (table 4). Throughout literature, heart failure is an example for rather poor agreement between self-reports and GP reports or medical records. Low agreement may be due to the complexity of the heart failure diagnosis and communication with patients.5 In contrast, published data on the agreement for kidney disease are limited. Two studies showed kappa values of 0.40 and 0.47 (table 4), which are high compared with our kappa of 0.15, possibly due to the younger age of their study participants with lower kidney disease frequency and the setting of interviewing hospitalised participants. For musculoskeletal diseases such as rheumatoid arthritis and arthrosis, we found poor to fair agreement in line with the literature (table 4). Fluctuating symptoms of joint diseases and the tendency to treat them without consulting a physician32 may explain low agreement. As we see especially high numbers of over-reporting for arthrosis, while rheumatoid arthritis is under-reported more often, there is the possibility that patients confuse the diseases for one another.
Agreement of self-reported disease with GP reports in the general population provides insights into awareness of the disease or of awareness of not having a disease. Awareness of disease is a prerequisite for compliance with treatment plans and lifestyle changes33–35; awareness of not having a disease documents a general understanding of one’s own health status.
In stratified analyses, overall agreement might be higher for the group in which disease frequency is lower compared with the other group (eg, men vs women), so subgroup disease frequencies must be considered in the interpretation of the results. According to our data, men are more likely to overall agree with their GPs on rheumatoid arthritis; however, disease frequencies are much lower in that group (9.6% in men vs 17.0% in women), so significant association may be influenced by that difference, as it might be easier to correctly report the absence of a disease than the presence of a disease. While we found that men were less likely to agree with their GPs on stroke, kidney disease and cancer, there is no consensus on the influence of sex on agreement for these diseases in the literature.9–11 14 15
Our findings of older age being associated with lower agreement for myocardial infarction and stroke were in accordance with a general decline of agreement by age reported in other studies.8–12 18
Living with a partner was found to improve health awareness36 and, thus, to potentially increase agreement. We did not find this status to be significantly associated with agreement and none of the discussed studies evaluated the influence of living with a partner.
Our finding that participants with higher education of >8 years were more likely to agree with their GPs regarding cancer indicates a higher ability to comprehend this diagnosis. Other studies also found higher education to be positively associated with higher agreement.8 12 14 The fully adjusted model with sex, age and living with a partner as covariables showed that the education effect on overall agreement for cancer self-report is not confounded by the other aspects.
Strengths and limitations
We need to acknowledge a limitation caused by the low response rate of GPs (26.0%), which is partly because we refrained from reminding GPs in the middle of the Corona pandemic. To further address the limitation by selection bias, we compared the characteristics and disease count of our study subgroup with the group of AugUR participants not being represented in our GP substudy. As a random selection of practitioners gave information on the disease status, we consider the subgroup sample to not to underlie selection bias. Yet, our sample from Regensburg and selected communities of the county may not be representative for other regions. The median time gap of 2.66 years between self-reports and GP reports raises limitations as well, however, we tried to minimise them by including all available follow-up information. Furthermore, our usage of lay language in participants’ questionnaires for diseases may be a reason to potentially increase disagreement.9 14 37 Regarding the exclusion of missing values, for most diseases, small numbers in both self-reports and GP reports were found. However, in GP reports, we found more missing values in diseases where treatment by a specialist rather than the GP is plausible, that is, lung diseases and musculoskeletal diseases.
This study’s major strength is its focus on old and very old individuals. The old and very old population is difficult to address in observational studies of wider age range and, therefore, often under-represented. This is in sharp contrast to the fact that the chronic diseases studied here are more frequent in old age. We tailored the study programme to the needs of the elderly by employing a leaner questionnaire and allocating additional time for walking around the study premises as well as for answering questions. The standardised face-to-face interviews with participants, rather than the use of unsupervised questionnaires, are also strength in our study; as is the wide variety of diseases evaluated here.
Data availability statement
Data are available upon reasonable request.
Patient consent for publication
This study involves human participants and was approved by Ethics Committee of the University of Regensburg, Germany, vote 12-101-0258. Participants gave informed consent to participate in the study before taking part.
The authors greatly appreciate the outstanding and committed study assistance of Lydia Mayerhofer, Magdalena Scharl, Sabine Schelter and Josef Simon. We thank Miriam Stoffregen and Caroline Kästner for critically reading the manuscript. Moreover, we thank all study participants for contributing to the AugUR study. We would like to express our special thanks to the general practitioners who made this work possible by providing data on diagnoses.
Contributors All authors have contributed to interpreting results and manuscript writing. All authors have read and approved the manuscript. Further contributions are: ABS: manuscript design, statistical analyses. MEZ: data management. FJD: data curation. AD: study physician. CB: study physician. MK: study support. JL: study support. IMH: study PI, project supervision, manuscript design. KJS: study coordination, project initiation and supervision, data management, statistical analysis, manuscript design. KJS is the guarantor of the paper.
Funding The AugUR study was supported by grants from the German Federal Ministry of Education and Research (BMBF 01ER1206 and BMBF 01ER1507) to IMH, by the German Research Foundation (DFG HE 3690/7-1 and BR 6028/2-1) to IMH and CB and by institutional budget (University of Regensburg).
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.