J Epidemiol Community Health 54:221-226 doi:10.1136/jech.54.3.221
  • Theory and methods

Validation of self diagnosis of high blood pressure in a sample of the Spanish EPIC cohort: overall agreement and predictive values

  1. M-J Tormo,
  2. C Navarro,
  3. M D Chirlaque,
  4. X Barber,
  5. the EPIC Group of Spain*
  1. Consejería de Sanidad y Política Social, Murcia, Spain
  1. Dr M-J Tormo, Servicio de Epidemiología, Consejería de Sanidad y Política Social, Ronda de Levante 11, E-30008 Murcia, Spain
  • Accepted 31 May 1999


STUDY OBJECTIVE High blood pressure is a variable related to several chronic conditions whose repeated measurement in large cohort studies is often not feasible having to rely on the self reporting of the subjects. The aim of the study is to validate such self diagnosis in a sample of members from the Spanish EPIC cohort study.

DESIGN Comparison of high blood pressure self diagnosis with the information provided by the personal medical record drawn from the primary health centre of reference for such population.

SETTING A small town near the EPIC-Murcia centre, one of five Spanish EPIC centres located in the south east, where inclusion in the cohort was offered to the general population.

PARTICIPANTS The agreement between self reported high blood pressure status and data from medical records was measured in a representative sample of men and women (n= 248) aged 30–69 years. Medical records were studied for a diagnosis of high blood pressure, an anti-hypertensive pharmacological treatment or subject's inclusion in a hypertension control programme run in the medical centre only for hypertensive people (definite high blood pressure cases). As well, in the absence of such a diagnosis, medical annotations of systolic or diastolic high blood pressure⩾ 140/90 mm Hg (possible high blood pressure cases) were considered. Sensitivity, specificity, positive and negative predictive values and κ scores were calculated for all, definite and possible high blood pressure cases. Variables associated with the probability of having a true positive or negative self report of high blood pressure were also tested.

MAIN RESULTS As expected, sensitivity was higher among definite cases (72.7%) than among possible cases (31.6%). Accordingly, the agreement between self report and medical record was higher for definite cases (κ = 0.65) than for possible (κ = 0.29) cases leading to a moderate overall agreement for all cases (κ = 0.58; 95% CI: 0.47, 0.69). Having some level of education (OR: 0.31; 95% CI: 0.09, 1.05) was negatively associated to a true self report of high blood pressure while being female was positively associated (OR: 4.01; 95% CI: 1.04, 16.8). No variable showed any association with having a true self report of being normotensive.

CONCLUSIONS High blood pressure self report shows a moderate agreement with medical information in this cohort allowing it to be used, with caution, as a surrogate variable of actual blood pressure status. However, because of its moderate sensitivity, it is not possible to rule out some underestimation when using self reported high blood pressure information for high blood pressure frequency measurements such as prevalence or incidence rates. This underestimation will be higher among men and educated people.

Prospective epidemiological studies often require a measure of presumed risk factors in large groups of people. Moreover, commonly during the follow up period, a re-assessment of the level of exposure seen at baseline is needed. High blood pressure is a variable related with many chronic conditions but its actual measurement in large epidemiological studies, often needing different observers, carries with it logistic difficulties not resolved to date regarding its quality control.1-6 In these circumstances it is necessary to rely on cohort members self reported information on blood pressure levels or hypertensive status. Validity is always an issue that needs to be considered when the end point of an analysis depends on such self reported data. Several factors are prone to influence validity of a self report. Thus, difficulty or selective recall, unawareness of the diagnosis or unwillingness to report the condition are accounted for in most of the works that have specifically approached the problem.7-10 A practical way to evaluate self report validity is to compare the reported condition against the information provided by the medical record8-12 mostly because, in practice, hypertension is a medical definition, subject to variability according to subjective medical criteria, more than an objective end point. The Spanish EPIC cohort13 forms part of the European EPIC study,14 a prospective investigation to elucidate the role of diet on several chronic diseases including cancer, hypertension, coronary heart diseases and stroke. The objective of this study is to provide an evaluation of self reported high blood pressure, one of the questions considered in the questionnaire, that has been used in baseline analysis and that is going to be the source of identification of incident cases of high blood pressure during the follow up.



The Spanish EPIC cohort is composed of 25 813 women and 15 634 men aged 29–69 years recruited from five Spanish Autonomous Communities, three from the north: Asturias, Navarra and Guipuzcoa and two from the south: Murcia and Granada.13 The participants are healthy volunteers from different social sectors, selected from urban and rural areas. The majority are active blood donors and in lesser amount industrial workers, civil servants or general population. The recruitment period started in November 1992 and lasted an average of three years depending upon the centre, the last subject was recruited in July 1996. The information about diet intake was obtained through personal interviews using the dietary history by means of a computerised questionnaire specially designed for this purpose and previously validated.15-17 A questionnaire of non-dietary variables was distributed detailing some lifestyles such as tobacco consumption, occupational physical activity, leisure time physical activity, level of education and some self reported ailments or illnesses (cancer, myocardial infarction, stroke, diabetes, hyperlipidaemia, etc). One of these previous conditions was that of increased blood pressure. The questionnaire asked: “Have you ever been told by a physician that you have or have had high blood pressure?”. And in the case of an affirmative answer the questionnaire asked the age at that time and if because of this diagnosis the person received any treatment.


From the EPIC cohort of Murcia (n= 8523) we selected for the validation study only those members coming from a small town (Alcantarilla) near the local coordinating centre. The reasons for such selection were, on the one hand, that they were volunteer members of the general population (no blood donors as most in the EPIC cohort were) and, on the other hand, that this was a centre of reference for primary health training medical graduates. Thus, better organisation including improved filling and retrieval of medical records should be expected.

Sample size was calculated in base of the observed prevalence of high blood pressure in the entire cohort (about 20%), for a precision ± 4% and level of confidence of 95% using Epiinfo 6.0 statistical software.18 Three hundred and fifteen people (n=315) were randomly selected from the whole 1402 EPIC cohort members recruited in Alcantarilla. During recruitment, cohort members gave written permission for revision of their medical records along with the follow up period (minimum 10 years) and the names of their primary health physicians. Because high blood pressure is a condition for which diagnosis, treatment and control are mainly done at the primary health care level in Spain, no attempt at retrieving information from different hospitals of reference was made.

Having ever been told of having high blood pressure from this sample was considered as a definite diagnosis of high blood pressure if: (a) a medical diagnosis of high blood pressure appeared in the medical record, (b) in absence of such a diagnosis, there was evidence through the medical record of regularly taking of antihypertensive drugs or (c) the person was included in a hypertension control programme run only for hypertensive people in the centre after being referred by their physician. A possible diagnosis of high blood pressure was considered when the person had more than one consecutive reading of systolic/diastolic blood pressure⩾140/90 mm Hg but no medical diagnosis of high blood pressure was explicitly written in the medical record.

A trained health worker actively sought the medical records of the entire sample and blindly reviewed those found. Spain is a country with a social security system that gives health coverage to almost the entire population (more than 95%) and that provides a specific primary health physician to each registered person. Thus, for those people whose medical records were not available we first retrieved the information given for that cohort person during the interview or, in the case of no information provided, we asked the regional social security headquarters for the physician officially assigned. With this information we asked the physician directly if he/she knew that person to rule out any medical record losses.


Firstly, a description of the searching of medical records including those people found, the reasons for not finding medical records, the confirmation of high blood pressure and the different sources that provided information are detailed.

Secondly, a characterisation of the selected sample according to the finding or not of their medical records and, among those people whose medical records were found, according to their self reported high blood pressure status (yes/no) was performed considering main demographic characteristics (gender, age groups, level of education: less than primary schooling and primary or higher) and previous self reported conditions (cardiovascular: myocardial infarction, stroke, angina, other cardiovascular illnesses; and diabetes).

Thirdly, the medical record diagnosis for each condition (yes or no) was cross tabulated with that of the self reported answer (yes or no) obtained from the questionnaire. The analysis of the agreement between the two sources of information was based first on considering definite and possible diagnosis and then combining both categories. Agreement was considered across several demographic variables (gender, age group, level of education). Furthermore, lifestyle characteristics (current smoking (at least one cigarette a day), current alcohol drinking (any amount higher than 0 g/day), obesity estimated by actual measurement of height and weight and expressed in terms of body mass index (BMI: weight in kg/height in m2, normal: BMI lesser than 25; overweight: BMI greater than 25 and lesser than 30; obesity: BMI greater than 30), and previous self reported conditions) were also taken into account in the analysis. The validity of the self reported questionnaire data compared with the medical records was expressed in terms of sensitivity (true positives correctly identified/all true positives), specificity (true negatives correctly identified/all true negatives), positive predictive values (true positives correctly identified/all positives identified by the questionnaire data), and negative predictive value (true negative correctly identified/all negatives identified by the questionnaire data). κ Scores were calculated to determine the agreement between self reported questionnaire data and medical records. As in previous works8 19 20 we considered a κ score less than 0.40 as having poor to fair agreement, 0.41–0.60 was considered moderate agreement, 0.61–0.80 was considered substantial agreement, and 0.81–1.00 was considered almost perfect agreement.

Following the analysis performed by Haapanen et al 8 on the agreement of self reported chronic conditions, including hypertension, and to explore and summarise the association of several explanatory variables (gender, age group, level of education, smoking status, alcohol drinking, obesity and previous illnesses) on the agreement between self reported and medical record information we used a logistic regression model for only those true positive as well as those true negative reports of hypertension. The results are expressed in terms of an unadjusted odds ratios agreement between the self administered questionnaire and the medical record information. Analysis were performed with SPSS 7.521 and Epiinfo 6.018 and all are two sided. For logistic regression coefficients, given the small number of cases in most cells, exact limits for the confidence interval have been calculated.


key points
  • Self informed report of high blood pressure (HBP) is often asked in the midst of health surveys but its validity is scarcely evaluated

  • Comparing self reported HBP with its diagnosis in medical records we have found a moderate agreement (κ=0.58)

  • HBP self report is, in this study, more specific (91.4%) than sensitive (63.5%)

From 315 selected people it was possible to access medical records in 248 (78.7%) cases (table 1). Reasons for not finding a fifth of those originally selected were: (a) the person had a physician but no medical record, meaning that no first visit to that person had ever been made by his assigned primary health physician, and (b) the person has not had a primary health physician assigned. While 68 (27.4%) people had a self report of high blood pressure in their EPIC questionnaire, 85 (34.3%) had a high blood pressure diagnosis in their medical record; from this group, 66 cases (26.6%) fulfilled the criteria for definite and 19 (7.7%) for possible high blood pressure. Both, self report and medical records of high blood pressure were found in 54 cases leading to a confirmation rate of 79.4%. From those cases where medical information was found, 54 had confirmation from more than one source, 19 only through medical recording of actual blood pressure levels (but without medical diagnosis of high blood pressure), 11 confirmed only through medical record and one only from the hypertension control programme.

Table 1

Reference population and sample used in the validation study: case ascertainament and sources of information (%). EPIC-Murcia cohort

As seen in table 2, people whose medical records were not found were more prone to be men, younger and with primary schooling or higher. Conversely, almost all medical records for people with a previous medical diagnosis of diabetes were found. Among those people with medical records found, the proportion of self reported hypertensive was twice higher in women (13.6% in men and 30.4% in women). Most of the hypertensives were in the upper age bands (aged ⩾55) and without any formal level of education. Thus, the prevalence to be hypertensive was 40.7% among people aged ⩾55 compared with only 20.4% among those aged less than 55, and among people without some studies the prevalence was 35.8%, three times higher than among people with some level of education (11.8%).

Table 2

Demographic and health status characteristics (%) in the selected sample according to finding or not of medical records and, among those with medical records (MR) found, according to self reported high blood pressure (HBP) status. EPIC-Murcia cohort

Concerning all 85 confirmed cases (definite plus possible cases) the sensitivity of self reported high blood pressure diagnosis was 63.5% with a correspondent specificity of 91.4% (table 3). Thus, using the information from the questionnaire we were be able to confirm almost four fifths self reported hypertensive (positive predictive value: 79.4%) and normotensive cases (negative predictive value: 82.8%). Agreement between the self administered questionnaire information and medical records measured by the κ score was 0.58 (95% CI: 0.47, 0.69), or moderate agreement according to our classification. Considering only definite cases, all parameters (sensitivity, specificity and positive predictive value) improved leading to an overall agreement of 0.65 (95% CI: 0.53, 0.76), or substantial agreement as stated by our criteria. Conversely, all parameters including only possible cases worsened yielding a poor to fair overall agreement of 0.29 (95% CI: 0.05, 0.53).

Table 3

Sensitivity, specificity, positive, negative predictive values and κ score for self informed diagnosis of high blood pressure according to the diagnosis criteria, demographic characteristics and cardiovascular risk factors. EPIC-Murcia cohort

The analysis, including all cases, was again performed considering demographic and lifestyle characteristics. Agreement was higher among women, people with less than primary schooling, never having smoked and the obese.

Finally, we wanted to look for personal variables that could help to understand the reasons for having a true positive recalling of high blood pressure and, conversely, a true negative assessment of normal blood pressure. With that purpose an unadjusted logistic regression analysis was performed (table 4) for these separate categories (true positive and negative cases) and age group (less and more than 55), gender, level of education (none/some), smoking habit (never/ever), alcohol intake (none/some) and BMI (normal weight, overweight and obese). For those true positive high blood pressure, being female was positively associated with a questionnaire that included a self report of high blood pressure. Thus, it was four times more frequent (OR = 4.01; 95% CI: 1.04, 16.8) that a true high blood pressure women correctly self reported her condition that did men. Conversely, having some formal education (OR = 0.31; 95% CI: 0.09, 1.05) was negatively associated with a true high blood pressure self report. To rule out any confounding effect of age group a stratified analysis was performed. The direction seen of the associations between gender and level of education and being a true hypertensive remained (for age group <55: ORgender = 14.0; 95% CI: 1.40, 339.8 and ORlevel of education: 0.35; 95% CI: 0.31, 1.20; for age group ⩾55: ORgender = 1.25; 95% CI: 0.13, 10.22 and ORlevel of education: 0.39; 95% CI: 0.01, 16.02) although the level of significance vanished because of the small amount of effectives (data not shown).

Table 4

Association (OR; 95% CI) between demographic characteristics and cardiovascular risk factors with self reporting of high blood pressure among those true positive and true negative cases of high blood pressure: definite and possible cases, EPIC-Murcia cohort


The main finding of this work is that self report of high blood pressure in the EPIC-Murcia cohort, considered as a whole, has a moderate agreement with medical records that improves substantially when only definite cases of high blood pressure are considered. Moreover, the agreement decreases when only possible cases were considered giving congruity to the association. In addition, the main problem among possible cases is that of a lack of sensitivity, possibly meaning that, given that this diagnosis is based, according to our classification, on the recording of systolic and diastolic blood pressure⩾140/90 mm Hg and not in a medical diagnosis of high blood pressure, those people were unaware of their tension level. An explanation for that is because they possibly did not receive such information from their respective physicians, declaring themselves as normotensives. In this sense, the results are probably a function more of the medical behaviour than a problem of recording or recalling properly that information. Additionally, the change of medical criteria for considering a person as hypertensive, from earlier WHO22 reports (hypertensive: systolic blood pressure⩾160 or diastolic blood pressure⩾95 or under treatment) to the most recent23 (hypertensive: systolic blood pressure⩾140 or diastolic blood pressure⩾90 or under pharmacological treatment) may have produced an under-registration of hypertensive people whose blood pressure levels were between both criteria.

Considering all cases together, the EPIC question on high blood pressure is very specific but moderately sensitive, meaning that a higher proportion of truly hypertensive cases will be included among those normotensives than the converse. Thus, some underestimation in the measures of frequency (prevalence or incidence) or attenuation in the measures of association (relative risk, odds ratio) could be expected when using this question for hypertensive case ascertainment during the follow up or in aetiological research on high blood pressure determinants. The impact of such underestimation will be greater among men and people with some educational level.

Some studies have previously measured the agreement of high blood pressure self report.8-11 24-26 Thus, de Sanjosé et al 10 carried out in Barcelona (Spain) a survey among 203 people who had been previously visited a primary health centre and to whom, two weeks later, a household interview was given to estimate the levels of under reporting of medical visits and to compare the reporting of chronic restrictive diseases. Regarding self report of hypertension they found a prevalence of 17.2% while the medical records reported an equivalent prevalence of 17.7%. Thus, overall agreement for this chronic condition was 91.6% and a κ score of 0.71. These extremely good results could be attributable to the short recall period used (two weeks) as they are the answers less prone to memory errors. Closer to our study is that of Haapanen et al 8 who undertook an agreement study in three north eastern municipalities of Finland (n= 600 people) between questionnaire data and medical records of chronic diseases including hypertension and high blood pressure. The agreement for hypertension was substantial, with a prevalence of 31.6% and an overall κ score of 0.77 for definite and 0.78 for possible high blood pressure categories. Vargas et al 11 In the context of the north American National Health and Nutrition Examination Survey (NHANES III) found, in a sample of non-Hispanic Americans, a sensitivity and specificity of high blood pressure diagnosis of 71% and 90% respectively, while Fordet al 24 found a lower sensitivity (56%–72%) among Puerto Ricans, Mexican-Americans and Cubans. Bush et al 26 in a non random sample of people aged more than 65 years found high agreement rates, measured by κ score, from 0.93 for diabetes to 0.53 for cataracts. In this context, high blood pressure showed a κ score of 0.71. In general, agreement for high blood pressure and hypertension are one of the highest found when several chronic ailments are tested.

These studies have found several conditions related to a correct self report. Thus, de Sanjosé10 reported a higher (non-significant) under reporting of medical visits among those young, male and from higher socioeconomic status while Haapanen8reported, for truly negative self report of cardiovascular diseases, only negative associations with higher age and more than three health service contacts per year during the period, in the sense that those older and more frequent users of health services poorly recall their non-diseased status. Our study showed positive association with correct high blood pressure self report for women and negative association for having some level of education but most categories had small numbers—as it is usual in validation studies—precluding additional analysis, including adjustment exception made for the stratified analysis of gender and educational level according to age groups, and possibly, a generalised lack of power for detecting further associations. However, looking at the figures on demographic and lifestyle characteristics (table 2) it is noticeable that for twice as much people with some level of education (30.3%) it was not possible to retrieve their medical record than for those having less than primary schooling (15.6%). Thus, it seems possible that more educated people had lesser morbidity or had preferred to use private medical services instead of the national health coverage, precluding to rule out some selection bias.

Finally, we were able to retrieve medical information from most of the people included in the study but it was impossible to confirm or reject the high blood pressure status for a fifth of them. Hypothetically, some improvement in medical record retrieval could be achieved by reviewing reference hospital registries but, in Spain, uncomplicated high blood pressure is a common ailment seen mostly in primary health centres. However, a non-negligible proportion of the selected persons (9%) had no physician assigned from the public system, which has almost universal coverage. This assignment is produced when, for first time, the person asks for it in the social security headquarters. Possibly, then, these people did not ask for a physician from the public health system but had hired private medical services. Given that the study was designed for directly not contacting the cohort members selected but using previously recorded information, we are not able to say anything further about this subgroup of people.

This validation study has been done based on a small proportion of the cohort members recruited in the Murcia EPIC centre. Reliability of the results found when stratifying by some demografic and lifestyle characteristics must need a larger study sample size. However, the findings can be used, with caution, as the best estimation of the information obtained by the EPIC question on high blood pressure in the cohort gathered in Murcia EPIC centre and, in the absence of further data, as an approximation of self reported high blood pressure diagnosis in other EPIC centres or, even, in the Spanish national health surveys, which have asked a similar question to that used in this questionnaire.

On the other hand, the source of information taken as gold standard has been that of the medical records. There is not a comprehensive evaluation of the quality of medical records in Murcia or in Spain. However, in the recent reforms taken in the primary health care system in Spain the recording system has been one of the key points. In addition, among the intervention programmes offered in such health centres, hypertension has been the leading one with 100% of the centres offering it in Murcia.27

In summary, self report of high blood pressure has a moderate to substantial agreement in this small sample of the EPIC-Murcia cohort but if used in aetiological research for determinants of incident cases of hypertension it should be taken carefully to minimise some misclassification because of a relative lack of sensitivity/positive predictive value.


The authors thank Patricia Esteras for assistance in the field work and Josefina Almansa for her help in the EPIC administrative tasks.


  • * A list of participants in the EPIC Group of Spain is shown at the end of the paper.

  • EPIC is a European study coordinated by the Unit of Nutrition and Cancer of the International Agency for Research on Cancer (IARC) (Agreement AEP/93/02). In Spain it receives financial support from the European Program Against Cancer, which is part of the EC (Agreement SOC 97 200302 05F02), the Health Research Fund (FIS) of the Spanish Ministry of Health (Exp 96–0032), the participant Regional Government and the Spanish Scientific Foundation against Cancer.

  • At the time of the study EPIC (European Prospective Investigation on Cancer) Group of Spain was composed of (in alphabetic order): Antonio Agudo**(1), Pilar Amiano**(6), Xavier Barber**(4) Ana Barcos**(5), Aurelio Barricarte*(5), José Maria Beguiristain**(6), Maria Dolores Chirlaque**(4), Miren Dorronsoro*(6), Carlos A González*,***(1), Cristina Lasheras**(2), Carmen Martínez*(3), Carmen Navarro*(4), Guillem Pera**(4), Jose R. Quirós*(2), Mauricio Rodríguez***(3), María José Tormo**(4). *Principal Investigator, **Co-investigator,*** Study coordinator. (1) IREC (Instituto de Investigación Epidemiológica y Clínica), Mataró, Barcelona, Spain; (2) Consejería de Sanidad y Servicios Sociales de Asturias, Oviedo, Spain; (3) EASP (Escuela Andaluza de Salud Pública), Granada, Spain; (4) Consejería de Sanidad y Política Social, Murcia, Spain; (5) Departamento de Salud de Navarra, Pamplona, Spain; (6) Dirección de Salud de Gipúzkoa, San Sebastian, Spain.