STUDY OBJECTIVES To study geographical differences in diastolic blood pressure and the influence of the social environment (census percentage of people with low educational achievement) on individual diastolic blood pressure level, after controlling for individual age and educational achievement. To compare the results of multilevel and ecological analyses.
DESIGN Cross sectional analysis performed by multilevel linear regression modelling, with women at the first level and urban areas at the second level, and by single level ecological regression using areas as the unit of analysis.
SETTING Malmö, Sweden (population 250 000).
PARTICIPANTS 15 569 women aged 45 to 73, residing in 17 urban areas, who took part in the Malmö Diet and Cancer Study (1991–1996).
MAIN RESULTS In the “fixed effects” multilevel analysis, low educational achievement at both individual (β=1.093, SE=0.167) and area levels (β=2.966, SE=1.250) were independently associated with blood pressure, although in the “random effects” multilevel analysis almost none of the total variability in blood pressure across persons was attributable to areas (intraclass correlation=0.3%). The ecological analysis also found an association between the area educational variable and mean diastolic blood pressure (β=4.058, SE=1.345).
CONCLUSIONS The small intraclass correlation found indicated very marginal geographical differences and almost no influence of the urban area on individual blood pressure. However, these slight differences were enough to detect an effect of the social environment on blood pressure. The ecological study overestimated the associations found in the “fixed” effects multilevel analysis, and neither distinguished individual from area levels nor provided information on the intraclass correlation. Ecological analyses are inadequate to evaluate geographical differences in health.
- variance analysis
- ecological bias
- blood pressure
Statistics from Altmetric.com
Equity in health care is a cornerstone of the Swedish public health system.1 As in many other countries, cardiovascular diseases are the foremost cause of disability and death in Sweden, and increased blood pressure is one of the main risk factors.2 As the blood pressure levels vary considerably from one area to another,3-5 it is of interest to analyse the possible causes of these differences.
There is considerable evidence regarding the effect of individual characteristics on blood pressure level.2 However, contextual factors—those related to the residential area of the person—could also influence individual blood pressure.6Single level ecological analyses have revealed geographical differences in blood pressure that are associated with area socioeconomic environment.7 8 However, it is still a matter of debate whether or not area effects exist,9-11 and whether they are explained by individual level factors.12 13 On the one hand, some multilevel analyses have suggested that the residential area of the individual may play a small part in individual health and that impaired health in deprived areas is mainly attributable to a person's social and economic status, and not to area contextual factors.12 13 On the other hand, most multilevel studies have concluded that the social environment has a clear influence on individual health that is independent of individual factors.6 14-16
Therefore, our intention was to study geographical differences in diastolic blood pressure and the influence of the social environment on individual diastolic blood pressure level. We performed a multilevel analysis17-21 of 15 569 women (first level) from 17 of the 18 urban areas (second level) in the city of Malmö. We also wished to compare the results of the multilevel analysis with the outcome of a conventional single level ecological analysis of the same population.
The city of Malmö in southern Sweden had a population of approximately 250 000 people in 1992. For administrative purposes, the city is divided into 18 large geographical areas. In this study, 17 of those areas were analysed, the zone of the harbour being excluded because of its small number of residents. These urban divisions have been used to study the distribution and determinants of health in the city for many years. The median number (1st–3rd quartiles) of women aged 45 to 73 in those areas was 2229 (1302–3457).
THE MALMÖ DIET AND CANCER STUDY
The MDCS is a prospective cohort study in the city of Malmö. The 17 388 women who participated in the MDCS cohort represented 41% of all women born between 1923 and 1950 living in Malmö during the baseline period 1991–1996. This study was limited to 15 569 women from whom complete information on the variables studied could be obtained. The median (1st–3rd quartiles) of participants per area was 918 (1099–513) and the participation rate in different areas of the city ranged from 30% to 67%.7 Women were recruited for the study through letters of invitation, by advertisements in the local media, and with the cooperation of major employers in Malmö. Letters sent directly to potential subjects resulted in 80% of the participants. A detailed description of the design and aims of the cohort study may be found elsewhere.22
A self administered questionnaire and a seven day personal diary were used to obtain information on relevant characteristics of those women participating. Each person completed both information sources at home within the same 1–2 week period set between the first and the second consecutive baseline visits to the project office. Blood pressure was measured at the initial visit, and other information gathered.22
The age of each woman was ascertained at the first visit to the study centre. Ages were then aggregated into four groups: 45–49, 50–59, 60–69, and 70–73. The youngest group was used for reference. Systolic and diastolic (Phase V) blood pressure measurements were done on the right arm under standardised conditions after five minutes of supine rest.
Low educational level consisted of those with fewer than nine years of formal education.
Ecological variables—The Skåne Council Statistics Office supplied us with age specific information (ages 20 to 79 in 10 year groups) on the number of inhabitants and the number of people with low educational achievement (that is, fewer than nine years of formal education) in every area.
STATISTICAL AND EPIDEMIOLOGICAL METHODS
Simple variance components multilevel linear regression models19 20 with women in the first level and areas in the second level were fitted to the data. The first model (i) was empty (that is, a model without any independent variable in which only the 2nd level area intercepts varied). In the second model (ii), age alone was factored in. In the third model (iii), age together with the individual educational variable was fitted. In the fourth model (iv), age together with census percentage of people with low educational achievement was included. Finally (v), in order to disentangle individual from area effects, age and individual educational level were entered together with census percentage of people with low educational achievement.
We separate the information produced by the multilevel analysis in two parts (a) a random effects analysis, and (b) a fixed effects analysis.
Fixed affect analysis
In the fixed affect analysis we observed the strength of the association (that is, the slopes of the regression) between individual diastolic blood pressure and individual low educational achievement, as well as the aggregated variable census percentage of people with low educational achievement. The strength of these associations was appraised by the β coefficient (standard error).
We aimed to investigate if the social environment (as appraised by the area educational variable) was associated to individual diastolic blood pressure independently of the women's age and educational achievement.
In the random effects part of the multilevel analysis we analysed geographical differences in diastolic blood pressure and the effects of the geographical boundaries of the residential area of the individual on individual diastolic blood pressure. This analysis was approached in two ways. On the one hand, we studied the second level variance (that is between areas) in mean diastolic blood pressure.
Ω u= area variance
On the other hand, we aimed to find out if individual diastolic blood pressure was clustered in the areas (that is, if diastolic blood pressure was more similar between women living in the same area than between women from different areas). A possible clustering would indicate that the geographical boundaries had specific influence on individual blood pressure, and give deeper information than simple differences in mean diastolic blood pressure. Such possible clustering was appraised by the intraclass correlation (ICC) according to the formula
ICC= Ω u/(Ω u + Ω e)
Ω e= individual variance
The higher the intraclass correlation was, the higher the clustering of the women in the areas in relation to individual diastolic blood pressure. A very low intraclass correlation, therefore, would indicate that the areas were not more different than in random samples taken from the population in Malmö (in random samples the people are not clustered at all).
Starting with the crude area variance and the intraclass correlation in the “empty” model we investigated if area differences were attributable to the individual composition of the areas, or if they were caused by a true contextual influence. If the crude areas differences disappeared when adjusting for individual factors the areas differences would be compositional, otherwise remaining differences would be contextual. A contextual effect could be further supported if, when adjusting for the census educational variable, the remaining 2nd level variance became reduced or vanished.
In order to know the percentage of area differences in diastolic blood pressure that was explained by the adjusted model we calculated the percentage of 2nd level variance explained(R2 2) as:
R2 2 = [(Ω uo—Ωu1)/Ω uo] × 100
Ω uo= 2nd level variance of the initial model
Ω u1= 2nd level variance of the adjusted model
Parameters were estimated using Restricted Iterative Generalised Least Square (RIGLS), and Markov Chain Monte Carlo (MCMC), with prior values obtained by RIGLS.23 As the results were very similar, the RIGLS estimations are presented in the tables. The MLwiN software package, Version 1.1, was used to perform the analyses.23
The age adjusted24 association between mean diastolic blood pressure on the one hand, and census percentage of people with low educational achievement on the other, was studied by weighted (number of inhabitants) linear regression analysis. This method was chosen because it is a customary analytical technique used in the city of Malmö.7 The strength of these associations was appraised by the β coefficient (standard error).
The squared correlation coefficient (R2) was obtained in order to calculate the proportion of the area variation in mean diastolic blood pressure that was explained by the age adjusted area educational variable.
COMPARISON BETWEEN ECOLOGICAL AND MULTILEVEL ANALYSIS
Aitkin and Longford25 performed in 1986 a very valuable study comparing multilevel and ecological analysis. On an empirical basis and given extended statistical information, these authors concluded that studies comparing areas (schools in their paper) should not be based on ecological analysis as they give unpredictable results when the information on within area variability is suppressed. The book by Snijders and Bosker19 also gives a detailed treatment to this issue in chapter 3. However, ecological analyses are still frequently used in epidemiology and community health. Therefore, we performed a simple epidemiological comparison of the parameters obtained in the ecological and the multilevel analysis.
In the multilevel analysis we studied the β coefficients (standard error), the 2nd level variance explained (R2 2), and the intraclass correlation. In the ecological analysis we studied the β coefficients (standard error), and the squared correlation coefficient (R2). The intraclass correlation could not be computed, as there was no individual information.
The ecological analysis has only one level, the area, which corresponds with the 2nd level in the multilevel analysis. As the area educational variable is constant for all people within an area (that is, there is no individual variation within the areas), both the fixed part of the multilevel analysis and the ecological analysis (that is, the β coefficients of the area educational variable) were estimating the association with mean diastolic blood pressure (that is, the between areas regression). The β coefficient of the individual educational level variable estimated the within individuals regression in the multilevel analysis.
The 2nd level explained proportion of variance in the multilevel analysis, R2 2, measures the reduction of variance between group means. In this sense, the squared correlation coefficient, R2, of the ecological analysis is the analogous of the R2 2 in the multilevel analysis, especially when the number of people in the sample is large (see Snijders and Bosker19 page 103 for more details).
Ecological analyses have revealed geographical health differences associated with the social environment.
Some multilevel studies have suggested that the social environment influences individual health, but others that the residential area of the person plays a minor part.
This apparent contradiction was explained when the different information conveyed by the “fixed” and the “random” effects multilevel analyses was differentiated.
The random effects (that is, intraclass correlation) evidenced almost no geographical influence on individual blood pressure, and gave deeper information than the simple analysis of the differences in mean blood pressure between the areas.
The “fixed” effects multilevel analysis (that is, β coefficients) revealed an association between educational achievement and blood pressure at both the individual and at the area level.
Compared with the multilevel analysis, the ecological analysis overestimated the association between educational achievement and blood pressure and it could not disentangle individual from area levels.
DESCRIPTION OF THE POPULATION
The crude mean diastolic blood pressure of the areas studied varied between 83.6 mm Hg and 85.4 mm Hg and the age adjusted mean between 82.8 and 84.9 mm Hg. The crude percentage of people with low educational achievement according to the census varied between 20% and 51% and the age adjusted mean between 22% and 54%. The mean diastolic blood pressure increased with the census-based variable “percentage of people with low educational achievement” (fig 1). The percentage of women with low educational achievement in the cohort was highly correlated to the analogous census variable (R=0.814, p<0.001).
The age adjusted ecological analysis shows that mean diastolic blood pressure increased significantly with rising census percentage of people with low educational achievement.
The age adjusted census-based educational variable explained 41% of the variation in age adjusted mean diastolic blood pressure (table 1).
In the age adjusted models, the individual level of diastolic blood pressure increased in the presence of individual low educational achievement and when the percentage of people with low educational achievement in the area was augmented. The model including simultaneously individual age, individual educational achievement and census percentage of people with low educational achievement showed that both individual and census-based educational variables remained positively associated with individual diastolic blood pressure level, after adjusting for each other (table 1). The slope of the between areas regression (area low educational achievement variable) was about 2.7 times higher than the slope of the whitin areas regression (individual low educational achievement variable).
The empty model (that is, the one containing random intercepts only, and without independent variables) showed a very small clustering of women in city areas in relation to diastolic blood pressure (that is, intraclass correlation = 0.4%). When adjusting for the individual age composition of the areas, the between area variance was reduced by 26%, and the intraclass correlation became 0.3%. The age and educational achievement composition of the areas explained 59% of the between area variance in diastolic blood pressure (table 1).
Individual age and the census-based area variable “percentage of women with low educational achievement” explained 65% of the between area variance in diastolic blood pressure level (table 1).
When individual age, individual educational level and the census-based variable “percentage of women with low educational achievement” were entered together, the clustering of women within given areas with respect to diastolic blood pressure level almost vanished (that is, intraclass correlation = 0.1%), and this model explained 73% of the area differences in diastolic blood pressure (table 1).
COMPARISON BETWEEN ECOLOGICAL AND MULTILEVEL ANALYSIS
The β coefficient of ecological analysis reflected the fixed effects of the area educational variable found in the multilevel analysis (table 1). However, after entering individual educational level the multilevel β coefficient was lower than the ecological. Only the multilevel analysis, moreover, could separate the contextual effect of the census educational variable from the compositional effect of the educational achievement of the women in the areas (table 1). The ecological β coefficient was 4.3 times higher than the β coefficient of the individual educational variable, and 1.6 times higher than the β coefficient of the area variable in the multilevel analysis.
The age adjusted census educational variable explained a large part of the between area variance in diastolic blood pressure in the multilevel analysis (that is, 65%), and this variable also explained a considerable amount in the ecological analysis (that is, 41%). However, only the multilevel analysis indicated that individual age and educational achievement explained 59% of the area differences and that after this adjustment the clustering of women in the areas in relation to diastolic blood pressure was very small (intraclass correlation = 0.2%).
In the fixed effects part of the multilevel analysis, individual diastolic blood pressure increased with the census percentage of people with low educational achievement. This area variable still retained its association with high blood pressure after adjusting for individual education level. These facts suggest that an area's educational level had a contextual effect on individual blood pressure that was not captured by individual levels of education.
In the random effects part of the multilevel analysis the area in which a person lived accounted for a very small amount of the individual differences in blood pressure (intraclass correlation= 0.4%). This minor intraclass correlation indicated that the geographical clustering of women in the areas in relation to diastolic blood pressure was very small. From these results (that is, random effects), it can be concluded that the residential area of the person had a minor impact on individual blood pressure, and that the geographical differences in individual diastolic blood pressure were only marginal. The intraclass correlation provided deeper information on geographical differences than the simple analysis of the variation in mean blood pressure between the areas.
In our study, the geographical differences were very small, but still enough to provide sufficient range of exposure to detect an association in the fixed effects part of the multilevel analysis.
Our results accorded with both aetiologically oriented multilevel studies showing that the social environment influenced individual health related factors,6 10 15 16 26 and with previous studies that indicated a very small influence of the area of residence (as geographical boundary) in relation to health indicators.9 11 12
The information provided by the intraclass correlation is rarely discussed in multilevel analysis within epidemiology and community health. However, this information is very central when studying area effects on individual health (that is, blood pressure level). As expressed by Rodriguez and Goldman27: “Estimates of the extent to which observations within a given group are correlated with one another are valuable not only for obtaining improved estimates of fixed effects and their standard errors, but also for yielding important substantive information”. Snijders and Bosker have expressed similar opinions19 (page 9) “The more thediastolic blood pressure levels of theindividuals within anarea are alike (as compared with individuals of others areas), the more likely that the determinants ofblood pressure have to do with thearea environment. Absence of dependency in this case implies absence of area effects on individual blood pressure” (italic type is ours and substitutes the terms “pupils”, “schools” and “pupil achievement” in the original text by “individuals”, “area” and “diastolic blood pressure” respectively without changing the meaning of the text).
The area educational variable explained a large part of the differences in blood pressure found among the various areas but, as commented by Aitkin and Longford25 and by Singer,28 as the amount of variation between areas was very small (age and individual educational achievement adjusted intraclass correlation of 0.2%) it explained a great deal of very little.
THE ECOLOGICAL VERSUS THE MULTILEVEL ANALYSIS
Many studies have given evidence about the weaknesses of ecological studies30 and a previous comparison of the ecological and the multilevel analyses by Aitkin and Longford25 concluded that comparing areas should not be based on ecological analysis as it gives unpredictable results when the information on within area variability is suppressed. See also Snijders and Bosker19 chapter 3.
Diez-Roux31 has pointed out two kinds of fallacies in ecological analyses, the ecological and the sociological. The ecological fallacy is the risk of doing erroneous inferences on individual associations when the units of analysis are groups. The sociological fallacy is the risk of making erroneous inferences on group association when the units of analysis are groups by ignoring the role of individual levels factors in a study of groups.
In our study the estimated β coefficient of the area educational variable in the ecological analysis (β = 4.655) was almost identical to the β coefficient in the fixed part of the multilevel analysis (β= 4.058). However, when the individual educational variable was entered, the ecological analysis overestimated the individual association between educational achievement and blood pressure level by a factor of about 4, and the association with area educational variable by a factor of about 1.5. This suggests the presence of both ecological (the risk of doing erroneous inferences on individual associations when the units of analysis are groups) and sociological (the risk of making erroneous inferences on group associations when the units of analysis are groups by ignoring the role of individual levels factors) fallacies.
Definite effects of the social environment on individual blood pressure (that is, β = 2.966) existed side by side with marginal geographical differences and minor influence of the residential area of the individual on individual blood pressure (intraclass correlation= 0.3%), a fact that is not counter-intuitive when the different information conveyed by the fixed and the random effects is noticed. Each measurable variable has a distribution among areas determined together by the mean, the position parameter, and the standard deviation or the standard error, the variance parameters. In the fixed part of the multilevel model (and in the ecological analysis) we study the position parameter (that is, means). In the random part of the multilevel model we analyse the variance parameter. An association in the fixed part of the model can be found, while the variability can be small or large.
Figure 1 is based on the cohort and represents a classic ecological association. Observing means gives the impression that the areas are quite separated from each other, and a simple inspection suggests an association between the area educational variable and blood pressure. Figure 2, also based on the cohort, allows a visual approximation of the variance between individuals and between areas. The administrative areas of Malmö appear more like 17 random samples taken from the whole city population than segregated territories with different blood pressure levels.
In figure 3 we simulated two sets of data, both gave about the same area β coefficient (β approximately 4.5), but the size of the intraclass correlation ranged from 0.21% to 84%. In the first case, the areas are like random samples taken from the whole population, and the geographical environment has almost no effect on the individual outcome (observe that this situation is like the empirical results of this study). In the second case, the clustering of the individuals is very large, and the geographical environment has a very high influence on the individual outcome. Despite that, the size of the β coefficient is similar (that is, β approximately 4.5). The ecological analysis resembles of a multilevel analysis with intraclass correlation of 100% (that is, all individuals in an area have the same value), but that is not true, as the lack of individual information did not allow calculating the intraclass correlation. In relation to the area variable, the ecological analysis correctly estimates the beta coefficient (that is, β approximately 4.5).
Interpreting the ecological analysis (fig 1), or the fixed part of the multilevel analysis, as if the context were a main cause of health variation in the areas would be a “contextual misinterpretation”. Such a “contextual misinterpretation” could be made even when the ecological and the sociological fallacy are absent.
LIMITATIONS OF THE STUDY
The number of areas studied in the present analysis was relatively small (nj = 17), a factor that may have rendered inaccurate the distributional assumptions in the multilevel regression, despite the fact that the number of women sampled was high (ni = 15 569). However, parameter estimation using MCMC gave similar results.
The possibility that the cohort may not be representative of the whole population may have reduced the external validity of our results. A previous analysis, however, has shown that participants could be regarded as fairly representative of the general population in each of the 17 areas studied, at least in relation to the main sociodemographic variables.7 In any case, this potential selection bias does not invalidate the present comparison between the multilevel and the ecological approach, as they are both based on the same population.
In conclusion, in our study both the ecological and the “fixed” effects multilevel analyses evidenced an association between educational achievement and blood pressure. The ecological analysis overestimates this association and could neither disentangle individual from area levels nor inform on the size of the intraclass correlation. The ecological analysis was inappropriate to evaluate geographical differences in health.
Most multilevel analyses have been focused on studying “fixed” effects (for example, β coefficients, odds ratios) in order to analyse the association between the social environment (measured by certain area characteristics) and individual health. The study of “random” effects (for example, intraclass correlation), however, is still rather unusual in social epidemiology even though it gives very relevant information for understanding the impact of the areas' geographical boundaries on individual health.
To sum up, we observed a definite association between the social environment and individual diastolic blood pressure even when the geographical boundaries played a small part on individual blood pressure and both aspects need to be considered. Even if the social environment influences health, resource allocation and public health programmes should not be based on large administrative areas,11 such as the ones studied. Allocating resources to control blood pressure exclusively to the more “unhealthy” areas (that is, those with higher mean diastolic blood pressure) would leave many people in need who reside in the more “healthy” areas unattended.
We wish to express our gratitude to Min Yang, Research Officer, Multilevel Models Project at the Institute of Education, University of London; John W Lynch, Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, Michigan; and to Peter Allebeck, Department of Social Medicine, University of Gothenburg, for his critical reading of the manuscript and valuable comments.
Funding: this study was funded by an ALF-Government Grant, Dnr M: E 39 390/98 (Juan Merlo); the National Institute for Public Health; the Swedish Medical Research Council; and the Swedish Cancer Society.
Conflicts of interest: none.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.