Article Text

Download PDFPDF

Sociodemographic determinants of intraurban variations in COVID-19 incidence: the case of Barcelona
  1. Antonio López-Gay1,2,
  2. Jeroen Spijker2,
  3. Helen V S Cole3,
  4. Antonio G Marques4,
  5. Margarita Triguero-Mas5,6,
  6. Isabelle Anguelovski7,8,
  7. Marc Marí-Dell'Olmo9,10,11,
  8. Juan A Módenes2,
  9. Dolores Álamo-Junquera9,
  10. Fernando López-Gallego12,
  11. Carme Borrell9,10,11
  1. 1 Department of Geography, Autonomous University of Barcelona, Barcelona, Spain
  2. 2 Center for Demographic Studies, Bellaterra, Spain
  3. 3 Barcelona Lab for Urban Environmental Justice and Sustainability, Autonomous University of Barcelona, Barcelona, Spain
  4. 4 Department of Signal Theory and Communications, Rey Juan Carlos University, Madrid, Spain
  5. 5 Institute for Environmental Science and Technology—Barcelona Lab for Urban Environmental Justice and Sustainability, Autonomous University of Barcelona, Cerdanyola del Vallès, Spain
  6. 6 Department of Urban Studies and Planning—Mariana Arcaya's Research Lab, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
  7. 7 Autonomous University of Barcelona, Bellaterra, Spain
  8. 8 Catalan Institution for Research and Advanced Studies, Barcelona, Spain
  9. 9 Agència de Salut Pública de Barcelona, Barcelona, Spain
  10. 10 CIBER de Epidemiología y Salud Pública, Madrid, Spain
  11. 11 Institut d’Investigació Biomèdica Sant Pau (IIB Sant Pau), Barcelona, Spain
  12. 12 Heterogeneous Biocatalysis Laboratory, CIC biomaGUNE, San Sebastian, Spain
  1. Correspondence to Dr Antonio López-Gay, Department of Geography, Autonomous University of Barcelona, 08193 Bellaterra, Barcelona, Spain; tlopez{at}


Background Intraurban sociodemographic risk factors for COVID-19 have yet to be fully understood. We investigated the relationship between COVID-19 incidence and sociodemographic factors in Barcelona at a fine-grained geography.

Methods This cross-sectional ecological study is based on 10 550 confirmed cases of COVID-19 registered during the first wave in the municipality of Barcelona (population 1.64 million). We considered 16 variables on the demographic structure, urban density, household conditions, socioeconomic status, mobility and health characteristics for 76 geographical units of analysis (neighbourhoods), using a lasso analysis to identify the most relevant variables. We then fitted a multivariate Quasi-Poisson model that explained the COVID-19 incidence by neighbourhood in relation to these variables.

Results Neighbourhoods with: (1) greater population density, (2) an aged population structure, (3) a high presence of nursing homes, (4) high proportions of individuals who left their residential area during lockdown and/or (5) working in health-related occupations were more likely to register a higher number of cases of COVID-19. Conversely, COVID-19 incidence was negatively associated with (6) percentage of residents with post-secondary education and (7) population born in countries with a high Human Development Index.

Conclusion Like other historical pandemics, the incidence of COVID-19 is associated with neighbourhood sociodemographic factors with a greater burden faced by already deprived areas. Because urban social and health injustices already existed in those geographical units with higher COVID-19 incidence in Barcelona, the current pandemic is likely to reinforce both health and social inequalities, and urban environmental injustice all together.

  • COVID-19
  • spatial analysis
  • social inequalities
  • public health
  • neighbourhood/place

Data availability statement

Our data are accessible to researchers upon reasonable request for data sharing to the corresponding author. Our dataset has been built based on publicly available data in the referred repositories.

This article is made freely available for use in accordance with BMJ’s website terms and conditions for the duration of the covid-19 pandemic or until otherwise determined by BMJ. You may use, download and print the article for any lawful, non-commercial purpose (including text and data mining) provided that all copyright notices and trade marks are retained.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Global flows of people, resources, and capital involved in the production and maintenance of urban life facilitate the spread of infectious disease and the emergence of pandemics.1 After appearing in China in late 2019, the first cases of COVID-19 were confirmed in Spain and elsewhere in Europe, by late January 2020. Previous research on virus transmission has shown that socioeconomic and cultural factors at the individual, household and neighbourhood levels are essential mechanisms for community spread of the virus.2 3

Individual-level risk factors such as gender, age or race/ethnicity are known to influence infectious disease incidence,4 5 including COVID-19.6 7 Although infection rates are similar between genders, men are more likely to have comorbid conditions (such as hypertension, diabetes, obesity and cardiovascular diseases) that are also risk factors associated with worse COVID-19 outcomes.8 9 Women, however, are often more exposed because of their more frequent dedication to care professions.10 Older people are also known to be more susceptible to COVID-19 and show higher fatality rates.11 In contrast, the role that children play in disease transmission is still unclear as they are rarely the index case12 and are less likely to transmit COVID-19 to adults.13 On the other hand, school closures are likely to have led to increased childcare by seniors,14 potentially increasing risk of transmission.

Individual socioeconomic factors such as level of education, income, employment status and type of occupation are also thought to impact risk of COVID-19. Although initial COVID-19 outbreaks emerged from international (business) travel and winter holidays,15 subsequent trends reveal that those working in specific occupations, especially frontline, ‘essential’ jobs in health, care, retail and hospitality, are more at risk of infection.16 17 Individuals living in poverty and other marginalised populations are more susceptible to infectious diseases.5 For instance, in the US context, racialised minorities (especially African Americans) are vulnerable social groups that exhibit higher than average rates of infectious diseases. This has been attributed to systematic and interpersonal racism, and poorer access to healthcare facilities and other health-promoting resources.18

Public health researchers have also long acknowledged the importance of neighbourhood-level sociodemographic and physical characteristics—including racial and economic residential segregation, and the spatial distribution of affordable and fresh food, or public transport—for understanding health outcomes.19 20 Structural contexts and neighbourhood environments can therefore create uneven poor living conditions and lasting environmental injustices for lower income or immigrant residents living in certain areas of a city,21 resulting in health inequity by neighbourhood. In fact, during the 1918 influenza pandemic, researchers already found a significant association between disease transmissibility and neighbourhood-level social characteristics such as population density, illiteracy and unemployment.4

Emerging research on COVID-19 shows similar patterns and pathways.22 For example, people living in denser neighbourhoods, with poor and overcrowded housing conditions have an elevated risk of infection as social contact in these living scenarios is more likely.11 23 Urban connectivity, mobility and the mode of transport also play an important role in the spread of COVID-19.24 At the neighbourhood level, greater use of private motor vehicles and less public transport mobility means less exposure to infection.25 Likewise, infection rates may be lower where part of the (more mobile, international and national) population was able to leave before movement restrictions or where a higher proportion of people was able to work from home during lockdown. Conversely, rates may be higher where more essential workers live (occupations that are over-represented by women and immigrants from low-income countries) as they are more likely to commute. Overall, higher mortality rates from COVID-19 are associated with poorer neighbourhood conditions, including a scarcity of healthcare facilities.26 The number of nursing and retirement homes has also been associated with a greater number of infections in the neighbourhood.27

To date, COVID-19 research on spatial variations has been mainly set at the national or subnational levels. At this level of analysis, it is very difficult to disentangle the different intervening factors behind risks and exposures to COVID-19 as this approach fails to reveal the diverse patterns within these larger geographies. There is therefore a need to focus on geographically smaller units to allow for better account of confounding factors28 and enhance the predictive accuracy and interpretability of the resulting statistical model. As of late 2020, neighbourhood-level studies of socio-spatial inequality in COVID-19 infection and mortality have primarily focused on the USA and UK.29 30 Very little is known about such patterns in mainland Europe,31 especially so in much denser and mixed-use urban environments. To address these shortfalls, we investigated the relationship between COVID-19 incidence and a comprehensive diversity of intraurban sociodemographic factors in Barcelona, Spain.


Study design and study population

This cross-sectional ecological study used data from the COVID-19 Register of the Barcelona Public Health Agency. During the first wave, Spain registered one of the highest per capita number of cases in Europe, making analysis at the local scale more reliable. Barcelona became one of the initial hotspots in the country, possibly due to its international position in tourism, business, education and research.32

Our study included 10 550 laboratory-confirmed cases of COVID-19 in Barcelona between 9 March and 3 May 2020. We selected these dates to focus on the first outbreak of the pandemic. During this period, tests were essentially performed for those hospitalised or from specific at-risk groups, especially healthcare workers, as well as residents and workers in long-term care facilities (LTCFs). However, confirmed cases registered in LTCF were excluded, as test campaigns were unevenly implemented across time and space and addresses of residents correspond to those of the LTCF which do not necessarily reflect the socioeconomic position of the residents themselves.

Our geographical unit of observation is the neighbourhood. We aggregated addresses of positive-tested individuals by neighbourhood of residence. Although the municipality of Barcelona (1.64 million inhabitants) is officially divided into 73 barris (Catalan for neighbourhood), for statistical purposes we have followed the adaptation developed by the Spanish National Statistical Office in several studies.33 This alternative division is based on the official administrative division, but creates more statistically robust units in terms of population size, merging the least populated with neighbouring units and splitting the most populated ones, always according to urban and sociodemographic criteria. Our final division consists of 76 units (henceforth referred to as neighbourhoods). They contain an average of 21 500 inhabitants and 1.3 km2 area. These units are very diverse in terms of wealth, housing characteristics, demographic ageing and health, factors known to be associated with the spread of infectious diseases.

Intraurban sociodemographic covariates

A total of 16 neighbourhood-level indicators on demographic structure, socioeconomic status, urban and household density, mobility and health characteristics were initially chosen based on earlier established associations with COVID-19 (see table 1 for sources, expected association with COVID-19 and summary statistics). Specifically, we included information on the proportion of (1) young people (ages 0–15 years) and (2) elderly (70 years and older), and (3) the percentage of the population aged 70+ years who was male. Socioeconomic indicators included were (4) mean income per person, (5) age-standardised ratio of population with at least post-secondary education, (6) percentage of the population born in foreign countries with a high Human Development Index (HDI) and (7) low HDI. We also included (8) population density, (9) average number of persons per dwelling and (10) people living alone. We obtained mobility data on: (11) the availability of private transportation and (12) mobility during lockdown. We also captured the presence of (13) transient populations (measured as the rate of inhabitants automatically deregistered by the municipality, which occurs when foreign residents fail to renew their registration), as cumulative infection may be lower in areas with hypermobile groups (eg, international students) that were likely to leave the city due to the pandemic. We also incorporated (14) the number of LTCF beds per 1000 inhabitants and (15) the percentage of economically active population in the health sector. Lastly, we included (16) the life expectancy at birth as a proxy for general health status.

Table 1

Covariates used in the study: hypothesised association with COVID-19, definitions, sources and summary statistics before transformation (when required*)

Statistical analyses

Data transformation

The distribution of each neighbourhood-level sociodemographic indicator and covariate was first assessed for normality using visual inspection of QQ plots and the Smirnov-Kolmogorov test for normality. Accordingly, we log-transformed: (1) young population, (2) income, (3) foreigners from high-HDI countries, (4) foreigners from low-HDI countries, (5) mobility during lockdown and (6) transient populations. We also used a square root transformation for the nursing homes variable.

Multiple variables model

To fit the total number of cases observed in each unit of analysis, we relied on a generalised linear model (Quasi-Poisson regression) that takes into account the total population as an offset as well as the sociodemographic variables. Given the relatively large number of covariates included in the study and the potential multicollinearity among them, we ran a lasso analysis to automatically identify the most relevant variables.34 In the context of generalised linear regression modelling and prediction, lasso performs both variable selection and regularisation to enhance prediction accuracy and interpretability of the statistical model. The hyperparameter of the lasso-regularised maximum likelihood estimator was set using cross-validation and, once lasso identified the most informative variables, we fitted the final Quasi-Poisson model that explained the COVID-19 incidence for each unit of analysis considered. Finally, variable elasticities were calculated. This enables estimating the increase of cumulative incidence (and predict the total number of positive cases) for a 1% change in a particular covariate and thereby compare the effect of the different covariates.


The intraurban geography of the COVID-19 cumulative incidence in Barcelona during the period of study reveals a strong proximity among the units with the highest and lowest values (figure 1). Northern neighbourhoods (mainly located within the districts of Nou Barris and Horta-Guinardó) have the highest incidence values, with some of them exceeding 1000 cases per 100 000 inhabitants during the 8 weeks of observation. On the other hand, the incidence in the geographical units located in the southeast of the city (ie, historical centre) is less than one-third of that in the worst-affected neighbourhoods.

Figure 1

Intraurban distribution of COVID-19 cumulative incidence in Barcelona from 9 March to 3 May 2020 (per 100 000 inhabitants).

From the initial 16 variables considered, the lasso method selected as meaningful to explain the observed COVID-19 levels the following seven (see also online supplemental material): (1) elderly, (2) high education, (3) foreigners from high-HDI countries, (4) population density (urban), (5) mobility during lockdown, (6) LTCF and (7) health workers. These variables are mapped in figure 2.

Supplemental material

Figure 2

Intraurban distribution of the sociodemographic covariates. HDI, Human Development Index.

Results of our Quasi-Poisson model confirm that the associations between the final selection of variables and the intraurban COVID-19 incidence in Barcelona are all in the expected direction (table 2). Neighbourhoods that are densely populated, with a higher number of older adults, with more numerous LTCF and with higher proportions of individuals who left their area of residence during lockdown were statistically more likely to have a higher number of cases of COVID-19 during the first outbreak of the pandemic. The work in health-related occupations variable was significant at the 0.063 level. Conversely, the association with COVID-19 cases is negative with the other two socioeconomic factors: post-secondary-educated residents and population born in high-HDI countries, with the second one being less relevant (note that while the cross-validation analysis of the lasso-regularised 16-variable regression deems the high-HDI variable meaningful, the p value associated with the 7-variable regression casts doubts about its statistical significance). Considering the effect of the factors on the number of COVID-19 infections in a neighbourhood of Barcelona with average characteristics, a 1% increase in older people or mobility during lockdown would lead to almost 30 extra cases, while a neighbourhood with a 1% higher ratio of post-secondary-educated inhabitants leads to 26 fewer cases during the observed period according to our model. We finally ran a Global Moran’s I test to assess the potential spatial autocorrelation of the model’s residuals, but results were not significant (see online supplemental material).

Table 2

Results of the generalised linear (Quasi-Poisson regression) analysis of social and demographic factors on COVID-19 infection rates in Barcelona from 9 March to 3 May 2020

Discussion, interpretation and implications


Our results confirm that incidence of COVID-19 is related to several intraurban sociodemographic factors. In Barcelona, higher rates of infection were found in geographical units that were more densely populated, had more residents aged 70 years or over, observed high levels of mobility during lockdown, contained more nursing home facilities and had the highest levels of people working in health-related occupations. Conversely, neighbourhoods with relatively more residents with high levels of education and with an immigration background from high-HDI countries registered fewer COVID-19 infections.

Our results are mostly in line with other indicators of spatial health inequalities for Barcelona which indicate that residents in neighbourhoods located in the north of the city—generally lower income neighbourhoods, with lower education, denser areas and higher immigration from lower HDI countries (as an indicator of ethnicity)—also have lower life expectancy and suffer more from chronic diseases.35 The same exposures that put residents at risk of general poor health and comorbidities also have implications for risk of COVID-19 infections.8 9

The environmental justice literature further demonstrates several causal pathways which may account for health differences by neighbourhood socioeconomic status by showing that, for example, neighbourhoods with high percentages of low-income and non-university-educated residents historically have more environmental hazards,36 putting residents at greater exposure to risks leading to greater related health impacts. Because urban social and health injustices already existed in those neighbourhoods with higher COVID-19 incidence in Barcelona, including poor housing conditions, and at greater risk of economic disadvantage among others, the current pandemic is likely to reinforce health and social inequalities and urban environmental injustice. People living in these neighbourhoods have less of a social safety net during times of both health and socioeconomic stress. They are thus more likely to face an unjust burden in overcoming the pandemic and its economic consequences.

During spring 2020, the lockdown in Spain limited mobility strictly to those working in essential services, including low-wage jobs that require commuting by public transit to other parts of the city, which predicts higher COVID-19 incidence in geographical units with higher numbers of commuters. In their case, additional health inequalities are likely to manifest because essential workers are often underpaid and underprotected, in positions that require close interactions with the public. Additionally, they may already suffer from underlying health conditions due to their lower socioeconomic status, as recent research suggests.37 As non-essential workers are losing their jobs or facing less pay, these hardships affect lower educated (and logically income) communities more, and jeopardise their ability to overcome the pandemic in the long term.38 In contrast, more privileged residents have greater ability to financially and physically recover. The negative association we found between infection and neighbourhoods with high percentages of individuals with post-secondary degree and/or born in high-HDI countries can be understood from a dual perspective: first, the presence of this type of residents is closely associated with neighbourhoods dominated by middle and upper socioeconomic households, which, in addition, were more likely to work remotely. Second, this group is increasingly formed by young mobile and transient populations,39 who had the chance to return to their home countries at the initial stage of the pandemic.

Last, results also indicate an expected structural age-related vulnerability, with neighbourhoods with a higher percentage of residents over 70 years and/or with more nursing homes, predicting higher COVID-19 incidence. Those are thus intersectional social vulnerabilities, particularly important for a context like Spain, which has a high ageing population and a high number of residents in nursing homes, many of whom suffer from other comorbid conditions.

Strengths and limitations

Barcelona is an excellent example to disentangle the spread of the infection within dense and highly mixed-use European urban areas. Socioeconomic and urban conditions are significantly different to other urban contexts where most of the research has been conducted. Another strength of our study is that the high number of COVID-19 cases in Barcelona enabled us to test various area-level indicators. In addition, the vast availability of aggregated sociodemographic data at a fine-grained scale allowed us to include many contextual factors that in other studies are often analysed separately. Nevertheless, using geographically aggregated data also has its limitations, as association found in ecological studies may not necessarily reflect those observed at the individual level. An interesting future line of analysis would be to create buffer zones based on case addresses in order to overcome the limitations of administrative boundaries. Another limitation was that our estimates cover only the municipality of Barcelona and do not include data from the metropolitan area. Last, our measurement of incidence was biased toward more severe patients with COVID-19 as testing procedures were restricted to hospital admissions at this stage of the pandemic. The seroprevalence study conducted between 27 April and 11 May estimated that 7% of the residents in Barcelona’s province had developed IgG antibodies against SARS-CoV-2.40 Assuming this prevalence for the city, the total number of cases that we analysed represented between 10% and 15% of the people who became infected during our period of study. Therefore, our model is likely to be biased in estimating intraurban variations of the entire infected population, but not for predicting the most severe cases. Our results may also differ from subsequent waves when massive and rapid COVID-19 testing became available that also detect asymptomatic cases. As the latter is more common among younger people, the predictive value of the percentage 70+ variable in intraurban variation of COVID-19 will likely be lower in subsequent waves.

Final thoughts

Despite initial media and political narratives framing the pandemic as a social equaliser, our analysis shows how vulnerable groups by occupation, age and ethnicity, who reside in Barcelona neighbourhoods with poor pre-existing social and environmental conditions, have statistically higher incidences of COVID-19. With the pandemic, their exposure to overlapping health risks has been compounded by new ones. The COVID-19 pandemic is therefore likely to reinforce existing health and social inequalities, and exacerbate urban environmental injustice in the city. These trends call for public policies and planning interventions to address neighbourhood environmental and social factors, strengthen social welfare and healthcare systems, and improve open green and public spaces to serve as resources and refuges for socially vulnerable groups.

What is already known on this subject

  • Previous research on virus transmission has shown that individual, household, and neighbourhood-level socioeconomic and cultural factors are associated with viral transmission.

  • Most of COVID-19 research on spatial variations has been mainly set at the national or subnational regional level. Because of the internal heterogeneity of these units, it is very difficult to disentangle the different intervening demographic and socioeconomic factors behind risks and exposures to COVID-19.

  • The limited research on the COVID-19 pandemic at the neighbourhood level (mainly in the USA and UK) identifies the effect of sociodemographic determinants, like socioeconomic status or ethnicity.

What this study adds

  • We analyse the spread of COVID-19 in Barcelona, a very dense and highly segregated city in Southern Europe, where the first outbreak led to very high levels.

  • We test a wide range of sociodemographic and urban characteristics, including mobility during lockdown, 16 variables in total, in order to predict intraurban variations in COVID-19 infections at the neighbourhood level in Barcelona.

  • The COVID-19 pandemic is likely to reinforce existing health and social inequalities, and exacerbate urban environmental injustice. These trends call for public policies and planning interventions that must address historical poor neighbourhood environmental and social factors, strengthen social welfare systems, and improve open green and public spaces in cities.

Data availability statement

Our data are accessible to researchers upon reasonable request for data sharing to the corresponding author. Our dataset has been built based on publicly available data in the referred repositories.

Ethics statements

Patient consent for publication

Ethics approval

No ethical approval was sought for this study as it used aggregated, anonymous and publicly available data, collected at the neighbourhood level.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @tonilopezga, @popageing, @hvscole, @ItaTrigueroMas, @ianguelovski, @epistatistic, @ModenesJA, @flopez_gallego, @carme1848

  • Contributors AGM, AL-G and FL-G conceived the study. AL-G, FL-G, MM-D and JAM collected data, calculated indicators and built the final dataset. AGM, AL-G, MM-D, JS and MT-M conducted the statistical analyses. IA, CB, HC, AGM, AL-G, JS and MT-M wrote the paper. AL-G was the principal investigator of the study. All authors contributed to the interpretation of data, and read, edited and approved the final manuscript.

  • Funding This project has been funded by the following programmes: H2020 European Research Council (GREEN LULUs SG/GA678034 and HEALIN CoG/GA864616); Ministerio de Ciencia e Innovación (CSO2016-79142-R; GLOBFAM/RTI2018-096730-B-I00I3; ‘Maria de Maeztu’ Program/CEX2019-000940-M; ‘Ramón y Cajal' Program/RYC-2013-14851; 'Juan de la Cierva' Program/FJCI-2017-33842 and IJC2018-035322-I); Agència de Gestió d'Ajuts Universitaris i de Recerca (DEMFAMS/2017 SGR 1454); Talent Research Program, Universitat Autònoma de Barcelona.

  • Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.