Article Text

Download PDFPDF
Neighbourhood risk factors of recurrent tuberculosis in Cape Town: a cohort study using geocoded notification data
  1. Marjan Molemans1,2,3,4,
  2. Frank van Leth4,5,
  3. David Henry McKelly6,
  4. Robin Wood7,8,
  5. Sabine Hermans1,2,4,9
  1. 1 Amsterdam Institute for Global Health and Development, Amsterdam, Netherlands
  2. 2 Department of Global Health, Amsterdam UMC Locatie Meibergdreef, Amsterdam, Netherlands
  3. 3 Amsterdam Institute for Social Science Research, Amsterdam, Netherlands
  4. 4 Amsterdam Public Health Research Institute, Amsterdam, Netherlands
  5. 5 Department of Health Sciences, VU Amsterdam, Amsterdam, Netherlands
  6. 6 Smart Place, Council for Scientific and Industrial Research, Cape Town, South Africa
  7. 7 University of Cape Town Desmond Tutu HIV Centre, Cape Town, South Africa
  8. 8 Faculty of Health Sciences, University of Cape Town Institute of Infectious Disease and Molecular Medicine, Cape Town, South Africa
  9. 9 Department of Infectious Diseases, Amsterdam UMC Locatie Meibergdreef, Amsterdam, Netherlands
  1. Correspondence to Mrs Marjan Molemans, Amsterdam Institute for Global Health and Development, Amsterdam 1105, North Holland, The Netherlands; m.molemans{at}aighd.org

Abstract

Background Individuals with a history of tuberculosis (TB) disease are at higher risk of developing a subsequent episode than those without. Considering the role of social and environmental factors in tuberculosis, we assessed neighbourhood-level risk factors associated with recurrent tuberculosis in Cape Town, South Africa.

Methods This cohort consisted of patients who completed treatment for their first drug-sensitive TB episode between 2003 and 2015. Addresses were geocoded at neighbourhood level. Data on neighbourhood-level factors were obtained from the Census 2011 (household size, population density) and the City of Cape Town (Socio-Economic Index). Neighbourhood-level TB burden was calculated annually by dividing the number of notified TB episodes by the population in that neighbourhood. Multilevel survival analysis was performed with the outcome recurrent TB, defined as a second episode of TB, and controlling for individual-level risk factors (age, gender and time since first episode in years). Follow-up ended at the second episode, or on 31 December 2015, whichever came first.

Results The study included 173 421 patients from 700 neighbourhoods. Higher Socio-Economic Index was associated with a lower risk of recurrence compared with average Socio-Economic Index. An increased risk was found for higher household size and TB burden, with an increase of 20% for every additional person in mean household size and 10% for every additional TB episode/100 inhabitants. No association was found with population density.

Conclusion Recurrent TB was associated with increased household size and TB burden at neighbourhood level. These findings could be used to target TB screening activities.

  • GIS
  • SOCIAL CLASS
  • TUBERCULOSIS
  • Health inequalities

Data availability statement

Data may be obtained from a third party and are not publicly available. These data were obtained from the Cape Town Electronic TB register,after receiving permission from Cape Town City Health (Health.Services@capetown.gov.za). Access to the data was limited to the conduct of relevant analysis and publication of results. A request to access data can be made direct to the Cape Town City Health.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THIS TOPIC

  • Research on recurrent tuberculosis (TB) has largely focused on biomedical factors, while the association between neighbourhood factors and TB in general has mixed findings.

WHAT THIS STUDY ADDS

  • This study shows that neighbourhood TB burden and mean household size play an important role in the risk of recurrent TB. Living in a neighbourhood with very good or good Socio-Economic Index is associated with a decreased risk of recurrent TB.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

  • This study shows the importance of neighbourhood factors in recurrent TB and can provide a good starting point for targeted screening.

Background

Recurrent episodes of tuberculosis (TB) play an important role in the yearly incidence of TB in Cape Town, South Africa, accounting for one quarter of TB notifications.1 2 Individuals with a prior episode of TB disease are at higher risk of developing another episode than those without.3

The conceptual framework from Lönnroth et al identifies upstream determinants and proximate risk factors of TB disease.4 The association between the higher risk of TB and the upstream determinants, low socioeconomic status and poverty, is largely effectuated by proximate risk factors, which can be categorised in two groups: factors directly increasing exposure (such as crowding and poor ventilation) and factors impairing host defence (such as smoking, HIV, malnutrition, lung diseases, diabetes, alcoholism, age and gender).4 Studies exploring risk factors for recurrent TB have mainly focused on biomedical risk factors.5–8

Neighbourhood socioeconomic status is an upstream determinant for TB. Mean household size and population density are measures of the proximate risk factor crowding. The evidence on the association between different measures of neighbourhood socioeconomic status and TB is mixed; studies found no association in South Africa,9 10 a positive association in Zambia11 and a negative association in Zambia and South Africa.12 No studies have examined neighbourhood factors and recurrent TB, leaving the path from upstream determinants to proximate risk factors for TB recurrence understudied.

Recurrent TB disease is either due to relapse or to reinfection. A systematic review showed that settings with higher background TB incidence have a higher proportion of reinfections and the predominance of relapse over reinfection decreases.13 DNA fingerprinting in a small cohort in South Africa showed that relapse occurs most often shortly after treatment completion, while reinfection is more common after 12 months and is responsible for two-thirds of recurrences.14

Drawing from the literature on recurrent TB and on neighbourhood risk factors for TB, we developed a conceptual framework, adapted from Lönnroth et al,4 with the information available to us(figure 1).The primary objective was to study the association between neighbourhood risk factors and recurrent TB in Cape Town, South Africa. We also performed two secondary analyses. We compared the association with neighbourhood factors and recurrence within 12 months of treatment completion versus after 12 months, as a proxy for the underlying mechanism of relapse vs reinfection. As HIV status of patients with TB was poorly registered prior to 2009,15 we performed a secondary analysis in the time period 2009–2015, including HIV status as an individual factor.

Figure 1

This model, adapted from Lönnroth et al and based on available information, identifies upstream determinants and proximate risk factors of TB disease.4 The association between the higher risk of TB and the upstream determinants, low socioeconomic status and poverty, is largely effectuated by proximate risk factors, which can be categorised in two groups: factors directly increasing exposure (such as crowding and poor ventilation) and factors impairing host defence (such as smoking, HIV, malnutrition, lung diseases, diabetes, alcoholism, age and gender).4 The factors in the dotted boxes cannot be measured. TB, tuberculosis.

Methods

Study setting

Cape Town had 3.7 million inhabitants according to the 2011 census.16 The city health department delivered healthcare in 129 facilities.17

TB care is free of charge, is predominantly provided by the public health sector and mainly in community clinics; patients diagnosed in the private sector are referred to a public clinic for treatment, since insurance schemes do not cover TB treatment.18

Sputum smear microscopy was used for diagnosis before 2013, Xpert MTB/RIF afterwards.19

Study design and population

Our cohort consisted of all patients who completed treatment for their first notified drug-sensitive TB episode between 1 January 2003 and 31 December 2015 in Cape Town, South Africa. As the Electronic TB Register does not contain personal identifiers, episodes of the same person were identified using probabilistic linkage previously.2 Patients who did not have their first episode during this period, or who did not complete their treatment were excluded.

We included both individual and neighbourhood risk factors in our analysis. The neighbourhood factor Socio-Economic Index (SEI) is an upstream determinant in the model of Lönnroth et al 4 while mean household size and population density are proximate risk factors that affect the exposure to infectious droplets.4 The individual factors age, gender and HIV status are proximate risk factors that can impair host defence.4

Data sources

Individual characteristics, addresses and information on the TB episode were retrieved from the Electronic TB Register for Metropolitan Cape Town.

Information on neighbourhood characteristics was retrieved from the 2011 South African National Census, except for the variable SEI, which was retrieved from the City of Cape Town report on the Socio-Economic Index.20 Neighbourhoods with less than 20 households were excluded (87 out of 922 neighbourhoods). Neighbourhoods were categorised as very low (4.7% of all neighbourhoods), low (4.4%), average (7.4%), high (43.4%) and very high (40.1%) SEI, resulting in 6.7% of the population living in neighbourhoods categorised as very low, and 18.8% in neighbourhoods categorised as low20

Study definitions

TB recurrence was defined as having a second episode of TB disease during the follow-up time.

The individual-level characteristic age was included as a time-varying variable, categorised in six age categories. Gender had two categories: male and female. HIV status at TB diagnosis was positive, negative or unknown. Follow-up time was defined as years since end of treatment, categorised in 13 one-year categories.

The neighbourhood-level factor population density was calculated as the number of inhabitants in the neighbourhood per square kilometre, divided by 1000. Mean household size, a discrete variable, was calculated from the information on household size in the census. The census included categorical information on how many households have a specific size, ranging from 1 to 10+. To calculate the mean neighbourhood household size, we multiplied the number of households with the given category, where we used 10 for the 10+ category, and divided the total by the total number of households.

The Socio-Economic Index was calculated by the City of Cape Town using variables on housing, education, household services and economics from the census; it consisted of five categories ranging from very low to very high.20

To calculate the annual neighbourhood TB burden, we divided the total number of notified TB episodes in a calendar year (therefore including any number of recurrences, patients who had their first episode before our study period started and patients who did not complete treatment for their first episode) by the neighbourhood population size in 2011 and multiplied by 100, to estimate the notification rate per 100 inhabitants. Because some neighbourhoods had a small number of inhabitants and because of potential misclassifications in geocoding (see below), some neighbourhoods had unrealistic numbers of TB burden, even after the corrections made in the geocoding described below. Therefore, we excluded neighbourhoods for which the TB burden was greater than 3 standard deviations from the mean TB burden in any of the included years. This resulted in the exclusion of an additional 21 neighbourhoods, 11 of which also did not have a SEI. The TB burden of the year in which the patient completed treatment for the first episode was used and was time-varying in the analysis.

Geocoding

The addresses were geocoded and then mapped to their neighbourhood. For this we used subplace, the second-lowest administrative area in the census, as it is the most homogeneous in population groups. A subplace was defined as ‘suburb, section or zone of a township, smallholdings, village, sub-village, ward or informal settlement’.21 In this paper, we use the term neighbourhood for this level. The lowest level in the census was the enumeration area (100–250 households), and was created for survey purposes, making it less constant over time,22 and therefore, not suited for this analysis. Cape Town had a total of 922 neighbourhoods in the 2011 census. Since the addresses were collected as part of routine TB register data and were not intended for research purposes, they were not noted in a standardised manner. This affected their quality and our ability to geocode them.

We used complementary methods to geocode the addresses. We used Stata 16.1 (StataCorp) and ArcGIS (ESRI, Redlands, California, USA) to geocode the addresses with HERE maps (HERE technologies, Eindhoven, the Netherlands) as reference. We then used Google maps to perform a manual check of the neighbourhood location of a random sample geocoded by both methods and found Stata geocoding to be more correct. In case both methods were able to geocode the address to a neighbourhood, we therefore used the one geocoded by Stata. A subsequent check revealed some neighbourhoods with an unrealistically high annual TB burden given their population size, as a result of non-informative addresses. For these instances, we made systematic corrections using informative parts in the address. Online supplemental figure E1 shows the geocoding flow diagram.

Supplemental material

After geocoding, the patients were assigned the neighbourhood in which the address was located. We then merged the census data and SEI data with the TB register data based on the neighbourhood.

We used the neighbourhood characteristics of the first TB episode, because we had this information both for patients with and without a recurrence.

Statistical methods

We used a multilevel survival model with individual and neighbourhood-level risk factors and the outcome recurrent TB, where neighbourhood was included as a random intercept. Follow-up time ended at the second episode, or on 31 December 2015, whichever came first.

We performed two subanalyses. The first was stratified by the timing of recurrence: within the first 12 months or after 12 months. In the second subanalysis, we restricted the time period to 2009–2015 and included individual HIV status as a covariate. We also performed the primary model without HIV for the time period 2009–2015 to allow for comparison.

Results

We included 173 421 patients, with 15 013 recurrences, from 700 neighbourhoods (online supplemental material Figure E2).

Of the patients, 52% were male and the largest age category was 25–34 years (table 1). Neighbourhoods classified as low SEI housed 35% of patients, and the median neighbourhood density was 14 850 people/km2. The median household size was 3.4 and the median annual TB burden was 1.1 per 100 inhabitants. The characteristics of the included neighbourhoods are reported in table 2.

Table 1

Individual and neighbourhood-level characteristics of patients (n=173 421)

Table 2

Characteristics of the included neighbourhoods (n=700)

Multilevel survival analysis

We found an association with the neighbourhood-level characteristics mean household size and annual TB burden, while we did not find an association with population density (figure 2 and online supplemental material). For every one person increase in mean household size, the hazard for recurrent TB increased by 20% (hazard ratio (HR) 1.20, 95% confience interval (CI) 1.12 to 1.28). For every increase in neighbourhood TB burden by 1 per 100 inhabitants, the hazard increased by 10% (HR 1.10, 95% CI 1.06 to 1.14). Patients who lived in a neighbourhood with a high and very high SEI had a lower hazard of recurrent TB than patients living in an average SEI neighbourhood. The individual factors that we controlled for showed the following associations: the hazard of recurrent TB was higher in all age groups compared with the reference group 0–14 years, with the highest hazard in the 35–44 years age group. Male gender was associated with increased hazard of recurrent TB, and longer time after the end of treatment are associated with a decreased hazard, compared with the first year.

Figure 2

Association (hazard ratio and 95% confidence interval) between recurrent TB and individual and neighbourhood-level factors (n=173 421). TB, tuberculosis.

Multilevel survival analysis, stratified by timing of the recurrence

The analysis was stratified by timing of recurrence within 12 months of treatment completion, or after 12 months. We found associations similar to those found in the main analysis (figure 3 and online supplemental material), and the estimates did not differ between the two strata, with the exception of male gender. The association of annual TB burden with recurrence was present both in the stratum of recurrence within 12 months and after 12 months.

Figure 3

Association (hazard ratio and 95% confidence interval) between recurrent TB, within (n=1 73 421) and after 12 months (n=1 57 049), and individual and neighbourhood-level factors.

Multilevel survival analysis, including individual HIV status

The subset of this analysis included 93 751 patients from 635 neighbourhoods. Of these, 54 276 (58%) patients were HIV positive; 36 407 (39%) patients were HIV negative and 3068 (3%) had missing HIV status. Demographics were similar to the entire cohort (online supplemental material).

There was a clear association between HIV status and the hazard of recurrent TB, with a HR of 1.75 (95% CI 1.64 to 1.86) for HIV positive patients, compared with the reference group of HIV negative patients (Figure 4 and online supplemental material). The associations found were similar to those of the main analysis, with the following exception: the association with annual TB burden was less strong when we restricted the time period to 2009–2015 and became even weaker when we included HIV status as an individual risk factor.

Figure 4

Association (hazard ratio and 95% confidence interval) between recurrent TB and individual, including HIV, and neighbourhood-level factors in time period 2009–2015 (n=93 751).

Discussion

This study shows that the higher the mean household size and the annual TB burden in the neighbourhood, the higher the risk of recurrent TB, while a higher population density did not affect the risk of recurrent TB. Living in a neighbourhood with a very high or high SEI decreased the risk of recurrent TB compared to an average SEI neighbourhood, while we did not observe an association with living in a low or very low SEI neighbourhood. We also found no difference in neighbourhood factors that were associated with recurrence within or after 12 months after previous treatment. We found a strong association with positive HIV status, and a weaker association with annual TB burden compared with the main analysis. This indicates that the effect of annual TB burden is partly explained by individual HIV status.

In the context of the model of Lönnroth et al,4 we did find an effect of the upstream determinant, SEI, although not for all categories. The downstream risk factors mean household size and neighbourhood TB burden are also associated with the risk of recurrent TB.

As both annual TB burden in the neighbourhood and mean household size increase the likelihood of exposure to infectious droplets, this suggests that reinfection plays an important role. The lack of association with population density has two possible explanations. First, this does not consider inhabitable surface, so neighbourhoods with a low population density could still be very dense in some areas, if part of the surface is inhabitable. Second, the risk associated with population density is possibly mediated by annual TB burden and household size.

The reported decreased hazard of recurrence in high and very high SEI neighbourhoods, is consistent with the finding in Zambia and in the Western Cape of South Africa.12 However, there are also studies that found no association in South Africa found.9 10 It is also in line with the study by Marx et al, which found that South African health districts with higher notification rates had higher proportions of recurrent TB.23

Although this study shows the value of combining register and census data, it also shows the shortcomings of geocoding in a context with limited resources. First, the register did not always record the addresses in a way they could be geocoded, as they were not collected for research purposes. Second, there is no national dataset of addresses to which the addresses could be compared. During the COVID-19 pandemic, researchers also raised attention to the difficulties of geocoding South African addresses.24 A recent systematic review showed that there are limited studies on spatially targeted interventions for TB, and highlighted the lack of municipal address systems in limited resource setting as a potential cause.25

Issues with geocoding have also been reported in other studies on TB in Viet Nam26 and India.27 Another study in Cape Town, focusing on neighbourhood characteristics of hospitalised childhood burn injury using hospital register data, reported that 18.4% of addresses could not be geocoded,28 compared with 11.1% in our study. We found that using health subdistricts or districts, as two other studies in South Africa did,23 29 instead of geocoding addresses, was not suitable in the Cape Town metropolitan area, as health districts are very heterogeneous in terms of population and therefore neighbourhood factors.

The strengths of our study are that it is representative at population level and that we have a large cohort with a long follow-up time. However, there are a number of limitations. First, the difficulties in geocoding possibly caused misclassification neighbourhoods. If we assume that people with lower SEI had a lower address quality, for example because they lived in an area without street names and house numbers, and if they had more recurrent TB, we would therefore have underestimated the association between SEI and recurrent TB. Second, the covariates mean household size, SEI and population density were measured at only at the 2011 census. Third, since the register only included episodes of drug sensitive TB, if a patient had a second episode that was drug-resistant, it was not included. Our data are thus an underestimation of the total burden of recurrent TB. A fourth limitation is that we could not include neighbourhood HIV prevalence and ART coverage, due to a lack of data on this level. Fifthly, we assigned patients the neighbourhood characteristics of the address of their first TB episode, which resulted in an assumption that patients did not move over the follow-up time. As we expect most people who moved during follow-up would have moved to neighbourhoods with similar population and neighbourhood characteristics, we do not expect this to significantly bias our results

Seeing the large burden of recurrent TB in Cape Town, it is important to consider interventions to reduce this burden. Based on our study, it could be beneficial to put additional effort into TB screening in neighbourhoods with a large mean household size and a high annual TB burden. The association with household size is clear in all our analyses, and as this information is available in most settings, this could provide for a starting point for targeting neighbourhoods. The WHO recommends systematic TB screening in communities with a high TB prevalence, defined at 0.5% or higher.30 If we compare this with our study population, we see that more than 75% of our cohort lived in a neighbourhood that meets this criterion. Therefore, targeting those neighbourhoods with a high household size could help prioritise sparse resources.

A recent study showed the feasibility and usefulness of targeting neighbourhoods with a high TB burden for TB screening.31 The ‘Targeted Universal Testing for TB’ (TUTT) study and a modelling study showed the added value of targeting people with a previous episode of TB as one of the risk groups for TB screening.32 33 Our study and that of Marx et al 23 contribute to the evidence that living in a neighbourhood with a high TB burden increases the risk of recurrent TB, indicating that it should be targeted in screening. The best strategy for this, community-based as in the ‘Tuberculosis Neighbourhood Expanded Testing’31 or clinic-based approach as in the TUTT study, is unknown. Future research could focus on comparing these strategies. Another interesting direction for future research would be to investigate neighbourhood factors and TB burden of all episodes, instead of only recurrence.

In conclusion, neighbourhood factors were associated with recurrent TB in Cape Town. Mean household size and annual TB burden increased the risk of recurrent TB, while living in a neighbourhood with high or very high SEI decreased the risk, when controlling for the individual-level factors of age, gender and follow-up time. Our findings indicate that targeting certain neighbourhoods for screening may be important.

Data availability statement

Data may be obtained from a third party and are not publicly available. These data were obtained from the Cape Town Electronic TB register,after receiving permission from Cape Town City Health (Health.Services@capetown.gov.za). Access to the data was limited to the conduct of relevant analysis and publication of results. A request to access data can be made direct to the Cape Town City Health.

Ethics statements

Patient consent for publication

Ethics approval

The Human Research Ethics Committee at the University of Cape Town granted ethical approval and the Cape Town City Health Department granted permission to use the data.

Acknowledgments

The authors thank Dr Karen Jennings and Judy Caldwell for their input in the conceptualisation and interpretation of the findings.

References

Footnotes

  • Twitter @FvLScience

  • Contributors MM: data management, geocoding, data analysis, interpretation of the data and drafting the manuscript. FvL: data analysis, interpretation of the data and critical review of the manuscript. DHM: data acquisition, geocoding, interpretation of the data and critical review of the manuscript. RW: study conceptualisation, data acquisition, interpretation of the data and critical review of the manuscript. SH: study conceptualisation, data acquisition, data analysis, interpretation of the data and critical review of the manuscript. All authors approved the final draft of the manuscript. MM is the guarantor.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.