Article Text

Download PDFPDF

LB2 Applying machine learning to pooled qualitative studies on active travel: a method to uncover unanticipated patterns to inform behaviour change?
  1. E Haynes1,
  2. R Garside1,
  3. J Green2,
  4. MP Kelly3,
  5. J Thomas4,
  6. C Guell1
  1. 1European Centre for Environment and Human Health, University of Exeter Medical School, Truro, UK
  2. 2School of Population Health and Environmental Sciences, King’s College London, London, UK
  3. 3Cambridge Institute of Public Health, University of Cambridge, Cambridge, UK
  4. 4Institute of Education, University College London, London, UK


Background Innovative approaches are required to better address physical inactivity. To move beyond individual approaches to behaviour change and develop more appropriate insights for the complex challenge of increasing population levels of activity, recent research has drawn on social practice theory. This theoretical approach describes the relational character of active living and related social practices. However, to date these investigations have been limited to small-scale qualitative research studies. To move beyond individual contexts and population groups and uncover conditions for ‘practice change’ across similar datasets, we explored a novel approach to qualitative data synthesis. Our aim was to pool several qualitative studies and apply machine learning to uncover patterns and interconnections in ‘active travel’ that have not emerged from the original qualitative data analyses.

Methods A pooled qualitative dataset of almost 250 transcripts was drawn from five studies conducted in different contexts in the UK, including Belfast, London, Glasgow, Cambridge and Cardiff. Machine learning approaches such as text mining have previously been applied to identify key recurring terms in large data sets. Recent software developments suggest the possibility of identifying ‘concepts within context’. This unsupervised analysis of inter-relating concepts, which focuses on pattern recognition, is known as ‘topic modelling analysis’. Text mining analysis software, Leximancer, was used to analyse the data and produce inter-topic distance maps to illustrate ‘themes’ and constituent ‘concepts’.

Results In our analysis, we interrogated the insightfulness of this software to facilitate an inductive, data-driven process, and provide an analytical ‘fresh lens’ and the potential for identifying novel patterns and linkages that might not be identified by manual coding. For example, a new ‘uncovered’ theme was that women’s accounts of ‘cycling’ were closely connected to ‘people’. Exploring this in the original data, this related to their notions of who is a ‘cyclist’, what ‘cyclists’ look like, and aspects such as required fitness. In contrast, for men, ‘cycling’ did not connect to ‘people’ but to logistics, how to get to work and how long it takes. This researcher input and interpretative work was a necessary analytical next step to make meaning from software outputs.

Conclusion This study contributes new insights into the, to date, rare application of machine learning to qualitative social science research, and towards a social science approach to behaviour change. Developing new methods and conceptual understandings can inform future research and policy decisions about social environments for promoting social practices which increase physical activity.

Research funded by the Academy of Medical Sciences and the Wellcome Trust: Springboard – Health of the Public 2040 (HOP001\1051).

  • Keywords qualitative data
  • text mining
  • machine learning
  • active travel
  • behaviour change
  • innovative approaches
  • synthesis

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.