Article Text

Download PDFPDF
OP82 How to identify physical-mental health multimorbidity using routine records
  1. Regina Prigge1,
  2. Kelly Fleetwood1,
  3. Caroline Jackson1,
  4. Stewart Mercer1,
  5. John Norrie1,
  6. Daniel Morales2,3,
  7. Daniel Smith4,
  8. Cathie Sudlow5,
  9. Bruce Guthrie1
  1. 1Usher Institute, University of Edinburgh, Edinburgh, UK
  2. 2Division of Population Health and Genomics, University of Dundee, Dundee, UK
  3. 3Department of Public Health, University of Southern Denmark, Odense, Denmark
  4. 4Division of Psychiatry, University of Edinburgh, Edinburgh, UK
  5. 5British Heart Foundation Data Science Centre, Health Data Research UK, London, UK


Background There is huge variation in the measurement of multimorbidity in research, in terms of both the number of conditions and the specific conditions included in multimorbidity definitions. Furthermore, the measurement of specific conditions is inconsistent and poorly reported, while code lists frequently remain unpublished, making reproducibility and comparability of existing analyses challenging. Our aim was to define and identify 154 physical and mental health conditions in routinely collected health records.

Methods We used data from 172,596 UK Biobank participants with linkage to primary care, hospital and death records. We determined the prevalence and incidence of 154 conditions in electronic health records using information based on three diagnosis coding systems (Read version 2, Read version 3, ICD-10) and one coding system with information on operations and clinical procedures (OPCS-4). For each morbidity and coding system, we applied a two-step process of systematically identifying existing code lists and creating new code lists for conditions with no published code list.

Results We identified existing Read version 2 and ICD-10 code lists for all 154 conditions. However, given the format of Read version 2 codes provided by UK Biobank, we had to convert existing 7-digit Read version 2 code lists to 5-digit Read version 2 code lists and account for resulting inconsistencies. We additionally identified existing OPCS-4 code lists for a subset of nine conditions, for which operation and procedure codes were deemed relevant. We successfully identified existing Read version 3 code lists for 118 of the 154 conditions and developed a 7-step process to create new Read version 3 code lists for the remaining 36 conditions: 1) mapping of Read v2 codes to Read v3 codes; 2) free-text search of Read v3 code descriptions; 3) manual search of the code hierarchy for potentially relevant codes above and below the codes already identified; 4) identification of preferred terms and synonyms; 5) creation of a combined code list; 6) determining the frequency of each code in UK Biobank; 7) clinical review and selection of relevant codes.

Conclusion Identifying existing code lists and creating new code lists is time-consuming, and new, more elaborate coding systems such as SNOMED-CT will make this worse. In order to improve reproducibility and comparability of analyses, collaborative approaches to code list development are needed where code lists are cross-validated and newly created code lists are made publicly available. Algorithmic approaches to code list development may help, but their validity needs evaluating.

  • Clinical coding
  • multimorbidity
  • electronic health records

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.