Elsevier

Public Health

Volume 125, Issue 10, October 2011, Pages 688-696
Public Health

Original Research
Name analysis to classify populations by ethnicity in public health: Validation of Onomap in Scotland

https://doi.org/10.1016/j.puhe.2011.05.003Get rights and content

Summary

Objectives

Health inequalities between ethnic minorities and the general population are persistent. Addressing them is hampered by the inability to classify individuals’ ethnicity accurately. This is addressed by a new name-based ethnicity classification methodology called ‘Onomap’. This paper evaluates the diagnostic accuracy of Onomap in identifying population groups by ethnicity, and discusses applications to public health practice.

Study design

Onomap was applied to three independent reference datasets (birth registration, pupil census and register of Polish health professionals) collected in Britain and Poland at individual level (n = 260,748).

Methods

Results were compared with the reference database ethnicity ‘gold standard’. Outcome measures included sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). Ninety-five percent confidence intervals and Chi-squared tests were used.

Results

Onomap identified the majority of those in the British participant group with high sensitivity and PPV (>95%), and low misclassification (<5%), although specificity and NPV were lowest in this group (56–87%). Outcome measures for all other non-British groupings were high for specificity and NPV (>98%), but variable for sensitivity and PPV (17–89%). Differences in misclassification by gender were statistically significant. Using maiden name rather than married name in women improved classification outcomes for those born in the British Isles (0.53%, 95% confidence interval 0.26–0.8%; P < 0.001) but not for South Asian or Polish groups.

Conclusions

Onomap offers an effective methodology for identifying population groups in both health-related and educational datasets, categorizing populations into a variety of ethnic groups. This evaluation suggests that it can successfully assist health researchers, planners and policy makers in identifying and addressing health inequalities.

Section snippets

Background

Health inequalities exist between ethnic minorities and indigenous populations.1, 2, 3, 4, 5, 6, 7 In order to address the underlying causes of these inequalities, it is essential to systematically identify and classify individuals into population groups defined by ethnicity. To the authors’ knowledge, there are currently limited means by which to identify such groups between decennial censuses. Information on ethnicity is usually collected at the level of the individual, typically being

Methods

A diagnostic accuracy study was carried out to evaluate the Onomap-assigned classification of a person’s cultural ethnic or linguistic origins with three independent reference datasets that contained information on migration origin or ethnicity collected in Scotland and Poland:

  • birth registration data collected between 2004 and 2008 in the region of Lothian, Scotland;

  • the City of Edinburgh Council pupil census data collected between 2005 and 2008;

  • a registration database of Polish healthcare

Birth registration database

Onomap was unable to classify 307 records (0.4%). The most common reason for this was names not present in the dictionary (n = 231, 75%). There was greater likelihood of fathers’ names being classified by Onomap than mothers’ names (unclassified fathers’ names: n = 142, 0.4%; unclassified mothers’ names: n = 185, 0.5%). This difference was statistically significant (0.09%, 95% CI 0.08–0.1%; P < 0.05).

Parents were predominantly classified in the British group (86.8%), with South Asians and

Discussion

Onomap Version 2 is a quick, effective and user-friendly tool for identifying population groups in health-related and educational datasets, categorizing populations into a variety of Onomap ethnic groups and languages of origin.

There are some limitations intrinsic to the nature of the three independent datasets evaluated. Within the birth registration dataset, country of birth of the parents is not obviously representative of a person’s ethnicity, especially amongst established migrant

Application to public health practice

Similarly to other name-based methodologies, Onomap may be most useful when looking at population health rather than individual’s health. Within Lothian, Onomap has been used to monitor changes since the 2001 Census. The influx of 25,000 Poles since 2004 has been identified, making Poles the biggest ethnic minority within Lothian, and explaining some new demands placed on health and council services. Onomap has identified Poles registered with primary care, allowing mapping of the population to

Conclusion

There is a policy drive to improve ethnicity recording in routinely collected health datasets.3, 11, 12, 13 Nevertheless, improvements are unlikely to extend to all disease registers and administrative datasets in the near future, and the need for alternative means to identify population groups by ethnicity remain. Besides contemporary datasets, past health registers remain largely unanalysed by ethnicity because of their lack of classification and the impracticality to contact individuals to

Ethical approval

Use of the data was approved by NHS Lothian Caldicott Guardian.

Funding

None declared.

Competing interests

Pablo Mateos holds the copyright of the Onomap classification and may receive royalties for its licensing in the future.

Acknowledgements

The authors would like to thank Caroll Brown (NHS Lothian) for her assistance in executing Onomap, Sanjeev Paul (City of Edinburgh Council) for his assistance with the pupil census dataset, Mette Tranter (NHS Lothian) for her assistance with maps, Evropi Theodoratou (University of Edinburgh) for statistical advice, and the Health Promotion Foundation, Warsaw, Poland for provision of the Polish dataset.

References (36)

  • P.J. Aspinall

    The future of ethnicity classifications

    J Ethn Migr Stud

    (2009)
  • R. Bhopal

    Ethnicity, race and health in multicultural societies

    (2007)
  • P. Kumarapeli et al.

    Ethnicity recording in general practice computer systems

    J Public Health

    (2006)
  • O. Sangowawa et al.

    Can we implement ethnic monitoring in primary health care and use the data?

    J Public Health Med

    (2000)
  • D. Lauderdale et al.

    Asian american ethnic identification by surname

    Popul Res Policy Rev

    (2000)
  • P. Mateos

    An ontology of ethnicity based upon personal names. Implications for neighbourhood profiling

    (2007)
  • P. Mateos

    A review of name-based ethnicity classification methods and their potential in population studies

    Popul Space Place

    (2007)
  • P. Mateos et al.

    The cultural, ethnic and linguistic classification of populations and neighbourhoods using personal names. CASA working paper 116

    (2007)
  • Cited by (72)

    • Names-based ethnicity enhancement of hospital admissions in England, 1999–2013

      2021, International Journal of Medical Informatics
      Citation Excerpt :

      It should in this context be mentioned that classifications based on groups of closely associated forenames and surnames are also available, i.e. the methodology used for creating the related Onomap software [10,15]. Onomap was validated against the Scottish birth registration database for 2004–2008, with slightly higher sensitivity for White, South Asian, and Chinese names than found in this study [15]. The reported sensitivity for Black African names was however as low as 25 % (compared with nearly 50 % in this study).

    View all citing articles on Scopus
    View full text