Article Text

Download PDFPDF
A glossary for big data in population and public health: discussion and commentary on terminology and research methods
  1. Daniel Fuller1,
  2. Richard Buote2,
  3. Kevin Stanley3
  1. 1 School of Human Kinetics and Recreation, Memorial University of Newfoundland, Saint John’s, Canada
  2. 2 Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St John’s, Canada
  3. 3 Department of Computer Science, College of Arts and Science, University of Saskatchewan, Saskatoon, Canada
  1. Correspondence to Dr Daniel Fuller, School of Human Kinetics and Recreation, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador A1C 5S7, Canada; dfuller{at}


The volume and velocity of data are growing rapidly and big data analytics are being applied to these data in many fields. Population and public health researchers may be unfamiliar with the terminology and statistical methods used in big data. This creates a barrier to the application of big data analytics. The purpose of this glossary is to define terms used in big data and big data analytics and to contextualise these terms. We define the five Vs of big data and provide definitions and distinctions for data mining, machine learning and deep learning, among other terms. We provide key distinctions between big data and statistical analysis methods applied to big data. We contextualise the glossary by providing examples where big data analysis methods have been applied to population and public health research problems and provide brief guidance on how to learn big data analysis methods.

  • public health
  • research methods
  • methodology

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Contributors DF and KS were responsible for the conceptualisation of this manuscript. DF and RB contributed to the definitions and references for the glossary. All three authors contributed to the writing and editing of this manuscript. All authors have read and approved the final submitted version of this manuscript.

  • Funding Funding for this paper was provided by Dr. Fuller’s Canada Research Chair.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.