Article Text

Download PDFPDF

OP87 Dagitty and directed acyclic graphs in observational research: a critical review
  1. PWG Tennant1,2,
  2. J Textor3,
  3. MS Gilthorpe1,4,
  4. GTH Ellison1,4
  1. 1Leeds Institute for Data Analytics, University of Leeds, Leeds, UK
  2. 2School of Healthcare, University of Leeds, Leeds, UK
  3. 3Department of Tumour Immunology, Radboud University Medical Centre, Nijmegen, The Netherlands
  4. 4Leeds Institute for Cardiovascular and Metabolic Medicine, University of Leeds, Leeds, UK


Background Empirical researchers working with observational data have been slow to adopt modern statistical methods for causal inference, which remain poorly recognised among applied quantitative researchers. First introduced in 2010, DAGitty is a free web application (and R package) that enables empirical researchers to draw directed acyclic graphs (DAGs) and identify minimally-sufficient adjustment sets without explicit knowledge of graphical model theory. This review examines empirical research articles that have used DAGitty as an aid for analysing observational data.

Methods Articles citing ‘DAGitty’ published before 1 July 2016 were identified through searching Web of Science, Medline, Scopus, PubMed, and Google Scholar. Original articles describing the analysis of observational data were identified by inspecting the published manuscripts. Information on the use and presentation of DAGs and adjustment sets were extracted into a standardised table. Bibliographic details (including journal discipline) were obtained from Thompson-Reuter’s Journal Citations Reports.

Results 124 original articles describing the analysis of observational data were identified from 151 unique articles citing DAGitty. Two (2%) were published in 2012, seven (6%) in 2013, 23 (19%) in 2014, 46 (37%) in 2015, and 46 (37%) in the first half of 2016. The first authors came from 18 countries, most commonly the USA (n=36, 29%), Germany (n=19, 15%), Australia (n=14, 11%), Sweden (n=12, 10%), the UK (n=10, 8%), and Denmark (n=6, 5%). The host journals represented 43 academic disciplines, most commonly ‘Public, environmental, and occupational health’ (n=29, 23%), ‘environmental studies’ (n=13,10%), ‘multidisciplinary sciences’ (n=11, 9%), ‘oncology’ (n=10, 8%), ‘nutrition and dietetics’ (n=9, 7%), and ‘immunology’ (n=8, 6%).

29 (23%) articles included a DAG in the manuscript, 41 (33%) in supplementary material, while 53 (44%) contained no DAG. DAGs varied greatly in scope from three-variable overviews to graphs with 30+variables. Very few DAGs were saturated, whether completely or in order of transit. At the extreme, some researchers omitted all arcs except those that were explicitly evidenced. Adjustment sets were often modified beyond minimally-sufficient set(s) by adding: competing exposures (for ‘improve precision‘), mediators (to ‘improve face validity‘), and interaction terms; or by removing variables using stepwise (p-value) methods or criteria for ‘minimum change‘.

Conclusion Use of DAGitty in empirical research is increasing exponentially. There is however huge variation in practice, with many choosing to blend DAG-based methods with more traditional/accepted approaches to model specification. Guidelines for ‘best practice’ should be developed and included in teaching material and/or journal guidelines.

  • Methods
  • Causal Inference
  • Graphical Model Theory
  • Directed Acyclic Graphcs

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.