Article Text
Abstract
Background Empirical researchers working with observational data have been slow to adopt modern statistical methods for causal inference, which remain poorly recognised among applied quantitative researchers. First introduced in 2010, DAGitty is a free web application (and R package) that enables empirical researchers to draw directed acyclic graphs (DAGs) and identify minimally-sufficient adjustment sets without explicit knowledge of graphical model theory. This review examines empirical research articles that have used DAGitty as an aid for analysing observational data.
Methods Articles citing ‘DAGitty’ published before 1 July 2016 were identified through searching Web of Science, Medline, Scopus, PubMed, and Google Scholar. Original articles describing the analysis of observational data were identified by inspecting the published manuscripts. Information on the use and presentation of DAGs and adjustment sets were extracted into a standardised table. Bibliographic details (including journal discipline) were obtained from Thompson-Reuter’s Journal Citations Reports.
Results 124 original articles describing the analysis of observational data were identified from 151 unique articles citing DAGitty. Two (2%) were published in 2012, seven (6%) in 2013, 23 (19%) in 2014, 46 (37%) in 2015, and 46 (37%) in the first half of 2016. The first authors came from 18 countries, most commonly the USA (n=36, 29%), Germany (n=19, 15%), Australia (n=14, 11%), Sweden (n=12, 10%), the UK (n=10, 8%), and Denmark (n=6, 5%). The host journals represented 43 academic disciplines, most commonly ‘Public, environmental, and occupational health’ (n=29, 23%), ‘environmental studies’ (n=13,10%), ‘multidisciplinary sciences’ (n=11, 9%), ‘oncology’ (n=10, 8%), ‘nutrition and dietetics’ (n=9, 7%), and ‘immunology’ (n=8, 6%).
29 (23%) articles included a DAG in the manuscript, 41 (33%) in supplementary material, while 53 (44%) contained no DAG. DAGs varied greatly in scope from three-variable overviews to graphs with 30+variables. Very few DAGs were saturated, whether completely or in order of transit. At the extreme, some researchers omitted all arcs except those that were explicitly evidenced. Adjustment sets were often modified beyond minimally-sufficient set(s) by adding: competing exposures (for ‘improve precision‘), mediators (to ‘improve face validity‘), and interaction terms; or by removing variables using stepwise (p-value) methods or criteria for ‘minimum change‘.
Conclusion Use of DAGitty in empirical research is increasing exponentially. There is however huge variation in practice, with many choosing to blend DAG-based methods with more traditional/accepted approaches to model specification. Guidelines for ‘best practice’ should be developed and included in teaching material and/or journal guidelines.