Background Machine learning algorithms are increasingly used to inform decision-making in healthcare and health policy. Data-driven algorithms are commonly thought to offer more consistent and fairer decision-making than relying on human judgement alone. However, because data-driven algorithms do not understand the origins and meaning of a dataset (and the variables therein), they are instead prone to reinforcing unfair and unjust decision-making.
To avoid such problems, there is an increasing interest in making algorithms more transparent, explainable, and interpretable. Unfortunately, these terms are frequently used interchangeably, with little awareness of their specific meanings and limitations.
Methods We propose the following definitions of transparency, explainability, and interpretability to aid with the classification and scrutiny of algorithms.
Transparency (what is in the model?) is achieved when there is sufficient information about how the model was developed, what variables it includes, for what reason, and in what context(s) it has been trained, tested, and deployed.
Explainability (how does the model work?) is achieved when human beings can provide technical explanations about how the algorithm behaves and computes a specific output or probability. An algorithm must be transparent to be explainable. Explainability may be considered as a limited form of understanding that is ignorant of the external context and meaning of each variable.
Interpretability (why does the model work this way?) is achieved when it is possible to discern why the algorithm has determined a certain output. It requires transparency, explainability, and knowledge of the context, as well as the meaning of, and relationship among, all relevant variables. Explainability may be considered as a richer form of understanding that is aware of the external context and meaning of each variable.
Discussion The ongoing conflation of transparency, explainability, and interpretability is a serious problem in the design, deployment, and scrutiny of algorithms. In health and social science, for example, it is common for transparent and explainable prediction models to be misinterpreted as having some external meaning. Algorithmic fairness and justice can ultimately only be achieved with full interpretability, which in turn requires an external understanding of the context (i.e. causal knowledge).
Conclusion We offer three distinct definitions of algorithmic transparency, explainability, and interpretability to aid with the classification and scrutiny of algorithms.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.