Explainable AI as evidence of fair decisions

This paper will propose that explanations are valuable to those impacted by a model's decisions (model patients) to the extent that they provide evidence that a past adverse decision was unfair. Under this proposal, we should favor models and explainability methods which generate counterfactual...

Full description

Bibliographic Details
Main Author: Derek Leben
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-02-01
Series:Frontiers in Psychology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1069426/full
_version_ 1811164291788701696
author Derek Leben
author_facet Derek Leben
author_sort Derek Leben
collection DOAJ
description This paper will propose that explanations are valuable to those impacted by a model's decisions (model patients) to the extent that they provide evidence that a past adverse decision was unfair. Under this proposal, we should favor models and explainability methods which generate counterfactuals of two types. The first type of counterfactual is positive evidence of fairness: a set of states under the control of the patient which (if changed) would have led to a beneficial decision. The second type of counterfactual is negative evidence of fairness: a set of irrelevant group or behavioral attributes which (if changed) would not have led to a beneficial decision. Each of these counterfactual statements is related to fairness, under the Liberal Egalitarian idea that treating one person differently than another is justified only on the basis of features which were plausibly under each person's control. Other aspects of an explanation, such as feature importance and actionable recourse, are not essential under this view, and need not be a goal of explainable AI.
first_indexed 2024-04-10T15:19:14Z
format Article
id doaj.art-44c67cb365ad454da47c57c453bddf71
institution Directory Open Access Journal
issn 1664-1078
language English
last_indexed 2024-04-10T15:19:14Z
publishDate 2023-02-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Psychology
spelling doaj.art-44c67cb365ad454da47c57c453bddf712023-02-14T16:21:44ZengFrontiers Media S.A.Frontiers in Psychology1664-10782023-02-011410.3389/fpsyg.2023.10694261069426Explainable AI as evidence of fair decisionsDerek LebenThis paper will propose that explanations are valuable to those impacted by a model's decisions (model patients) to the extent that they provide evidence that a past adverse decision was unfair. Under this proposal, we should favor models and explainability methods which generate counterfactuals of two types. The first type of counterfactual is positive evidence of fairness: a set of states under the control of the patient which (if changed) would have led to a beneficial decision. The second type of counterfactual is negative evidence of fairness: a set of irrelevant group or behavioral attributes which (if changed) would not have led to a beneficial decision. Each of these counterfactual statements is related to fairness, under the Liberal Egalitarian idea that treating one person differently than another is justified only on the basis of features which were plausibly under each person's control. Other aspects of an explanation, such as feature importance and actionable recourse, are not essential under this view, and need not be a goal of explainable AI.https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1069426/fullxAIexplainabilityfairnessdiscriminationcounterfactual explanations
spellingShingle Derek Leben
Explainable AI as evidence of fair decisions
Frontiers in Psychology
xAI
explainability
fairness
discrimination
counterfactual explanations
title Explainable AI as evidence of fair decisions
title_full Explainable AI as evidence of fair decisions
title_fullStr Explainable AI as evidence of fair decisions
title_full_unstemmed Explainable AI as evidence of fair decisions
title_short Explainable AI as evidence of fair decisions
title_sort explainable ai as evidence of fair decisions
topic xAI
explainability
fairness
discrimination
counterfactual explanations
url https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1069426/full
work_keys_str_mv AT derekleben explainableaiasevidenceoffairdecisions