Objective criteria for explanations of machine learning models

Abstract Objective criteria to evaluate the performance of machine learning (ML) model explanations are a critical ingredient in bringing greater rigor to the field of explainable artificial intelligence. In this article, we survey three of our proposed criteria that each target different classes of...

Full description

Bibliographic Details
Main Authors: Chih‐Kuan Yeh, Pradeep Ravikumar
Format: Article
Language:English
Published: Wiley 2021-12-01
Series:Applied AI Letters
Subjects:
Online Access:https://doi.org/10.1002/ail2.57
_version_ 1818905490290114560
author Chih‐Kuan Yeh
Pradeep Ravikumar
author_facet Chih‐Kuan Yeh
Pradeep Ravikumar
author_sort Chih‐Kuan Yeh
collection DOAJ
description Abstract Objective criteria to evaluate the performance of machine learning (ML) model explanations are a critical ingredient in bringing greater rigor to the field of explainable artificial intelligence. In this article, we survey three of our proposed criteria that each target different classes of explanations. In the first, targeted at real‐valued feature importance explanations, we define a class of “infidelity” measures that capture how well the explanations match the ML models. We show that instances of such infidelity minimizing explanations correspond to many popular recently proposed explanations and, moreover, can be shown to satisfy well‐known game‐theoretic axiomatic properties. In the second, targeted to feature set explanations, we define a robustness analysis‐based criterion and show that deriving explainable feature sets based on the robustness criterion yields more qualitatively impressive explanations. Lastly, for sample explanations, we provide a decomposition‐based criterion that allows us to provide very scalable and compelling classes of sample‐based explanations.
first_indexed 2024-12-19T21:24:10Z
format Article
id doaj.art-0c6d23f22d4c4605901f9151d4c259b9
institution Directory Open Access Journal
issn 2689-5595
language English
last_indexed 2024-12-19T21:24:10Z
publishDate 2021-12-01
publisher Wiley
record_format Article
series Applied AI Letters
spelling doaj.art-0c6d23f22d4c4605901f9151d4c259b92022-12-21T20:05:10ZengWileyApplied AI Letters2689-55952021-12-0124n/an/a10.1002/ail2.57Objective criteria for explanations of machine learning modelsChih‐Kuan Yeh0Pradeep Ravikumar1Machine Learning Department, Carnegie Mellon University Pittsburgh Pennsylvania USAMachine Learning Department, Carnegie Mellon University Pittsburgh Pennsylvania USAAbstract Objective criteria to evaluate the performance of machine learning (ML) model explanations are a critical ingredient in bringing greater rigor to the field of explainable artificial intelligence. In this article, we survey three of our proposed criteria that each target different classes of explanations. In the first, targeted at real‐valued feature importance explanations, we define a class of “infidelity” measures that capture how well the explanations match the ML models. We show that instances of such infidelity minimizing explanations correspond to many popular recently proposed explanations and, moreover, can be shown to satisfy well‐known game‐theoretic axiomatic properties. In the second, targeted to feature set explanations, we define a robustness analysis‐based criterion and show that deriving explainable feature sets based on the robustness criterion yields more qualitatively impressive explanations. Lastly, for sample explanations, we provide a decomposition‐based criterion that allows us to provide very scalable and compelling classes of sample‐based explanations.https://doi.org/10.1002/ail2.57explainable AIfeature importanceimportant feature setsobjective evaluation criteriasample explanations
spellingShingle Chih‐Kuan Yeh
Pradeep Ravikumar
Objective criteria for explanations of machine learning models
Applied AI Letters
explainable AI
feature importance
important feature sets
objective evaluation criteria
sample explanations
title Objective criteria for explanations of machine learning models
title_full Objective criteria for explanations of machine learning models
title_fullStr Objective criteria for explanations of machine learning models
title_full_unstemmed Objective criteria for explanations of machine learning models
title_short Objective criteria for explanations of machine learning models
title_sort objective criteria for explanations of machine learning models
topic explainable AI
feature importance
important feature sets
objective evaluation criteria
sample explanations
url https://doi.org/10.1002/ail2.57
work_keys_str_mv AT chihkuanyeh objectivecriteriaforexplanationsofmachinelearningmodels
AT pradeepravikumar objectivecriteriaforexplanationsofmachinelearningmodels