Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products

Although algorithmic auditing has emerged as a key strategy to expose systematic biases embedded in software platforms, we struggle to understand the real-world impact of these audits, as scholarship on the impact of algorithmic audits on increasing algorithmic fairness...

Full description

Bibliographic Details
Main Authors: Buolamwini, Joy, Raji, Inioluwa Deborah
Other Authors: Massachusetts Institute of Technology. Center for Civic Media
Format: Article
Language:en_US
Published: Conference on Artificial Intelligence, Ethics, and Society 2020
Subjects:
Online Access:https://hdl.handle.net/1721.1/123456
_version_ 1824458135400611840
author Buolamwini, Joy
Raji, Inioluwa Deborah
author2 Massachusetts Institute of Technology. Center for Civic Media
author_facet Massachusetts Institute of Technology. Center for Civic Media
Buolamwini, Joy
Raji, Inioluwa Deborah
author_sort Buolamwini, Joy
collection MIT
description Although algorithmic auditing has emerged as a key strategy to expose systematic biases embedded in software platforms, we struggle to understand the real-world impact of these audits, as scholarship on the impact of algorithmic audits on increasing algorithmic fairness and transparency in commercial systems is nascent. To analyze the impact of publicly naming and disclosing performance results of biased AI systems, we investigate the commercial impact of Gender Shades, the first algorithmic audit of gender and skin type performance disparities in commercial facial analysis models. This paper 1) outlines the audit design and structured disclosure procedure used in the Gender Shades study, 2) presents new performance metrics from targeted companies IBM, Microsoft and Megvii(Face++) on the Pilot Parliaments Benchmark (PPB)as of August 2018, 3) provides performance results on PPB by non-target companies Amazon and Kairos and,4) explores differences in company responses as shared through corporate communications that contextualize differences in performance on PPB. Within 7 months of the original audit, we find that all three targets released new API versions. All targets reduced accuracy disparities between males and females and darker and lighter-skinned subgroups, with the most significant up-date occurring for the darker-skinned female subgroup,that underwent a 17.7% - 30.4% reduction in error be-tween audit periods. Minimizing these disparities led to a 5.72% to 8.3% reduction in overall error on the Pi-lot Parliaments Benchmark (PPB) for target corporation APIs. The overall performance of non-targets Amazon and Kairos lags significantly behind that of the targets,with error rates of 8.66% and 6.60% overall, and error rates of 31.37% and 22.50% for the darker female sub-group, respectively.
first_indexed 2024-09-23T12:04:49Z
format Article
id mit-1721.1/123456
institution Massachusetts Institute of Technology
language en_US
last_indexed 2025-02-19T04:21:05Z
publishDate 2020
publisher Conference on Artificial Intelligence, Ethics, and Society
record_format dspace
spelling mit-1721.1/1234562025-02-06T18:44:51Z Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products Buolamwini, Joy Raji, Inioluwa Deborah Massachusetts Institute of Technology. Center for Civic Media algorithm, audit, ethics, intersectionality, AI, bias, race, gender Although algorithmic auditing has emerged as a key strategy to expose systematic biases embedded in software platforms, we struggle to understand the real-world impact of these audits, as scholarship on the impact of algorithmic audits on increasing algorithmic fairness and transparency in commercial systems is nascent. To analyze the impact of publicly naming and disclosing performance results of biased AI systems, we investigate the commercial impact of Gender Shades, the first algorithmic audit of gender and skin type performance disparities in commercial facial analysis models. This paper 1) outlines the audit design and structured disclosure procedure used in the Gender Shades study, 2) presents new performance metrics from targeted companies IBM, Microsoft and Megvii(Face++) on the Pilot Parliaments Benchmark (PPB)as of August 2018, 3) provides performance results on PPB by non-target companies Amazon and Kairos and,4) explores differences in company responses as shared through corporate communications that contextualize differences in performance on PPB. Within 7 months of the original audit, we find that all three targets released new API versions. All targets reduced accuracy disparities between males and females and darker and lighter-skinned subgroups, with the most significant up-date occurring for the darker-skinned female subgroup,that underwent a 17.7% - 30.4% reduction in error be-tween audit periods. Minimizing these disparities led to a 5.72% to 8.3% reduction in overall error on the Pi-lot Parliaments Benchmark (PPB) for target corporation APIs. The overall performance of non-targets Amazon and Kairos lags significantly behind that of the targets,with error rates of 8.66% and 6.60% overall, and error rates of 31.37% and 22.50% for the darker female sub-group, respectively. 2020-01-16T16:54:24Z 2020-01-16T16:54:24Z 2019 Article https://hdl.handle.net/1721.1/123456 Raji, I & Buolamwini, J. (2019). Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products. Conference on Artificial Intelligence, Ethics, and Society. en_US application/pdf Conference on Artificial Intelligence, Ethics, and Society
spellingShingle algorithm, audit, ethics, intersectionality, AI, bias, race, gender
Buolamwini, Joy
Raji, Inioluwa Deborah
Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products
title Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products
title_full Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products
title_fullStr Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products
title_full_unstemmed Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products
title_short Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products
title_sort actionable auditing investigating the impact of publicly naming biased performance results of commercial ai products
topic algorithm, audit, ethics, intersectionality, AI, bias, race, gender
url https://hdl.handle.net/1721.1/123456
work_keys_str_mv AT buolamwinijoy actionableauditinginvestigatingtheimpactofpubliclynamingbiasedperformanceresultsofcommercialaiproducts
AT rajiinioluwadeborah actionableauditinginvestigatingtheimpactofpubliclynamingbiasedperformanceresultsofcommercialaiproducts