Context and Participation in Machine Learning

ML systems are shaped by human choices and norms, from problem conceptualization to deployment. They are then used in complex socio-technical contexts, where they interact with and affect diverse populations. However, development decisions are often made in isolation, without deeply taking into ac...

Full description

Bibliographic Details
Main Author: Suresh, Harini
Other Authors: Guttag, John V.
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/150314
Description
Summary:ML systems are shaped by human choices and norms, from problem conceptualization to deployment. They are then used in complex socio-technical contexts, where they interact with and affect diverse populations. However, development decisions are often made in isolation, without deeply taking into account the deployment context in which the system will be used. And they are typically hidden to users in that context, who have few avenues to understand if or how they should use the system. As a result, there are numerous examples of ML systems that in practice are harmful, poorly understood, or misused. We propose an alternate approach to the development and deployment of ML systems that is focused on incorporating the participation of the people who use and are affected by the system. We first develop two frameworks that lend clarity to the human choices that shape ML systems and the broad populations that these systems affect. These inform a prospective question: how can we shape new systems from the start to reflect context-specific needs and benefit justice and equity? We address this question through an in-depth case study of co-designing ML tools to support activists who monitor gender-related violence. Drawing from intersectional feminist theory and participatory design, we develop methods for data collection, annotation, modeling, and evaluation that prioritize sustainable partnerships and challenge power inequalities. Then, we consider an alternative paradigm where we do not have full control over the development lifecycle, e.g., where a model has already been built and made available. In these cases, we show how deployment tools can give downstream stakeholders the information and agency to understand and hold ML systems accountable. We describe the design of two novel deployment tools that provide intuitive, useful, and context-relevant insight into model strengths and limitations. The first uses example-based visualizations and an interactive input editor to help users assess the reliability of individual model predictions. The second, Kaleidoscope, enables context-specific evaluation, allowing downstream users to translate their implicit knowledge of "good model behavior'' for their context into explicitly-defined, semantically-meaningful tests. This dissertation demonstrates several ways that context-specific considerations and meaningful participation can shape the development and use of ML systems. We hope that this is a step towards the broader goal of building ML-based systems that are grounded in societal context, are shaped by diverse viewpoints, and contribute to justice and equity.