Learning with confident examples: Rank pruning for robust classification with noisy labels

P N learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate ρ1 for positive examples and ρ0 for negative examples. We propose Rank Pruning (RP) to solve PN learning and the open problem of estimating the noise rates. Unlike prior...

Full description

Bibliographic Details
Main Authors: Chuang, Isaac L., Wu, Tailin, Northcutt, Curtis G.
Other Authors: Massachusetts Institute of Technology. Department of Physics
Format: Article
Language:English
Published: 2021
Online Access:https://hdl.handle.net/1721.1/137802
_version_ 1826214248521400320
author Chuang, Isaac L.
Wu, Tailin
Northcutt, Curtis G.
author2 Massachusetts Institute of Technology. Department of Physics
author_facet Massachusetts Institute of Technology. Department of Physics
Chuang, Isaac L.
Wu, Tailin
Northcutt, Curtis G.
author_sort Chuang, Isaac L.
collection MIT
description P N learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate ρ1 for positive examples and ρ0 for negative examples. We propose Rank Pruning (RP) to solve PN learning and the open problem of estimating the noise rates. Unlike prior solutions, RP is efficient and general, requiring O(T) for any unrestricted choice of probabilistic classifier with T fitting time. We prove RP achieves consistent noise estimation and equivalent expected risk as learning with uncorrupted labels in ideal conditions, and derive closed-form solutions when conditions are non-ideal. RP achieves state-of-the-art noise estimation and F1, error, and AUC-PR for both MNIST and CIFAR datasets, regardless of the amount of noise. To highlight, RP with a CNN classifier can predict if an MNIST digit is a one or not with only 0:25% error, and 0:46% error across all digits, even when 50% of positive examples are mislabeled and 50% of observed positive labels are mislabeled negative examples.
first_indexed 2024-09-23T16:02:29Z
format Article
id mit-1721.1/137802
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T16:02:29Z
publishDate 2021
record_format dspace
spelling mit-1721.1/1378022023-08-11T17:20:46Z Learning with confident examples: Rank pruning for robust classification with noisy labels Chuang, Isaac L. Wu, Tailin Northcutt, Curtis G. Massachusetts Institute of Technology. Department of Physics P N learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate ρ1 for positive examples and ρ0 for negative examples. We propose Rank Pruning (RP) to solve PN learning and the open problem of estimating the noise rates. Unlike prior solutions, RP is efficient and general, requiring O(T) for any unrestricted choice of probabilistic classifier with T fitting time. We prove RP achieves consistent noise estimation and equivalent expected risk as learning with uncorrupted labels in ideal conditions, and derive closed-form solutions when conditions are non-ideal. RP achieves state-of-the-art noise estimation and F1, error, and AUC-PR for both MNIST and CIFAR datasets, regardless of the amount of noise. To highlight, RP with a CNN classifier can predict if an MNIST digit is a one or not with only 0:25% error, and 0:46% error across all digits, even when 50% of positive examples are mislabeled and 50% of observed positive labels are mislabeled negative examples. 2021-11-08T19:46:52Z 2021-11-08T19:46:52Z 2017 2019-06-17T16:16:40Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137802 Chuang, Isaac L., Wu, Tailin and Northcutt, Curtis G. 2017. "Learning with confident examples: Rank pruning for robust classification with noisy labels." en http://auai.org/uai2017/proceedings/papers/35.pdf Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf arXiv
spellingShingle Chuang, Isaac L.
Wu, Tailin
Northcutt, Curtis G.
Learning with confident examples: Rank pruning for robust classification with noisy labels
title Learning with confident examples: Rank pruning for robust classification with noisy labels
title_full Learning with confident examples: Rank pruning for robust classification with noisy labels
title_fullStr Learning with confident examples: Rank pruning for robust classification with noisy labels
title_full_unstemmed Learning with confident examples: Rank pruning for robust classification with noisy labels
title_short Learning with confident examples: Rank pruning for robust classification with noisy labels
title_sort learning with confident examples rank pruning for robust classification with noisy labels
url https://hdl.handle.net/1721.1/137802
work_keys_str_mv AT chuangisaacl learningwithconfidentexamplesrankpruningforrobustclassificationwithnoisylabels
AT wutailin learningwithconfidentexamplesrankpruningforrobustclassificationwithnoisylabels
AT northcuttcurtisg learningwithconfidentexamplesrankpruningforrobustclassificationwithnoisylabels