Do We Train on Test Data? Purging CIFAR of Near-Duplicates

The CIFAR-10 and CIFAR-100 datasets are two of the most heavily benchmarked datasets in computer vision and are often used to evaluate novel methods and model architectures in the field of deep learning. However, we find that 3.3% and 10% of the images from the test sets of these datasets have dupli...

Full description

Bibliographic Details
Main Authors: Björn Barz, Joachim Denzler
Format: Article
Language:English
Published: MDPI AG 2020-06-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/6/6/41