Data augmentation: A comprehensive survey of modern approaches

To ensure good performance, modern machine learning models typically require large amounts of quality annotated data. Meanwhile, the data collection and annotation processes are usually performed manually, and consume a lot of time and resources. The quality and representativeness of curated data fo...

Full description

Bibliographic Details
Main Authors: Alhassan Mumuni, Fuseini Mumuni
Format: Article
Language:English
Published: Elsevier 2022-12-01
Series:Array
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590005622000911
_version_ 1811209452828753920
author Alhassan Mumuni
Fuseini Mumuni
author_facet Alhassan Mumuni
Fuseini Mumuni
author_sort Alhassan Mumuni
collection DOAJ
description To ensure good performance, modern machine learning models typically require large amounts of quality annotated data. Meanwhile, the data collection and annotation processes are usually performed manually, and consume a lot of time and resources. The quality and representativeness of curated data for a given task is usually dictated by the natural availability of clean data in the particular domain as well as the level of expertise of developers involved. In many real-world application settings it is often not feasible to obtain sufficient training data. Currently, data augmentation is the most effective way of alleviating this problem. The main goal of data augmentation is to increase the volume, quality and diversity of training data. This paper presents an extensive and thorough review of data augmentation methods applicable in computer vision domains. The focus is on more recent and advanced data augmentation techniques. The surveyed methods include deeply learned augmentation strategies as well as feature-level and meta-learning-based data augmentation techniques. Data synthesis approaches based on realistic 3D graphics modeling, neural rendering, and generative adversarial networks are also covered. Different from previous surveys, we cover a more extensive array of modern techniques and applications. We also compare the performance of several state-of-the-art augmentation methods and present a rigorous discussion of the effectiveness of various techniques in different scenarios of use based on performance results on different datasets and tasks.
first_indexed 2024-04-12T04:39:39Z
format Article
id doaj.art-06579df9a3974ed5b06ef40a946cf282
institution Directory Open Access Journal
issn 2590-0056
language English
last_indexed 2024-04-12T04:39:39Z
publishDate 2022-12-01
publisher Elsevier
record_format Article
series Array
spelling doaj.art-06579df9a3974ed5b06ef40a946cf2822022-12-22T03:47:42ZengElsevierArray2590-00562022-12-0116100258Data augmentation: A comprehensive survey of modern approachesAlhassan Mumuni0Fuseini Mumuni1Cape Coast Technical University, P. O. Box DL 50, Cape Coast, Ghana; Corresponding author.University of Mines and Technology, P.O. Box 237, Tarkwa, GhanaTo ensure good performance, modern machine learning models typically require large amounts of quality annotated data. Meanwhile, the data collection and annotation processes are usually performed manually, and consume a lot of time and resources. The quality and representativeness of curated data for a given task is usually dictated by the natural availability of clean data in the particular domain as well as the level of expertise of developers involved. In many real-world application settings it is often not feasible to obtain sufficient training data. Currently, data augmentation is the most effective way of alleviating this problem. The main goal of data augmentation is to increase the volume, quality and diversity of training data. This paper presents an extensive and thorough review of data augmentation methods applicable in computer vision domains. The focus is on more recent and advanced data augmentation techniques. The surveyed methods include deeply learned augmentation strategies as well as feature-level and meta-learning-based data augmentation techniques. Data synthesis approaches based on realistic 3D graphics modeling, neural rendering, and generative adversarial networks are also covered. Different from previous surveys, we cover a more extensive array of modern techniques and applications. We also compare the performance of several state-of-the-art augmentation methods and present a rigorous discussion of the effectiveness of various techniques in different scenarios of use based on performance results on different datasets and tasks.http://www.sciencedirect.com/science/article/pii/S2590005622000911Review of data augmentationComputer visionGenerative adversarial networkMeta-learningSynthetic dataMachine learning
spellingShingle Alhassan Mumuni
Fuseini Mumuni
Data augmentation: A comprehensive survey of modern approaches
Array
Review of data augmentation
Computer vision
Generative adversarial network
Meta-learning
Synthetic data
Machine learning
title Data augmentation: A comprehensive survey of modern approaches
title_full Data augmentation: A comprehensive survey of modern approaches
title_fullStr Data augmentation: A comprehensive survey of modern approaches
title_full_unstemmed Data augmentation: A comprehensive survey of modern approaches
title_short Data augmentation: A comprehensive survey of modern approaches
title_sort data augmentation a comprehensive survey of modern approaches
topic Review of data augmentation
Computer vision
Generative adversarial network
Meta-learning
Synthetic data
Machine learning
url http://www.sciencedirect.com/science/article/pii/S2590005622000911
work_keys_str_mv AT alhassanmumuni dataaugmentationacomprehensivesurveyofmodernapproaches
AT fuseinimumuni dataaugmentationacomprehensivesurveyofmodernapproaches