Data augmentation: A comprehensive survey of modern approaches
To ensure good performance, modern machine learning models typically require large amounts of quality annotated data. Meanwhile, the data collection and annotation processes are usually performed manually, and consume a lot of time and resources. The quality and representativeness of curated data fo...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-12-01
|
Series: | Array |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2590005622000911 |
_version_ | 1811209452828753920 |
---|---|
author | Alhassan Mumuni Fuseini Mumuni |
author_facet | Alhassan Mumuni Fuseini Mumuni |
author_sort | Alhassan Mumuni |
collection | DOAJ |
description | To ensure good performance, modern machine learning models typically require large amounts of quality annotated data. Meanwhile, the data collection and annotation processes are usually performed manually, and consume a lot of time and resources. The quality and representativeness of curated data for a given task is usually dictated by the natural availability of clean data in the particular domain as well as the level of expertise of developers involved. In many real-world application settings it is often not feasible to obtain sufficient training data. Currently, data augmentation is the most effective way of alleviating this problem. The main goal of data augmentation is to increase the volume, quality and diversity of training data. This paper presents an extensive and thorough review of data augmentation methods applicable in computer vision domains. The focus is on more recent and advanced data augmentation techniques. The surveyed methods include deeply learned augmentation strategies as well as feature-level and meta-learning-based data augmentation techniques. Data synthesis approaches based on realistic 3D graphics modeling, neural rendering, and generative adversarial networks are also covered. Different from previous surveys, we cover a more extensive array of modern techniques and applications. We also compare the performance of several state-of-the-art augmentation methods and present a rigorous discussion of the effectiveness of various techniques in different scenarios of use based on performance results on different datasets and tasks. |
first_indexed | 2024-04-12T04:39:39Z |
format | Article |
id | doaj.art-06579df9a3974ed5b06ef40a946cf282 |
institution | Directory Open Access Journal |
issn | 2590-0056 |
language | English |
last_indexed | 2024-04-12T04:39:39Z |
publishDate | 2022-12-01 |
publisher | Elsevier |
record_format | Article |
series | Array |
spelling | doaj.art-06579df9a3974ed5b06ef40a946cf2822022-12-22T03:47:42ZengElsevierArray2590-00562022-12-0116100258Data augmentation: A comprehensive survey of modern approachesAlhassan Mumuni0Fuseini Mumuni1Cape Coast Technical University, P. O. Box DL 50, Cape Coast, Ghana; Corresponding author.University of Mines and Technology, P.O. Box 237, Tarkwa, GhanaTo ensure good performance, modern machine learning models typically require large amounts of quality annotated data. Meanwhile, the data collection and annotation processes are usually performed manually, and consume a lot of time and resources. The quality and representativeness of curated data for a given task is usually dictated by the natural availability of clean data in the particular domain as well as the level of expertise of developers involved. In many real-world application settings it is often not feasible to obtain sufficient training data. Currently, data augmentation is the most effective way of alleviating this problem. The main goal of data augmentation is to increase the volume, quality and diversity of training data. This paper presents an extensive and thorough review of data augmentation methods applicable in computer vision domains. The focus is on more recent and advanced data augmentation techniques. The surveyed methods include deeply learned augmentation strategies as well as feature-level and meta-learning-based data augmentation techniques. Data synthesis approaches based on realistic 3D graphics modeling, neural rendering, and generative adversarial networks are also covered. Different from previous surveys, we cover a more extensive array of modern techniques and applications. We also compare the performance of several state-of-the-art augmentation methods and present a rigorous discussion of the effectiveness of various techniques in different scenarios of use based on performance results on different datasets and tasks.http://www.sciencedirect.com/science/article/pii/S2590005622000911Review of data augmentationComputer visionGenerative adversarial networkMeta-learningSynthetic dataMachine learning |
spellingShingle | Alhassan Mumuni Fuseini Mumuni Data augmentation: A comprehensive survey of modern approaches Array Review of data augmentation Computer vision Generative adversarial network Meta-learning Synthetic data Machine learning |
title | Data augmentation: A comprehensive survey of modern approaches |
title_full | Data augmentation: A comprehensive survey of modern approaches |
title_fullStr | Data augmentation: A comprehensive survey of modern approaches |
title_full_unstemmed | Data augmentation: A comprehensive survey of modern approaches |
title_short | Data augmentation: A comprehensive survey of modern approaches |
title_sort | data augmentation a comprehensive survey of modern approaches |
topic | Review of data augmentation Computer vision Generative adversarial network Meta-learning Synthetic data Machine learning |
url | http://www.sciencedirect.com/science/article/pii/S2590005622000911 |
work_keys_str_mv | AT alhassanmumuni dataaugmentationacomprehensivesurveyofmodernapproaches AT fuseinimumuni dataaugmentationacomprehensivesurveyofmodernapproaches |