Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle Imagery
The multi-label classification problem in Unmanned Aerial Vehicle (UAV) images is particularly challenging compared to single-label classification due to its combinatorial nature. To tackle this issue, we propose in this paper a deep learning approach based on encoder-decoder neural network architec...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8808853/ |
_version_ | 1818855317913468928 |
---|---|
author | Aaliyah Alshehri Yakoub Bazi Nassim Ammour Haidar Almubarak Naif Alajlan |
author_facet | Aaliyah Alshehri Yakoub Bazi Nassim Ammour Haidar Almubarak Naif Alajlan |
author_sort | Aaliyah Alshehri |
collection | DOAJ |
description | The multi-label classification problem in Unmanned Aerial Vehicle (UAV) images is particularly challenging compared to single-label classification due to its combinatorial nature. To tackle this issue, we propose in this paper a deep learning approach based on encoder-decoder neural network architecture with channel and spatial attention mechanisms. Specifically, the encoder module which is based on a pre-trained convolutional neural network (CNN) has the task to transform the input image to a set of feature maps using an opportune feature combination. To improve the feature representation further, this module incorporates a squeeze excitation (SE) layer for modelling the interdependencies between the channels of the feature maps. The decoder module which is based on a long short terms memory (LSTM) network has the task of generating, in a sequential way, the classes present in the image. At each time step, it predicts the next class-label by aligning its hidden state to the corresponding region in the image by means of an adaptive spatial attention mechanism. The experiments carried out on two UAV datasets with a spatial resolution of 2-cm show that our method is promising in predicting the labels present in the image while attending the relevant objects in the image. Additionally, it is able to provide better classification results compared to state-of-the-art methods. |
first_indexed | 2024-12-19T08:06:41Z |
format | Article |
id | doaj.art-4544ee0b2eae43f89ef651ab1a92f422 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T08:06:41Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-4544ee0b2eae43f89ef651ab1a92f4222022-12-21T20:29:45ZengIEEEIEEE Access2169-35362019-01-01711987311988010.1109/ACCESS.2019.29366168808853Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle ImageryAaliyah Alshehri0Yakoub Bazi1https://orcid.org/0000-0001-9287-0596Nassim Ammour2https://orcid.org/0000-0002-4875-4640Haidar Almubarak3Naif Alajlan4Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaComputer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaComputer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaComputer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaComputer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaThe multi-label classification problem in Unmanned Aerial Vehicle (UAV) images is particularly challenging compared to single-label classification due to its combinatorial nature. To tackle this issue, we propose in this paper a deep learning approach based on encoder-decoder neural network architecture with channel and spatial attention mechanisms. Specifically, the encoder module which is based on a pre-trained convolutional neural network (CNN) has the task to transform the input image to a set of feature maps using an opportune feature combination. To improve the feature representation further, this module incorporates a squeeze excitation (SE) layer for modelling the interdependencies between the channels of the feature maps. The decoder module which is based on a long short terms memory (LSTM) network has the task of generating, in a sequential way, the classes present in the image. At each time step, it predicts the next class-label by aligning its hidden state to the corresponding region in the image by means of an adaptive spatial attention mechanism. The experiments carried out on two UAV datasets with a spatial resolution of 2-cm show that our method is promising in predicting the labels present in the image while attending the relevant objects in the image. Additionally, it is able to provide better classification results compared to state-of-the-art methods.https://ieeexplore.ieee.org/document/8808853/UAV imagerydeep learningattention neural networkmulti-label image classification |
spellingShingle | Aaliyah Alshehri Yakoub Bazi Nassim Ammour Haidar Almubarak Naif Alajlan Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle Imagery IEEE Access UAV imagery deep learning attention neural network multi-label image classification |
title | Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle Imagery |
title_full | Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle Imagery |
title_fullStr | Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle Imagery |
title_full_unstemmed | Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle Imagery |
title_short | Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle Imagery |
title_sort | deep attention neural network for multi label classification in unmanned aerial vehicle imagery |
topic | UAV imagery deep learning attention neural network multi-label image classification |
url | https://ieeexplore.ieee.org/document/8808853/ |
work_keys_str_mv | AT aaliyahalshehri deepattentionneuralnetworkformultilabelclassificationinunmannedaerialvehicleimagery AT yakoubbazi deepattentionneuralnetworkformultilabelclassificationinunmannedaerialvehicleimagery AT nassimammour deepattentionneuralnetworkformultilabelclassificationinunmannedaerialvehicleimagery AT haidaralmubarak deepattentionneuralnetworkformultilabelclassificationinunmannedaerialvehicleimagery AT naifalajlan deepattentionneuralnetworkformultilabelclassificationinunmannedaerialvehicleimagery |