A multimodal deep learning approach for gravel road condition evaluation through image and audio integration
This study investigates the combination of audio and image data to classify road conditions, particularly focusing on loose gravel scenarios. The dataset underwent binary categorisation, comprising audio segments capturing gravel sounds and corresponding images. Early feature fusion, utilising a pre...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024-06-01
|
Series: | Transportation Engineering |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666691X24000034 |
_version_ | 1797322456168398848 |
---|---|
author | Nausheen Saeed Moudud Alam Roger G Nyberg |
author_facet | Nausheen Saeed Moudud Alam Roger G Nyberg |
author_sort | Nausheen Saeed |
collection | DOAJ |
description | This study investigates the combination of audio and image data to classify road conditions, particularly focusing on loose gravel scenarios. The dataset underwent binary categorisation, comprising audio segments capturing gravel sounds and corresponding images. Early feature fusion, utilising a pre-trained Very Deep Convolutional Networks 19 (VGG19) and Principal component analysis (PCA), improved the accuracy of the Random Forest classifier, surpassing other models in accuracy, precision, recall, and F1-score. Late fusion, involving decision-level processing with logical disjunction and conjunction gates (AND and OR) in combination with individual classifiers for images and audio based on Densely Connected Convolutional Networks 121 (DenseNet121), demonstrated notable performance, especially with the OR gate, achieving 97 % accuracy. The late fusion method enhances adaptability by compensating for limitations in one modality with information from the other. Adapting maintenance based on identified road conditions minimises unnecessary environmental impact. This method can help to identify loose gravel on gravel roads, substantially improving road safety and implementing a precise maintenance strategy through a data-driven approach. |
first_indexed | 2024-03-08T05:14:36Z |
format | Article |
id | doaj.art-e916ac5245064e1fafbca8b2a6344efe |
institution | Directory Open Access Journal |
issn | 2666-691X |
language | English |
last_indexed | 2024-03-08T05:14:36Z |
publishDate | 2024-06-01 |
publisher | Elsevier |
record_format | Article |
series | Transportation Engineering |
spelling | doaj.art-e916ac5245064e1fafbca8b2a6344efe2024-02-07T04:45:58ZengElsevierTransportation Engineering2666-691X2024-06-0116100228A multimodal deep learning approach for gravel road condition evaluation through image and audio integrationNausheen Saeed0Moudud Alam1Roger G Nyberg2Corresponding author.; School of Information and Engineering, Dalarna University, Röda vägen 3, Borlänge, SwedenSchool of Information and Engineering, Dalarna University, Röda vägen 3, Borlänge, SwedenSchool of Information and Engineering, Dalarna University, Röda vägen 3, Borlänge, SwedenThis study investigates the combination of audio and image data to classify road conditions, particularly focusing on loose gravel scenarios. The dataset underwent binary categorisation, comprising audio segments capturing gravel sounds and corresponding images. Early feature fusion, utilising a pre-trained Very Deep Convolutional Networks 19 (VGG19) and Principal component analysis (PCA), improved the accuracy of the Random Forest classifier, surpassing other models in accuracy, precision, recall, and F1-score. Late fusion, involving decision-level processing with logical disjunction and conjunction gates (AND and OR) in combination with individual classifiers for images and audio based on Densely Connected Convolutional Networks 121 (DenseNet121), demonstrated notable performance, especially with the OR gate, achieving 97 % accuracy. The late fusion method enhances adaptability by compensating for limitations in one modality with information from the other. Adapting maintenance based on identified road conditions minimises unnecessary environmental impact. This method can help to identify loose gravel on gravel roads, substantially improving road safety and implementing a precise maintenance strategy through a data-driven approach.http://www.sciencedirect.com/science/article/pii/S2666691X24000034Gravel road maintenanceData fusionSound analysisMachine visionMachineLearning |
spellingShingle | Nausheen Saeed Moudud Alam Roger G Nyberg A multimodal deep learning approach for gravel road condition evaluation through image and audio integration Transportation Engineering Gravel road maintenance Data fusion Sound analysis Machine vision Machine Learning |
title | A multimodal deep learning approach for gravel road condition evaluation through image and audio integration |
title_full | A multimodal deep learning approach for gravel road condition evaluation through image and audio integration |
title_fullStr | A multimodal deep learning approach for gravel road condition evaluation through image and audio integration |
title_full_unstemmed | A multimodal deep learning approach for gravel road condition evaluation through image and audio integration |
title_short | A multimodal deep learning approach for gravel road condition evaluation through image and audio integration |
title_sort | multimodal deep learning approach for gravel road condition evaluation through image and audio integration |
topic | Gravel road maintenance Data fusion Sound analysis Machine vision Machine Learning |
url | http://www.sciencedirect.com/science/article/pii/S2666691X24000034 |
work_keys_str_mv | AT nausheensaeed amultimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration AT moududalam amultimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration AT rogergnyberg amultimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration AT nausheensaeed multimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration AT moududalam multimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration AT rogergnyberg multimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration |