A multimodal deep learning approach for gravel road condition evaluation through image and audio integration

This study investigates the combination of audio and image data to classify road conditions, particularly focusing on loose gravel scenarios. The dataset underwent binary categorisation, comprising audio segments capturing gravel sounds and corresponding images. Early feature fusion, utilising a pre...

Full description

Bibliographic Details
Main Authors: Nausheen Saeed, Moudud Alam, Roger G Nyberg
Format: Article
Language:English
Published: Elsevier 2024-06-01
Series:Transportation Engineering
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666691X24000034
_version_ 1797322456168398848
author Nausheen Saeed
Moudud Alam
Roger G Nyberg
author_facet Nausheen Saeed
Moudud Alam
Roger G Nyberg
author_sort Nausheen Saeed
collection DOAJ
description This study investigates the combination of audio and image data to classify road conditions, particularly focusing on loose gravel scenarios. The dataset underwent binary categorisation, comprising audio segments capturing gravel sounds and corresponding images. Early feature fusion, utilising a pre-trained Very Deep Convolutional Networks 19 (VGG19) and Principal component analysis (PCA), improved the accuracy of the Random Forest classifier, surpassing other models in accuracy, precision, recall, and F1-score. Late fusion, involving decision-level processing with logical disjunction and conjunction gates (AND and OR) in combination with individual classifiers for images and audio based on Densely Connected Convolutional Networks 121 (DenseNet121), demonstrated notable performance, especially with the OR gate, achieving 97 % accuracy. The late fusion method enhances adaptability by compensating for limitations in one modality with information from the other. Adapting maintenance based on identified road conditions minimises unnecessary environmental impact. This method can help to identify loose gravel on gravel roads, substantially improving road safety and implementing a precise maintenance strategy through a data-driven approach.
first_indexed 2024-03-08T05:14:36Z
format Article
id doaj.art-e916ac5245064e1fafbca8b2a6344efe
institution Directory Open Access Journal
issn 2666-691X
language English
last_indexed 2024-03-08T05:14:36Z
publishDate 2024-06-01
publisher Elsevier
record_format Article
series Transportation Engineering
spelling doaj.art-e916ac5245064e1fafbca8b2a6344efe2024-02-07T04:45:58ZengElsevierTransportation Engineering2666-691X2024-06-0116100228A multimodal deep learning approach for gravel road condition evaluation through image and audio integrationNausheen Saeed0Moudud Alam1Roger G Nyberg2Corresponding author.; School of Information and Engineering, Dalarna University, Röda vägen 3, Borlänge, SwedenSchool of Information and Engineering, Dalarna University, Röda vägen 3, Borlänge, SwedenSchool of Information and Engineering, Dalarna University, Röda vägen 3, Borlänge, SwedenThis study investigates the combination of audio and image data to classify road conditions, particularly focusing on loose gravel scenarios. The dataset underwent binary categorisation, comprising audio segments capturing gravel sounds and corresponding images. Early feature fusion, utilising a pre-trained Very Deep Convolutional Networks 19 (VGG19) and Principal component analysis (PCA), improved the accuracy of the Random Forest classifier, surpassing other models in accuracy, precision, recall, and F1-score. Late fusion, involving decision-level processing with logical disjunction and conjunction gates (AND and OR) in combination with individual classifiers for images and audio based on Densely Connected Convolutional Networks 121 (DenseNet121), demonstrated notable performance, especially with the OR gate, achieving 97 % accuracy. The late fusion method enhances adaptability by compensating for limitations in one modality with information from the other. Adapting maintenance based on identified road conditions minimises unnecessary environmental impact. This method can help to identify loose gravel on gravel roads, substantially improving road safety and implementing a precise maintenance strategy through a data-driven approach.http://www.sciencedirect.com/science/article/pii/S2666691X24000034Gravel road maintenanceData fusionSound analysisMachine visionMachineLearning
spellingShingle Nausheen Saeed
Moudud Alam
Roger G Nyberg
A multimodal deep learning approach for gravel road condition evaluation through image and audio integration
Transportation Engineering
Gravel road maintenance
Data fusion
Sound analysis
Machine vision
Machine
Learning
title A multimodal deep learning approach for gravel road condition evaluation through image and audio integration
title_full A multimodal deep learning approach for gravel road condition evaluation through image and audio integration
title_fullStr A multimodal deep learning approach for gravel road condition evaluation through image and audio integration
title_full_unstemmed A multimodal deep learning approach for gravel road condition evaluation through image and audio integration
title_short A multimodal deep learning approach for gravel road condition evaluation through image and audio integration
title_sort multimodal deep learning approach for gravel road condition evaluation through image and audio integration
topic Gravel road maintenance
Data fusion
Sound analysis
Machine vision
Machine
Learning
url http://www.sciencedirect.com/science/article/pii/S2666691X24000034
work_keys_str_mv AT nausheensaeed amultimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration
AT moududalam amultimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration
AT rogergnyberg amultimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration
AT nausheensaeed multimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration
AT moududalam multimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration
AT rogergnyberg multimodaldeeplearningapproachforgravelroadconditionevaluationthroughimageandaudiointegration