Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks

Audio signal classification finds various applications in detecting and monitoring health conditions in healthcare. Convolutional neural networks (CNN) have produced state-of-the-art results in image classification and are being increasingly used in other tasks, including signal classification. Howe...

Full description

Bibliographic Details
Main Authors:	Roneel V. Sharan, Hao Xiong, Shlomo Berkovsky
Format:	Article
Language:	English
Published:	MDPI AG 2021-05-01
Series:	Sensors
Subjects:	convolutional neural networks fusion interpolation machine learning spectrogram time-frequency image
Online Access:	https://www.mdpi.com/1424-8220/21/10/3434

_version_	1797534065646108672
author	Roneel V. Sharan Hao Xiong Shlomo Berkovsky
author_facet	Roneel V. Sharan Hao Xiong Shlomo Berkovsky
author_sort	Roneel V. Sharan
collection	DOAJ
description	Audio signal classification finds various applications in detecting and monitoring health conditions in healthcare. Convolutional neural networks (CNN) have produced state-of-the-art results in image classification and are being increasingly used in other tasks, including signal classification. However, audio signal classification using CNN presents various challenges. In image classification tasks, raw images of equal dimensions can be used as a direct input to CNN. Raw time-domain signals, on the other hand, can be of varying dimensions. In addition, the temporal signal often has to be transformed to frequency-domain to reveal unique spectral characteristics, therefore requiring signal transformation. In this work, we overview and benchmark various audio signal representation techniques for classification using CNN, including approaches that deal with signals of different lengths and combine multiple representations to improve the classification accuracy. Hence, this work surfaces important empirical evidence that may guide future works deploying CNN for audio signal classification purposes.
first_indexed	2024-03-10T11:24:21Z
format	Article
id	doaj.art-dc607b9d8c98434c904608fdc8007047
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-10T11:24:21Z
publishDate	2021-05-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-dc607b9d8c98434c904608fdc80070472023-11-21T19:48:27ZengMDPI AGSensors1424-82202021-05-012110343410.3390/s21103434Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural NetworksRoneel V. Sharan0Hao Xiong1Shlomo Berkovsky2Australian Institute of Health Innovation, Macquarie University, Sydney, NSW 2109, AustraliaAustralian Institute of Health Innovation, Macquarie University, Sydney, NSW 2109, AustraliaAustralian Institute of Health Innovation, Macquarie University, Sydney, NSW 2109, AustraliaAudio signal classification finds various applications in detecting and monitoring health conditions in healthcare. Convolutional neural networks (CNN) have produced state-of-the-art results in image classification and are being increasingly used in other tasks, including signal classification. However, audio signal classification using CNN presents various challenges. In image classification tasks, raw images of equal dimensions can be used as a direct input to CNN. Raw time-domain signals, on the other hand, can be of varying dimensions. In addition, the temporal signal often has to be transformed to frequency-domain to reveal unique spectral characteristics, therefore requiring signal transformation. In this work, we overview and benchmark various audio signal representation techniques for classification using CNN, including approaches that deal with signals of different lengths and combine multiple representations to improve the classification accuracy. Hence, this work surfaces important empirical evidence that may guide future works deploying CNN for audio signal classification purposes.https://www.mdpi.com/1424-8220/21/10/3434convolutional neural networksfusioninterpolationmachine learningspectrogramtime-frequency image
spellingShingle	Roneel V. Sharan Hao Xiong Shlomo Berkovsky Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks Sensors convolutional neural networks fusion interpolation machine learning spectrogram time-frequency image
title	Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
title_full	Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
title_fullStr	Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
title_full_unstemmed	Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
title_short	Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
title_sort	benchmarking audio signal representation techniques for classification with convolutional neural networks
topic	convolutional neural networks fusion interpolation machine learning spectrogram time-frequency image
url	https://www.mdpi.com/1424-8220/21/10/3434
work_keys_str_mv	AT roneelvsharan benchmarkingaudiosignalrepresentationtechniquesforclassificationwithconvolutionalneuralnetworks AT haoxiong benchmarkingaudiosignalrepresentationtechniquesforclassificationwithconvolutionalneuralnetworks AT shlomoberkovsky benchmarkingaudiosignalrepresentationtechniquesforclassificationwithconvolutionalneuralnetworks

Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks

Similar Items