DEW: A wavelet approach of rare sound event detection.

This paper presents a novel sound event detection (SED) system for rare events occurring in an open environment. Wavelet multiresolution analysis (MRA) is used to decompose the input audio clip of 30 seconds into five levels. Wavelet denoising is then applied on the third and fifth levels of MRA to...

Full description

Bibliographic Details
Main Authors: Sania Gul, Muhammad Salman Khan, Ata Ur-Rehman
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-01-01
Series:PLoS ONE
Online Access:https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0300444&type=printable
_version_ 1827293884607102976
author Sania Gul
Muhammad Salman Khan
Ata Ur-Rehman
author_facet Sania Gul
Muhammad Salman Khan
Ata Ur-Rehman
author_sort Sania Gul
collection DOAJ
description This paper presents a novel sound event detection (SED) system for rare events occurring in an open environment. Wavelet multiresolution analysis (MRA) is used to decompose the input audio clip of 30 seconds into five levels. Wavelet denoising is then applied on the third and fifth levels of MRA to filter out the background. Significant transitions, which may represent the onset of a rare event, are then estimated in these two levels by combining the peak-finding algorithm with the K-medoids clustering algorithm. The small portions of one-second duration, called 'chunks' are cropped from the input audio signal corresponding to the estimated locations of the significant transitions. Features from these chunks are extracted by the wavelet scattering network (WSN) and are given as input to a support vector machine (SVM) classifier, which classifies them. The proposed SED framework produces an error rate comparable to the SED systems based on convolutional neural network (CNN) architecture. Also, the proposed algorithm is computationally efficient and lightweight as compared to deep learning models, as it has no learnable parameter. It requires only a single epoch of training, which is 5, 10, 200, and 600 times lesser than the models based on CNNs and deep neural networks (DNNs), CNN with long short-term memory (LSTM) network, convolutional recurrent neural network (CRNN), and CNN respectively. The proposed model neither requires concatenation with previous frames for anomaly detection nor any additional training data creation needed for other comparative deep learning models. It needs to check almost 360 times fewer chunks for the presence of rare events than the other baseline systems used for comparison in this paper. All these characteristics make the proposed system suitable for real-time applications on resource-limited devices.
first_indexed 2024-04-24T13:47:35Z
format Article
id doaj.art-d43bcd7129c3462086ed57503d9fb77d
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-24T13:47:35Z
publishDate 2024-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-d43bcd7129c3462086ed57503d9fb77d2024-04-04T05:34:35ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-01193e030044410.1371/journal.pone.0300444DEW: A wavelet approach of rare sound event detection.Sania GulMuhammad Salman KhanAta Ur-RehmanThis paper presents a novel sound event detection (SED) system for rare events occurring in an open environment. Wavelet multiresolution analysis (MRA) is used to decompose the input audio clip of 30 seconds into five levels. Wavelet denoising is then applied on the third and fifth levels of MRA to filter out the background. Significant transitions, which may represent the onset of a rare event, are then estimated in these two levels by combining the peak-finding algorithm with the K-medoids clustering algorithm. The small portions of one-second duration, called 'chunks' are cropped from the input audio signal corresponding to the estimated locations of the significant transitions. Features from these chunks are extracted by the wavelet scattering network (WSN) and are given as input to a support vector machine (SVM) classifier, which classifies them. The proposed SED framework produces an error rate comparable to the SED systems based on convolutional neural network (CNN) architecture. Also, the proposed algorithm is computationally efficient and lightweight as compared to deep learning models, as it has no learnable parameter. It requires only a single epoch of training, which is 5, 10, 200, and 600 times lesser than the models based on CNNs and deep neural networks (DNNs), CNN with long short-term memory (LSTM) network, convolutional recurrent neural network (CRNN), and CNN respectively. The proposed model neither requires concatenation with previous frames for anomaly detection nor any additional training data creation needed for other comparative deep learning models. It needs to check almost 360 times fewer chunks for the presence of rare events than the other baseline systems used for comparison in this paper. All these characteristics make the proposed system suitable for real-time applications on resource-limited devices.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0300444&type=printable
spellingShingle Sania Gul
Muhammad Salman Khan
Ata Ur-Rehman
DEW: A wavelet approach of rare sound event detection.
PLoS ONE
title DEW: A wavelet approach of rare sound event detection.
title_full DEW: A wavelet approach of rare sound event detection.
title_fullStr DEW: A wavelet approach of rare sound event detection.
title_full_unstemmed DEW: A wavelet approach of rare sound event detection.
title_short DEW: A wavelet approach of rare sound event detection.
title_sort dew a wavelet approach of rare sound event detection
url https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0300444&type=printable
work_keys_str_mv AT saniagul dewawaveletapproachofraresoundeventdetection
AT muhammadsalmankhan dewawaveletapproachofraresoundeventdetection
AT ataurrehman dewawaveletapproachofraresoundeventdetection