Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features
Recently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems....
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Electronics and Telecommunications Research Institute (ETRI)
2017-12-01
|
Series: | ETRI Journal |
Subjects: | |
Online Access: | https://doi.org/10.4218/etrij.17.0117.0157 |
_version_ | 1819014665179496448 |
---|---|
author | Hyoung‐Gook Kim Jin Young Kim |
author_facet | Hyoung‐Gook Kim Jin Young Kim |
author_sort | Hyoung‐Gook Kim |
collection | DOAJ |
description | Recently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems. In this paper, we present an approach to improve the accuracy of polyphonic sound event detection in multichannel audio based on gated recurrent neural networks in combination with auditory spectral features. In the proposed method, human hearing perception‐based spatial and spectral‐domain noise‐reduced harmonic features are extracted from multichannel audio and used as high‐resolution spectral inputs to train gated recurrent neural networks. This provides a fast and stable convergence rate compared to long short‐term memory recurrent neural networks. Our evaluation reveals that the proposed method outperforms the conventional approaches. |
first_indexed | 2024-12-21T02:19:27Z |
format | Article |
id | doaj.art-8642a6e8e8e14fa78367c5a42dc8760e |
institution | Directory Open Access Journal |
issn | 1225-6463 2233-7326 |
language | English |
last_indexed | 2024-12-21T02:19:27Z |
publishDate | 2017-12-01 |
publisher | Electronics and Telecommunications Research Institute (ETRI) |
record_format | Article |
series | ETRI Journal |
spelling | doaj.art-8642a6e8e8e14fa78367c5a42dc8760e2022-12-21T19:19:10ZengElectronics and Telecommunications Research Institute (ETRI)ETRI Journal1225-64632233-73262017-12-0139683284010.4218/etrij.17.0117.015710.4218/etrij.17.0117.0157Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral FeaturesHyoung‐Gook KimJin Young KimRecently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems. In this paper, we present an approach to improve the accuracy of polyphonic sound event detection in multichannel audio based on gated recurrent neural networks in combination with auditory spectral features. In the proposed method, human hearing perception‐based spatial and spectral‐domain noise‐reduced harmonic features are extracted from multichannel audio and used as high‐resolution spectral inputs to train gated recurrent neural networks. This provides a fast and stable convergence rate compared to long short‐term memory recurrent neural networks. Our evaluation reveals that the proposed method outperforms the conventional approaches.https://doi.org/10.4218/etrij.17.0117.0157Acoustic event detectionDeep recurrent neural networksGated recurrent neural networkMultichannel audio |
spellingShingle | Hyoung‐Gook Kim Jin Young Kim Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features ETRI Journal Acoustic event detection Deep recurrent neural networks Gated recurrent neural network Multichannel audio |
title | Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features |
title_full | Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features |
title_fullStr | Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features |
title_full_unstemmed | Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features |
title_short | Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features |
title_sort | acoustic event detection in multichannel audio using gated recurrent neural networks with high resolution spectral features |
topic | Acoustic event detection Deep recurrent neural networks Gated recurrent neural network Multichannel audio |
url | https://doi.org/10.4218/etrij.17.0117.0157 |
work_keys_str_mv | AT hyounggookkim acousticeventdetectioninmultichannelaudiousinggatedrecurrentneuralnetworkswithhighresolutionspectralfeatures AT jinyoungkim acousticeventdetectioninmultichannelaudiousinggatedrecurrentneuralnetworkswithhighresolutionspectralfeatures |