Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features

Recently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems....

Full description

Bibliographic Details
Main Authors: Hyoung‐Gook Kim, Jin Young Kim
Format: Article
Language:English
Published: Electronics and Telecommunications Research Institute (ETRI) 2017-12-01
Series:ETRI Journal
Subjects:
Online Access:https://doi.org/10.4218/etrij.17.0117.0157
_version_ 1819014665179496448
author Hyoung‐Gook Kim
Jin Young Kim
author_facet Hyoung‐Gook Kim
Jin Young Kim
author_sort Hyoung‐Gook Kim
collection DOAJ
description Recently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems. In this paper, we present an approach to improve the accuracy of polyphonic sound event detection in multichannel audio based on gated recurrent neural networks in combination with auditory spectral features. In the proposed method, human hearing perception‐based spatial and spectral‐domain noise‐reduced harmonic features are extracted from multichannel audio and used as high‐resolution spectral inputs to train gated recurrent neural networks. This provides a fast and stable convergence rate compared to long short‐term memory recurrent neural networks. Our evaluation reveals that the proposed method outperforms the conventional approaches.
first_indexed 2024-12-21T02:19:27Z
format Article
id doaj.art-8642a6e8e8e14fa78367c5a42dc8760e
institution Directory Open Access Journal
issn 1225-6463
2233-7326
language English
last_indexed 2024-12-21T02:19:27Z
publishDate 2017-12-01
publisher Electronics and Telecommunications Research Institute (ETRI)
record_format Article
series ETRI Journal
spelling doaj.art-8642a6e8e8e14fa78367c5a42dc8760e2022-12-21T19:19:10ZengElectronics and Telecommunications Research Institute (ETRI)ETRI Journal1225-64632233-73262017-12-0139683284010.4218/etrij.17.0117.015710.4218/etrij.17.0117.0157Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral FeaturesHyoung‐Gook KimJin Young KimRecently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems. In this paper, we present an approach to improve the accuracy of polyphonic sound event detection in multichannel audio based on gated recurrent neural networks in combination with auditory spectral features. In the proposed method, human hearing perception‐based spatial and spectral‐domain noise‐reduced harmonic features are extracted from multichannel audio and used as high‐resolution spectral inputs to train gated recurrent neural networks. This provides a fast and stable convergence rate compared to long short‐term memory recurrent neural networks. Our evaluation reveals that the proposed method outperforms the conventional approaches.https://doi.org/10.4218/etrij.17.0117.0157Acoustic event detectionDeep recurrent neural networksGated recurrent neural networkMultichannel audio
spellingShingle Hyoung‐Gook Kim
Jin Young Kim
Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features
ETRI Journal
Acoustic event detection
Deep recurrent neural networks
Gated recurrent neural network
Multichannel audio
title Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features
title_full Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features
title_fullStr Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features
title_full_unstemmed Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features
title_short Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features
title_sort acoustic event detection in multichannel audio using gated recurrent neural networks with high resolution spectral features
topic Acoustic event detection
Deep recurrent neural networks
Gated recurrent neural network
Multichannel audio
url https://doi.org/10.4218/etrij.17.0117.0157
work_keys_str_mv AT hyounggookkim acousticeventdetectioninmultichannelaudiousinggatedrecurrentneuralnetworkswithhighresolutionspectralfeatures
AT jinyoungkim acousticeventdetectioninmultichannelaudiousinggatedrecurrentneuralnetworkswithhighresolutionspectralfeatures