Recognizing sound-event by machine learning

Environmental sounds provide important context to events. Environmental sound recognition is made possible by developments in computing and statistics. One chief method of analyzing sound events is via the spectrogram. Multiple feature extraction techniques exist, however not all of them are suit...

Full description

Bibliographic Details
Main Author: Athaariq Ramadino
Other Authors: Jiang Xudong
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/136924
Description
Summary:Environmental sounds provide important context to events. Environmental sound recognition is made possible by developments in computing and statistics. One chief method of analyzing sound events is via the spectrogram. Multiple feature extraction techniques exist, however not all of them are suitable for environmental sound recognition. In this paper, a new technique, hereby termed “2D complex-log spectrum” is used. From the spectrogram, a second FFT is taken in the time dimension. Afterwards, the result is regularized in order to maximize discriminating features. The technique is applied to RWCP and NTU-SEC databases, and compared to other feature extraction techniques, with >95% recognition in the best-case scenario.