Summary: | Environmental sounds provide important context to events. Environmental sound
recognition is made possible by developments in computing and statistics. One chief
method of analyzing sound events is via the spectrogram. Multiple feature extraction
techniques exist, however not all of them are suitable for environmental sound
recognition. In this paper, a new technique, hereby termed “2D complex-log
spectrum” is used. From the spectrogram, a second FFT is taken in the time
dimension. Afterwards, the result is regularized in order to maximize discriminating
features. The technique is applied to RWCP and NTU-SEC databases, and compared
to other feature extraction techniques, with >95% recognition in the best-case
scenario.
|