Single-ended quality measurement of a music content via convolutional recurrent neural networks

The paper examines the usage of Convolutional Bidirectional Recurrent Neural Network (CBRNN) for a problem of quality measurement in a music content. The key contribution in this approach, compared to the existing research, is that the examined model is evaluated in terms of detecting acoustic anoma...

Full description

Bibliographic Details
Main Authors: Kamila Organiściak, Józef Borkowski
Format: Article
Language:English
Published: Polish Academy of Sciences 2021-01-01
Series:Metrology and Measurement Systems
Subjects:
Online Access:https://journals.pan.pl/Content/117865/PDF/art12.pdf
Description
Summary:The paper examines the usage of Convolutional Bidirectional Recurrent Neural Network (CBRNN) for a problem of quality measurement in a music content. The key contribution in this approach, compared to the existing research, is that the examined model is evaluated in terms of detecting acoustic anomalies without the requirement to provide a reference (clean) signal. Since real music content may include some modes of instrumental sounds, speech and singing voice or different audio effects, it is more complex to analyze than clean speech or artificial signals, especially without a comparison to the known reference content. The presented results might be treated as a proof of concept, since some specific types of artefacts are covered in this paper (examples of quantization defect, missing sound, distortion of gain characteristics, extra noise sound). However, the described model can be easily expanded to detect other impairments or used as a pre-trained model for other transfer learning processes. To examine the model efficiency several experiments have been performed and reported in the paper. The raw audio samples were transformed into Mel-scaled spectrograms and transferred as input to the model, first independently, then along with additional features (Zero Crossing Rate, Spectral Contrast). According to the obtained results, there is a significant increase in overall accuracy (by 10.1%), if Spectral Contrast information is provided together with Mel-scaled spectrograms. The paper examines also the influence of recursive layers on effectiveness of the artefact classification task.
ISSN:2300-1941