Embedding-Based Music Emotion Recognition Using Composite Loss

Most music emotion recognition approaches perform classification or regression that estimates a general emotional category from a distribution of music samples, but without considering emotional variations (e.g., happiness can be further categorised into much, moderate or little happiness). We propo...

Full description

Bibliographic Details
Main Authors: Naoki Takashima, Frederic Li, Marcin Grzegorzek, Kimiaki Shirahama
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10097747/
_version_ 1797844390206504960
author Naoki Takashima
Frederic Li
Marcin Grzegorzek
Kimiaki Shirahama
author_facet Naoki Takashima
Frederic Li
Marcin Grzegorzek
Kimiaki Shirahama
author_sort Naoki Takashima
collection DOAJ
description Most music emotion recognition approaches perform classification or regression that estimates a general emotional category from a distribution of music samples, but without considering emotional variations (e.g., happiness can be further categorised into much, moderate or little happiness). We propose an embedding-based music emotion recognition approach that associates music samples with emotions in a common embedding space by considering both general emotional categories and fine-grained discrimination within each category. Since the association of music samples with emotions is uncertain due to subjective human perceptions, we compute composite loss-based embeddings obtained to maximise two statistical characteristics, one being the correlation between music samples and emotions based on canonical correlation analysis, and the other being a probabilistic similarity between a music sample and an emotion with KL-divergence. The experiments on two benchmark datasets demonstrate the effectiveness of our embedding-based approach, the composite loss and learned acoustic features. In addition, detailed analysis shows that our approach can accomplish robust bidirectional music emotion recognition that not only identifies music samples matching with a specific emotion but also detects emotions expressed in a certain music sample.
first_indexed 2024-04-09T17:21:55Z
format Article
id doaj.art-b979ec3a575348749f923739df205f27
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-09T17:21:55Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-b979ec3a575348749f923739df205f272023-04-18T23:00:28ZengIEEEIEEE Access2169-35362023-01-0111365793660410.1109/ACCESS.2023.326580710097747Embedding-Based Music Emotion Recognition Using Composite LossNaoki Takashima0Frederic Li1https://orcid.org/0000-0003-2110-4207Marcin Grzegorzek2Kimiaki Shirahama3https://orcid.org/0000-0003-1843-5152Graduate School of Science and Engineering, Kindai University, Higashiosaka, JapanInstitute of Medical Informatics, University of Lübeck, Lübeck, GermanyInstitute of Medical Informatics, University of Lübeck, Lübeck, GermanyFaculty of Informatics, Kindai University, Higashiosaka, JapanMost music emotion recognition approaches perform classification or regression that estimates a general emotional category from a distribution of music samples, but without considering emotional variations (e.g., happiness can be further categorised into much, moderate or little happiness). We propose an embedding-based music emotion recognition approach that associates music samples with emotions in a common embedding space by considering both general emotional categories and fine-grained discrimination within each category. Since the association of music samples with emotions is uncertain due to subjective human perceptions, we compute composite loss-based embeddings obtained to maximise two statistical characteristics, one being the correlation between music samples and emotions based on canonical correlation analysis, and the other being a probabilistic similarity between a music sample and an emotion with KL-divergence. The experiments on two benchmark datasets demonstrate the effectiveness of our embedding-based approach, the composite loss and learned acoustic features. In addition, detailed analysis shows that our approach can accomplish robust bidirectional music emotion recognition that not only identifies music samples matching with a specific emotion but also detects emotions expressed in a certain music sample.https://ieeexplore.ieee.org/document/10097747/Music emotion recognitionembeddingscanonical correlation analysisKullback-Leibler divergencebidirectional retrieval
spellingShingle Naoki Takashima
Frederic Li
Marcin Grzegorzek
Kimiaki Shirahama
Embedding-Based Music Emotion Recognition Using Composite Loss
IEEE Access
Music emotion recognition
embeddings
canonical correlation analysis
Kullback-Leibler divergence
bidirectional retrieval
title Embedding-Based Music Emotion Recognition Using Composite Loss
title_full Embedding-Based Music Emotion Recognition Using Composite Loss
title_fullStr Embedding-Based Music Emotion Recognition Using Composite Loss
title_full_unstemmed Embedding-Based Music Emotion Recognition Using Composite Loss
title_short Embedding-Based Music Emotion Recognition Using Composite Loss
title_sort embedding based music emotion recognition using composite loss
topic Music emotion recognition
embeddings
canonical correlation analysis
Kullback-Leibler divergence
bidirectional retrieval
url https://ieeexplore.ieee.org/document/10097747/
work_keys_str_mv AT naokitakashima embeddingbasedmusicemotionrecognitionusingcompositeloss
AT fredericli embeddingbasedmusicemotionrecognitionusingcompositeloss
AT marcingrzegorzek embeddingbasedmusicemotionrecognitionusingcompositeloss
AT kimiakishirahama embeddingbasedmusicemotionrecognitionusingcompositeloss