Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging.

In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to...

Full description

Bibliographic Details
Main Authors: Taketo Akama, Hiroaki Kitano, Katsuhiro Takematsu, Yasushi Miyajima, Natalia Polouliakh
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Series:PLoS ONE
Online Access:https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0294643&type=printable
_version_ 1797392940229722112
author Taketo Akama
Hiroaki Kitano
Katsuhiro Takematsu
Yasushi Miyajima
Natalia Polouliakh
author_facet Taketo Akama
Hiroaki Kitano
Katsuhiro Takematsu
Yasushi Miyajima
Natalia Polouliakh
author_sort Taketo Akama
collection DOAJ
description In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to deduce associated tags, such as genre and mood. Given the limitations and non-scalability of human supervision signals, it becomes crucial for models to learn from alternative sources to enhance their performance. Contrastive learning-based self-supervised learning, which exclusively relies on learning signals derived from music audio data, has demonstrated its efficacy in the context of auto-tagging. In this work, we propose a model that builds on the self-supervised learning approach to address the similarity-based retrieval challenge by introducing our method of metric learning with a self-supervised auxiliary loss. Furthermore, diverging from conventional self-supervised learning methodologies, we discovered the advantages of concurrently training the model with both self-supervision and supervision signals, without freezing pre-trained models. We also found that refraining from employing augmentation during the fine-tuning phase yields better results. Our experimental results confirm that the proposed methodology enhances retrieval and tagging performance metrics in two distinct scenarios: one where human-annotated tags are consistently available for all music tracks, and another where such tags are accessible only for a subset of music tracks.
first_indexed 2024-03-08T23:54:52Z
format Article
id doaj.art-905cc4c4bce5426facb87b1cc96ef5e5
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-03-08T23:54:52Z
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-905cc4c4bce5426facb87b1cc96ef5e52023-12-13T05:32:39ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-011811e029464310.1371/journal.pone.0294643Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging.Taketo AkamaHiroaki KitanoKatsuhiro TakematsuYasushi MiyajimaNatalia PolouliakhIn the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to deduce associated tags, such as genre and mood. Given the limitations and non-scalability of human supervision signals, it becomes crucial for models to learn from alternative sources to enhance their performance. Contrastive learning-based self-supervised learning, which exclusively relies on learning signals derived from music audio data, has demonstrated its efficacy in the context of auto-tagging. In this work, we propose a model that builds on the self-supervised learning approach to address the similarity-based retrieval challenge by introducing our method of metric learning with a self-supervised auxiliary loss. Furthermore, diverging from conventional self-supervised learning methodologies, we discovered the advantages of concurrently training the model with both self-supervision and supervision signals, without freezing pre-trained models. We also found that refraining from employing augmentation during the fine-tuning phase yields better results. Our experimental results confirm that the proposed methodology enhances retrieval and tagging performance metrics in two distinct scenarios: one where human-annotated tags are consistently available for all music tracks, and another where such tags are accessible only for a subset of music tracks.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0294643&type=printable
spellingShingle Taketo Akama
Hiroaki Kitano
Katsuhiro Takematsu
Yasushi Miyajima
Natalia Polouliakh
Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging.
PLoS ONE
title Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging.
title_full Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging.
title_fullStr Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging.
title_full_unstemmed Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging.
title_short Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging.
title_sort auxiliary self supervision to metric learning for music similarity based retrieval and auto tagging
url https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0294643&type=printable
work_keys_str_mv AT taketoakama auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging
AT hiroakikitano auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging
AT katsuhirotakematsu auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging
AT yasushimiyajima auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging
AT nataliapolouliakh auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging