A Framework for Content-Based Search in Large Music Collections

We address the problem of scalable content-based search in large collections of music documents. Music content is highly complex and versatile and presents multiple facets that can be considered independently or in combination. Moreover, music documents can be digitally encoded in many ways. We prop...

Full description

Bibliographic Details
Main Authors: Tiange Zhu, Raphaël Fournier-S’niehotta, Philippe Rigaux, Nicolas Travers
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Big Data and Cognitive Computing
Subjects:
Online Access:https://www.mdpi.com/2504-2289/6/1/23
_version_ 1797472824368037888
author Tiange Zhu
Raphaël Fournier-S’niehotta
Philippe Rigaux
Nicolas Travers
author_facet Tiange Zhu
Raphaël Fournier-S’niehotta
Philippe Rigaux
Nicolas Travers
author_sort Tiange Zhu
collection DOAJ
description We address the problem of scalable content-based search in large collections of music documents. Music content is highly complex and versatile and presents multiple facets that can be considered independently or in combination. Moreover, music documents can be digitally encoded in many ways. We propose a general framework for building a scalable search engine, based on (i) a music description language that represents music content independently from a specific encoding, (ii) an extendible list of feature-extraction functions, and (iii) indexing, searching, and ranking procedures designed to be integrated into the standard architecture of a text-oriented search engine. As a proof of concept, we also detail an actual implementation of the framework for searching in large collections of XML-encoded music scores, based on the popular ElasticSearch system. It is released as open-source in GitHub, and available as a ready-to-use Docker image for communities that manage large collections of digitized music documents.
first_indexed 2024-03-09T20:06:38Z
format Article
id doaj.art-47f9ee4b36794a4090fda661b41fdaa1
institution Directory Open Access Journal
issn 2504-2289
language English
last_indexed 2024-03-09T20:06:38Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Big Data and Cognitive Computing
spelling doaj.art-47f9ee4b36794a4090fda661b41fdaa12023-11-24T00:29:06ZengMDPI AGBig Data and Cognitive Computing2504-22892022-02-01612310.3390/bdcc6010023A Framework for Content-Based Search in Large Music CollectionsTiange Zhu0Raphaël Fournier-S’niehotta1Philippe Rigaux2Nicolas Travers3CEDRIC Laboratory, CNAM Paris, 75003 Paris, FranceCEDRIC Laboratory, CNAM Paris, 75003 Paris, FranceCEDRIC Laboratory, CNAM Paris, 75003 Paris, FranceResearch Center, Léonard de Vinci Pôle Universitaire, 92400 Paris La Défense, FranceWe address the problem of scalable content-based search in large collections of music documents. Music content is highly complex and versatile and presents multiple facets that can be considered independently or in combination. Moreover, music documents can be digitally encoded in many ways. We propose a general framework for building a scalable search engine, based on (i) a music description language that represents music content independently from a specific encoding, (ii) an extendible list of feature-extraction functions, and (iii) indexing, searching, and ranking procedures designed to be integrated into the standard architecture of a text-oriented search engine. As a proof of concept, we also detail an actual implementation of the framework for searching in large collections of XML-encoded music scores, based on the popular ElasticSearch system. It is released as open-source in GitHub, and available as a ready-to-use Docker image for communities that manage large collections of digitized music documents.https://www.mdpi.com/2504-2289/6/1/23music collectionsdigital music encodingmusic information retrievalscalable and content-based search
spellingShingle Tiange Zhu
Raphaël Fournier-S’niehotta
Philippe Rigaux
Nicolas Travers
A Framework for Content-Based Search in Large Music Collections
Big Data and Cognitive Computing
music collections
digital music encoding
music information retrieval
scalable and content-based search
title A Framework for Content-Based Search in Large Music Collections
title_full A Framework for Content-Based Search in Large Music Collections
title_fullStr A Framework for Content-Based Search in Large Music Collections
title_full_unstemmed A Framework for Content-Based Search in Large Music Collections
title_short A Framework for Content-Based Search in Large Music Collections
title_sort framework for content based search in large music collections
topic music collections
digital music encoding
music information retrieval
scalable and content-based search
url https://www.mdpi.com/2504-2289/6/1/23
work_keys_str_mv AT tiangezhu aframeworkforcontentbasedsearchinlargemusiccollections
AT raphaelfourniersniehotta aframeworkforcontentbasedsearchinlargemusiccollections
AT philipperigaux aframeworkforcontentbasedsearchinlargemusiccollections
AT nicolastravers aframeworkforcontentbasedsearchinlargemusiccollections
AT tiangezhu frameworkforcontentbasedsearchinlargemusiccollections
AT raphaelfourniersniehotta frameworkforcontentbasedsearchinlargemusiccollections
AT philipperigaux frameworkforcontentbasedsearchinlargemusiccollections
AT nicolastravers frameworkforcontentbasedsearchinlargemusiccollections