A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks

Music is an extremely subjective art form whose commodification via the recording industry in the 20th century has led to an increasingly subdivided set of genre labels that attempt to organize musical styles into definite categories. Music psychology has been studying the processes through which mu...

Celý popis

Podrobná bibliografie
Hlavní autoři:	Simone Angioni, Nathan Lincoln-DeCusatis, Andrea Ibba, Diego Reforgiato Recupero
Médium:	Článek
Jazyk:	English
Vydáno:	PeerJ Inc. 2023-06-01
Edice:	PeerJ Computer Science
Témata:	Classification Deep Learning Generation MIDI Transformers
On-line přístup:	https://peerj.com/articles/cs-1410.pdf

_version_	1827919673508757504
author	Simone Angioni Nathan Lincoln-DeCusatis Andrea Ibba Diego Reforgiato Recupero
author_facet	Simone Angioni Nathan Lincoln-DeCusatis Andrea Ibba Diego Reforgiato Recupero
author_sort	Simone Angioni
collection	DOAJ
description	Music is an extremely subjective art form whose commodification via the recording industry in the 20th century has led to an increasingly subdivided set of genre labels that attempt to organize musical styles into definite categories. Music psychology has been studying the processes through which music is perceived, created, responded to, and incorporated into everyday life, and, modern artificial intelligence technology can be exploited in such a direction. Music classification and generation are emerging fields that gained much attention recently, especially with the latest discoveries within deep learning technologies. Self attention networks have in fact brought huge benefits for several tasks of classification and generation in different domains where data of different types were used (text, images, videos, sounds). In this article, we want to analyze the effectiveness of Transformers for both classification and generation tasks and study the performances of classification at different granularity and of generation using different human and automatic metrics. The input data consist of MIDI sounds that we have considered from different datasets: sounds from 397 Nintendo Entertainment System video games, classical pieces, and rock songs from different composers and bands. We have performed classification tasks within each dataset to identify the types or composers of each sample (fine-grained) and classification at a higher level. In the latter, we combined the three datasets together with the goal of identifying for each sample just NES, rock, or classical (coarse-grained) pieces. The proposed transformers-based approach outperformed competitors based on deep learning and machine learning approaches. Finally, the generation task has been carried out on each dataset and the resulting samples have been evaluated using human and automatic metrics (the local alignment).
first_indexed	2024-03-13T04:01:32Z
format	Article
id	doaj.art-dc47468cdf7c46c99821b4e178f901b1
institution	Directory Open Access Journal
issn	2376-5992
language	English
last_indexed	2024-03-13T04:01:32Z
publishDate	2023-06-01
publisher	PeerJ Inc.
record_format	Article
series	PeerJ Computer Science
spelling	doaj.art-dc47468cdf7c46c99821b4e178f901b12023-06-21T15:05:05ZengPeerJ Inc.PeerJ Computer Science2376-59922023-06-019e141010.7717/peerj-cs.1410A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracksSimone Angioni0Nathan Lincoln-DeCusatis1Andrea Ibba2Diego Reforgiato Recupero3Department of Mathematics and Computer Science, University of Cagliari, Cagliari, Sardegna, ItalyDepartment of Music, Fordham University, New York, United States of AmericaDepartment of Mathematics and Computer Science, University of Cagliari, Cagliari, Sardegna, ItalyDepartment of Mathematics and Computer Science, University of Cagliari, Cagliari, Sardegna, ItalyMusic is an extremely subjective art form whose commodification via the recording industry in the 20th century has led to an increasingly subdivided set of genre labels that attempt to organize musical styles into definite categories. Music psychology has been studying the processes through which music is perceived, created, responded to, and incorporated into everyday life, and, modern artificial intelligence technology can be exploited in such a direction. Music classification and generation are emerging fields that gained much attention recently, especially with the latest discoveries within deep learning technologies. Self attention networks have in fact brought huge benefits for several tasks of classification and generation in different domains where data of different types were used (text, images, videos, sounds). In this article, we want to analyze the effectiveness of Transformers for both classification and generation tasks and study the performances of classification at different granularity and of generation using different human and automatic metrics. The input data consist of MIDI sounds that we have considered from different datasets: sounds from 397 Nintendo Entertainment System video games, classical pieces, and rock songs from different composers and bands. We have performed classification tasks within each dataset to identify the types or composers of each sample (fine-grained) and classification at a higher level. In the latter, we combined the three datasets together with the goal of identifying for each sample just NES, rock, or classical (coarse-grained) pieces. The proposed transformers-based approach outperformed competitors based on deep learning and machine learning approaches. Finally, the generation task has been carried out on each dataset and the resulting samples have been evaluated using human and automatic metrics (the local alignment).https://peerj.com/articles/cs-1410.pdfClassificationDeep LearningGenerationMIDITransformers
spellingShingle	Simone Angioni Nathan Lincoln-DeCusatis Andrea Ibba Diego Reforgiato Recupero A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks PeerJ Computer Science Classification Deep Learning Generation MIDI Transformers
title	A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks
title_full	A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks
title_fullStr	A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks
title_full_unstemmed	A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks
title_short	A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks
title_sort	transformers based approach for fine and coarse grained classification and generation of midi songs and soundtracks
topic	Classification Deep Learning Generation MIDI Transformers
url	https://peerj.com/articles/cs-1410.pdf
work_keys_str_mv	AT simoneangioni atransformersbasedapproachforfineandcoarsegrainedclassificationandgenerationofmidisongsandsoundtracks AT nathanlincolndecusatis atransformersbasedapproachforfineandcoarsegrainedclassificationandgenerationofmidisongsandsoundtracks AT andreaibba atransformersbasedapproachforfineandcoarsegrainedclassificationandgenerationofmidisongsandsoundtracks AT diegoreforgiatorecupero atransformersbasedapproachforfineandcoarsegrainedclassificationandgenerationofmidisongsandsoundtracks AT simoneangioni transformersbasedapproachforfineandcoarsegrainedclassificationandgenerationofmidisongsandsoundtracks AT nathanlincolndecusatis transformersbasedapproachforfineandcoarsegrainedclassificationandgenerationofmidisongsandsoundtracks AT andreaibba transformersbasedapproachforfineandcoarsegrainedclassificationandgenerationofmidisongsandsoundtracks AT diegoreforgiatorecupero transformersbasedapproachforfineandcoarsegrainedclassificationandgenerationofmidisongsandsoundtracks

A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks

Podobné jednotky